java - Spark SQL - DataFrameReader load method with where condition -
i trying use dataframereader.load("table name")
load hive
table records , return dataframe
.
but dont want load entire records, wanted fetch records specific date (which 1 of field in hive table).
if add condition in returned dataframe, load entire table first filter
records based on date?
because hive tables huge , partitioned based on date field.
basically want achieve select * table date='date'
using load method without loading entire table.
recent versions of spark support feature called "predicate push-down". want: pushes, possible, sql clauses source. i'm not sure if predicate push-down works hive data source (it works parquet, jdbc , others sources). see does spark predicate pushdown work jdbc?
Comments
Post a Comment