Tuesday, 31 August 2021

Hive with Parquet file

 File location on hdfs :

/data44/2/part-00000-ddd75d27-e608-4a17-a96d-4d631c71e875-c000.snappy.parquet

Create external table in Hive 

create external table test_paq4(deptid string,name string,id int) stored as parquet location 'hdfs://127.0.0.1:9000/data44/2';

hive> select * from test_paq4;


Note :- If you are using parquet then you can select the sequence of column on random basis e.g id,name,deptid or deptid,name,id 

Only the name of the column should match with Parquet file column. It mean each parquet file contains schema (column names).