Monday, October 12, 2015

Hive basics for Begginers #1


 1. How to get into hive terminal
        $ hive



 2. Command to display the databases in hive 
    hive> show databases;


    Note:  Hive has a default database 'default', if you don't specify any databases it takes the default database.

3. Command to use database;
      hive> use <database_name>
4. Command to list tables in a database
     hive> show tables;
 5. Create table syntax in hive
   hive> create table emp(eid int, ename string, salary int, gender string, dept_no int ); 
6.Load data into Hive table from Local File System 
  hive> load data local inpath '<Local_Directory_Path>' into table <hive_table_name>;



  emp.txt is a coma delimited file, hence we are getting Null's in all the columns. To overcome this we have to modify our query as below

 7.  Creating a table in hive which can accept the coma delimited file
   hive>create table emp_temp(eid int, ename string, salary int, gender string, dept_no int )
           > row format delimited fields terminated by ',';


   8. Loading data into the above created hive table
      hive>load data local inpath '/tmp/HadoopPractice/emp.txt' into table emp_temp;
     

 Note: In the above output you don't see headers(colum names) for the column. So use the below command to set the headers
  hive> set hive.cli.print.header=true;

9. Now as we have data in our table lets do some analysis on the data
hive> select sum(salary) as salaries_sum from emp_temp;


10. Import data from one hive table to another hive table;
     hive>    insert overwrite table <to_table_name> select eid,ename from <from_table_name>;


     
________________________________________________________________________________

More Hive Commands

hive> describe <table_name>;
hive>describe extended <table_name>;
hive>show functions;
hive> set hive.cli.print.header=true;
hive> describe function <function_name>
hive> load data  inpath '/tmp/HadoopPractice/emp.txt' into table emp_temp; (Loading data from HDFS to Hive).

Note: when you load data from HDFS to Hive, the file in HDFS is actually deleted.


By default all the hive databases are stored in  user/hive/warehouse/
[training@localhost /]$  hadoop fs -ls /user/hive/warehouse/;


















No comments:

Post a Comment