In this we will explains how to create a table and how to insert data into it. The conventions of creating a table in HCatalog is quite similar to creating a table using Hive.
Create Table Statement
Create Table is a statement used to create a table in Hive metastore using HCatalog. Its syntax and example are as follows −
Syntax
CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.] table_name [(col_name data_type [COMMENT col_comment], ...)] [COMMENT table_comment] [ROW FORMAT row_format] [STORED AS file_format]
Example
Let us assume you need to create a table named employee using CREATE TABLE statement. The following table lists the fields and their data types in the employee table −
Sr.No | Field Name | Data Type |
---|---|---|
1 | Eid | int |
2 | Name | String |
3 | Salary | Float |
4 | Designation | string |
The following data defines the supported fields such as Comment, Row formatted fields such as Field terminator, Lines terminator, and Stored File type.
COMMENT ‘Employee details’ FIELDS TERMINATED BY ‘\t’ LINES TERMINATED BY ‘\n’ STORED IN TEXT FILE
The following query creates a table named employee using the above data.
./hcat –e "CREATE TABLE IF NOT EXISTS employee ( eid int, name String, salary String, destination String) \ COMMENT 'Employee details' \ ROW FORMAT DELIMITED \ FIELDS TERMINATED BY ‘\t’ \ LINES TERMINATED BY ‘\n’ \ STORED AS TEXTFILE;"
If you add the option IF NOT EXISTS, HCatalog ignores the statement in case the table already exists.
On successful creation of table, you get to see the following response −
OK Time taken: 5.905 seconds
Load Data Statement
Generally, after creating a table in SQL, we can insert data using the Insert statement. But in HCatalog, we insert data using the LOAD DATA statement.
While inserting data into HCatalog, it is better to use LOAD DATA to store bulk records. There are two ways to load data: one is from local file system and second is from Hadoop file system.
Syntax
The syntax for LOAD DATA is as follows −
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
- LOCAL is the identifier to specify the local path. It is optional.
- OVERWRITE is optional to overwrite the data in the table.
- PARTITION is optional.
Example
We will insert the following data into the table. It is a text file named sample.txt in /home/user directory.
1201 zafrul 45000 Technical manager 1202 debi 45000 Proof reader 1203 subrat 40000 Technical writer 1204 jay 40000 Hr Admin 1205 ajit 30000 Op Admin
The following query loads the given text into the table.
./hcat –e "LOAD DATA LOCAL INPATH '/home/user/sample.txt' OVERWRITE INTO TABLE employee;"
On successful download, you get to see the following response −
OK Time taken: 15.905 seconds
Great post.Thanks Again. Really Cool.