HCatalog – Create Table

In this we will explains how to create a table and how to insert data into it. The conventions of creating a table in HCatalog is quite similar to creating a table using Hive.

Create Table Statement

Create Table is a statement used to create a table in Hive metastore using HCatalog. Its syntax and example are as follows −

Syntax

CREATE [TEMPORARY] [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.] table_name
[(col_name data_type [COMMENT col_comment], ...)]
[COMMENT table_comment]
[ROW FORMAT row_format]
[STORED AS file_format]

Example

Let us assume you need to create a table named employee using CREATE TABLE statement. The following table lists the fields and their data types in the employee table −

Sr.NoField NameData Type
1Eidint
2NameString
3SalaryFloat
4Designationstring

The following data defines the supported fields such as Comment, Row formatted fields such as Field terminatorLines terminator, and Stored File type.

COMMENT ‘Employee details’
FIELDS TERMINATED BY ‘\t’
LINES TERMINATED BY ‘\n’
STORED IN TEXT FILE

The following query creates a table named employee using the above data.

./hcat –e "CREATE TABLE IF NOT EXISTS employee ( eid int, name String, 
   salary String, destination String) \
COMMENT 'Employee details' \
ROW FORMAT DELIMITED \
FIELDS TERMINATED BY ‘\t’ \
LINES TERMINATED BY ‘\n’ \
STORED AS TEXTFILE;"

If you add the option IF NOT EXISTS, HCatalog ignores the statement in case the table already exists.

On successful creation of table, you get to see the following response −

OK
Time taken: 5.905 seconds

Load Data Statement

Generally, after creating a table in SQL, we can insert data using the Insert statement. But in HCatalog, we insert data using the LOAD DATA statement.

While inserting data into HCatalog, it is better to use LOAD DATA to store bulk records. There are two ways to load data: one is from local file system and second is from Hadoop file system.

Syntax

The syntax for LOAD DATA is as follows −

LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename
[PARTITION (partcol1=val1, partcol2=val2 ...)]
  • LOCAL is the identifier to specify the local path. It is optional.
  • OVERWRITE is optional to overwrite the data in the table.
  • PARTITION is optional.

Example

We will insert the following data into the table. It is a text file named sample.txt in /home/user directory.

1201  zafrul        45000    Technical manager
1202  debi          45000    Proof reader
1203  subrat        40000    Technical writer
1204  jay           40000    Hr Admin
1205  ajit          30000    Op Admin

The following query loads the given text into the table.

./hcat –e "LOAD DATA LOCAL INPATH '/home/user/sample.txt'
OVERWRITE INTO TABLE employee;"

On successful download, you get to see the following response −

OK
Time taken: 15.905 seconds

This Post Has One Comment

Leave a Reply