Here we will discuss about apache pig diagnostic operators. The load statement will simply load the data into the specified relation in Apache Pig. To verify the execution of the Load statement, you have to use the Diagnostic Operators. Pig Latin provides four different types of diagnostic operators −
- Dump operator
- Describe operator
- Explanation operator
- Illustration operator
In this chapter, we will discuss about apache pig diagnostic operators. And the Dump operators of Pig Latin.
Dump Operator
The Dump operator is used to run the Pig Latin statements and display the results on the screen. It is generally used for debugging Purpose.
Syntax
Given below is the syntax of the Dump operator.
grunt> Dump Relation_Name
Example
Assume we have a file student_data.txt in HDFS with the following content.
001,Rajiv,Reddy,9848022337,Hyderabad 002,siddarth,Battacharya,9848022338,Kolkata 003,Rajesh,Khanna,9848022339,Delhi 004,Preethi,Agarwal,9848022330,Pune 005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 006,Archana,Mishra,9848022335,Chennai.
And we have read it into a relation student using the LOAD operator as shown below.
grunt> student = LOAD 'hdfs://localhost:9000/pig_data/student_data.txt' USING PigStorage(',') as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray );
Now, let us print the contents of the relation using the Dump operator as shown below.
grunt> Dump student
Once you execute the above Pig Latin statement, it will start a MapReduce job to read data from HDFS. It will produce the following output.
2015-10-01 15:05:27,642 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2015-10-01 15:05:27,652 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.6.0 0.15.0 Hadoop 2015-10-01 15:03:11 2015-10-01 05:27 UNKNOWN Success! Job Stats (time in seconds): JobId job_14459_0004 Maps 1 Reduces 0 MaxMapTime n/a MinMapTime n/a AvgMapTime n/a MedianMapTime n/a MaxReduceTime 0 MinReduceTime 0 AvgReduceTime 0 MedianReducetime 0 Alias student Feature MAP_ONLY Outputs hdfs://localhost:9000/tmp/temp580182027/tmp757878456, Input(s): Successfully read 0 records from: "hdfs://localhost:9000/pig_data/ student_data.txt" Output(s): Successfully stored 0 records in: "hdfs://localhost:9000/tmp/temp580182027/ tmp757878456" Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager spill count : 0Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1443519499159_0004 2015-10-01 15:06:28,403 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLau ncher - Success! 2015-10-01 15:06:28,441 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2015-10-01 15:06:28,485 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2015-10-01 15:06:28,485 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 (1,Rajiv,Reddy,9848022337,Hyderabad) (2,siddarth,Battacharya,9848022338,Kolkata) (3,Rajesh,Khanna,9848022339,Delhi) (4,Preethi,Agarwal,9848022330,Pune) (5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar) (6,Archana,Mishra,9848022335,Chennai)
Next Topic : Click Here
Pingback: Apache Pig - Storing Data | Adglob Infosystem Pvt Ltd