Looking for a Tutor Near You?

Post Learning Requirement »
x

Choose Country Code

x

Direction

x

Ask a Question

x

x
x
x
Hire a Tutor

Hadoop And Big Data

Loading...

Published in: Big Data & Hadoop
704 Views

Important Points On PIG Programming.

Priyashree B / Mumbai

35 years of teaching experience

Qualification: M.Tech (RGPV BHOPAL, MP - 2016)

Teaches: Mental Maths, All Subjects, EVS, Mathematics, School Level Computer, Science, Social Studies

Contact this Tutor
  1. Pig Basic program Structure: Script: Pig can run a Script file that contains Pig Commands. Grunt: Grunt is an interactive shell for running Pig commands. It is also possible to run Pig scripts from within Grunt using run and exec (execute) Embedded: Embedded can run Pig Programs from Java like we can use JDBC to run SQL programs from java Note- Pig resides on user Machine and Job runs on Hadoop Cluster Pig Latin Program: It is made up of a series of operations or transformations that are applied to the input data to produce output Note- Pig turns the transformations into a series of Mapreduce jobs Basic Types of Data Models: Bag: A bag is a collection of tuples (Outer bag, Inner bag) Tuple: A tuple is an ordered set of fields Field: A field is a piece of data Data Map : A data map is a map from keys that are string literals to values that can be any data type. Pig Data Types: http://pig.apache.org/docs/rO.7.O/piglatin ref2.html#Data+Types+and+More Pig Latin Relational Operators http://pig.apache.org/docs/rO.7.O/piglatin ref2.htmI#Data+Types+and+More Note- In Pig when a data element is NULL it means the value is UNKNOWN
  2. File Loaders in Pig Latin: BinStorage - "binary" storage PigStorage - loads and stores data that is delimited by something TextLoader - loads data line by line (delimited by the newline character) CSVLoader - Loads CSV files XML Loader - Loads XML files ETL Operations in Pig: http://pig.apache.org/docs/r0.7.O/piglatin_ref2.html#Data+Types+and+More Hands On First Program in Pig