What is BloomMapFile used for?

The BloomMapFile is a class that extends MapFile. So its functionality is similar to MapFile. BloomMapFile uses dynamic Bloom filters to provide quick membership test for the keys. It is used in Hbase table format. Can you give us some examples how Hadoop is used...
Read More »

What is PIG?

PIG is a platform for analyzing large data sets that consist of high level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. PIG’s infrastructure layer consists of a compiler that produces sequence of MapReduce Programs. Can you give us some...
Read More »

What is the difference between logical and physical plans?

Pig undergoes some steps when a Pig Latin Script is converted into MapReduce jobs. After performing the basic parsing and semantic checking, it produces a logical plan. The logical plan describes the logical operators that have to be executed by Pig during execution. After this,...
Read More »

Does ‘ILLUSTRATE’ run MR job?

No, illustrate will not pull any MR, it will pull the internal data. On the console, illustrate will not do any job. It just shows output of each stage and not the final output. Can you give us some examples how Hadoop is used in...
Read More »

Is the keyword ‘DEFINE’ like a function name?

Yes, the keyword ‘DEFINE’ is like a function name. Once you have registered, you have to define it. Whatever logic you have written in Java program, you have an exported jar and also a jar registered by you. Now the compiler will check the function...
Read More »

Why do we need MapReduce during Pig programming?

Pig is a high-level platform that makes many Hadoop data analysis issues easier to execute. The language we use for this platform is: Pig Latin. A program written in Pig Latin is like a query written in SQL, where we need an execution engine to...
Read More »