Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

Optimizing Pig script

$
0
0
Hi,

I have written a ‘Pig Script’ which is processing Sequence files given as
input.

It is working fine but there is one problem mentioned below.

I have repetitive statements in my pig script, as shown below:

- Filtered_Data _1= FILTER BagName BY ($0 matches 'RegEx-1');
- Filtered_Data_2 = FILTER BagName BY ($0 matches 'RegEx-2');
- Filtered_Data_3 = FILTER BagName BY ($0 matches 'RegEx-3');
- So on…

Question :

So is there any way by which I can have above statement written once and

then loop through all possible “RegEx” and substitute in Pig script.

For Example:

Filtered_Data _X = FILTER BagName BY ($0 matches 'RegEx'); ( have this
statement once )

( loop through all possible RegEx and substitute value in the statement )

Right now I am calling Pig script from a shell script, so any way from
shell script will be also be welcome..

Thanks in advance.

Happy Pigging!!!!

Viewing all articles
Browse latest Browse all 5648

Trending Articles