Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

field name reference - alias

$
0
0
Hello,

Can one refer to a field name with no ambiguity by its full name (A::x
instead of x)? Below are two contradictory behaviors:

*First example:*
A = load '1.txt' using PigStorage(' ') as (x:int, y:chararray,
z:chararray);
B = load '1_ext.txt' using PigStorage(' ') as (a:int, b:chararray,
c:chararray);
C = JOIN A by x LEFT OUTER, B BY a;
D = FOREACH C GENERATE A::x as toto;
describe C;
describe D;

*output:*
C: {A::x: int,A::y: chararray,A::z: chararray,B::a: int,B::b:
chararray,B::c: chararray}
D: {toto: int}

Works fine also if you refer to A:: x as x.

*Second example with toMap:*
A = load '1.txt' using PigStorage(' ') as (x:int, y:chararray,
z:chararray);
B = FOREACH A GENERATE TOMAP('toto', x);
describe B;
DUMP B;
store B into '/home/kereno/Documents/pig-0.11.1/workspace/res';

*output:*
C: {map[]}

If you change the script to refer to A::x, you would get an error as follow:
A = load '1.txt' using PigStorage(' ') as (x:int, y:chararray,
z:chararray);
B = FOREACH A GENERATE TOMAP('toto', A::x);
describe B;
DUMP B;
store B into '/home/kereno/Documents/pig-0.11.1/workspace/res';

output
<file tomap.pig, line 2, column 37> Invalid field projection. Projected
field [A::x] does not exist in schema: x:int,y:chararray,z:chararray.

My question is why is it that for the FOREACH I can use either and not for
the TOMAP??
side node: I am asking cause I am generating schemas of a Pig script and
use these as input for another language (project translating Pig to
Algebricks) and would like to be consistent with the Pig behavior :).

Thanks,
Keren

Viewing all articles
Browse latest Browse all 5648

Trending Articles