Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

union

$
0
0
Hi,

According to Pig's documention on union, two schemas which have the same
schema (have the same length and types can be implicitly cast) can be
concatenated (see http://pig.apache.org/docs/r0.11.1/basic.html#union)

However, when I try with:
A = load '1.txt' using PigStorage(' ') as (x:int, y:chararray,
z:chararray);
B = load '1_ext.txt' using PigStorage(' ') as (a:int, b:chararray,
c:chararray);
C = union A, B;
describe C;
DUMP C;
store C into '/home/kereno/Documents/pig-0.11.1/workspace/res';

with:
~/Documents/pig-0.11.1/workspace 130$ more 1.txt 1_ext.txt
::::::::::::::
1.txt
::::::::::::::
1 a aleph
2 b bet
3 g gimel
::::::::::::::
1_ext.txt
::::::::::::::
0 a alpha
0 b beta
0 g gimel

I get in result:~/Documents/pig-0.11.1/workspace 0$ more res/part-m-0000*
::::::::::::::
res/part-m-00000
::::::::::::::
0 a alpha
0 b beta
0 g gimel
::::::::::::::
res/part-m-00001
::::::::::::::
1 a aleph
2 b bet
3 g gimel

Whereas I was expecting something like
0 a alpha
0 b beta
0 g gimel
1 a aleph
2 b bet
3 g gimel

[all together]

I understand that two files for non-matching schemas would be generated but
why for union with a matching schema?

Thanks,
Keren

Viewing all articles
Browse latest Browse all 5648

Trending Articles