Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

Filename in load

$
0
0
Hey all,

I'm loading a group of csv files into pig storage, and I would like to
include the filename in each tuple loaded from that file. So as to
differentiate the tuple as unique to coming from that file (each file is
for a particular user).

So for example:
csv_all =LOAD 'sample1.csv, sample2.csv' USING PigStorage('|')
AS (upc:chararray, store_id:int, date:chararray,
product_description:chararray);

Is there a way to load each tuple from each csv to include another field
that contains the filename or part of it (like filename:chararry)?

Thanks in advance!

*cavallero.me <http://cavallero.me>*

Viewing all articles
Browse latest Browse all 5648

Trending Articles