Hey all,
I'm loading a group of csv files into pig storage, and I would like to
include the filename in each tuple loaded from that file. So as to
differentiate the tuple as unique to coming from that file (each file is
for a particular user).
So for example:
csv_all =LOAD 'sample1.csv, sample2.csv' USING PigStorage('|')
AS (upc:chararray, store_id:int, date:chararray,
product_description:chararray);
Is there a way to load each tuple from each csv to include another field
that contains the filename or part of it (like filename:chararry)?
Thanks in advance!
*cavallero.me <http://cavallero.me>*
I'm loading a group of csv files into pig storage, and I would like to
include the filename in each tuple loaded from that file. So as to
differentiate the tuple as unique to coming from that file (each file is
for a particular user).
So for example:
csv_all =LOAD 'sample1.csv, sample2.csv' USING PigStorage('|')
AS (upc:chararray, store_id:int, date:chararray,
product_description:chararray);
Is there a way to load each tuple from each csv to include another field
that contains the filename or part of it (like filename:chararry)?
Thanks in advance!
*cavallero.me <http://cavallero.me>*