Hi
I am writing a Pig UDF that returns a Tuple as per
http://wiki.apache.org/pig/UDFManual . I want the output tuple to have
a particular schema, Say {name:chararray, age:int} after I FLATTEN it
out after using the UDF.
As per the UDFManual, the method below
public Schema outputSchema(Schema input) {
try{
Schema tupleSchema = new Schema();
tupleSchema.add(input.getField(1));
tupleSchema.add(input.getField(0));
return new Schema(new
Schema.FieldSchema(getSchemaName(this.getClass().getName().toLowerCase(),
input),
tupleSchema, DataType.TUPLE));
}catch (Exception e){
return null;
gives this.getClass().getName().toLowerCase()::name and
this.getClass().getName().toLowerCase()::age as the fields after I
flatten.
My actual usecase has a Tuple that has a schema with 100 columns with
nested bags etc..
Is there some way I can get rid of the prefix of each of the fields ?
I just need schema of the Tuple as
{ field_name1: datatype1, field_name2:datatype 2, .... field_name100:
datatype 100 }
Thanks
Narayanan
I am writing a Pig UDF that returns a Tuple as per
http://wiki.apache.org/pig/UDFManual . I want the output tuple to have
a particular schema, Say {name:chararray, age:int} after I FLATTEN it
out after using the UDF.
As per the UDFManual, the method below
public Schema outputSchema(Schema input) {
try{
Schema tupleSchema = new Schema();
tupleSchema.add(input.getField(1));
tupleSchema.add(input.getField(0));
return new Schema(new
Schema.FieldSchema(getSchemaName(this.getClass().getName().toLowerCase(),
input),
tupleSchema, DataType.TUPLE));
}catch (Exception e){
return null;
gives this.getClass().getName().toLowerCase()::name and
this.getClass().getName().toLowerCase()::age as the fields after I
flatten.
My actual usecase has a Tuple that has a schema with 100 columns with
nested bags etc..
Is there some way I can get rid of the prefix of each of the fields ?
I just need schema of the Tuple as
{ field_name1: datatype1, field_name2:datatype 2, .... field_name100:
datatype 100 }
Thanks
Narayanan