Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

Output Schema of Pig UDF that returns a Tuple

$
0
0
Hi

I am writing a Pig UDF that returns a Tuple as per
http://wiki.apache.org/pig/UDFManual . I want the output tuple to have
a particular schema, Say {name:chararray, age:int} after I FLATTEN it
out after using the UDF.

As per the UDFManual, the method below

public Schema outputSchema(Schema input) {
try{
Schema tupleSchema = new Schema();
tupleSchema.add(input.getField(1));
tupleSchema.add(input.getField(0));
return new Schema(new
Schema.FieldSchema(getSchemaName(this.getClass().getName().toLowerCase(),
input),

tupleSchema, DataType.TUPLE));
}catch (Exception e){
return null;

gives this.getClass().getName().toLowerCase()::name and
this.getClass().getName().toLowerCase()::age as the fields after I
flatten.

My actual usecase has a Tuple that has a schema with 100 columns with
nested bags etc..

Is there some way I can get rid of the prefix of each of the fields ?

I just need schema of the Tuple as

{ field_name1: datatype1, field_name2:datatype 2, .... field_name100:
datatype 100 }

Thanks
Narayanan

Viewing all articles
Browse latest Browse all 5648

Trending Articles