hi all,
My script basically performs a calculation and outputs an array of values.
Pig script:
register pcent.py using jython as pc;
data = load 'Data.csv' as (value:int);
B = FOREACH data GENERATE pc.ptile(value);
C = group B by (percentiles);
store C into 'C';
UDF:
@outputSchema("y:float")
def ptile(value):
unique = set(value)
maps = {}
pc = float(1)/(len(unique)-1)
for n, i in enumerate(unique):
maps[i] = (n*pc)
return [maps.get(i) for i in value]
The python script runs fine but I keep getting an error in Pig and the stack trace is not very helpful:
Backend error message
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error executing function
at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:120)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:337)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:410)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:344)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
at o
Pig Stack Trace
ERROR 0: Error executing function
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error executing function
at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:120)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:337)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:410)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:344)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
================================================================================
My script basically performs a calculation and outputs an array of values.
Pig script:
register pcent.py using jython as pc;
data = load 'Data.csv' as (value:int);
B = FOREACH data GENERATE pc.ptile(value);
C = group B by (percentiles);
store C into 'C';
UDF:
@outputSchema("y:float")
def ptile(value):
unique = set(value)
maps = {}
pc = float(1)/(len(unique)-1)
for n, i in enumerate(unique):
maps[i] = (n*pc)
return [maps.get(i) for i in value]
The python script runs fine but I keep getting an error in Pig and the stack trace is not very helpful:
Backend error message
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error executing function
at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:120)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:337)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:410)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:344)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
at o
Pig Stack Trace
ERROR 0: Error executing function
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error executing function
at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:120)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:337)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:410)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:344)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:372)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:297)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
================================================================================