Hello,
I've been having trouble with JsonStorage(). First, since my Python UDF had
an outputSchema that returned floats, I was getting an error in JsonStorage
trying to cast Double to Float. I resolved this by changing my UDF to
return doubles.
Pig-0.11.1, hadoop-1.0.3.
Next, I am able to successfully write json files out to s3 (I was watching
as my Pig job was running and grabbed a sample) but then at what appears to
be the final step of writing .pig_schema, this error is thrown:
grunt> *STORE firsts INTO 's3n://n2ygk/firsthops.json' using JsonStorage();*
*... *chugs along for a while successfully writing
s3://n2ygk/firsthops.json/part-r-* into the bucket.... and then:
*java.lang.IllegalArgumentException: This file system object (hdfs://
10.253.44.244:9000) does not support access to the request path
's3n://n2ygk/firsthops.json/.pig_schema' You possibly called
FileSystem.get(conf) when you should have called FileSystem.get(uri, conf)
to obtain a file system supporting your path.*
at
org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:384)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:129)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:513)
at
org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:770)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:200)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:128)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:144)
at
org.apache.pig.builtin.JsonMetadata.storeSchema(JsonMetadata.java:294)
at
org.apache.pig.builtin.JsonStorage.storeSchema(JsonStorage.java:274)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.storeCleanup(PigOutputCommitter.java:141)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:204)
at
org.apache.hadoop.mapred.Task.runJobCleanupTask(Task.java:1060)
at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:362)
at
org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at
java.security.AccessController.doPrivileged(Native Method)
at
javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at
org.apache.hadoop.mapred.Child.main(Child.java:249)
Any ideas?
Thanks.
/a
I've been having trouble with JsonStorage(). First, since my Python UDF had
an outputSchema that returned floats, I was getting an error in JsonStorage
trying to cast Double to Float. I resolved this by changing my UDF to
return doubles.
Pig-0.11.1, hadoop-1.0.3.
Next, I am able to successfully write json files out to s3 (I was watching
as my Pig job was running and grabbed a sample) but then at what appears to
be the final step of writing .pig_schema, this error is thrown:
grunt> *STORE firsts INTO 's3n://n2ygk/firsthops.json' using JsonStorage();*
*... *chugs along for a while successfully writing
s3://n2ygk/firsthops.json/part-r-* into the bucket.... and then:
*java.lang.IllegalArgumentException: This file system object (hdfs://
10.253.44.244:9000) does not support access to the request path
's3n://n2ygk/firsthops.json/.pig_schema' You possibly called
FileSystem.get(conf) when you should have called FileSystem.get(uri, conf)
to obtain a file system supporting your path.*
at
org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:384)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:129)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:513)
at
org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:770)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:200)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:128)
at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:144)
at
org.apache.pig.builtin.JsonMetadata.storeSchema(JsonMetadata.java:294)
at
org.apache.pig.builtin.JsonStorage.storeSchema(JsonStorage.java:274)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.storeCleanup(PigOutputCommitter.java:141)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:204)
at
org.apache.hadoop.mapred.Task.runJobCleanupTask(Task.java:1060)
at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:362)
at
org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at
java.security.AccessController.doPrivileged(Native Method)
at
javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at
org.apache.hadoop.mapred.Child.main(Child.java:249)
Any ideas?
Thanks.
/a