Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

how can I let camus etl point to hdfs instead of local?

$
0
0
I tried to run camus with the properties file in the examples dir:

java -cp ...... com.linkedin.camus.etl.kafka.CamusJob -P
myproperties.properties

then it says that my output dir does not exist:
~/tools/camus/camus-etl-kafka$ java -cp
target/camus-etl-kafka-0.1.0-SNAPSHOT.jar
com.linkedin.camus.etl.kafka.CamusJob -P camus.properties
Starting Kafka ETL Job
The blacklisted topics: []
The whitelisted topics: []
Dir Destination set to: /camus/out
Getting the base paths.
The execution base path does not exist. Creating the directory
The history base path does not exist. Creating the directory.
Exception in thread "main" java.io.FileNotFoundException: File /camus/exec
does not exist.
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at org.apache.hadoop.fs.FileSystem.getContentSummary(FileSystem.java:801)
at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:223)
at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:556)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

I tried changing the dir to hdfs://localhost/camus/out it says:
$ java -cp target/camus-etl-kafka-0.1.0-SNAPSHOT.jar
com.linkedin.camus.etl.kafka.CamusJob -P camus.properties
Starting Kafka ETL Job
The blacklisted topics: []
The whitelisted topics: []
Dir Destination set to: hdfs://localhost/camus/out
Getting the base paths.
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS:
hdfs://localhost/camus/exec, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
at
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:55)
at org.apache.hadoop.fs.LocalFileSystem.pathToFile(LocalFileSystem.java:61)
at org.apache.hadoop.fs.LocalFileSystem.exists(LocalFileSystem.java:51)
at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:211)
at com.linkedin.camus.etl.kafka.CamusJob.run(CamusJob.java:556)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)

so how can I let Camus know that this is in a hdfs environment, and the
code in CamusJob.java:140

private Job createJob(Properties props) throws IOException {
Job job = new Job(getConf());

get a conf that points to a hdfs setup?

I already set my env HADOOP_CONF_DIR to my running hadoop

thanks
Yang

Viewing all articles
Browse latest Browse all 5648

Trending Articles