I am using Cloudera CDH 5.1 and running a Flume agent configured by
Cloudera manager.
I would like to send Avro data to Flume, and I was assuming the Avro Source
would be the appropriate method to send data in this way.
However, the examples of Java clients that send data via the Avro Source,
send simple strings, not Avro objects to be serialized, e.g. the example
here: https://flume.apache.org/FlumeDeveloperGuide.html
And the examples of Avro serialization all seem to be able serializing to
disk.
In my use case, I am basically receiving a real-time stream of JSON
documents, which I am able to convert to Avro objects, and would like to
put them into Flume. I would then like to be able to index this Avro data
in Solr via the Solr sink, and convert it to Parquet format in HDFS using
the HDFS sink.
Is this possible or am I coming about this the wrong way?
Cloudera manager.
I would like to send Avro data to Flume, and I was assuming the Avro Source
would be the appropriate method to send data in this way.
However, the examples of Java clients that send data via the Avro Source,
send simple strings, not Avro objects to be serialized, e.g. the example
here: https://flume.apache.org/FlumeDeveloperGuide.html
And the examples of Avro serialization all seem to be able serializing to
disk.
In my use case, I am basically receiving a real-time stream of JSON
documents, which I am able to convert to Avro objects, and would like to
put them into Flume. I would then like to be able to index this Avro data
in Solr via the Solr sink, and convert it to Parquet format in HDFS using
the HDFS sink.
Is this possible or am I coming about this the wrong way?