I m getting messages through SyslogUDP and storint them on HDFS. But data at the end is not separated by \n as I would expect
Reslut is stored without \n thus makes it hard to further process with standard tool set
# flume config
tier1.sources.sourceDHCP_Raw.type = syslogudp
tier1.sources.sourceDHCP_Raw.host = 0.0.0.0
tier1.sources.sourceDHCP_Raw.port = 5141
tier1.sources.sourceDHCP_Raw.channels = channelDHCP_Raw
tier1.channels.channelDHCP_Raw.type = memory
tier1.channels.channelDHCP_Raw.capacity = 100
tier1.sinks.sinkDHCP_Raw.type = hdfs
tier1.sinks.sinkDHCP_Raw.hdfs.path = /flume/TV/DHCP_RAW
tier1.sinks.sinkDHCP_Raw.hdfs.rollInterval = 10000
tier1.sinks.sinkDHCP_Raw.serializer.appendNewline = true
tier1.sinks.sinkDHCP_Raw.channel = channelDHCP_Raw
__What comes through network __
# tcpdump -n udp -A | grep 'ZXAN'
.....B......y..<191>S,3,00029b221441,0.0.0.0,2014-6-5 12:47:14.716,2014-6-5 12:47:14,ZXAN pon 0/ 2/3/ 8/2:6,28734950,0x853DE730
.....B......y..<191>S,3,00029b221441,0.0.0.0,2014-6-5 12:46:27.451,2014-6-5 12:46:27,ZXAN pon 0/ 2/3/ 8/2:6,28734950,0x853DE730
Reslut is stored without \n thus makes it hard to further process with standard tool set
# flume config
tier1.sources.sourceDHCP_Raw.type = syslogudp
tier1.sources.sourceDHCP_Raw.host = 0.0.0.0
tier1.sources.sourceDHCP_Raw.port = 5141
tier1.sources.sourceDHCP_Raw.channels = channelDHCP_Raw
tier1.channels.channelDHCP_Raw.type = memory
tier1.channels.channelDHCP_Raw.capacity = 100
tier1.sinks.sinkDHCP_Raw.type = hdfs
tier1.sinks.sinkDHCP_Raw.hdfs.path = /flume/TV/DHCP_RAW
tier1.sinks.sinkDHCP_Raw.hdfs.rollInterval = 10000
tier1.sinks.sinkDHCP_Raw.serializer.appendNewline = true
tier1.sinks.sinkDHCP_Raw.channel = channelDHCP_Raw
__What comes through network __
# tcpdump -n udp -A | grep 'ZXAN'
.....B......y..<191>S,3,00029b221441,0.0.0.0,2014-6-5 12:47:14.716,2014-6-5 12:47:14,ZXAN pon 0/ 2/3/ 8/2:6,28734950,0x853DE730
.....B......y..<191>S,3,00029b221441,0.0.0.0,2014-6-5 12:46:27.451,2014-6-5 12:46:27,ZXAN pon 0/ 2/3/ 8/2:6,28734950,0x853DE730