Hi all,
I'm looking at setting up a (small) Kafka cluster for streaming microscope data to Spark-Streaming.
The producer would be a single Windows 7 machine with a 1Gb or 10Gb ethernet connection running http posts from Matlab (this bit is a little fuzzy, and I'm not the user, I'm an admin), the consumer would be 10-60 (or more) Linux nodes running Spark-Streaming with 10Gb ethernet connections. Target data rate per the user is <200MB/sec, although I can see this scaling in the future.
Based on the documentation, my initial thoughts were as follows:
3 nodes, all running ZK and the broker
Dell R620
2x8 core 2.6GHz Xeon
256GB RAM
8x300GB 15K SAS drives (OS runs on 2, ZK on 1, broker on the last 5)
10Gb ethernet (single port)
Do these specs make sense? Am I over or under-speccing in any of the areas? It made sense to me to make the filesystem cache as large as possible, particularly when I'm dealing with a small number of brokers.
Thanks,
Ken Carlile
Senior Unix Engineer, Scientific Computing Systems
Janelia Farm Research Campus, HHMI
I'm looking at setting up a (small) Kafka cluster for streaming microscope data to Spark-Streaming.
The producer would be a single Windows 7 machine with a 1Gb or 10Gb ethernet connection running http posts from Matlab (this bit is a little fuzzy, and I'm not the user, I'm an admin), the consumer would be 10-60 (or more) Linux nodes running Spark-Streaming with 10Gb ethernet connections. Target data rate per the user is <200MB/sec, although I can see this scaling in the future.
Based on the documentation, my initial thoughts were as follows:
3 nodes, all running ZK and the broker
Dell R620
2x8 core 2.6GHz Xeon
256GB RAM
8x300GB 15K SAS drives (OS runs on 2, ZK on 1, broker on the last 5)
10Gb ethernet (single port)
Do these specs make sense? Am I over or under-speccing in any of the areas? It made sense to me to make the filesystem cache as large as possible, particularly when I'm dealing with a small number of brokers.
Thanks,
Ken Carlile
Senior Unix Engineer, Scientific Computing Systems
Janelia Farm Research Campus, HHMI