Quantcast
Channel: Apache Timeline
Viewing all articles
Browse latest Browse all 5648

Hardware planning

$
0
0
Hi all,

I'm looking at setting up a (small) Kafka cluster for streaming microscope data to Spark-Streaming.

The producer would be a single Windows 7 machine with a 1Gb or 10Gb ethernet connection running http posts from Matlab (this bit is a little fuzzy, and I'm not the user, I'm an admin), the consumer would be 10-60 (or more) Linux nodes running Spark-Streaming with 10Gb ethernet connections. Target data rate per the user is <200MB/sec, although I can see this scaling in the future.

Based on the documentation, my initial thoughts were as follows:

3 nodes, all running ZK and the broker

Dell R620
2x8 core 2.6GHz Xeon
256GB RAM
8x300GB 15K SAS drives (OS runs on 2, ZK on 1, broker on the last 5)
10Gb ethernet (single port)

Do these specs make sense? Am I over or under-speccing in any of the areas? It made sense to me to make the filesystem cache as large as possible, particularly when I'm dealing with a small number of brokers.

Thanks,
Ken Carlile
Senior Unix Engineer, Scientific Computing Systems
Janelia Farm Research Campus, HHMI

Viewing all articles
Browse latest Browse all 5648

Trending Articles