Dear all,
I'm new to Kafka, and I'm considering using it for a maybe not very usual
purpose. I want it to be a backend for data synchronization between a
magnitude of devices, which are not always online (mobile and embedded
devices). All the synchronized information belong to some user, and can be
identified by the user id. There are several data types, and a user can have
many entries of each data type coming from many different devices.
This solution has to scale up to hundreds of thousands of users, and, as far
as I understand, Kafka stores every partition in a single file. I've been
thinking about creating a topic for every data type and a separate partition
for every user. Amount of data stored by every user is no more than several
megabytes over the whole lifetime, because the data stored would be keyed
messages, and I'm expecting it to be compacted.
So what I'm wondering is, would Kafka be a right approach for such task, and
if yes, would this architecture (one topic per data type and one partition
per user) scale to specified extent?
Thanks,
Roman.
I'm new to Kafka, and I'm considering using it for a maybe not very usual
purpose. I want it to be a backend for data synchronization between a
magnitude of devices, which are not always online (mobile and embedded
devices). All the synchronized information belong to some user, and can be
identified by the user id. There are several data types, and a user can have
many entries of each data type coming from many different devices.
This solution has to scale up to hundreds of thousands of users, and, as far
as I understand, Kafka stores every partition in a single file. I've been
thinking about creating a topic for every data type and a separate partition
for every user. Amount of data stored by every user is no more than several
megabytes over the whole lifetime, because the data stored would be keyed
messages, and I'm expecting it to be compacted.
So what I'm wondering is, would Kafka be a right approach for such task, and
if yes, would this architecture (one topic per data type and one partition
per user) scale to specified extent?
Thanks,
Roman.