How do I design a system to process messages sequentially | Lobsters
source link: https://lobste.rs/s/w1bk6l/how_do_i_design_system_process_messages
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
I have come across this problem at work and I want to know how it done/implemented generally. I don’t know where else to ask/discuss this. I would be really grateful if there are more communities where I can post this.
This is a chat application which includes multiple rooms. The end users are typically on platforms like WhatsApp, Instagram etc. and there is one room for each user. These platforms have an API which lets me queue the messages. Once the message is sent to the user’s accounts/devices, they make a callback to our webhook APIs. So, only way to send messages and ensure sequence is to:
- queue the message by calling their APIs
- wait for the ack which says message is sent
- repeat
Now how do I design a system which does this, at scale?
To give more details:
-
The messages to be sent are written to a queue (GCP PubSub) and they are ordered within a context of the user/room. However, messages aren’t batched. For example, the queue could look like this:
+----------------------------------------------------------+ +----+ +----+ +----+ +----+ +----+ +----+ +----+ +------> | A7 | | B3 | | A6 | | A5 | | C5 | | C4 | | B2 | +-----> +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----------------------------------------------------------+
If A, B, C are different users, you can see that in this queue, for a user, their messages are ordered.
- I would like to have multiple workers which consume from this queue.
- I would like to have multiple webhook servers which listen to the acks from the external APIs
- Sometimes I never get the ack from the external APIs. So, I would like have a timer so that if I haven’t received the ack within X milliseconds, then mark the current message as sent and proceed to next.
Problems:
- Sharing the state between worker and webhook servers. The webhook server somehow needs to communicate to the worker that this message been sent to the user.
- Processing messages for different rooms in parallel. If for some reason, it is taking more time to send the message to one user, then it should not slow down for other users.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK