Advice on MoM and large messages
I'm designing a system that will use jms and some messaging software (I'm leaning towards ActiveMQ) as middleware. There will be less than 100 agents, each pushing at most 5000 messages per day through the queue.
The payload per message will be around 100 bytes each. I expect roughly half (2500) of the messages to cluster around midnight and the other half will be somewhat evenly distributed during the day. The figures given above are all on the higher end of what I expect. (Yeah, I'll probably eat that statement in a near future).
There is one type of message where the payload will be considerably larger, say in the range of 5-50mb. These messages will only be sent a few times per day from each agent.
My questions are: Will this cause me problems in any way or is it perfectly normal to send larger amounts of data through a message queue?
For example, will it reduce throughput (smaller messages queuing开发者_StackOverflow中文版 up) while dealing with the larger messages?
Or will the message queue choke on larger messages?
Or should I approach this in a different way, say sending the location of the data through jms, and let the end receiver pick up the data elsewhere? (I was hoping not to have a special case due to coupling, security issues, and extra configuration).
I'm completely new to the practical details of jms, so just tell me if I need to provide more details.
Edited: I accepted Andres truly awesome answer. Keep posting advices and opinions, I will keep upvote everything useful.
Larger messages will definitely have an impact, but the sizes you mention here (5-50MB) should be managable by any decent JMS server.
However, consider the following. While processing a particular message, the entire message is read into memory. So if 100 agents each send a 50MB message to a different queue at around the same time, or at different times but the messages take long to dequeue, you could run into a situation where you are trying to put 5000MB worth of messages into memory. I have run into similar problems with 4MB messages with ActiveMQ in the past, however there were more messages being sent than the figures mentioned here. If the messages are all sent to the same (persistent) queue, this should not be a problem, as only the message being processed needs to be in memory.
So it depends on your setup. If the theoretical upper limit of 5000MB is managable for you (and keep the 32-bit JVM limit of 2000MB in mind) then go ahead, however this approach clearly does not scale very well so I wouldn't suggest it. If everything is sent to one persistent queue, it would probably be fine, however I would recommend putting a prototype under load first to make sure. Processing might be slow, but not necessarily slower than if it is fetched by some other mechanism. Either way, I would definitely recommend sending the smaller messages to separate destinations where they can be processed in parallel with the larger messages.
We are running a similar scenario with a hgher amount of messages. We did it similar to Andres proposal with using different queues for the big amount of smaller messages (which are still ~3-5MB in our scenario) and the few big messages that are around 50-150 MB.
In addition to the memory problems already cited, we also encountered general performance issuees on the message broker when processing a huge number of persistent large messages. This is caused by the need of persisting these messages somehow into the filesystem, we ran into bottlenecks on this side.
of cause the message size has an impact on the throughput (in msgs/sec). the larger the messages, the smaller the throughput.
精彩评论