Queuing jobs in a processing chain in Java
I am currently designing a correlation engine in java which is extracting data from pdf files and correlating (raising alerts where necessary) it structured data from a relational database.
Focusing on the processing of the pdf files the system consists of:
A component which is performing the custom extraction from the pdf.
A component which parses the sometimes unordered unclean data into the required data structures
A normalisation component which will normalises the values for comparison
And a component which interfaces with the db (where the extracted data will be inserted with the rest of the data)
The components should be开发者_运维知识库 reusable in other processing chains but they will all run on the same system initially.
I think it's wise to have some sort of buffering between components, is it wise to be using JMS Queueing or would this over complicate matters? I have been experimenting with a simple linkedblockingqueue object but this object has to be passed between components so it requires a master components which drives everything which i am not sure is desirable, is there a standard way of approaching this problem?
I would use chained calls unless you have additional requirements.
loadPDF(new PDFExtractor(new PDFParser(new Normalizer(new DBEnricher(listener)))));
If you want multiple threads, I would process each file in a different thread using an ExecutorService thread pool.
精彩评论