Techniques for querying a set of object in-memory in a Java application

2022-12-30 10:59 问答作者：

We have a system which performs a 'coarse search' by invoking an interface on another system which returns a set of Java objects. Once we have received the search results I need to be able to further filter the resulting Java objects based on certain criteria describing the state of the attributes (e.g. from the initial objects return all objects where x.y > z && a.b == c).

The criteria used to filter the set of objects each time is partially user configurable, by this I mean that users will be able to s开发者_如何学Celect the values and ranges to match on but the attributes they can pick from will be a fixed set.

The data sets are likely to contain <= 10,000 objects for each search. The search will be executed manually by the application user base probably no more than 2000 times a day (approx). It's probably worth mentioning that all the objects in the result set are known domain object classes which have Hibernate and JPA annotations describing their structure and relationship.

Possible Solutions

Off the top of my head I can think of 3 ways of doing this:

For each search persist the initial result set objects in our database, then use Hibernate to re-query them using the finer grained criteria.
Use an in-memory Database (such as hsqldb?) to query and refine the initial result set.
Write some custom code which iterates the initial result set and pulls out the desired records.

Option 1

Option 1 seems to involve a lot of toing and froing across a network to a physical Database (Oracle 10g) which might result in a lot of network and disk activity. It would also require the results from each search to be isolated from other result sets to ensure that different searches don't interfere with each other.

Option 2

Option 2 seems like a good idea in principle as it would allow me to do the finer query in memory and would not require the persistence of result data which would only be discarded after the search was complete. Gut feeling is that this could be pretty performant too but might result in larger memory overheads (which is fine as we can be pretty flexible on the amount of memory our JVM gets).

Option 3

Option 3 could be very performant but is something I would like to avoid as any code we write would require such careful testing that the time taken to acheive something flexible and robust enough would probably be prohibitive.

I don't have time to prototype all 3 ideas so I am looking for comments people may have on the 3 options above, plus any further ideas I have not considered, to help me decide which idea might be most suitable. I'm currently leaning toward option 2 (in memory database) so would be keen to hear from people with experience of querying POJOs in memory too.

Hopefully I have described the situation in enough detail but don't hesitate to ask if any further information is required to better understand the scenario.

Cheers,

Edd

Options 1 and 2 are quite compatible: by implementing one you can replace it with the other with simple reconfiguration of persistence.xml (given that in-memory database is JPA compatible, e.g. JavaDB, Derby, etc.).

Option 3 is re-implementing both third-party software (database) and your own code (existing JPA entities). You also listed its advantages as concerns. It's clearly a less feasible option in your case. I can't think of anything else to promote Option 3 either.

It seems that in-memory database is more suitable given use cases and their time span. If requirements evolve into less transient ones then you can switch to Oracle.

If your expressions are not too complex, you can use an expression language for evaluating string queries on your Java objects (POJOs). I can recommend MVEL http://mvel.codehaus.org .

The idea is that you put your objects into MVEL context. Then you provide string query written according to MVEL simple notation, and finally evaluate expression.

Example taken from MVEL site:

Map vars = new HashMap();
vars.put("x", new Integer(5));
vars.put("y", new Integer(10));

Integer result = (Integer) MVEL.eval("x * y", vars);
assert result.intValue() == 50;  // Mind the JDK 1.4 compatible code :)

Usually expression languages support traversing your object graph (collections) and accessing members in JSP EL style (dot notation).

Also, I can suggest looking at OGNL (google it, I can't add more than one link)

How complex are the refining criteria? If the majority are quite simple, I'd be tempted to go for option (3) to start with, but make sure it's encapsulated behind a suitable interface so that if you come across something that is too complex or inefficient to code up yourself you can switch to the in-memory DB at that point (either wholesale for all queries, or just for the complex ones if there's an overhead in setting up the temporary tables).

Option 2 seems to be good - since you can toggle between 1 & 2 as per need. 3 is restricted in terms of future data sizing issue as well. Querying objects would imply greater dependency on the code structure for storage and querying.

Probably it would be good idea to include some caching mechanism (ehcache/memcache) along with usage of Option 2 and then profiling to check the performance difference.

继续阅读：database in-memory-database jpa

Techniques for querying a set of object in-memory in a Java application

Possible Solutions

Option 1

Option 2

Option 3

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Possible Solutions

Option 1

Option 2

Option 3

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？