Out of Core Rules Engine
Are there any implementations of production rule systems that operate out of core?
I've checked out the open source implementations like CLIPS and Jess, but these only operate in memory, so they tend to crash or force heavy disk swapping when operating on large numbers of facts and rules (e.g. in the billions/trillions).
I'm playing around with the idea of possibly porting a simple rules engine, like Pychinko to a SQL backend, using Django's ORM. However, supporting the level of functionality found in CLIPS would be very non-trivial, and I don't want to reinvent the wheel.
Are there any alternatives for 开发者_StackOverflow中文版scaling up a production rule system?
you can check JENA and similar RDF rule engines which are designed to work with very large fact databases.
This isn't a direct answer to your question, but it may give you a line of attack on the problem.
Back in the 80's and 90's we fielded an information retrieval system that allowed for very large numbers of standing queries. Specifically, we had systems with 64MB of memory (which was a buttload in those days) ingesting upwards of a million messages a day and applying 10,000 to 100,00+ standing queries against that stream.
If all we had done was to iteratively apply each standing query against the most recent documents we would have been dead meat. What we did was to perform a sort of inversion of the queries, specifically identifying the must have and may have terms in the query. We then used the term list from the document to find those queries that had any sort of chance to succeed. The customer learned to create queries that had strong differentiators and, as a result, sometimes only 10 or 20 queries had to be fully evaluated.
I don't know your dataset, and I don't know what your rules look like, but there might be something similar you could try.
精彩评论