2007-07-03

Node sharing in the Rete network

Joe asked about beta node sharing on the Ruleby mailing list a couple of days ago. Whilst I answered Joe's question, Peter managed to tell him what he wanted to know. He described, with a short example, how node sharing is done in the beta network. When I read his description I couldn't help but to think about how I've read a paper about different strategies for sharing nodes in a Rete network. However, I can't remember which paper it is.

It is quite possible that I dreamed the whole thing or that I've misinterpreted a paper describing something related (or possibly even something completely different ;-). I've re-read the relevant sections of all Rete papers I've got and I've checked the source codes of a number of Rete implementations and I've found nothing. They all do (more or less) the same thing.

[2007-07-05] Update: Turns out I'm not very crazy, the paper I was thinking about is An Optimization Algorithm for Production Systems by Toru Ishida. However, I think I may have misinterpreted it since I've only had time to skim through it and it appears that he mostly concentrates on topology transformation which I guess is more or less equivalent to re-ordering the conditional elements of a rule. It has little or nothing to do with manipulating the actual sharing of nodes from a rule compiler's perspective.

The following CLIPS rules and their compilation messages should come as no surprise to those familiar with Rete.

(defrule A
(foo ?foo ?bar)
(bar ?bar ?baz)
(baz ?baz ?qux)
(qux BLAH)
=> )
Compilation output is, as expected:
+j+j+j+j
If we define another rule
(defrule B
(foo ?foo ?bar)
(bar ?bar ?baz)
(baz ?baz ?qux)
(qux BLECH)
=> )
most of the nodes are shared:
=j=j=j+j
If we define a third rule, with the exact same conditional elements but put the last element first instead
(defrule C
(qux BLABLAH)
(foo ?foo ?bar)
(bar ?bar ?baz)
(baz ?baz ?qux)
=> )
NO nodes are shared
+j+j+j+j
This is one of the gotchas with Rete-based engines. The rule author needs to know how the engine compiles the Rete network so that he/she won't write rules that create an unneccessary large network. But considering that database vendors seems to have come quite far with techniques for optimizing SQL query statements (regardless of how clumsily the author has written them) I'm guessing "something" could be done for Rete implementations as well.

I can think of ways to compile the network to maximize sharing but that would require storing additional information in tokens which won't be very good because it will probably eat more memory than those extra nodes shared anyway.

Does anyone know if someone has tried to experiment with different techniques for sharing nodes? Or is this issue "solved" already in the sense that it cannot be fixed and current sharing techniques are as good as it gets?

3 kommentarer:

woolfel sa...

There's actually a very simple method of maximizing the beta node sharing.

Calculate the complexity of each conditional element and order the conditional elements by complexity. usually the more complex conditional elements are above the less complex. There's different ways of calculating the complexity. One way is to count the number of attributes the conditional element uses. Another is to include attribute count with bindings. Another potential method is to give each object a priority value, which tells the engine "transactions" object is more important than "PhoneNumber".

Beyond that, one can perform static analysis of the entire ruleset before generating the RETE network. The kinds of things one would want to look for:
1. how often is a given attribute used in the rules.
2. does an attribute participate in joins, bindings or functions
3. if the attribute a string or numeric value
4. count how many conditional elements the rules have

Once you have the information, you can order the conditional elements in optimal order. Another potential way to optimize the execution is to scan the database using the rules and determine which rules are "more likely" to fire. By scan the database, I mean translate each conditional element into a SQL query like "select count(*) where blah='blah'".

Johan Lindberg sa...

And that's more or less what Toru describes in the paper.

It's worth reading though because he describes a cost model and an algorithm to calculate a "cost" for each node in the Rete network. The costs are then used when choosing which topology variant to morph the network into.

The downside is that the production system has to be run once (to gather enough data for the cost algorithm to work with) and I'm not sure how it works if you change the rule base at run time.

woolfel sa...

there's some other papers that cover that type of approach. to my knowledge, if the ruleset changes at runtime, you would have to re-run the analysis. though, even without re-running the analysis, the performance may be sufficient.