2006-09-05

The Devil is in the details!

As I continue to work on pyTER, there are some things that trouble me. Apart from getting the Rete algorithm in place without damaging it there are also other subtle little things that could be implemented in various ways. It's not that I can't figure out how to do it as much as I don't know which of all possible alternatives is correct.

One thing that I thought a lot about last week was Conflict Resolution. I was a bit confused because in most (high-level) descriptions I've read, I've interpreted Conflict Resolution as a way of sorting Rules (not Activations). Consider the following Conflict Set:

[ (Rule 1, [Fact 1, Fact2]),
(Rule 1, [Fact 3, Fact4]), ]
How do I determine which Activation to fire first?

Luckily, the CLIPS documentation includes the CLIPS 6.24 Basic Programming Guide which (in section 5.3) explains the built-in Conflict Resolution Strategies in detail. Previously I had read about Conflict Resolution in Jess In Action (chapter 7.5) and Expert Systems - Principles and Programming (chapter 1.9). But I couldn't really parse sentences like "... the most recently activated rules to fire first, ..." until I read the CLIPS document.

Another thing that I (still) worry about is the Fact repository. Whenever a Fact is asserted I do 3 things:
1) The Fact is deconstructed into WMEs (Python tuples of size 4).
2) I modify the Fact's Class by adding it's __setattr__ to a function that automatically notifies the Working Memory's modify method.
3) Pass all of the WMEs through the Rete Network.

I'm not sure how others (Jess, JBoss Rules etc) are doing this. Whether it's tightly coupled (as in pyTER now) or if the Fact objects are removed from the program's environment and kept in a "shadow" repository that's not accessible from the outside.

I can see upsides and downsides with each of the implementations but I need to read more and study whatever effects this has on the client program using pyTER. For now, I'll keep with the tightly coupled design and hope that Python's dynamic abilities won't mess things up downstream.

2 kommentarer:

woolfel sa...

Since Drools3 is open source, you can look at the code and save yourself some work. Sumatra's source code is also on sourceforge, so you can compare contrast the two implementations to see the effect of design decisions on performance and functionality.

There really is no "right" way to implement RETE. Just efficient and in-efficient implementations. For example, the original agenda implementation in Drools3 and in Sumatra was a simple list. Well that turns out to be really slow if there are alot of rules that match fully. It forces the agenda to search the entire agenda when an activation is removed. In activation in this case is an object that knows the facts that fire a given rule.

The solution that drools3, Sumatra and JESS take is to make it so removing an activation doesn't iterate over the entire list of activations in the agenda. You can see the impact of this in recent posts on my blog. Hope that helps save you some time.

peter lin

Johan Lindberg sa...

Thanks for the pointers. I really apprecieate it.

I've got the source code for Drools/JBoss Rules already and I regularly check little details in it. I'm downloading the snapshot zip of Sumatra as I'm writing this.