2006-09-11

About the Alpha Network

Peter has previously pointed me to a lot of resources about Rete and different implementations (Jess, JBoss Rules, Sumatra etc). I spent some time during the weekend going through source code, documentation and reading archived blog posts and I've come to realize that I have done things a bit differently in my implementation. Among other things I've read this, this and this.

Before I go into detail about what the differences are I need to tell you about the design of the Working Memory. There are two interesting methods in the public API: assertObject and retractObject.

Asserting an Object does 3 things:
1) The Object is indexed and added to the Object Repository, it's also given a name: objN (where N is a positive integer).
2) The Object's __setattr__ method is modified so that each property change in the Object notifies the WorkingMemory.
3) The Object is deconstructed into Facts (Python tuples) and each Fact is asserted.

>>> wm = WorkingMemory(None, None)
>>> class Foo(object):
... def __init__(self, n):
... self.n = n

>>> f = Foo(1)
>>> wm.assertObject(f)
# ==> ('obj1', 'instanceof', '==', 'Foo')
# ==> ('obj1', 'n', '==', 1)
The interesting thing about the above code is that what's passed to the Alpha Network is not the Object f but two tuples of size 4. RetractObject basically does the opposite of assert, it removes all of the indexes, resets the __setattr__ method and retracts all Facts associated with the Object.

Modifying a property in an asserted Object triggers a modification of the corresponding 'fact'.
>>> f.n = 2
# <== ('obj1', 'n', '==', 1) # ==> ('obj1', 'n', '==', 2)
I don't know whether or not this is a smart thing to do but it has definitely influenced the design of my Alpha Network implementation.

The LHS of a rule is also converted into a tuple and although the process is a bit different and I haven't really written a complete implementation for all cases yet I'll use a very simple example for demonstrational purposes.
>>> def aRule():
... if ?obj.n == 1:
... pass
The code above would be converted into this tuple:
(('?obj', 'n'), '==', 1)
which in turn will be compiled into the following Alpha Network:
{ '*': { 'n': { '==': { 1: { 'am': AlphaMemory1 }}}}}
As you can see, the Alpha Network is implemented as a dictionary of dictionaries. '*' indicates a variable. Adding two more tuples, for example:
(('?obj', 'n'), '==', 2)
(('?obj', 'p'), '==', 1)
changes the network into:
{ '*': { 'n': { '==': { 1: { 'am': AlphaMemory1 },
2: { 'am': AlphaMemory2 }}},
'p': { '==': { 1: { 'am': AlphaMemory3 }}}}}
The method addFactToReteNetwork is used to add a Fact to the Rete Network.
def addFactToReteNetwork(self, fact, _content = None, _network = None):
if _network is None:
_network = self.alphaNetwork
if _content is None:
_content = fact

for value in _content:
try:
_network['am'].alphaActivate(fact)
except KeyError:
pass

for v in ['*', value,]:
try:
node = _network[v]
if len(_content) > 1:
self.addFactToReteNetwork(fact, _content[1:], node)
else:
node['am'].alphaActivate(fact)
except KeyError:
pass
Whenever a Fact tuple reaches an Alpha Memory it triggers an activation. But more about that in another blog post. Now let's look at some of the differences.

1) I don't have an ObjectTypeNode.
2) I don't have a way of checking intra-element conditions.

I'm not too worried about the ObjectTypeNode. I compile type checks into:
('objN', None, 'instanceof', 'Class')
which can be tested in the Alpha Network just as any other property. I don't really know what to do about the intra-element conditions though.

Well, I've got some more reading to do and I guess I need to think about building the Alpha Network using a linked list instead. That would definitely solve the problem with intra-element conditions but it changes a lot of other things as well... but since I'm sort of starting fresh. I guess, at least, the timing's quite good ;-)

7 kommentarer:

woolfel sa...

One critical aspect of the ObjectTypeNode is it filters the objects. Implementation that remove the ObjectTypeNode end up scanning more nodes than necessary.

In other words, having an objectTypeNode reduces the cost of pattern matching. For small rulesets, leaving out the ObjectTypeNode doesn't have much effect, but once the rule count reaches 100, it starts to have a huge impact. Or put it another way, if there isn't an objectTypeNode, how many nodes would the rule engine need to check? How many of those alphaNodes should be checked at runtime?

By removing the ObjectTypeNode, the implementation is no longer RETE. Fuxi also removed the ObjectTypeNode. I tried to explain it to the author of Fuxi, but he claimed ObjectTypeNodes aren't needed, even though Forgy clearly says it's important. Hope that helps.

Johan Lindberg sa...

Thanks for the comment Peter, you make a good point.

I can see that not having the ObjectTypeNode to filter out large mismatches early may stress the Alpha Network more than neccessary.

The plan, or whatever I ought to call it, was to compile any type checks (these are, after all, optional in Python) into a tuple and have it filtered through as any other Fact.

The problem is of course that in my example. The two Facts are interrelated and I don't have an efficient strategy to NOT check the second tuple if the first one fails. In my first attempt I did this in the Beta Network. That was one of the reasons I posted as I did a couple of days ago.

In my example, this is not *really* a problem, because there are only two Facts, but it could have been a lot more and potentially all of them would have ended up in an AlphaMemory.

To put it another way, if I assert an object with 10 interesting properties (interesting means that they are used in at least one rule) and if the rules use type checks then the object will be broken up into 11 fact tuples. The type check will be first out and if it fails another 40 dictionary lookups and method calls are performed. Hmmmm...

Johan Lindberg sa...

Part of the problem is that Python uses duck typing and that I would either have to go against the design of the language and force those that use the rule engine to declare types for all of their rules or I would have to figure out a way to efficiently figure out whether or not this object matches the interface information provided in the rule.

I think I'll try out both of the approaches (it's still fun, so I don't mind the extra work :-) and see whichever works the best.

chimezie sa...
Den här kommentaren har tagits bort av bloggadministratören.
chimezie sa...

Fuxi also removed the ObjectTypeNode.

This is not correct and a misunderstanding on your part on how Horn-like description and first order logic rules were mapped to the RETE algorithm.

I've finalized a function-free N3 python reasoner. It uses iterator algebra for very high speed hash-based beta node joins.

In my transposition, the ObjectTypeNode is expressed explicitely in the rule language and builtin into the axioms (subsumption axioms) associated with the rules it uses.

You might find it interesting Johan:

http://copia.ogbuji.net/blog/2006-09-16/fuxi-compiling-and-evaluating

Johan Lindberg sa...

Thanks.

I'll have a look, it sounds interesting.

I've previously downloaded both the pychinko and FuXi source code but I must say that I'm not too familiar with RDF/OWL or the n3 notation. I'm having some problems with figuring out how to test and compare the match algorithm implementation with anything I've made since it's so different (at least in my eyes).

OTOH RDF and OWL seems interesting enough to learn more about so I'll spend some time trying to learn and hopefully I'll be able to make better use of your code.

chimezie sa...

Fuxi no longer depends on Pychinko. It used to use Pychinko for the RETE impl, but it's been rewritten to be stand alone - mostly due to the packaging headaches of Pychinko and as an oppurtunity to improve hashing techniques using additional native programming idioms.