2007-08-26

Duck typing, again.

I have previously posted about duck-typing and whether or not it should be possible to write a rule that doesn't check the Fact object's class name but instead check the existance of the attributes that are tested in Alpha and Beta nodes.

A couple of months ago I concluded that it's best to leave that sort of thing out and instead require that the rule's definition include class names for each variable.

However, things are not so simple. Just because pyRete won't allow you to compile a "duck-typed" Rete Network doesn't mean the Fact objects passing through it will play nice as well.

Consider the following rule:

>>> @pyRete.rule
... def foo(a= Foo):
... if a.n > 10:
... pass
and this Fact class:
>>> class Foo(pyRete.Fact):
... def __init__(self, n):
... self.n = n
Using it as defined won't cause any problems but unfortunately (or luckily, depends a bit on how you view the world ;-) there's no stopping this type of behaviour:
>>> obj = Foo(n = 10)
>>> obj.n
10
>>> del obj.n
>>> obj.n

Traceback (most recent call last):
File "", line 1, in
obj.n
AttributeError: 'Foo' object has no attribute 'n'
That makes it obviuos that the ObjectType node test won't be sufficient for pyRete. Even though the above (using del on an instance's property) is probably rather rare there's no way to *know* if the public interface of the Fact instance has been modified or not.

I think I'm going to have to implement the Rete Network more or less as described in Forgy's 1979 paper where he tests the length and the contents of a Fact object in order to categorize it properly. In the 1982 article it appears as if these tests have vanished and have been replaced by a "type" symbol in the first element of each fact. Which, I believe, is now usually known as an ObjectType node.

Now, the only question is: Should I distribute the tests among the Alpha and Beta nodes or should I extend the ObjectType node with some property existance tests as well?

15 kommentarer:

woolfel sa...

another solution to this is to notify the engine when an object definition changes. For example, when users re-declare a deftemplate, the engine is notified of the change and handles the network changes appropriately.

That's one thing in jamocha, which I haven't finished yet. Since jamocha's design is positional, if a deftemplate is modified such that a slot is removed, it means I have to either A) not allow the deletion of the slot, or B) re-calculate the network.

handling model changes either with the interface or class isn't trivial :)

Johan Lindberg sa...

Hi Peter,

Yes, you're right. Only problem is, I don't have the equivalent of a deftemplate.

One of the main ideas behind pyRete is to allow *any* Python object be used as a Fact. The idea is that you should (at least, in theory) be able to add pyRete to an existing project without modifying anything (or at least very little).

I could of course change that decision and require the user to define his/her fact objects, but I'd rather not.

Joe Kutner sa...

I am taking the same approach with Ruleby. *Any* ruby object can be used as a Fact. I feel this is important because the majority of its users are not rule-engine geared. If they were then they would be using CLIPS, jamocha or something faster/better.

Instead, the goal of Ruleby is to provide a rule-engine aspect to an existing program rather than have it be the focal point.

But in that same respect, duck-typing is big with the Ruby crowd. And I should probably be thinking about this more.

Johan, do you share nodes between classes? For example, can the conditions (Foo.n == 10) and (Bar.n == 10) share a node? This seems like it would be related to duck typing.

Johan Lindberg sa...

Hi Joe

> Instead, the goal of Ruleby is to
> provide a rule-engine aspect to an
> existing program rather than have
> it be the focal point.

That's very well put. I might steal that ;-)

> Johan, do you share nodes between
> classes? For example, can the
> conditions (Foo.n == 10) and
> (Bar.n == 10) share a node? This
> seems like it would be related to
> duck typing.

Currently (the code in Subversion today) they would not share a node. Since they would be descendants of different ObjectTypeNodes.

I'm not sure about how to tackle that in the new design. It would definitely be possible to share such a node since the code that would be generated in both cases would be something like:

>>> if hasattr(obj, 'n') and \
... getattr(obj, 'n') == 10:
... # propagate

I'll have to think about that though because I suspect it may have an effect on subsequent nodes.

I would have to find a way of "splitting" the tokens again if there are more conditional elements in the rules that are not shared. Something has to handle which facts go where. It's probably not a big deal but... like I said, I have to think about it.

woolfel sa...

Here's a thought. In Dr. Forgy's original paper, he organized the network by slot count rather than object type. Using that approach doesn't help with the case where an attribute is deleted.

I think a critical part of supporting duck typing is to inject a listener into the object, so the engine is notified the structure has changed. If the engine is notified, you can always recalculate the network.

Without some kind of listener interface, the engine would need to evaluate all nodes in the network to be complete.

Johan Lindberg sa...

> Using that approach doesn't help
> with the case where an attribute
> is deleted.

That's true. We'll end up with a possibly incorrect network either way. But how should the engine handle the situation? Should it raise an Exception or silently ignore errors caused by interface mismatches?

If I distribute the hasattr tests along with the Alpha and Beta nodes I can at least detect the mismatch and log it before the VM raises an exception.

I hope that any user that is bold/stupid enough to put him/herself in this situation won't be surprised that the outcome of a run() probably won't match expectations.

> If the engine is notified, you
> can always recalculate the
> network.

The problem is that it's the *instance* that's being modified so re-calculating/compiling the network is actually the wrong way to go. If a user creates another instance of the same class their public interfaces won't match and it won't match the new nodes in the network.

There are ways to switch the class definition as well and *then* it would be appropriate to re-compile the network but not to replace nodes, but rather to *add* nodes in order to support the new class definition (that just happens to have the same name as another one). Since we can never be certain that the user will stop using instances of the old class we have to support both.

I'm hoping that this situation will never ever turn up in real life code but since I can't be sure I thought I'd best prepare.

The thing is I used to *like* the fact that Python is absurdely dynamic and that everything is run-time. Now, I'm not so sure ;-)

woolfel sa...

You're right that re-calculating the network is impractical. The only time one could recalculate the network is inbetween requests. Even then, it's only "feasible" for a small network and zero facts.

Using Robert Doorenboss's IAV design from RETE-UL paper "could" make it easier to re-calculate, but it also requires the listener to notify the engine.

One potential approach is to get rid of the objectTypeNode and replace it with attributeNode. This way, you can delete an attribute from an object. The downside is that you'd need track the Entity-Attibute association some other way.

I don't think there's any easy way to support this level of dynamic models.

Johan Lindberg sa...

> One potential approach is to get
> rid of the objectTypeNode and
> replace it with attributeNode.

Is there any particular reason you'd have the attribute check as (a) separate node/nodes? Currently I've bundled attribute checks together with the Alpha and Beta nodes. It felt simpler that way but I haven't thought too much about it.

> I don't think there's any easy
> way to support this level of
> dynamic models.

Amen to that! ;-)

woolfel sa...

the reason for replacing the objectType node with an attributeNode is to group all nodes for the given attribute under it. by putting the attribute + object test with each node, you'd have to evaluate more nodes.

I haven't thought through it completely. I'll try to write an entry on it.

Johan Lindberg sa...

> the reason for replacing the
> objectType node with an
> attributeNode is to group all nodes
> for the given attribute under it.
> by putting the attribute + object
> test with each node, you'd have to
> evaluate more nodes.

That's quite clever, and a lot closer to Forgy's 1979 Rete description than doing it my way. I'll give it a try. Thanks.

woolfel sa...

I just posted an entry on the idea. hope it helps.

woolfel sa...

if you try the approach, I'd love to hear your results.

Johan Lindberg sa...

> if you try the approach, I'd
> love to hear your results.

Of course. I'll publish them as soon as I get it all working.

This week has been a bit more hectic than I expected, I have barely even had time to read the posts on your blog.

woolfel sa...

Having a baby tends to make free time disappear :) All the work is worth it.

Johan Lindberg sa...

Yeah, that's true! But my inactivity is actually mostly due to a fever that I've had for the past days. So most of my spare time has been spent sleeping instead.