2007-12-14

MultiField variables in pyRete

I've been thinking about MultiField variables quite a bit these past few weeks. I want to fit them into pyRete. However, they're a bit different from SingleField variables - as this snippet shows:

CLIPS> (deftemplate Person (multislot name))
CLIPS> (assert (Person (name Astrid Ingrid Eowyn Lindberg)))
[fact-1]
CLIPS> (defrule match-names
(Person (name $?n1 $?n2))
=>
(printout t $?n1 " " $?n2 crlf))
CLIPS> (run)
() (Astrid Ingrid Eowyn Lindberg)
(Astrid) (Ingrid Eowyn Lindberg)
(Astrid Ingrid) (Eowyn Lindberg)
(Astrid Ingrid Eowyn) (Lindberg)
(Astrid Ingrid Eowyn Lindberg) ()
As you can see a MultiField variable can, and will, hold all possible combinations of values found in a multislot.

Given the current design of pyRete this means that I'll have to implement functionality to generate all combinations *within* a Node's match method. I'm slightly worried that this might make the processing quite a bit slower (I believe it's slower in Clips as well but I haven't benchmarked it so I can't really say for sure).

The other thing is that in Clips there's no way of nesting multifield structures. You cannot have a multifield of multifields. When it comes to Python, nested tuples/lists/dicts are common and I'd like to end up with an implementation that supports them.

The question is how to best do that?

In pyRete today there's no equivalent to Single or MultiField variables. Variables are bound to fact-instances and not parts of a fact (attributes of an object). You *can* introduce a variable in the RHS that binds to a part of a fact, but that cannot be done using pattern-matching.

>>> def rule(foo = Foo):
... if foo(something = 100):
... something_else = foo.something_else
... # ...
So what we're talking about here is basically the possibility to do this:

>>> def rule(foo = Foo):
... if foo(something = 100, something_else = something_else):
... # ...
However, this will of course let you do far more sophisticated things than that. For example, you could "filter" out facts based on attribute values in another fact:

>>> def rule(foo = Foo, bar = Bar):
... if foo(something = var) and bar(something_else = var):
... # ...
My thoughts so far are that I'll use __ as Clips use $? and _ as Clips use ?. Named Single and MultiField variables would be _foo and __foo respectively. Variables will default to SingleField so foo and _foo are really the same thing.

Unfortunately those types of variable names already have a meaning in Python. Though I'm not sure I can afford to care about that since there's really no other way to distinguish variable names without breaking syntax rules (which I'm hoping to avoid).

Also, I'd like to introduce a way of specifying a minimum and possibly a maximum number of values a MultiField should take on. That would allow you to, for example, say that all MultiField variables must have at least one value. My Clips example would instead give the following result:
(Astrid) (Ingrid Eowyn Lindberg)
(Astrid Ingrid) (Eowyn Lindberg)
(Astrid Ingrid Eowyn) (Lindberg)

I don't really have a suggestion for syntax for that though, _foo_1toN doesn't really feel right :-)

If you've got any suggestions or thoughts about this I'd love to hear them.

2 kommentarer:

woolfel sa...

generally, multislot are used in 2 ways. the first is comparing the 2 multislots are equal. The second is to test if an element is in the multislot. In clips and JESS, it's a test pattern.

multislots are more expensive, but there's various techniques to get around that.

Johan Lindberg sa...

Hm... well both of those uses are actually covered by allowing lists in a fact's attribute and by allowing the use of Python's in operator in conditional elements. And, that's probably enough for more than 99% of situations.