2008-04-04

Another attempt at pattern matching

In my previous post I'm talking about feature-tests without really explaining what I mean. So here's an attempt to re-phrase, expand on and give some background to my thoughts.

I spent yesterday re-reading parts of Forgy's thesis (using OPS2) and the AI Journal article (using OPS5) describing the Rete Algorithm.

The implementation described in the thesis uses a whole lot of nodes to test for features of the fact-structure as well as the values contained in a fact. This approach of classifying things is known as Structural Typing.

Compiling the MB11 production in OPS2 (see 2.2.1 Productions with one Condition Element in the thesis):

MB11 ( (Want (Monkey Near =P))
-->
(Want (Monkey On Floor)) )
produces no less than five nodes in the Rete Network:

1. Is the element a list of two subelements?
2. Is the first subelement Want?
3. Is the second subelement a list of three subelements?
4. Is the first subelement of the second subelement Monkey?
5. Is the second subelement of the second subelement Near?

The article, however, describes a different approach, based on Static Typing, which requires you to declare the structure of a fact before using it. This way, we can remove all feature-tests and replace them with only one test, an expression node (also known as an object type node).

The MB11 production for OPS5 looks like this:
(p mb11
(goal ^status active ^type walk-to ^object [p])
-->
(make goal ^status active ^type on ^object floor))
They're not the same, I know.

But still, from the description in the article this would be compiled into three nodes:

1. Is the element's class goal?
2. Is the value of the status attribute active?
3. Is the value of the type attribute walk-to?

It doesn't take too large a rulebase before the benefits of this approach start to show. But at the same time, it seems that we have also lost the possibility to match nested structures.

In OPS5, according to the OPS5 Reference Manual, there's a possibility to create a vector attribute which can hold several values. Here's part of an OPS5 dialog that shows what happens when you match a vector attribute:
OPS> (vector-attribute bar)
(BAR)
OPS> (literalize foo bar)
NIL
OPS> (p rule
(foo ^bar [bar])
-->
(write " [bar] = " [bar] (crlf)))
*
NIL
OPS> (make foo ^bar 1 2 3)
NIL
OPS> (run)
1. RULE 1 [bar] = 1
I haven't yet been able to match several values with one variable (as $? allows you to in CLIPS). I'm not sure if this is even possible in OPS5. Actually, in chapter 2.6 The Range of Applicability of the Algorithm of Forgy's thesis he describes some of the constraints that we have to work around. In short, they are: 1) facts must only contain constants and 2) it must be possible to determine (at compile-time) which subelement each condition matches. This last constraint applies to Multifield variables and is probably one explanation to why they're so much slower than regular variables.

My question in the previous post was: why aren't nested data structures supported? Don't get me wrong. I'm not out to convince anyone to change anything or question whatever decisions has been made. I don't know if the issue is related to Multifield variables or not and I'm not sure I really care. The situation is as it is and CLIPS works just fine for my needs.

But. I'd really, *really* love to learn more about the design decisions made in different production systems regarding this, and other, areas. If you know of papers, articles, books or magazines that contain information about OPS, CLIPS and/or ART history please let me know.

[Update 2008-04-05]: The article The OPS Languages - An Historical Overview in this issue of PC AI magazine looks interesting enough to order.

6 kommentarer:

Gary Riley sa...

ART supported nested lists. The return value constraint, =, was used to distinguish between an expression and a sub-list. So (assert (a (+ 3 (* 2 4) (* 5 3) 4))) would create the fact (a (+ 3 (* 2 4) (* 5 3) 4)) while (assert (b =(+ 3 (* 2 4) (* 5 3) 4))) would create the fact (b 30).

It's a critically useful feature, but only for a subset of applications. I suspect that is why so few tools implement it rather than performance/technical issues.

Johan Lindberg sa...

Hi Gary,

ART supported nested lists. ... So (assert (a (+ 3 (* 2 4) (* 5 3) 4))) would create the fact (a (+ 3 (* 2 4) (* 5 3) 4)) ...

Ok. Was it also possible to match anywhere within that expression or would you have to match the "outer" set of parens?

It's a critically useful feature, but only for a subset of applications. I suspect that is why so few tools implement it rather than performance/technical issues.

Yes, that makes sense and you're most probably right.

I'm still curious about Multifields though ;-) Where they available in ART as well? Or is that something that first appeared in CLIPS?

woolfel sa...

If only I had a copy of the old ART manual, I could check. I'll try to ask paul haley, he should know.

Johan Lindberg sa...

If only I had a copy of the old ART manual, ...

The manual? I want the source code ;-)

Gary Riley sa...

ART allowed you to match anywhere within a nested list. So (a (+ (?o ?v ?v) (* ?x ?y) ?z)) was a valid pattern. ART also supported multifield wildcards and variables in patterns.

Johan Lindberg sa...

Interesting. Now I really wish I had a copy of the source code.

The CL version of OPS5 I'm running won't even let me eval functions within a make call and it's equally unimpressed with quotes.

OPS> (make foo ^bar (+ 4 3))
?..+..functions cannot be used at top level
NIL
OPS> (make foo ^bar `(+ 4 3))
?..BACKQUOTE..functions cannot be used at top level
NIL
OPS> (make foo ^bar '(+ 4 3))
?..QUOTE..functions cannot be used at top level
NIL
OPS>