Ruleby - a Ruby Rule Engine

I just found the Ruleby project and I think it looks real promising.

Ruleby is a rule engine written in the Ruby language. It is a system for executing a set of IF-THEN statements known as production rules. These rules are matched to objects using the forward chaining Rete algorithm.

The code isn't available as I'm writing this but that appears to be a configuration problem rather than a strategy. I don't quite understand what they're saying on their benchmarks and open issues pages because I have a problem relating it to my understanding of how the Rete algorithm works. I'm sure it will all become clear once I can access the source code.

I really thought that Michael Neale's effort Ruby Rules would be the first pure-Ruby implementation of the Rete algorithm but unfortunately development seems to have slowed down.

22 kommentarer:

woolfel sa...

that description of network optimization is just node sharing. looks like these guys have some common misunderstandings about RETE. On the rubyforge website, you'll see they started in 2005 http://rubyforge.org/projects/ruleby/ and didn't really get any where for a long time :)

I came across ruleby early last year, but didn't see any activity. the disclaimer about manners makes me think they didn't implement conflict resolution strategy correctly. I hope they keep slugging through it and get it complete. I really doubt it will be faster than RETE written in other languages, since ruby's VM is slow. though as long as it's fast enough, it should be useful.

Johan Lindberg sa...

And I thought they were brand new :-)

I have a hard time figuring out how their implementation works from their examples and descriptions and I'm hoping to see some source code soon. I'll have to brush up my Ruby skills but that's another matter :-)

They say on the Benchmarks page that they needed a modification to make Manners work (which I assume is because of the CR strategy they've chosen). It seems to be fixed on timestamp of activations, salience and recency of fact objects.

But what I'm most interested in is what they mean with:

For Ruleby, this benchmark ran well for 16 guests. However, as the number of guests increases to 32 and 64, the relative performance of the benchmark drops off. This is due to a hack/bug in the implementation of the Rete algorithm.

I really don't think that Ruleby (or pyRete for that matter) will ever be very fast (I hope I'm going to be proven wrong though).

The underlying languages are too slow for large sets of rules/facts. I do, however, find pyRete useful in many other ways: small sets of rules/facts, prototyping, explaining Rule Engines etc.

Charles sa...

I don't really consider setting the salience of make_path higher than path_done to be a 'cheat'. The original benchmark specification not only does not define salience, it also does not specify a CR strategy. The closest it gets is in the accompanying documentation which states that Manners performs a depth-first search. For Miss Manners, getting the best performance from an engine depends almost entirely on ensuring that activations of the find_seating rule are ordered on the agenda by the recency of the Seating facts which they match. If the activation for the most recent Seating fact is always placed at the top of the agenda, the Miss Manners benchmark will operate in a depth-first fashion.

It has been mistakenly assumed (e.g., in the JBoss Rules documentation, if memory serves me correctly) that if the CR strategy does not allow a depth-first approach, the benchmark breaks. This is not strictly true. It works, but the amount of work the engine undertakes rises so rapidly that it becomes infeasible to run the benchmark. A breadth-first approach in Miss Manners can take many years to provide a result. A breadth-first approach, in effect, calculates every candidate solution, although the benchmark would only every report one of these. For the data files I use (I have generated my own data files using Miranker’s data generator compiled with the Microsoft C compiler because of inconsistencies in the data files published with the benchmark), I once calculated that there were 1.63e+9 candidate solutions for the 16-guest run, and 1.61e+178 candidate solutions for the 128 guest run. Amusingly, MS BRE uses a pseudo-random CR strategy, and this gives rise to the bizarre phenomenon of the benchmark taking wildly different times to complete a run depending on entirely arbitrary aspects of the facts that you assert. Change a hobby of a single guest, and the engine can complete the benchmark orders of magnitude faster! I had to search very carefully for an approach in my variant Manners implementation for MS BRE which allowed me to force depth-first without any major reduction in the amount of evaluation work undertaken. I did find a solution, incidentally, though there is a whole story to tell just on this alone (MS BRE automatically optimised my amended Rete, whereas Jess, CLIPS et al did not, so I had to push my change out of the rule set and into custom code in order to hide it from the RuleSet-to-Rete translator – hence you can’t see my change in the rule set itself!).

One wrinkle is that it is assumed that the depth-first approach in Manners is purely the result of choosing a depth-first CR strategy. In fact, things are not quite as simple as this. For example, in CLIPS, Manners only works in a depth-first fashion because of Rete, rather than CR features. CLIPS does use timestamps (‘timetags’), but only at the level of activations. An activation timestamp reflects the time at which the rule was activated, and not (directly) the time at which any individual fact was asserted. The fact the CLIPS performs manners in a depth-first fashion may reflect recency ordering of WMEs at the level of the working memory, rather than the agenda, though I have not ascertained if this is the case.

In Manners, the salience problem is quite distinct to the depth-first issue. The Ruleby developers found correctly that if activations of make-path rule are not ordered higher in the agenda than activations of the path_done rule, the benchmark breaks (I discovered the same thing when implementing my variant version of Manners for MS BRE). Manners only works unchanged on OPS5, CLIPS and Jess because each engine, by default, orders activations of different rules by complexity on the agenda. Each engine, however, has a different way of doing this. OPS5 does this via the default LEX CR strategy. CLIPS has an internal feature called ‘dynamic salience’ in which it silently assigns salience to each activation according to a simple complexity measure (different, incidentally, to the complexity measure used in its ‘Complexity’ CR strategy). For Jess, the complexity ordering is a little rough and ready, and is a co-incidental by-product of the way fact timestamp values are summed for each activation. The more facts an activation contains, the higher the summed value is likely to be! Hence activations of rules with more facts (a simple complexity measure) have higher timestamp values than activations of rules with fewer facts. By pure co-incidence, this allows Manners to perform correctly on Jess.

Engines differ considerably in terms of their support for CR, and I can't see that it is really possible to say that an engine has an 'incorrect' CR strategy implementation unless it obviously has a bug and/or fails to work as documented (though I note that most engines fail to adequately document their CR strategies). If Manners is going to be used as a general comparative benchmark for different engines, then it is reasonable to suggest that it should not depend on any specific CR strategy in order to operate correctly. It should certainly not depend on co-incidental or arbitrary features. This is one of several problems I have with accepting Manners as a good benchmark for comparative testing. Just like the Ruleby developers, I found that the benchmark (specifically, a variant of Manners) could not function correctly on MS BRE unless I specified salience for the same two rules.

Setting salience as described in the Ruleby article avoids any dependency on a specific form of CR in order to ensure correct operation. Personally, I think salience, which is supported by most engines, should be defined as a formal part of the benchmark. Importantly, simply setting the salience doesn't appear to reduce the amount of node evaluation performed by the Manners Rete. It just ensures that engines that don't share the same default CR behaviour as OPS5 nevertheless process the benchmark in broadly the same fashion.

In addition, to be entirely consistent, the are_we_done rule should have a higher salience than the continue rule, and the print_results rule should also specify higher salience than the all_done rule. Using salience in conjunction with control states (i.e., the context state slot in Manners) to control priority for rules that match the same state value is entirely acceptable IMHO. I explicitly set the salience of each rule accordingly in my variant Miss Manners for MS BRE.

Of course, my main problem in MS BRE was the lack of support for negated conjunction which unfortunately is the main feature tested by the benchmark. For Jess, 95.81% of all node evaluations occur in NotCE nodes for the 128 guest run, with almost all this work (202,929,954 out of 207,511,768 NotCE evaluations) done at one single node. I got almost identical results for JBoss Rules. My only recourse in MS BRE was to write custom code to track path and chosen fact assertions and retractions, and to provide a ‘strong’ statement of non-existence in lieu of weak negation. Hence, my variant cannot be used for comparative performance testing, although I am confident it is about as close to a true implementation of Miss Manners as it is possible to get for MS BRE.

Like Johan, I also don't understand the issue they are describing re iteration over working memory. It will be interesting to find out more in due course.

Charles sa...

Oh, and for the sake of clarity, I should say that I am perfectly aware that other engines such as Jess have a 'dynamic salience' feature as well. The point is that in CLIPS, the dynamic salience value is automatically set according to a complexity measure, where in other engines, the feature is used to allow salience to be set dynamically at run-time. I can't recall if CLIPS supports this more general meaning of the term, and I can't be bothered to search through the manual or source code to find out :-)

Gary Riley sa...

The default conflict resolution strategy in CLIPS places new activations at the topmost position in the agenda among activations of equal salience. No complexity measure or silent dynamic salience is used in determining placement. Thus for activations triggered by the assertion of a fact, it is the most recent fact in the activation which determines the position of the activation on the agenda. The ordering is unspecified for activations created by the assertion of the same fact.

CLIPS does have a dynamic salience feature, but it does not automatically set the salience to a complexity measure. The user specifies a global variable or function which is evaluated to determine the salience of activations. The evaluation can be set to occur either when the rule is defined, when the activation is created, or every execution cycle.

Charles Young sa...

Thanks Gary. Can't argue with the author! Clearly my mistake. When I have a moment, I will dig out the copy I have of the CLIPS source code and try to work out why I misunderstood what was going on, and perhaps, if I may, seek further clarification.

Any other corrections very welcome (Peter?)

woolfel sa...

I'll defer to gary's comment, since I really suck at reading C code. By suck, I mean I read C at a 3rd grade level :)

plus, I haven't looked at Clips code in over a year, so my rusty brain has a hard time remembering where to look.

Johan Lindberg sa...

The default conflict resolution strategy in CLIPS places new activations at the topmost position in the agenda among activations of equal salience.

AFAICT from the Ruleby source code Activations (who cannot be separated by Salience, Recency or "Timestamp") are put on the Agenda in the order of when the Rule was added to the engine.

But, I may be mistaken here because I read Ruby like a six-year old ;-)

Charles Young sa...

Well, I looked again at the CLIPS source and yes, of course Gary is quite correct and I am hopelessly, hopelessly wrong. It would appear that when I was looking at this issue several months ago, I totally misinterpreted a piece of code in rulebin.c. Not only is this code not invoked when you run Miss Manners (it implements bsave), it doesn't even work the way I thought it did! All a little embarrassing really. Thanks to Gary for putting me right.

However, although I was quite incorrect, I have reminded myself of the behaviour I observed in CLIPS at the time, and would love an explanation of what is happening. The easiest thing is to explain how to reproduce the issue. Take the Miss Manners benchmark, and add a third condition to the path_done rule. It doesn't matter too much what this condition is, but you should keep it very simple – e.g., by adding (test (= 1 1)). This doesn't disturb the Rete very much (at least, not in Jess). Another not very disruptive change would be (last_seat).

Now run the benchmark and look carefully at the results. The test is broken. Now swap the positions of the path_done and make_path rules and re-try. The benchmark works again. There may be a possibility that you could observe the opposite results - i.e, it works to begin with, but breaks when you swap rule positions.

Try this in Jess, and the benchmark will always run correctly. Remove the additional condition, and instead declare salience on the path_done rule to give it a higher salience than make_path. This will break the benchmark just like adding the additional condition to path_done and swapping its position with make_path. However, in this case, the benchmark will be broken regardless of the position of path_done. It's broken in both CLIPS and Jess.

When I was experimenting a few months ago, I oh so very mistakenly, thought I had found the answer. I was completely wrong. Playing around with this again to-night I note that, even if you add lots of additional conditions to path_done, the benchmark will still work correctly if path_done appears before make_path. This suggests that this behaviour is not based in any way on a count of the number of conditions in a rule. What is going on? Why does the order of declarative rules in a rule set influence the behaviour of CLIPS? To understand that is probably to be able to understand why Miss Manners is able to run unchanged in CLIPS.

One possibility is that the changes I described above are altering the order in which the join node that joins context on seating is performing activations. This node is shared between the two rules. If it activates its child nodes in a different order, then I guess this would result in activations being created in a different order and hence have different timetags. That would neatly explain why the benchmark can perform as if path_done had a higher salience. The path_done activations will have later timetags. Indeed, you can see by watching activations that the activations are indeed created in a different order depending on the position of amended path_done rule. This is probably where the explanation lies. For some reason, adding an additional condition leads to a different activation order. Moving the amended rule causes the order to revert. I wouldn’t expect Jess to behave the same way because it computes a sum of the timestamps of each individual fact in the activation, and uses this to order activations of the same salience. If this is the explanation, then it shows how the Miss Manners benchmark can depend on quite co-incidental and arbitrary features of a specific engine, and why salience should, IMHO, be added to the benchmark definition.

jpkutner sa...

Johan, you are correct about activations defaulting to the order in which the rules were added to the system. In fact, you can demonstrate this in the Miss Manners example. Switch the order of the make_path and path_done rules, and remove the negative salience from path_done. The benchmark will fire correctly.

I purposely left them in the opposite sequence to illustrate that they are dependent on some kind of forced ordering (rather than have it slip under the radar).

I am glad that you had time to look at the source code. However, you may find some gross mistakes in the implementation. I am very new to the Rete algorithm (within the last few months). But I am determined to fix these problems.

jpkutner sa...

Actually, i was wrong... Switch the order of the make_path and path_done rules does break done along the way. I must have overlooked some other factor.

Johan Lindberg sa...

Hi Joe.

I'm sure you'll manage to get the Ruleby Rete implementation in good shape soon enough.

I'm also working on a Rete implementation (using Python) and I'm also struggling to get the details right.

I look forward to trying Ruleby out in more detail. I am having some trouble in getting going though. Mostly because I don't know enough Ruby, but I'll try to fix that this summer. Is there any way to get at the examples? They weren't included in the gem.

Johan Lindberg sa...

Is there any way to get at the examples? They weren't included in the gem.

Forget I asked! I just tried subversion again and managed to download the source code including the examples directory.

Matt Smith sa...

Hi Johan,

Thanks for pointing out the examples did not make the released gem. We will do a new release soon and have the examples included in the gem.



woolfel sa...

if path_done fires before make_path, it ends up creating a loop :)

as you already discovered, all the make_path activation must fire first and path_done should fire last.

Gary Riley sa...

The path_done rule subsumes the make_path rule. Any set of facts which activate the make_path rule will also activate the path_done rule. This occurs not by coincidence, but because the pattern and join nodes generated for the path_done rule can be reused by the make_path rule. There's only one context fact and one seating fact that will match the first two patterns for both rules, so using either the OPS5 lex or mea strategy always result in the make_path rule being applied first because it has more patterns.

CLIPS doesn't create a separate terminal node for a rule in most cases, so subsuming rules usually share all of their joins with the rules that they subsume. In the case of the make_path and path_done rules these means that the path_done rule will always be activated first since its terminal node will be encountered before the other join nodes of the make_path rule. Since the make_path activations follow, they end up being higher on the agenda. Switching the order of the rules doesn't make any difference because the joins are all shared regardless of the order the rules are encountered.

If you add an additional condition to the path_done rule, such as (test (= 1 1)), the rule no longer shares all of its joins with the make_path rule. Its position relative to make_path then effects the order in which the joins are visited since additional joins need to be added to the network.

Charles Young sa...

Thanks Gary. Very illuminating. I spent some time single-stepping through CLIPS to ensure I had understood correctly. I can see that the joinNode struct has a 'ruleToActivate' field containing a defrule struct. Looking at the PPDrive() procedure in drive.c, I see that if this field is not null, the join node will add an activation for the give rule to the agenda. I guess the defrule can be considered to be the terminal node, but rather than being activated as a 'nextLevel' node, it is simply passed to the AddActivation procedure. Once this has been done, the join node continues by passing partial matches to a 'nextLevel' joinNode, if any. As you say, because path_done subsumes make_path, the path_done activations will always be added to the agenda before make_path activations. This is slightly different to other implementations I'm familiar with which create and activate terminal nodes (e.g., Defrule objects per disjunct in Jess and TerminalNode objects per rule in MS BRE).

Doesn't this essentially mean that, at least in the limited context of subsumption, subsumed rules (which are, by definition, more 'complex') could be regarded as having the equivalent of a higher salience than subsuming rules? It's similar in Jess, of course, though for different reasons. Jess will generally tend to fire activations for rules with more matched facts before rules with fewer matched facts because the summed timestamp values will typically be higher. These subtle features can be quite important to understand - after all, Miss Manners cannot work correctly as-is in an engine that does not naturally give preference to make_rule activations over path_done activations. That's what I mean by using the term 'co-incidence' - the benchmark requires preference to be given to make_path activations, but does not define salience explicitly. Some engines, by happy co-incidence, give make_path activations a higher preference over path_done activations. Others do not.

To my way of thinking, the dependency in CLIPS on rule order in my doctored version of the rule set serves to underline salience as an indispensible tool when defining rules. Even though the use of salience is sometimes deemed to compromise the declarative nature of rules, it remains a necessary method of control. I think that the Miss Manners benchmark would be improved by introducing explicit salience declarations, though I still don't think it serves well as a benchmark for comparison between different engines.

Charles Young sa...

Re. Peter's point about loops if path_done fires before make_path...I looked up the JBoss Rules document which says "When manners is executed in CLIPS with breadth strategy, it may result in an infinite loop if path_done is fired before make_path activations. This is because path_done modifies the context and removes the activations for make_path rule." Peter is broadly correct. In this case, path_done activations will still have an earlier timetag value than make_path activations, but will fire first because of the breadth strategy. I ran Manners in CLIPS and sure enough, it goes into a never-ending loop. I say 'broadly' correct, because it turns out that you run into the same issue with the continue and are_we_done rules.

When I created the variant implementation for MS BRE, I had to set salience on path_done & make_path and continue & are_we_done. I also set salience on print_results & all_done. It was in this context that I discovered that a breadth-first approach works. This, I think, is reasonable. The depth-first search in Manners fundamentally depends on the engine always firing a find_seating activation for the most recently asserted seating during any one reasoning cycle. The three pairs of rules on which I set salience logically should have salience set in order to ensure correct functioning of the benchmark. So, I accept that if you use the CLIPS (or Jess) breadth-first CR strategy on Manners as-is, it breaks. If you set salience on the three pairs of rules, and then use a breadth strategy, it works, but does many more evaluations.

Gary Riley sa...

I vote for salience. I'll even argue that the lack of salience in OPS5 and the subsequent reliance on the lex and mea strategies for correct execution compromises the declarative nature of the rules. Providing explicit mechanisms for the control of execution is a good thing. It allows you to seperate the declarative aspects of your rules from the imperative aspects.

In the case of manners, the path_done rule doesn't have any declarative knowledge in it anyway. Most of the declarative knowledge, seating men next to women and seating people with similar hobbies next to each other, is contained in the find_seating rule. The other rules all deal with the implementation of the solution rather than the declaration of the problem.

The correct execution of manners is critically dependent on the make_path rule firing before the path_done rule. But even if you are familiar with the OPS5 resolution strategy, you can't know whether or not the programmer intended one rule to execute before the other. In this case, assigning a salience to the rule that controls execution is actually declarative.

Another aspect of the manners rules which are non-declarative is the use of the context facts in the rules. I use this technique all the time, particularly for small programs, but this is another case of mixing implementation details in with the declarative knowledge of the program. I think supporting multiple agendas (a.k.a. CLIPS modules) is a better, more explicit mechanism for specifying that one group of rules should execute before another group.

Charles Young sa...

Excellent. So the Ruleby developers shouldn’t feel bad about defining salience to get Miss Manners to work on their engine.

To me, the issues around declarativity and control in Rete sound eerily familiar, and similar to the issues of 'purity' debated by the proponents of functional languages like Scheme and Haskell. Put very simplistically, the more a language avoids side effects, the closer it gets to being purely declarative. However, a truly 'pure' functional language can't really, in itself, be used to build real-world programs. Haskell claims to safeguard functional purity by using Monads to control interactions with state. In other languages, there is a greater or lesser degree of 'compromise' in which procedural approaches are used alongside functional code. These languages typically try to provide a fairly 'clean' separation of imperative and declarative code.

Production engines are highly stateful (working memories, token lists, agendas, etc.) and internal functions such as node activation directly result in all kinds of side effects. It shouldn't come as a surprise, then, that rule developers typically have to combine a degree of proceduralism with their declarative rules.

More to the point, the discussion above shows that different engines exhibit subtle but important differences in their side effects. Where, I wonder, does all this leave efforts like the OMG's PRR with its stated goal of "allowing interoperability across different vendors providing production rule implementations"? Will a standardised representation of production rules ever be of any practical use in allowing rules to be portable across different engines? I suppose that all depends on how well engine vendors can automate the production of PSMs from PRR PIMs. I suspect that in reality rule developers will still have to do a lot of hand-crafting.

Johan Lindberg sa...

Since the discussion took off in this direction. I started to think about what a formalisation of a Rete/Production System would look like a couple of days ago.

I didn't find much work done on the subject and I'm not sure if that's because not many have tried or if my Google skills aren't good enough.

There was at least one paper, Production Systems and Rete Algorithm Formalisation, Rapport Intermédiaire, Oct 2004 , about the subject that I could find. But the formalisation only concludes that a Rete/Production System must have a Conflict Resolution mechanism that computes which Activation to fire from a given Conflict Set.

And, as this discussion have shown, there are *many* ways (more than I thought actually ;-) to select that Activation even if all engines are using "depth" strategy.

I'm not saying that someone should try to formalise the inner workings of Conflict Resolution so that they can force implementation details on engine vendors. But maybe there can be some agreement on two, maybe three "new" and more detailed strategies that could be used across engine implementations (and we'd need a test-suite to prove it of course).

As you point out, I'm not sure if the common formats/languages will work in all situations. I don't have much hope for it to be honest but I really hope that properly written rules (with proper use of salience and modules) should work out most of the time. Ms Manners, however, won't! I think we can all agree on that ;-)

Johan Lindberg sa...

It just occured to me that what I'm looking for is not so much formalisation as it is standardization.