"The Combs Method For Rapid Inference"
Copyright ã 1997 by William E. Combs.
All rights reserved. No part of this document may be reproduced or transmitted in any form by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from William E. Combs.
William E. Combs
The Boeing Company
P.O. Box 3707, MS: 6A-06
Seattle, WA 98124
(425)477-3970
william.e.combs@boeing.com
The combinatorial problem: fuzzy logic's Achilles' heel
If you have ever tried to use fuzzy logic in complex, real-time systems, you have likely struggled with what is popularly known as the combinatorial rule explosion problem. Rules are traditionally defined by the intersection of the input subsets. As inputs increase, the number of rules can explode exponentially, quickly diminishing performance to unacceptable levels in time-critical applications.
It is one thing to produce an actuarial table with an over-night batch program or balance an inverted pendulum. It is quite another to find an optimum path in an ever-changing, on-line network or to control the dynamics of a wind tunnel.
As a quick example, suppose that each input universal set has five input subsets. If a rule matrix were generated to contain all possible rules, then a two input, one output system would contain 25 rules; a three input system would contain 125 rules; . . . and a six input system would contain 15,625 rules! Since every rule is processed during each inference cycle, this exponential increase can bring an on-line application to its knees.
Rule reduction algorithms have been proposed using neural networks, genetic algorithms and a variety of clustering techniques to select only those rules that make a significant contribution to the inference output. While these methods can significantly reduce the number of system rules, they do not address the reason why the rules increase exponentially as inputs are added to the system.
What we would like is a rule configuration that models the entire problem space without incurring a combinatorial penalty. In order to understand how to do that, we need to go to the heart of the combinatorial problem.
We are accustomed to rules in this format: If the temperature is cool and the rate of temperature change is decreasing then the furnace output should be moderately high. Such a rule can be generalized to the following propositional logic construct:
If (p and q) then r (1)
Note, however, that neither p nor q has an independent relation with r. It is only the intersection of p and q that has this relation. If we define this intersection as w, then Eq.(1) can be restated as:
If w then r (2)
In fact, w can represent the intersection value of any number of antecedent elements in the rule configuration of Eq.(1). Since this rule configuration focuses on the intersection of the antecedent elements, let’s call it the intersection rule configuration (IRC).
There are two significant difficulties with the IRC, the first of which affects the combinatorial problem:
Whenever either p or q changes, w is altered as well, producing a different relation with r (a new rule).
W’s relation with r is only as accurate as our ability to define p’s intersection with q. As antecedent elements increase, this intersection definition becomes more complex.
What we would prefer in a rule configuration is a way for the antecedent elements to relate directly to their consequent counterparts. Using propositional logic, we can transform the following four rule configurations into a more desirable structure (the first line is highlighted because we will focus on it later):
[(p and q) then r] is equivalent to [(p then r) or (q then r)] (3)
[(p or q) then r] is equivalent to [(p then r) and (q then r)] (4)
[p then (r and s)] is equivalent to [(p then r) and (p then s)] (5)
[p then (r or s)] is equivalent to [(p then r) or (p then s)] (6)
Of course, I don’t expect you to take my word for these four constructs. So, let’s prove their equivalence by using truth tables. In propositional logic, a truth table can be built on the relation between a single antecedent and a single consequent, defined as (p implies r) or (if p then r). Since it is somewhat difficult to understand what implication actually means, we can state (if p then r) in terms of its material implication equivalent, (not p or r). Then, taking the four possible values for p and r, we can build the table in Figure 1.
p |
not p |
r |
not p or r |
if p then r |
T |
F |
T |
T |
T |
T |
F |
F |
F |
F |
F |
T |
T |
T |
T |
F |
T |
F |
T |
T |
Fig. 1
Looking at the implication (if p then r) another way, we can say that if p is the case, then r must also be the case. Thus, the second row of Figure 1 is not valid. With this table as a starting point, we can test whether or not our alternative rule configurations are equivalent. For the tables in Figures 2 - 5, we will see that the bolded columns are identical.
[(p and q) then r] is equivalent to [(p then r) or (q then r)]:
p |
q |
r |
(p and q) |
(p and q) then r |
(p then r) |
(q then r) |
(p then r) or (q then r) |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
F |
T |
F |
F |
F |
F |
T |
F |
T |
F |
T |
T |
T |
T |
T |
F |
F |
F |
T |
F |
T |
T |
F |
T |
T |
F |
T |
T |
T |
T |
F |
T |
F |
F |
T |
T |
F |
T |
F |
F |
T |
F |
T |
T |
T |
T |
F |
F |
F |
F |
T |
T |
T |
T |
Fig. 2
[(p or q) then r] is equivalent to [(p then r) and (q then r)]:
p |
q |
r |
(p or q) |
(p or q) then r |
(p then r) |
(q then r) |
(p then r) and (q then r) |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
F |
T |
F |
F |
F |
F |
T |
F |
T |
T |
T |
T |
T |
T |
T |
F |
F |
T |
F |
F |
T |
F |
F |
T |
T |
T |
T |
T |
T |
T |
F |
T |
F |
T |
F |
T |
F |
F |
F |
F |
T |
F |
T |
T |
T |
T |
F |
F |
F |
F |
T |
T |
T |
T |
Fig. 3
[p then (r and s)] is equivalent to [(p then r) and (p then s)]:
p |
r |
s |
(r and s) |
p then (r and s) |
(p then r) |
(p then s) |
(p then r) and (p then s) |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
F |
F |
F |
T |
F |
F |
T |
F |
T |
F |
F |
F |
T |
F |
T |
F |
F |
F |
F |
F |
F |
F |
F |
T |
T |
T |
T |
T |
T |
T |
F |
T |
F |
F |
T |
T |
T |
T |
F |
F |
T |
F |
T |
T |
T |
T |
F |
F |
F |
F |
T |
T |
T |
T |
Fig. 4
[p then (r or s)] is equivalent to [(p then r) or (p then s)]:
p |
r |
s |
(r or s) |
p then (r or s) |
(p then r) |
(p then s) |
(p then r) or (p then s) |
T |
T |
T |
T |
T |
T |
T |
T |
T |
T |
F |
T |
T |
T |
F |
T |
T |
F |
T |
T |
T |
F |
T |
T |
T |
F |
F |
F |
F |
F |
F |
F |
F |
T |
T |
T |
T |
T |
T |
T |
F |
T |
F |
T |
T |
T |
T |
T |
F |
F |
T |
T |
T |
T |
T |
T |
F |
F |
F |
F |
T |
T |
T |
T |
Fig. 5
If you are like me and would rather picture this equivalence visually, the Venn Diagrams in Figures 6 - 7 demonstrate the validity of Eq.(3). First, we will convert [(p and q) then r] to [not (p and q) or r)] by material implication since Venn Diagrams can only display AND, OR and NOT relations. On the left of Figure 6, is the Venn Diagram for (p and q). In the middle, NOT is applied to (p and q) so that the only area not colored is the intersection of p and q. On the right, the set r is OR’d into the Venn Diagram coloring a portion of the intersection of p and q.
(p and q) (not (p and q)) [not (p and q) or r]
Fig. 6
Next, we convert [(p then r) or (q then r)] to [(not p or r) or (not q or r)] by material implication. On the left of Figure 7 is the Venn Diagram for (not p or r). In the middle is the diagram for (not q or r). The right diagram is the result of the first two diagrams OR’d together.
(not p or r) (not q or r) [(not p or r) or (not q or r)]
Fig. 7
Note that the completed Venn diagrams of both halves of Eq.(3) are identical. Since [(p then r) or (q then r)] focuses on the union of the rule’s single input, single output relations, let’s call it the union rule configuration (URC).
The formal transformation steps for Eq.(3) in propositional logic are as follows:
(p and q) then r the initial IRC
not(p and q) or r by material implication
(not p or not q) or r by DeMorgan’s Law
not p or (not q or r) by association
(not q or r) or not p by commutation
(q then r) or not p by material implication
((q then r) or not p) or r by addition
(q then r) or (not p or r) by association
(q then r) or (p then r) by material implication
(p then r) or (q then r) by commutation yields the URC
Now that we have rule constructs providing direct relations between antecedent and consequent elements, you might be wondering why we have couched them in propositional logic terms.
The combinatorial problem is not unique to fuzzy logic. The IRC generates an exponential rule explosion in the other forms of logic as well. Propositional logic is the foundation of all formal logic, including fuzzy logic, and defines how rule elements relate to each other. Since the combinatorial problem is a generic issue, it is appropriate to work out alternative rule constructs using propositional logic methods.
Moreover, fuzzy logic is not limited to any particular rule configuration. That is, it is able to cope with intersection and union operators in the antecedent as well as the consequent portions of any rule structure. Once our equivalent rule formats have been verified in propositional logic, fuzzy logic is free to process their fuzzy versions using normal fuzzy inference methods.
The big adjustment I have had to make with these alternative rule constructs is that they are not intuitively obvious. When I got my first driver’s license, my insurance agent reminded me that since I was sixteen AND male AND single, my insurance premium would be high. Later, after college, he said that since I was in my mid-twenties AND male AND married, my insurance premium would be moderately low.
This latter statement seems to make more intuitive sense than our alternative format: since I was in my mid-twenties, my insurance premium would be moderately low, OR since I was a male, my insurance premium would be moderately high, OR since I was married, my insurance premium would be low.
No agent wanting to close a sale would utter such a seemingly conflicting statement. One of the problems with the URC is that the transformation from [(p and q) then r] to [(p then r) or (q then r)] shifts our focus from one rule to what appears to be the union of two (or more) rules. In addition, since each of these alternative rules can contain different consequent subsets, they seem to be contradicting each other: either my premium is high OR my premium is low. How can it be both?
This apparent conflict might lead us to view the OR operator as an EXCLUSIVE OR so that [(p then r) or (q then r)] becomes [either (p then r) or (q then r) but not both]. In reality, as the Venn Diagram shows, the OR in the URC is not exclusive. The final value of the rule is defined by the UNION of the two (or more) relations within the rule.
Our intuition, trained as it is, forms the hurdle that must be overcome in order to trust this solution to the combinatorial problem. I remember when I first encountered Reverse Polish Notation, my initial reaction was, "How could anyone think this way!!" Yet, this counter-intuitive approach gives the computer a simple tool for stack operations. In the same way, the counter-intuitive URC gives the computer a simple tool for solving the combinatorial problem.
How does the URC affect the multiplication of rules?
At first blush, it appears that all we have done with these alternative rule constructs is redefine what an individual rule looks like. How can such a redefinition impact rule multiplication? In order to show how the URC eliminates combinatorial rule explosion, I will use a simple, term-life insurance example. We will begin with the two input, one output model shown in Figure 8 since it is not easy to draw a five-dimensional IRC rule matrix.
Fig. 8
Based on the universal sets in Figure 8, an intersection rule matrix (IRM) may be populated with rules as shown in Figure 9.
Age/Health |
Youthful |
Young |
Middle Aged |
Mature |
Old |
Excellent |
(1) Very Low |
(6) Low |
(11) Mod. Low |
(16) Mod. Low |
(21) Moderate |
Good |
(2) Low |
(7) Mod. Low |
(12 ) Mod. Low |
(17) Moderate |
(22) Mod. High |
Average |
(3) Mod. Low |
(8) Mod. Low |
(13) Moderate |
(18) Mod. High |
(23) Mod. High |
Below Ave. |
(4) Mod. Low |
(9) Moderate |
(14) Mod. High |
(19) Mod. High |
(24) High |
Poor |
(5) Moderate |
(10) Mod. High |
(15) Mod. High |
(20) High |
(25) Very High |
Fig. 9. Note: The health status below poor and the age beyond old makes a person uninsurable.
Here is a list of these IRM rules:
Rule 1: IF the person is Youthful and his/her health is Excellent, THEN the term- life insurance premium will be Very Low.
Rule 2: IF the person is Youthful and his/her health is Good, THEN the term-life insurance premium will be Low.
Rule 3: IF the person is Youthful and his/her health is Average, THEN the term-life insurance premium will be Moderately Low.
Rule 4: IF the person is Youthful and his/her health is Below Average, THEN the term-life insurance premium will be Moderately Low.
Rule 5: IF the person is Youthful and his/her health is Poor, THEN the term-life insurance premium will be Moderate.
- - - - - -
Rule 21: IF the person is Old and his/her health is Excellent, THEN the term-life insurance premium will be Moderate.
Rule 22: IF the person is Old and his/her health is Good, THEN the term-life insurance premium will be Moderately High.
Rule 23: IF the person is Old and his/her health is Average, THEN the term-life insurance premium will be Moderately High.
Rule 24: IF the person is Old and his/her health is Below Average, THEN the term-life insurance premium will be High.
Rule 25: IF the person is Old and his/her health is Poor, THEN the term-life insurance premium will be Very High.
Since each antecedent element in the URC relates directly to its consequent counterpart, we will reduce the number of output subsets in Figure 10 from seven to five . . .
Fig. 10
. . . and a union rule matrix (URM) may be populated as shown in Figure 11.
Age |
Youthful Then Low |
Young Then Mod. Low |
Middle Aged Then Mod. |
Mature Then Mod. High |
Old Then High |
Health Status |
Excellent Then Low |
Good Then Mod. Low |
Average Then Mod. |
Below Average Then Mod. High |
Poor Then High |
Fig. 11
For simplicity, we can state that there are ten rules in this URM. But, remember these fuzzy rules grow out of the propositional logic format: [(p then r) or (q then r)] where p represents Age, q represents Health Status and r represents Premium. Therefore, since each one of these fuzzy relations is separated logically by an OR for inference purposes, we could just as easily state that there are two rules, one for each input . . . or that there is just one rule which mirrors the one unioned propositional logic rule.
This last option emphasizes that the one fuzzy rule (composed of a series of single input, single output sub-rules) models the entire problem space:
Rule: [IF the person is Youthful, THEN the premium will be Low OR
IF the person is Young, THEN the premium will be Moderately Low
OR
IF the person is Middle Aged, THEN the premium will be Moderate
OR
IF the person is Mature, THEN the premium will be Moderately High
OR
IF the person is Old, THEN the premium will be High]
OR
[IF the person’s Health is Excellent, THEN the premium will be Low
OR
IF the person’s Health is Good, THEN the premium will be Moderately Low
OR
IF the person’s Health is Average, THEN the premium will be Moderate
OR
IF the person’s Health is Below Average, THEN the premium will be Moderately High
OR
IF the person’s Health is Poor, THEN the premium will be High].
Whether you prefer to think of this alternative format as containing ten rules, two rules or one rule (it doesn’t really matter to the computer), you will note that there are just ten implications in this URM (five subsets times two inputs) instead of twenty-five implications in the IRM (five subsets raised to the power of two inputs). You will also note that there is just one antecedent element in each URM implication while in each IRM implication, there are two (the number of inputs).
Now, let’s see what happens when we add three more inputs to the URM: a person’s family health history; the level of risk in one’s occupation; and the level of risk if a person’s occupation is in a foreign country (Figure 12).
Age |
Youthful Then Low |
Young Then Mod Low |
Middle Aged Then Mod. |
Mature Then Mod. High |
Old Then High |
Health Status |
Excellent Then Low |
Good Then Mod. Low |
Average Then Mod. |
Below Average Then Mod. High |
Poor Then High |
Health History |
Excellent Then Low |
Good Then Mod. Low |
Average Then Mod. |
Below Average Then Mod. High |
Poor Then High |
Job Risk |
Minor Then Low |
Below Average Then Mod. Low |
Average Then Mod. |
Above Average Then Mod. High |
Major Then High |
Foreign Risk |
Minor Then Low |
Below Average Then Mod. Low |
Average Then Mod. |
Above Average Then Mod. High |
Major Then High |
Fig. 12
We are up to 25 implications for the URM as opposed to 3,125 rules for the IRM. And each implication still contains just one antecedent element, while there are five antecedent elements in each IRM rule. What is most important is that the URC/URM is modeling the entire problem space in a manner equivalent to the IRC/IRM.
Please understand that we are NOT advocating the transformation of existing rules from one format to another. Instead, we are suggesting that the problem space be modeled using a different rule configuration (URC) than the one normally used (IRC).
How does the URC work?
As stated earlier, fuzzy logic can work with any valid rule configuration, so all of the standard fuzzification, inference and defuzzification techniques apply. However, since the URC reduces the basic rule structure to a series of single input, single output implications coupled by union, it lends itself to further computational streamlining. For the sake of space, I will only illustrate one simple inference method to emphasize this optimization.
First, we place the single input, single output subset relations of the URC in a matrix format such as the one in Figure 12. Our next task is to calculate the output subset membership values from this matrix. But before we do, I want to briefly outline four processes that can help streamline URM inference: singleton output subsets, the Mamdani Method of inference calculated by SUM/PRODUCT inference operators, and Centroid defuzzification.
Singleton output subsets reduce the representation in Figure 13 . . .
Fig. 13
. . . to that shown in Figure 14.
Fig. 14
The only significant restrictions that singleton subsets bring to the problem space when used with Centroid defuzzification, are that they cannot be shaped and the output universal set must be represented by more than one singleton subset. They are commonly used in conjunction with Centroid defuzzification to decrease calculation time.
The Mamdani Method of inference is one of the most common inference methods in fuzzy logic and interprets [p then r] as [p and r]. I personally like to use Lotfi Zadeh’s SUM/PRODUCT inference operators to calculate Mamdani’s implication because these operators provide a smooth, predictable output that is easily computed. (Zadeh’s MAX/MIN inference operators are more often used to explain the inference process. Zadeh’s operators are also among the most popular.)
The PRODUCT inference operator calculates the intersection portions of the rule. If we use Mamdani's implication [if p and r] with singleton output subsets, the output from the PRODUCT operator will always equal the antecedent membership value since the consequent membership value will always be 1.0:
Inference Membership Value =
The antecedent membership value (a value from 0.0 to 1.0)
. . . multiplied by . . .
The consequent singleton subset (whose value is always 1.0).
The SUM inference operator calculates the union portions of the rule. In using this operator, we SUM all the single input, single output relations that correspond to a given consequent subset and then normalize the membership values of all the consequent subsets to 1.0. However, if we use Centroid defuzzification, there is no need to perform the normalization step since the consequent membership values are in both the numerator and the denominator of the Centroid equation.
So, if we use these four processes, and assemble the antecedent membership values in a union rule matrix format, we can sum the columns into an accumulator array representing the output subset membership values (Figure 15).
Union Rule Matrix and Accumulator
Age |
m Youthful |
m Young |
m Middle Aged |
m Mature |
m Old |
H. Status |
m Excellent |
m Good |
m Average |
m Below Average |
m Poor |
H. History |
m Excellent |
m Good |
m Average |
m Below Average |
m Poor |
Job Risk |
m Minor |
m Below Average |
m Average |
m Above Average |
m Major |
F. Risk |
m Minor |
m Below Average |
m Average |
m Above Average |
m Major |
Premium |
m Low |
m Mod. Low |
m Moderate |
m Mod. High |
m High |
Fig. 15
All that is left to do after this inference step is to plug the accumulator’s membership values into the Centroid defuzzification function to obtain an output Premium value.
Diagram of the URM inference engine
Fig. 16
Of course, the URC is not limited to this streamlined process. You are free to use any fuzzy logic functions because fuzzy logic is able to work with any valid rule configuration.
Modeling each input’s relative importance
There are times when we want to modify the importance of each input universal set. One way to do this is through the use of universal and subset weights as shown in Figure 17.
Universal
Weight
Age |
0.7 |
Youthful Then Low |
Young Then Mod Low |
Middle Aged Then Mod. |
Mature Then Mod. High |
Old Then High |
Health Status |
1.0 |
Excellent Then Low |
Good Then Mod. Low |
Average Then Mod. |
Below Average Then Mod. High |
Poor Then High |
Health History |
0.8 |
Excellent Then Low |
Good Then Mod. Low |
Average Then Mod. |
Below Average Then Mod. High |
Poor Then High |
Job Risk |
1.0 |
Minor Then Low |
Below Average Then Mod. Low |
Average Then Mod. |
Above Average Then Mod. High |
Major Then High |
Foreign Risk |
0.9 |
Minor Then Low |
Below Average Then Mod. Low |
Average Then Mod. |
Above Average Then Mod. High |
Major Then High |
Fig. 17
Briefly, a universal set weight is multiplied by every subset membership value in the universal set, as shown above, providing a way to relate universal sets to each other. Subset weights permit us to model the relative importance of each subset within a universal set by multiplying a weight with its corresponding subset membership value.
One area where I use subset weights is in modeling rate changes such as in a temperature control problem. To solve this problem with an IRM, we could establish three input subsets for Temperature (COLD, COOL and WARM), three for Rate of Temperature Change (DECREASING, STEADY and INCREASING), and five subsets for furnace output (LOW, MODERATELY LOW, MODERATE, MODERATELY HIGH and HIGH). Initially, we could arrange the rules as shown in Figure 18.
Temp/Rate |
Cold |
Cool |
Warm |
Decreasing |
High |
Mod. High |
Moderate |
Steady |
Mod. High |
Moderate |
Mod. Low |
Increasing |
Moderate |
Mod. Low |
Low |
Fig.18
However, such an arrangement means that if the temperature is Cold (say in the forties) and has been that way for some time, the furnace will not be producing at its highest level. Worse yet, when the temperature reaches 72 degrees and the rate of temperature change remains steady, the furnace will still be pumping heat into the residence even though we really want it to shut off. So, an alternative rule distribution to correct this situation might be that of Figure 19.
Temp/Rate |
Cold |
Cool |
Warm |
Decreasing |
(1) High |
(4) Mod. High |
(7) Moderate |
Steady |
(2) High |
(5) Moderate |
(8) Low |
Increasing |
(3) Moderate |
(6) Mod. Low |
(9) Low |
Fig. 19 Note: Rule numbers are in parentheses.
These rules solve the immediate problem but produce jumps between non-contiguous output subsets when the system moves from rules 2 to 3, 2 to 5, 5 to 8, and 7 to 8 which, in turn, adversely affects the smoothness of the output.
The fact that the output subsets are identical for rules 2 and 1, 8 and 9, 5 and 3, and 5 and 7, indicates that we could eliminate the influence of the STEADY subset altogether.
Unfortunately, if the STEADY row is cut from the IRM and the Rate input is STEADY, then the membership values of both DECREASING and INCREASING will fall to zero and no IRM rules will fire.
The URM configuration, using subset weights, offers a much simpler solution (Figure 20).
Subset Subset Subset
Weight Weight Weight
Temperature |
1.0 |
Cold Then High |
1.0 |
Cool Then Moderate |
1.0 |
Warm Then Low |
Rate |
1.0 |
Decreasing Then High |
0.0 |
Steady Then Moderate |
1.0 |
Increasing Then Low |
Fig. 20
Note that STEADY relates only to the MODERATE output subset and since this sub-rule is associated with all other implications by union, the system is not adversely affected if we multiply STEADY’s membership value by a zero subset weight.
Why should we care that the URM can provide a smoother output than the IRM in this situation? After all, will we be able to notice the difference if the furnace output jumps from LOW to MODERATE instead of to MODERATELY LOW? It is one thing to discuss temperature control for a residential thermostat that may only be sampling its environment every few minutes. It is another to consider the control requirements for an ultra-high speed drill bit in an NC milling machine where precise control enables the drill bit to travel across the milling surface at an incredibly rapid rate.
Of course, you can use methods other than weights to tune your application such as alpha cuts and hedges as well as varying subset shapes and sizes. I have briefly discussed weights as just one tuning example.
And speaking of tuning . . .
The IRC is not only at the heart of the combinatorial problem, it can also adversely affect tuning. Recall Eq.(2) [If w then r] where w represents the subset intersections of the various antecedent universal sets and r represents the subsets of the consequent universal set. We said that whenever either p or q changed, w was altered as well, necessitating a different relation with r (a new rule). In this configuration, as rules become more numerous, r subsets must also increase in order to accommodate the additional implications.
But what happens when the consequent subsets do not keep pace? Consider rules 18, 19 and 23 from Figure 9 above. Let’s suppose that I am fully mature. That is, my age places me in the subset MATURE with a membership value of 1.0. Let’s also suppose that my health is the essence of average so that it falls completely within the subset AVERAGE. With these two inputs, the IRM in Figure 9 declares that my premium is MODERATELY HIGH.
Now, suppose that either my health deteriorates to BELOW AVERAGE [m (19) = 1.0] or my health remains stable but I live to be OLD [m (23) = 1.0]. Not to worry (unless you are my insurance company) because my premium will not change one penny according to this IRM! The reason is simple: whenever we move either vertically or horizontally to a contiguous cell in an intersection rule matrix without changing output subsets, the output values plateau.
There are several options to address this problem. First, we could increase the output subsets from seven to nine so that vertical or horizontal IRM movement would yield different outputs. Unfortunately with this proposal, a seven-by-seven matrix would need thirteen subsets . . and what about a five or six input IRM? Computers do not care how many output subsets we have. But as system designers, we prefer to assign some meaningful linguistic title to each one, and modifiers can only provide so much distinctive variation.
Another option is to alter the size, shape and position of the input and output subsets, thereby modifying the shape of the plateaus to more faithfully produce a realistic output. Tuning configurations, such as Figure 21, are not uncommon.
Fig. 21
To see how this option works, note that all values between 30 and 40 fall entirely within the domain of subset B. Yet, only 35 has a membership value of 1.0. All other values have lower membership in the subset and therefore, have less influence on the overall system, tailoring the naturally flat output of a plateau to a more desired shape. This process imposes a kind of variable subset weighting structure on the inference engine. But the method is not straight forward, particularly for more complex systems, making tuning more difficult.
Why should the URC make tuning easier? Remember that the URC allows each antecedent element to have its own relation with r, independent of the other inputs unlike the IRC where input subsets must relate to their output counterparts through their intersection value w. Since the output universal set relates directly to each input universal set, no additional output subsets are needed when inputs are added to the system in order to enable the additional implications. Thus, no plateaus are created.
Tuning, while still necessary to shape the inference engine to the desired real-world system, is greatly simplified in the URC because there are no plateaus to overcome.
Design considerations
It is important to underscore that the combinatorial rule explosion problem is not unique to fuzzy logic. We have rooted our solution in propositional logic so that the URC can be applied to all disciplines of formal logic. Having proved its equivalency to the IRC, we can be confident that fuzzy logic will operate just as easily on URC rules as on those in the more familiar format.
As a matter of fact, the inference task for the URC can be viewed as a normal two-phase compositional rule of inference operation in a manner identical to that used for the IRC. Take Zadeh’s MIN/MAX operator as an example. If we picture the URC as a series of single input, single output rules coupled by union, the first inference phase would be to apply the intersection operator (MIN) to each rule in order to obtain its weight. The second phase would be to use the union operator (MAX) to obtain the maximum membership value (weight) for each output subset of these minimized fuzzy rules.
A simpler analogy can be seen by reducing Figure 11 to a single input, single output system shown in Figure 22.
Age |
Youthful Then Low |
Young Then Mod Low |
Middle Aged Then Mod. |
Mature Then Mod. High |
Old Then High |
Fig. 22
The inference tasks for this system are identical regardless of whether we perceive the rules as being in an IRC or URC format.
Even though these two rule configurations are equivalent, you may not be accustomed to working with systems based on single input, single output relations. So, for a moment, I would like to focus on this simple pairing. As a general rule, these input and output universal sets should be designed as monotones. That is, the values along each of the two X-axes should either only increase or only decrease.
For what appears to be an exception to this general guideline, consider the example in Figure 23, taken from Cox’s Fuzzy Systems Handbook (First Edition), pp. 81 - 89.
Fig. 23
This single fuzzy set defines the automobile insurance underwriting risk for drivers between the ages of 16 and 95 depicting young and very old drivers as high risks while placing persons between 30 and 70 in a moderately low risk category. This set is obviously non-monotonic.
Yet, on closer inspection, the membership values represent the degree of risk, not the degree of membership in the universal set AGE. Cox has cleverly incorporating several risk-related factors, including driving experience, reflex reaction time and general health, vision and hearing, into a composite set representing DRIVING RISK relative to age. Taken separately, these inputs are monotones. But as a group, they form a complex risk curve. Cox combined these elements into one set to improve performance. Now that the URC is able to efficiently manage large numbers of inputs, the need for composite curves to expedite computation should be greatly diminished.
[In the paper I wrote with James Andrews nearly a year ago for the IEEE Transactions On Fuzzy Systems, we said URC universal sets must be monotones, and their input and output universal sets must contain the same number of subsets. Since that writing, we have come to the understanding that these constraints may be too conservative. For a simple spreadsheet example that extends beyond these two constraints, see the accompanying CD ROM program entitled "URCCurve".]
An apparent mathematical stumbling block
There is a class of mathematical functions which appears to contain an intersection imbedded within the function. For equations such as A = SIN(X*Y), the multiplication is interpreted as the intersection of X and Y. This proposed intersection is further defined using a product operator so that a positive value is produced when both X and Y have the same sign. (This is also the case when a value is squared [A = X^{2} + Y^{2}].)
The argument builds on this conjecture ([X*Y] represents an intersection) by stating that since the union of a negative X value with a negative Y value cannot produce a positive result like that obtained by the fuzzy intersection product operator, the Combs method is not capable of handling this class of functions.
On closer inspection, this argument runs into trouble. While it is perfectly legitimate to use an arithmetic product operator to approximate a fuzzy intersection, the reverse is not tenable. That is, it is not legitimate to use a fuzzy intersection operator to approximate an arithmetic product.
To see the fallacy of this approach, we will assume that [X*Y] is performing a fuzzy intersection and then substitute a MIN operator for the PROD operator. Both of these operators should provide a solution that is in the same ballpark if we are dealing with fuzzy intersections since PROD is a more restrictive intersection operator than MIN. However, the MIN of two negative numbers is still a negative number.
Put another way, we know that an intersection operator produces a subset of the sets it is acting upon. This means that the elements of the intersection subset must also be contained in the sets involved in the intersection. If subsets X_{1} and Y_{1} (of universal sets X and Y) contain only negative numbers, then the subset created by the intersection of X_{1} and Y_{1} must also be limited to negative numbers.
So, what we are really dealing with in [X*Y] is arithmetic multiplication rather than fuzzy intersection. By placing the X and Y universal sets in the two dimensions of an intersection rule matrix and using a fuzzy product intersection operator, a fuzzy engine is able to calculate the same output as the [X*Y] function. However, in this case, the PROD operator is performing an arithmetic product operation, not a fuzzy intersection.
In order to accurately process this class of functions using fuzzy logic, we must either perform the arithmetic multiplication before we use a fuzzy engine or we must deal with each antecedent (X and Y) separately in a fuzzy engine and then multiply the results in order to insure that we do not inadvertently mix these fuzzy and arithmetic disciplines.
An HMO Scheduling Program
An HMO (Health Maintenance Organization) differs from traditional hospitals and clinics in that its manifesto focuses on the maintenance of health rather than the resolution of illness. It functions in a manner similar to an insurance policy and hospital/clinic rolled into one. In return for a financial premium, the HMO provides for the health care needs of its members. It is in the organization's interest to keep its members healthy since chronic illness generates significant, long-term expense.
For this example, we will concentrate on a very small portion of the business. Normally, a member would have an appointment with a technician, nurse, general practitioner or specialist, depending on the circumstances. We will consider only those members entering the facility without appointments to see general practitioners.
The spreadsheet on the accompanying CD ROM entitled "HMOSched" is limited to one triage nurse and seven general practitioners. The triage nurse interviews each walk-in patient and selects an appropriate physician based on the member's condition , and the doctor’s competency, patient load, and availability to see walk-in members. This program was adapted from my off-hour class, "Fuzzy Logic for Business Applications" at the University of Washington. Presented during the course’s first computer lab, it is a simplified example, introducing the student to basic fuzzy logic functionality.
Conclusion
The information age has put mountains of data at our finger tips through corporate databases and the worldwide web. The growing challenge for computer professionals, however, is not just how to retrieve and store this information, but how to interpret it -- how to use this unprecedented access to our advantage. Meaning is not a Boolean issue. It dwells in the ever-changing shades of gray of our complex world. The URC should find ready use in software agents, data mining tools and sophisticated modeling systems in addition to the more traditional control applications in science and engineering.
This brief chapter has outlined an inference method that facilitates the creation of large-scale time-critical fuzzy logic applications by eliminating combinatorial rule explosion from the inference process. It also cuts overall development time by simplifying tuning. We are not implying, however, that the URC should obsolete the more conventional rule configuration. Instead, we hope you will view this alternative method as an addition to your tool kit.
Acknowledgments
I want to thank my friends: James E. Andrews, for the creation of the propositional logic proof for Eq.(3), the truth tables and the Venn Diagrams as well as his technical and stylistic support; and Earl Cox, for his encouragement and enthusiasm.
REFERENCES
J. Andrews, "Taming complexity in large-scale fuzzy systems," PC AI, Phoenix: Knowledge Technology Inc., May/June 1997, pp. 39-42.
S. Chiu, "Method and software for extracting fuzzy classification rules by subtractive clustering," 1996 Biennial Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.96TH8171), Berkeley, CA, USA. pp. 461-5. NAFIPS - North American Fuzzy Inf. Process. Soc. BISC - Berkeley Initiative in Soft Computing IEEE Neural Networks Council. IEEE Systems, Man and Cybernetics Society, June 1996, pp. 19-22.
S. Chiu, "Fuzzy model identification based on cluster estimation," Journal of Intelligent and Fuzzy Systems, vol. 2., New York: John Wiley & Sons, 1994, pp.267-278.
W. Combs and J. Andrews, "Combinatorial rule explosion eliminated by a fuzzy rule configuration," IEEE Transactions on Fuzzy Systems, February 1998, Volume 06, Number 01, pp. 1-11.
W. Combs, "Reconfiguring the fuzzy rule matrix for large, time-critical applications," Third Annual International Conference on Fuzzy-Neural Applications, Systems and Tools, PennWell Publishing Company, 10 Tara Blvd., 5th Floor, Nashua, NH 03062-2801, November 1995, pp. 18:1-7.
I. Copi and C. Cohen, Introduction to Logic, Ninth Edition, New Jersey: Prentice Hall, 1994.
E. Cox, The Fuzzy Systems Handbook, A Practitioner’s Guide To Building, Using, and Maintaining Fuzzy Systems, Boston: Academic Press, 1994, p. 323.
E. Cox, Fuzzy Logic For Business And Industry, Rockland, Massachusetts: Charles River Media, Inc., 1995.
S. Halgamuge and M. Glesner, "Neural networks in designing fuzzy systems for real world applications," Fuzzy Sets and Systems, vol. 65, no. 1., pp. 1-12, July 1994.
H. Ishibuchi, K. Nozaki, N. Yamamoto and H. Tanaka, "Selecting fuzzy if-then rules for classification problems using genetic algorithms," IEEE Transactions on Fuzzy Systems, vol. 3, no. 3., pp. 260-270, August 1995.
G. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic, New Jersey: Prentice Hall, 1995, pp. 231-236, 317-322.
G. Klir, Ute H. St. Clair and Bo Yuan, Fuzzy Set Theory: Foundations and Applications, New Jersey: Prentice Hall, 1997.
B. Kosko, Neural Networks and Fuzzy Systems, A Dynamical Systems Approach to Machine Intelligence, New Jersey: Prentice Hall, 1992, pp. 322-326.
R. Rovatti, R. Guerrieri and G. Baccarani, "Fuzzy rules optimization and logic synthesis," Second IEEE International Conference on Fuzzy Systems (Cat. No. 93CH3136-9), San Francisco, March/April 1993, pp. 1247-1252, vol. 2.
F. Watkins, "Building Large-Scale Fuzzy Logic Systems: Combinatorial Challenges," PC AI Magazine, March/April Edition, 1995.
F. Watkins, Fuzzy Engineering, Ph.D Thesis, University of California, Irvine, 1994. UMI Order Number 9417921.
R. Yager and L. Zadeh, Eds., Fuzzy Sets, Neural Networks and Soft Computing, New York: Van Nostrand Reinhold, 1994, pp. 2ff, 85-125.
R. Yager and D. Filev, "Learning of fuzzy rules by mountain climbing," Proceedings of the SPIE Conference on Applications of Fuzzy Logic Technology, Boston, 1993, pp. 246-254.
L. Zadeh and J. Kacprzyk, Eds., Fuzzy Logic for the Management of Uncertainty, New York: John Wiley & Sons, 1992, pp. 265-276.
H.-J. Zimmermann, Fuzzy Logic Theory - and its Applications, Second Edition, Dordrecht: Kluwer Academic Publishers, 1991.
BIOGRAPHY
William E. Combs is a computing systems architect for the Boeing Company. His department is responsible for providing configuration control of on-line CATIA CAD/CAM models worldwide on MVS, UNIX and NT platforms for all Boeing airplanes. He also teaches an off-hour class, "Fuzzy Logic For Business Applications", at the University of Washington. His university degrees include: Bachelor of Arts in Mathematics and Music from Alaska Methodist University, Anchorage, Alaska; and Master of Divinity and Doctor of Ministry from Fuller Theological Seminary in Pasadena, California. He is a member of the IEEE Computer Society.
E-Mail: william.e.combs@boeing.com