Learning rules from data

NAVIGATION

Summary : Users guide
Previous page : Handling rules
Next page : FIS usage and Defuzzification

Page content:

This page describes how you can automatically build a rule base from training data.

Introduction

You need to have an idea on the range of your data, because you have to first set up input and output sets of membership functions. This can be done using the application data-analysis.cpp.

For now, lets assume you know min and max values, and that you have a FIS correctly defined (the easiest way is to generate the sets of functions automatically from these min-max values). All you have to do is define the granularity of your set (the number of functions used). (For more on defining sets of functions, see page Sets of membership functions.)

Learning rules from data

One you have set up inputs and output of the Fis, learning the rules from data can be done in two ways:

Adding rules one by one. At present, this can only be done for Mamdani fis types.
Adding rules from a whole dataset. This takes in charge all the process. This is valid for both Mamdani and TS fis types, but it requires loading all the data at once, making it unsuitable for (very) large training sets.

Warning:: Please note that the number of needed rules grows very quickly when the number of inputs increases, particularly if input partition is arbitrary. This reduces reliability, as you won't necessary have a sufficient number of training patterns for all the possible combinations of inputs. Thus, in that case, it is better to tune manually the input partition using for example a data clustering technique, such as described in [Chi, 1996]. This is NOT implemented at present here (but might one day ?).

Adding rules one by one

This means adding a new rule at the time you get a data point, so you don't need to store all the datapoints. It implies more steps, as you need to:

add the rules from each datapoint,
then, reduce rule base,
then, factorize rule base.

Adding rules

First, adding rules. This can be done this way:

int NbInputs = 4;       // for example
std::vector<REALVAL> in_val(NbInputs);  // a vector of 4 values
while( NotFinished )
{
//  get input data in vector 'in_val'
        ....
// get corresponding output data in out_val
        REALVAL out_val = ...
// Create a rule from these values
        fis.AddRuleFromValues ( in_val, out_val )
}

This will create a new rule for each data vector. Each rule will be assigned a fuzzy "degree" value (can be fetched with RULE_IDX::GetDegree() ), whose value defines how much a data point fits to the rule it produced. The degree is computed as the product of the different membership functions values.

For example, say we have a FIS with three inputs {A,B,C}. For a given input value $\{ x^A,x^B,x^C\}$ , we first search the membership $\mu^j$ functions that "triggers" the value with the maximum fuzzy value $ y^j $ . The degree of the corresponding rule will be computed as $ d= y^A . y^B . y^C $ .

In real life, the data can be fetched directly from some other piece of software, but it can also be stored in a file. This library provides code for reading data files in several formats, see Dataset handling.

Reducing rule base

Before using the FIS, you need to reduce the rule base, as there will be both redundancy and contradictory rules. This is be done with a call to SLIFIS::ReduceRuleBase().

Several methods are available for reducing the number of rules. They all are based on the "Wang & Wendel" technique (see References). We group all equal rules together in several tables, one for each output, and then select the group of rules that gives one output, and reduce it to a single rule.

At present, three methods are available (See slifis::EN_REDUCE_METHOD for details):

REDM_HIGHEST_SUM (default): keep the rule for whom we have the maximum degree sum.
REDM_HIGHEST_NBRULES: keep the rule for whom we have the maximum number of rules with same consequence part,
REDM_HIGHEST: keep the rule that has the highest degree, regardless of other ones,

For the third method, the degree of the winner rule is kept as is. For the first two methods, the degree of the final rule is computed as the ratio of the sum of all the degrees of the considered group of rules over the number of rules (i.e. it is the mean value of the degrees of the considered subset of rules).

Factorizing rule base

This is done with SLIFIS::FactorizeRuleBase().

This aims at reducing the number of rules by grouping rules that are similar into a single rule. For example, consider a 3-input FIS where:

input 1 is {blue, red,green},
input 2 is {high,low}
input 3 is {high,medium,low}
output is {great,good,bad}

Say we have the following rule base:

r1: IF input1 is 'blue'  AND input2 is 'high' and input3 is 'high'   THEN output is 'great'
r2: IF input1 is 'blue'  AND input2 is 'high' and input3 is 'medium' THEN output is 'great'
r3: IF input1 is 'blue'  AND input2 is 'high' and input3 is 'low'    THEN output is 'great'
r4: IF input1 is 'red'   AND input2 is 'low'  and input3 is 'medium' THEN output is 'good'
r5: IF input1 is 'green' AND input2 is 'low'  and input3 is 'low'    THEN output is 'bad'

The three first rules can be factorized to a single rule, thus giving:

r1: IF input1 is 'blue'  AND input2 is 'high'                        THEN output is 'great'
r4: IF input1 is 'green' AND input2 is 'low'  and input3 is 'medium' THEN output is 'good'
r5: IF input1 is 'green' AND input2 is 'low'  and input3 is 'low'    THEN output is 'bad'

This can be done only because:

the output value was the same for the three rules,
the factorized input covered all of the inputs domain. For example, if rule n°3 would have been missing, then we couln't have done this factorization.

Factorisation is not always possible, particularly when the number of rules is too small, or when the inputs have many membership functions.

Adding rules from a whole dataset

Once you have filled a DATA_SET object, the whole process described above can be done at once:

        SLIFIS fis;
        // ... define inputs/output
        DATA_SET dataset;
        // ... load it with values
        RBB_PARAMS params; // parameters for rule base building
        fis.BuildRuleBaseFromData( dataset, params );

You can adjust the parameters by editing values in RBB_PARAMS (default values are provided).

For a Mamdani Fis type, it basically does the same as above (add rules one by one, then reduce, then factorize, depending on the selected parameters in RBB_PARAMS). For a TS Fis type, it produces a rule from each data point and computes the corresponding coefficients. In that case, no reducing neither factorisation is done (outputs coefficients are computed using an estimation process, it is very unlikely that they are the same from one rule to another...)

Takagi-Sugeno learning

Warning:: Please note that in order do to the fitting required for building the TS coefficients, this library relies on an external library that is not included at present. Please read this: Dependencies.

For a TS Fis, you need to have all the data available in a single DATA_SET object before building the rules. (see Dataset handling). Then, just as for a Mamdani Fis, call the SLIFIS::BuildRuleBaseFromData() function (see above). This will call SLIFIS::BuildTSRulesFromValues() and will compute a rule for each possible combination of inputs (assuming there are enough data points to cover all the possible situations). For example for a 3-input Fis with 3 MF for input 1, 4 MF for input 2 and 2 MF for input 3, then this will produce $3 \times 4 \times 2 = 24$ rules.

The process does the following steps:

For each possible combination of inputs, extract the subset of data that fits inside the defined inputs, i.e. all the values whose fuzzyfied value using the considered input membership function are above a threshold (see RBB_PARAMS).
Then, using this subset, compute the TS coefficients for this rule by doing a least square fitting.
Finally, add the rule to the rule base.

For each rule, the fitting is done with the extracted subset of data, building the following linear system, and solving it. For example, for a FIS with $ n=3 $ inputs, say we have $ m=5 $ data points in the considered subset.

is the value for input i, for data point n° j,
is the output value for data point n° j,
is the TS coeff for input i (what we are looking for).

We need to build the following $ m $ equations:

$\begin{eqnarray*} b_1 &= x^0 + x^1 a^1_1 + x^2 a^2_1 + x^3 a^3_1 \\ b_2 &= x^0 + x^1 a^1_2 + x^2 a^2_2 + x^3 a^3_2 \\ b_3 &= x^0 + x^1 a^1_3 + x^2 a^2_3 + x^3 a^3_3 \\ b_4 &= x^0 + x^1 a^1_4 + x^2 a^2_4 + x^3 a^3_4 \\ b_5 &= x^0 + x^1 a^1_5 + x^2 a^2_5 + x^3 a^3_5 \\ \end{eqnarray*}$

This can be expressed using matrix notations:

$[\mathbf{A}]. [\mathbf{x}] = [\mathbf{b}]$

with:

$[\mathbf{x}]$ a vector of elements (3 inputs plus the constant term) $\{ x^0, x^1, x^2, x^3 \}$ ,
$[\mathbf{b}]$ a vector of elements (as much as the data points) holding the output values: $\{ b_1, b_2, b_3, b_4, b_5 \}$ ,
$[\mathbf{A}]$ a matrix ( lines x cols ) holding all the input values :

\f{matrix}{
 1 & a^1_1 & a^2_1 & a^3_1 \\
 1 & a^1_2 & a^2_2 & a^3_2 \\
 1 & a^1_3 & a^2_3 & a^3_3 \\
 1 & a^1_4 & a^2_4 & a^3_4 \\
 1 & a^1_5 & a^2_5 & a^3_5
\f}

The numerical value for $[\mathbf{x}]$ can be computed by: $[\mathbf{x}] = [\mathbf{A}]^{-1} [\mathbf{b}]$ but it is well-known that this is numerically unstable, so we proceed with an SVD solving instead.

For Eigen usage, see http://eigen.tuxfamily.org/dox/TutorialLinearAlgebra.html

NAVIGATION

Summary : Users guide
Previous page : Handling rules
Top of page : Learning rules from data
Next page : FIS usage and Defuzzification