Settings for Apriori Algorithm
The defaults have been selected to work well in most cases. For more information about confidence and support, see Support and Confidence. You can change the following:
- Minimum Support: A number between 0 and 100 indicating a percent. Smaller values for support results in slower builds and requires more system resources. The default is 5%.
- Minimum Confidence: Confidence in the rules. A number between 0 and 100 indicating a percent. High confidence results in a faster build. The default is 10%.
- Limit Number of Attributes in Each Rule: The default is to check this option. The maximum number of attributes in each rule; this number must be an integer between 2 and 20. Higher numbers of rules result in slower builds. The default value is 3. You can change the number of attributes in a rule, or you can specify no limit for the number of attributes in a rule. To specify no limit, click the checkbox to uncheck it. Specifying a large number of attributes in each rule increases the number of rules considerably. A good practice is to start with the default and increase this number slowly.
To restore the default values, click Restore.
If you increase support and confidence, you will reduce
the number of rules generated.
In most cases, confidence should be strictly greater than support.
If a model has 0 rules, you must modify the build options and rebuild the model. First try decreasing Minimum Support; if that doesn't work, decrease Minimum Confidence. You may need to specify values less than 1 for either of these values.
Click OK to return to the calling page.
Support and Confidence
Support and confidence are defined as follows:
- Support for a rule is the percentage of baskets containing the items in the rule.
- Confidence for a rule is the percentage of baskets containing all of the items in the antecedent that also contains the consequent.
For example, suppose that there are 100 baskets in the training dataset. Suppose that 20 baskets contain A and B. Suppose that 2 of the 20 baskets that contain A, and B also contain C. Then the support for "if A and B then C" is 2% (2 out of 100) and confidence is 10% (2 out of 20).
Copyright © 2006, 2008, Oracle. All rights
reserved.