NULL
values usually indicate missing values. The following Oracle Data Mining algorithms assume that NULL
values are indicators of missing values (and are not indicators of sparse data): Naive Bayes, Attribute Importance, k-Means (Java interface), and O-Cluster.
Most Oracle Data Mining algorithms are robust in handling missing values and do not require users to treat missing values in any special way. Oracle Data Mining will ignore missing values but will use non-missing data in a case. See Recommended Missing Values Treatments for the exceptions.
There are several ways to treat missing values; one way is to replace the missing value with a "typical" value, such as the mean or the mode.
For the following algorithms, for best results, replace missing values with the mean for numerical data and the mode for categorical data:
If you invoke the Missing Values Transformation from a Mining Activity, default treatments are specified.
There is a difference between a missing value that is unknown and a missing value that has meaning. For example, if an attribute contains a list of products and quantities purchased by a customer in a retail store, each entry may have a small number of products chosen from thousands of possible products. Most of the products are "missing" in the sense that the missing products were not purchased. Such an attribute is said to be sparse. Normally you don't want to impose a missing value treatment on that attribute. The default is to skip such attributes in defining a Missing Value Treatment, but you can change that by clicking the checkbox below the Case Count. See Sparsity for more information.
Copyright © 2006, 2008, Oracle. All rights reserved.