Review Data Usage

The wizard displays the attributes in a grid.

The wizard determines a default mining type for each attribute and determines which attributes are sparse. You must check that these defaults are appropriate.

You can perform the following tasks:

To see a summary of the data, click Data Summary.

When you are done, click Next to proceed.

Select Mining Inputs

The default is to use all columns (attributes) as input, except for columns that have one value or only a very small number of values (that is, attributes that are constants) and columns that are not supported by the selected algorithm. You can change the selection of a column by clicking checkbox in the Input column.

Two buttons below the grid allow you to include or exclude all attributes:

Change Mining Type

The wizard determines a default mining type for each attribute.

For details about the text mining type, see Text Mining Type

You can change mining types in some cases:

Mining Type for O-Cluster

If you are building an O-Cluster model, ensure that

Text Mining Type

The wizard sets mining type to text only when the datatype obviously indicates that the field is text and when the algorithm that you selected supports text data, that is, for datatypes such as BLOB or CLOB. If a text attribute has some other datatype, such as VARCHAR2, the default mining type will be categorical. If the attribute should have mining type text, you must select text explicitly.

Note: Not all algorithms support text mining. See Text Mining for more information.

Change Sparsity

An attribute can be sparse. For non-transactional data, sparsity indicates the percentage of cases with NULL value for that attribute. For transactional data, sparsity indicates the percentage of possible values included. For example, if an average customer record shows the purchase of 4 of a possible 10,000 products, then that transactional attribute is sparse. Internal heuristics are applied to assign a check or not to the sparsity indicator on this page; you can change the indicator if you have knowledge contradicting the heuristics.

You can change the value in the Sparsity by clicking the checkbox.

Note: Text data is not sparse, that is, if you change mining type to text, do not click the sparsity checkbox.