The wizard displays the attributes in a grid.
The wizard determines a default mining type for each attribute and determines which attributes are sparse. You must check that these defaults are appropriate.
You can perform the following tasks:
To see a summary of the data, click Data Summary.
When you are done, click Next to proceed.
The default is to use all columns (attributes) as input, except for columns that have one value or only a very small number of values (that is, attributes that are constants) and columns that are not supported by the selected algorithm. You can change the selection of a column by clicking checkbox in the Input column.
Two buttons below the grid allow you to include or exclude all attributes:
The wizard determines a default mining type for each attribute.
For details about the text mining type, see Text Mining Type
You can change mining types in some cases:
NUMBER
, you can change mining type between categorical
and numerical
.CHAR
or VARCHAR2
, you can change mining type between categorical
and text
. You can only select text
as a mining type if the algorithm selected supports text.If you are building an O-Cluster model, ensure that
numerical
mining typecategorical
mining typeThe wizard sets mining type to text
only when the datatype obviously indicates that the field is text and when the algorithm that you selected supports text data, that is, for datatypes such as BLOB
or CLOB
. If a text attribute has some other datatype, such as VARCHAR2
, the default mining type will be categorical
. If the attribute should have mining type text
, you must select text
explicitly.
Note: Not all algorithms support text mining. See Text Mining for more information.
An attribute can be sparse. For non-transactional data, sparsity indicates the percentage of cases with NULL
value for that attribute. For transactional data, sparsity indicates the percentage of possible values included. For example, if an average customer record shows the purchase of 4 of a possible 10,000 products, then that transactional attribute is sparse. Internal heuristics are applied to assign a check or not to the sparsity indicator on this page; you can change the indicator if you have knowledge contradicting the heuristics.
You can change the value in the Sparsity by clicking the checkbox.
Note: Text data is not sparse, that is, if you change mining type to text
, do not click the
sparsity checkbox.
Copyright © 2006, 2008, Oracle. All rights reserved.