Data mining algorithms often require that input data be discretized (binned) before the data is used for model building, testing, and applying (scoring). Binning means grouping related values together in bins to reduce the number of distinct values for an attribute. For example, you might ages into bins representing 10 year groups. After you bin data, you then perform calculations based on the bins. Having fewer attributes typically leads to a more compact model and one that builds faster, but it can also lead to some loss in accuracy. The target attribute is not usually binned.
You can either create bins manually (using the Discretization Wizard or by some other means) using a strategy that you create, or you can follow the recommendations of a mining activity. Different algorithms may require different binning strategies. Because binning can have an effect on a model's accuracy, getting the best results usually requires a careful binning strategy based on information about the data or the problem.
Use the Discretize transformation to create a view based on your discretization strategy; you can then pass the view to a data mining object.
Note: If you use your own binning strategy to build a model, you must bin data in the same way when you test or apply the model.
Copyright © 2006, 2008, Oracle. All rights reserved.