Support Vector Machine (SVM) is a classification and regression prediction tool that uses machine learning theory to maximize predictive accuracy while automatically avoiding over fit of the data.
One-class SVM classifiers are used for anomaly detection; for more information, see Anomaly Detection.
The SVM algorithm is actually a suite of algorithms, adaptable for use with a variety of problems and data. By swapping one kernel for another, SVM can fit diverse problem spaces. Oracle Data Mining supports two kernels, Linear and Gaussian. Data records with n
attributes can be thought of as points in n
-dimensional space. SVM attempts to separate the points into subsets with homogeneous target values, by hyperplanes in the linear case, and by non-linear separators in the non-linear case (Gaussian). SVM finds the vectors that define the separators giving the widest separation of classes (the "support vectors").
SVM solves regression problems (problems where the target attributes have continuous values) by defining an n
-dimensional tube around the data points, determining the vectors giving the widest separation.
SVM can emulate traditional methods such as linear regression, radial basis functions, and neural nets, but goes far beyond those methods in flexibility, scalability, and speed. For example, SVM can act like a neural net in calculating predictions, but can work on data with thousands of attributes, far more attributes than a neural net supports. Moreover, while a neural net might mistake a local change in direction as a point of minimum error, SVM tries to find the global point of minimum error.
There is no upper limit on the number of attributes and target cardinality for SVMs.
SVMs perform well with real-world applications such as classifying text, recognizing hand-written characters, classifying images, as well as bioinformatics and biosequence analysis. Their introduction in the early 1990's led to an explosion of applications and deepening theoretical analysis that established SVM along with neural networks as one of the standard tools for machine learning and data mining.
SVM can be used for regression and classification problems where the input table has text columns. For details, see Text Mining.
Copyright © 2006, 2008, Oracle. All rights reserved.