One-Class Support Vector Machine Algorithm

The One-Class Support Vector Machine (SVM) algorithm is used to solve anomaly detection problems, which are considered to be a kind of classification problem. Standard classification algorithms require the presence of both positive and negative examples (counterexamples) for a target class. One-Class Support Vector Machine (SVM) classification requires only the presence of examples of a single target class. The model learns to discriminate between the known examples of the positive class and the unknown negative set of counterexamples. The goal is to estimate a function that will be positive if an example belongs to a set and negative or zero if the example belongs to the complement of the set. This type of problem is called anomaly detection.

Solving such a "One-class" classification problem is difficult. The goal of anomaly detection is to provide some useful information where no information was previously attainable. The goal is to create a "profile" of the known class, and to apply that profile to the general population for the purpose of identifying individuals who are "different" from the profile in some way.

One-class SVM models, unlike SVM classification and regression models, do not have a target.

One-class SVM models are useful also in cases where it is difficult to provide counterexamples. For example, in text document classification, it is easy to classify a document under a given topic. However, the universe of documents not belonging to this topic can be very large and it may not be feasible to provide counterexamples.

If you have enough counterexamples for the rare event, you can still build an anomaly detection model; however, in this case, an SVM classification model may provide better results.

You can use one-class SVM for anomaly detection where the input table has text columns. For details, see Text Mining.

SVM is the only Oracle Data mining algorithm that supports one-class mode.

Another way to detect rare events is to build a clustering model, apply it, and then find items that do not fit in any cluster.