Non-Negative Matrix Factorization Algorithm

Non-Negative Matrix Factorization (NMF) is the algorithm used for feature extraction. NMF decomposes multivariate data by creating a user-defined number of features, which results in a reduced representation of the original data. NMF decomposes a data matrix V into the product of two lower rank matrices W and H so that V is approximately equal to the product WH. NMF uses an iterative procedure to modify the initial values of W and H so that the product approaches V. The procedure terminates when the approximation error converges or the specified number of iterations is reached. Each feature is a linear combination of the original attribute set; the coefficients of these linear combinations are non-negative. During model apply, an NMF model maps the original data into the new set of attributes (features) discovered by the model.

There is no upper limit on the number of attributes or on the target cardinality for NMF.

NMF in Text Mining

Text mining involves extracting information from unstructured data. Further, typical text data is high-dimensional and sparse. NMF reduces dimensionality of the data, and therefore, can be useful in text mining.

You can use NMF for feature extraction where the input table has text columns. For details, see Text Mining.