Non-negative Matrix Factorization - Background

Background

Let matrix be the product of the matrices and such that:

Matrix multiplication can be implemented as linear combinations of column vectors in with coefficients supplied by cell values in . Each column in can be computed as follows:


\mathbf{v}_i = \sum_{j=1}^N \mathbf{H}_{ji}\mathbf{w}_j

where: is the number of columns in

is the column vector of the product matrix

is the cell value in the row and column of the matrix

is the column of the matrix

When multiplying matrices the factor matrices can be of significantly lower rank than the product matrix and it's this property that forms the basis of NMF. If we can factorize a matrix into factors of significantly lower rank than the original matrix then the column vectors of the first factor matrix can be considered as spanning vectors of the vector space defined by the original matrix.

Here's an example based on a text-mining application:

  • Let the input matrix (the matrix to be factored) be with 10000 rows and 500 columns where words are in rows and documents are in columns. In other words, we have 500 documents indexed by 10000 words. It follows that a column vector in represents a document.
  • Assume we ask the algorithm to find 10 features in order to generate a features matrix with 10000 rows and 10 columns and a coefficients matrix with 10 rows and 500 columns.
  • The product of and is a matrix with 10000 rows and 500 columns, the same shape as the input matrix and, if the factorization worked, also a reasonable approximation to the input matrix .
  • From the treatment of matrix multiplication above it follows that each column in the product matrix is a linear combination of the 10 column vectors in the features matrix with coefficients supplied by the coefficients matrix .

This last point is the basis of NMF because we can consider each original document in our example as being built from a small set of hidden features. NMF generates these features.

It's useful to think of each feature (column vector) in the features matrix as a document archetype comprising a set of words where each word's cell value defines the word's prominence in the feature: the higher a word's cell value the higher the word's prominence in the feature. A column in the coefficients matrix represents an original document with a cell value defining the document's ranking for a feature. This follows because each row in represents a feature. We can now reconstruct a document (column vector) from our input matrix by a linear combination of our features (column vectors in ) where each feature is weighted by the feature's cell value from the document's column in .

Read more about this topic:  Non-negative Matrix Factorization

Famous quotes containing the word background:

    I had many problems in my conduct of the office being contrasted with President Kennedy’s conduct in the office, with my manner of dealing with things and his manner, with my accent and his accent, with my background and his background. He was a great public hero, and anything I did that someone didn’t approve of, they would always feel that President Kennedy wouldn’t have done that.
    Lyndon Baines Johnson (1908–1973)

    They were more than hostile. In the first place, I was a south Georgian and I was looked upon as a fiscal conservative, and the Atlanta newspapers quite erroneously, because they didn’t know anything about me or my background here in Plains, decided that I was also a racial conservative.
    Jimmy Carter (James Earl Carter, Jr.)

    Pilate with his question “What is truth?” is gladly trotted out these days as an advocate of Christ, so as to arouse the suspicion that everything known and knowable is an illusion and to erect the cross upon that gruesome background of the impossibility of knowledge.
    Friedrich Nietzsche (1844–1900)