One of the simplest, and yet most heavily constrained form of matrix factorization, is vector quantization (VQ). Heavily used in image/video compression, the VQ problem is a factorization where (our dictionary) is called the codebook and is designed to cover the cloud of data points effectively, and each line of is a unit vector.

This means that each each data point is approximated as . In other words, the closest row vector (codeword/dictionary atom) of is chosen as an approximation, and this is encoded as a unit vector . The data representation is composed of such vectors.

There is a variation called gain-shape VQ where instead of approximating each point as one of the codewords, we allow a scalar multiplication invariance: . This model requires considerably more storage (each data point needs a floating point number and an unsigned index, as opposed to just the index), but it leads to a much better approximation.

Gain-shape VQ can equivalently be accomplished by normalizing each data vector prior to fitting the codebook.

In order to fit a codebook for efficient VQ use, the K-Means Clustering [1] algorithm is a natural thought. K-means is an iterative algorithm that incrementally improves the dispersion of k cluster centers in the data space until convergence. The cluster centers are initialized in a random or procedural fashion, then, at each iteration, the data points are assigned to the closest cluster center, which is subsequently moved to the center of the points assigned to it.

The `scikits.learn.decomposition.KMeansCoder`

object from our work-in-progress dictionary learning toolkit can learn a dictionary from image patches using the K-Means algorithm, with optional local contrast normalization and a PCA whitening transform. Using a trained object to transform data points with orthogonal matching pursuit, with the parameter `n_atoms=1`

is equivalent to gain-shape VQ. Of course you are free to use any method of sparse coding such as LARS. The code used to produce the example images on top of this post can be found in [2].

Using K-Means for learning the dictionary does not optimize over linear combinations of dictionary atoms, like standard dictionary learning methods do. However, it’s considerably faster, and Adam Coates and Andrew Ng suggest in [3] that as long as the dictionary is filled with a large enough number of atoms and it covers well enough the cloud of data (and of future test data) points, then K-Means, or even random sampling of image patches, can perform remarkably well for some tasks.

*dictionary learning, scikits.learn, Uncategorized*

Fabian

July 11, 2011

Hey! this looks awesome. I’ve been doing some numerical experiments with the LARS algorithm and found it to bee really unstable … I hope that is not affecting your work too much.

Fabian.

Sameer Gupta

July 14, 2011

I d want to know more

Vlad N.

July 18, 2011

I’d be happy to answer any questions.

Gael Varoquaux

July 14, 2011

The fact that LARS is unstable is an issue when you are trying to do model identification, but for denoising or prediction, is it much less important, as you are not really using the estimated coefficients themselves.

fionntan

July 18, 2011

Nice work, man. Wondering though, is the dictionary learning toolkit available yet? I checked out the bleeding edge scikits.learn and it didn’t seem to be in it.

Cheers.

Vlad N.

July 18, 2011

Thank you! The dictionary learning branches are pending for review. The first part of my GSoC project, the SparsePCA branch, has been merged a couple of days ago. If you would like to look at my development branch (which should be usable, there are tests and examples) see the branches ‘sc’ and ‘kmeanscoder’ on my github.