Image denoising with dictionary learning

July 7, 2011

3

I am presenting an image denoising example that fully runs under my local scikits-learn fork. Coming soon near you! The 400 square pixels area covering Lena’s face was distorted by additive gaussian noise with a standard deviation of 50 (pixel values are ranged 0-256.) The dictionary contains 100 atoms of shape 4×4 and was trained […]

Dictionary learning sneak peek

June 24, 2011

2

Closing in on the goal of integrating J. Mairal’s dictionary learning in the scikit, I stitched together a couple of examples. The code is not yet integrated according to our standards, but here is the kind of results you can expect. Here is how a dictionary obtained from 8×8 patches of Lena looks like. Pretty […]

Posted in: Uncategorized

Summer of Code roadmap, part 1

June 12, 2011

2

After a little busy while, I have graduated and entered the summer vacation, which means time for serious GSoC work. So we had a little conference in order to discuss what will be done and when. We gathered quite a few code snippets since the official start of the project, but it’s now time to […]

Posted in: Uncategorized

First thoughts on Orthogonal Matching Pursuit

May 30, 2011

4

I am working on implementing the Orthogonal Matching Pursuit (OMP) algorithm for the scikit. It is an elegant algorithm (that almost writes itself in Numpy!) to compute a greedy approximation to the solution of a sparse coding problem: subject to or (in a different parametrization) subject to The second formulation is interesting in that it […]

Posted in: Uncategorized

Sparse PCA

May 23, 2011

6

I have been working on the integration into the scikits.learn codebase of a sparse principal components analysis (SparsePCA) algorithm coded by Gaël and Alexandre and based on [1]. Because the name “sparse PCA” has some inherent ambiguity, I will describe in greater depth what problem we are actually solving, and what it can be used for. […]

Customizing scikits.learn for a specific text analysis task

April 29, 2011

2

Scikits.learn is a great general library, but machine learning has so many different application, that it is often very helpful to be able to extend its API to better integrate with your code. With scikits.learn, this is extremely easy to do using inheritance and using the pipeline module. The problem While continuing the morphophonetic analysis […]

Posted in: nlp, scikits.learn

An overview of dictionary learning: Terminology

April 15, 2011

2

My GSoC proposal is titled “Dictionary learning in scikits.learn” and in the project, I plan to implement methods used in state of the art research and industry applications in signal and image processing. In this post, I want to clarify the terminology used. Usually the terms dictionary learning and sparse coding are used interchangably. Also […]

Newton interpolation and numerical differentiation

April 15, 2011

1

I am sharing some Python code code that I wrote as a school assignment. This computes the Newton form of the interpolation polynomial of a given set of points, and allows for the evaluation of both the polynomial and its derivative, at a given point. This is an accurate way of estimating the derivative of a […]

Posted in: python

A look at Romanian verbs with scikits-learn

April 14, 2011

4

One of the problems we tackled here at my university is one as old as the modern Romanian language. It is a problem for linguists, as well as for foreigners trying to learn the language. We call it the root alternations problem. Similar to French and other languages, Romanian verbs are split into four groups […]

Posted in: nlp, scikits.learn

Tweaking matplotlib subplots for pretty results

April 4, 2011

1

When plotting multiple subplots using matplotlib, the axes rarely look pretty with the default configuration. Since matplotlib figures are abstract objects, designed for consistency in print as well as on screen, tweaking their layout can get tricky. An example The following code is taken from the face recognition example in scikits.learn: pl.figure(figsize=(1.8 * n_col, 2.4 […]

Posted in: python