A while back, Bob L. Sturm blogged about a similar implementation of OMP to the one in scikit-learn. Instead of using the Cholesky decomposition like we did, his Matlab code uses the QR decomposition, to a similar (or maybe even identical) outcome, in theory. So lucky that Alejandro pointed out to him the existence of […]
October 9, 2011
One year ago I had the chance to take a class on Monte Carlo simulation with prof. Ion Văduva, and my assignment for the class was to implement exactly what it says in the title of the blog post. I am going to walk you through the idea behind this. General formulation The ratio-of-uniforms is […]
September 20, 2011
Last week was marked by the international RANLP (Recent Advances in Natural Language Processing) conference, taking place in a nice spa in Hissar, Bulgaria. The excellent folks from the computational linguistics group at the University of Wolverhampton were behind it, together with the Institute of Information and Communication Technologies from the Bulgarian Academy of Sciences. […]
September 19, 2011
Thanks to Olivier, Gaël and Alex, who reviewed the code heavily the last couple of days, and with apologies for my lack of activity during a sequence of conferences, Dictionary learning has officially been merged into scikit-learn master, and just in time for the new scikit-learn 0.9 release. Here are some glimpses of the examples […]
September 5, 2011
Anybody reading my blog should have expected me to blog about the end of my GSoC. Sorry to disappoint, but I simply did not experience anything similar to an ending. On the contrary, I feel like things have barely started. Also, I apologize for one of the few posts here without pretty pictures! 🙂 For […]
August 11, 2011
EDIT: There was a bug in the final version of the code presented here. It is fixed now, for its backstory, check out my blog post on it. When we last saw our hero, he was fighting with the dreaded implementation of least-angle regression, knowing full well that it was his destiny to be faster. […]
August 7, 2011
After intense code optimization work, my implementation of OMP finally beat least-angle regression! This was the primary issue discussed during the pull request, so once performance was taken care of, the code was ready for merge. Orthogonal matching pursuit is now available in scikits.learn as a sparse linear regression model. OMP is a key building […]
August 2, 2011
Since orthogonal matching pursuit (OMP) is an important part of signal processing and therefore crucial to the image processing aspect of dictionary learning, I am currently focusing on optimizing the OMP code and making sure it is stable. OMP is a forward method like least-angle regression, so it is natural to bench them against one […]
July 19, 2011
I am happy to announce that the Sparse PCA code has been reviewed and merged into the main scikits.learn repository. You can use it if you install the bleeding edge scikits.learn git version, by first downloading the source code as explained in the user’s guide, and then running python setup.py install. To see what code […]
July 10, 2011
One of the simplest, and yet most heavily constrained form of matrix factorization, is vector quantization (VQ). Heavily used in image/video compression, the VQ problem is a factorization where (our dictionary) is called the codebook and is designed to cover the cloud of data points effectively, and each line of is a unit vector. This […]
November 18, 2011
1