Last summer, while working at Numenta, I spent some time reading about sensory coding and how it relates to pattern recognition. I read a lot about sparse coding, principal component analysis, independent component analysis, nonnegative matrix factorization and other methods. It kept my mind fertile as I thought about the challenges of building information processing algorithms based on Hierarchical Temporal Memory, a theory of the neocortex (wikipedia, talk, white paper).
One well-known and understood method for sensory coding is called Principal Component Analysis (PCA). It is a very common technique for dimensionality reduction and decorrelation of data. It is one of my favorite techniques not just because it is so useful, but also becuase it is surprisingly deep. There are many perspectives from which to view it from, spanning topics from statistics and linear algebra to information theory and optimization. These perspectives are useful as tools for thinking about Principal Component analysis, useful as tools to think about a large class of methods related to Principal Component Analysis, and useful in their own right as mathematical tools.
I wrote a draft of an article attempting to introduce the reader to Principal Component Analysis through a variety of perspectives, and draw connections to the related methods of Independent Component Analysis, Kernel PCA, and Nonnegative Matrix Factorization. There are entire books on this topic that are far more comprehensive. Natural Image Statistics by Hyvarinen, Hurri and Hoyer (free e-book!) and Independent Component Analysis by Hyvärinen, Karhunen and Oja are very good ones:
Any serious student of data analysis methods would do well to go through both of these. My aim, however, was to write a short article that would distill the important elements and that someone could jump in and start reading with little or no background in machine learning or data analysis
Check out my article here. It is still very much a work in progress and far from comprehensive, especially the parts that have to do with topics other than PCA. I would love to hear any feedback about any aspect of the article. Enjoy!