Saturday, November 13, 2010

Comparing the new algorithms to the first generation

It is interesting to compare the new algorithms to the original zeta1 algorithms released by Numenta back in 2007. In a May 2007 blog post, Numenta discussed the limitations of those algorithms. Here were the limitations noted at that time (see this link for the blog post):

1. The 2007 algorithms had no time based inference, so inference was based only on, for instance, a single snapshot of a picture to recognize an object. Now, of course, the algorithms fully employ time-based inference, which should make computer vision applications (and other applications) based on HTM much more powerful.

2. In 2007, time was used for learning, but it was only "first-order" time based learning. That meant that when the software was attempting to learn sequences of patterns, it would only account for the current time step and one prior time step. Imagine trying to learn invariant representations for dogs, cars, people, and other complex objects based on only two consecutive still pictures of data. Our brains learn by seeing many "movies" of objects in the world around us, so this was a very significant limitation on the power of HTM. Now, it appears that HTM can learn sequences of essentially unlimited length.

3. The 2007 algorithms had abrupt discrete "nodes" without overlapping boundaries. According to the blog, this diminished the ability of the system to create invariant representations. Now, the levels of the HTM hierarchy are one continuous region (no more nodes). This is a big change that I actually wasn't expecting, which is good, because continuous regions of neurons rather than discrete nodes is the way that the brain works.

4. The 2007 algorithms did not use sparse distributed representations, which also severely limited the scalability of the algorithms due to memory requirements. Now, it goes without saying that sparse distributed representations are the key to making the new algorithms work. Not only does this make the algorithms much, much more scalable, it also facilitates generalization.

In short, every single major listed shortcoming of the original HTM software has now been addressed. I expect to see many commercial applications start to come from Numenta's work. Hopefully my blog will soon be able to focus as much on applications as on the core technology. It will be interesting to see the extent to which this technology takes off over the next few years. Personally, I am particularly interested in robotics, and hope to see HTMs begin to be used to create robots that can intelligently perceive the world and perform useful tasks. Navigation, object recognition and manipulation, and language understanding are all things that could theoretically be done by HTM.


  1. It's very interesting. Do you know what version of the software were used for the surveillance application, which is the only commercial product as of yet, as far as I am informed.

    Is the software complete now, with no major shortcomings ? So if it doesn't work, the model is not a good representation of the brain ?

  2. The Vitamin D surveillance camera application (at least the current version) is based on the old algorithms, although I am sure that Vitamin D is now implementing the new algorithms for a souped up version of its software. Numenta has not made public any information about applications using the new algorithms, so we don't yet know how well they will work. There are other applications that are using the old algorithms (and maybe the new ones now). They have worked with Forbes on web site click prediction, Tyzx on 3-D computer vision, EDSA Power on power system failure prediction, medical imaging companies on software that searches high resolution pathology slides for cancerous growths, and an unnamed major car company on pedestrian detection/recognition for cars (Volvo just rolled out a production model with this capability).

    As far as the state of the software, Numenta is the first to admit that its software is still in its infancy. The cortex has six layers of neurons, each of which is doing something different. This new software is a model of one of the six layers. Numenta's white paper hypothesizes that the other layers are also performing the same learning, inference, and prediction functions, but to do different things that the current software cannot yet do, such as attention, specific timing, and controlling motor behavior (robotics). In other words, Numenta has figured out the mechanisms used by each of the six layers to do their work, but hasn't necessarily figured out how to wire them all together to do these various things. Personally, I think that they have now figured out the hardest part.

  3. The link does not work now,- where Dileep describes the limitations of old algorithms-, so I saved the google cache, if anyone need just say. (Maybe the link will work soon, but who knows..)