A few days ago Numenta sent out a newsletter with a quick update on their work. The newsletter notes that Numenta has now posted the Smith Group lecture on its website (it runs much more smoothly than the version on the University's website). It also announced the first additions/updates to the new learning algorithm documentation. The new additions were helpful, particularly the addition of an appendix that goes into some depth about the neuron model used by the HTM software. It includes some of the graphics used by Numenta in the online lecture.
Sadly, the newsletter noted that Numenta is temporarily deferring its work on computer vision problems in favor of applications that are more focused on temporal patterns, such as web click prediction and credit card fraud prediction. I guess that I can't say that I am too surprised by this. In hindsight, based on the online video and the whitepaper, it is clear that Numenta ran into some problems with its vision experiments with the new algorithms. The current algorithms can model layer three or four of the cortex (layer 3 for variable order time based learning or layer 4 for learning that does not rely on context). The whitepaper hypothesizes that layer four allows the brain to learn spatial invariance while layer three allows the brain to learn temporal invariance but that for vision problems the brain is somehow combining layers 3 and 4 to create spatial and temporal invariance at the same time. Until Numenta figures out how to model both layers at the same time working together like the real brain, computer vision probably isn't going to work terribly well.
Subscribe to:
Post Comments (Atom)
I dont know why Dileep has started a new company now, if they have problems with vision..
ReplyDeleteI hope they're still working together(with Numenta) behind the scenes.
Adam,
ReplyDeleteIt is difficult to know why Dileep left Numenta. It could be that Numenta and Dileep disagree on how to solve the vision problem. It is interesting to me that Numenta has deferred its vision work when its most visible customer (Vitamin D) relies entirely on Numenta's algorithms to run its vision software. I am not sure where that leaves Vitamin D.
It could be the case that, due to resource limitations rather than fundamental technical barriers, Numenta is focusing first on the most near-term commercial applications. I too am curious as to the extent that Dileep's commercial applications will continue to rely on HTM-like technology.
Yeah, have to be honest, click prediction and the like doesn't exactly get my heart racing. Vision, voice recognition, language - these are the promised land stuff that's really going to attract people. I'm sure Numenta knows this, and are probably just doing their typical keep-expectations-low deal. Although JH's proclamation "this [machine learning] is the place to be" still hints at high hopes.
ReplyDeleteYes, Numenta has done a better job lately at managing expectations. Hawkins' statement that now, rather than twenty years from now, is the time to do this, also hints at high hopes. It used to be that Numenta would say that it could take a number of years to really figure out how to do HTM, but it may be that they now think that we are much, much closer to a huge market in cortical software.
ReplyDeleteHarkening back to the Michael Anissimov blog post that I criticized earlier, I really liked the comment by the poster who compared Numenta's work to the transistor, likening the new learning algorithms to the discovery of transistors. Anissimov is essentially complaining that Numenta hasn't built the AI equivalent of supercomputers, while failing to recognize that algorithms that can do what the neurons, columns, and regions of the brain do is THE essential building block for a future of AI "supercomputers." My half-glass full argument, then, is that if Numenta has figured this out, then the road forward actually gets easier from here. If they have succeeded in creating the building block of how the cortical layers function, then adding feedback, attention mechanisms, motor behavior, and the like may not be terribly difficult. After all, we have a good idea of the wiring of the six cortical layers and the region-to-region connectivity of the brain. If Numenta has figured out what is going on at the column and neuron level, that is huge news, because when that knowledge is combined with what is known about level-to-level and region-to-region wiring, I would think that the puzzle pieces start to fall into place more quickly. Of course, this hinges on whether Numenta is correct that the six layers are all implementing a similar learning, inference, and prediction function.
UPDATE:
ReplyDeleteIn the Numenta forums, a Numenta employee just made a post with a bit of insight into why vision is not being pursued in 2011. Here is the direct quote:
"The new algorithms can be applied to vision tasks, but we decided not to pursue this now for two reasons. 1) There are many tasks we can address today that are of higher value than vision. 2) Vision tasks require significantly more memory and processing time, so that the algorithms require vision-specific optimizations to be feasible. Therefore, for the coming year at least, we do not expect to release new vision tools or upgrade our old vision tools."
I find that comment to be very interesting. Most tellingly, it sounds as though the technical hurdles to computer vision are not technical in the way that I thought. Rather, the memory and processing requirements for a sufficiently large HTM region to do vision are prohibitively large with today's technology. Sounds like your typical desktop computer doesn't have the power needed to do that type of task without heavy software optimization. I have to say that it would still be nice if they posted some experimental results