Machine learning already allows computers to identify people by faces and read medical images. But the task of interpreting what is happening in real-time video generated cumbersome algorithms – until researchers from MIT and IBM took up the matter.
Researchers at the MIT lab and IBM Watson have figured out how to reduce the size of video recognition models. Firstly, it increases the speed of learning, and secondly, such “lightweight” algorithms can work even on mobile devices, reports Engadget.
The trick is to change the way video recognition models look at the time. Modern neural networks encode timing in a sequence of images, which leads to an increase in their size and computational complexity. Specialists from MIT and IBM have developed a “time shift module” that gives the model a sense of the movement of time without the need for an explicit presentation.
During testing, the learning speed of a deep neural network that recognizes video, this method coped with the task three times faster than existing analogues.
The time shift module allows you to run video recognition models on mobile devices. “Our goal is to make AI accessible to every owner of a cheap device,” said MIT professor Han Song. “To do this, we need to construct effective AI models – less demanding on energy and resources that can work on peripheral devices, where artificial intelligence is now migrating.”
The importance of peripheral computing recently spoke and the head of Microsoft Satya Nadella. At the summit in Washington, he argued in favor of this technology, which, in his opinion, will soon work in tandem with the cloud.