Every day, over 2.2 billion users post overwhelming amount of images on Facebook, which the giant social platform should log, put into search results and scan for any malware content. Large parts of those pictures come with text that must be studied as well.

To manage this huge work, Facebook has developed a classy artificial intelligence system known as “Rosetta”. The presence of this system was announced in a blog post published on 11th September. On daily basis, Rosetta mines text written on more than a billion pictures publicly shared on Facebook and Instagram and is in different languages. The AI system can examine the content of both the individual frames within videos and standalone files. It differs from earlier text recognition softwares in a way that it uses a different method to do scanning of all the pictures.

Generally, AI systems of this type are capable of only recognizing individual characters in a portion of text without comprehending its meaning or catching any superior level details. However, Facebook required much modern level systems. The platform wanted to develop a system that can relate writing to the background of its respective image, which made the engineers train Rosetta with forecasting abilities.

A method to solve sequence prediction problem is used to examine the text using this developed AI system. It examines the pictures and utilizes historical data in addition to the visual profile of different characters, in order to comprehend the meaning of writing. Facebook claimed that this method helps Rosetta in identifying different lengths of words, even the ones it has not seen before during the training process.

“Once we obtain the bounding boxes for word locations on an image, they are cropped and resized to a height of 32 pixels with the aspect ratio maintained,” the engineers at Facebook who developed Rosetta explained. “All such crops for an image are batched into a single tensor with zero padding as needed and then processed at once by the text recognition model.”

Facebook is employing Rosetta to charge a variety of attributes. The images can be explored through Instagram and Facebook’s respective search engines. Further, the system assists in deciding how they should appear in the News Feed and if it contains any malware content. With time, Facebook is planning to further grow the capabilities of this AI system.

Facebook Rosetta Architecture

“As we look beyond images, one of the biggest challenges is extracting text efficiently from videos,” Facebook’s engineers wrote. “The naive approach of applying image-based text extraction to every single video frame is not scale-able, because of the massive growth of videos on the platform, and would only lead to wasted computational resources.”

Engineers also claimed that in order to advance the sorting of video frames for mining of text, they are looking for ways to use 3-D convolution, that is a machine learning technique.