Microsoft’s Seeing AI is a smartphone app that “narrates the world around you”. Development of this software started way back in 2014, and at that time it used a neural network to help find and identify surrounding objects. However, it was far too slow with a latency of approx 10 seconds.
In a recent Nvidia blog post the graphics chip firm tells the story of the development of Seeing AI from its huge latency limited scope past to the wide ranging and responsive current day. The app was demonstrated at GTC 2018 last month, as it was trained using Nvidia hardware. The development team completed some local training using the Nvidia Titan X GPU, with more expensive works completed using a Microsoft Azure cloud instance running Nvidia Tesla P100 GPUs.
Over time the app has gained new ‘channels’ which are extra capabilities. Sensing AI can now complete the following useful tasks, some of them even work while offline:
- Read printed short texts as soon as they are presented via the smart device camera
- Recognise, capture and read through documents
- Read handwritten text
- Identify products
- Identify people – if the person is a pre-configured contact then you will be told their name, otherwise they will be described
- Judge a person’s emotions from their expression
- Recognise and describe scenes around you
- Recognise and describe scenes in other smartphone apps such as Mail, Twitter, Gallery and more
- Identify currency – useful for paying for products and services
- Measure light
- Describe colour
You can see many of the above capabilities in demo videos via the Seeing AI product page.
In training the app to recognise money the research tem thought it was important not to ‘guess’ currency values incorrectly. The AI would rather not classify than get it wrong. Thus a very wide range of photo sources of currency was used for training. Cash photos were partly obscured, blurred, out of context and too zoomed in to try and cover any occasion.
Seeing AI is quite a popular app, and the Nvidia blog says that Stevie Wonder uses it every day, for example. At the moment it is iOS only. Whether it will come to Android isn’t clear, as it is rumoured that this iOS test version was only released in preparation for Microsoft’s goal of producing a refined HoloLens app.