This article is more than 1 year old

Google opens Cloud Vision API beta, world + dog asked to try it

60 cents to run OCR on 1,000 images. Not bad

Google has released a beta of its Cloud Vision API, allowing developers to submit images to its machine learning models for automated content analysis.

Following a limited preview release in early December, during which Google claimed thousands of companies used the API, generating millions of requests for image annotations, the Chocolate Factory has opened up access to everyone.

The API helps detect everyday objects. Google suggests terms such as "sports car," "sushi," or "eagle" as well as reading text within images or identifying product logos.

During the beta timeframe, which runs from now until 1 March, users will have a quota of 20 million images per month. While this is unlikely to allow developers to bring in the Cloud Vision API for real-time mission critical applications, it should provide a taster of its capabilities.

Google has also announced the pricing for the API, which will be effective the end of February. Google says users can "apply Label Detection on an image for as little as $2 per 1,000 images", or apply "Optical Character Recognition (OCR) for $0.60 for 1,000 images."

What does it do?

The automated analysis of the content of images is a non-trivial task for computers, although it is obviously quite trivial for humans. It has been a key interest for Google, whose image search capabilities have lagged far behind of its text search abilities – though it has invested many hours into creating an effective Safe Search filter, which is also available as a feature in the API.

The documentation provided explains that the Vision API can be enabled from within the Cloud Developer console, as long as the billing and authentication requirements are met.

The Vision API features currently include facial analysis – allowing applications to analyse the sentiment of those viewing logos, for instance – and landmark detection, alongside the ability to recognise those logos, general text, and NSFW content – as well as what Google describes as "labels".

Labels detection will be familiar to Google Photos users. It is essentially the key metadata feature of Google's Cloud Vision API, enabling the creation of categories of content within an image, "ranging from modes of transportation to animals", for classification purposes.

Google says its Cloud Vision API "is our first step on the journey to enable applications to see, hear and make information in the world more useful." ®

More about

TIP US OFF

Send us news


Other stories you might like