Google Lens has been on the market for only a year. The growth it has seen is massive. Google Lens can now detect over a billion items. When the artificial intelligence camera feature launched, it could only recognize 250,000 items. Google made this announcement in a blog post.
Google Lens was fed several brand labels. The app’s optical character recognition engine was fed so many labels that it started expanding its “vocabulary.” This helped enhance its recognition algorithm and was able to recognize products just by reading the names.
Google Lens has also been fed more data from photos captured by smartphones. This translates to the app being overall more accurate and robust. The one billion image database available to Google Lens makes recognizing products even more reliable.
Google Lens expanded its product database via Google Shopping. The database of products greatly helped Google Lens, but it is still a far from perfect system. Google Lens will most likely fail to recognize rare and obscure items such as vintage cars or stereo systems from the 70s.
Despite this shortcoming, Google Lens has a lot of practical applications. It can turn one’s images into search queries to bring up all relevant information about the image. Google Lens does this via Ding! Ding! You guessed it, machine learning and artificial intelligence; both hot fields in the IT sector.
Lens uses TensorFlow which is Google’s open source machine learning framework. TensorFlow helps connect the image to words that best describe the image. For example, if a user snaps a picture of his friend’s PlayStation 4, Google Lens will connect the images to the words PlayStation 4 and gaming console.
The algorithms then connect those labels to Google’s Knowledge Graph, with its tens of billions of facts on Sony PlayStation 4. This helps the system learn that the PS4 is a gaming console. Obviously, it isn’t perfect. It can get confused by a similar looking object.
Another example of Google Lens’ complexity is a kid describing a dog breed as “cute, tan fur, with a white spot in the middle” Google will learn about this dog just by using the camera, pointing at the breed and it will give the correct result of the breed. It’s a Siba Inu. Lens then present forward a card that describes the dog breed and shows all relevant images to it.
It does this by utilizing computer vision and machine learning. The problem with machine learning it requires a lot of data to work with. Machine learning allows computers to learn about new things automatically. So the algorithm is dependent on how good the data is.
That is why Google Lens brings in hundreds of millions of search queries in Image Search for “Shibu Inu.” Lens brings in these query specific images along with the thousands of images that are returned for each search query. These relevant images help train the algorithms to produce even more accurate results.
Google’s Knowledge Graph, helps bring up tens of billions of facts on a variety of topics ranging from “pop stars to puppy breeds.” This is what tells the system that a Shiba Inu is a dog breed, and not let’s say a smartphone brand.
It is often easy to fool the system. We take photos from different angles, in different backgrounds, or varying lighting conditions. All these factors can produce photos that differ from the database available to Google Lens.
To prevent Lens from getting confused, it will need to become smarter. This is why the team at Google are expanding Lens’ database by feeding it pictures taken via smartphones. Google Lens also can read text off of menus, or books.
The lens makes these words interactive so that users can perform various actions on them. For example, users can point their smartphone’s camera at a business card and add it to their contact list. Users could also save ingredients from a physical recipe book and save them into their shopping list.
Google taught the Lens how to read in a very bare-bones manner. The company had to start from scratch. Google developed an optical character recognition (OCR) engine. The company combined it with their understanding of how languages work from Google Search and the Knowledge Graph. It taught the machine learning algorithms via different characters, languages, fonts, and drew inspiration from several sources such as Google Books scans.
Lens also has a style search feature that gives suggestions for stylistically similar items. So if a user points their camera at outfits and home decor, they will be sent suggestions of items that are similar in style. So, for example, if a user sees a lamp that they really find interesting, Google Lens will give suggestions of lamps that have similar designs. Lens will also display product reviews of stylistically relevant products.
Our smartphones are just becoming smarter and smarter with each passing day. We rely so much on them because of their portability and overall ease of use. Smartphones have overtaken computers regarding their utility.
This is what Google realized long ago. It is focusing heavily on improving its smartphone-based services. Google Lens is just one example of this. Deep learning algorithms used in Google Lens show great potential in detecting signs of diabetic retinopathy just by looking at photos of eyes. These algorithms also help capture a better image in low light scenarios.
The Lens is such a useful piece of code, but can only work on smartphones. So Google is amalgamating its features with smartphone technology. Google Lens will only get better from here on out. We won’t have to wait long before it can precisely narrow down searches just by using simple visual commands. The future is bright for Google Lens, and it will only be more useful for the average person.