Fast-learning digital assistants are finding their voice in India

Fast-learning digital assistants are finding their voice in India
Photo Credit: Photo Credit: 123RF.com

N Chandrababu Naidu is currently in his fourteenth year as Andhra Pradesh’s chief minister. For the first 13 years, which were spread across two stints, he resorted to the time-consuming approach of seeking a report from bureaucrats every time he wanted updates about a particular district or the status of a welfare scheme. But since last February, he has been simply asking Alexa, Amazon’s digital voice assistant, to spell out the statistics.

Beyond the chief minister’s office, voice assistants are becoming ubiquitous in India across smartphones and connected homes. The likes of Alexa and its peers such as Google Assistant, Apple’s Siri, Samsung’s Bixby and Microsoft’s Cortana are accomplishing a wide variety of tasks such as hailing a cab, calling a friend, sending a message, and playing songs.

At enterprises, voice-enabled bots are handling core functions in areas such as enterprise resource planning, human resource management and to support business intelligence or run banking operations. Typically, these functions include generating proactive reports, citing anomalies and boosting efficiency of employees.


The vast majority of technology majors have been running R&D programmes for voice computing and are launching commercial products as well.

“Voice computing went mainstream in 2018, gaining more prominence among consumers and enterprises,” said Vivian Gomes, vice president and head of marketing & inside sales at CSS Corp.

In fact, at one million utterances a week, more people summon Alexa in India than anywhere else in the world. And at least one user says “I love you, Alexa” every minute, according to Puneesh Kumar, country manager for Alexa experiences and devices at Amazon India.


Technological advances

The current boom has come on the back of two decades of experimentation by the industry, according to Puneet Gupta, chief technology officer of digital product engineering services company GlobalLogic’s India division.

“Voice assistants can be fundamentally broken down into two key technology challenges. One, converting voice to text and two, interpreting the text for the intent,” he said.


All voice assistants are trained using machine learning and artificial intelligence models to understand the context of human speech in different situations varying from region to languages.

Gupta believes what tipped the sales in recent years has been the proliferation of cloud computing and advanced machine learning technologies such as deep learning.

“These models of ML are computationally-intensive, especially in the training phase. Once trained, very efficient implementations are possible both at the server end and the client side,” he explained.


Another dimension was added in the form of artificial intelligence techniques that can be used to extract user intent from text and usage to resolve queries or provide relevant responses.  

“Today the data is predominately in text, whether structured or unstructured. This means that all processing will happen in text. So any voice input is currently converted to text and processed using natural language processing (NLP) algorithms,” said Animesh Samuel, co-founder of Light Information Systems, a startup which provides AI and NLP services.

“Conversely, voice increases the volume or data which in turn improves the output of the AI and makes these really smart interactions possible,” he explained.


While the base technology is broadly the same, different companies adopt different approaches towards training their voice assistants. For instance, Google held ‘translatathons’ across the world to help its engine comprehend different languages.

Amazon did things differently. It trained Alexa by inviting select users to interact with the assistant in a language of their choice in order to build proficiency. It also deployed Cleo, an Alexa ‘skill’ that lets people teach the voice assistant new languages by speaking to it.

Boom in India


India has been among the fastest adopters of voice computing. According to an Accenture report that surveyed 22,500 consumers in 21 countries including 1,000 Indian consumers, half of online consumers globally now use digital voice assistants, led by emerging markets such as India with 72% adoption.

The study further said that standalone voice assistants or smart speakers are one of the fastest-adopted technologies in India and have a 97% satisfaction rate among Indian consumers.

Mahesh Makhija, partner and leader for digital and emerging technologies at professional services firm EY, said that speech recognition has made impressive strides in the country over the last few years with algorithms having improved due to large data sets available to Indian researchers and developers.

He further attributed the growth to the proliferation of new-age internet devices such as smartphones and increasingly cheap data from service providers such as Reliance Jio, which is driving higher consumption among customers. Around 500 million Indians are now estimated to have access to the Internet.

The explosion of data usage has also helped voice assistants overcome an initial hiccup: accuracy.

"Voice assistants run on deep learning algorithms which help them train themselves the more they are used," said Faisal Kawoosa, co-founder and chief analyst at research firm TechArc. "However, just a year-and-a half back, these voice assistants didn't have much access to data which is why it was difficult for them to understand some nuances of accent or context in voice commands from users."

“With increasing adoption and satisfaction levels of smart speaker technology in India, we will see digital voice assistants influencing the whole consumer technology and service ecosystem in a way that no other device, including smartphones, has done before,” said Aditya Chaudhuri, managing director and lead in Accenture’s communications, media & technology practice in India.

He added that as consumers shift behaviours from smartphones to voice assistants, there is a clear expectation that smart speakers will take on progressively complex workloads in the future.

One phenomenon experts have observed is that people who are not fluent in English end up using digital assistants for a variety of purposes such as simple searches for finding out the spelling of a word.

A Google spokesperson told TechCircle that while the company doesn’t bring out region-specific numbers, voice searches in India have gone up by as much as 270% year-on-year with most of them originating from rural areas.

Amazon’s Puneesh Kumar said that Alexa was being used by a school teacher named Amol Bhuyar in Maharashtra, who built a mannequin around an Alexa device to train students in speaking fluent English.

Like other assistants, Alexa is also looking to learn Indian languages.

“While we are still at the nascent stage, we are continuously adding lexicons of local languages continuously. We have also opened up Alexa skills for developers to allow them to add more functions or train it in other languages,” Kumar told TechCircle.

Google’s spokesperson said that the complex script of most Indian languages makes text input difficult, which is why the company has been investing heavily in voice. Google is also focusing on localising user experience by integrating Indian references and nuances.

“We have already built in several key Hinglish phrases into the English Google Assistant in India,” the spokesperson said, adding that the Assistant can understand Indian accents too.

Interestingly, TechArc's Kawoosa also pointed out that urban consumers are more conscious about what they say to voice assistants in public, thereby limiting their use.

Potential speed bumps

Analysts and experts say there will be explosive growth in voice computing in the year ahead with more innovations on the horizon.

“Half the searches on the internet will be voice-based in the next two to three years,” said Lux Rao, director of solutions at Johannesburg-based IT services firm Dimension Data’s India division.

Rao, who claims that voice will be the new oil, said that voice biometrics will be the first step and will see adoption from banks because voice analytics are more accurate than text data analytics.

However, there are still some challenges to overcome.

The Accenture report cited above names privacy as one of the biggest impediments to the mushrooming of smart speakers. Alexa devices have found themselves in hot water on a few occasions for recording users’ conversations without their consent.

“Nearly 46% of consumers globally believe they don’t have control of their data with voice assistants and over 58% are more likely to re-evaluate their trust in this service by continually checking how their information is being used,” the report said.

Makhija of EY said that the technology has to evolve to handle more sophisticated conversations such as engaging a consumer during complex purchase scenarios. He also said that though companies were building up Indic language capabilities, it was still at the early stages.

Sign up for Newsletter

Select your Newsletter frequency