With more than 100 million users registered on its platform, Amazon knows what Indians buy and how one region differs from another. The key driving this know-how is data, which is propelled by the company's machine learning (ML) initiatives. These measures will lay the foundation of artificial intelligence-based shopping in the future.
Rajeev Rastogi, director of machine learning at Amazon, spoke to TechCircle about how ML addresses India-specific concerns on customer feedback, product quality issues, regional languages and preferences, and infrastructure and payments challenges. Edited excerpts:
At Amazon, the culture has been to first address the customer and then the product. How does that impact the way the ML team works?
I talk with the business side to understand their requirements for ML solutions. The business side gets a lot of feedback and anecdotes from customers. A lot of people write to (founder) Jeff Bezos or to Amit (Agarwal). That is one source for us to know what pain points customers have. There are metrics that we continuously track for our business, which also gives us a sense of where to improve.
Customer anecdotes are a very powerful mechanism for us. When customers give feedback and talk about reviews, policies or certain metrics, or if we see too many damages on our products, those are cases when our ML team takes notice. In other cases, we get feedback like certain prices are not competitive or the delivery time was too long.
You have more than 100 million customers now in India. Does that provide you with enough customer behaviour data to work out problem areas?
Customer feedback is always going to be important to improve your service. We can leverage customer behaviour to optimise our systems.
Some things I can also figure out from the data, like which items are selling more and which aren’t. For some learnings, you can have insights based on customer behaviour on your platform.
Customer feedback helps us understand why an item did not sell well. Behavioural data will tell you there is a problem but not the root cause of the problem. We now have a large number of customers, so we train models and get insights from data.
Which areas get the most customer feedback and how do you improve it?
One area we have worked on is addresses. Many times, we weren't able to deliver a package because the address provided by the customer was not complete or had a street name omitted.
We have developed algorithms to figure out the quality of those addresses. When a delivery fails or we are unable to deliver to an address, it costs us money. Just knowing the location of the address doesn't mean knowing how to get there. It requires a lot of effort to figure out the exact house or office.
How does ML resolve the issue around addresses?
ML helps us infer the location for an address even if they haven't delivered to that particular. For example, if we have delivered to one apartment in a building, then we can easily pinpoint other apartments in that building. ML can identify two addresses from the same building. When someone enters an address, we prompt them with suggestions to make it complete or correct spelling errors.
Besides addresses, what other India-specific issues has ML identified?
The data quality for the catalogue is not of international standard, because we only have a marketplace model here. In the US, we have a retail and marketing (model). For retail, a lot of the data is curated by Amazon, because the set of products is more limited.
But on the marketplace, any seller can create a product, so the product set is much larger, which cannot be curated. As the marketplace in India is populated by sellers, the quality of products can be mixed. The value and data they provide may not be of very high quality. So data quality is a big issue in India. ML addresses the quality of the catalogue by auto-populating information or data about standardised products.
Secondly, we have a lot of languages. E-commerce is new and English may not be everybody's native language. Even on search, we see mixed-language queries, like you have Indian words such as chappal thrown in. But over time, content in Indian languages will increase in the next five years as there will be many more vernacular users than English speaking ones.
How does ML decide the ranking of search results, especially regarding regional preferences?
This is another India-specific initiative. We rank products with faster delivery speeds higher and give preference to regionally popular products.
Are there any infrastructure or logistics-related challenges that ML solves?
Infrastructure in India is not great. In many countries, shipping from one end of the country to another may not be so different. But in India, depending on the distance, time can vary a lot. So, it’s very important for local addresses to be available, because then you can ship much faster
Similarly, payment is an area of concern. When a payment I make through my credit card doesn’t go through, I have no idea why it didn’t get processed. It could be a network issue or the server could be down. We use ML to predict which payment instrument has the most likelihood of succeeding and then tell our users.
How does ML help you understand the cultural nuances within India?
India is a large country with regional preferences. Somebody searching for a sari in Bengaluru, Kolkata or Delhi all have a different model or design. Regional tastes and preferences become very important. In our search algorithms, we look at the location of the users and match them with the products popular in their region.