
Curated data fails at scale, India is the real AI testbed: Jayaprakash Nair, Altimetrik


As AI adoption moves from experimentation to real-world implementation, many challenges remain. In a conversation with TechCircle, Jayaprakash Nair, Head of AI & Analytics at Altimetrik, shared how the company is approaching this shift, focusing on production-ready architecture, data quality, AI safety, and solving business problems without relying on hype. He also discusses India’s role in the global AI landscape and the strategic areas where Altimetrik is placing its biggest bets over the next few years. Edited Excerpts:
What does your AI production architecture look like today, especially around data, model deployment, and observability? What’s been the hardest part to get right?
When we talk about architecture, there are two key aspects to consider: conceptual architecture and physical or infrastructure architecture. Since this is a technical discussion, I’ll go into some detail.
Conceptual architecture depends on the business problem at hand. It could be a retrieval-augmented generation (RAG) based setup, or a combination of RAG and fine-tuning, sometimes referred to as RAFT in the industry.

In real-world production deployments, simply calling a large language model (LLM) is not sufficient. A solid pre-processing pipeline is necessary to ensure the inputs sent to the LLM, such as the prompt and context, are properly curated. This helps optimize the model’s output and reduces common issues like hallucination. After receiving the LLM's response, some level of post-processing is often needed as well.
So in most architectures, there are distinct pre-processing and post-processing stages. Everything mentioned so far relates to generative AI, but not all industry problems are limited to GenAI solutions. Some require other approaches, like discriminative AI, causal AI, or similar techniques. We often see different types of AI used together to address complex business problems.
Our process always starts with a problem. We don’t create solutions looking for a use case. If the problem can be solved with GenAI alone, that’s what we use. If it requires combining GenAI with other AI methods, we do that.

Infrastructure architecture, the way these solutions are deployed, is shaped by the customer’s existing investments. Some customers use AWS, others use Azure, GCP, or other platforms. We build solutions that work with their existing infrastructure rather than replacing it, extending what they already have to solve their business problems.
What are the most common technical reasons why AI proof of concepts fail to scale in production? Is infrastructure the main issue, or are there other factors like data quality or depth?
There are several reasons for this, and let’s look at a specific scenario to understand the broader issue. Suppose you’re building a proof of concept (POC) for a customer. Typically, the customer provides a manually curated snapshot of data, pulled from multiple sources, combined in one place, and cleaned manually. Only the data relevant to the use case is included.
This curated data is then used for analysis, model building, and evaluation. The POC works, the model performs well, and the business is happy. But when it's time to move to production, the same approach doesn't scale. You can't rely on manual data prep every day. Instead, you need automated pipelines that pull data from multiple systems, store it properly, ensure data quality and security, and avoid breaks in the flow.

In a POC, an expert might catch and fix data issues manually. In production, those fixes need to be built into the system. Automating is complex and often becomes the biggest hurdle in moving from POC to production.
This isn’t rare. It happens often across the industry. Business teams are convinced by the POC, but the shift to production reveals the challenge, because it’s not just about one model, but an entire AI pipeline with many parts that must be automated. That’s where the real work begins.
Many companies today say they're solving the "last mile of AI." What does that phrase mean in your context? How does your recent ALTI Lab announcement relate to that goal?
From the beginning, we made a clear commitment: we will not engage in hype. Before sharing this externally, we aligned on it internally. In all conversations, whether with customers, partners, or others, we avoid name-dropping, jargon, or distractions. We focus entirely on the real business problem at hand. While we work with advanced technologies, we’re honest about their limits. Instead of hiding those limits, we bring them into the conversation. We clarify what’s possible, where the boundaries are, and what trade-offs exist. We walk through pros, cons, optimizations, and workarounds. Everything is shared openly and directly. That’s how we differentiate, by staying transparent and avoiding false promises.

Another key area we focus on is AI safety. It’s a broad topic and includes multiple concerns, like bias in models or the risk of sensitive data leaking outside the enterprise. We’ve built internal tools to address these. For example, OpenAI offers a compliance API that logs ChatGPT interactions across the enterprise. These logs can be used for audits, both internal and external.
We’ve taken this further with a tool called SafePrompt. If an enterprise user enters sensitive content, like business confidential data or PII, into a ChatGPT input, SafePrompt intercepts it before the model receives it. It flags the issue and blocks the message, preventing the data from leaving the organization’s firewall. This is different from simply logging what’s already been sent; it prevents the mistake altogether.
This is just one of several innovations we’ve developed around both discriminative and generative AI. With the launch of our lab, we’re formalizing and scaling these efforts. We’re putting structured systems in place for AI safety governance, production enablement, and agentic development, moving from ad hoc progress to disciplined, enterprise-ready frameworks.
What role does India play in your company’s global AI strategy, and what unique advantages or challenges do Indian enterprises face in building enterprise-grade AI?

One key advantage we have in India is the volume and diversity of data we generate. As you know, many regions around the world are introducing policies to keep data within their borders. This trend is growing, with most countries either having such regulations in place already or moving in that direction. The idea behind this is to maintain control over locally generated data.
This has a direct impact on AI. Since training AI models requires large volumes of data, restricted data flow can make it difficult to build effective models in regions where data is limited. In contrast, India produces vast amounts of data across both B2C and B2B sectors. This creates a strong foundation for developing advanced AI systems.
A good example of this is Tesla’s interest in testing its vehicles in India. The complex road conditions here offer a wide range of edge cases and unique data points, which are valuable for training models. India’s demographic and environmental diversity naturally generates multi-dimensional data, making it ideal not just for training language models, but also for building sophisticated reasoning systems.
What are the most important AI bets you're making today that you believe will deliver results in the near future?

One important point, not exactly a bet, since that implies a chance of failure, is based on what we know for certain and what the industry has confirmed. Large language models are powerful, but they also come with limitations. Not every industry problem can be solved directly with these models. And by large language models, we mean any of the many types that exist today.
Some problems require smaller, domain-specific models. Instead of using a 500-billion-parameter model, a company might use a base model with 50–60 billion parameters and fine-tune it with its own domain-specific data. This approach creates a model that performs well for that specific enterprise, rather than trying to be general-purpose.
We're seeing growing interest in this area, which is why we built a framework called Domain Forge. It helps companies build domain-specific models efficiently. These models won’t answer every question, but they are much better than general-purpose models when it comes to tasks within the trained domain, whether it's answering, searching, or interpreting data.
This is one of the key offerings we’re taking to market. The second area we’re focusing on is safety. I've already shared a few examples of that. The third area relates to the wide range of language models available today. We aim to make it easier for people to benchmark and compare these models, especially for those trying to solve more general problems across industries.
Another key point is our partnership with OpenAI. We're one of the few official partners. That partnership followed a series of deep technical discussions and supports our differentiated approach in the market. It also strengthens our go-to-market strategy.
These are just some examples of the multi-channel strategies and offerings we’ve already launched. There will be more coming as a result of ongoing work in our labs.