Commvault’s Balaji Rao: Customers are searching for the right framework to deploy AI safely
Enterprises in India are still early in adopting generative and agent-based AI, but governance and data handling have become immediate concerns as pilots move toward production.
In a recent conversation with TechCircle, Balaji Rao, Area Vice President for India & SAARC at Commvault, described what he hears from CIOs and C-suite leaders about AI security risks, including the risk of exposing personal data under India’s DPDP Act, and more.
Edited Excerpts:
What’s the biggest AI security worry you’re hearing from Indian CIOs and C-suite leaders over the past year?
Most customers are in the early stages of the journey, having made some investments, running pilots, and having prototypes ready, with some of them going into production. But fundamentally, from an assistant AI standpoint to autonomous AI, where decisions are being taken by agents, there is a concern: what if the data that is being provided is clean, is safe, or does it have any personally identifiable information? So those concerns around governance are quite high.
While I would say the resilience needs are also high, they have to reach that stage of good production adoption for them to worry about it. They are more worried about their cloud infrastructure and other areas when it comes to resilience and ransomware protection. Whereas here it’s: as you build the models, how do you ensure that there is clean data available and also follow what could be a proper governance framework, where they are in a position to say, backtrack just in case something doesn’t happen.
Avoidance is the best thing that can be done. These concerns are also aligned with the fact that the DPDP Act is there, and there are concerns around that, too, because the fines are heavy. If something goes wrong and some personal information gets published somewhere, annoyingly or unknowingly through the agent, then it has different implications, financial as well. So these concerns are beginning to crop up, and customers are probably searching for what could be the right framework and right architecture for deploying these in a safe way and responsible way.
Your company also addresses a term called data poisoning. So, what is data poisoning in simple terms, and how common is it in Indian AI projects?
I think, whether Indian or global, the concern is pretty much common. One of the use cases that our customers have been asking about over the last 12 months, which I think is a very practical use case for somebody like Commvault, is that we have been custodians of data for years, and customers want to use it. Some of them have a seven-year retention, some of them have a 10-year retention. In that context, you have seven to 10 years of your company’s data in Commvault.
And obviously, the data models that need to be built need to be fed the right kind of data. So, what is a safe way by which this data can be provided to the model so that they can train these language models to do what they want them to do? Commvault came up—or is coming up—with Commvault Data Rooms, which is a way of providing the data in a safe and in a way that can be consumed, because most of this data is encrypted or stored in a safe way, whereas the models need to be fed in a certain way so that they can use the data effectively.
So, Commvault Data Room is one way of providing that data so that we can ensure the cleanness of data is provided and fed to the AI models, and the necessary output can be achieved in a safe way. In the absence of which, you have issues of another kind, which is exactly what you mentioned.
Outside of this, through an acquisition called Satori, an Israeli company that we acquired recently, it gives us good inroads into sensitive data governance, especially around structured data. This company is a leader in doing that with databases, which is where some of these larger financial institutions and others are concerned about the governance piece. And this fits in very well with that piece of the puzzle.
Where do you see the biggest “silent failures” in today’s AI stack—data, vector databases, RAG, or production inference?
I would say it’s all about data in this business. One place is the data.
The second place, which I may not have a role as Commvault to play, however very critical, is if you look at the last few years, the digitisation and the pace at which we have moved on digitisation. For example, if it’s a bank, you’re worried about how to onboard a customer in two or three steps, how to provide them a mobile interface, and simplify banking. That was more of an application build-out running on infrastructure, and it didn’t have anything to do with internal people and organisational processes.
Whereas today, if you look at the scale of agents, we are talking about one human to 80 agents already. Essentially, it means there is going to be some kind of reorganisation that needs to happen internally. Even roles and responsibilities would change internally. So apart from the fact that we are talking about a data issue and a governance issue, the larger issue here, unlike digitisation, unlike mobile adoption, unlike other things that happened, is more internal to the organisation, where the CEO’s office has to drive a lot of change. I think that is something that, to use a broader term, culture could change things in many ways as we adopt AI going forward.
AI, cloud, and ransomware now share the same risk environment. What’s the main choke point: identity, storage, or something else?
If you treat AI as another workload, though a very significant workload, customers already have a multi-cloud scenario, and some of them might have on-prem as well. The threat of ransomware is all across, whether it’s an AI application or any other application.
So the challenge for the customer is recovery. If there is a ransomware situation, whether it’s an AI workload or a cloud workload, the time taken to recover continues to be one of the biggest challenges. This is a board-level question: if faced with ransomware, how quickly can we recover? The complexities involved in these multiple workloads make it complex for a customer to recover.
This is where Commvault comes in with a unified platform that we just announced called Commvault Cloud Unity Platform, wherein we support not only multi-cloud workloads and on-prem, but also AI workloads.
Along with it, the point mentioned on identity resilience is becoming extremely important, because without an AD or an Okta—largely AD in these parts of the world—you don’t have access to your own house. Users and business users also won’t have access. While you may have a clean copy of data and your application may be up and running, if AD doesn’t authenticate, the story is over.
So the ability to protect and recover AD fast is very critical. That is where Commvault has launched automated forest recovery of AD, where we can bring all the AD infrastructure up like a runbook in an automated way. This becomes even more accentuated in an AI scenario. It is important even in a multi-cloud or hybrid scenario, but in an AI scenario, it gets more accentuated with multiple agents logging in with the necessary identity.
We have built more IP around this, integrated with our Commvault Cloud Unity Platform, so customers can recover workloads and infrastructure in multi-cloud environments and ensure identity is recovered. They also have the ability to test AD recovery, so in a given scenario, they know how long it will take to recover.
How has the DPDP Act shifted resilience priorities for enterprise leaders—more audits, stricter retention, and faster breach response?
Customers are generally worried now about the right kind of data—personal data—and how it is being used. This concern was a little less a couple of years back, but with the Act, it has become more prominent.
Data classification is one of the things customers don’t do well enough. We always had the ability in the form of unstructured data. Now with the Satori acquisition, we have that ability in the form of structured data, too.
If you can classify your data in an organisation and put it in various buckets and keep your personal information and also your IP secure—suppose I am Commvault, the code is very critical, it can’t be floating here and there—the ability to keep that safe and secure, or back it up differently, or keep it in an air gap, these are things customers want to do now.
They want to make sure sensitive personal data doesn’t get exposed to any external site or show up on the dark web. That’s the larger concern, especially since penalties are high. Customers are worried about that scenario.
That said, some of our customers are already used to it because they have been adhering to the GDPR regime because of global exposure. GDPR does have “DPDP++,” I would say, in terms of the compliance levels required. So some are used to it in some way. But some Indian organisations that are local and do not have a global presence are now getting used to this Act.
We help them by mapping relevant parts of the DPDP Act to what our technology can do, and we help them with the feature functionalities they can use to achieve compliance.
So if you talk about resilience, what's the next category you think will merge into resilience? Do you think it's data governance or AI safety or something else entirely?
From a current theme standpoint, data governance seems to be on top of people’s minds, and I think it is going to evolve even more as we see more use cases of AI evolve.
We are seeing use cases come forward internally and externally. We also use AI within Commvault. The way our engineering uses AI, and the governance process around it, there are about 10 steps we follow for any AI application. Nine steps would probably be governance, and the 10th step would be log files, because the ability to trace back something that happens and take it back to a previous state is critical.
There is going to be more governance around this, because without it, this could go any which way. And you also have to ensure the goal can be achieved. A high level of governance can slow things down. You want velocity, and velocity without governance increases risk. So we have to ensure there is a balanced approach.
One example from a governance standpoint: if there are five or six players in the team and you want to give them data to build a model, and some of it is personal data, we have the ability to redact that at a certain point in time and not give it away forever. Some governance mechanisms are being built in with the tools that we have.
As models get more mature, we will see more challenges evolve. Given the high level of automation and the black box mechanism these models work with, we have to be careful about what we expose.
What’s next for Commvault in AI? Any expansion or acquisitions?
A lot of what’s next is in our recent announcements. We’ve introduced synthetic recovery—an AI-assisted, patent-pending capability expected to be released soon—to help customers remove malicious code from the latest backup so they can recover without rolling back to older data.
We’re also adding conversational AI to help teams investigate alerts by tracing issues back a few days and taking action earlier. Alongside Commvault Data Rooms and Satori for sensitive data governance, these capabilities are being integrated into the Commvault Cloud Unity Platform, which supports structured and unstructured data governance and AI workloads.
The platform also includes air-gapped backups and clean room recovery across major hyperscalers, plus infrastructure recovery—such as rebuilding AWS from an India region to Singapore in most cases in under an hour—combined with identity resilience so Active Directory can be recovered first, followed by applications and data. We integrate with security vendors as part of what we call resiliency operations, linking recovery with SecOps in a bidirectional way and aligning coverage across the NIST framework.

