The Economic Survey of 2019 has recommended the creation of a central database with a strong privacy framework that could benefit citizens, governments and private sector firms as India starts digital transformation.
Pointing out that India was undergoing data explosion and that the cost of data storage per gigabyte had fallen from $61,050 from 1981 to $3.48 today, the Survey said that information stored in large datasets, if connected, can help in minimising errors and save valuable government resources that can be passed on back to the citizens.
The Survey gave the example of merging of disparate sets of data – transaction data from Jan Dhan accounts and Mahatma Gandhi National Rural Employment Guarantee Scheme (MGNREGS).
“As MGNREGS can be a real-time indicator of rural distress, the credit scoring done using the transactions data of Jan Dhan accounts can be used to provide credit in districts/panchayats that are experiencing distress. Such combining of disparate datasets can be extremely useful in obtaining the necessary richness required to design and implement welfare policies,” it said.
The Survey also said that the declining cost of gathering, storing, processing and disseminating data could pay increasing marginal benefits such as evidence-based policy, accountability in public services, better targeting of welfare programmes, nationally integrated markets and product innovation.
In fact, the Survey cites a similar trend in the private sector, which is seeing investment in data-related endeavours. According to a Forbes survey done in 2017, 53% of companies use Big Data to take decisions.
In order to take advantage of these data trends, the Survey proposes creating a new kind of central database that will work quite differently from the database that is present at separate local, state or central governments.
For example, if a person’s medical records are present in a government-run district hospital, it will most likely be on paper and will be aggregated and available for analysis at the state or central level.
However, the new central database makes each department responsible for collecting, sharing and protecting data with them, creating their own required standards of what is public and private data as described by the guidelines of the data privacy law.
“This data is then made available through a data access fiduciary to data requestors. Data requestors may be public or private institutions but can only access the data if they have appropriate user consent. The data access fiduciaries themselves have no visibility on the data due to end-to-end encryption. Such a model puts user consent in the centre of the government’s initiative to make data a public good,” the Survey said.
While proposing about how to go about creating the central database, the Survey said that the government collects four kinds of data – administrative data, survey data, transactions data and institutional data.
While administrative data consists of birth, death, pensions, tax and marriage records, survey data consists of census data and National Sample Survey data. On the other hand, transactions data consists of datasets from e-NAM (National Agriculture Market) and United Payments Interface.
Institutional data consists of data from public schools and hospitals, etc.
However, consolidating data or information can be a challenge in the country because of the lack of a common code or identifier as several departments or ministries have their own code.
To address this issue, the Survey recommends following the local government directory, an application that has been developed by the ministry of panchayati raj that assigns unique codes to every place and has made it available to the public.
“If all government databases requiring location codes are aligned with the codes in this directory, then all databases will share a common location field that can help in merging data, and reduce accuracy errors in the distribution of welfare,” it said.
The Survey also suggests three critical rules that ensure strong privacy framework so that data is not tampered with and citizens still have a choice to opt out of some choices.
“First, while any ministry should be able to view the complete database, a given ministry can manipulate only those data fields for which it is responsible. Second, updating of data should happen in real time and in such a way that one ministry’s engagement with the database does not affect other ministries’ access,” it said, adding that the database should be secure, with absolutely no room for tampering.
The Survey also proposes that people can opt out of divulging data to the government, where possible. This means that one can choose not to use government-run payments services or participate in a survey.
However, citizens can’t choose to opt out of sending or giving data in places where they are legally bound to do so such as in the case of vehicle registration.
In order to promote transparency, the Survey also talks about maintaining immutable access logs for all data that would be made available to the public so that citizens know who has seen their data and why.
The Survey also proposes an infrastructure in creating and maintaining the central database. This includes digitising existing paper-based data and digital data collection at source.
In terms of storage, the Survey proposes real-time storage for select data and reducing time lag between collection and data entry.
The Survey also says that governments at all administrative levels should invest in building their internal capacities to exploit data in real time, perform analyses and translate data into meaningful information.
“While every government department may have a dedicated analytics or data insights division, the ministry of statistics and programme implementation and the ministry of electronics and information technology can act as nodal departments to steer such efforts at the national level,” it said, adding that the government may open up data processing to private players with necessary safeguards.
For dissemination of data, the Survey proposes the creation of an Open Government Data Portal for the public, along with dashboards for every scheme launched.
In terms of applications, government, private sectors and citizens can be beneficiaries, the Survey said.
While the government can use it to do cross-verification of the income-tax return with Goods and Services Tax (GST) to highlight tax evasion, private sectors can be granted access to select databases for commercial use.
The Survey gives the example of data access about students’ test scores across districts. “Using test scores of students, demographic characteristics of each district and publicly available data on the efficacy of public education schemes, a private firm may be able to uncover unmet needs in education and cater to these needs by developing innovative tutoring products tailored to the specific needs of specific districts. These products would not only create profits for the private sector, but also monetise data and generate revenues for the government, in addition to improving education levels and social welfare,” it said.
Another use case could be the sale of datasets to analytics agencies that generate insights and sell these insights to the corporate sector so that it can predict, demand and discover untapped market opportunities.
For citizens, the Survey said that use-cases can vary from services such as Digi-Locker or the NBFC-AA.