As modern markets become increasingly data-driven, every participating individual and firm in this market generates large data trails. To the extent that this data relates to personal information of individuals, India has witnessed an extended public debate on a new legal framework to govern personal data protection. Non-personal information, that does not identify individuals, also carries immense economic value in terms of the insights it can generate based on aggregate patterns. For example, insights on commuter patterns can be revealed from traffic data; on purchasing patterns in a demographic from e-commerce data, or inferences about the spread of diseases can be made based on public health information. Presently, the flow of such Non-Personal Data (NPD) is not regulated in India.
Discussions around the appropriate stance for Indian policy with respect to such data flows has grown in recent months. In July 2019, the Economic Survey of India 2018-19 called out the “data explosion of recent years” and stated that the data of Indians was akin to a natural resource belonging to the country, or a public good which may be utilised for the economic benefit (Ministry of Finance, 2019). These wider debates have precipitated in the nodal ministry for information technology in India—the Ministry of Electronics and Information Technology (MeitY) constituting a Committee to deliberate on these very concerns and formulate a data governance framework for Non-Personal Data (NPD).
In this blog, we identify the policy objectives that should guide the policy stance in India on the governance of NPD. We also examine whether these policy objectives fall under the purview of existing regulatory authorities in India, or a future Data Protection Authority (DPA). We conclude by considering the question of whether there is a case for mandating free flow of NPD across sectors in India and across borders.
1. What is Non-Personal Data?
We interpret the term Non-Personal Data (NPD) to include all kinds of data except Personal Data. Personal data is defined under the draft Personal Data Protection Bill, 2018 (draft Bill) under section 3(29) (MEITy, 2018). It includes all data about or relating to a natural person who is directly or indirectly identifiable by such data. Identifiability of a natural person appears to be core to the definition of Personal Data.
Consequently, for clarity, we propose that data that does not pertain to or identify a human being should fall in the scope of NPD. Further, NPD appears to be of two types:
(i) Non-human NPD i.e. data which does not originate from or identify any human being. Examples of Non-human NPD could include statistical concepts (such as the GDP or weather data), data on climatic conditions, supply chain data, data from industrial machines, aggregated e-commerce sales data etc. It could also include aggregate data sourced from multiple individuals where individuals are not identifiable for e.g. commute patterns, frequencies and loads on public transport systems.
(ii) Human NPD i.e. data that originally pertained to or identified a person but has subsequently been anonymised, making it impossible to identify the underlying natural person. Human NPD includes anonymised datasets of personal data such as personal health records, online/e-commerce shopping histories, location histories etc.
A note on “mixed datasets”: In most real-life situations, we note that a dataset is very likely to be composed of both personal and NPD. This is often referred to as a “mixed dataset” (European Commission, 2019). Examples of mixed datasets include a company’s tax records, mentioning the name and telephone number of the managing director of the company. This can include a company’s knowledge of IT problems and solutions based on individual incident reports, or a research institution’s anonymised statistical data together with the raw data initially collected (such as replies of individual respondents to survey questionnaires). Developing legal literature suggests that data protection obligations will be applicable when the mixed data can be used to directly or indirectly identify a data principal (Patrick Breyer v Bundesrepublik Deutschland, 2016). In mixed datasets personal data and NPD are inextricably linked, therefore it appears that personal data protection laws should apply to these sets (European Commission, 2019). The current framing in the draft Bill also suggests that where such inextricably linked sets of personal data and NPD exist, they would be caught by the definition of personal data in the draft Bill.
2. Possible Objectives for Governing Non-Personal Data
The objectives that could guide any future policy on NPD appear to be driven by four core areas of concern:
(i) to ensure competitiveness in the digital economy;
(ii) the growth and development of international trade & commerce in the digital economy;
(iii) national security, and
(iv) mitigating privacy risks due to the re-identification of individuals from NPD datasets.
Objectives (i) to (iii) in the list above would guide any policy relating to all NPD (i.e. both human NPD and non-human NPD) while objective (iv) relating to privacy risks would need to guide policy regarding the processing of human NPD.
From this analysis, we find that there appears to be a limited role for the Data Protection Authority (DPA) in policy around NPD flows, given the scope and purview set out in the draft Bill. The DPA is mainly empowered to act to implement and enforce the provisions of the Bill itself. If NPD is not considered personal data, then it would fall outside the scope of the DPA’s authority except to the extent of objective (iv) above. Accordingly. the relevant laws, ministries and regulators in India will be tasked with the policy approach on NPD, as opposed to MeitY or the future DPA (as currently envisioned) who would have a more limited role.
On the specific types of NPD mentioned in the question, it appears that:
- Community data (assuming it does not contain personally identifiable information) and Anonymised data would be human NPD which would need to be governed taking into account all four considerations above relating to competition, trade & commerce, national security and privacy (re-identification risks).
- E-Commerce data could contain mixed datasets of personal information, non-human NPD and human NPD so would need to be governed taking into account all four considerations above.
It is reiterated that where these datasets are inextricably linked to personal data, it would be regulated as personal data rather than NPD.
2.1. Ensuring competitiveness in the digital economy
As data forms an essential input for the digital economy, the way it is shared and pooled across various stakeholders can profoundly affect a market’s structure. Increasingly, jurisdictions globally are examining competition issues which may arise in the context of a digital and data-intensive economy. Data-intensive models could upset competitiveness in the market by raising high barriers to entry in the following ways:
(i) Economies of scale: Data-intensive business models exhibit unprecedented economies of scale. By definition, it makes it profitable to serve more consumers instead of few (OECD, 2002) as average costs exhibit a declining trend. This yields a significant competitive advantage for incumbents and explains the rise in zero-price services (European Commission, 2019).
(ii) Network Effects: The extreme economies of scale are complemented by network effects. A network effect “refers to the effect that one user of a good or service has on the value of that product to other existing or potential users” (UNCTAD, 2019). Positive network effects lead to greater value being generated for each incoming individual, leading to further entrenchment of incumbents.
(iii) Control over data: Together economies of scale and network effects can lead to a generation of more data, which can help incumbents to finetune their services. This further confers market power on them.
(iv) Economies of scope: As incumbent providers get access to varied datasets over time, they are also able to enter other markets more easily and stunt the development of secondary markets elsewhere (UNCTAD, 2019).
Together these features create a tendency for digital markets to ‘tip’ swiftly and disproportionately in the favour of an incumbent provider (Digital Competition Expert Panel, 2019). This dominance of an incumbent provider may appear innocuous at first, however, it can serve to the detriment of the consumers in the long run. It can reduce the choices available to consumers, adversely affect the quality of products available to consumers, increase the prices that consumers face and most of all impede with innovation in the sector (Digital Competition Expert Panel, 2019).
These competition and anti-trust issues emanating out of the use of NPD appear closer to the jurisdiction of the competition authorities in India and the current consumer protection regime. These regimes should address these concerns, if necessary or appropriate. However, where these concerns are raised with regard to NPD, they are unlikely that they fall within the purview of the draft Bill and the DPA as no personal data of individuals is involved.
2.2. Considerations of international trade
Supporting the growth and development of trade and commerce is a key imperative for the Indian Government. As trade becomes increasingly data-driven and intermediated through digital processes, this objective will also no doubt apply when considering the policy position on trade-related NPD. Examples include data from routine trade activity like supply chains and trading contracts, data from E-commerce, information technology services etc. International trade rules matter for national data governance frameworks since (i) they impinge on cross border flow of data by regulating the underlying trade in goods and services, and protect intellectual property (ii) impose certain international rules that require changes in national laws and (iii) limit the policy space of national governments (Burri, 2017, p. 68).
The agreements negotiated by the World Trade Organization (WTO) in particular could have a large impact in this area. Member states of the WTO are essentially restricted from discriminating between products and services coming from different WTO Members, and between foreign and domestic products and services unless they can avail of exceptions (Burri, 2017, pp. 72–73). Accordingly, to enable free trade & commerce, the general practice of WTO member states has been to avoid imposing customs duties on electronic transmissions this enabling the free flow of data across borders (WTO, 2019). India has been calling for a re-opening of the debate on the moratorium on electronic transmissions (WTO Delegations of India & South Africa, 2018). As these negotiations progress, they could have an impact on policy frameworks that would govern persona data and NPD.
The Ministry of Commerce and Industry, Department of Commerce is the nodal agency of the Government of India for all matters pertaining to WTO (Ministry of Electronics and Information Technology, n.d.), with support from relevant ministries including MeitY. Any policy on the governance of NPD data flows will need to take into account India’s obligations under the international trade regime. They are unlikely that they fall within the purview of the draft Bill and the DPA (except to the extent of protection where personal data of individuals is involved).
2.3. National Security
Access to data, both personal and non-personal, for safeguarding national security is common in legislation across the world (Scott, 2019).
The Indian Government has several initiatives and data sharing schemes that collect and process data for the purpose of enhancing public security such as the National Intelligence Grid (NATGRID), the Crime and Criminal Tracking Network & Systems (CCTNS), the Network Traffic Analysis (NETRA) System and the Central Monitoring Systems amongst other that already collect large amounts of data both personal and non-personal, especially in terms of satellite data, traffic and commute data, health data, financial transactions data etc. (Xynou & Hickok, 2009). Several existing laws give power to state authorities to summon documentation, direct the furnishing of data and access computer resources held by others.
Any overarching policy framework for NPD data flows will need to take account of all these existing regimes that already govern access to such information for national security purposes. To prevent the misuse of such access, it would be important to delineate the bounds with which NPD must be made accessible to the Government, using principles such as proportionality and necessity. The draft Bill contains a wide exemption from protections of personal data for Security of the State (in section 42). However, in the case of NPD access, these matters are unlikely to fall within the purview of the draft Bill and the DPA (except to the extent of protection where personal data of individuals is involved).
2.4. Privacy considerations in Non-Personal Data
Privacy considerations arise where natural persons are identified through the processing of NPD or re-identified when anonymised NPD is de-anonymised. This would convert the NPD to personal data, and bring it within the scope of the draft Bill. Re-identification risks can arise when NPD records are (i) singled-out to directly or indirectly identify a data principle, (ii) linked with similar records in other dataset(s) to narrow down upon their identity or (iii) inferences are made about identity from the data that is available (Article 29 Data Protection Working Party, 2014).
Even anonymisation does not guarantee that privacy risks will not arise from processing activities. It is widely acknowledged that anonymisation can be reversed and carries a high risk of re-identification (Wes, 2017). The risk of reidentification can increase with the variety of data and the number of datasets which is accessible to an entity. Compiled datasets, hence, carry a high risk of reidentification, post which re-identified personal data can be used for malicious purposes which can harm data principals. It raises serious privacy concerns like personal data breaches and illegal use of personal data.
This presents a compelling reason for the proposed DPA to set out policies for NPD to mitigate re-identification risk. The potential measures it could support to mitigate re-identification risks are as follows:
(i) The DPA could support codes to set standards for anonymisation that are thorough in masking directly and indirectly identifiable data to prevent singling out, linking or by inferencing. It could monitor technological developments and commercial practices that may affect personal data protection review and anonymisation methods as required.
(ii) The DPA could support the development of rigorous risk analysis techniques for use by data fiduciaries to estimate the risk of reidentification before anonymising data or before sharing it with a third party.
(iii) The DPA could support data audits & reviews of data fiduciaries’ anonymisation methods as well as anonymised datasets to check for reidentification risks.
(iv) The DPA could consider limitations such as requiring sharing of human NPD under contract in certain sectors so that data fiduciaries retain sufficient control on how it is used.
(v) Data Principal Consent may be taken in cases where the consequences of reidentification may be dire before sharing human NPD.
3. Proposed Approach for Governance of Non-Personal Data
As identified, any future framework for the governance of NPD must consider the objectives of the competition, trade, national security and privacy. It is proposed that the DPA only regulate aspects pertaining to privacy & data protection risks in NPD. In other areas, existing laws and regulators can interact with NPD. Accordingly, we suggest the following approach.
(i) Competition-related issues relating to NPD such as anti-competitive practices, abuse of dominance, market distortion and trade barriers created by entities are already addressed by the Competition Commission of India (CCI) pursuant to s.18, s.19 and s.20 of the Competition Act, 2002. The CCI could be the regulator for NPD in this context.
(ii) Trade-related issues relating to NPD pertain to matters which require serious consideration of domestic and foreign trade policies which are governed under international frameworks like GATT, GATS and the WTO. The Ministry of Commerce and Industry is better suited to address these issues.
(iii) National security-related issues relating to NPD pertain to the use, and the potential for misuse of the vast amounts of human and non-human non-personal data by government agencies. The collection and use of data are currently regulated by the Ministry of Home Affairs (Privacy International, 2019).
(iv) Privacy-related issues relating to NPD pertain to the privacy risks raised by the reidentification of human non-personal data. The DPA is responsible for data protection and preserving informational privacy under the draft Bill. It has the power to make regulations and codes to mitigate reidentification risks and allied privacy concerns. The DPA should, therefore, be the regulator human non-personal data and mixed data in this context.
4. Conclusion: Is there a case for mandating free access to Non-Personal Data?
The Economic Survey of India 2018-19 explores the potential of data generated from Indians from the angles of enhancing the economy and better policymaking. It considers the value that can be extracted from this data by the means of aggregation as well as data analytical methods that are now available. This provides some indication towards the intentions of the Government with respect to data governance, especially from entities that possess large magnitudes of data of Indian consumers.
As noted with respect to the identified objectives, the DPA defined under the draft Personal Data Protection Bill may not be suitable as the sole regulator for the different objectives relating to NPD. Given the different considerations for the different categories of NPD, a blanket, one-size-fits-all governance framework may not be the optimal regulatory stance. While there are clear benefits to the free flow of data across the economy, research is slowly uncovering some effects that might counter or offset some of those benefits.
Wholesale access to data through data interoperability could dismantle the dominance of a few select firms, but it is unclear how it would affect the incentives for innovation in the market. Data access might ease the opening of secondary markets for complimentary services or to dislodge a dominant provider (European Commission, 2019). However, data interoperability may not be a universally applicable tool to promote competition given its adverse impact on incentives for innovation and business secrets. It is prudent to consider mandated data interoperability and free flow of NPD, in a sector-specific manner based on the merits of the case and subject to safeguards to protect overall economic and social welfare.
Article 29 Data Protection Working Party. (2014). Opinion 05/2014 on Anonymisation Techniques.
Committee of the Experts under the chairmanship of Justice Srikrishna. (2018). Free and Fair Digital Economy. Government of India.
Digital Competition Expert Panel. (2019). Unlocking Digital Competition: Report of the Digital Competition Expert Panel. Retrieved from https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/785547/unlocking_digital_competition_furman_review_web.pdf
European Commission. (2019). Competition Policy for the Digital Era. Basel: European Commission.
European Commission. (2019, May 29). Guidance on the Regulation on a framework for the free flow of non-personal data in the European Union. Retrieved from EUR-Lex: https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52019DC0250&from=EN
(2018, July 27). The Personal Data Protection Bill, 2018. Retrieved from Ministry of Electronics and Information Technology: https://meity.gov.in/content/personal-data-protection-bill-2018
(2019, September 13). Constitution of a Committee of Experts to deliberate on Data Governance Framework. Retrieved from Ministry of Electronics and Information Technology (MEITy): https://meity.gov.in/writereaddata/files/constitution_of_committee_of_experts_to_deliberate_on_data_governance-framework.pdf
Ministry of Consumer Affairs, Food and Public Distribution. (2019, August 2). Draft Guidelines on e-commerce for consumer protection. Retrieved from Department of Consumer Affairs: https://consumeraffairs.nic.in/sites/default/files/file-uploads/latestnews/Guidelines%20on%20e-Commerce.pdf
Ministry of Finance. (2019, July). The Economic Survey of India 2018-2019. Retrieved from Economic Survey, Ministry of Finance, Government of India: https://www.indiabudget.gov.in/economicsurvey/
(2002). Economies of Scale. Retrieved from OECD Glossary of Statistical Terms: https://stats.oecd.org/glossary/detail.asp?ID=3203
Patrick Breyer v Bundesrepublik Deutschland, C-582/14 (Court of Justice of the European Union October 19, 2016).
Privacy International. (2019, January). State of Privacy India. Retrieved from Privacy International: https://privacyinternational.org/state-privacy/1002/state-privacy-india#dataprotection
Scott, P. F. (2019, February 23). NATIONAL SECURITY, DATA PROTECTION AND DATA SHARING AFTER THE DATA PROTECTION ACT 2018. Retrieved from SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3340543
(2019). Competition issues in digital economy. Retrieved from A network effect “refers to the effect that one user of a good or service has on the value of that product to other existing or potential users”.
Wes, M. (2017, April 25). Looking to comply with GDPR? Here’s a primer on anonymisation and pseudonymisation. Retrieved from IAPP: https://iapp.org/news/a/looking-to-comply-with-gdpr-heres-a-primer-on-anonymization-and-pseudonymization/
Xynou, M., & Hickok, E. (2009, December 23). Security, Surveillance and Data Sharing Schemes and Bodies in India. Retrieved from Centre for Internet and Society: https://cis-india.org/internet-governance/blog/security-surveillance-and-data-sharing.pdf
 A version of this research and analysis has been shared with MeitY.
 Section 3(29) of the states that “Personal data” means data about or relating to a natural person who is directly or indirectly identifiable, having regard to any characteristic, trait, attribute or any other feature of the identity of such natural person, or any combination of such features, or any combination of such features with any other information;”
 Earlier this year, the UK Treasury released the report of its expert panel on competition in the digital economy (Digital Competition Expert Panel, 2019), followed by the European Commission’s report on competition issues (European Commission, 2019) and UNCTAD’s report on these issues (UNCTAD, 2019).
 Section 91 of the Code of Criminal Procedure (CrPC), 1973 (Summons to produce document or other thing) carries the provision to access any stored content. More specifically, this section gives the courts of India, or relevant enforcement personnel to summon a person to produce any articles or documents that may be deemed necessary for any inquiry, investigation or trial taking place under the CrPC. This provision may be expanded to access non-personal data as well (Privacy International, 2019).
Section 43(F) (Obligation to furnish information) of the Unlawful Activities (Prevention) Act, 1967 requires any firm, institution, establishment or organisation to furnish information that is deemed necessary for the purposes of the Act (Ministry of Home Affairs, 1967).
Section 69 of the Information Technology Act, 2000 (the IT Act) empower central and state governments to issue directives for the monitoring, interception or decryption of any information transmitted, received or stored through a computer resource (Privacy International, 2019).
Separately, section 70 of the IT Act gives any competent authority the power to notify and authorise access to computer resources as Critical Information Infrastructure. Within the IT Act, a Critical Information Infrastructure refers to that computer resource the destruction or incapacitation of which, negatively impacts national security (Ministry of Electronics and Information Technology, 2000). In the absence of any legal provisions, the Government may rely on the IT Act to obtain access to NPD from relevant entities.
 The assessment can consider factors including (a) sensitivity of personal data (b) potential for indirect identification in the dataset (c) publicly available datasets which can complement with the anonymised dataset to suggest links between records (d) consequences of reidentification. (MEITy, 2018) (Article 29 Data Protection Working Party, 2014) (United Kingdom Information Commissioner’s Office, 2012).