In many ways the next technological revolution will be one based on data, the businesses that can best use that data, and in a legal, private, and secure manner, will be the winners
By Cameron Kerry
Our world is undergoing an information Big Bang, in which the universe of data doubles every two years and quintillions of bytes of data are generated every day. For decades, Moore’s Law on the doubling of computing power every 18-24 months has driven the growth of information technology.
Now–as billions of smartphones and other devices collect and transmit data over high-speed global networks, store data in ever-larger data centres, and analyse it using increasingly powerful and sophisticated software–Metcalfe’s Law comes into play. It treats the value of networks as a function of the square of the number of nodes, meaning that network effects exponentially compound this historical growth in information. As 5G networks and eventually quantum computing deploy, this data explosion will grow even faster and bigger.
The impact of big data is commonly described in terms of three “Vs”: volume, variety, and velocity. More data makes analysis more powerful and more granular. Variety adds to this power and enables new and unanticipated inferences and predictions. And velocity facilitates analysis as well as sharing in real time. Streams of data from mobile phones and other online devices expand the volume, variety, and velocity of information about every facet of our lives and puts privacy into the spotlight as a global public policy issue.
Artificial intelligence likely will accelerate this trend. Much of the most privacy-sensitive data analysis today–such as search algorithms, recommendation engines, and AdTech networks–are driven by machine learning and decisions by algorithms. As artificial intelligence evolves, it magnifies the ability to use personal information in ways that can intrude on privacy interests by raising analysis of personal information to new levels of power and speed.
Facial recognition systems offer a preview of the privacy issues that emerge. With the benefit of rich databases of digital photographs available via social media, websites, driver’s license registries, surveillance cameras, and many other sources, machine recognition of faces has progressed rapidly from fuzzy images of cats to rapid (though still imperfect) recognition of individual humans. Facial recognition systems are being deployed in cities and airports around America. However, China’s use of facial recognition as a tool of authoritarian control in Xinjiang and elsewhere has awakened opposition to this expansion and calls for a ban on the use of facial recognition. Owing to concerns over facial recognition, the cities of Oakland, Berkeley, and San Francisco in California, as well as Brookline, Cambridge, Northampton, and Somerville in Massachusetts, have adopted bans on the technology. California, New Hampshire, and Oregon all have enacted legislation banning use of facial recognition with police body cameras.
Privacy issues in AI
The challenge is to pass privacy legislation that protects individuals against any adverse effects from the use of personal information in AI, but without unduly restricting AI development or ensnaring privacy legislation in complex social and political thickets. The discussion of AI in the context of the privacy debate often brings up the limitations and failures of AI systems, such as predictive policing that could disproportionately affect minorities or Amazon’s failed experiment with a hiring algorithm that replicated the company’s existing disproportionately male workforce. These both raise significant issues, but privacy legislation is complicated enough even without packing in all the social and political issues that can arise from uses of information. To evaluate the effect of AI on privacy, it is necessary to distinguish between data issues that are endemic to all AI, like the incidence of false positives and negatives or overfitting to patterns, and those that are specific to use of personal information.
The privacy legislative proposals that involve these issues do not address artificial intelligence in name. Rather, they refer to “automated decisions” (borrowed from EU data protection law) or “algorithmic decisions” (used in this discussion). This language shifts people’s focus from the use of AI as such to the use of personal data in AI and to the impact this use may have on individuals. This debate centres in particular on algorithmic bias and the potential for algorithms to produce unlawful or undesired discrimination in the decisions to which the algorithms relate. These are major concerns for civil rights and consumer organisations that represent populations that suffer undue discrimination.
Addressing algorithmic discrimination presents basic questions about the scope of privacy legislation. First, to what extent can or should legislation address issues of algorithmic bias? Discrimination is not self-evidently a privacy issue, since it presents broad social issues that persist even without the collection and use of personal information, and fall under the domain of various civil rights laws. Moreover, making these laws available for debate could effectively open a Pandora’s Box because of the charged political issues they touch on and the multiple congressional committees with jurisdiction over various such issues. Even so, discrimination is based on personal attributes such as skin colour, sexual identity, and national origin. Use of personal information about these attributes, either explicitly or—more likely and less obviously—via proxies, for automated decision-making that is against the interests of the individual involved thus implicates privacy interests in controlling how information is used.
Second, protecting such privacy interests in the context of AI will require a change in the paradigm of privacy regulation. Most existing privacy laws are rooted in a model of consumer choice based on “notice-and-choice” (also referred to as “notice-and-consent”). Consumers encounter this approach in the barrage of notifications and banners linked to lengthy and uninformative privacy policies and terms and conditions that we ostensibly consent to but seldom read. This charade of consent has made it obvious that notice-and-choice has become meaningless. For many AI applications—smart traffic signals and other sensors needed to support self-driving cars as one prominent example—it will become utterly impossible.
A model focused on data collection and processing may affect AI and algorithmic discrimination in several ways. One, data stewardship requirements, such as duties of fairness or loyalty, could militate against uses of personal information that are adverse or unfair to the individuals the data relates to. Two, transparency or disclosure rules, as well as right to access information relating to them, could illuminate uses of algorithmic decision-making.
Three, data governance rules that prescribe the appointment of privacy officers, conduct of privacy impact assessments, or product planning through “privacy by design” may surface issues concerning use of algorithms. Four, rules on data collection and sharing could reduce the aggregation of data that enables inferences and predictions, but may involve some trade-offs with the benefits of large and diverse datasets. In addition to these provisions of general applicability that may affect algorithmic decisions indirectly, a number of proposals specifically address the subject.
As the power of AI increases, the amount and type of data used and collected will also increase. Where current AI technology may identify diseases in an MRI scan, future AI technology may fully identify a user. Notably, patient health records have already been at issue in data privacy fines for lax protection. Innovative uses for AI technology may likewise create data privacy compliance issues. For example, the Spanish soccer league, LaLiga, was recently fined for its use of location data and speech recognition technology aimed at preventing piracy.
Furthermore, once data does become subject to data privacy laws, compliance will fall on businesses to accurately map, monitor, and understand how data is collected, used, shared and retained. In order to maintain customer trust relative to AI, comply with the myriad of privacy and data protection regulations, and add value to customers by the collection and use of their data, businesses should consider the following:
Taking an inventory of AI practices
Businesses should understand what, where, and how AI is employed. For example, does the customer service team use a chatbot powered by AI? Does the finance team use AI for fraud detection? Does the security team using AI to predict and manage incidents? Documenting these AI practices, and considering whether to perform Privacy Impact Assessments or other compliance reviews to confirm adequate controls are in place and existing governing policies are met, is the first step to compliance.
Businesses should also understand the motivation for deploying AI. For example, who are the stakeholders developing the business strategies for AI? How does AI add value to service and product offerings and how is that value communicated to consumers? How does the business’s AI strategy align to company culture and public statements made by its executive team and in public-facing statements such as financial disclosures and Privacy Statements? Notably, even tech giant Google, Inc. was fined for lack of transparency with regards to its personalized advertising.
With an understanding of what AI is used and the business strategy for using AI, businesses should consider leveraging existing governing policies for privacy, security and confidentiality guidelines. Several data protection principles should also be observed when developing AI: fairness of processing, purpose limitation, data minimization and transparency/right to information. For example, common features of data privacy regulations around the world allow consumers the right to access collected data, understand how the collected data is used, restrict how it is processed and remove it on request. Privacy laws also raise challenges with respect to consent, including the scope of use of personal data for ongoing and future training of algorithms and product development and validation.
Consider Ethics and Bias
Businesses may wish to publish AI or Data Usage Ethics Principles, as many companies (e.g., Microsoft) have done. These principles often go beyond what AI use is legally complaint and impose obligations related to ethics (what is the right thing to do with data and what are the customer expectations). In addition, policymakers have expressed increasing concerns regarding the implications of AI on issues of fairness, bias and discrimination. Businesses are well advised to confirm their policies and processes address controls related to rooting out bias and discrimination in the algorithm.
While consumers value privacy and security of their personal data, the potential of AI innovation to improve products and services is undeniable. Businesses cannot ignore novel implementations of AI technology due to concerns about data privacy issues, but neither can businesses ignore data privacy issues in the pursuit of AI innovation. Similar to Intellectual Property law, where AI has unsettled traditional IP concepts such as inventorship and protectability, and Competition law, where AI has raised fears about anti-competitive practices such as collusion and parallel pricing, businesses must understand that complying with data privacy laws when introducing a new AI features is as much a part of the AI innovation as the feature itself. (