Read time: 23 minutes
Any discussion around privacy issues concerning AI quickly starts to read like an exam question in which potentially every single global worldwide privacy requirement comes into play for consideration. In many ways, privacy regulation is really put to the test with this rapidly developing technology, potentially even challenging whether it is appropriate for the job. How do those looking to develop and use AI in the media and entertainment sector even begin to navigate issues such as data minimization when data sets have become exponential in size, and how does a user even know if there is personal data involved? What about the practical issues of informing data subjects regarding how their personal data is used in a way that is understandable and implementing rights management practices at scale and when even identifying where personal data is may have become opaque? In this chapter, we look at some of these key issues.
Show me the data: Ways in which personal data is processed in AI
Like other industries, media and entertainment companies must pay close attention to how they use and protect personal and sensitive data. Globally, there is an increasingly complex regulatory landscape that companies must navigate with respect to consumer personal and sensitive data. Personal data refers to information that can be used to identify an individual, such as name, address, date of birth, etc. Sensitive data is a subset of personal data that is considered particularly sensitive or private, such as medical records, financial information, race, religion, ethnicity, etc.
In the context of AI, media and entertainment companies rely heavily on personal and sensitive data to personalize content recommendations, to target advertising, to create interactive content and to develop new products and services. These companies engage in complex data-gathering efforts to provide services accessible across multiple platforms (e.g., mobile, web, television, gaming consoles, etc.) and across a vast range of jurisdictions. They leverage algorithms to analyze consumer viewing habits and preference to make targeted decisions about content. For these reasons, these companies must pay attention to data privacy regulations governing personal data.
In the United States, one of the key legal frameworks governing personal data is the California Consumer Privacy Act (CCPA), which went into effect on Jan. 1, 2020. The law applies to companies that meet at least one of the following thresholds: (1) annual gross revenue of greater than $25 million; (2) buys, receives or sells the personal data of 50,000 or more California residents, households or devices each year or (3) earns half or more of its annual revenue from selling California residents’ personal information. Companies can continue to collect consumer data but are required under the CCPA to disclose the personal information they process and who it is shared with. Under the CCPA, personal information is broadly defined as “information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household.” This can include commercial, electronic, behavioral, biometric, financial and educational information. For example, companies that stream podcasts would be required to disclose the mobile phone numbers they use to identify listeners.
One of the most influential legal frameworks governing the use of personal data is the EU’s General Data Protection Regulation and UK GDPR (collectively, GDPR). The GDPR applies to companies that process personal data, including special category data (the term given for sensitive data under the GDPR), that are based in the EU/UK but also have extraterritorial effect – for example, if a media company is based in the United States but is providing goods or services to or monitoring individuals located in the EU/UK. Therefore, any company targeting an audience in the EU or UK for goods and services or monitoring the behavior of individuals including through geolocation data and online identifiers, which is likely to be a core function of AI data mining, must comply with the GDPR. The GDPR has spawned a proliferation of similar laws in many countries around the world, and the issues will be similar. Most probably, an AI provider’s role is likely to be that of a controller because of the discretion used in formulating the AI algorithms, and the only way to fall outside the GDPR is to anonymize data, a process equally fraught, given the difficulty in doing so and the question of whether the data would have any utility. Those using AI technologies and outputs may also be acting as data controllers.
To comply with these regulations, throughout the AI life cycle from development to deployment, companies must consider how to address their use of personal and sensitive information. Various considerations are discussed in the sections that follow, such as the lawful basis for processing personal data, transparency, data minimization and data integrity or accuracy. Other privacy/data protection by design considerations include:
- Design and governance. During the design phase, determine what personal data will be required and how it will be used. Document how personal and sensitive data will flow through and out of the AI system. Many privacy laws like GDPR require accountability and governance to show the steps taken to protect personal data. This may require records of processing, policies and assessments. If other companies are involved, appropriate contracts may need to be put in place.
- Data rights. Consider how individual rights requests will be addressed. When designing an AI framework, consider how to index personal data so that it can be retrieved when a request is received. Functionality should exist at the outset that enables the company to respond to requests from consumers.
- Training and testing. To guard against unintended scope creep caused by a learning algorithm, conduct training and testing. Evaluate whether there are any changes to the purpose for which the data has been collected and confirm that any new purposes are lawful. If there are any changes in purpose from what was originally disclosed, update privacy information provided to consumers and consider whether additional consent must be obtained.
- Security. Care is needed to ensure that personal data is kept secure. This can be another driver for ensuring that personal data sets are not used as inputs without thinking through how that personal data will then be held in systems and to whom it may become accessible.
- Transfers. Many privacy laws around the world include restrictions on data transfers. Consider how it may be possible to comply with them, thinking through where personal data is inputted and how and where outputs can be used.
- Privacy laws around the world contain core principles and requirements (such as data minimization) that present new challenges for those using and developing AI
- At the same time, variations in approach and detail, specifically as regards consent and the use of publicly available data, mean that it is an increasingly complex landscape to navigate
- Privacy regulators are beginning to flex their muscles with enforcement action and new guidance on AI, so such issues can no longer be ignored