Using AI in HR and recruitment – data protection issues

In an employment and HR context, AI is increasingly used to automate the more administrative tasks, freeing up HR teams (and recruiters) to focus on the human-input elements and leading to cost and time savings. In recruitment, AI is commonly used to screen CVs, establish interest of candidates, social media screening (or profiling) and interview and personality tests. Wider HR uses include automating benefits management and common requests, processing performance data, automating parts of the onboarding process, agile learning and workplace data analysis.

The UK data protection framework

Using AI in recruitment inevitably involves processing personal data and will be subject to data protection law, both at an EU level (the GDPR) and at a Member State level (in the UK, the Data Protection Act 2018 – the DPA18). As governments and regulators around the world struggle to promote and regulate the use of ethical AI, the UK's ICO has focused on the practical. In addition to general guidance on aspects of the GDPR, it has published a draft AI auditing framework for consultation and, together with the Alan Turing Institute, guidance on explaining decisions made with AI. Running to more than a combined 200 pages, the guidance looks initially daunting but is comprehensive and accessible.

While not specifically targeted for use in an employment context, both sets of guidance will be extremely helpful to HR teams and recruiters considering using AI and often use HR examples by way of illustration. They are designed to be read in tandem and are aimed at two broad audiences – those with a compliance focus, and technology specialists. This includes data protection officers (DPOs), risk and cybersecurity managers, legal counsel, developers and engineers. The issues covered in the guidance are extensive – essentially, most of the GDPR will apply, but there are a few which employers and recruiters will find particularly interesting.

Starting as you mean to go on

As the ICO says, using AI to process personal data is likely to result in a high risk to the rights and freedoms of individuals which means a Data Impact Assessment (DPIA) is legally required.

A properly considered DPIA is fundamental to meeting accountability requirements under the GDPR. It also helps embed privacy by design and default into technology and processing operations, including by assessing appropriate cybersecurity – a particular issue for AI models. A DPIA for the planned use of AI must cover all the GDPR requirements but should also include an explanation of any margins of error which might affect the accuracy of the processing, the effect any decisions automated by the AI might have on individuals, and the degree of any human involvement or intervention in the decision-making process.

Where you're buying a third-party AI solution or outsourcing it, you will need to carry out due diligence to ensure the product is GDPR-compliant and this will also need to be documented and assessed as part of the DPIA.

Lawfulness, fairness and transparency

HR departments and recruiters should be familiar with the need to have a lawful basis for each processing operation. As with other areas, you need to be wary about relying on consent which is unlikely to be valid in an employment context, although may be less problematic when relied on by recruitment consultants (see our article). This is particularly true in relation to the use of AI which will involve a high degree of risk. You also need to be careful not to prioritise the business's legitimate interests over the rights and freedoms of the individuals whose data is being processed. If the system is processing sensitive personal data, you will also need to satisfy an Article 9 GDPR condition and this is in addition to any requirements under the Equality Act 2010.

Processing personal data using AI to infer data about individuals must be fair. The system needs to be sufficiently accurate and avoid discrimination (see below), and the impact on the reasonable expectations of the individuals must be considered.

Transparency is particularly important if you are going to use AI in a way which impacts actual and prospective employees or workers. This is, in part, because the technology is new so people won't have assumptions about how it works and the impact it has, but also, because it's hard for most of us to understand exactly how it works. In fact, the degree of difficulty in being genuinely transparent about the use of AI is demonstrated by the fact that the ICO/Alan Turing Institute guidance on explaining AI is 136 pages long.

Transparency is not just about being able to communicate what you are doing to the individuals whose data is being processed, although that is certainly a major factor. Employees using the AI also need to understand what it does and how to use it responsibly and lawfully. This means, as the ICO guidance makes clear, there will be layers of different types of information provided to different stakeholders explaining the concepts of the AI, the decisions it takes, as well as creating policies and procedures to inform and protect affected individuals.

Getting it right

Processing operations involved in AI products are complex. It can be hard to work out how to comply with data subject rights which will apply to personal data used in the training data, to make a prediction during deployment, to the result of the prediction itself, or data used or contained in the model.

Even though training data may be stripped of unique identifiers, the GDPR will still apply where it can be used to single out an individual in combination with other data you may process. Where you have bought a third-party product without having input into the overarching decisions the system makes, you are unlikely to be the data controller of the original training data, but you will probably be the controller of any additional data you input to train the system.

It may become more complex to give effect to data subject rights when using AI. By way of example, the right to data portability will apply to data provided by individuals, including where it is used as training data for the model. However, where training data is significantly altered, it may no longer count as 'provided' by the individual and the right to data portability may no longer apply even though the data remains personal data. Whatever the complexity though, data subject rights must be respected.

How much data do you really need?

The GDPR requires that personal data be collected for specified purposes and only to the extent required for those purposes. These principles of purpose limitation and data minimisation can be particularly hard to comply with given that AI can process and analyse unprecedented amounts of data in record time. There are particular risks if you are buying a third-party product which may have been designed to collect far more data than you actually need. The ICO guidance cites CV screening products as an example. It suggests that purchasers negotiate the right to modify the system if they cannot justify processing the data and, preferably, use feature selection methods when the system is trained to ensure that they are only processing the data needed for their purposes.

Machine Learning models usually require the full set of predictor variables for a person to be included in a query and again there are steps which can be taken at the inference stage to minimise the data, including converting personal data into less human readable formats, making inferences locally and introducing privacy-preserving queries.

Who or what makes the decisions?

The GDPR has rules around solely automated decision-making (including profiling), with legal or similarly significant effects. What counts as "similarly significant" is undefined but the ICO says in its Guide to the GDPR that it would include e-recruiting practices without human intervention. Again this is not altogether specific. It is conceivable that a CV screening tool would be used without human intervention to select candidates to go forward to the next stage but the next stage might well involve human decision making; so the solely automated part of the process is limited but it could nonetheless lead to a candidate's rejection for a role.

AI which results in solely automated decisions which do have similarly significant effects on individuals, can only be used where the decision:

Is necessary for entering into or performance of a contract between an organisation and the affected individual
Is authorised by law (for example, to prevent fraud or tax evasion)
Is based on the individual's explicit consent – problematic in an employment context.

There are additional requirements in terms of information which needs to be given to individuals and efforts which need to be taken to eliminate bias, discrimination and accuracy. Individuals must be able to challenge decisions and obtain human intervention.

How fair is your AI?

One of the much-vaunted benefits of using AI to recruit is the claim that it eliminates actual or unconscious bias and helps promote equality and diversity. This is a claim to be treated with considerable caution. It can certainly be true, but Machine Learning AI is only as good as the data its trained on. Imbalanced training data or data which reflects past discrimination can lead to unfair results. For example, a CV screening tool trained on past recruitment in an area which has historically had no women in senior management roles, could inadvertently produce unfair outcomes given the previous lack of diversity. Businesses need to be alive to the risk of bias or unfair outcomes when using AI and carry out appropriate due diligence to ensure their models do not produce the opposite outcomes to those intended.

Whether you are building an AI model or using a third-party one, you may need to modify training data, change the learning process or modify the model after training using mathematical 'fairness' measures against which to measure the results. Many of these will involve using datasets containing personal data of representative samples of the population. For each individual, labels for protected characteristics of interest including those outlined in the Equality Act 2010 are needed. Some of these may also be classed as special data under the GDPR which means an Article 9 condition needs to be identified as well as an Article 6 lawful basis.

Trying to eliminate bias from your AI can be easier if you include rather than ignore certain protected characteristics. The ICO gives the example of an AI system used to sort job applicants where it might be more effective to include someone's disability status to ensure they are not discriminated against. Not doing so might inadvertently lead the system to discriminate against someone with a disability because it does not factor in the effect of their condition on other features used to make a prediction. The ICO says that this approach amounts to making decisions about individuals in a solely automated way with significant effects using special category data and is prohibited under the GDPR unless you have explicit consent or can meet one of the substantial public interest conditions in Schedule 1 of the DPA18.

It's worth noting that even if you don't use protected characteristics or special category data in your model, the model may detect patterns of discrimination based on those projected characteristics and reproduce them in outputs. Some of the protected characteristics may also be special category data.

The ICO recommends putting in place appropriate procedures and policies at the design or procurement stage and throughout the lifecycle of the AI to mitigate the risk of bias and discrimination. The approach needs to be signed off by senior management.

Making the trade

Choosing between different risk-management approaches will involve making 'trade-offs' at a technical level which will need to be explainable to non-technical staff. Technical staff will, conversely, need to understand the requirements and priorities of the business. Tensions between the two as well as between the competing demands of data protection and statistical accuracy will need to be addressed.

The more statistically accurate an AI model is, the more data it needs to be trained on so there is an immediate trade-off between the model and the data minimisation principle. On the other hand, in order to avoid bias and discrimination, you may need more training data. There is also a potential trade-off between explainability and statistical accuracy – the more complex the system, the harder it is to explain it, creating a tension with the transparency principle. It's important to understand, mitigate and decide on the policy between trade-offs.

Appropriate governance

The ICO's auditing framework underlines that accountability rests with senior management and DPOs. They are responsible for understanding and addressing complexities associated with using AI and cannot delegate the issues to data scientists or engineering teams.

This makes it important for businesses to establish a chain of stakeholders which goes up to board level. This might involve the DPO, general counsel, the CTO and, of course, the head of HR. In addition, businesses should consider setting up a wider oversight mechanism to help ensure AI is used ethically. This would include consideration of privacy and other legal issues, but also of the wider ethical considerations of using AI in the context of HR and recruitment.

GDPR compliance when using AI for HR or recruitment is complex and we've only touched on a few issues here. Our experts can advise in detail on this area.

Prev. Next

Global Data Hub 10 July 2020

Consent and personal data in an employment setting

Elaine Fletcher and Mary Rendle look at the difficulties with using consent as a lawful basis for processing HR data and consider the alternatives.

1 of 6 Insights

Click here to find out more

Global Data Hub 10 July 2020

Employee monitoring in the context of COVID-19

Sally Annereau looks at issues to consider when contemplating return to work or remote work monitoring.

2 of 6 Insights

Click here to find out more

Global Data Hub 10 July 2020

Processing employee fingerprint data

Debbie Heywood looks at processing employee biometric data in light of a recent fine imposed by the Dutch Data Protection Authority.

3 of 6 Insights

Click here to find out more

Global Data Hub 10 July 2020

Protecting employee data – what did we learn from the Morrisons decision?

We look at how to manage the risk of an employee data breach (particularly while working from home) in the context of the Morrisons Supreme Court decision.

4 of 6 Insights

Click here to find out more

Global Data Hub 8 July 2020

Returning to work post COVID-19: Employers' top 10 data and employment questions

Helen Farr and Sally Annereau answer the big questions UK employers have about returning to the workplace during the COVID-19 pandemic.

5 of 6 Insights

Click here to find out more

Services and Groups Data & cyber HR Data GDPR compliance

Back to

Global Data Hub

Go to Global Data Hub main hub

Using AI in HR and recruitment – data protection issues

The UK data protection framework

Starting as you mean to go on

Lawfulness, fairness and transparency

Getting it right

How much data do you really need?

Who or what makes the decisions?

How fair is your AI?

Making the trade

Appropriate governance

More from this series

Consent and personal data in an employment setting

Employee monitoring in the context of COVID-19

Processing employee fingerprint data

Protecting employee data – what did we learn from the Morrisons decision?

Returning to work post COVID-19: Employers' top 10 data and employment questions

Global Data Hub