Businesses are failing in making machine learning fair and safe
Almost nine in ten businesses deploying machine learning (ML) technologies have not yet considered important questions that will affect data quality, consumer privacy and, ultimately, the quality of ML applications, according to recent research from O’Reilly Media. The research, conducted among 2,000 senior business leaders in the EU, found that 86% of businesses fail to account for compliance, privacy, fairness and bias in their model-building checklist.
O’Reilly Media warns that businesses that do not take these factors into account will end up developing flawed, biased and unethical applications that will not only fail to deliver effective results but will also put people’s privacy at risk.
According to the research, over half of EU businesses (55 per cent) haven’t included privacy provisions in their model-building checklist; 53 per cent do not take account of compliance, while 625 didn’t include fairness and bias. Only one in seven businesses (14%) counted all four elements – compliance, privacy, fairness and bias – in their model-building checklist.
In the context of machine learning, fairness describes efforts to prevent discrimination based on sensitive characteristics. However, often machine learning will behave very differently across different demographics if the model does not account for fairness. This can creep into models when organisations make assumptions on different groups (such as those based on demographics). However, the machine learning research community has managed to settle on a few strategies to ensure fairness, which also have their limitations. Ultimately, it is not possible to rely on a single strategy and hope that it addresses the problem - instead, a data scientist will need to work closely with a subject expert to create specific tests for fairness.
“There is much more to machine learning than just optimising your business metrics,” said Ben Lorica, Chief Data Scientist at O’Reilly Media and AI London Conference chair. “It’s critical that those developing these transformational applications understand the power they’re harnessing, and how small errors or omissions can lead to major problems down the line.
“Too often, the task of developing ML models falls to data scientists, with no oversight from lawyers, compliance and privacy experts,” he continued. “Since the introduction of the GDPR, businesses should be on heightened alert for anything that could compromise consumer privacy. Yet, over half of machine learning projects still fail to take this into account. This is simply storing up trouble for the future.”
“Meanwhile, other failings such as bias and fairness will mean that organisations won’t get full value from their ML investment – and could even end up with applications that are fundamentally inaccurate and therefore less than useless.”
O’Reilly Media is urging businesses taking their first steps into machine learning – as well as those with more experience of the technology – to ensure that they account for all four factors when building their models. Furthermore, every organisation should involve compliance leads, legal specialists and Chief Data Officers in their model development, and to develop shared processes and terminology so that everyone can communicate effectively.
“It’s easy for errors to creep in, especially when a project is overseen by machine learning engineers and no-one else,” said Lorica. “It’s vital that businesses have multiple ‘lines of defence’ consisting of different reviewers who conduct periodic reviews across every phase of development. Alarmingly, though, only 16 per cent of the businesses we polled had a Chief Data Officer, which does not bode well for the accuracy of new machine learning applications, or for consumer privacy.”
“The problem with any new technology is that developers and engineers are often focused on its potential for good, rather than worrying about dangers such as privacy. To maintain public trust in these technologies, it’s critical that we address these problems before machine learning applications come online,” concluded Lorica.