An upcoming report from the Algorithmic Justice League (AJL), a private nonprofit organization, recommends requiring disclosure when an AI model is used, and creating a public repository of incidents where AI has caused harm. The repository can help auditors detect potential algorithm problems and help regulators investigate or fine repeat offenders. AJL co-founder Joy Buolamwini co-authored an influential 2018 review that found that face recognition algorithms work best on white men and worst on dark-skinned women.
The report says that it is crucial that auditors are independent and that results are publicly available. Without these safeguards, “there is no accountability mechanism at all,” says AJL research director Sasha Costanza-Chock. “If they want to, they can just bury it; if a problem exists, there is no guarantee that it will be resolved. It’s toothless, it’s a secret, and the auditors have no leverage. ”
Deb Raji is a fellow at AJL who evaluates audits and she participated in the 2018 audit of face recognition algorithms. She warns that Big Tech companies seem to take a more adversarial approach to external auditors, sometimes threatening lawsuits based on privacy or anti-hacking reasons. In August, Facebook prevented NYU academics from monitoring political advertising spending and thwarted a German researcher’s efforts to investigate the Instagram algorithm.
Raji calls for the establishment of an audit supervisory board in a federal agency to do things like enforce standards or mediate disputes between auditors and firms. Such a board could be formed according to the Financial Accounting Standards Board or the Food and Drug Administration’s standards for the evaluation of medical devices.
Auditing standards and auditors are important because growing demands to regulate AI have led to the creation of a number of audit startups, some by critics of AI, and others that may be more favorable to the companies they audit. In 2019, a coalition of AI researchers from 30 organizations recommended external audits and regulation, creating a marketplace for auditors as part of building AI that people trust with verifiable results.
Cathy O’Neil started a company, O’Neil Risk Consulting & Algorithmic Auditing (Orcaa), in part to assess AI that is invisible or inaccessible to the public. For example, Orcaa works with state attorneys in four U.S. states to evaluate financial or consumer product algorithms. But O’Neil says she’s losing potential customers because companies want to maintain plausible denial and don’t want to know if or how their AI is hurting people.
Earlier this year, Orcaa conducted an audit of an algorithm that HireVue used to analyze people’s faces during job interviews. A press release from the company claimed that the audit found no issues with accuracy or bias, but the audit made no attempt to assess the system code, training data or performance for different groups of people. Critics said HireVue’s characterization of the audit was misleading and unattainable. Shortly before the release of the review, HireVue said it would stop using AI in video job interviews.
O’Neil believes that revisions can be useful, but she says that in some respects it is too early to take the approach prescribed by AJL, in part because there are no standards for revisions and we do not fully understand the ways in which AI harms people. Instead, O’Neil prefers another approach: algorithmic impact assessments.
While an audit may evaluate the output of an AI model to see if, for example, it treats men differently than women, an impact assessment may focus more on how an algorithm was designed, who may be harmed, and who is responsible. if things go wrong. In Canada, companies must assess the risk to individuals and communities by implementing an algorithm; In the United States, assessments are being developed to determine when AI is low or high risk, and to quantify how much people trust AI.
The idea of measuring impact and potential damage began in the 1970s with the National Environmental Protection Act, which led to the creation of environmental impact statements. These reports take into account factors from pollution to the potential discovery of ancient artifacts; similarly, impact assessments for algorithms would take into account a wide range of factors.