Misplaced Faith in Computer Precision

Computers can be fantastically accurate. And humans have a tendency to assume that this accuracy means something.

For example, an automated license plate reader might flag the license plate in front of you as “stolen.” You look at the report, confirm the license plate in front of you, and arrest the driver. You may not consider that the report itself is wrong. Even if the technology works exactly as intended, it doesn’t necessarily mean what you assume it means.

Joe Posnanski suggests this kind of faith in computer precision may be unfairly impacting athletes as well:

Maybe you heard about the truly insane false-start controversy in track and field? Devon Allen — a wide receiver for the Philadelphia Eagles — was disqualified from the 110-meter hurdles at the World Athletics Championships a few weeks ago for a false start.

Here’s the problem: You can’t see the false start. Nobody can see the false start. By sight, Allen most definitely does not leave before the gun.


Allen’s reaction time was 0.099 seconds, just 1/1000th of a second under the “allowable limit” of 0.1 seconds.

Posnanski writes:

World Athletics has determined that it is not possible for someone to push off the block within a tenth of a second of the gun without false starting. They have science that shows it is beyond human capabilities to react that fast. Of course there are those (I’m among them) who would tell you that’s nonsense, that’s pseudoscience, there’s no way that they can limit human capabilities like that. There is science that shows it is humanly impossible to hit a fastball. There was once science that showed human beings could not run a four-minute mile.

The computer can tell you his reaction time was 0.099 seconds. But it can’t tell you what that means.

As we rely more and more on computers to make decisions, especially “artificially intelligent” computers, it will be critical to understand what they are telling us and what they are not.

Bias mitigations for the DALL-E 2 image generation AI

OpenAI has a post explaining the three main techniques it used to “prevent generated images from violating our content policy.”

First, they filtered out violent and sexual images from the training dataset:

[W]e prioritized filtering out all of the bad data over leaving in all of the good data. This is because we can always fine-tune our model with more data later to teach it new things, but it’s much harder to make the model forget something that it has already learned.

Second, they found that the filtering can actually amplify bias because the smaller remaining datasets may be less diverse:

We hypothesize that this particular case of bias amplification comes from two places: first, even if women and men have roughly equal representation in the original dataset, the dataset may be biased toward presenting women in more sexualized contexts; and second, our classifiers themselves may be biased either due to implementation or class definition, despite our efforts to ensure that this was not the case during the data collection and validation phases. Due to both of these effects, our filter may remove more images of women than men, which changes the gender ratio that the model observes in training.

They fix this by re-weighting the training dataset so that the categories of filtered data are as balanced as the categories of unfiltered data.

Third, they needed to prevent image regurgitation to avoid IP and privacy issues. They found that most regurgitated images (a) were simple vector graphics; and (b) had many near-duplicates in the training set. As a result, these images were easier for the model to memorize. So they de-duplicated images with a clustering algorithm.

To test the effect of deduplication on our models, we trained two models with identical hyperparameters: one on the full dataset, and one on the deduplicated version of the dataset. . . . Surprisingly, we found that human evaluators slightly preferred the model trained on deduplicated data, suggesting that the large amount of redundant images in the dataset was actually hurting performance.

Given the obviously impressive results, this is an instructive set of techniques for AI model bias mitigation.

Facebook settles housing discrimination lawsuit

In 2019, Facebook was sued for housing discrimination because their machine learning advertising algorithm functioned “just like an advertiser who intentionally targets or excludes users based on their protected class.”

They have now settled the lawsuit by agreeing to scrap the algorithm:

Under the settlement, Meta will stop using an advertising tool for housing ads (known as the “Special Ad Audience” tool) which, according to the complaint, relies on a discriminatory algorithm to find users who “look like” other users based on FHA-protected characteristics.  Meta also will develop a new system over the next six months to address racial and other disparities caused by its use of personalization algorithms in its ad delivery system for housing ads.  If the United States concludes that the new system adequately addresses the discriminatory delivery of housing ads, then Meta will implement the system, which will be subject to Department of Justice approval and court oversight.  If the United States concludes that the new system is insufficient to address algorithmic discrimination in the delivery of housing ads, then the settlement agreement will be terminated.

United States Attorney Resolves Groundbreaking Suit Against Meta Platforms, Inc., Formerly Known As Facebook, To Address Discriminatory Advertising For Housing

Government lawyers will need to approve Meta’s new algorithm, and Meta was fined $115,054, “the maximum penalty available under the Fair Housing Act.”

The DOJ’s press release states: “This settlement marks the first time that Meta will be subject to court oversight for its ad targeting and delivery system.”

People don’t reason well about robots

Andrew Keane Woods in the University of Colorado Law Review:

[D]octors continue to privilege their own intuitions over automated decision-making aids. Since Meehl’s time, a growing body of social psychology scholarship has offered an explanation: bias against nonhuman decision-makers…. As Jack Balkin notes, “When we talk about robots, or AI agents, or algorithms, we usually focus on whether they cause problems or threats. But in most cases, the problem isn’t the robots. It’s the humans.”


Making decisions that go against our own instincts is very difficult (see also List of cognitive biases), and relying on data and algorithms is no different.

A major challenge of AI ethics is figuring out when to trust the AI’s.

Andrew Keane Woods suggests (1) defaulting to use of AI’s; (2) anthropomorphizing machines to encourage us to treat them as fellow decision-makers; (3) educating against robophobia; and perhaps most dramatically (4) banning humans from the loop. 😲

Believing AI’s is sometimes easy, and sometimes hard

Most ethicists are concerned that AI’s are wrong, and we harm people by deferring to them. But they can be right and ignored too:

NURSE DINA SARRO didn’t know much about  artificial intelligence when Duke University Hospital installed machine learning software to raise an alarm when a person was at risk of developing sepsis, a complication of infection that is the number one killer in US hospitals. The software, called Sepsis Watch, passed alerts from an algorithm Duke researchers had tuned with 32 million data points from past patients to the hospital’s team of rapid response nurses, co-led by Sarro.

But when nurses relayed those warnings to doctors, they sometimes encountered indifference or even suspicion. When docs questioned why the AI thought a patient needed extra attention, Sarro found herself in a tough spot. “I wouldn’t have a good answer because it’s based on an algorithm,” she says.

AI Can Help Patients—but Only If Doctors Understand It

AI test proctor fails

One college student went viral on TikTok after posting a video in which she said that a test proctoring program had flagged her behavior as suspicious because she was reading the question aloud, resulting in her professor assigning her a failing grade.

A student says test proctoring AI flagged her as cheating when she read a question out loud. Others say the software could have more dire consequences.

This is basic ethics: if your AI has real consequences, you’d better get it right.

The value of distinguishing AI’s from humans

What will happen when we can no longer distinguish human tweets from AI tweets? Does it matter? Should we care? Will there be a verified human status?

Renée DiResta, writing for The Atlantic:

Amid the arms race surrounding AI-generated content, users and internet companies will give up on trying to judge authenticity tweet by tweet and article by article. Instead, the identity of the account attached to the comment, or person attached to the byline, will become a critical signal toward gauging legitimacy. Many users will want to know that what they’re reading or seeing is tied to a real person—not an AI-generated persona. . . .

. . . . .

The idea that a verified identity should be a precondition for contributing to public discourse is dystopian in its own way. Since the dawn of the nation, Americans have valued anonymous and pseudonymous speech: Alexander Hamilton, James Madison, and John Jay used the pen name Publius when they wrote the Federalist Papers, which laid out founding principles of American government. Whistleblowers and other insiders have published anonymous statements in the interest of informing the public. Figures as varied as the statistics guru Nate Silver (“Poblano”) and Senator Mitt Romney (“Pierre Delecto”) have used pseudonyms while discussing political matters on the internet. The goal shouldn’t be to end anonymity online, but merely to reserve the public square for people who exist—not for artificially intelligent propaganda generators.

The Supply of Disinformation Will Soon Be Infinite

The idea that we should reserve the public square for humans is remarkable, in just the sense that this technology is now upon us. Human sentiments have value; AI facsimiles do not.

An optimistic take is that perhaps we will instead pay attention to the useful content of such messages, rather than inflammatory rhetoric. A good idea is a good idea, AI or not.

Portland bans facial recognition by private entities

34.10.030 Prohibition.

Except as provided in the Exceptions section below, a Private Entity shall not use Face Recognition Technologies in Places of Public Accommodation within the boundaries of the City of Portland.

34.10.040 Exceptions.

The prohibition in this Chapter does not apply to use of Face Recognition Technologies:

1. To the extent necessary for a Private Entity to comply with federal, state, or local laws;

2. For user verification purposes by an individual to access the individual’s own personal or employer issued communication and electronic devices; or

3. In automatic face detection services in social media applications.

Prohibit the use of Face Recognition Technologies by Private Entities in Places of Public Accommodation in the City (via PRIVACY & INFORMATION SECURITY LAW BLOG)

Note the exception for use in “social media applications.”

What does it mean for AI to be “explainable”?

A NIST paper attempts to answer this question:

Briefly, our four principles of explainable AI are:

Explanation: Systems deliver accompanying evidence or reason(s) for all outputs. 

Meaningful: Systems provide explanations that are understandable to individual users. 

Explanation Accuracy: The explanation correctly reflects the system’s process for generating the output. 

Knowledge Limits: The system only operates under conditions for which it was designed or when the system reaches a sufficient confidence in its output. 

Four Principles of Explainable Artificial Intelligence

Stating this differently: there should be an explanation, it should be understandable and accurate, and the system should stop when it’s generating nonsense.

These are very reasonable principles, but likely tough to deliver with current technology.

Indeed, the paper discusses that humans are often unable to explain why they have taken a certain action:

People fabricate reasons for their decisions, even those thought to be immutable, such as personally held opinions [24, 34, 99]. In fact, people’s conscious reasoning that is able to be verbalized does not seem to always occur before the expressed decision. Instead, evidence suggests that people make their decision and then apply reasons for those decisions after the fact [95]. From a neuroscience perspective, neural markers of a decision can occur up to 10 seconds before a person’s conscious awareness [85]. This finding suggests that decision making processes begin long before our conscious awareness. 

Id. at 14.

And it is well documented that even experts generally cannot predict their own accuracy.

What hope do the AI’s have?