German Data Ethics Commission insists AI regulation is necessary

The German Data Ethics Commission issued a 240-page report with 75 recommendations for regulating data, algorithmic systems, and AI. It is one of the strongest views on ethical AI to date and favors explicit regulation.

The Data Ethics Commission holds the view that regulation is necessary, and cannot be replaced by ethical principles.

Opinion of the Data Ethics Commission – Executive Summary at 7 (emphasis original).

The report divides ethical considerations into concerns about either data or algorithmic systems. For data, the report suggests that rights associated with the data will play a significant role in the ethical landscape. For example, ensuring that individuals provide informed consent for use of their personal data addresses a number of significant ethical issues.

For algorithmic systems, however, the report suggests that the AI systems might have no connection to the affected individuals. As a result, even non-personal data for which there are no associated rights could be used in an unethical manner. The report concludes that regulation is necessary to the extent there is a potential for harm.

The report identifies five levels of algorithmic system criticality. Applications with zero or negligible potential for harm would face no regulation. The regulatory burden would increase as the potential for harm increases, up to a total ban. For applications with serious potential for harm, the report recommends constant oversight.

The framework appears to be a good candidate for future ethical AI regulation in Europe, and perhaps (by default) the world.

White House Releases AI Principles

The White House has released draft “guidance for regulation of artificial intelligence applications.” The memo states that “Federal agencies must avoid regulatory or non-regulatory actions that needlessly hamper AI innovation and growth.”

Agencies should consider new regulation only after they have reached the decision . . . that Federal regulation is necessary.

Nevertheless, the memo enumerates ten principles that agencies should take into account should they ultimately take action that impacts AI:

  1. Public Trust in AI. Don’t undermine it by allowing AI’s to pose risks to privacy, individual rights, autonomy, and civil liberties.
  2. Public Participation. Don’t block public participation in the rule making process.
  3. Scientific Integrity and Information Quality. Use scientific principles.
  4. Risk Assessment and Management. Use risk management principles.
  5. Benefits and Costs.
  6. Flexibility. Be flexible and ensure American companies are not disadvantaged by the United States’ regulatory regime.
  7. Fairness and Non-Discrimination.
  8. Disclosure and Transparency.
  9. Safety and Security.
  10. Interagency Coordination. Agencies should coordinate.

Overall, the memo is a long-winded directive that agencies should not regulate, but if for some reason they feel they have to, they should consider the same basic principles that everyone else is listing about AI concerns: safety, security, transparency, fairness.

Biased algorithms are easier to fix

Sendhil Mullainathan in an excellent essay for the NYT:

Humans are inscrutable in a way that algorithms are not. Our explanations for our behavior are shifting and constructed after the fact. To measure racial discrimination by people, we must create controlled circumstances in the real world where only race differs. For an algorithm, we can create equally controlled just by feeding it the right data and observing its behavior.

Biased Algorithms Are Easier to Fix Than Biased People

This is a fascinating complement to the concern that deep learning algorithms are a black box and we do not understand how they work. Even so, they are much easier to study than humans. Algorithms are tractable in a way that humans are not.

At its core, this essay is an argument for AI regulation, and an argument that such regulation will actually work.

AI-powered text adventure game

Anyone remember Zork I? It’s now part of a genre called “interactive fiction.” A computer describes something to you, you type a response, you get a custom response, and off you go.

Now developer Nick Walton has created an AI version of this type of game. He’s calling it AI Dungeon 2, and the dialog is created by the AI on the fly. It’s also not fully coherent. But it’s still amazing!

My favorite exchange from The Verge article:

You can play AI Dungeon 2 yourself here.

Australia rolls driving-while-talking AI detectors

Cameras in New South Wales, Australia will detect when drivers are using mobile phones. Importantly, the system has a human-in-the-loop which verifies the accuracy of the detection.

This kind of automatic policing raises concerns among many ethicists. (What if the system is bad at detecting certain races or genders and skews the enforcement?) But overall it is hard to find fault in this kind of efficient safety innovation. Innocent people are killed every day by distracted drivers.

Copyrightability of AI creations

One of the many fascinating things about AI is whether AI creations can be copyrighted and, if so, by whom. Under traditional copyright analysis, the human(s) that made some contribution to the creative work own the copyright by default. If there is no human contribution, there is no copyright. See, for example, the so-called “monkey selfie” case in which a monkey took a selfie and the photographer that owned the camera got no copyright in the photo.

But when an AI creates a work of art, is there human involvement? A human created the AI, and might have fiddled with its knobs so to speak. Is that sufficient? The U.S. Copyright Office is concerned about this. One question they are asking is this:

2. Assuming involvement by a natural person is or should be required, what kind of involvement would or should be sufficient so that the work qualifies for copyright protection? For example, should it be sufficient if a person

(i) designed the AI algorithm or process that created the work;

(ii) contributed to the design of the algorithm or process;

(iii) chose data used by the algorithm for training or otherwise;

(iv) caused the AI algorithm or process to be used to yield the work;

or (v) engaged in some specific combination of the foregoing activities? Are there other contributions a person could make in a potentially copyrightable AI-generated work in order to be considered an ‘‘author’’?

Request for Comments on Intellectual Property Protection for Artificial Intelligence Innovation

No one really knows the answer to this because (1) it is going to be very fact intensive (lots of different ways for humans to be involved or not involved); and (2) it feels weird to do a lot of work or spend a lot of money to build an AI and not be entitled to copyright over its creations.

In any case, these issues are going to be litigated soon. A reddit user recently used a widely-available AI program called StyleGAN to create a music visualization. And although the underlying AI was not authored by the reddit poster, the output was allegedly created by “transfer learning with a custom dataset of images curated by the artist.”

Does the reddit poster (aka self-proclaimed “artist”) own a copyright on the output? Good question.

AI snake oil

Professor Arvind Narayanan of Princeton published a brief deck on AI snake oil. Helpfully, he divides applications into three (non-exclusive) domains:

  1. Perception (e.g., face recognition, speech to text): “genuine, rapid progress”
  2. Automating judgment (e.g., spam detection, essay grading) “imperfect but improving”; and
  3. Predicting social outcomes (e.g., recidivism, job success): “fundamentally dubious”

His point is that humans are terrible at predicting social outcomes, and AI’s are no better. And in fact manually sorting the data using just a few features is often the best we know how to do.

So AI’s predicting job success = snake oil.

NYC issues report on use of algorithms

New York City convened a task force in 2017 to “develop recommendations that will provide a framework for the use of and policy around ADS [automated decision systems].” The report is now out, and has been immediately criticized:

“It’s a waste, really,” says Meredith Whittaker, co-founder of the AI Now Institute and a member of the task force. “This is a sad precedent.” . . .

Ultimately, she says, the report, penned by city officials, “reflects the city’s view and disappointingly fails to leave out a lot of the dissenting views of task force members.” Members of the task force were given presentations on automated systems that Whittaker says “felt more like pitches or endorsements.” Efforts to make specific policy changes, like developing informational cards on algorithms, were scrapped, she says.

NYC’s algorithm task force was ‘a waste,’ member says

The report itself makes three fairly pointless recommendations: (1) build capacity for an equitable, effective, and responsible approach to the City’s ADS; (2) broaden public discussion on ADS; and (3) formalize ADS management functions.

Someone should really start thinking about this!

The report’s summary contains an acknowledgement that, “we did not reach consensus on every potentially relevant issue . . . .”

Anonymizing data is hard

Google tried to anonymize health care data and failed:

On July 19, NIH contacted Google to alert the company that its researchers had found dozens of images still included personally identifying information, including the dates the X-rays were taken and distinctive jewelry that patients were wearing when the X-rays were taken, the emails show.

Google almost made 100,000 chest X-rays public — until it realized personal data could be exposed

This article comes across as a warning, but it’s a success story. Smart people thought they could anonymize data, someone noticed they couldn’t, the lawyers got involved, and the project was called off.

That’s how the system is supposed to work.