Is ShotSpotter AI?

 A federal lawsuit filed Thursday alleges Chicago police misused “unreliable” gunshot detection technology and failed to pursue other leads in investigating a grandfather from the city’s South Side who was charged with killing a neighbor.

. . . . .

ShotSpotter’s website says the company is “a leader in precision policing technology solutions” that help stop gun violence by using sensors, algorithms and artificial intelligence to classify 14 million sounds in its proprietary database as gunshots or something else.

Lawsuit: Chicago police misused ShotSpotter in murder case

Some commentators (e.g., link) have jumped on this story as an example of someone (allegedly) being wrongly imprisoned due to AI.

But maybe ShotSpotter is just bad software that is used improperly? Does it matter?

The definition of AI is so difficult that we may soon find ourselves regulating all software.

Pilot confusion over computer control in Airbus planes

An Airbus A321 jet unexpectedly rolled hard left on takeoff in an April 2019 incident.

The NTSB concluded the incident was pilot error: too much rudder while taking off in a crosswind.

But the pilots were very confused:

A transcript of conversations between the captain and his first officer shows how the two were confused and didn’t realize how badly damaged the plane was as they continued to climb out of New York.

“Are we continuing?” the first officer asked, according to a transcript released by the NTSB. “ … I thought we were gone.”

“Well she feels normal now,” the captain said a few minutes later.

NTSB: Pilot described ‘near death’ experience after wing hit the ground

And at least one pilot blamed the confusing nature of the Airbus flight computers:

Cockpit Transcript at 18.

Thankfully the pilots safely returned to JFK airport.

“The internet is less free, more fragmented, and less secure”

The Council on Foreign Relations, described by Wikipedia as a “right leaning American think tank specializing in U.S. foreign policy and international relations,” has issued a report titled Confronting Reality in Cyberspace:

The major findings of the Task Force are as follows:

The era of the global internet is over.

U.S. policies promoting an open, global internet have failed, and Washington will be unable to stop or reverse the trend toward fragmentation.

Data is a source of geopolitical power and competition and is seen as central to economic and national security.

The report is a warning that the U.S. needs to get serious about a fragmenting internet or risk losing digital leadership entirely.

AI discoveries in chess

AlphaZero shocked the chess world in 2018.

Now an economics paper is trying to quantify the effect of this new chess knowledge:

[W]e are not aware of any previously documented evidence comparing human performance before and after the introduction of an AI system, showing that humans have learned from AI’s ideas, and that this has pushed the frontier of our understanding.

AlphaZero Ideas

The paper shows that the top-ranked chess player in the world, Magnus Carlsen, meaningfully altered his play and incorporated ideas from AlphaZero on openings, sacrifices, and the early advance of the h-pawn.

Carlsen himself acknowledged the influence:

Question: We are really curious about the influence of AlphaZero in your game.

Answer: Yes, I have been influenced by my hero AlphaZero recently. In essence, I have become a very different player in terms of style than I was a bit earlier and it’s been a great ride.”

Id. at 25 (citing a June 14, 2019 interview in Norway Chess 2019).

Bias mitigations for the DALL-E 2 image generation AI

OpenAI has a post explaining the three main techniques it used to “prevent generated images from violating our content policy.”

First, they filtered out violent and sexual images from the training dataset:

[W]e prioritized filtering out all of the bad data over leaving in all of the good data. This is because we can always fine-tune our model with more data later to teach it new things, but it’s much harder to make the model forget something that it has already learned.

Second, they found that the filtering can actually amplify bias because the smaller remaining datasets may be less diverse:

We hypothesize that this particular case of bias amplification comes from two places: first, even if women and men have roughly equal representation in the original dataset, the dataset may be biased toward presenting women in more sexualized contexts; and second, our classifiers themselves may be biased either due to implementation or class definition, despite our efforts to ensure that this was not the case during the data collection and validation phases. Due to both of these effects, our filter may remove more images of women than men, which changes the gender ratio that the model observes in training.

They fix this by re-weighting the training dataset so that the categories of filtered data are as balanced as the categories of unfiltered data.

Third, they needed to prevent image regurgitation to avoid IP and privacy issues. They found that most regurgitated images (a) were simple vector graphics; and (b) had many near-duplicates in the training set. As a result, these images were easier for the model to memorize. So they de-duplicated images with a clustering algorithm.

To test the effect of deduplication on our models, we trained two models with identical hyperparameters: one on the full dataset, and one on the deduplicated version of the dataset. . . . Surprisingly, we found that human evaluators slightly preferred the model trained on deduplicated data, suggesting that the large amount of redundant images in the dataset was actually hurting performance.

Given the obviously impressive results, this is an instructive set of techniques for AI model bias mitigation.

Keyword search warrants are (too?) powerful

Three teenagers set fire to a home in Denver because they believed someone who stole a phone lived there. Five members of a family died.

The police had video from a neighbor’s house showing three people in hooded sweatshirts and masks near the home at the time of the fire. But for weeks they had no further evidence.

Then the police subpoenaed cell tower data to see who was in the area. They got 7,000 devices, which they narrowed down to exclude neighbors and any that did not match the movement of a vehicle that was observed. Only 33 devices remained.

Then they went to Google:

[A] warrant to Google asked for any searches for the destroyed house’s address anytime in the two weeks before the fire. Google provided five accounts that made that search — including three accounts with email addresses that included [the suspect’s names].

Teen charged in deadly Denver arson told investigators he set fire over stolen phone, detective says

One of the defendants has filed a motion to suppress the Google search evidence, and the EFF has filed an amicus brief in support:

Should the police be able to ask Google for the name of everyone who searched for the address of an abortion provider in a state where abortions are now illegal? Or who searched for the drug mifepristone? What about people who searched for gender-affirming healthcare providers in a state that has equated such care with child abuse? Or everyone who searched for a dispensary in a state that has legalized cannabis but where the federal government still considers it illegal?

EFF to File Amicus Brief in First U.S. Case Challenging Dragnet Keyword Warrant

Fascinating case. Some version of this feels destined for the U.S. Supreme Court.

States aren’t any better at privacy

A press release by the California Department of Justice acknowledges that it leaked personal data on individuals applying for a concealed and carry weapons permit between 2011 and 2021.

The leaked data included “names, date of birth, gender, race, driver’s license number, addresses, and criminal history.”

The California Attorney General page on the California Consumer Privacy Act:

https://oag.ca.gov/privacy/ccpa

At least GDPR applies to public entities in Europe.

UK IPO suggests copyright exception for text and data mining

The United Kingdom’s Intellectual Property Office has concluded a study on “how AI should be dealt with in the patent and copyright systems.”

For text and data mining, we plan to introduce a new copyright and database exception which allows TDM for any purpose. Rights holders will still have safeguards to protect their content, including a requirement for lawful access.

Consultation outcome / Artificial Intelligence and IP: copyright and patents

They also considered copyright protection for computer-generated works without a human author, and patent protection for AI-devised inventions. But they suggest no changes in the law for these latter two areas.

Some companies agree to not use location data from “sensitive points of interest”

A subset of Network Advertising Initiative companies have voluntarily agreed that they will not use location data associated with “sensitive points of interest,” which include:

Places of religious worship

Correctional facilities

Places that may be used to infer an LGBTQ+ identification

Places that may be used to infer engagement with explicit sexual content, material, or acts

Places primarily intended to be occupied by children under 16

Domestic abuse shelters, including rape crisis centers

Welfare or homeless shelters and halfway houses

Dependency or addiction treatment centers

Medical facilities that cater predominantly to sensitive conditions, such as cancer centers, HIV/ AIDS, fertility or abortion clinics, mental health treatment facilities, or emergency room trauma centers

Places that may be used to infer refugee or immigrant status, such as refugee or immigration centers and immigration services`

Credit repair, debt services, bankruptcy services, or payday lending institutions

Temporary places of assembly such as political rallies, marches, or protests, during the times that such rallies, marches, or protests take place

Military bases

NAI PRECISE LOCATION INFORMATION SOLUTION PROVIDER VOLUNTARY ENHANCED STANDARDS

The announcement is close behind increasing public concern that location data brokers might intentionally or reluctantly provide data on individuals visiting abortion clinics.