Learning from synthetic data

Microsoft trained an excellent 3D face reconstruction model using synthetic data.

Synthetic (i.e. computer generated) data is helpful because it takes a long time for humans to look at many faces and label all of their features. But synthetic data arrives already labeled. And that allows for good and fast training:

Can we keep things simple by just using more landmarks?

In answer, we present the first method that accurately predicts ten times as many landmarks as usual, covering the whole head, including the eyes and teeth. This is accomplished using synthetic training data, which guarantees perfect landmark annotations.

3D Face Reconstruction with Dense Landmarks

Maybe the police should be able to use facial recognition…

Scott Ikeda for CPO Magazine:

Some cities and states that were early to ban law enforcement from using facial recognition software appear to be having second thoughts, which privacy advocates with the Electronic Frontier Foundation (EFF) and other organizations largely attribute to an uptick in certain types of urban crime.

Facial Recognition Bans Begin To Fall Around the US as Re-Funding of Law Enforcement Becomes Politically Popular

New Orleans and Virginia have both backtracked a bit with facial recognition technology now being allowed with supervision and for more serious types of crime.

Virginia in particular has imposed a requirement that facial recognition technology have an accuracy rating of at least 98% across all demographics.

More data for AI interpretation of patents

Google has released the Patent Phrase Similarity dataset, intended to help AI models better understand the somewhat odd world of patent language:

The process of using traditional patent search methods (e.g., keyword searching) to search through the corpus of over one hundred million patent documents can be tedious and result in many missed results due to the broad and non-standard language used. For example, a “soccer ball” may be described as a “spherical recreation device”, “inflatable sportsball” or “ball for ball game”.

Announcing the Patent Phrase Similarity Dataset

The dataset was used in the U.S. Patent Phrase to Phrase Matching Kaggle competition with some close-to-human results.

Commercial (legal) limitations of DALL-E 2

Louise Matsakis reporting for The Information:

At least one major brand has already tried incorporating Dall-e 2 into an advertising campaign, inadvertently demonstrating how legal snafus could arise. When Heinz’s marketing team fed Dall-e 2 “generic ketchup-related prompts,” the program almost exclusively produced images closely resembling the company’s trademarked condiment bottle. “We ultimately found that no matter how we were asking, we were still seeing results that looked like Heinz,” a company representative told AdWeek.

Can Creatives Survive the Future War Against Dall-e 2?

The image generation AI’s are remarkable, but they do still have significant technical limitations as well, particularly an inability to generate unusual images (“a cup on a spoon”).

Quantum encryption scheme broken with classical math

DAN GOODIN for ArsTechnica:

SIKE is the second NIST-designated PQC candidate to be invalidated this year. In February, IBM post-doc researcher Ward Beullens published research that broke Rainbow, a cryptographic signature scheme with its security, according to Cryptomathic, “relying on the hardness of the problem of solving a large system of multivariate quadratic equations over a finite field.”

Post-quantum encryption contender is taken out by single-core PC and 1 hour

One of the SIKE inventors conceded that many cryptographers “do not understand as much mathematics as we really should.”

One gets a sense that the AI’s are going to be really good at this though.

Discovery sanctions for GDPR redactions

An order by Judge Payne out of the Eastern District of Texas does not agree that redactions allegedly required by GDPR were proper:

To further demonstrate the alleged bad faith application of the GDPR, Arigna showed where Continental blacked out the faces of its Executive Board in a picture even though that picture was available on Continental’s public website without the redactions. Based on these redactions and failure to timely produce the ESI, Argina seeks an adverse inference instruction; an order precluding Continental from using any document that it did not timely produce, and Arigna’s costs and fees.

In response, Continental argued (but did not show) that it received an opinion letter from a law firm based in Europe stating the redactions were required by the GDPR, and that it had worked diligently to produce the ESI while also complying with the GDPR.

July 29, 2022 Memorandum Order, Case No. 22-cv-00126 (EDTX)

Wikipedia influences judicial decisions

Bob Ambrogi:

To assess whether Wikipedia impacts judicial decisions, the researchers set out to test for two types of influence: (1) whether the creation of a Wikipedia article on a case leads to that case being cited more often in judicial decisions; and (2) whether the text of judicial decisions is influenced by the text of the corresponding Wikipedia article.

Scientists Conclude that Wikipedia Influences Judges’ Legal Reasoning

They found that the addition of a case to Wikipedia increased the case’s citations by 20%.

They also purport to demonstrate with natural language analysis that “a textual similarity exists between the judicial decisions and the Wikipedia articles.”

I’m skeptical that this method proves actual influence by a Wikipedia article. But it’s easy to believe that case salience would have an impact.

Convenience vs Privacy

Very cool technology:

Delta Air Lines recently introduced a “Parallel Reality” system that lets travelers access individual flight information on a shared overhead screen based on a scan of their boarding pass — or their face. The twist is that 100 people can do this at a time, all using the same digital screen but only seeing their own personal details.

Unlike a regular TV or video wall, in which each pixel would emit the same color of light in every direction, the board sends different colors of light in different directions.

Coming to a giant airport screen: Your personal flight information

But it does require computers know exactly who and where you are.

Is ShotSpotter AI?

 A federal lawsuit filed Thursday alleges Chicago police misused “unreliable” gunshot detection technology and failed to pursue other leads in investigating a grandfather from the city’s South Side who was charged with killing a neighbor.

. . . . .

ShotSpotter’s website says the company is “a leader in precision policing technology solutions” that help stop gun violence by using sensors, algorithms and artificial intelligence to classify 14 million sounds in its proprietary database as gunshots or something else.

Lawsuit: Chicago police misused ShotSpotter in murder case

Some commentators (e.g., link) have jumped on this story as an example of someone (allegedly) being wrongly imprisoned due to AI.

But maybe ShotSpotter is just bad software that is used improperly? Does it matter?

The definition of AI is so difficult that we may soon find ourselves regulating all software.