Using facial recognition in police investigations

The Georgetown Law Center on Privacy & Technology issued a report (with its own vanity URL!) on the NYPD’s use of face recognition technology, and it starts with a particularly arresting anecdote:

On April 28, 2017, a suspect was caught on camera reportedly stealing beer from a CVS in New York City. The store surveillance camera that recorded the incident captured the suspect’s face, but it was partially obscured and highly pixelated. When the investigating detectives submitted the photo to the New York Police Department’s (NYPD) facial recognition system, it returned no useful matches.1

Rather than concluding that the suspect could not be identified using face recognition, however, the detectives got creative.

One detective from the Facial Identification Section (FIS), responsible for conducting face recognition searches for the NYPD, noted that the suspect looked like the actor Woody Harrelson, known for his performances in CheersNatural Born KillersTrue Detective, and other television shows and movies. A Google image search for the actor predictably returned high-quality images, which detectives then submitted to the face recognition algorithm in place of the suspect’s photo. In the resulting list of possible candidates, the detectives identified someone they believed was a match—not to Harrelson but to the suspect whose photo had produced no possible hits.2

This celebrity “match” was sent back to the investigating officers, and someone who was not Woody Harrelson was eventually arrested for petit larceny.


The report describes a number of incidents that it views as problematic, and they basically fall into two categories: (1) editing or reconstructing photos before submitting them to face recognition systems; and (2) simply uploading composite sketches of suspects to face recognition systems.

The report also describes a few incidents in which individuals were arrested based on very little evidence apart from the results of the face recognition technology, and it makes the claim that:

If it were discovered that a forensic fingerprint expert was graphically replacing missing or blurry portions of a latent print with computer-generated—or manually drawn—lines, or mirroring over a partial print to complete the finger, it would be a scandal.

I’m not sure this is true. Helping a computer system latch onto a possible set of matches seems an excellent way to narrow a list of suspects. But of course we should not be permitted to arrest or convict based solely on fabricated fingerprint or facial “evidence”. We need to understand the limits of technology used in the investigative process.

As technology becomes more complex, it is increasingly difficult to understand how it works and does not work. License plate readers are fantastically powerful technology, responsible for solving really terrible crimes. But the technology stack makes mistakes. You cannot rely on it alone.

There is no difference in principle between facial recognition technology, genealogy searches, and license plate readers. They are powerful tools but they are not perfect. And, crucially, they can be far less accurate when used on minority populations. Using powerful tools requires training. And the benefits are remarkable. But users need to understand how the technology works and where it can break down. This will always be true.

Salvador Dalí recreated with AI at Dalí Museum in Florida

What is dead may never die, at least with AI. The painter Salvador Dalí has been recreated on life-size video to interact with visitors to the Dalí Museum in St. Petersburg, Florida.

Using archival footage from interviews, GS&P pulled over 6,000 frames and used 1,000 hours of machine learning to train the AI algorithm on Dalí’s face. His facial expressions were then imposed over an actor with Dalí’s body proportions, and quotes from his interviews and letters were synced with a voice actor who could mimic his unique accent, a mix of French, Spanish, and English.

. . . . .

It’s hard to think of another artist who would be better suited for this than Dalí.

Deepfake Salvador Dalí takes selfies with museum visitors

This is going to be everywhere soon. How long until people start paying to have themselves recreated after they die?

The video is worth watching:

SF restricts its government agencies from using facial recognition technology

There are many reports that “SF bans facial recognition” (I’m looking at you NYT), but this is not true. The “ban” is just a restriction on its own government agencies (including the police) from using facial recognition.

San Francisco’s ban covers government agencies, including the city police and county sheriff’s department, but doesn’t affect the technology that unlocks your iPhone or cameras installed by businesses or individuals. It’s part of a broader package of rules, introduced in January by supervisor Aaron Peskin, that will require agencies to gain approval from the board before purchasing surveillance tech and will require that they publicly disclose its intended use.


None of the reporting seems to link to the actual ordinance, but you can find it on the SF Board of Supervisor’s website. It is file #190110, introduced 1/29/2019. The actual ordinance is here. Summary is here.

Play with OpenAI’s GPT-2 language generation model

In February 2019, OpenAI disclosed a language generation algorithm called GPT-2. It did only one thing: predict the next word given all previous words in the text. And, while not perfect, it does this very well. When prompted with:

In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.

it responds with:

The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science.

Now, after almost two centuries, the mystery of what sparked this odd phenomenon is finally solved.

(The text continues.)

GPT-2 is a transformer-based neural network with 1.5 billion parameters trained on a dataset of 8 million web pages. Transformer-based networks were introduced by Google researchers in 2017 primarily for the purpose of language translation. They work on language by figuring out how much attention to pay to which words. Some words have more semantic value than others, and transformer-based neural networks can learn how to value different words with large amounts of training data. The biggest benefit of a transformer-based network is that the computation can be easily performed in parallel, in contrast to the more traditional and sequential RNN models used for language translation.

In a controversial move, OpenAI originally declined to make the GPT-2 model available to researchers, citing concerns about it being used to create “deceptive, biased, or abusive language at scale . . . .” Recently, however, they have released a smaller, less capable version of the model, and are considering other ways to share the research with AI partners.

Anyways…. now you can play with the smaller GPT-2 model at

AI radiology has arrived

This has been predicted for a long time, but AI radiology is here:

A commercial artificial intelligence (AI) system matched the accuracy of over 28,000 interpretations of breast cancer screening mammograms by 101 radiologists. Although the most accurate mammographers outperformed the AI system, it achieved a higher performance than the majority of radiologists.

Artificial intelligence versus 101 radiologists

Almost anything to do with recognizing objects or features in images are going to be the first tasks mastered by convolutional neural networks. Radiology, surveillance, counting stuff, etc.

AI beats esports world champion team for first time

Some humans have gotten very good at playing the video game Dota 2. It’s a complex game with over 100 different character types, an in-game economy, and an audience of spectators on Oh, and tournaments in which professional Dota 2 players have earned over $100M. And now the championship team has been crushed by an AI:

Within the simplified bounds of the game, OpenAI Five was an astounding triumph. One thing to look for in evaluating the performance of an AI system on a strategy game is whether it’s merely winning with what gamers call “micro” — the second-to-second positioning and attack skills where a computer’s reflexes are a huge advantage. 

OpenAI Five did have good micro, but it also did well in ways that human players, now that they’ve seen it, may well choose to emulate — suggesting that it didn’t just succeed through superior reflexes. The commentators watching the game criticized OpenAI Five’s eagerness to buy back into the game when its heroes died, for example, but the tactic was borne out — maybe suggesting that human pros should be a bit more willing to pay to rejoin the field. 

And OpenAI had a deeper strategic understanding of the board than the human commentators. When the commentators were asserting that the game looked evenly matched, OpenAI would declare that it perceived a 90 percent chance of victory. (It turns out that soberly announced probability estimates make for great trash talk, and these declarations frequently rattled their opponents OG). To us, the game may have seemed open, but to the computer, it was obviously nearly over.

AI triumphs against the world’s top pro team in strategy game Dota 2

Three points to note here:

  1. Rate of improvement. AI’s are improving at an astonishing rate. Chess fell, then Go, now very complex multi-player strategic games like Dota 2. It used to be that game-playing algorithms were customized for specific games and had little applicability to other domains. This is truly a revolution.
  2. Scale of computation. The scale of computation available to the AI’s matters a lot. OpenAI, the researchers behind this AI victory, improved on their previous performance by utilizing eight times more training compute. They trained this model on 45,000 years of Dota self-play over 10 realtime months. Good luck humans.
  3. Real-world applications. Dota 2 is a very complex game with many characters making independent real-time judgments as part of teams trying to take over each other’s bases while protecting their own. It’s a complex simulation of war. The real world is of course still more complex, but this is a domain in which the AI’s appear to do well. Defense departments around the world are paying attention.

Update: The OpenAI team let their AI play against regular Dota 2 players. Out of 7,257 matches, the AI’s won 7,215 (99.4%) and lost just 42.

U.S. facial recognition also rolling out

Jon Porter, writing for The Verge:

The Department of Homeland Security says it expects to use facial recognition technology on 97 percent of departing passengers within the next four years. The system, which involves photographing passengers before they board their flight, first started rolling out in 2017, and was operational in 15 US airports as of the end of 2018. 

The facial recognition system works by photographing passengers at their departure gate. It then cross-references this photograph against a library populated with facesimages from visa and passport applications, as well as those taken by border agents when foreigners enter the country.

US facial recognition will cover 97 percent of departing airline passengers within four years

It’s not automated racism, but it’s similar in scope to China’s rollout. Routine facial recognition for tracking is here, like it or not.

Automated Racism in China

Paul Mozur, writing for the New York Times:

Now, documents and interviews show that the authorities are also using a vast, secret system of advanced facial recognition technology to track and control the Uighurs, a largely Muslim minority. It is the first known example of a government intentionally using artificial intelligence for racial profiling, experts said.

The facial recognition technology, which is integrated into China’s rapidly expanding networks of surveillance cameras, looks exclusively for Uighurs based on their appearance and keeps records of their comings and goings for search and review. The practice makes China a pioneer in applying next-generation technology to watch its people, potentially ushering in a new era of automated racism.

One Month, 500,000 Face Scans: How China Is Using A.I. to Profile a Minority

Bill Gates recently said that AI is the new nuclear technology: both promising and dangerous.

Our long term survival probably requires being good at managing the dangers of increasingly powerful technologies. Not a great start.

AI Transparency Tension: NYPD Sex Chat Bots

The NYPD is using AI chat bots to surface and warn individuals looking to buy sex:

A man texts an online ad for sex.

He gets a text back: “Hi Papi. Would u like to go on a date?” There’s a conversation: what he’d like the woman to do, when and where to meet, how much he will pay.

After a few minutes, the texts stop. It’s not unexpected — women offering commercial sex usually text with several potential buyers at once. So the man, too, usually texts several women at once.

What he doesn’t expect is this: He is texting not with a woman but with a computer program, a piece of artificial intelligence that has taught itself to talk like a sex worker.

A.I. Joins the Campaign Against Sex Trafficking

The article posts an example of an actual chat conversation and it is worth reading to get a sense of the AI capabilities.

Ethics tension. It’s worth noting that many AI ethics frameworks emphasize the importance of informing humans when they are interacting with bots. See also the Google Duplex controversy. Instead, this is indeed deception-by-design. How does this fall within an ethical framework? Are we immediately making trade-offs between effectiveness and transparency?