This book is an intervention -
From predicting criminality to sexual orientation, fake and deeply flawed Artificial Intelligence (AI) is rampant. Amidst this feverishly hyped atmosphere, this book interrogates the rise and fall of AI hype, pseudoscience and snake oil. Bringing together different perspectives and voices from across disciplines and countries, it draws connections between injustices inflicted by inappropriate AI. Each chapter unpacks lazy and harmful assumptions made by developers when designing AI tools and systems, and examines the existential underpinnings of the technology itself to ask: why are there so many pointless, and even dangerously flawed, AI systems?
All Meatspace Press publications are free to download, or can be ordered in print from meatspacepress.com.
Book release 14/12/2021.
Content
Introduction
Frederike Kaltheuner
This book is an intervention
Chapter 1
An interview with Arvind Narayanan
AI Snake Oil, Pseudoscience and Hype
Chapter 2
Abeba Birhane
Cheap AI
Chapter 3
Deborah Raji
The bodies underneath the rubble
Chapter 4
Frederike Kaltheuner
Who am I as data?
Chapter 5
Razvan Amironesei, Emily Denton, Alex Hanna, Hilary Nicole, Andrew Smart
The case for interpretive techniques in machine learning
Chapter 6
Serena Dokuaa Oduro
Do we need AI or do we need Black feminisms? A poetic guide
Chapter 7
James Vincent
How (not) to blog about an intelligent toothbrush
Chapter 8
Alexander Reben
Learn to take on the ceiling
Chapter 9
Gemma Milne
Uses (and abuses) of hype
Chapter 10
Crofton Black
Talking heads
Chapter 11
Adam Harvey
What is a face?
Chapter 12
Andrew Strait
Why automated content moderation won’t save us
Chapter 13
Tulsi Parida, Aparna Ashok
Consolidating power in the name of progress: techno-solutionism and farmer protests in India
Chapter 14
Favour Borokini, Ridwan Oloyede
When fintech meets 60 million unbanked citizens
Chapter 15
Fieke Jansen, Corinne Cath
Algorithmic registers and their limitations as a governance practice
Chapter 16
Aidan Peppin
The power of resistance: from plutonium rods to silicon chips








This project was made possible by the generous support of the Mozilla Foundation through its Tech Policy Fellowship programme.License: Creative Commons BY-NC-SA
“Much of what is sold commercially today as ‘AI’ is what I call ‘snake oil’. We have no evidence that it works, and based on our scientific understanding of the relevant domains, we have strong reasons to believe that it couldn’t possibly work.”
Arvind Narayanan
Introduction
Frederike Kaltheuner
Not a week passes by without some research paper, feature article or product marketing making exaggerated or even entirely unlikely claims about the capabilities of Artificial Intelligence (AI). From academic papers that claim AI can predict criminality, personality or sexual orientation, to the companies that sell these supposed capabilities to law enforcement, border control or human resources departments around the world, fake and deeply flawed AI is rampant.
The current amount of public interest in AI was spurred by the genuinely remarkable progress that has been made with some AI techniques in the past decade. For narrowly defined tasks, such as recognising objects, AI is now able to perform at the same level or even better than humans. However, that progress, as Arvind Narayanan has argued, does not automatically translate into solving other tasks. In fact, when it comes to predicting any social outcome, using AI is fundamentally dubious. [1]
The ease and frequency with which AI’s real and imagined gains are conflated results in real, tangible harms.
For those subject to automated systems, it can mean the difference between getting a job and not getting a job, between being allowed to cross a border and being denied access. Worse, the ways in which these systems are so often built in practice means that the burden of proof often falls on those affected to prove that they are in fact who they say they are. On a societal level, widespread belief in fake AI means that we risk redirecting resources to the wrong places. As Aidan Peppin argues in this book, it could also mean that public resistance to the technology will end up stifling progress in areas where genuine progress is being made.
What makes the phenomenon of fake AI especially curious is the fact that, in many ways, 2020-21 has been a time of great AI disillusionment. The Economist dedicated its entire summer Technology Quarterly to the issue, concluding that “An understanding of AI’s limitations is starting to sink in.” [2] For a technology that has been touted as the solution to virtually every challenge imaginable—from curing cancer, to fighting poverty, predicting criminality, reversing climate change and even ending death—AI has played a remarkably minor role [3] in the global response to a very real challenge the world is facing today, the Covid-19 pandemic. [4] As we find ourselves on the downward slope of the AI hype cycle, this is a unique moment to take stock, to look back and to examine the underlying causes, dynamics, and logics behind the rise and fall of fake AI.
Bringing together different perspectives and voices from across disciplines and countries, this book interrogates the rise and fall of AI hype, pseudoscience, and snake oil. It does this by drawing connections between specific injustices inflicted by inappropriate AI, unpacking lazy and harmful assumptions made by developers when designing AI tools and systems, and examining the existential underpinnings of the technology itself to ask: why are there so many useless, and even dangerously flawed, AI systems?
Any serious writing about AI will have to wrestle with the fact that AI itself has become an elusive term. As every computer scientist will be quick to point out, AI is an umbrella term that’s used for a set of related technologies. Yet while these same computer scientists are quick to offer a precise definition and remind us that much of what we call AI today is in fact machine learning, in the public imagination, the term AI has taken on a meaning of its own. Here, AI is a catch-all phrase used to describe a wide-ranging set of technologies, most of which apply statistical modelling to find patterns in large data sets and make predictions based on those patterns—as Fieke Jansen and Corinne Cath argue in their piece about the false hope that’s placed in AI registers.
Just as AI has become an imprecise word, hype, pseudoscience, and snake oil are frequently used interchangeably to call out AI research or AI tools that claim to do something they either cannot, or should not do. If we look more closely however, these terms are distinct. Each highlights a different aspect of the phenomenon that this book interrogates.
Dangerously, they’ve acquired a veneer of innovation, a sheen of progress, even. By contrast, in a wide-ranging interview that considers how much, and how little, has changed since his original talk three years ago, Arvind Narayanan hones in on “AI snake oil”, explaining how it is distinct from pseudoscience. Vendors of AI snake oil use deceptive marketing, fraud, and even scams to sell their products as solutions to problems for which AI techniques are either ill-equipped or completely useless.
The environment in which snake oil and pseudoscience thrives is characterised by genuine excitement, unchallenged hype, bombastic headlines, and billions of dollars of investment, all coupled with a naïve belief in the idea that technology will save us. Journalist James Vincent writes about his first encounter with a PR pitch for an AI toothbrush and reflects on the challenges of covering hyped technology without further feeding unrealistic expectations. As someone who used to work as a content moderator for Google in the mid 2010s, Andrew Strait makes a plea against placing too much hope on automation in content moderation.
Each piece in this book provides a different perspective and proposes different answers to problems which circle around the shared question of what is driving exaggerated, flawed or entirely unfounded hopes and expectations about AI. Against broad-brush claims, they call for precise thinking and scrupulous expression.
For Deborah Raji the lack of care with which engineers so often design algorithmic systems today belongs to a long history of engineering irresponsibility in constructing material artefacts like bridges and cars. Razvan Amironesei, Emily Denton, Alex Hanna, Andrew Smart and Hilary Nicole describe how benchmark datasets contribute to the belief that algorithmic systems are objective or scientific in nature. The artist Adam Harvey picks apart what exactly defines a “face” for AI.
A recurring theme throughout this book is that harms and risks are unevenly distributed.
Tulsi Parida and Aparna Ashok consider the effects of AI inappropriately applied through the Indian concept of jugaad. Favour Borokini and Ridwan Oloyede warn of the dangers that come with AI hype in Nigeria’s fintech sector.
Amidst this feverishly hyped atmosphere, this book makes the case for nuance. It invites readers to carefully separate the real progress that AI research has made in the past few years from fundamentally dubious or dangerously exaggerated claims about AI’s capabilities.
We are not heading towards Artificial General Intelligence (AGI). We are not locked in an AI race that can only be won by those countries with the least regulation and the most investment.
Instead, the real advances in AI pose both old and new challenges that can only be tamed if we see AI for what it is. Namely, a powerful technology that at present is produced by only a handful of companies with workforces that are not representative of those who are disproportionately affected by its risks and harms.
Notes
1. Narayanan, A. (2019) How to recognize AI snake oil. Princeton University, Department of Computer Science. https://www.cs.princeton.edu/~arvindn/talks/MIT-STS-AI-snakeoil.pdf
2. Cross, T. (2020, 13 June) An understanding of AI’s limitations is starting to sink in. The Economist. https://www.economist.com/technology-quarterly/2020/06/11/an-understanding-of-ais-limitations-is-starting-to-sink-in
3. Mateos-Garcia, J., Klinger, J., Stathoulopoulos, K. (2020) Artificial Intelligence and the Fight Against COVID-19. Nesta.
https://www.nesta.org.uk/report/artificial-intelligence-and-fight-against-covid-19/
4. Peach, K. (2020) How the pandemic has exposed AI’s limitations. Nesta.
https://www.nesta.org.uk/blog/how-the-pandemic-has-exposed-ais-limitations/
The current amount of public interest in AI was spurred by the genuinely remarkable progress that has been made with some AI techniques in the past decade. For narrowly defined tasks, such as recognising objects, AI is now able to perform at the same level or even better than humans. However, that progress, as Arvind Narayanan has argued, does not automatically translate into solving other tasks. In fact, when it comes to predicting any social outcome, using AI is fundamentally dubious. [1]
The ease and frequency with which AI’s real and imagined gains are conflated results in real, tangible harms.
For those subject to automated systems, it can mean the difference between getting a job and not getting a job, between being allowed to cross a border and being denied access. Worse, the ways in which these systems are so often built in practice means that the burden of proof often falls on those affected to prove that they are in fact who they say they are. On a societal level, widespread belief in fake AI means that we risk redirecting resources to the wrong places. As Aidan Peppin argues in this book, it could also mean that public resistance to the technology will end up stifling progress in areas where genuine progress is being made.
What makes the phenomenon of fake AI especially curious is the fact that, in many ways, 2020-21 has been a time of great AI disillusionment. The Economist dedicated its entire summer Technology Quarterly to the issue, concluding that “An understanding of AI’s limitations is starting to sink in.” [2] For a technology that has been touted as the solution to virtually every challenge imaginable—from curing cancer, to fighting poverty, predicting criminality, reversing climate change and even ending death—AI has played a remarkably minor role [3] in the global response to a very real challenge the world is facing today, the Covid-19 pandemic. [4] As we find ourselves on the downward slope of the AI hype cycle, this is a unique moment to take stock, to look back and to examine the underlying causes, dynamics, and logics behind the rise and fall of fake AI.
Bringing together different perspectives and voices from across disciplines and countries, this book interrogates the rise and fall of AI hype, pseudoscience, and snake oil. It does this by drawing connections between specific injustices inflicted by inappropriate AI, unpacking lazy and harmful assumptions made by developers when designing AI tools and systems, and examining the existential underpinnings of the technology itself to ask: why are there so many useless, and even dangerously flawed, AI systems?
Any serious writing about AI will have to wrestle with the fact that AI itself has become an elusive term. As every computer scientist will be quick to point out, AI is an umbrella term that’s used for a set of related technologies. Yet while these same computer scientists are quick to offer a precise definition and remind us that much of what we call AI today is in fact machine learning, in the public imagination, the term AI has taken on a meaning of its own. Here, AI is a catch-all phrase used to describe a wide-ranging set of technologies, most of which apply statistical modelling to find patterns in large data sets and make predictions based on those patterns—as Fieke Jansen and Corinne Cath argue in their piece about the false hope that’s placed in AI registers.
Just as AI has become an imprecise word, hype, pseudoscience, and snake oil are frequently used interchangeably to call out AI research or AI tools that claim to do something they either cannot, or should not do. If we look more closely however, these terms are distinct. Each highlights a different aspect of the phenomenon that this book interrogates.
As Abeba Birhane powerfully argues in her essay, Cheap AI, the return of pseudoscience, such as race science, is neither unique nor distinct to AI research. What is unique is that dusty and long-discredited ideas have found new legitimacy through AI.
Dangerously, they’ve acquired a veneer of innovation, a sheen of progress, even. By contrast, in a wide-ranging interview that considers how much, and how little, has changed since his original talk three years ago, Arvind Narayanan hones in on “AI snake oil”, explaining how it is distinct from pseudoscience. Vendors of AI snake oil use deceptive marketing, fraud, and even scams to sell their products as solutions to problems for which AI techniques are either ill-equipped or completely useless.
The environment in which snake oil and pseudoscience thrives is characterised by genuine excitement, unchallenged hype, bombastic headlines, and billions of dollars of investment, all coupled with a naïve belief in the idea that technology will save us. Journalist James Vincent writes about his first encounter with a PR pitch for an AI toothbrush and reflects on the challenges of covering hyped technology without further feeding unrealistic expectations. As someone who used to work as a content moderator for Google in the mid 2010s, Andrew Strait makes a plea against placing too much hope on automation in content moderation.
Each piece in this book provides a different perspective and proposes different answers to problems which circle around the shared question of what is driving exaggerated, flawed or entirely unfounded hopes and expectations about AI. Against broad-brush claims, they call for precise thinking and scrupulous expression.
For Deborah Raji the lack of care with which engineers so often design algorithmic systems today belongs to a long history of engineering irresponsibility in constructing material artefacts like bridges and cars. Razvan Amironesei, Emily Denton, Alex Hanna, Andrew Smart and Hilary Nicole describe how benchmark datasets contribute to the belief that algorithmic systems are objective or scientific in nature. The artist Adam Harvey picks apart what exactly defines a “face” for AI.
A recurring theme throughout this book is that harms and risks are unevenly distributed.
Tulsi Parida and Aparna Ashok consider the effects of AI inappropriately applied through the Indian concept of jugaad. Favour Borokini and Ridwan Oloyede warn of the dangers that come with AI hype in Nigeria’s fintech sector.
Amidst this feverishly hyped atmosphere, this book makes the case for nuance. It invites readers to carefully separate the real progress that AI research has made in the past few years from fundamentally dubious or dangerously exaggerated claims about AI’s capabilities.
We are not heading towards Artificial General Intelligence (AGI). We are not locked in an AI race that can only be won by those countries with the least regulation and the most investment.
Instead, the real advances in AI pose both old and new challenges that can only be tamed if we see AI for what it is. Namely, a powerful technology that at present is produced by only a handful of companies with workforces that are not representative of those who are disproportionately affected by its risks and harms.
Notes
1. Narayanan, A. (2019) How to recognize AI snake oil. Princeton University, Department of Computer Science. https://www.cs.princeton.edu/~arvindn/talks/MIT-STS-AI-snakeoil.pdf
2. Cross, T. (2020, 13 June) An understanding of AI’s limitations is starting to sink in. The Economist. https://www.economist.com/technology-quarterly/2020/06/11/an-understanding-of-ais-limitations-is-starting-to-sink-in
3. Mateos-Garcia, J., Klinger, J., Stathoulopoulos, K. (2020) Artificial Intelligence and the Fight Against COVID-19. Nesta.
https://www.nesta.org.uk/report/artificial-intelligence-and-fight-against-covid-19/
4. Peach, K. (2020) How the pandemic has exposed AI’s limitations. Nesta.
https://www.nesta.org.uk/blog/how-the-pandemic-has-exposed-ais-limitations/
“Just as the car manufacturer called out by Nader shifted blame onto car dealerships for failing to recommend tyre pressures to “correct” the Corvair’s faulty steering, algorithm developers also seek scapegoats for their own embarrassing failures.”
Deborah Raji
AI Snake Oil, Pseudoscience and Hype
An interview with Arvind Narayanan
The term “snake oil” originates from the United States in the mid 19th century when Chinese immigrants working on the railroads introduced their American counterparts to a traditional treatment for arthritis and bursitis made of oil derived from the Chinese water snake. The effectiveness of the oil, which is high in omega-3 acids, and its subsequent popularity prompted some profiteers to get in on a lucrative market. These unscrupulous sellers peddled quack remedies which contained inferior rattlesnake oil or completely arbitrary ingredients to an unsuspecting public. By the early 20th century, “snake oil” had taken on its modern, pejorative meaning to become a byword for fake miracle cures, groundless claims, and brazen falsehoods.
Much of what is sold commercially as AI is snake oil, says Arvind Narayanan, Associate Professor for Computer Science at Princeton University—we have no evidence that it works, and based on our scientific understanding, we have strong reasons to believe that it couldn’t possibly work. And yet, companies continue to market AI products that claim to predict anything from crime, to job performance, sexual orientation or gender. What makes the public so susceptible to these claims is the fact that in recent years, in some domains of AI research, there has been genuine and impressive progress. How, then, did AI become attached to so many products and services of questionable or unverifiable quality, and slim to non-existent usefulness?
Frederike Kaltheuner spoke to Arvind Narayanan via Zoom in January 2021. Frederike, from lockdown in Berlin, and Arvind from his office in Princeton.
F: Your talk, How to Recognise AI Snake Oil went viral in 2019. What inspired you to write about AI snake oil, and were you surprised by the amount of attention your talk received?
A: Throughout the last 15 years or so of my research, one of my regular motivations for getting straight into a research topic is when there is hype in the industry around something. That’s how I first got started on privacy research. My expertise, the desire for consumer protection, and the sense that industry hype got out of control all converged in the case of AI snake oil. The AI narrative had been getting somewhat unhinged from reality for years, but the last straw was seeing how prominent these AI-based hiring companies have become. How many customers they have, and how many millions of people have been put through these demeaning video interviews where AI would supposedly figure out someone’s job suitability based on how they talked and other irrelevant factors. That’s really what triggered me to feel “I have to say something here”.
I was very surprised by its reception. In addition to the attention on Twitter, I received something like 50 invitations for papers, books… That had never happened to me before. In retrospect I think many people suspected what was happening was snake oil but didn’t feel they had the expertise or authority to say anything. People were speaking up of course, but perhaps weren’t being taken as seriously because they didn’t have the “Professor of Computer Science” title. That we still put so much stock in credentials is, I think, unfortunate. So when I stood up and said this, I was seen as someone who had the authority. People really felt it was an important counter to the hype.
F: … and it is still important to counter the hype today, especially in policy circles. Just how much of what is usually referred to as AI falls under the category of AI snake oil? And how can we recognise it?
A: Much of what is sold commercially today as “AI” is what I call “snake oil”. We have no evidence that it works, and based on our scientific understanding of the relevant domains, we have strong reasons to believe that it couldn’t possibly work. My educated guess is because “AI” is a very loose umbrella term. This happens with buzzwords in the tech industry (like “blockchain”). After a point nobody really knows what it means. Some are not snake oil. There has been genuinely remarkable scientific progress. But because of this, companies put all kinds of systems under the AI umbrella—including those you would have more accurately called regression 20 years ago, or statistics, except that statistics asks rigorous questions about whether something is working and how we can quantify it. But because of the hype, people have skipped this step and the public and policymakers have bought into it.
Surveys show that the public largely seems to believe that Artificial General Intelligence (AGI) is right around the corner—which would be a turning point in the history of human civilisation! I don’t think that’s true at all, and most experts don’t either. The idea that our current progress with AI would lead to AGI is as absurd as building a taller and taller ladder that reached the moon. There are fundamental differences between what we’re building now and what it would take to build AGI. AGI is not task-specific, so that’s in part why I think it will take something fundamentally new and different to get there.
F: To build on your metaphor—if the genuinely remarkable scientific progress under the AI umbrella is a ladder to the moon, then AGI would take an entirely different ladder altogether. AI companies are pointing at genuine progress to make claims that require an entirely different kind of progress altogether?
A: Right. There’s this massive confusion around what AI is, which companies have exploited to create hype. Point number two is that the types of applications of so-called “AI” are fundamentally dubious. One important category is predicting the future, that is, predicting social outcomes. Which kids might drop out of school? Who might be arrested for a crime in the future? Who should we hire? These are all contingent on an incredible array of factors that we still have trouble quantifying—and it’s not clear if we ever will.
A few scientific studies have looked rigorously at how good we are at predicting these future social outcomes and shown that it’s barely better than random. We can’t really do much better than simple regression models with a few variables. My favourite example is the “Fragile Families Challenge” led by my Princeton colleague Professor Matt Salganik, along with colleagues and collaborators around the world. Hundreds of participants used state-of-the-art machine learning techniques and a phenomenal dataset to scrutinise “at-risk” kids over a decade and try to predict (based on a child’s circumstances today) what their outcomes might be six years in the future. The negative results are very telling. No team, on any of these social outcomes, could produce predictions that were significantly better than random prediction. This is a powerful statement about why trying to predict future social outcomes is a fundamentally different type of task to those that AI has excelled at. These things don’t work well and we shouldn’t expect them to.
F: Which domains seem to have a lot of snake oil in them and why?
A: My educated guess is that to understand the prevalence of AI snake oil it’s better to look at the consumers / buyers than the sellers. Companies will spring up around any type of technology for which high demand exists. So why are people willing to buy certain types of snake oil? That’s interesting.
I think it’s because certain domains (like hiring) are so broken that even an elaborate random-number generator (which is what I think some of these AI tools are), is an improvement over what people are doing today. And I don’t make this statement lightly. In a domain like hiring we—culturally as well as in business—have a hard time admitting that there is not much we can do to predict who’s going to be most productive in a job. The best we can do is have some basic tests of preparation, ability and competence, and beyond that just accept that it’s essentially a lottery. I think we’re not willing to accept that so much success in life is just randomness, and in our capitalistic economy there’s this constant push for more “rationality”, whether or not that makes sense.
So the way hiring works is a) fundamentally arbitrary because these outcomes are hard to predict, and b) there’s a lot of bias along all axes that we know about. What these tools promise to do is cut down on bias that is relatively easy to statistically quantify, but it’s much harder to prove that these tools are actually selecting candidates who will do better than candidates who were not selected. The companies who are buying these tools are either okay with that or don’t want to know. Look at it from their perspective: they might have a thousand applications for two positions. It’s an enormous investment of time to read those applications and interview those candidates, and it’s frustrating not to be able to make decisions on a clear candidate ranking. And against this backdrop emerges a tool that promises to be AI and has a veneer of scientific sophistication. It says it will cut down on bias and find the best candidates in a way that is much cheaper to their company than a traditional interview and hiring process. That seems like a great deal.
F: So what you’re saying is the domains in which snake oil is more prevalent are the ones where either the market is broken or where we have a desire for certainty that maybe doesn’t exist?
A: I hesitate to provide some sort of sweeping characterisation that explains where there is a lot of snake oil. My point is more that if we look at the individual domains, there seem to be some important reasons why there are buyers in that domain. We have to look at each specific domain and see what is specifically broken there. There’s also a lot of AI snake oil that’s being sold to governments. I think what’s going on there is that there’s not enough expertise in procurement departments to really make nuanced decisions about whether this algorithmic tool can do what it claims.
F: Do you think this problem is limited to products and services that are being sold or is this also something you observe within the scientific community?
A: A lot of my thinking evolved through the “Limits to Prediction” course that I co-taught with Professor Matt Salganik, whom I mentioned earlier. We wanted to get a better scientific understanding of when prediction is even possible, and the limits of its accuracy. One of the things that stuck out for me is that there’s also a lot of misguided research and activity around prediction where we have to ask: what is even the point?
One domain is political prediction. There’s a great book by EItan Hersh which criticises the idea of politics, and even political activism, as a sport—a horse race that turns into a hobby or entertainment. What I find really compelling about this critique is what it implies about efforts like FiveThirtyEight that involve a lot of statistics and technology for predicting the outcomes of various elections. Why? That’s the big question to me. Of course, political candidates themselves might want to know where to focus their campaigning efforts. Political scientists might want to understand what drives people to vote—those are all great. But why as members of the public…?
Let me turn this inwards. I’m one of those people who refreshes the New York Times needle and FiveThirtyEight’s predictions. Why do I participate in this way? I was forced to turn that critique on myself, and I realised it’s because uncertainty is so uncomfortable. Anything that promises to quell the terror that uncertainty produces and tell us that “there’s an 84% chance this candidate will win” just fills a huge gap in our subconscious vulnerabilities. I think this is a real problem. It’s not just FiveThirtyEight. There’s a whole field of research to figure out how to predict elections. Why? The answer is not clear at all. So, it’s not just in the commercial sphere, there’s also a lot of other misguided activity around prediction. We’ve heard a lot about how these predictions have not been very successful, but we’ve heard less about why people are doing these predictions at all.
F: Words like “pseudoscience” and “snake oil” are often thrown around to denote anything from harmful AI, to poorly-done research, to scams, essentially. But you chose your words very carefully. Why “misguided research” rather than, let’s say, “pseudoscience”?
A: I think all these terms are distinct, at least somewhat. Snake oil describes commercial products that are sold as something that’s going to solve a problem. Pseudoscience is where scientific claims are being made, but they’re based on fundamentally shaky assumptions. The classic example is, of course, a paper on supposedly predicting criminality from facial images. When I say “misguided research”, a good example is electoral prediction by political scientists. This is very, very careful research conducted by very rigorous researchers. They know their statistics, I don’t think they’re engaged in pseudoscience. By “misguided” I mean they’re not asking the question of “who is this research helping?”
F: That’s really interesting. The question you’re asking then is epistemological. Why do you think this is the case and what do you see as the problems arising from not asking these questions?
A: That’s a different kind of critique. It’s not the same level of irresponsibility as some of this harmful AI present in academia and on the street. Once an academic community decides something is an important research direction, then you stop asking the questions. It’s frankly difficult to ask that question for every paper that you write. But sometimes an entire community starts down a path that ultimately leads nowhere and is not going to help anybody. It might even have some harmful side-effects. There’s interesting research coming out that the false confidence that people get from seeing these probability scores actually depresses turnouts. This might be a weird thing to say right after an election that saw record levels of turnout, but we don’t know whether even more people might have voted had it not been for this entire industry of predicting elections, and splashing those predictions on the frontpages. This is why misguided research is, I think, a separate critique.
F: Moving onto a different theme, I have two questions on the limit of predictability. It seems like every other year a research paper tries to predict criminality. The other one for me that surprisingly doesn’t die is a 2017 study by two Stanford researchers on predicting homosexuality from faces. There are many, many problems with this paper, but what still fascinates me is that the conversations with policymakers and journalists often revolved around “Well maybe we can’t predict this now, but who knows if we will be able to predict it in future?”. In your talk you said that this is an incomplete categorisation of tasks that AI can be used to solve—and I immediately thought of predicting identity. It’s futile, but the reason why ultimately lies somewhere else. It’s more a question of who we think has the ultimate authority about who defines who we are. It’s an ontological question rather than one about accuracy or biology. I am curious how you refute this claim that AI will be able to predict things in the future, and place an inherent limit on what can be predicted?
A: If we look at the authors of the paper on predicting sexual orientation, one of their main supposed justifications for writing the paper is they claim to be doing this in the interest of the gay community. As repressive governments want to identify sexuality through photos and social media to come after people, they think it’s better for this research to be out there for everybody to see and take defensive measures.
I think that argument makes sense in some domains like computer security. It absolutely does not make sense in this domain. Doing this research is exactly the kind of activity that gives a veneer of legitimacy to an oppressive government who says “Look! There’s a peer-reviewed research paper and it says that this is scientifically accurate, and so we’re doing something that’s backed by science!” Papers like this give ammunition to people who might do such things for repressive ends. The other part is that if you find a vulnerability in a computer program, it’s very easy to fix—finding the vulnerability is the hard part. It’s very different in this case. If it is true (and of course it’s very doubtful) that it’s possible to accurately infer sexual orientation from people’s images on social media, what are these authors suggesting people do to protect themselves from oppressive governments other than disappear from the internet?
F: I think that the suggestion was “accept the death of privacy as a fact and adapt to social norms” which… yeah…
A: Right. I would find the motivations for doing this research in the first place to be very questionable. Similarly, predicting gender. One of the main applications is to put a camera in the back of the taxi that can infer the rider’s gender and show targeted advertisements on the little television screen. That’s one of the main applications that I’m seeing commercially. Why? You know… I think we should push back on that application in the first place. And if none of these applications make sense, we should ask why people are even working on predicting gender from facial images.
F: So you would rephrase the question and not even engage in discussions about accuracy, and just ask whether we should be doing this in the first place?
A: That’s right. I think there are several kinds of critique for questionable uses of AI. There’s the bias critique, the accuracy critique, and the questionable application technique. I think these critiques are separate (there’s often a tendency to confuse them) and what I tried to do in the AI Snake Oil talk is focus on one particular critique, the critique of accuracy. But that’s not necessarily the most relevant critique in all cases.
F: Let’s talk about AI and the current state of the world. I was moderately optimistic that there was less AI solutionism in response to Covid-19 than I feared. Could this be a positive indicator that the debate has matured in the past two years?
A: It’s hard to tell, but that’s a great question. It’s true that companies didn’t immediately start blowing the AI horn when Covid-19 happened, and that is good news. But it’s hard to tell if that’s because they just didn’t see enough commercial opportunity there or because the debate has in fact matured.
F: There could be various explanations for that…
A: Yeah. There is a lot of snake oil and misguided AI in the medical domain. You see a lot where machine learning was tested on what is called a “retrospective test”, where you collect data first from a clinical setting, develop your algorithm on that data and then just test the algorithm on a different portion of the same data. That is a very misleading type of test, because the data might have been collected from one hospital but when you test it on a different hospital in a different region—with different cultural assumptions, different demographics—where the patterns are different, the tool totally fails. We have papers that look at what happens if you test these retrospectively-developed tools in a prospective clinical setting: there’s a massive gap in accuracies. We know there’s a lot of this going on in the medical machine learning domain, but whether the relative dearth of snake oil AI for Covid-19 is due to the debate maturing or some other factor, who can tell.
F: One thing I was wondering… do you feel like you’ve made an impact?
A: (laughs)
F: As in, are you seeing less snake oil now than you did, say two years ago?
A: That’s hard to know. I think there is certainly more awareness among the people who’ve been doing critical AI work. I’m seeing less evidence that awareness is coming through in journalism, although I’m optimistic that that will change. I have a couple of wish-list items for journalists who often unwittingly provide cover for overhyped claims. One is: please stop attributing agency to AI. I don’t understand why journalists do this (presumably it drives clicks?) but it’s such a blatantly irresponsible thing to do. Headlines like “AI discovered how to cure a type of cancer”. Of course it’s never AI that did this. It’s researchers, very hardworking researchers, who use AI machine learning tools like any other tool. It’s both demeaning to the researchers who did that work and creates massive confusion among the public when journalists attribute agency to AI. There’s no reason to do that, especially in headlines.
And number two is that it’s virtually never meaningful to provide an accuracy number, like “AI used to predict earthquakes is 93% accurate”. I see that all the time. It never makes sense in a headline and most of the time never makes sense even in the body of the article. Here’s why: I can take any classifier and make it have just about any accuracy I want by changing the data distribution on which I do the test. I can give it arbitrarily easy instances to classify, I can give it arbitrarily hard instances to classify. That choice is completely up to the researcher or the company that’s doing the test. In most cases there’s not an agreed-upon standard, so unless you’re reporting accuracies on a widely-used, agreed-upon benchmark dataset (which is virtually never the case, it’s usually the company deciding on-the-road how to do the test) it never makes sense to report an accuracy number like that without a lengthy explanation and many, many other caveats. So don’t provide these oversimplified headline accuracy numbers. Try to provide these caveats and give qualitative descriptions of accuracy. What does this mean? What are the implications if you were to employ this in a commercial application? How often would you have false positives? Those are the kinds of questions that policymakers should know, not these oversimplified accuracy numbers.
Want to read more? All Meatspace Press publications are free to download, or can be ordered in print from meatspacepress.com.
Design notes
AI-constrained design
By Carlos Romo-Melgar, John Philip Sage and Roxy Zeiher
The design of this book explores the use of artificial intelligence to conceptualise its main visual elements. From an optimistic standpoint of AI abundance, the book design is a semi-fiction, a staged and micromanaged use of a GAN (Generative Adversarial Network), an unsupervised machine learning framework, where two neural networks contest with each other in order to generate visual output. Stretching the narrative, this book could be framed as a/the (first) book designed by an AI. In this scenario, the collaborating AI (more like the AI-as-head-of-design-that-doesn’t-know- how-to-design), has informed, but also constrained the possibilities to work visually with the pages.

The design strategy adopts the Wizard of Oz Technique, a method originated from interaction design where what is seemingly auton- omous, is in reality disguising the work of humans ‘as a proxy for the system behind the scenes’. [1] The use of the GAN, which a reader could expect as a simplification, a symbol of technologicalergonomics, has instead complicated the process. As a result, the contents contort around the spaces that the AI imagination left them to exist, revealing an apparently spontaneous visual language.
The book features results from two separate datasets, addressing the overall layout composition, and a (overly sensitive) recognition algorithm which targets all instances of ‘AI, ai, Ai’, regardless of their position or meaning.
MetaGAN v.3 Layouts
The dataset used to produce the compositions above is a collec- tion of book scans. The purpose of an image GAN is to create new instances by detecting, deconstructing and subsequently reconstructing existing patterns to create speculations about con- tinuations. Reusing existing layout materials, conceived by human creativity, opens up the discussion of AI creativity. The outcomes, which could be perceived as surprising, original and novel, are however subject to human selection and valuation. In training the MetaGAN, the dissimilarity of the data points, in combination with the small size of the dataset (200 images), led to the idiosyncrasy of overfitting. An overfitted model generates outcomes ‘that correspond too closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably’. [2]


AI type results
These AI letterings are results of a GAN using a dataset containing logos from various AI related brands (or belonging to Anguilla, whose country code top-level domain is ‘.ai’). The use of these characters is indeed automated in the design of the book, but it is done using GREP styles.
Notes
1. Bella, M. & Hanington, B., 2012. Universal Methods of Design, Beverly, MA: Rockport Publishers. p.204
2. https://www.lexico.com/definition/overfitting
Contributors
[in order of appearance]Abeba Birhane is a cognitive science PhD candidate at the Complex Software Lab, University College Dublin, Ireland, and Lero, the Science Foundation Ireland Research Centre for Software.
Deborah Raji is a Mozilla fellow, interested in algorithmic auditing. She also works closely with the Algorithmic Justice League initiative to highlight bias in deployed products.
Frederike Kaltheuner is a tech policy analyst and researcher. She is also the Director of the European AI Fund, a philanthropic initiative to strengthen civil society in Europe.
Dr Razvan Amironesei is a research fellow in data ethics at the University of San Francisco and a visiting researcher at Google, currently working on key topics in algorithmic fairness.
Dr Emily Denton is a Senior Research Scientist on Google’s Ethical AI team, studying the norms, values, and work practices that structure the development and use of machine learning datasets.
Dr Alex Hanna is a sociologist and researcher at Google.
Hilary Nicole is a researcher at Google.
Andrew Smart is a researcher at Google working on AI governance, sociotechnical systems, and basic research on conceptual foundations of AI.
Serena Dokuaa Oduro is a writer and policy researcher aiming for algorithms to uplift Black communities. She is the Policy Research Analyst at Data & Society.
James Vincent is a senior reporter for The Verge who covers artificial intelligence and other things. He lives in London and loves to blog.
Alexander Reben is an MIT-trained artist and technologist who explores the inherently human nature of the artificial
Gemma Milne is a Scottish science and technology writer and PhD researcher in Science & Technology Studies at University College London. Her debut book is Smoke & Mirrors: How Hype Obscures the Future and How to See Past It (2020).
Dr. Crofton Black is a writer and investigator. He leads the Decision Machines project at The Bureau of Investigative Journalism. He has a PhD in the history of philosophy from the Warburg Institute, London.
Adam Harvey is a researcher and artist based in Berlin. His most recent project, Exposing.ai, analyses the information supply chains of face recognition training datasets.
Andrew Strait is a former Legal Policy Specialist at Google and works on technology policy issues. He holds an MSc in Social Science of the Internet from the Oxford Internet Institute.
Tulsi Parida is a socio-technologist currently working on AI and data policy in fintech. Her previous work has been in edtech, with a focus on responsible and inclusive learning solutions.
Aparna Ashok is an anthropologist, service designer, and AI ethics researcher. She specialises in ethical design of automated decision-making systems.
Fieke Jansen is a doctoral candidate at Cardiff University. Her research is part of the Data Justice project funded by ERC Starting Grant (no.759903).
Dr Corinne Cath is a recent graduate of the Oxford Internet Institute's doctoral programme.
Aidan Peppin is a Senior Researcher at the Ada Lovelace Institute. He researches the relationship between society and technology, and brings public voices to ethical issues of data and AI.
Fake AI
Edited by: Frederike Kaltheuner
Publisher: Meatspace Press (2021)
Weblink: meatspacepress.com
Design: Carlos Romo-Melgar, John Philip Sage and Roxy Zeiher
Copy editors: David Sutcliffe and Katherine Waters
Format: Paperback and pdf
Printed by: Petit. Lublin, Poland.
Paper: Munken Print White 20 - 90 gsm
Set in: Roobert and Times New Roman
Length: 206 pages
Language: English
Product code: MSP112101
ISBN (paperback): 978-1-913824-02-0
ISBN (pdf, e-book): 978-1-913824-03-7
License: Creative Commons BY-NC-SA
Deborah Raji is a Mozilla fellow, interested in algorithmic auditing. She also works closely with the Algorithmic Justice League initiative to highlight bias in deployed products.
Frederike Kaltheuner is a tech policy analyst and researcher. She is also the Director of the European AI Fund, a philanthropic initiative to strengthen civil society in Europe.
Dr Razvan Amironesei is a research fellow in data ethics at the University of San Francisco and a visiting researcher at Google, currently working on key topics in algorithmic fairness.
Dr Emily Denton is a Senior Research Scientist on Google’s Ethical AI team, studying the norms, values, and work practices that structure the development and use of machine learning datasets.
Dr Alex Hanna is a sociologist and researcher at Google.
Hilary Nicole is a researcher at Google.
Andrew Smart is a researcher at Google working on AI governance, sociotechnical systems, and basic research on conceptual foundations of AI.
Serena Dokuaa Oduro is a writer and policy researcher aiming for algorithms to uplift Black communities. She is the Policy Research Analyst at Data & Society.
James Vincent is a senior reporter for The Verge who covers artificial intelligence and other things. He lives in London and loves to blog.
Alexander Reben is an MIT-trained artist and technologist who explores the inherently human nature of the artificial
Gemma Milne is a Scottish science and technology writer and PhD researcher in Science & Technology Studies at University College London. Her debut book is Smoke & Mirrors: How Hype Obscures the Future and How to See Past It (2020).
Dr. Crofton Black is a writer and investigator. He leads the Decision Machines project at The Bureau of Investigative Journalism. He has a PhD in the history of philosophy from the Warburg Institute, London.
Adam Harvey is a researcher and artist based in Berlin. His most recent project, Exposing.ai, analyses the information supply chains of face recognition training datasets.
Andrew Strait is a former Legal Policy Specialist at Google and works on technology policy issues. He holds an MSc in Social Science of the Internet from the Oxford Internet Institute.
Tulsi Parida is a socio-technologist currently working on AI and data policy in fintech. Her previous work has been in edtech, with a focus on responsible and inclusive learning solutions.
Aparna Ashok is an anthropologist, service designer, and AI ethics researcher. She specialises in ethical design of automated decision-making systems.
Fieke Jansen is a doctoral candidate at Cardiff University. Her research is part of the Data Justice project funded by ERC Starting Grant (no.759903).
Dr Corinne Cath is a recent graduate of the Oxford Internet Institute's doctoral programme.
Aidan Peppin is a Senior Researcher at the Ada Lovelace Institute. He researches the relationship between society and technology, and brings public voices to ethical issues of data and AI.
Fake AI
Edited by: Frederike Kaltheuner
Publisher: Meatspace Press (2021)
Weblink: meatspacepress.com
Design: Carlos Romo-Melgar, John Philip Sage and Roxy Zeiher
Copy editors: David Sutcliffe and Katherine Waters
Format: Paperback and pdf
Printed by: Petit. Lublin, Poland.
Paper: Munken Print White 20 - 90 gsm
Set in: Roobert and Times New Roman
Length: 206 pages
Language: English
Product code: MSP112101
ISBN (paperback): 978-1-913824-02-0
ISBN (pdf, e-book): 978-1-913824-03-7
License: Creative Commons BY-NC-SA
For press requests, please email: mail [a] frederike-kaltheuner.com