As more and more problems with AI have surfaced, including biases around race, gender, and age, many tech companies have installed “ethical AI” teams ostensibly dedicated to identifying and mitigating such issues.
Twitter’s META unit was more progressive than most in publishing details of problems with the company’s AI systems, and in allowing outside researchers to probe its algorithms for new issues.
Last year, after Twitter users noticed that a photo-cropping algorithm seemed to favor white faces when choosing how to trim images, Twitter took the unusual decision to let its META unit publish details of the bias it uncovered. The group also launched one of the first ever “bias bounty” contests, which let outside researchers test the algorithm for other problems. Last October, Chowdhury’s team also published details of unintentional political bias on Twitter, showing how right-leaning news sources were, in fact, promoted more than left-leaning ones.
Many outside researchers saw the layoffs as a blow, not just for Twitter but for efforts to improve AI. “What a tragedy,” Kate Starbird, an associate professor at the University of Washington who studies online disinformation, wrote on Twitter.
This content can also be viewed on the site it originates from.
“The META team was one of the only good case studies of a tech company running an AI ethics group that interacts with the public and academia with substantial credibility,” says Ali Alkhatib, director of the Center for Applied Data Ethics at the University of San Francisco.
Alkhatib says Chowdhury is incredibly well thought of within the AI ethics community and her team did genuinely valuable work holding Big Tech to account. “There aren’t many corporate ethics teams worth taking seriously,” he says. “This was one of the ones whose work I taught in classes.”
Mark Riedl, a professor studying AI at Georgia Tech, says the algorithms that Twitter and other social media giants use have a huge impact on people’s lives, and need to be studied. “Whether META had any impact inside Twitter is hard to discern from the outside, but the promise was there,” he says.
Riedl adds that letting outsiders probe Twitter’s algorithms was an important step toward more transparency and understanding of issues around AI. “They were becoming a watchdog that could help the rest of us understand how AI was affecting us,” he says. “The researchers at META had outstanding credentials with long histories of studying AI for social good.”
As for Musk’s idea of open-sourcing the Twitter algorithm, the reality would be far more complicated. There are many different algorithms that affect the way information is surfaced, and it’s challenging to understand them without the real time data they are being fed in terms of tweets, views, and likes.
The idea that there is one algorithm with explicit political leaning might oversimplify a system that can harbor more insidious biases and problems. Uncovering these is precisely the kind of work that Twitter’s META group was doing. “There aren’t many groups that rigorously study their own algorithms’ biases and errors,” says Alkhatib at the University of San Francisco. “META did that.” And now, it doesn’t.
The character of conflict between nations has fundamentally changed. Governments and militaries now fight on our behalf in the “gray zone,” where the boundaries between peace and war are blurred. They must navigate a complex web of ambiguous and deeply interconnected challenges, ranging from political destabilization and disinformation campaigns to cyberattacks, assassinations, proxy operations, election meddling, or perhaps even human-made pandemics. Add to this list the existential threat of climate change (and its geopolitical ramifications) and it is clear that the description of what now constitutes a national security issue has broadened, each crisis straining or degrading the fabric of national resilience.
Traditional analysis tools are poorly equipped to predict and respond to these blurred and intertwined threats. Instead, in 2022 governments and militaries will use sophisticated and credible real-life simulations, putting software at the heart of their decision-making and operating processes. The UK Ministry of Defence, for example, is developing what it calls a military Digital Backbone. This will incorporate cloud computing, modern networks, and a new transformative capability called a Single Synthetic Environment, or SSE.
This SSE will combine artificial intelligence, machine learning, computational modeling, and modern distributed systems with trusted data sets from multiple sources to support detailed, credible simulations of the real world. This data will be owned by critical institutions, but will also be sourced via an ecosystem of trusted partners, such as the Alan Turing Institute.
An SSE offers a multilayered simulation of a city, region, or country, including high-quality mapping and information about critical national infrastructure, such as power, water, transport networks, and telecommunications. This can then be overlaid with other information, such as smart-city data, information about military deployment, or data gleaned from social listening. From this, models can be constructed that give a rich, detailed picture of how a region or city might react to a given event: a disaster, epidemic, or cyberattack or a combination of such events organized by state enemies.
Defense synthetics are not a new concept. However, previous solutions have been built in a standalone way that limits reuse, longevity, choice, and—crucially—the speed of insight needed to effectively counteract gray-zone threats.
National security officials will be able to use SSEs to identify threats early, understand them better, explore their response options, and analyze the likely consequences of different actions. They will even be able to use them to train, rehearse, and implement their plans. By running thousands of simulated futures, senior leaders will be able to grapple with complex questions, refining policies and complex plans in a virtual world before implementing them in the real one.
One key question that will only grow in importance in 2022 is how countries can best secure their populations and supply chains against dramatic weather events coming from climate change. SSEs will be able to help answer this by pulling together regional infrastructure, networks, roads, and population data, with meteorological models to see how and when events might unfold.
Delphi taps the fruits of recent advances in AI and language. Feeding very large amounts of text to algorithms that use mathematically simulated neural networks has yielded surprising advances.
In June 2020, researchers at OpenAI, a company working on cutting-edge AI tools, demonstrated a program called GPT-3 that can predict, summarize, and auto-generate text with what often seems like remarkable skill, although it will also spit out biased and hateful language learned from text it has read.
The researchers behind Delphi also asked ethical questions of GPT-3. They found its answers agreed with those of the crowd workers just over 50 percent of the time—little better than a coin flip.
Improving the performance of a system like Delphi will require different AI approaches, potentially including some that allow a machine to explain its reasoning and indicate when it is conflicted.
The idea of giving machines a moral code stretches back decades both in academic research and science fiction. Isaac Asimov’s famous Three Laws of Robotics popularized the idea that machines might follow human ethics, although the short stories that explored the idea highlighted contradictions in such simplistic reasoning.
Choi says Delphi should not be taken as providing a definitive answer to any ethical questions. A more sophisticated version might flag uncertainty, because of divergent opinions in its training data. “Life is full of gray areas,” she says. “No two human beings will completely agree, and there’s no way an AI program can match people’s judgments.”
Other machine learning systems have displayed their own moral blind spots. In 2016, Microsoft released a chatbot called Tay designed to learn from online conversations. The program was quickly sabotaged and taught to say offensive and hateful things.
Efforts to explore ethical perspectives related to AI have also revealed the complexity of such a task. A project launched in 2018 by researchers at MIT and elsewhere sought to explore the public’s view of ethical conundrums that might be faced by self-driving cars. They asked people to decide, for example, whether it would be better for a vehicle to hit an elderly person, a child, or a robber. The project revealed differing opinions across different countries and social groups. Those from the US and Western Europe were more likely than respondents elsewhere to spare the child over an older person.
Some of those building AI tools are keen to engage with the ethical challenges. “I think people are right to point out the flaws and failures of the model,” says Nick Frost, CEO of Cohere, a startup that has developed a large language model that is accessible to others via an API. “They are informative of broader, wider problems.”
Cohere devised ways to guide the output of its algorithms, which are now being tested by some businesses. It curates the content that is fed to the algorithm and trains the algorithm to learn to catch instances of bias or hateful language.
Frost says the debate around Delphi reflects a broader question that the tech industry is wrestling with—how to build technology responsibly. Too often, he says, when it comes to content moderation, misinformation, and algorithmic bias, companies try to wash their hands of the problem by arguing that all technology can be used for good and bad.
When it comes to ethics, “there’s no ground truth, and sometimes tech companies abdicate responsibility because there’s no ground truth,” Frost says. “The better approach is to try.”
There’s an old joke that physicists like to tell: Everything has already been discovered and reported in a Russian journal in the 1960s, we just don’t know about it. Though hyperbolic, the joke accurately captures the current state of affairs. The volume of knowledge is vast and growing quickly: The number of scientific articles posted on arXiv (the largest and most popular preprint server) in 2021 is expected to reach 190,000—and that’s just a subset of the scientific literature produced this year.
It’s clear that we do not really know what we know, because nobody can read the entire literature even in their own narrow field (which includes, in addition to journal articles, PhD theses, lab notes, slides, white papers, technical notes, and reports). Indeed, it’s entirely possible that in this mountain of papers, answers to many questions lie hidden, important discoveries have been overlooked or forgotten, and connections remain concealed.
Artificial intelligence is one potential solution. Algorithms can already analyze text without human supervision to find relations between words that help uncover knowledge. But far more can be achieved if we move away from writing traditional scientific articles whose style and structure has hardly changed in the past hundred years.
Text mining comes with a number of limitations, including access to the full text of papers and legal concerns. But most importantly, AI does not really understand concepts and the relationships between them, and is sensitive to biases in the data set, like the selection of papers it analyzes. It is hard for AI—and, in fact, even for a nonexpert human reader—to understand scientific papers in part because the use of jargon varies from one discipline to another and the same term might be used with completely different meanings in different fields. The increasing interdisciplinarity of research means that it is often difficult to define a topic precisely using a combination of keywords in order to discover all the relevant papers. Making connections and (re)discovering similar concepts is hard even for the brightest minds.
As long as this is the case, AI cannot be trusted and humans will need to double-check everything an AI outputs after text-mining, a tedious task that defies the very purpose of using AI. To solve this problem we need to make science papers not only machine-readable but machine-understandable, by (re)writing them in a special type of programming language. In other words: Teach science to machines in the language they understand.
Writing scientific knowledge in a programming-like language will be dry, but it will be sustainable, because new concepts will be directly added to the library of science that machines understand. Plus, as machines are taught more scientific facts, they will be able to help scientists streamline their logical arguments; spot errors, inconsistencies, plagiarism, and duplications; and highlight connections. AI with an understanding of physical laws is more powerful than AI trained on data alone, so science-savvy machines will be able to help future discoveries. Machines with a great knowledge of science could assist rather than replace human scientists.
Mathematicians have already started this process of translation. They are teaching mathematics to computers by writing theorems and proofs in languages like Lean. Lean is a proof assistant and programming language in which one can introduce mathematical concepts in the form of objects. Using the known objects, Lean can reason whether a statement is true or false, hence helping mathematicians verify proofs and identify places where their logic is insufficiently rigorous. The more mathematics Lean knows, the more it can do. The Xena Project at Imperial College London is aiming to input the entire undergraduate mathematics curriculum in Lean. One day, proof assistants may help mathematicians do research by checking their reasoning and searching the vast mathematics knowledge they possess.
In the past decade, autonomous driving has gone from “maybe possible” to “definitely possible” to “inevitable” to “how did anyone ever think this wasn’t inevitable?” to “now commercially available.” In December 2018, Waymo, the company that emerged from Google’s self-driving-car project, officially started its commercial self-driving-car service in the suburbs of Phoenix. At first, the program was underwhelming: available only to a few hundred vetted riders, and human safety operators remained behind the wheel. But in the past four years, Waymo has slowly opened the program to members of the public and has begun to run robotaxis without drivers inside. The company has since brought its act to San Francisco. People are now paying for robot rides.
And it’s just a start. Waymo says it will expand the service’s capability and availability over time. Meanwhile, its onetime monopoly has evaporated. Every significant automaker is pursuing the tech, eager to rebrand and rebuild itself as a “mobility provider. Amazon bought a self-driving-vehicle developer, Zoox. Autonomous trucking companies are raking in investor money. Tech giants like Apple, IBM, and Intel are looking to carve off their slice of the pie. Countless hungry startups have materialized to fill niches in a burgeoning ecosystem, focusing on laser sensors, compressing mapping data, setting up service centers, and more.
This 21st-century gold rush is motivated by the intertwined forces of opportunity and survival instinct. By one account, driverless tech will add $7 trillion to the global economy and save hundreds of thousands of lives in the next few decades. Simultaneously, it could devastate the auto industry and its associated gas stations, drive-thrus, taxi drivers, and truckers. Some people will prosper. Most will benefit. Some will be left behind.
It’s worth remembering that when automobiles first started rumbling down manure-clogged streets, people called them horseless carriages. The moniker made sense: Here were vehicles that did what carriages did, minus the hooves. By the time “car” caught on as a term, the invention had become something entirely new. Over a century, it reshaped how humanity moves and thus how (and where and with whom) humanity lives. This cycle has restarted, and the term “driverless car” may soon seem as anachronistic as “horseless carriage.” We don’t know how cars that don’t need human chauffeurs will mold society, but we can be sure a similar gear shift is on the way.
The First Self-Driving Cars
Just over a decade ago, the idea of being chauffeured around by a string of zeros and ones was ludicrous to pretty much everybody who wasn’t at an abandoned Air Force base outside Los Angeles, watching a dozen driverless cars glide through real traffic. That event was the Urban Challenge, the third and final competition for autonomous vehicles put on by Darpa, the Pentagon’s skunkworks arm.
At the time, America’s military-industrial complex had already thrown vast sums and years of research trying to make unmanned trucks. It had laid a foundation for this technology, but stalled when it came to making a vehicle that could drive at practical speeds, through all the hazards of the real world. So, Darpa figured, maybe someone else—someone outside the DOD’s standard roster of contractors, someone not tied to a list of detailed requirements but striving for a slightly crazy goal—could put it all together. It invited the whole world to build a vehicle that could drive across California’s Mojave Desert, and whoever’s robot did it the fastest would get a million-dollar prize.
The 2004 Grand Challenge was something of a mess. Each team grabbed some combination of the sensors and computers available at the time, wrote their own code, and welded their own hardware, looking for the right recipe that would take their vehicle across 142 miles of sand and dirt of the Mojave. The most successful vehicle went just seven miles. Most crashed, flipped, or rolled over within sight of the starting gate. But the race created a community of people—geeks, dreamers, and lots of students not yet jaded by commercial enterprise—who believed the robot drivers people had been craving for nearly forever were possible, and who were suddenly driven to make them real.
They came back for a follow-up race in 2005 and proved that making a car drive itself was indeed possible: Five vehicles finished the course. By the 2007 Urban Challenge, the vehicles were not just avoiding obstacles and sticking to trails but following traffic laws, merging, parking, even making safe, legal U-turns.
When Google launched its self-driving car project in 2009, it started by hiring a team of Darpa Challenge veterans. Within 18 months, they had built a system that could handle some of California’s toughest roads (including the famously winding block of San Francisco’s Lombard Street) with minimal human involvement. A few years later, Elon Musk announced Tesla would build a self-driving system into its cars. And the proliferation of ride-hailing services like Uber and Lyft weakened the link between being in a car and owning that car, helping set the stage for a day when actually driving that car falls away too. In 2015, Uber poached dozens of scientists from Carnegie Mellon University—a robotics and artificial intelligence powerhouse—to get its effort going.