The Fall of Babylon Is a Warning for AI Unicorns

The Fall of Babylon Is a Warning for AI Unicorns

In late 2016, Hugh Harvey was working as a consultant doctor in the UK’s National Health Service. Harvey had dabbled in machine learning while doing a research degree, and had seen the potential for artificial intelligence to revolutionize health care. But he felt strongly that the introduction of AI into medicine was not going to come from within the NHS—it was going to come from industry. So when an opportunity opened up at a buzzy new health-tech startup, Babylon Health, he applied.

Founded in London in 2013 by Ali Parsa, a British-Iranian ex-banker, Babylon had a lofty goal: It wanted to do with health care what Google did with information; that is, make it freely and easily available to everyone. By the time Harvey joined the company in 2016, it was already picking up tens of millions in venture capital funding—even though at that point, all it had built was an app that let patients have video calls with their doctors. Helpful, yes, but not exactly revolutionary. The company’s value was in its grand ambition to add on an AI-powered symptom checker, which would speed up—or even automate—diagnoses.

Accustomed to the frugal conditions of the perpetually cash-strapped NHS, Harvey says he was taken in by the lavish setup: laptop waiting for him on his desk, fancy office in upmarket South Kensington, free office beers and pizzas at lunch. But soon, Harvey got to take a peek at the software that was behind all the excitement. What he was shown was a bunch of Excel spreadsheets containing clinical decision pathways written by junior doctors at the company. They had essentially divided the body up into different parts, and depending on which part of the body the user clicked on, the app would follow what they called “clinical flows,” or decision trees. “I was like, well, this isn’t really artificial intelligence,” Harvey recalls thinking.

But over the next few years, the hype around Babylon just kept growing. It picked up contracts with the NHS and British health insurance providers. Chinese tech giant Tencent signed a deal to offer services through WeChat. Saudi Arabia’s sovereign wealth fund invested $550 million. By the time it went public on the New York Stock Exchange in 2021, Babylon was valued at $4.2 billion. But the wheels were already coming off. The company’s losses were mounting as it spent big to chase growth. Its share price quickly went into free fall. In mid-August this year, after a rescue deal fell apart, it was announced that the UK side of the business was going into administration—a process similar to bankruptcy protection in the US. The company shuttered its US headquarters, laid off scores of employees, and filed for bankruptcy there, too.

WIRED spoke to three former employees at Babylon Health to uncover what went so wrong for the darling of the “digital-first” health-tech landscape. What they say about the company’s collapse—at a moment when VC interest in AI and health care is at a fever pitch—is a warning about the dangers of backing hype over delivery.

Neither Parsa nor Babylon Health responded to multiple requests for comment.

Parsa named his company after the ancient city of Babylon, which, according to the Greek historian Herodotus, had a square where citizens gathered to share tips on how to treat their ailments.

Former employees say Parsa was obsessed with “blitzscaling”—the kind of entrepreneurial hypergrowth popularized by LinkedIn cofounder Reid Hoffman. The company went on uncontrolled hiring sprees, ex-employees say, and teams were often working on overlapping projects. Three teams were working on three different, mutually incompatible versions of the symptom checker at one point, says an ex-employee, who spoke on condition of anonymity. The employee says they once found a product manager wandering the building on his second day at the company. He had been left looking for a team to work with because nobody had onboarded him or told him where he should be. “He assumed it was some kind of onboarding ‘challenge’ to just find a team to join,” the employee says.

The C-suite experienced lots of turnover. Senior leadership would go on retreats to Antigua, which wasn’t widely known by staff—until it was leaked on a public Slack channel. Parsa “once presented a stand-up from Antigua while pretending to be in his office,” one ex-employee says. Former staff say Parsa’s leadership style was “idiosyncratic” and “occasionally megalomaniacal.” At one point, Parsa tried to ban Microsoft PowerPoint at the company. Workers, whom Parsa referred to as Babylonians, were chastised by the CEO for leaving at 5:30 pm, Harvey says.

Parsa’s rush for scale outpaced Babylon Health’s ability to actually put out finished products, according to former employees. After Harvey joined, the company reassured him that its data science team was working on a knowledge graph, which connects bits of knowledge by probabilities. What this looked like was Harvey and his clinician colleagues answering thousands of medical questions, like “What is the probability of someone with jaundice having hepatitis?” The questions progressively became more fine-grained; what’s, say, the probability of someone having two weeks of jaundice and having hepatitis B?

“The questions just became more and more ridiculous and unrelated,” Harvey says—and it still wasn’t really AI. (Another former employee of Babylon Health, who worked on the AI team, says that it’s likely that the machine learning team just showed Harvey Excel spreadsheets for simplicity, but admits the decision tree model was “not particularly sophisticated.”)

At one point, the BBC were scheduled to visit the office to film the technology. But there was one problem: The app hadn’t been finished yet. It had only been modeled for gastroenterology; basically, stomach problems. It had no interface, so Harvey recalls a data scientist having to sleep in the office for several nights and over the weekend as they raced to build something that looked like an app. “But we all knew … that’s not the product we’re building,” Harvey says. “This is a mock-up of something that has been put together in haste with a lot of man-hours to demonstrate to the BBC.” Harvey’s account was corroborated by another former employee.

Babylon’s symptom-checking app, called GP at Hand, was launched in 2017, promising to help tackle the NHS’s long waiting lists by automating some patient inquiries.

Harvey’s role at Babylon was to get the go-ahead from regulators that the app could be used to triage patients—a preliminary assessment that ascertains how urgently a patient needs to be seen by a doctor. But this was not the party line. Parsa was publicly saying in 2017 that it could diagnose patients: a much grander statement. Harvey says Parsa would come up to him on a near daily basis to ask whether they had gotten regulatory clearance yet. Harvey would explain that they would get it—but only to triage.

Later that year, the company claimed its AI performed better than humans on an exam used to test doctors’ ability to diagnose (a claim that was quickly questioned by experts). By then, Harvey had quit and returned to the NHS as a consultant radiologist. But the GP at Hand app grew in popularity—albeit not without criticism by health care professionals.

One of the first people to raise the alarm about the effectiveness of Babylon Health’s AI was a consultant oncologist for the NHS, David Watkins. Tweeting at first under the alias @DrMurphy11, Watkins regularly documented online the unusual departures from the clinical norm the bot would take, like asking a 66-year-old woman concerned about a breast lump whether she was pregnant or breastfeeding, and failing to spot the symptoms of a heart attack. The company dubbed him a “troll” in a public statement. But Watkins’ concerns were also reportedly shared within the company, and, it turned out, by the UK’s medical regulator.

A 2017 report from the Care Quality Commission, the regulator of health and social care services in England, called the safety and effectiveness of the company’s services into question—for which Babylon threatened to sue. In 2019, WIRED reported that Babylon was costing the NHS upward of £26 million ($32 million). Then, in 2020, the company admitted that its GP at Hand app had suffered a data breach which meant users were able to see dozens of video consultations done by other patients. And, even as its service was being adopted across the country, Babylon Health was struggling to make its model work financially in the UK. Parsa blamed its failure on structural problems within the NHS that meant it never managed to turn a profit. It quit its final NHS contact in August last year.

But Parsa had long held ambitions to go global anyway. The company set up shop in Canada—but sold its operations there in 2021 as part of a licensing deal. The same year, a Canadian government investigation found the app was not compliant with the country’s privacy regulations. Babylon shifted its focus to the US, where it could make more money through health insurance programs Medicaid and Medicare. Parsa even relocated there.

But the US venture was also ultimately doomed. It was entering a very crowded market, and wasn’t ready to compete. “There are a lot of scaled telemedicine companies here that have been around a lot longer than Babylon,” says Christina Farr, a health-tech investor at OMERS Ventures in San Francisco.

One ex-employee says that Parsa didn’t fully understand that the US was a mature market. The final straw for the employee was when they saw a contract being drawn up to provide telehealth services in Missouri through Medicaid. Essentially, Babylon would be taking on all of the financial responsibility and financial liability of a health insurer, but without any of the sky-high premiums that are required to cover that kind of liability. “I was like, ‘No, absolutely not,’” says the ex-employee. “‘This is going to go tits up, and I don’t want to be around when that happens.’” They quit.

Even the company’s stock market debut quickly went south. Within 18 months of listing, its shares had dropped 99 percent. Parsa described the nosedive as an “unbelievable, unmitigated disaster.” It wasn’t that surprising. Although Babylon was generating revenue, it was losing a lot of money. In 2022, the company lost $221 million. In the first three months of 2023, it lost a further $63 million. In May 2023, the company’s biggest lender, Albacore Capital, took the company private and tried to merge it with another health-tech company, MindMaze. The merger fell through in early August.

Babylon isn’t the first company at the interface of AI and health care to struggle to move from hype to commercial success. Its fate “raises questions around how you commercialize AI in health care,” says David Wong, an associate professor of health informatics and data science at the University of Leeds in the UK. Wong points to another failure: the collapse of Sensyne Health, an AI startup, which cost two NHS trusts $18 million when it was delisted from the London Stock Exchange in 2022. The same year, IBM dumped Watson Health. Olive AI, a health care automation startup valued at $4 billion in 2021, fired a third of its staff in February 2023.

The reason companies like Babylon fail, experts say, is simply that it’s hard to replace flesh-and-blood clinicians with an algorithm, and there’s an inherent mismatch between the move-fast-break-things culture of tech startups and that of health care, where caring for patients requires thoughtfulness and context.

“I think probably the tricky part of the startup world is there are a lot of people with ideas, and most of them won’t work,” Wong says. “And I think if there were more clinicians on board, most of them would be very quick in telling you which ones had a chance of working and which ones didn’t.”

Grace Browne

Leave a Reply