A relatively new startup called EvolutionaryScale has secured a huge sum of money to develop AI models that can generate novel proteins for scientific research.
EvolutionaryScale announced Tuesday that it has raised $142 million in a seed round led by former GitHub CEO Nat Friedman, Daniel Gross and Lux Capital. Amazon and NVentures, Nvidia’s corporate venture arm, also participated. The startup also released ESM3, an AI model it describes as a “frontier model” for biology – one that can generate proteins for use in drug discovery and materials science.
“ESM3 takes a step toward a future of biology where AI is a tool to engineer from the ground up, just as we design structures, machines and microchips and write computer programs,” said Alexander Rives, co-founder and chief scientist of EvolutionaryScale, in a statement.
Rives began developing generative AI models to study proteins in Meta’s FAIR AI research lab in 2019, along with Tom Secru and Sal Candido. After their team disbanded, Rives, Secru and Candido left Meta to continue the work they had started.
Characterizing proteins can reveal the mechanisms of a disease, including ways to slow or reverse it, while Create Proteins have the potential to lead to entirely new classes of drugs, tools, and therapeutics. Yet the current process of protein development in the lab is costly from both a computational and human resource perspective.
When developing a protein, you have to develop a structure that plausible perform a task in the body or in a product and then find a protein sequence – the sequence of amino acids that make up a protein – that is likely to “fold” into the structure. Proteins must fold correctly into three-dimensional shapes to perform their intended function.
ESM3 was trained on a dataset of 2.78 billion proteins and can “figure out” the sequence, structure and function of proteins, Rives says. This allows the model to generate new proteins – much like Google DeepMind’s AlphaFold. EvolutionaryScale makes the full model with 98 billion parameters available for non-commercial use through its cloud developer platform Forge, and is releasing a smaller version of the model for offline use.
EvolutionaryScale claims it has used ESM3 to create a new variant of the green fluorescent protein (GFP), which is responsible for the glow of jellyfish and the bright colors of corals. A preprint on the company’s website describes its work in detail.
“We have been working on this for a long time and are excited to share it with the scientific community and see what they do with it,” Rives said.
EvolutionaryScale isn’t a charity, of course. The company, which employs around 20 people, told TechCrunch that it plans to make money through a combination of partnerships, royalties and revenue shares. For example, EvolutionaryScale could partner with pharmaceutical companies to integrate ESM3 into their workflows, or earn revenue shares with researchers for breakthrough discoveries commercialized using ESM3.
To that end, EvolutionaryScale announces that it will soon make ESM3 and its derivatives available to select AWS customers through the cloud provider’s SageMaker AI development platform, Bedrock AI platform, and HealthOmics service. ESM3 will also be available to select customers using NVIDIA’s NIM microservices, supported by an Nvidia enterprise software license.
According to EvolutionaryScale, both AWS and Nvidia customers can optimize ESM3 using their own data.
It could be a while before EvolutionaryScale turns a profit. In the company’s presentation, a copy of which was obtained by Forbes last August, EvolutionaryScale repeatedly stressed that it could be a decade before generative AI models could help develop therapies. The company also faces competition from DeepMind’s spin-off Isomorphic Labs, which already has deals with major pharmaceutical companies, as well as Insitro, the publicly traded company Recursion and Inceptive.
EvolutionaryScale’s big investment is to expand the training of its model to include data beyond proteins and create a general AI model for biotech applications.
“The incredible pace of new AI advances is driven by ever-larger models, ever-larger data sets, and increasing computational power,” said a spokesperson for EvolutionaryScale. “The same is true in biology. Over the past five years, the ESM team’s research has examined scaling in biology. We’ve found that as language models scale, they develop an understanding of the underlying principles of biology and discover biological structures and functions.”
To the reporter, this all sounds extremely ambitious, but financially strong investors will certainly be helpful.