Power of proteins

Power of proteins

AI has enabled us to crack the secret of protein folding, opening the door to faster drug development, more resilient crops and bacteria-fuelled recycling.

Proteins are at the heart of cells, and cells are the building blocks of life. Understanding how protein structures form and change is key to understanding biology, paving the way for faster development of new drugs, the creation of more resilient crops and even the breaking down of plastic waste.

Yet protein structures have been, until recently, difficult to understand due to their 3-dimensional shape, folded from a linear polymer of the protein’s amino acid building blocks. The folding allows for optimal interactions between the amino acids, and the end result is a bit like an origami made with a string of beads instead of paper.

“Determining a protein structure using experiments is labour-intensive and slow. Humanity has only done these a few 100,000 times in the last half century since the first protein structure was determined”, explains Dr Chris Bahl, co-founder of AI Proteins, a drug discovery platform. That may sound like a large number, but it’s tiny compared with the hundreds of millions of possible structures out there. As well as years of painstaking work, elucidating a protein structure has often required costly techniques such as X-ray crystallography and cryo-electron microscopy.

That all changed in 2021 with the release of AlphaFold, developed by DeepMind, in partnership with the European Molecular Biology Laboratory (EMBL), an intergovernmental research institute. Using artificial intelligence, AlphaFold can predict a protein structure from its amino acid sequence at a rate that “far outpaces humanity’s ability”, according to Bahl. The tool provides access to over 200 million protein structure predictions. 

 

The following year, Facebook parent company, Meta, released a database showing the predicted shape of 600 million proteins from bacteria, viruses and microorganisms that had not yet been characterised. Their approach used a large language model (LLM), since popularised with the launch of ChatGPT, which can predict text from a few letters or words, creating a kind of protein ‘autocomplete.’

A key difference between this and AlphaFold is that the language model does not need information about nearby amino acid sequences or multiple sequence alignments (MSA). MSA queries databases of protein sequences to identify similar sequences that are already known in living organisms. Instead, the language model can predict the structure of proteins that have no resemblance to other known proteins, giving it an advantage for detecting what would happen to a protein if there is a point mutation. The algorithm is not as accurate as AlphaFold, according to researchers, but it is quicker, allowing scientists to predict structures in just two weeks. “I’m so happy to be a scientist who can actually live through this revolution,” says Professor Edith Heard, Director General at EMBL.

Crucially, the new discoveries are widely available. AlphaFold is an open access resource, while Meta has published the code used to create its database.  This approach gives the algorithms enormous reach and reflects tech companies’ reliance on public data resources to build them: DeepMind's algorithms could only be developed with the data held by EMBL. “If we really wanted to make this a game changer, it had to be open [access], it had to be shared by all,” says Heard.

Turbocharged research

AI-powered prediction is turbocharging scientific research. Biochemists at the University of Colorado were able to determine a bacterial protein structure in 15 minutes, having tried to figure it out for 10 years, aiding their efforts to combat antibiotic resistance. Scientists at the University of Portsmouth are applying AlphaFold to develop enzymes which can degrade plastics. “These can be used as planetary healers. That is amazing and something we could never have thought of doing at such high speed a few years ago,” says Heard. 

A team at the Karolinska Institute in Sweden has used AlphaFold to determine the structure of a protein which could block bacterial infections in the urinary tract and gastrointestinal system. Researchers at the University of Oxford are working on malaria vaccines that target every phase of the parasite’s infection cycle, helping tackle not just disease but onward transmission. Malaria has evaded a vaccine solution because it has hundreds or thousands of surface proteins which makes it hard to target. AlphaFold has surpassed existing techniques in identifying the properties of one key protein, known as Pfs48/45, which is essential to the development of the parasite in the mosquito's gut.

In pharmaceutical research, significant time and money is wasted in going after the wrong drug targets; predictive AI can improve the odds of new drug candidates being successful. “Whole areas of science will blossom, because before it was just too timely and too costly,” says EMBL’s Heard. 

Neurodegenerative conditions including Alzheimer’s and Parkinson’s diseases are the result of protein misfolding. These, alongside other modern mass killers like diabetes and cancer, are in large part the result not of bacteria or viruses, our arch nemeses for millennia, but our body misfiring. Since most drugs work by targeting specific proteins in the body, access to information about the structure of the misfolded proteins will aid drug discovery and allow design of drugs that will bind precisely to the target protein and alter its function. Dr Bahl is optimistic about advances beyond medicine, in areas like next-generation pesticides and agricultural applications. “This is a way to unlock control over biology; biology is fundamentally mediated by proteins and designing proteins will give us unprecedented control over biology.” 

Beyond proteins

However, not all disease-related proteins react to drugs. Some are "undruggable" which means they are unable to bind strongly and effectively to drug molecules. Here, too, AI can help, but this time by focusing on RNA. RNA represents the critical step between DNA – the molecule that contains our genetic code and the blueprint for making the proteins that are essential for life to function – and the actual production of those proteins. Each of the nearly 100,000 different types of proteins that human cells produce has its own unique RNA sequence that has been transferred from the cell’s DNA sequence.

Targeting the RNA before proteins are made would allow the drug to alter the protein before or during its synthesis. From Covid-19 vaccines to some cancer drugs, RNA therapeutics have already benefited millions of people, and the ability to predict RNA shapes quickly and accurately on a computer will help accelerate the understanding of RNA molecules and expand their use in healthcare.

“The reason why AI for RNA structure prediction makes a significant impact is that there have been severe difficulties in finding drugs that are selective enough to target just the RNA you care about”, explains Dr Raphael Townsend, CEO and founder of Atomic AI. Knowing the structure of the RNA would therefore make the process more selective.

Although the findings look promising, there is still a lot to do – both in terms of science and in terms of regulation. The FDA’s Digital Health Innovation Action Plan was released in 2017 with the hope of speeding up the approvals process for digital health products, followed in 2021 by guiding principles for the use of machine learning in medical devices, developed by the FDA and Canadian and UK regulatory agencies.

As yet, there is no guidance around specific use of AI in pharma manufacturing, though the FDA released a discussion paper which highlights considerations for policy development in AI, to encourage feedback from public, industry and research centres.  “It is something that regulators need to really figure out in a hurry, because clinical trials are going to become a massive bottleneck in our ability to make new medicines”, cautions Bahl from AI Proteins. 

But his overarching sentiment is one of optimism for a new dawn in medicine. He says predictive AI in biology is part of “a renaissance in the arts and sciences in every field – AI, biomedical research, astrophysics. It's all happening synergistically and advancements in computational technology are marching hand in hand with advancements in laboratory technology and automation.”

Please confirm your profile
Please confirm your profile to continue
Confirm your selection
By clicking on “Continue”, you acknowledge that you will be redirected to the local website you selected for services available in your region. Please consult the legal notice for detailed local legal requirements applicable to your country. Or you may pursue your current visit by clicking on the “Cancel” button.

Welcome to Pictet

Looks like you are here: {{CountryName}}. Would you like to change your location?