Expanding functional protein sequence spaces using generative adversarial networks
Artikel i vetenskaplig tidskrift, 2021

De novo protein design for catalysis of any desired chemical reaction is a long-standing goal in protein engineering because of the broad spectrum of technological, scientific and medical applications. However, mapping protein sequence to protein function is currently neither computationally nor experimentally tangible. Here, we develop ProteinGAN, a self-attention-based variant of the generative adversarial network that is able to ‘learn’ natural protein sequence diversity and enables the generation of functional protein sequences. ProteinGAN learns the evolutionary relationships of protein sequences directly from the complex multidimensional amino-acid sequence space and creates new, highly diverse sequence variants with natural-like physical properties. Using malate dehydrogenase (MDH) as a template enzyme, we show that 24% (13 out of 55 tested) of the ProteinGAN-generated and experimentally tested sequences are soluble and display MDH catalytic activity in the tested conditions in vitro, including a highly mutated variant of 106 amino-acid substitutions. ProteinGAN therefore demonstrates the potential of artificial intelligence to rapidly generate highly diverse functional proteins within the allowed biological constraints of the sequence space.

Författare

Donatas Repecka

Biomatter Designs

Vykintas Jauniskis

Chalmers, Biologi och bioteknik, Systembiologi

Biomatter Designs

Laurynas Karpus

Biomatter Designs

Elzbieta Rembeza

Chalmers, Biologi och bioteknik, Systembiologi

Irmantas Rokaitis

Biomatter Designs

Jan Zrimec

Chalmers, Biologi och bioteknik, Systembiologi

Simona Poviloniene

Vilniaus universitetas

Audrius Laurynenas

Biomatter Designs

Vilniaus universitetas

Sandra Viknander

Chalmers, Biologi och bioteknik, Systembiologi

Wissam Abuajwa

Chalmers, Biologi och bioteknik, Systembiologi

Otto Savolainen

Chalmers, Biologi och bioteknik, Systembiologi

Rolandas Meskys

Vilniaus universitetas

Martin Engqvist

Chalmers, Biologi och bioteknik, Systembiologi

Aleksej Zelezniak

Chalmers, Biologi och bioteknik, Systembiologi

Science for Life Laboratory (SciLifeLab)

Nature Machine Intelligence

25225839 (eISSN)

Vol. 3 4 324-333

Ämneskategorier

Biokemi och molekylärbiologi

Bioinformatik och systembiologi

Annan industriell bioteknik

DOI

10.1038/s42256-021-00310-5

Mer information

Senast uppdaterat

2024-01-03