Experimental and computational exploration of enzyme sequence space
Doktorsavhandling, 2021

Millions of enzymes with desirable features or new exciting activities can be found in organisms occupying diverse niches all around the earth. However, enzyme studies tend to be biased towards characterisation of representatives from eukaryotes, model organisms, or disease-causing bacteria. As such, a large number of enzymes still remains underexplored. The so-called sequence space of proteins - all possible protein sequences - is even greater when we include not only natural sequences, but also the ones designed by human or artificial intelligence. This thesis explores various reasons, approaches, and outcomes of investigation of large enzymatic sequence spaces. 

In the first part of my work, I focused on investigation of a natural sequence space of oxidases using a high-throughput activity profiling platform. A functional screen of an industrially important class of enzymes, S-2-hydroxyacid oxidases (EC 1.1.3.15), revealed that nearly 80% of the class is misannotated. Further exploration of annotations to public databases indicated that similar errors of annotations can be found in other enzyme classes. A broader activity profiling of 1.1.3.x oxidases resulted in the discovery of two novel microbial enzymes: N-acetyl-hexosamine oxidase, and a novel type of long-chain alcohol oxidase. 

Natural enzymes often need to be improved in order to be industrially applied, for example to become more stable, or accept non-natural substrates. A novel, and constantly developing, approach for enzyme design involves the use of machine learning (ML) tools. Second part of my work focused on screening an enzyme sequence space designed by generative adversarial networks. Our work proved that ML methods can generate fully functional enzymes that mimic sequences present in nature.

Enzyme assays are necessary to get a full understanding of how enzymes work. Traditional kinetic assays are time- and reagent-consuming and as a result a limited number of variants and conditions are being tested for each target. In my final work I described a novel approach for enzyme kinetic studies, by adaptation of a microfluidic qPCR device.

high-throughput-screening

protein annotation

enzyme discovery

enzyme sequence space

oxidases

10an-salen, Forskarhus 1, Kemigården 4
Opponent: Zbynek Prokop, Loschmidt Laboratories and Masaryk University, Brno

Författare

Elzbieta Rembeza

Chalmers, Biologi och bioteknik, Systembiologi

Discovery of Two Novel Oxidases Using a High-Throughput Activity Screen

ChemBioChem,; Vol. 23(2022)

Artikel i vetenskaplig tidskrift

Adaptation of a Microfluidic qPCR System for Enzyme Kinetic Studies

ACS Omega,; Vol. 6(2021)p. 1985-1990

Artikel i vetenskaplig tidskrift

Expanding functional protein sequence spaces using generative adversarial networks

Nature Machine Intelligence,; Vol. 3(2021)p. 324-333

Artikel i vetenskaplig tidskrift

Proteins are biomolecules that perform many different roles inside organisms, such as structural, transporting, or signaling. Despite their vast functional diversity, they are all built of the same basic building blocks: 20 amino acids tied together in an order particular for each protein. A protein built of 100 amino acids can be made in 10^130 unique ways. Such a hypothetical sequence space of proteins is huge, however, nature covers only a fraction of these possible combinations. The order of amino acids in a protein is very important, as it dictates the protein’s function. But why does one protein consist of this and not another set of building blocks? We still struggle to fully understand this sequence-function relationship.

Enzymes are a special kind of proteins that act as biological catalysts: they enable reactions to happen inside organisms. Enzymes are also used in our everyday life in detergents, cosmetics, or as pharmaceuticals and food additives. In my work I have dived into a deep sequence space of enzymes to explore their potential and sequence-function relationships. I sifted through hundreds of enzymes to find ones with novel functions, as well as to learn more about their functional annotations in biological databases. I also explored the potential of artificial intelligence to design enzymes and investigated novel ways to perform high throughput experiments on enzymes.

Ämneskategorier

Biokemi och molekylärbiologi

Fundament

Grundläggande vetenskaper

ISBN

978-91-7905-576-9

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5043

Utgivare

Chalmers

10an-salen, Forskarhus 1, Kemigården 4

Online

Opponent: Zbynek Prokop, Loschmidt Laboratories and Masaryk University, Brno

Mer information

Senast uppdaterat

2021-12-13