Parallel Factor Analysis Enables Quantification and Identification of Highly Convolved Data-Independent-Acquired Protein Spectra
Artikel i vetenskaplig tidskrift, 2020

The latest high-throughput mass spectrometry-based technologies can record virtually all molecules from complex biological samples, providing a holistic picture of proteomes in cells and tissues and enabling an evaluation of the overall status of a person's health. However, current best practices are still only scratching the surface of the wealth of available information obtained from the massive proteome datasets, and efficient novel data-driven strategies are needed. Powered by advances in GPU hardware and open-source machine-learning frameworks, we developed a data-driven approach, CANDIA, which disassembles highly complex proteomics data into the elementary molecular signatures of the proteins in biological samples. Our work provides a performant and adaptable solution that complements existing mass spectrometry techniques. As the central mathematical methods are generic, other scientific fields that are dealing with highly convolved datasets will benefit from this work.

deconvolution

data-independent acquisition

tensor factorization

DSML 2: Proof-of-Concept: Data science output has been formulated, implemented, and tested for one domain/problem

canonical decomposition

big data

proteomics

PARAFAC

mass spectrometry

Författare

Filip Buric

Chalmers, Biologi och bioteknik, Systembiologi

Jan Zrimec

Chalmers, Biologi och bioteknik, Systembiologi

Aleksej Zelezniak

Chalmers, Biologi och bioteknik, Systembiologi

Science for Life Laboratory (SciLifeLab)

Patterns

26663899 (eISSN)

Vol. 1 9 100137

Ämneskategorier

Analytisk kemi

Bioinformatik (beräkningsbiologi)

Bioinformatik och systembiologi

DOI

10.1016/j.patter.2020.100137

Mer information

Senast uppdaterat

2020-12-28