Optimal group testing algorithms with interval queries and their application to splice site detection
Artikel i vetenskaplig tidskrift, 2005

We consider the following constrained version of the classical Group Testing problem: Given a finite set of items identified with the set of natural numbers from 1 to n, and an unknown distinguished subset P of up to p positive elements, the goal is to identify the items in P by asking the least number of queries of the type "does the subset Q intersect P?", where Q is a subset of consecutive elements of cardinality at most d. This particular case of the Group Testing problem naturally arises in computational biology in the context of searching for splice sites within a gene. In this paper we focus on algorithms that solve the aforesaid problem and for which queries can be arranged in stages: in each stage a certain number of queries can be performed in parallel, while queries of a given stage can be chosen depending on the answers to those of previous stages. Algorithms that operate in few stages are usually preferred in practical applications. We study the case with one positive element comprehensively. We obtain asymptotically tight bounds on the number of queries of two-stage strategies for arbitrarily many positives. Finally, we extend our results to strategies with any number of stages and positives.

gene prediction

combinatorial group testing

splice sites

Författare

Peter Damaschke

Chalmers, Data- och informationsteknik, Datavetenskap

Peter Damaschke

Ugo Vaccaro

International Journal of Bioinformatics Research and Applications

1744-5485 (ISSN) 1744-5493 (eISSN)

Vol. 1 4 363-388

Ämneskategorier

Datavetenskap (datalogi)