Optimal group testing algorithms with interval queries and their application to splice site detection
Artikel i vetenskaplig tidskrift, 2005
We consider the following constrained version of the classical Group Testing problem: Given a finite set of
items identified with the set of natural numbers from 1 to
n, and an unknown distinguished subset P of up to p
positive elements, the goal is to identify the items in P
by asking the least number of queries of the type "does
the subset Q intersect P?", where Q is a subset of consecutive elements of cardinality at most d. This particular case of the Group Testing problem naturally arises in computational biology in the context of
searching for splice sites within a gene. In this paper we focus on algorithms that solve the aforesaid problem and
for which queries can be arranged in stages: in each stage
a certain number of queries can be performed in parallel, while queries of a given stage can be chosen depending on the answers to those of previous stages. Algorithms that operate in few stages are usually preferred in practical applications. We study the case with one positive element
comprehensively. We obtain asymptotically tight bounds on the number of queries of two-stage strategies for arbitrarily many positives. Finally, we extend our results to strategies with any number of stages and positives.
gene prediction
combinatorial group testing
splice sites