Fast algorithms for finding disjoint subsequences with extremal densities
Journal article, 2006

We derive fast algorithms for the following problem: Given a set of n points on the real line and two parameters s and p, find s disjoint intervals of maximum total length that contain at most p of the given points. Our main contribution consists of algorithms whose time bounds improve upon a straightforward dynamic programming algorithm, in the relevant case that input size n is much bigger than parameters s and p. These results are achieved by selecting a few candidate intervals that are provably sufficient for building an optimal solution via dynamic programming. As a byproduct of this idea we improve an algorithm for a similar subsequence problem of Chen, Lu and Tang (2005). The problems are motivated by the search for significant patterns in biological data. Finally we propose several heuristics that further reduce the time complexity in typical instances. One of them leads to an apparently open subsequence sum problem of independent interest.

dynamic programming

selection algorithms

protein torsion angle

range prediction

time complexity

holes in data

protein structure prediction

Author

Peter Damaschke

Chalmers, Computer Science and Engineering (Chalmers), Computing Science (Chalmers)

Anders Bergkvist

Chalmers

Pattern Recognition

0031-3203 (ISSN)

Vol. 39 12 2281-2292

Subject Categories (SSIF 2011)

Computer Science

DOI

10.1016/j.patcog.2006.01.008

More information

Latest update

9/10/2018