Segmenting strings homogeneously via trees
Paper in proceeding, 2007

We divide a string into k segments, each with only one sort of symbols, so as to minimize the total number of exceptions. Motivations come from machine learning and data mining. For binary strings we develop a linear-time algorithm for any k. Key to efficiency is a special-purpose data structure, called W-tree, which reflects relations between repetition lengths of symbols. Existence of algorithms faster than obvious dynamic programming remains open for non-binary strings. Our problem is also equivalent to finding weighted independent sets of prescribed size in paths. We show that this problem in bounded-degree graphs is FPT.

segmentation

parameterized complexity

dynamic programming

interval graphs

tree computations

weighted independent set

Author

Peter Damaschke

Chalmers, Computer Science and Engineering (Chalmers), Computing Science (Chalmers)

33rd International Workshop on Graph-Theoretic Concepts in Computer Science WG 2007, Lecture Notes in Computer Science

Vol. 4769 214-225
978-3-540-74838-0 (ISBN)

Subject Categories

Computer Science

ISBN

978-3-540-74838-0

More information

Created

10/6/2017