Segmenting strings homogeneously via trees
Paper i proceeding, 2007

We divide a string into k segments, each with only one sort of symbols, so as to minimize the total number of exceptions. Motivations come from machine learning and data mining. For binary strings we develop a linear-time algorithm for any k. Key to efficiency is a special-purpose data structure, called W-tree, which reflects relations between repetition lengths of symbols. Existence of algorithms faster than obvious dynamic programming remains open for non-binary strings. Our problem is also equivalent to finding weighted independent sets of prescribed size in paths. We show that this problem in bounded-degree graphs is FPT.

segmentation

parameterized complexity

dynamic programming

interval graphs

tree computations

weighted independent set

Författare

Peter Damaschke

Chalmers, Data- och informationsteknik, Datavetenskap

33rd International Workshop on Graph-Theoretic Concepts in Computer Science WG 2007, Lecture Notes in Computer Science

Vol. 4769 214-225

Ämneskategorier

Datavetenskap (datalogi)

ISBN

978-3-540-74838-0