A Krylov subspace method for information retrieval
Journal article, 2004
A new algorithm for information retrieval is described. It is a vector space method with automatic query expansion. The original user query is projected onto a Krylov subspace generated by the query and the term-document matrix. Each dimension of the Krylov space is generated by a simple vector space search, using first the user query and then new queries generated by the algorithm and orthogonal to the previous query vectors. The new algorithm is closely related to latent semantic indexing (LSI), but it is a local algorithm that works on a new subspace of very low dimension for each query. This makes it faster and more flexible than LSI. No preliminary computation of the singular value decomposition (SVD) is needed, and changes in the data base cause no complication. Numerical tests on both small (Cranfield) and larger (Financial Times data from the TREC collection) data sets are reported. The new algorithm gives better precision at given recall levels than simple vector space and LSI in those cases that have been compared. © 2005 Society for Industrial and Applied Mathematics.