Information Retrieval Using Krylov Subspace Methods
In this dissertation we discuss how simple Krylov subspace methods can be used for information retrieval (IR). The dissertation consists of two parts. The first part gives a background of IR and introduces the vector space model for IR and the Krylov subspace methods that we use. The second part consists of four articles.
The first article introduces the concept of subspace methods and in particular introduces how simple Krylov subspace methods can be used for IR.
In the second article we show how simple modifications of the original Krylov subspace method for IR can help to steer the process of what documents to bring in and to avoid, and there by increase retrieval performance.
Retrieval performance for IR-systems improves significantly if proper term weighting is used. Terms with high search values are weighted up and terms with low search values are weighted down. Several term weighting schemes appear in the IR community. In the third article we experiment with different term weighting schemes.
In the fourth article we discuss how the Krylov subspace method is able to indicate even weak connections between groups of relevant documents. We also show how simple modifications of the method can be used to decrease the scoring for irrelevant documents. All experiments in the fourth article are made on sets from the TREC (Text REtrieval Conference) collection.