Empirical Observations regarding Predictability in User Access-Behavior in a Distributed Digital Library System
Paper in proceeding, 2002

Document archives are today geographically distributed but often not replicated. This can potentially result in a low quality of service in terms of reduced availability and long user-perceived access times, especially during peak hours. Indiscriminate replication is not feasible due to the sheer size of the database and its administration. In an ongoing project, the goal is to study the effectiveness of caching techniques like prefetching and selective preloading to improve quality of service of digital library systems. In this paper, we analyze whether user access behavior is predictable enough to use it to guess what articles to prefetch or to preload based on user access logs from DADS, a digital library system developed at the Technical Knowledge Center of Denmark, DTV. We have found that once a literature search has been narrowed down to less than ten articles, there is a high likelihood that some of them will be eventually downloaded. This suggests that prefetching can be used to hide the article transfer latency. We have also found that as many as 80% of the article downloads are confined to less than 20% of the journals. This suggests that preloading a small fraction of the digital library database can significantly shorten the access latency as well as improving the availability.

Digital Library

Caching

Author

Jochen Hollmann

Chalmers, Department of Computer Engineering, Computer Architecture

Anders Ardö

Per Stenström

Chalmers, Department of Computer Engineering, Computer Architecture

Proceedings of the 16th International Parallel and Distributed Processing Symposium

221-228
0-7695-1573-8 (ISBN)

Subject Categories

Computer Engineering

Other Computer and Information Science

DOI

10.1109/IPDPS.2002.1016636

ISBN

0-7695-1573-8

More information

Created

10/7/2017