Diverse Data Expansion with Semi-Supervised k-Determinantal Point Processes
Paper in proceeding, 2023

Determinantal point processes (DPPs) have become prominent in data summarization and recommender system tasks for their ability to simultaneously model diversity as well as relevance. In practical applications, k-Determinantal point processes (k-DPPs) are used to yield a selection of k items from a set of size N that are the most representative of the set. In this paper, we study a special case of the diverse subset selection problem where a fixed set GO is already given as a forced recommendation and the task is to determine the remainder of the recommendation G1. The standard k-DPP optimization objectives here can suggest items that are close to optimal when considering only items in G1, but are arbitrarily close to items in G0, i.e., they might not be sufficiently diverse w.r.t. G0. We explore a semi-supervised k-DPP objective that simultaneously considers G0 and G1 and compares the difference between the two recommendations. We demonstrate our findings using multiple examples where the diverse subset selection problem with forced recommendation is important in practice.

Data Summarization

Diversity and Relevance

Determinantal Point Process

Author

Simon Johansson

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Ola Engkvist

AstraZeneca AB

Morteza Haghir Chehreghani

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Alexander Schliep

Brandenburg University of Technology

Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023

5260-5265
9798350324457 (ISBN)

2023 IEEE International Conference on Big Data, BigData 2023
Sorrento, Italy,

Subject Categories

Computer Science

DOI

10.1109/BigData59044.2023.10386642

More information

Latest update

2/26/2024