Diverse Data Expansion with Semi-Supervised k-Determinantal Point Processes
Paper i proceeding, 2023

Determinantal point processes (DPPs) have become prominent in data summarization and recommender system tasks for their ability to simultaneously model diversity as well as relevance. In practical applications, k-Determinantal point processes (k-DPPs) are used to yield a selection of k items from a set of size N that are the most representative of the set. In this paper, we study a special case of the diverse subset selection problem where a fixed set GO is already given as a forced recommendation and the task is to determine the remainder of the recommendation G1. The standard k-DPP optimization objectives here can suggest items that are close to optimal when considering only items in G1, but are arbitrarily close to items in G0, i.e., they might not be sufficiently diverse w.r.t. G0. We explore a semi-supervised k-DPP objective that simultaneously considers G0 and G1 and compares the difference between the two recommendations. We demonstrate our findings using multiple examples where the diverse subset selection problem with forced recommendation is important in practice.

Data Summarization

Diversity and Relevance

Determinantal Point Process

Författare

Simon Johansson

Chalmers, Data- och informationsteknik, Data Science och AI

Ola Engkvist

AstraZeneca AB

Morteza Haghir Chehreghani

Chalmers, Data- och informationsteknik, Data Science och AI

Alexander Schliep

Brandenburgische Technische Universität

Proceedings - 2023 IEEE International Conference on Big Data, BigData 2023

5260-5265
9798350324457 (ISBN)

2023 IEEE International Conference on Big Data, BigData 2023
Sorrento, Italy,

Ämneskategorier

Datavetenskap (datalogi)

DOI

10.1109/BigData59044.2023.10386642

Mer information

Senast uppdaterat

2024-02-26