Direct-to-Reverberant Energy Ratio Estimation and Extrapolation from Own Speech
Paper i proceeding, 2025

Accurately characterizing a user’s acoustic environment is essential for creating virtual sound sources in augmented reality that blend seamlessly into the real environment. The acoustic parameters of an environment can be calculated from a room impulse response (RIR) and the authors recently presented a method to blindly estimate RIRs from speech signals captured with a head-worn microphone array. The approach uses either speech from a distant speaker or own speech from the person wearing the array on their head. While both variants provide reliable reverberation time estimates, direct-to-reverberant energy ratio (DRR) estimates from the user's own speech deviate significantly from the expected DRR of a distant virtual source due to the higher direct sound level. This study investigates the feasibility of extrapolating DRR values from own speech to predict DRRs of distant sources. The approach relies on two acoustic assumptions: (i), the mouth-to-array transfer paths do not change significantly between users and, (ii), a homogeneous reverberant field. Our findings show that the assumptions hold above the Schröder frequency and in sufficiently reverberant conditions. Average DRR extrapolation errors are below 2 dB at mid frequencies when using mouth simulator measurements and around 3 dB with actual speech recordings.

Room Acoustics

Augmented Reality

Direct-to-Reverberant Energy Ratio

Room Impulse Response

Författare

Nils Meyer-Kahlen

Aalto-Yliopisto

Thomas Deppisch

Chalmers, Arkitektur och samhällsbyggnadsteknik, Teknisk akustik

Proc. 33rd European Signal Processing Conference (EUSIPCO)


978-9-46-459362-4 (ISBN)

33rd European Signal Processing Conference (EUSIPCO)
Palermo, Italy,

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier (SSIF 2025)

Signalbehandling

DOI

10.23919/EUSIPCO63237.2025.11226315

Mer information

Senast uppdaterat

2025-12-12