Speech and Video Coding for Unreliable Channels

Daniel Persson

Speech and Video Coding for Unreliable Channels
Doktorsavhandling, 2009

Speech and video communications via the Internet and cellular phones have become a part of modern life. This thesis deals with different methods for speech and video transmission over noisy and lossy communication channels. The block-based video error concealment problem, where redundancy in the received data is used for mending lost parts, is first considered. A Gaussian mixture model (GMM)-based estimator of lost pixels is devised, and dirent analytical solutions are derived for the important case of consecutive packet losses. Though asymptotically optimal in the number of mixture components, the GMM-based scheme is computationally complex and based on offline parameter optimization that does not maximize the established peak signalto-noise ratio (PSNR) distortion measure that is used for evaluating the system. In a second effort, a low-complexity mixture-based estimator, whose parameters are found by an algorithm that maximizes PSNR in every iteration, is therefore proposed. The new estimator outperforms the GMM-based solution in terms of PSNR for all computational complexities. In a third effort, a slightly more general error concealment problem is considered, namely the estimation of both motion vectors and pixels. Error concealment based on Markov random eld modeling of surrounding motion vectors and pixels, motion vector estimates, and a replacement by an arbitrary spatial error concealment method, is proposed. A newly suggested quantization method, power series quantization(PSQ), uses a codebook containing non-linear functions of previously quantized data. Though general in nature, this scheme has been employed for linear prediction-based voice coding, where it increased performance compared to several common quantization methods for sources with memory. We derive a PSQ-based optimal encoder, and several codebook optimization algorithms, for the case of noisy channels. Simulations are in agreement with high rate-capacity and rate-distortion-capacity performance predictions. We finally propose a framework for transmission of sources with memory by means of a multiple description coding strategy over packet channels. Within this framework, we derive an optimal PSQ encoder and a codebook optimization algorithm.

joint source-channel coding

channel-optimized quantization

ED-salen, Hörsalsvägen 11, Chalmers

Opponent: Sören Holdt Jensen, Dept. of Electronic Systems, Aalborg University, Denmark

Författare

Daniel Persson

Chalmers, Signaler och system

Forskning Andra publikationer

Ämneskategorier (SSIF 2011)

Signalbehandling

ISBN

978-91-7385-216-6

Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 2897

ED-salen, Hörsalsvägen 11, Chalmers

Opponent: Sören Holdt Jensen, Dept. of Electronic Systems, Aalborg University, Denmark

Mer information

Skapat

2017-10-08

Speech and Video Coding for Unreliable Channels Doktorsavhandling, 2009