Unequal Probability Sampling in Active Learning and Traffic Safety
Licentiate thesis, 2019
Using finite population sampling methodology, we address the issue of appropriate subset selection. We show how sample selection may be optimised to maximise precision in estimating various parameters and quantities of interest, and extend the existing finite population sampling methodology to an adaptive, sequential sampling framework, where information required for sample scheme optimisation may be updated iteratively as more data is collected. The implications of model misspecification are discussed, and the robustness of the finite population sampling methodology against model misspecification is highlighted.
The proposed methods are illustrated and evaluated on two problems: on subset selection for optimal prediction in active learning (Paper I), and on optimal control sampling for analysis of safety critical events in naturalistic driving studies (Paper II). It is demonstrated that the use of optimised sample selection may reduce the number of records for which complete information needs to be collected by as much as 50%, compared to conventional methods and uniform random sampling.
naturalistic driving
active learning
sampling weighing
optimal design
sequential sampling
probability sampling
Author
Henrik Imberg
Chalmers, Mathematical Sciences, Applied Mathematics and Statistics
Optimal sampling in unbiased active learning
Proceedings of Machine Learning Research,;Vol. 108(2020)p. 559-569
Paper in proceeding
Optimization of Two-Phase Sampling Designs with Application to Naturalistic Driving Studies
IEEE Transactions on Intelligent Transportation Systems,;Vol. 23(2022)p. 3575-3588
Journal article
Subject Categories (SSIF 2011)
Probability Theory and Statistics
Publisher
Chalmers
Euler, Mathematical Sciences, Skeppsgränd 3, Gothenburg
Opponent: Associate Professor Krzysztof Bartoszek, Division of Statistics and Machine Learning, Department of Computer and Information Science, Linköping University, Linköping, Sweden