Robust Visual Object Tracking mean shift, particle filters and point features
Visual object tracking has been identified as a promising technique for many computer vision applications like surveillance, flight navigation, video compression and driver assistance. The main idea is to find the state of
the object and how it changes over time, in recursive video frames. The state can be arbitrarily complex but usually consists of object shape and appearance. Robust visual tracker has to deal many challenges like object affine transformation, pose change, deformation, partial or full occlusion, complex background or clutter, dynamic camera, image noise, illumination change and real time constraint.
This thesis addresses the problems in visual object tracking by combining object appearance, shape and motion of bounding box. The proposed method uses anisotropic mean-shift tracker for appearance similarity while SIR particle filter for tracking of bounding box. It adds spatial information
in histogram calculation by dividing the bounding box into disjoint areas and finds the height, width and orientation of the region by checking the goodness of the sub region bandwidth estimate through Bhattacharyya similarity coefficient. Unlike previous algorithms to embed the mean-shift tracker in PF framework, the proposed method extends the idea in  by not only renewing the box center, but also adding the width, height and orientation estimation of region by using multi-mode anisotropic mean shift. The combined scheme is able to maintain the merits of both methods, uses a small number of particles (<20) and stabilize the trajectories of target during occlusion.
A method for combining local point features and global appearance-based video tracking is also discussed. In the first step, it uses SIFT algorithm for extracting and matching local features and RANSAC to estimate the affine
transformation parameters of bounding box. In the second step, results are further refined by appearance based tracking through mean shift. The goal of this fusion is twofold: to avoid point feature tracking failure, if the number of correct matches falls below the minimum required to estimate the motion model and to handle partial occlusion by utilizing the local salient point features in the no-occluded part.
Experiments showed better performance compared to previous approaches while handling partial occlusions, object intersection, clutter, tightness and
accuracy of box and tracking drifts.
visual object tracking
Room EA, floor 4, Hörsalsvägen 11
Opponent: Prof. Hamid Krim, Dept. of Electrical and Computer Engineering, North Carolina State University, USA