Tracking Scheme for Rigid
Object with Instance Matching
and Online Learning
Jui-Hsin(Larry) Larr
y



Demo Video www.larry-lai.com/tracking.html
2
Outline
Dif
fi
culty in Object Tracking

Proposed Tracking Scheme

Experimental Results

Conclusion & Extensions
3
Difficulty in Object Tracking (1/4)
Translation
translation is the simplest
difficulty in object tracking


most previous works were
proposed to solve this kind
of problem
Zooming
the tracking features should
be well designed


e.g. scale-invariant
features


e.g. scalable object size
4
Difficulty in Object Tracking (2/4)
Rotation
can not be solved without
directional features


most previous works failed
to this problem
Panning/Tilting
a very difficult challenge in
object tracking


object’s appearance beyond
initial training
5
Difficulty in Object Tracking (3/4)
Occlusion
a common challenge in
the real cases


tracking features should
be tolerant to outliers
Illuminance
a common challenge in
the real cases


tracking features should
be tolerant to luminance
change
6
Difficulty in Object Tracking (4/4)
Blur
out of focus is common in the
capture


tracking features should be
tolerant to image blur
The real cases...
combination of all the
difficulties


to design a robust tracking
algorithm just like the
mission impossible
7
Challenge
overcome these difficulties


translation, zooming, rotation, tilting/panning


occlusion, illumination, blur, combination
A tracking algorithm
high accurate performance


high precision rate


high recall rate
low computation loading


real-time tracking system


target: 640x480, 30 fps
8
Outline
Dif
fi
culty in Object Tracking

Proposed Tracking Scheme

Experimental Results

Conclusion & Extensions
9
Tracking Scheme -- Step 1
... Target Instance
Possible Region
Region filter to preserve possible regions
10
Reduce total execution time


feature extraction on the whole frame is time-
consumption


instance matching on the whole frame is also time-
consumption
Find the possible regions
Region Filter
Increase tracking accuracy


tracking performance would decrease if matching to
the whole frame
11
?
t+3
Particles will randomly
distribute, and resample the
important particles.
t
t+1
t+2
Particle Filter as the Region Filter
Particles with high matching
probability are considered as
the possible regions
Sample Importance and
Resampling
12
Similarity of luminance histogram
a regional statistical feature for luminance distribution


the method of histogram shift to compensate
luminance variation


cope with translation, zooming, and blur
“Importance” in Particle Matching
2
0
0
7
0 .
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
1 19 37 55 73 91 109 127 145 163 181 199 217 235 253
2
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
1 21 41 61 81 101 121 141 161 181 201 221 241
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
1 19 37 55 73 91 109 127 145 163 181 199 217 235 253
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
1 18 35 52 69 86 103 120 137 154 171 188 205 222 239 256
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
1 19 37 55 73 91 109 127 145 163 181 199 217 235 253
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
1 19 37 55 73 91 109 127 145 163 181 199 217 235 253
0 
2000 
4000 
6000 
8000 
10000 
12000 
14000 
16000 
18000 
20000 
1  22  43  64  85  106  127  148  169  190  211  232  253 
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
1 19 37 55 73 91 109 127 145 163 181 199 217 235 253
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
1 21 41 61 81 101 121 141 161 181 201 221 241
0 
2000 
4000 
6000 
8000 
10000 
12000 
14000 
16000 
18000 
20000 
1  22  43  64  85  106  127  148  169  190  211  232  253 
Scaling
13
Tracking Scheme -- Step 2
Unified Instance Size
Feature Detection
...
...
...
...
Descriptor
Feature extraction on unified instance size
14
More Accurate Features
Rotation, Tilting/Panning


feature detectors with accurate feature location/size/
orientation/transformation


feature descriptors with more detailed descriptions
Need more accurate features to cope with
How to choose the feature detector


FAST, MSER, STAR, DoG, Harris, Hessian,
GoodFeatureToTrac(GFTT), ...


Harris-Affine, Hessian-Affine, ...
How to choose the feature descriptor


SIFT, A-SIFT, SURF, BRIEF, ...
15
Much Powerful Feature Detector
Simple feature


only cope with translation and rotation without scale
invariant


computation time is short
Trade-off
In our opinion, no feature detector can completely handle
the real cases.
Affine-Invariant feature


invariant to affine transformation


but computation time drastically increase


affine-invariant features have lower accuracy than
simple feature, especially angle less than 40o


Most importantly, the real-case is usually
perspective transformation, but not affine
transformation
16
Unified Instance Size
Normalize the candidate region from Region Filter


apply simple feature detector and descriptor


cope with translation, zooming, and rotation
17
Tracking Scheme -- Step 3
Panning in X-axis
Tilting
in
Y-axis
Instance Number
Instance
Database
Instance Matching
& Pose Estimation
Instance matching and online learning
18
Instance Matching
Find the highest matching instance


calculate the perspective transformation by
matching features


successful matching angle is less than 30o due to
using simple feature detector
Matching to the instances in the database
19
Pose Estimation
Find the object’s XYZ rotation and XYZ translation


refer to the method in AR calibration to find the rotation
and translation in real world coordinate


proposed the refinement model to solve the jitter
problem
Perspective model is not enough
20
Online Learning (1/2)
Panning in X-axis
Tilting
in
Y-axis
Instance Number
3D Instance Model


learn all the instance’s
appearance online


appearance in each
panning/tilting angle is
recored


construct the instance’s
3D model
Online construct the database
Multiple Instances


multiple instances in each
panning/tilting angle


record variant luminance,
occlusion, and blur situations
21
Online Learning (2/2)
Trade-off


a more complete database provides higher tracking
performance


but the computation time will linearly increase while
the database enlarging
Database is growing
Set the upper bound for database size


total instance number in each angle is fixed


First-In-First-Out mechanism is used to remove the
earliest instance in the database


=> computation time would be fixed and the
performance keeps at a high rate
22
Review of Proposed Tracking Scheme
Region filter to preserve possible regions
... Target Instance
Possible Region
...
...
...
...
Descriptor
Feature Detection
Scaling
Unified Instance Size
Feature extraction on unified instance size
Instance matching and online learning
Panning in X-axis
Tilting
in
Y-axis
Instance Number
Instance Matching
& Pose Estimation
Instance
Database
Cope with


Translation


Zooming


Blur
Cope with


Translation


Zooming


Rotation
Cope with


Pan/Tilt


Luminance


Occlusion


Blur
23
Outline
Dif
fi
culty in Object Tracking

Proposed Tracking Scheme

Experimental Results

Conclusion & Extensions
24
Testing Videos (1/3)
Synthetic videos by 3ds Max


isolate each tracking difficulty


targeting results are obtained
Testing videos with targeting results
Background: Simple vs. Complex
Object: Textured vs. Textureless
Provide complete testing videos for following researchers.
25
Testing Videos (2/3)
Testing videos with 13 factors. Each video length is 10s.
1 Zooming for the change of object size


2 Zooming for the change of camera focal length
3 Rotation about the z-axis of the object


4 Rotation about the z-axis of the camera


5 Panning/Tilting change by spinning the object


6 Panning/Tilting change by spinning the camera


7 Translation of the object


8 Translation of the camera


9 Occlusion (by a textured object)


10 Illumination change


11 Deformation
12 Blur
13 Combination
26
Testing Videos (3/3)
Total number of testing videos is 52 (4x13) !
Complex B. & Textured O. Simple B. & Textured O.
Complex B. & Textureless O. Simple B. & Textureless O.
27
Experiments Overview
Exp. 1 -- Performance analysis of tracking scheme


performance w/o region filter


performance w/o unified instance size


performance w/o online learning
Exp. 2 -- Performance analysis of various feature
detector and descriptor


tracking accuracy


tracking time
Exp. 3 -- Performance comparison


3 relative studies (state-of-the-art)


objective evaluation VS. subjective evaluation
Exp. 4 -- Real-Time tracking system


video of live demos
28
Exp. 1 -- Tracking Scheme Analysis (1/9)
Figure description
Index of Testing Video
F-Value
Average F-value of testing
video (10 seconds)
Overlapping region of tracking
result is used to calculate
precision and recall rates
29
Exp. 1 -- Tracking Scheme Analysis (2/9)
F-values of the proposed tracking scheme
Complex B. & Textureless O. Simple B. & Textureless O.
Complex B. & Textured O. Simple B. & Textured O.
Occlusion (by a textured object)
Rotation Translation
30
Discussion
Exp. 1 -- Tracking Scheme Analysis (3/9)
Performance with proposed tracking scheme


high precision and recall rates in most testing videos


textureless objects have lower performance than
textured objects


complex background or occlusion object sometimes
would make tracking performance decrease
Complex Background Textured occlusion object
31
Exp. 1 -- Tracking Scheme Analysis (4/9)
F-values with and without Region Filter
Complex B. & Textureless O. Simple B. & Textureless O.
Complex B. & Textured O. Simple B. & Textured O.
32
Performance without Region Filter


accuracy and recall rates of most testing videos
drastically decrease


the feature is difficult to find the perfect matching
when facing a large number of features


execution time is 12.36 times slower in average
due to large number of features in the whole frame
Discussion
Exp. 1 -- Tracking Scheme Analysis (5/9)
33
F-values with and without Unified Instance Size
Exp. 1 -- Tracking Scheme Analysis (6/9)
Complex B. & Textureless O. Simple B. & Textureless O.
Complex B. & Textured O. Simple B. & Textured O.
34
Discussion
Performance without Unified Instance Size


performance reduction in some cases and the
overall performance become less robust


variant execution time for each tracking object due
to different object size
Exp. 1 -- Tracking Scheme Analysis (7/9)
set the instance size to 152x152 pixels


employ FAST detector + SIFT descriptor
Settings of Unified Instance Size
35
Exp. 1 -- Tracking Scheme Analysis (8/9)
F-values with and without Online Learning
Complex B. & Textureless O. Simple B. & Textureless O.
Complex B. & Textured O. Simple B. & Textured O.
36
Exp. 1 -- Tracking Scheme Analysis (9/9)
Discussion
Performance without Online Learning


accuracy and recall rates of most testing videos
drastically decrease, especially for textureless
objects


but execution time is 5.31 times faster due to single
matching
upper bound of instance number is set to 17
Settings of Online Learning
37
Exp. 2 -- Feature Extractor Analysis (1/3)
Performance analysis under various feature
detectors and descriptors
28 combinations with 52 testing videos!
7 feature detectors


FAST, Harris, GFTT


MSER, STAR, DoG, Hessian
4 feature descriptors


SIFT, SURF, BRIEF, ORB
All the methods are referred to OpenCV library
38
noise robustness and clear description are the
key factors


SIFT performs the best among these 4 descriptors


BRIEF or ORB achieves good results on textured
objects


SURF works poorly in our tracking scheme
Accuracy of Descriptor
Exp. 2 -- Feature Extractor Analysis (2/3)
stabilization of pixel location and feature
number are the key factors


lightweight detectors (like FAST, Harris) work
comparably well in our framework
Accuracy of Detector
Best performance is obtained by FAST + SIFT
39
Exp. 2 -- Feature Extractor Analysis (3/3)
SIFT : SURF : BRIEF : ORB


15.80 : 3.55 : 1.00 : 1.58
Extraction time of Descriptor
SIFT : SURF : BRIEF : ORB


9.23 : 4.83 : 2.22 : 1.00
Matching time of Descriptor
FAST : Harris : GFTT : MSER : DoG : SURF : STAR : ORB


1.00 : 4.38 : 5.84 : 34.36 : 163.63 : 47.38 : 4.54 : 6.88
Extraction time of Detector
40
Exp. 3 -- Performance Comparison (1/8)
Online


AdaBoost
Mul
ti
ple


Instance Learning
[2011 PAMI] Robust Object Tracking with Online Mul
ti
ple Instance Learning
41
Exp. 3 -- Performance Comparison (2/8)
http://vision.ucsd.edu/~bbabenko/project_miltrack.shtml
Performance of PAMI2011
42
Exp. 3 -- Performance Comparison (3/8)
[2010 ICPR] Forward-Backward Error: Automatic Detection of Tracking Failures
Predator
[2010 ICIP] Face-TLD: Tracking-Learning-Detection Applied to Faces
[2010 CVPR] P-N Learning: Bootstrapping Binary Classifiers by Structural
Constraints
[2009 OLCV] Online learning of robust object detectors during unstable tracking
43
Exp. 3 -- Performance Comparison (4/8)
http://info.ee.surrey.ac.uk/Personal/Z.Kalal/tld.html
Performance of Predator
44
Exp. 3 -- Performance Comparison (5/8)
F-value comparison for OAB, MIL and Ours
Complex B. & Textureless O. Simple B. & Textureless O.
Complex B. & Textured O. Simple B. & Textured O.
45
Exp. 3 -- Performance Comparison (6/8)
Comparison of tracking abilities
Predator OAB MIL Ours
Translation
Zooming
Rotation
Panning/Tilting
Occlusion
Illuminance
Blur
46
1.298&
0.563&
1&
0&
0.2&
0.4&
0.6&
0.8&
1&
1.2&
1.4&
MIL& OAB& Our&Approach&
Computa(on*(me**
(normalized*by*our*approach)*
Exp. 3 -- Performance Comparison (7/8)
47
Exp. 3 -- Performance Comparison (8/8)
No golden answer, only user ranking
Subjective evaluation
48
Exp. 4 -- Video of Live Demos
Under Construction
49
Outline
Dif
fi
culty in Object Tracking

Proposed Tracking Scheme

Experimental Results

Conclusion & Extensions
50
Conclusion (1/2)
Tracking Scheme
only candidate regions are used for following feature
processing


drastically improve tracking accuracy and efficiency


cope with translation, zooming, and blur
Region Filter
simple feature detector on unified size to obtain the
scale invariant property


make tracking accuracy more robust and fix tracking
execution time


cope with translation, zooming, and rotation
Unified Instance Size
51
Contribution (2/2)
Tracking Scheme
construct instance’s appearance in 3D instance model


multiple instance to record variant instance
appearance


drastically improve tracking ability and increase
accuracy


cope with panning/tilting, illuminance, occlusion,
and blur
Online Learning
52
Contribution
Propose a tracking scheme
comparable performance with state-of-the-art in
translation/zooming/occlusion/illuminance/blur


outstanding performance than state-of-the-art in
rotation and panning/tilting
Accuracy
cope with the 7 difficulties in real cases


experimental results show the robustness
Capability
low computation requirement


achieve 640x480 30fps
Computation
53
Extensions
It’s just the beginning
feature points as the tracking feature is much
powerful and functional


without user manual initialization


tour navigation, advertisement extension, ...
Combination of Object Recognition
face can be seen as rigid instance


improve face recognition by face detection and face
tracking
Face Application
proposed scheme can be extended to non-rigid object


modify feature extraction and 3D instance model
Non-Rigid Object
54
Thanks for Your Attention

Object Tracking with Instance Matching and Online Learning

  • 1.
    Tracking Scheme forRigid Object with Instance Matching and Online Learning Jui-Hsin(Larry) Larr y 
 Demo Video www.larry-lai.com/tracking.html
  • 2.
    2 Outline Dif fi culty in ObjectTracking Proposed Tracking Scheme Experimental Results Conclusion & Extensions
  • 3.
    3 Difficulty in ObjectTracking (1/4) Translation translation is the simplest difficulty in object tracking most previous works were proposed to solve this kind of problem Zooming the tracking features should be well designed e.g. scale-invariant features e.g. scalable object size
  • 4.
    4 Difficulty in ObjectTracking (2/4) Rotation can not be solved without directional features most previous works failed to this problem Panning/Tilting a very difficult challenge in object tracking object’s appearance beyond initial training
  • 5.
    5 Difficulty in ObjectTracking (3/4) Occlusion a common challenge in the real cases tracking features should be tolerant to outliers Illuminance a common challenge in the real cases tracking features should be tolerant to luminance change
  • 6.
    6 Difficulty in ObjectTracking (4/4) Blur out of focus is common in the capture tracking features should be tolerant to image blur The real cases... combination of all the difficulties to design a robust tracking algorithm just like the mission impossible
  • 7.
    7 Challenge overcome these difficulties translation,zooming, rotation, tilting/panning occlusion, illumination, blur, combination A tracking algorithm high accurate performance high precision rate high recall rate low computation loading real-time tracking system target: 640x480, 30 fps
  • 8.
    8 Outline Dif fi culty in ObjectTracking Proposed Tracking Scheme Experimental Results Conclusion & Extensions
  • 9.
    9 Tracking Scheme --Step 1 ... Target Instance Possible Region Region filter to preserve possible regions
  • 10.
    10 Reduce total executiontime feature extraction on the whole frame is time- consumption instance matching on the whole frame is also time- consumption Find the possible regions Region Filter Increase tracking accuracy tracking performance would decrease if matching to the whole frame
  • 11.
    11 ? t+3 Particles will randomly distribute,and resample the important particles. t t+1 t+2 Particle Filter as the Region Filter Particles with high matching probability are considered as the possible regions Sample Importance and Resampling
  • 12.
    12 Similarity of luminancehistogram a regional statistical feature for luminance distribution the method of histogram shift to compensate luminance variation cope with translation, zooming, and blur “Importance” in Particle Matching 2 0 0 7 0 . 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 1 19 37 55 73 91 109 127 145 163 181 199 217 235 253 2 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 1 21 41 61 81 101 121 141 161 181 201 221 241 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 1 19 37 55 73 91 109 127 145 163 181 199 217 235 253 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 1 18 35 52 69 86 103 120 137 154 171 188 205 222 239 256 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 1 19 37 55 73 91 109 127 145 163 181 199 217 235 253 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 1 19 37 55 73 91 109 127 145 163 181 199 217 235 253 0  2000  4000  6000  8000  10000  12000  14000  16000  18000  20000  1  22  43  64  85  106  127  148  169  190  211  232  253  0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 1 19 37 55 73 91 109 127 145 163 181 199 217 235 253 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 1 21 41 61 81 101 121 141 161 181 201 221 241 0  2000  4000  6000  8000  10000  12000  14000  16000  18000  20000  1  22  43  64  85  106  127  148  169  190  211  232  253 
  • 13.
    Scaling 13 Tracking Scheme --Step 2 Unified Instance Size Feature Detection ... ... ... ... Descriptor Feature extraction on unified instance size
  • 14.
    14 More Accurate Features Rotation,Tilting/Panning feature detectors with accurate feature location/size/ orientation/transformation feature descriptors with more detailed descriptions Need more accurate features to cope with How to choose the feature detector FAST, MSER, STAR, DoG, Harris, Hessian, GoodFeatureToTrac(GFTT), ... Harris-Affine, Hessian-Affine, ... How to choose the feature descriptor SIFT, A-SIFT, SURF, BRIEF, ...
  • 15.
    15 Much Powerful FeatureDetector Simple feature only cope with translation and rotation without scale invariant computation time is short Trade-off In our opinion, no feature detector can completely handle the real cases. Affine-Invariant feature invariant to affine transformation but computation time drastically increase affine-invariant features have lower accuracy than simple feature, especially angle less than 40o Most importantly, the real-case is usually perspective transformation, but not affine transformation
  • 16.
    16 Unified Instance Size Normalizethe candidate region from Region Filter apply simple feature detector and descriptor cope with translation, zooming, and rotation
  • 17.
    17 Tracking Scheme --Step 3 Panning in X-axis Tilting in Y-axis Instance Number Instance Database Instance Matching & Pose Estimation Instance matching and online learning
  • 18.
    18 Instance Matching Find thehighest matching instance calculate the perspective transformation by matching features successful matching angle is less than 30o due to using simple feature detector Matching to the instances in the database
  • 19.
    19 Pose Estimation Find theobject’s XYZ rotation and XYZ translation refer to the method in AR calibration to find the rotation and translation in real world coordinate proposed the refinement model to solve the jitter problem Perspective model is not enough
  • 20.
    20 Online Learning (1/2) Panningin X-axis Tilting in Y-axis Instance Number 3D Instance Model learn all the instance’s appearance online appearance in each panning/tilting angle is recored construct the instance’s 3D model Online construct the database Multiple Instances multiple instances in each panning/tilting angle record variant luminance, occlusion, and blur situations
  • 21.
    21 Online Learning (2/2) Trade-off amore complete database provides higher tracking performance but the computation time will linearly increase while the database enlarging Database is growing Set the upper bound for database size total instance number in each angle is fixed First-In-First-Out mechanism is used to remove the earliest instance in the database => computation time would be fixed and the performance keeps at a high rate
  • 22.
    22 Review of ProposedTracking Scheme Region filter to preserve possible regions ... Target Instance Possible Region ... ... ... ... Descriptor Feature Detection Scaling Unified Instance Size Feature extraction on unified instance size Instance matching and online learning Panning in X-axis Tilting in Y-axis Instance Number Instance Matching & Pose Estimation Instance Database Cope with Translation Zooming Blur Cope with Translation Zooming Rotation Cope with Pan/Tilt Luminance Occlusion Blur
  • 23.
    23 Outline Dif fi culty in ObjectTracking Proposed Tracking Scheme Experimental Results Conclusion & Extensions
  • 24.
    24 Testing Videos (1/3) Syntheticvideos by 3ds Max isolate each tracking difficulty targeting results are obtained Testing videos with targeting results Background: Simple vs. Complex Object: Textured vs. Textureless Provide complete testing videos for following researchers.
  • 25.
    25 Testing Videos (2/3) Testingvideos with 13 factors. Each video length is 10s. 1 Zooming for the change of object size 2 Zooming for the change of camera focal length 3 Rotation about the z-axis of the object 4 Rotation about the z-axis of the camera 5 Panning/Tilting change by spinning the object 6 Panning/Tilting change by spinning the camera 7 Translation of the object 8 Translation of the camera 9 Occlusion (by a textured object) 10 Illumination change 11 Deformation 12 Blur 13 Combination
  • 26.
    26 Testing Videos (3/3) Totalnumber of testing videos is 52 (4x13) ! Complex B. & Textured O. Simple B. & Textured O. Complex B. & Textureless O. Simple B. & Textureless O.
  • 27.
    27 Experiments Overview Exp. 1-- Performance analysis of tracking scheme performance w/o region filter performance w/o unified instance size performance w/o online learning Exp. 2 -- Performance analysis of various feature detector and descriptor tracking accuracy tracking time Exp. 3 -- Performance comparison 3 relative studies (state-of-the-art) objective evaluation VS. subjective evaluation Exp. 4 -- Real-Time tracking system video of live demos
  • 28.
    28 Exp. 1 --Tracking Scheme Analysis (1/9) Figure description Index of Testing Video F-Value Average F-value of testing video (10 seconds) Overlapping region of tracking result is used to calculate precision and recall rates
  • 29.
    29 Exp. 1 --Tracking Scheme Analysis (2/9) F-values of the proposed tracking scheme Complex B. & Textureless O. Simple B. & Textureless O. Complex B. & Textured O. Simple B. & Textured O. Occlusion (by a textured object) Rotation Translation
  • 30.
    30 Discussion Exp. 1 --Tracking Scheme Analysis (3/9) Performance with proposed tracking scheme high precision and recall rates in most testing videos textureless objects have lower performance than textured objects complex background or occlusion object sometimes would make tracking performance decrease Complex Background Textured occlusion object
  • 31.
    31 Exp. 1 --Tracking Scheme Analysis (4/9) F-values with and without Region Filter Complex B. & Textureless O. Simple B. & Textureless O. Complex B. & Textured O. Simple B. & Textured O.
  • 32.
    32 Performance without RegionFilter accuracy and recall rates of most testing videos drastically decrease the feature is difficult to find the perfect matching when facing a large number of features execution time is 12.36 times slower in average due to large number of features in the whole frame Discussion Exp. 1 -- Tracking Scheme Analysis (5/9)
  • 33.
    33 F-values with andwithout Unified Instance Size Exp. 1 -- Tracking Scheme Analysis (6/9) Complex B. & Textureless O. Simple B. & Textureless O. Complex B. & Textured O. Simple B. & Textured O.
  • 34.
    34 Discussion Performance without UnifiedInstance Size performance reduction in some cases and the overall performance become less robust variant execution time for each tracking object due to different object size Exp. 1 -- Tracking Scheme Analysis (7/9) set the instance size to 152x152 pixels employ FAST detector + SIFT descriptor Settings of Unified Instance Size
  • 35.
    35 Exp. 1 --Tracking Scheme Analysis (8/9) F-values with and without Online Learning Complex B. & Textureless O. Simple B. & Textureless O. Complex B. & Textured O. Simple B. & Textured O.
  • 36.
    36 Exp. 1 --Tracking Scheme Analysis (9/9) Discussion Performance without Online Learning accuracy and recall rates of most testing videos drastically decrease, especially for textureless objects but execution time is 5.31 times faster due to single matching upper bound of instance number is set to 17 Settings of Online Learning
  • 37.
    37 Exp. 2 --Feature Extractor Analysis (1/3) Performance analysis under various feature detectors and descriptors 28 combinations with 52 testing videos! 7 feature detectors FAST, Harris, GFTT MSER, STAR, DoG, Hessian 4 feature descriptors SIFT, SURF, BRIEF, ORB All the methods are referred to OpenCV library
  • 38.
    38 noise robustness andclear description are the key factors SIFT performs the best among these 4 descriptors BRIEF or ORB achieves good results on textured objects SURF works poorly in our tracking scheme Accuracy of Descriptor Exp. 2 -- Feature Extractor Analysis (2/3) stabilization of pixel location and feature number are the key factors lightweight detectors (like FAST, Harris) work comparably well in our framework Accuracy of Detector Best performance is obtained by FAST + SIFT
  • 39.
    39 Exp. 2 --Feature Extractor Analysis (3/3) SIFT : SURF : BRIEF : ORB 15.80 : 3.55 : 1.00 : 1.58 Extraction time of Descriptor SIFT : SURF : BRIEF : ORB 9.23 : 4.83 : 2.22 : 1.00 Matching time of Descriptor FAST : Harris : GFTT : MSER : DoG : SURF : STAR : ORB 1.00 : 4.38 : 5.84 : 34.36 : 163.63 : 47.38 : 4.54 : 6.88 Extraction time of Detector
  • 40.
    40 Exp. 3 --Performance Comparison (1/8) Online AdaBoost Mul ti ple Instance Learning [2011 PAMI] Robust Object Tracking with Online Mul ti ple Instance Learning
  • 41.
    41 Exp. 3 --Performance Comparison (2/8) http://vision.ucsd.edu/~bbabenko/project_miltrack.shtml Performance of PAMI2011
  • 42.
    42 Exp. 3 --Performance Comparison (3/8) [2010 ICPR] Forward-Backward Error: Automatic Detection of Tracking Failures Predator [2010 ICIP] Face-TLD: Tracking-Learning-Detection Applied to Faces [2010 CVPR] P-N Learning: Bootstrapping Binary Classifiers by Structural Constraints [2009 OLCV] Online learning of robust object detectors during unstable tracking
  • 43.
    43 Exp. 3 --Performance Comparison (4/8) http://info.ee.surrey.ac.uk/Personal/Z.Kalal/tld.html Performance of Predator
  • 44.
    44 Exp. 3 --Performance Comparison (5/8) F-value comparison for OAB, MIL and Ours Complex B. & Textureless O. Simple B. & Textureless O. Complex B. & Textured O. Simple B. & Textured O.
  • 45.
    45 Exp. 3 --Performance Comparison (6/8) Comparison of tracking abilities Predator OAB MIL Ours Translation Zooming Rotation Panning/Tilting Occlusion Illuminance Blur
  • 46.
  • 47.
    47 Exp. 3 --Performance Comparison (8/8) No golden answer, only user ranking Subjective evaluation
  • 48.
    48 Exp. 4 --Video of Live Demos Under Construction
  • 49.
    49 Outline Dif fi culty in ObjectTracking Proposed Tracking Scheme Experimental Results Conclusion & Extensions
  • 50.
    50 Conclusion (1/2) Tracking Scheme onlycandidate regions are used for following feature processing drastically improve tracking accuracy and efficiency cope with translation, zooming, and blur Region Filter simple feature detector on unified size to obtain the scale invariant property make tracking accuracy more robust and fix tracking execution time cope with translation, zooming, and rotation Unified Instance Size
  • 51.
    51 Contribution (2/2) Tracking Scheme constructinstance’s appearance in 3D instance model multiple instance to record variant instance appearance drastically improve tracking ability and increase accuracy cope with panning/tilting, illuminance, occlusion, and blur Online Learning
  • 52.
    52 Contribution Propose a trackingscheme comparable performance with state-of-the-art in translation/zooming/occlusion/illuminance/blur outstanding performance than state-of-the-art in rotation and panning/tilting Accuracy cope with the 7 difficulties in real cases experimental results show the robustness Capability low computation requirement achieve 640x480 30fps Computation
  • 53.
    53 Extensions It’s just thebeginning feature points as the tracking feature is much powerful and functional without user manual initialization tour navigation, advertisement extension, ... Combination of Object Recognition face can be seen as rigid instance improve face recognition by face detection and face tracking Face Application proposed scheme can be extended to non-rigid object modify feature extraction and 3D instance model Non-Rigid Object
  • 54.