Ph.D.Dissertation
An Empirical Approach to Grouping and Segmentation
by
David Royal Martin
Doctor of Philosophy in Computer Science
University of California, Berkeley
Professor Jitendra Malik, Co-Chair
Professor David Patterson, Co-Chair
Professor Stephen Palmer
Abstract
This thesis presents a novel dataset of
12,000 segmentations of 1,000
natural images by 30 human subjects. The subjects marked the locations
of objects in the images, providing ground truth data for
learning grouping cues and benchmarking grouping algorithms. We feel
that the data-driven approach is critical for two reasons: (1) the
data reflects ``ecological statistics'' that the human visual system
has evolved to exploit, and (2) innovations in computational vision
should
be evaluated quantitatively.
We develop a battery of segmentation comparison measures that we use
both to validate the consistency of the human data and to provide
approaches for evaluating grouping algorithms. In conjunction with the
segmentation dataset, the various measures provide
``micro-benchmarks'' for boundary detection algorithms and pixel
affinity functions, as well a benchmark for complete segmentation
algorithms. Using these performance measures, we can systematically
improve grouping algorithms with the human ground truth as our goal.
Starting at the lowest level, we present local boundary models based on
brightness, color, and texture cues, where the cues are
individually optimized with respect to the dataset and then combined in
a statistically optimal manner with classifiers. The resulting
detector is shown to significantly outperform prior state-of-the-art
algorithms. Next, we learn from data how to combine the boundary
model with patch-based features in a pixel affinity model to settle
long-standing debates in computer vision with empirical results: (1)
brightness boundaries are more informative than patches, and vice versa
for color; (2) texture boundaries and patches are the two most
powerful cues; (3) proximity is not a useful cue for grouping, it is
simply a result of the process; and (4) both boundary-based and
region-based approaches provide significant independent information for
grouping.
|
Chapters
 |
Abstract |
 |
Front Matter |
 |
Chapter 1: Introduction |
 |
Chapter 2: A Dataset of Human Segmented Natural Images |
 |
Chapter 3: Segmentation Consistency
Measures |
 |
Chapter 4: Learning a Local Boundary Model |
 |
Chapter 5: Learning a Pixel Affinity Model |
 |
Chapter 6: Summary and Conclusion |
 |
Appendices |
 |
Bibliography |
 |
Entire Document |
Back to Homepage
|