Overview
Browsing tools that would allow the user to quickly get an idea
of the content of video footage are important missing components
in these video database systems. Browsers that present one key-frame
per shot do not adequately summarize the complex information content
of long shots in which human motion, object motion, or camera
motion progressively reveal entirely new situations. We are developing
video browsing techniques that sample video sequences more or
less densely according to changes in specific features defined
by the user.
The
sheer volume of video data can make any retrieval task overwhelming.
A fast forward mode that accelerates over event-less video portions
and slows down over portions of significant changes of user specified
features facilitates the rapid scanning of large amounts of video
data, and reduces the risk of missing information that is judged
significant by the user.
Sampling
techniques that would attempt to detect significant information
changes simply by looking at pairs of frames are bound to lack
robustness in presence of noise. Methods such as Ramer's algorithm
are available to detect perceptually significant points and discontinuities
in noisy curves. They can be applied to the video browsing problem.
In order to apply such methods, we represent a video sequence
as a polygonal curve by mapping a feature vector for each frame
to a point in a high dimensional Euclidean space. This feature
vector can be defined according to the user's specific interest.
Applying Ramer's method, we recursively split the video curve
until these curve segments can be replaced by line segments. This
replacement can occur if the distance from the curve to the line
is below a given temporal detail level. Significant frames at
a given temporal detail level are defined as the junctions between
the line segments that "summarize'' the video curve at the
considered detail level. Our browser lets the user choose the
detail level with a vertical slider. Only the significant frames
at the requested detail level are displayed.
The
video browser is functional for short video sequences.
In
the user interface, a sampling stripe provides the user with a
view of the level of sampling performed along the video sequence
by the curve summarization process. It is a long black stripe
with one white vertical tick mark for each displayed frame at
the requested level of detail. A triangular frame marker above
the sampling stripe slides along the stripe as the video clip
is being played and indicates which frame is being displayed.
When all frames are to be played, the sampling stripe is completely
white. At intermediary detail slider positions, only a few white
frame tick marks appear. Video portions that do not contain variations
of the feature vector are visually characterized by the sparsity
of frame tick marks in the sampling stripe, while video portions
with significant feature vector activity are characterized by
dense sampling and dense tick marks on the sampling stripe. The
user can click-drag the triangular frame marker above the sampling
stripe, and the significant frames on which the frame marker is
dragged are instantly played. This provides very fine temporal
control of the fly-over, while the detail slider provides fine
control of the "altitude'' of the fly-over.
Full
paper (ACM Multimedia 98.)
|