Overview
In the MPEG encoding scheme, blocks of pixels of a frame are correlated
to areas of the previous frame, and only differences between blocks
and their correlated areas are encoded. The translation vector
between a block and the area that most closely matches it is called
a motion vector. Motion vectors are generated by camera motion
and by object motion. Motion vector fields generated by camera
motion have a consistent structure that can be differentiated
from object motion, at least when the object covers only a small
part of the field of view. We have developed a simple averaging
method for computing the camera motion -pan, tilt, swing, and
zoom- that best explains a motion vector field. This camera motion
is used to build a mosaic. A mosaic gathers the information from
the frames obtained by the moving camera into the field of view
of a virtual large-angle still camera. Our technique brings all
the rays obtained by the rotating camera into a single reference
frame. Given a ray in the virtual still camera, we find the values
of the pixels -if any- produced by rays hitting the scene at the
same point in the moving camera, and place one of these values
in the image of the virtual still camera. This operation requires
only that the rays of the virtual still camera be expressed in
the reference frames of the moving camera using the computed camera
motion angles.
The
feasibility of the method has been demonstrated by building panorama
views of the lab from MPEG sequences obtained by panning the lab.
The
proposed mosaicing method takes advantage of the image matching
information encoded by MPEG in the motion vectors. Therefore there
is no need for additional image-to-image correspondence or optical
flow computation. In addition, our ray tracing approach provides
a better understanding and control of the distortions generated
by the mosaicing process than the direct image-to-image warping
proposed in other techniques.
Our
goal is to provide automatic mosaicing of MPEG sequences. We would
like to mosaic shots where the camera is used to pan a fairly
static scene, and not shots where motion vectors are generated
by significant object motion. To distinguish the two, we plan
to use the motion vectors to compute camera motion in both cases,
then compare the global motion vectors that would be generated
if such camera motion actually occurred with the actual motion
vector field. Mosaic will be performed only when the actual motion
field is properly explained by the camera motion.
|