|
|

Using only shadow trajectories of
stationary objects in a scene, we demonstrate that using a set of six
or more photographs are sufficient to accurately calibrate the camera.
Moreover, we present a novel application where, using only three points
from the shadow trajectory of the objects, one can accurately determine
the geo-location of the camera, up to a longitude ambiguity, and also
the date of image acquisition without using any GPS or other special
instruments. We refer to this as "geo-temporal localization". We
consider possible cases where ambiguities can be removed if additional
information is available. Our method does not require any knowledge of
the date or the time when the pictures are taken, and geo-temporal
information is recovered directly from the images. We demonstrate the
accuracy of our technique for both steps of calibration and
geo-temporal localization using synthetic and real data.
|
|
 |
We developed a novel framework for
view-invariant recognition of human actions and human body poses.
Unlike previous works that regard an action as a whole object, or as a
sequence of individual poses, we represent an action as a set of pose
transitions defined by all possible triplets of body points, i.e., we
break down further each pose into a set of point-triplets and find
invariants for the motion of these triplets across frames. We proposed
two distinct methods of recognizing pose transitions independent of
camera calibration matrix and viewpoint, and applied them to the
problem of action recognition and pose recognition. UCF-CIL Action
Dataset was used in this work.
|
|
We address the issue of Euclidean
path modeling in a single camera for activity monitoring in a
multi-camera video surveillance system. The method consists of a path
building training phase and a testing phase. During the unsupervised
training phase, after auto-calibrating a camera and thereafter metric
rectifying the input trajectories, a weighted graph is constructed with
trajectories represented by the nodes, and weights determined by a
similarity measure. Normalized-cuts are recursively used to partition
the graph into prototype paths. Each path, consisting of a partitioned
group of trajectories, is represented by a path envelope and an average
trajectory. For every prototype path, features such as spatial
proximity, motion characteristics, curvature, and absolute world
velocity are then recovered directly in the rectified images or by
registering to aerial views. During the testing phase, using our simple
yet efficient similarity measures for these features, we seek a
relation between the trajectories of an incoming sequence and the
prototype path models to identify anomalous and unusual behaviors.
Real-world pedestrian sequences are used to evaluate the steps, and
demonstrate the practicality of the proposed approach.
|
|
 |
In this work, we present a novel
technique for realistically removing static or moving objects in a
video sequence obtained by a perspective camera. Different from
previous efforts, which are typically based on processing in the 3D
data volume, we slice the volume along the motion manifold of the
moving object, and therefore reduce the search space from 3D to 2D,
while still preserving the spatial and temporal coherence. In addition
to the computational efficiency, based on geometric video analysis, the
proposed approach is also able to handle real videos under perspective
distortion, as well as common camera motions, such as panning, tilting,
and zooming. The experimental results demonstrate that our algorithm
performs comparably to 3D search based methods, but extends the current
state-of-the-art techniques to videos with projective effects, as well
as illumination changes.
|
|
 |
Calibration and Light Source Orientation from
Solar Shadows
Description:
In this work, we have developed a
method for recovering camera parameters from perspective views of
daylight shadows in a scene, given only minimal geometric information
determined from the images. This minimal information consists of two 3D
stationary points and their cast shadows on the ground plane. We show
that this information captured in two views is sufficient to determine
the focal length, the aspect ratio, and the principal point of a
pinhole camera with fixed intrinsic parameters. In addition, we are
also able to compute the orientation of the light source. We also
demonstrate the application to an image-based rendering problem.
|
|
 |
Self-Calibration for Turn-table Sequences under
Variable Zoom
Description:
In this work, we propose a novel
method for self-calibration, using constant inter-frame motion from an
image sequence of an object rotating around a single axis with varying
camera internal parameters. Our approach makes use of the facts that in
many commercial systems rotation angles are often controlled by an
electromechanical system, and that the inter-frame essential matrices
are invariant if the rotation angles are constant but not necessary
known. Using the bundle adjustment method tailored to the special case,
i.e., static camera and constant object rotation, the 3D structure of
the object is recovered and the camera parameters are refined
simultaneously.
|
|
 |
In this project, We present an
approach to accurately align the CT scans of a patient to a stone-cast
model of his/her mandible or maxilla, and use the result of
registration to clean up the patient's scans from artifacts and
defects. The proposed approach assumes that the maxillofacial features
are roughly symmetric with respect to a 3D plane. Then 3D volumetric
models of both the patient and the stone-cast are reconstructed from
the input data using a marching cube algorithm. The planes of symmetry
are extracted using an improved Extended Gaussian Images method. After
an initial alignment of the two volumes guided by the plane of symmetry
due to 3D homology, we minimize a global cost function that depends on
the sum of square differences (SSD) of patient data with the stone-cast
model to finally recover the rigid transformation between the two scans.
|
|
 |
Given two video sequences of
different scenes acquired with moving cameras, it is interesting to
seamlessly transfer a 3D object from one sequence to the other. In this
paper, we present a video-based augmented reality approach to
geometry-correctly transfer realistic 3D objects between two or more
non-overlapping, moving video sequences. Our framework builds upon
techniques in camera pose estimation, 3D spatiotemporal video
alignment, depth recovery, key-frame editing, natural video matting,
and image-based rendering.
|
|
 |
In this project, we have developed
a unified framework for realistically compositing scenes from multiple
videos or images directly in the 2D image domain. Our approach is
flexible in that it allows for compositing of video streams or images
without any prior knowledge of the geometry or the lighting, in both
source and target scenes. This flexibility also allows for inserting
virtual objects in a realistic fashion. We combine this work with our
shadow and reflection synthesis techniques to provide an image-based
direct lighting in the composition process for generating realistic
rendering including shadows and reflections, with the color
characteristics learnt from the target scene. Various direct lighting
scenarios for daylight and night time are demonstrated.
|
|
 |
In this project, we developed an
approach for creating realistic shadows and reflections of objects
composited into a novel background. We observe that scenes containing a
ground plane and some up-right objects, e.g. walls, crowds, desks,
street lamps, trees, etc., are common in natural environments, and show
that two vertical lines and their cast shadows are enough to recover
primary light source, which makes it possible to construct shadows of
inserted planar objects in a single target image. We also demonstrate a
technique to calibrate the camera when two views are available, which
enables us to create shadows of general 3D objects in a target scene.
For reflections, we exploit the fact an object and its reflection on a
planar surface define a cross-ratio in the image plane, which of course
is invariant to camera perspective transformation. This allows us to
realistically generate object reflections directly in the image domain.
|
|
 |
In this project, we present a
method for real-time simulation of 3-dimensional fire inspired by an
old trick known as the "silk torch". Silk torch has proven illusive
effects on human visual perception of fire. We combine this compelling
visual effect of the computer generated silk torch with additional
physical properties of the combustion process, such as the speed of the
vaporized fuel, the
effect of wind, the gravity, and other internal/external forces in
order to provide realistic simulation of the dynamics of various types
of flame under different conditions.
|
|
 |
Camera Calibration Using Mirror Symmetry (ICPR Piero Zamperoni Best Paper Award)
Description:
This project addresses the problem
of calibrating a pinhole camera from images of an isoceles trapezoid.
Assuming a unit aspect ratio and zero skew, we introduce a novel and
simple camera calibration approach. The key features of the proposed
technique are its simplicity and the lack of need for 3D coordinate
information about the calibrating object - i.e. the isosceles
trapezoid. By ultilizing the symmetry of such trapezoid,
we show that one can obtain both the internal and the external camera
parameters. To demonstrate the effectiveness of the algorithm, we
present the processing results on synthetic and real images, and
compare our results to Zhang’s flexible calibration method.
|
|
 |
Expression Morphing
Description:
In this project, we propose an
image-based approach to photo-realistic view synthesis by integrating
field morphing and view morphing in a single framework. We thus provide
a unified technique for synthesizing new images that include both
viewpoint changes and object deformations. For view morphing, we relax
the requirement of monotonicity along epipolar lines to piecewise
monotonicity, by incorporating a segmentation stage prior to
interpolation. This allows for dealing with occlusions and visibility
issues, and hence alleviates the “ghosting
effects” that typically occur when morphing is performed between
distant viewpoints. We have particularly applied our approach to the
synthesis of human facial expressions, while allowing for wide change
of viewing positions and directions.
|
|
|
Image-Based Metrology
Description:
In this project, we describe how
3D Euclidean measurements can be made in a pair of perspective images,
when only minimal geometric information are available in the image
planes. This minimal information consists of one line on a reference
plane and one vanishing point for a direction perpendicular to the
plane. Given these information, we show that the length ratio of two
objects perpendicular to the reference plane can be
expressed as a function of the camera principal point. Assuming that
the camera intrinsic parameters remain invariant between the two views,
we recover the principal point and the camera focal length by
minimizing the symmetric transfer error of geometric distances.
Euclidean metric
measurements can then be made directly from the images. To demonstrate
the effectiveness of the approach, we present the processing results
for synthetic and natural images, including measurements along both
parallel and non-parallel lines.
|
|
 |
Self-calibrated Stereo Modeling
Description:
The traditional stereo
reconstrction techniques based on point correspondences and the
estimation of the cameras from the fundamental matrix introduce a
four-fold ambiguity. Moreover, there is a projective ambiguity inherent
in the fundamental matrix. We show that a symmetric object can be
modeled even under partial occlusion with a pair of uncalibrated stereo
images. This implies that unlike traditional stereo algorithms, we can
extract 3D information from two arbitrary viewpoints, even when there
is no left-to-right point correspondences. To demonstrate the
effectiveness of the method, we present experimental results on both
synthetic and real images.
|
|
 |
Urban Simulation
Description:
In this project We have introduced
a new method for rectifying stereo pairs that does not require any
calibration information or any knowledge of the epipolar geometry. We
then describe a Bayesian stereo technique that fuses information from
both monocular and binocular vision in order to overcome the complexity
of data in cluttered/dense urban areas. Finally a model for reducing
perspective distortions is given, which otherwise could introduce
severe errors in the actual 3D models. We have also developed a
multi-sensor platform for acquisition and construction of 3D city
models at close range, which we plan to fuse with far-range stereo
models.
|
|
|