Camera Motion Estimation Using Monocular Image Sequences and Inertial Data

Camera Motion Estimation Using Monocular Image Sequences and Inertial Data
Author:
Publisher:
Total Pages: 28
Release: 1999
Genre: Image processing
ISBN:

This paper presents a robust model-based algorithm for camera motion estimation using a monocular image sequence and inertial data. Conventional algorithms using only video information encounter difficulties in robustly estimating irregular camera motion. In our approach, we use both video and inertial information to estimate the camera motion. The inertial data are acquired by a set of on-board Micro-Electro-Mechanical-System (MEMS) based sensors. The key features of our algorithm are (1) utilization of inertial data and (2) full exploitation of Directions of Location (DOLs) of feature points. By using inertial data and DOLs of feature points, we track and recover more complex platform motion than conventional algorithms. An Iterated Extended Kalman Filter (IEKF) is used to estimate the motion parameters. This algorithm has been tested on synthetic and real image sequences and the results in both cases show the efficacy of our approach.

Motion Estimation from Image and Inertial Measurements

Motion Estimation from Image and Inertial Measurements
Author: Dennis W. Strelow
Publisher:
Total Pages: 154
Release: 2004
Genre: Computer vision
ISBN:

Abstract: "Robust motion estimation from image measurements would be an enabling technology for Mars rover, micro air vehicle, and search and rescue robot navigation; modeling complex environments from video; and other applications. While algorithms exist for estimating six degree of freedom motion from image measurements, motion from image measurements suffers from inherent problems. These include sensitivity to incorrect or insufficient image feature tracking; sensitivity to camera modeling and calibration errors; and long-term drift in scenarios with missing observations, i.e., where image features enter and leave the field of view. The integration of image and inertial measurements is an attractive solution to some of these problems. Among other advantages, adding inertial measurements to image-based motion estimation can reduce the sensitivity to incorrect image feature tracking and camera modeling errors. On the other hand, image measurements can be exploited to reduce the drift that results from integrating noisy inertial measurements, and allows the additional unknowns needed to interpret inertial measurements, such as the gravity direction and magnitude, to be estimated. This work has developed both batch and recursive algorithms for estimating camera motion, sparse scene structure, and other unknowns from image, gyro, and accelerometer measurements. A large suite of experiments uses these algorithms to investigate the accuracy, convergence, and sensitivity of motion from image and inertial measurements. Among other results, these experiments show that the correct sensor motion can be recovered even in some cases where estimates from image or inertial estimates alone are grossly wrong, and explore the relative advantages of image and inertial measurements and of omnidirectional images for motion estimation. To eliminate gross errors and reduce drift in motion estimates from real image sequences, this work has also developed a new robust image feature tracker that exploits the rigid scene assumption and eliminates the heuristics required by previous trackers for handling large motions, detecting mistracking, and extracting features. A proof of concept system is also presented that exploits this tracker to estimate six degrees of freedom motion from long image sequences, and limits drift in the estimates by recognizing previously visited locations."

Motion Analysis and Image Sequence Processing

Motion Analysis and Image Sequence Processing
Author: M. Ibrahim Sezan
Publisher: Springer Science & Business Media
Total Pages: 499
Release: 2012-12-06
Genre: Technology & Engineering
ISBN: 1461532361

An image or video sequence is a series of two-dimensional (2-D) images sequen tially ordered in time. Image sequences can be acquired, for instance, by video, motion picture, X-ray, or acoustic cameras, or they can be synthetically gen erated by sequentially ordering 2-D still images as in computer graphics and animation. The use of image sequences in areas such as entertainment, visual communications, multimedia, education, medicine, surveillance, remote control, and scientific research is constantly growing as the use of television and video systems are becoming more and more common. The boosted interest in digital video for both consumer and professional products, along with the availability of fast processors and memory at reasonable costs, has been a major driving force behind this growth. Before we elaborate on the two major terms that appear in the title of this book, namely motion analysis and image sequence processing, we like to place them in their proper contexts within the range of possible operations that involve image sequences. In this book, we choose to classify these operations into three major categories, namely (i) image sequence processing, (ii) image sequence analysis, and (iii) visualization. The interrelationship among these three categories is pictorially described in Figure 1 below in the form of an "image sequence triangle".

Structure from Motion using the Extended Kalman Filter

Structure from Motion using the Extended Kalman Filter
Author: Javier Civera
Publisher: Springer Science & Business Media
Total Pages: 180
Release: 2011-11-05
Genre: Technology & Engineering
ISBN: 3642248330

The fully automated estimation of the 6 degrees of freedom camera motion and the imaged 3D scenario using as the only input the pictures taken by the camera has been a long term aim in the computer vision community. The associated line of research has been known as Structure from Motion (SfM). An intense research effort during the latest decades has produced spectacular advances; the topic has reached a consistent state of maturity and most of its aspects are well known nowadays. 3D vision has immediate applications in many and diverse fields like robotics, videogames and augmented reality; and technological transfer is starting to be a reality. This book describes one of the first systems for sparse point-based 3D reconstruction and egomotion estimation from an image sequence; able to run in real-time at video frame rate and assuming quite weak prior knowledge about camera calibration, motion or scene. Its chapters unify the current perspectives of the robotics and computer vision communities on the 3D vision topic: As usual in robotics sensing, the explicit estimation and propagation of the uncertainty hold a central role in the sequential video processing and is shown to boost the efficiency and performance of the 3D estimation. On the other hand, some of the most relevant topics discussed in SfM by the computer vision scientists are addressed under this probabilistic filtering scheme; namely projective models, spurious rejection, model selection and self-calibration.

Motion Estimation

Motion Estimation
Author: Fouad Sabry
Publisher: One Billion Knowledgeable
Total Pages: 123
Release: 2024-05-12
Genre: Computers
ISBN:

What is Motion Estimation In computer vision and image processing, motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion happens in three dimensions (3D) but the images are a projection of the 3D scene onto a 2D plane. The motion vectors may relate to the whole image or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. The motion vectors may be represented by a translational model or many other models that can approximate the motion of a real video camera, such as rotation and translation in all three dimensions and zoom. How you will benefit (I) Insights, and validations about the following topics: Chapter 1: Motion_estimation Chapter 2: Motion_compensation Chapter 3: Block-matching_algorithm Chapter 4: H.261 Chapter 5: H.262/MPEG-2_Part_2 Chapter 6: Advanced_Video_Coding Chapter 7: Global_motion_compensation Chapter 8: Block-matching_and_3D_filtering Chapter 9: Video_compression_picture_types Chapter 10: Video_super-resolution (II) Answering the public top questions about motion estimation. (III) Real world examples for the usage of motion estimation in many fields. Who this book is for Professionals, undergraduate and graduate students, enthusiasts, hobbyists, and those who want to go beyond basic knowledge or information for any kind of Motion Estimation.

Fast and Accurate Camera Motion Estimation For Static and Dynamic Scenes

Fast and Accurate Camera Motion Estimation For Static and Dynamic Scenes
Author: Haleh Azartash
Publisher:
Total Pages: 95
Release: 2014
Genre:
ISBN: 9781303995729

Visual Odometry (VO) is the process of finding a camera's relative pose in different time intervals by analysing the images taken by the camera. Visual Odometry, also knowns as ego-motion estimation, has a variety of applications including image stabilization, unmanned aerial vehicle (UAV) and robotic navigation, scene reconstruction and augmented reality. VO has been extensively studied for the past three decades for stationary and dynamic scenes using monocular, stereo and more recently RGB-D cameras. It is important to note that camera motion estimation is application specific, and proper adjustments should be applied to the solution based on the requirements. In this thesis, we present different methods to estimate visual odometry accurately for camera stabilization and robotic navigation using monocular, stereo and RGB-D cameras for both stationary and dynamic scenes. For image stabilization, we propose a fast and robust 2D-affine ego-motion estimation algorithm based on phase correlation in Fourier-Mellin domain using a single camera. The 2D motion parameters, rotation-scale-translation (RST), are estimated in a coarse to fine approach, thus ensuring the convergence for large camera displacement. Using a RANSAC-based robust least square model fitting in the refinement process, we are able to find the final motion accurately which is robust to outliers such as moving objects or flat areas, therefore, making it suitable for both static and dynamic scenes. Even though this method estimates the 2D camera motion accurately, it is only applicable to scenes with small depth variation. Consequently, a stereo camera is used to overcome this limitation. Using a stereo camera enables us to find 3D camera motion (instead of 2D) of an arbitrary moving rig in any static environment with no limitation for depth variation. We propose a feature-based method that estimates large 3D translation and rotation motion of a moving rig. The translational velocity, acceleration and angular velocity of the rig are then estimated using a recursive method. In addition, we account for different motion types such as pure rotation and pure translation in different directions. Although by using a stereo rig we can find the arbitrary motion of a moving rig, the observed environment should be stationary. In addition, estimating the disparity between the stereo images increases the complexity of the proposed method. Therefore, we propose a robust method to estimate visual odometry using RGB-D cameras which is applicable to dynamic scenes as well. RGB-D cameras provide a color image and depth map of the scene simultaneously and therefore, reduce the complexity and computation time of visual odometry algorithms significantly. To exclude the dynamic regions of the scene from the camera motion estimation process, we use image segmentation to separate the moving parts from the stationary parts of the scene. We use an enhanced depth-aware segmentation method that improves the segmentation output in addition to conjoin areas where the depth value is not available. Then, a dense 3D point cloud is constructed by finding the dense correspondence between the reference and current frames using optical flow. Motion parameters for each segment is calculated using iterative closest point (ICP) technique (with six degrees of freedom). Finally, to find the true motion of the camera and exclude the dynamic region's motion parameters, we perform motion optimization by finding a linear combination of motion parameters that minimizes the remainder difference between the reference and the current images.

Continuous Models for Cameras and Inertial Sensors

Continuous Models for Cameras and Inertial Sensors
Author: Hannes Ovrén
Publisher: Linköping University Electronic Press
Total Pages: 67
Release: 2018-07-25
Genre:
ISBN: 917685244X

Using images to reconstruct the world in three dimensions is a classical computer vision task. Some examples of applications where this is useful are autonomous mapping and navigation, urban planning, and special effects in movies. One common approach to 3D reconstruction is ”structure from motion” where a scene is imaged multiple times from different positions, e.g. by moving the camera. However, in a twist of irony, many structure from motion methods work best when the camera is stationary while the image is captured. This is because the motion of the camera can cause distortions in the image that lead to worse image measurements, and thus a worse reconstruction. One such distortion common to all cameras is motion blur, while another is connected to the use of an electronic rolling shutter. Instead of capturing all pixels of the image at once, a camera with a rolling shutter captures the image row by row. If the camera is moving while the image is captured the rolling shutter causes non-rigid distortions in the image that, unless handled, can severely impact the reconstruction quality. This thesis studies methods to robustly perform 3D reconstruction in the case of a moving camera. To do so, the proposed methods make use of an inertial measurement unit (IMU). The IMU measures the angular velocities and linear accelerations of the camera, and these can be used to estimate the trajectory of the camera over time. Knowledge of the camera motion can then be used to correct for the distortions caused by the rolling shutter. Another benefit of an IMU is that it can provide measurements also in situations when a camera can not, e.g. because of excessive motion blur, or absence of scene structure. To use a camera together with an IMU, the camera-IMU system must be jointly calibrated. The relationship between their respective coordinate frames need to be established, and their timings need to be synchronized. This thesis shows how to automatically perform this calibration and synchronization, without requiring e.g. calibration objects or special motion patterns. In standard structure from motion, the camera trajectory is modeled as discrete poses, with one pose per image. Switching instead to a formulation with a continuous-time camera trajectory provides a natural way to handle rolling shutter distortions, and also to incorporate inertial measurements. To model the continuous-time trajectory, many authors have used splines. The ability for a spline-based trajectory to model the real motion depends on the density of its spline knots. Choosing a too smooth spline results in approximation errors. This thesis proposes a method to estimate the spline approximation error, and use it to better balance camera and IMU measurements, when used in a sensor fusion framework. Also proposed is a way to automatically decide how dense the spline needs to be to achieve a good reconstruction. Another approach to reconstruct a 3D scene is to use a camera that directly measures depth. Some depth cameras, like the well-known Microsoft Kinect, are susceptible to the same rolling shutter effects as normal cameras. This thesis quantifies the effect of the rolling shutter distortion on 3D reconstruction, depending on the amount of motion. It is also shown that a better 3D model is obtained if the depth images are corrected using inertial measurements. Att använda bilder för att återskapa världen omkring oss i tre dimensioner är ett klassiskt problem inom datorseende. Några exempel på användningsområden är inom navigering och kartering för autonoma system, stadsplanering och specialeffekter för film och spel. En vanlig metod för 3D-rekonstruktion är det som kallas ”struktur från rörelse”. Namnet kommer sig av att man avbildar (fotograferar) en miljö från flera olika platser, till exempel genom att flytta kameran. Det är därför något ironiskt att många struktur-från-rörelse-algoritmer får problem om kameran inte är stilla när bilderna tas, exempelvis genom att använda sig av ett stativ. Anledningen är att en kamera i rörelse ger upphov till störningar i bilden vilket ger sämre bildmätningar, och därmed en sämre 3D-rekonstruktion. Ett välkänt exempel är rörelseoskärpa, medan ett annat är kopplat till användandet av en elektronisk rullande slutare. I en kamera med rullande slutare avbildas inte alla pixlar i bilden samtidigt, utan istället rad för rad. Om kameran rör på sig medan bilden tas uppstår därför störningar i bilden som måste tas om hand om för att få en bra rekonstruktion. Den här avhandlingen berör robusta metoder för 3D-rekonstruktion med rörliga kameror. En röd tråd inom arbetet är användandet av en tröghetssensor (IMU). En IMU mäter vinkelhastigheter och accelerationer, och dessa mätningar kan användas för att bestämma hur kameran har rört sig över tid. Kunskap om kamerans rörelse ger möjlighet att korrigera för störningar på grund av den rullande slutaren. Ytterligare en fördel med en IMU är att den ger mätningar även i de fall då en kamera inte kan göra det. Exempel på sådana fall är vid extrem rörelseoskärpa, starkt motljus, eller om det saknas struktur i bilden. Om man vill använda en kamera tillsammans med en IMU så måste dessa kalibreras och synkroniseras: relationen mellan deras respektive koordinatsystem måste bestämmas, och de måste vara överens om vad klockan är. I den här avhandlingen presenteras en metod för att automatiskt kalibrera och synkronisera ett kamera-IMU-system utan krav på exempelvis kalibreringsobjekt eller speciella rörelsemönster. I klassisk struktur från rörelse representeras kamerans rörelse av att varje bild beskrivs med en kamera-pose. Om man istället representerar kamerarörelsen som en tidskontinuerlig trajektoria kan man på ett naturligt sätt hantera problematiken kring rullande slutare. Det gör det också enkelt att införa tröghetsmätningar från en IMU. En tidskontinuerlig kameratrajektoria kan skapas på flera sätt, men en vanlig metod är att använda sig av så kallade splines. Förmågan hos en spline att representera den faktiska kamerarörelsen beror på hur tätt dess knutar placeras. Den här avhandlingen presenterar en metod för att uppskatta det approximationsfel som uppkommer vid valet av en för gles spline. Det uppskattade approximationsfelet kan sedan användas för att balansera mätningar från kameran och IMU:n när dessa används för sensorfusion. Avhandlingen innehåller också en metod för att bestämma hur tät en spline behöver vara för att ge ett gott resultat. En annan metod för 3D-rekonstruktion är att använda en kamera som också mäter djup, eller avstånd. Vissa djupkameror, till exempel Microsoft Kinect, har samma problematik med rullande slutare som vanliga kameror. I den här avhandlingen visas hur den rullande slutaren i kombination med olika typer och storlekar av rörelser påverkar den återskapade 3D-modellen. Genom att använda tröghetsmätningar från en IMU kan djupbilderna korrigeras, vilket visar sig ge en bättre 3D-modell.

A "half-perspective" Approach to Robust Ego-motion Estimation for Calibrated Cameras

A
Author: Robert Wagner
Publisher:
Total Pages: 58
Release: 1997
Genre: Computer vision
ISBN:

Abstract: "A new computational approach to estimate the ego-motion of a camera from sets of point correspondences taken from a monocular image sequence is presented. The underlying theory is based on a decomposition of the complete set of model parameters into suitable subsets to be optimized separately, e.g. all stationary parameters concerning camera calibration are adjusted in advance (calibrated case). The first part of the paper is devoted to the description of the mathematical model, the so-called conic error model, and the numerical solution of the derived optimization problem. In contrast to existing methods, the conic error model permits to distinguish between feasible and non-feasible image correspondences related to 3D object points in front of and behind the camera, respectively. Based on this 'half-perspective' point of view, a well-balanced objective function is derived that encourages the proper detection of mismatches and distinct relative motions. In the second part, the results of various tests are presented and analyzed. The experimental study clearly shows that the numerical stability of the new approach is superior to that of so-called self-calibration techniques (uncalibrated case). Furthermore, the precision of the estimates is better than that achieved by comparable methods in the calibrated case based on a 'full-perspective' modeling and the related epipolar geometry. Accordingly, the accuracy of the resulting ego-motion estimation turns out to be excellent, even without any further temporal filtering."