Project Details | |
Project Code | JWA141 |
Title | Vehicle Detection in Aerial Surveillance Using Dynamic Bayesian Networks |
Project Type | Java Swing Application |
Front End | Eclipse |
Back End | Nil |
Project Cost |
Abstract
We present an automatic vehicle detection system for aerial surveillance in this paper. In this system, we escape from the stereotype and existing frameworks of vehicle detection in aerial surveillance, which are either region based or sliding window based. We design a pixel wise classification method for vehicle detection. The novelty lies in the fact that, in spite of performing pixel wise classification, relations among neighboring pixels in a region are preserved in the feature extraction process. We consider features including vehicle colors and local features. For vehicle color extraction, we utilize a color transform to separate vehicle colors and non-vehicle colors effectively. For edge detection, we apply moment preserving to adjust the thresholds of the Canny edge detector automatically, which increases the adaptability and the accuracy for detection in various aerial images. Afterward, a dynamic Bayesian network (DBN) is constructed for the classification purpose. We convert regional local features into quantitative observations that can be referenced when applying pixel wise classification via DBN. Experiments were conducted on a wide variety of aerial videos. The results demonstrate flexibility and good generalization abilities of the proposed method on a challenging data set with aerial surveillance images taken at different heights and under different camera angles.
Introduction:
Aerial surveillance has a long history in the military for observing enemy activities and in the commercial world for monitoring resources such as forests and crops. Similar imaging techniques are used in aerial news gathering and search and rescue aerial surveillance has been performed primarily using film or electronic framing cameras. The objective has been to gather high-resolution still images of an area under surveillance that could later be examined by human or machine analysts to derive information of interest. Currently, there is growing interest in using video cameras for these tasks. Video captures dynamic events that cannot be understood from aerial still images. It enables feedback and triggering of actions based on dynamic events and provides crucial and timely intelligence and understanding that is not otherwise available. Video observations can be used to detect and geo-locate moving objects in real time and to control the camera, for example, to follow detected vehicles or constantly monitor a site. However, video also brings new technical challenges. Video cameras have lower resolution than framing cameras. In order to get the resolution required to identify objects on the ground, it is generally necessary to use a telephoto lens, with a narrow field of view. This leads to the most serious shortcoming of video in surveillance— it provides only a “soda straw” view of the scene. The camera must then be scanned to cover extended regions of interest. An observer watching this video must pay constant attention, as objects of interest move rapidly in and out of the camera field of view. The video also lacks a larger visual context—the observer has difficulty perceiving the relative locations of objects seen at one point in time to objects seen moments before. In addition, geodetic coordinates for objects of interest seen in the video are not available.
One of the main topics in aerial image analysis is scene registration and alignment. Another very important topic in intelligent aerial surveillance is vehicle detection and tracking. The challenges of vehicle detection in aerial surveillance include camera motions such as panning, tilting, and rotation. In addition, airborne platforms at different heights result in different sizes of target objects.
In this paper, we design a new vehicle detection framework that preserves the advantages of the existing works and avoids their drawbacks. The framework can be divided into the training phase and the detection phase. In the training phase, we extract multiple features including local edge and corner features, as well as vehicle colors to train a dynamic Bayesian network (DBN). In the detection phase, we first perform background color removal. Afterward, the same feature extraction procedure is performed as in the training phase. The extracted features serve as the evidence to infer the unknown state of the trained DBN, which indicates whether a pixel belongs to a vehicle or not. In this paper, we do not perform region-based classification, which would highly depend on results of color segmentation algorithms such as mean shift. There is no need to generate multiscale sliding windows either. The distinguishing feature of the proposed framework is that the detection task is based on pixelwise classification. However, the features are extracted in a neighborhood region of each pixel. Therefore, the extracted features comprise not only pixel-level information but also relationship among neighboring pixels in a region. Such design is more effective and efficient than region-based or multiscale sliding window detection methods.
SOCIALIZE IT →