SHAPE DETECTION BASED MULTI OBJECT TRACKING full report
Active In SP
Joined: Mar 2010
31-03-2010, 01:21 PM
Matriculation .no: 0861090
Prof. Dr.-Ing. K. Kyamakya
SHAPE DETECTION BASED MULTI TRACKING
This paper introduces a detection of a wheel from particular frames of video. Detection is done by two approaches, first approach is through the pixel changing and the second approach is through Haar-Like feature. Here the detection is done with the help of Haar-Like featureâ„¢s and by using adaboost boosting techquies the detection process is been enhanced for detection desired object or feature. There are lots of obstacles facing, while making object detection and also tracking along the frames of video, so here we made use of kalman filter to track the desired object detection in frames of video. And this paper mainly focus on detection and tracking of a wheel in a particular frames, More over we discuss about the adaboost usage in the feature extraction and detecting an desired object or feature accurately.
Increased demand of cars has increased need of safety and need of surveillance and to observe the vehicles, which make system more observable and safety. Many researchers have conducted many seminar and presentation and conferences by intelligent transportation system community of this field . It deals with the high-tech technologies and equipments on the vehicle and a side of the road segments. These days technology and technical systems are strongly bind with the environment and where they are undividable from them. This paper mainly deals with the systems which are building to observe the traffic and size of the vehicle and other necessaries of the road network .The shape detection is a growing essential technology along with empowering technologies and technical systems around the world. Detection is essential in many situation or cases like an automation of security systems, bill collecting, vehicle security, real-time vehicle management, night vision systems, image processing and many more; here detection is done for dynamic or stable object along the frames of video for different kinds of purposes as mentioned above. How a system could understand our desired detection and why we need to make it know the system our desired detection, with a general example it will be easy to understand , for example to make it understand a kid what are things like, we show them the thing and we say what the name of the thing or object that we have just shown to them and they can remember about the object which is just shown and with the help of features of the object which is shown to them will help them to recall if the same object is shown again in future but in case if we show some other object behave of which we shown then kid may not recognize it because he donâ„¢t know about the object apart of he knows. In the same manner the system should be trained to know what desired detection to be detected is to be like the above exmaple and also system need to recognize the features which are to remembered for detection.
To recognize the object or features in an accurate manner, so for such a kind of detection we use Haar-like featureâ„¢s there are lot of features will be formed while making a process of formation of Haar-Like features with the image but there is big task of extracting desired features, so adaboost boosting technique is the best for extraction of desired features among the large set of features. There are lots of stages for extracting the desired features it is named as classification after several classifications we get one strong classifier which helps for accurate detection of desired object or feature of the object. AdaBoost is used as a technique for selecting the best weak classifiers and then by number of iterative process of weak classifiers will be combined them into strong classifiers. A cascade of classifiers is built as the final detector which is known as strong classifier, where a classifier is used only when all previous classifiers accept a particular image (or a sub-window of an image) . The detection is needed to be carried along the frames with out brake, so there is a need of tracking the object along the frames; here we make use of kalman filter for tracking the object, where kalman is best at predicting the future position of dynamic object and it is done by some mathematical calculations of system. By the making desired equations of the dynamic system behavior, we can make best prediction with avoiding the White noise and Gaussians noise of the system and it is very interesting to know the future position of the any kind of object and by making use kalman filter tracking all along the frames without any loss of detection on desired object will be done. The tool used for implementation for this object detection is implemented on OpenCV, where it is implemented or designed by Microsoft for image processing and many more machine vision experiments and OpenCV is only tool which contains 500 built-in functions for completion of detection. The system doesnâ„¢t need any manual intervention for detection process and the system is robust to the occlusions or for any perturbations occurred when making a detection of particular object.
We need a training set for representing different kinds of wheels and there is need of training set to prepare the system to detect only the wheel in the whole image or frames of the video.
To make the image to be prepared for detection of the wheel, there is a need of formation of features on the image which will be done by Haar-like features.
To extract the required features from the set of features will be robustly done by the adaboost and it is achieved through the iterative process.
Wheel features will be used accurately to recognize the wheel in subsequent frames of video.
Classifier is the one which is used in the detection of the wheel in an image
At a high processing with a little computation at the classifiers, where it eliminates the most of the images without the wheels and make the localization the images with the wheel with a kind of rectangle around the wheel.
On real time capturing of the wheel is done automatically without any presence of human or any manual process, and the system is robust of dynamic changes from the climate, where the background changes dynamically and system should be robust to overcome such changes.
This paper is organized as follows. After the description of the general framework of the proposed wheel detection approach in the next section we discuss about Haar-Like features and types of features and then in next section we can see how the training stage for wheel detection is been designed and after training we discuss in detail about the adaboost classifications and finally we can have a clear view about kalman filter tracking, atlast concluding the results.
Here in this paper, the aim is to detect the wheel of a vehicle which is passing by the region of the camera area where it is located. The system has no idea, what the wheel is and how to recognize a wheel in the sequence of frames of a video. So the system need to be trained with some kind of data sets, which gives a clear information to the system what a wheel would really will be like and how to detect or recognize a wheel in the sequence of frames. There is also a need to understand by system that any object which is detected is apart of the given dataset then it knows it is not a wheel, where there will be different kind of datasets are available in the internet such as face, bicycle, cars, windows, ears, noses, etc many more but there are no dataset of wheel in the search, so manually collected some of wheel images from internet and then by taking some capture of images by using the webcam.
Here datasets should be arranged in correct manner for reason of separating the positive and negative images, these are the images which are made in the form of datasets and given to the system in an organized manner, so for any object detection or to recognition we need to built a well formed datasets, more over we have know few things about the training stages here because the whole training stages are based on the datasets order. It is necessary to know about the approach of the problem solving to control the further coming error, we can find such kind of errors or fluctuations are seen in many detection or recognition project and implimentation ,where we need to make samples all of same size because if the height and width of the samples differ of individual then there is lot is error or wrong detection it is really critical to identify such kind of problems, so while building a datasets there should of equal height and width of all the samples ,here in below we can see the positive image representation and negative image representation
Positive images are like the image which contains the information of a wheel but the negative image will be like an image where it doesnâ„¢t contains any information or representation of a wheel. By placing this information in an organized manner we built a system for better detection and there also a need that a system should be trained by only positive images not by negative images but for training purpose we can make use of negative images it will be for testing purpose but to make understand the system what is a wheel and how a wheel looks like we need to make use of the folder which contains only positive images and then by well built and organized datasets will help us to built a robust system which will be accurate at detection of the wheel and one more important thing is here we made a detection of frontal wheel, so the dataset which is to be built should be in the form or made of frontal images of the wheel as show as above.
These features are also known to be like digital image features where used for detection or for formation of features in an image. Where we have discussed in the above with an example of a child, it is like a child know only the things we show them and tell them apart of it child donâ„¢t know as like the child system only know those things we show, So for such kind of training we need Haar-Like features where these things deals with the image intensities (i.e., the RGB pixel values at each and every pixel of image).
Here in the above figures we can see the basic representation of Haar-Like features and the extended representation of the Haar-Like features, here in above figures we can see that the each image consists of two or three white and black combination of features, it make use of these regions for summing getting a value which is nothing but the feature. The total process begins at the image insertion into the Haar-Like process; here the image consists of number of pixels where these rectangle features are formed on the total area of the images making in the formation of regions on each part of the image. In such a manner we find many number of features nearly for an image of 24*24 we can find features nearly of 49,556 features and we need to extract the features which completes or works for our desired task of detection. The feature computation representation is done by the below representation, where the summing of one region is subtracted from the one region and gives the value which is nothing but the feature required.
f(x) =Sumblack rectangle (pixel gray level) â€œ Sumwhite rectangle (pixel gray level)
There is the reason behind using Haar-Like feature method instead of any other method for detection of wheel on this experiment, if we use the process of some other method instead of Haar-Like feature we need to go through lots of computation process therefore to make a updating or detection process with a pixel changing processes is time taking and this pixel changing is only done when any large changes in the image or frame are seen and it make computation of whole image instead of desired detection where Haar-Like features will be searching for desired region of interest in any frame with the help of features came out from Haar-Like process. Viola & Jones 2001; Papageorgiou & Poggio 2000 are the people who made first researches on Haar-Like features.
Haar-Like filters are been consisting of two or three rectangles for making a feature and before that image should be transformed from RGB form to the gray level format where these are of binary format, so for computation of the process for a certain region is subtracted from the summation of the pixels black one (and normalized by a coefficient in form of a filter with three
rectangles or five rectangles). Where Viola and Jones  introduced the integral image which is an intermediate representation of an input image and reduces the computation time for the process of filter. Rectangle features are mostly managed widely using the integral image process using an intermediate image, which nothing but the part or some region of the image.
Above figure shows the representation of the integral image which is part of the image and at the end of one corner shows the representation of P(x,y) which represents value of the feature. In the same manner the system computes for the total image and gives number of features. This is case of single rectangle and the single rectangle will be divided in the four array format as show as below.
In the above figure which represents the computation processor of the Haar filter it is been divided into for array format for single rectangle and for the two rectangle it is portioned in the form of six array format and in case of three rectangle it is portioned for eight array and in case of four rectangle nine array, it will be done as shown in below
Here we can see in figure5 P1, P2, P3, P4 are representation of single rectangle and above representation gives a clear ideas of how a computation is done for single rectangle.
Feature selection and extraction:
A feature is a specific representation of image structure in the image and it could be represented as a Boolean value or variable of a certain point, where it descries weather a feature is located at that point or not. The feature extraction is the main task for entire process and we need to make a correct feature extraction to build a robust detection system for detecting wheel. Where feature selection undergoes through some sort of image processing and then selection of specific features is really a big task and it is done by feature reduction with attribute selection. The selection of feature is done in a manner of choosing certain specific examples to understand the system what the desired task of detection ment to be, if the selection of the features is not done in an correct manner the classifier which is built using certain techquies will not work properly and faces many wrong detections. This whole process of selection is done by the built in program which are available in OpenCV and there is long process to be done before we go for feature extraction.
Feature extraction is nothing but extracting the certain features which are required for the detection of object required, if the extraction extracts some kind of wrong features then it leads to wrong detection. This is a technique which is used in the machine learning process, so by selecting the sub-relevant features from the frames will help to build a robust and accurate process for detection. Feature extraction is a basic level of image processing and it is first operation performed on the image and goes through every pixel of the image where to check weather any feature is present at that pixel. Some of types of feature we mention here to have good idea and an overview regarding features they are edges, corner/interest points, bold or region of interest points and ridges. The tool we are using here for detection of a wheel provides us a vast knowledge of built in functions to make use of these feature selection and feature extraction widely in an easy manner and there are built in example which are available in the tool itself helps us to have a clear view of generating an feature and organize a feature on the image.
Feature vectors and feature spaces:
There are many kind of application which are intended to detect or tracking of particular object for different kind of needs and all the time the same features or same type of features are not used there is need of different features to represent one particular edge or corner or ridge, So for robust detection of particular object we need to extract two or more features for a single object which belong to a region of the image or at the image points. The features are stored in a single form of vector for representation of single image point, where it is represented by the set known as feature vector. The set of all different kinds of features which are available in the form of feature vectors are know to be feature space.
The training of the system intended to ready the system for the desired task of detection, where our main task is towards the wheel detection and to track the features along the frames of video. Here we will feed the system with correct training examples means what our desired task of detection is to be performed in such a manner we will the train the system. The stage of training is most vital part of the image processing and we need to make a robust training for the system where it leads to make detection well directed. Here in below we can see the figure of stages and a clear explanation for individual stages involved in the training process.
The above figure gives a clear view of the adaboost training process and where we were discussed previously that the process begins at the feeding the system with correct samples and where it leads for the completion of desired task detection. Training process is based on adaboost where the features are generated by using Haar filters and there is need of extraction and selection of features from the thousand features which are available in set of a single image and atlast we need to build a feature vectors for accurate detection performance.
In first stage we take all the samples which needed for detection of particular object in the frames and there is rule that all images should be given in an ordered format where the training stage is well built with same type of features, after the work of making all the samples in a folder we have to work on every frame or image for finding desired object in those images, here in below we can see the how the sample will be taken for wheel reorganization
In the second stage all the sample are changed in form of gray scale image format and sent to the further feature selection where all the images will go through a process for making the image to build features on every pixel where it is located. The images which are sent to the Haar filters are all sample of positive images only, positive images are which contains the information of the wheel or which contains the wheel in the image but we need to take care that no negative image if sent while making a process feature selection but we need negative images at a certain point of training process, where there is need of perturbation for system to make it in an robust manner.
The features are generated automatically in the process of training stage and after the creation features and then we send the images with the features to the AdaBoost feature cascades where it extract the features from the wide set of features and they may of same type of feature for a single point of selected pixel which help in most of the detection and here in the feature cascading stage it eliminates most of the negative images, where negative image is the one which doesnâ„¢t contain any information about the wheel or any other object for detection. The images are gone through the classifiers and the classifiers are which extract the features based on the information of training sample and features will not be take in one cycle, there will be an iterative process of selecting the features from the images, it will in such a manner like if one do the wrong thing it will corrected by next coming classifier and there is chance of eliminating if the detection is made by a previous classifier is wrong. The classifier which not correct at detection are stated as weak classifiers and these classifiers are repeated until it reaches the correct detection rate and the combination these weak classifiers are know as strong classifier and there these cascade of classifier will become representation for the cascade of features.
The boosting is the one most common method known for improving the accuracy of the image processing learning algorithm, where it is the basic level boosting done at any level of image processing, it begins with the identifying the weak classifier and these are like the one make very bad prediction, by making a good example in general form gives a clear overview on the boosting concept and weak classifiers. A bike racing better or gambler make a decision to enhance his winning chances widely by making use of the system technology and to built a program which makes a good prediction on the race winning bikes. First there is a need of a rule of thumb to be known it is done only when there is clear information about all the bike racers and their racing history then we need some more information from the rest of the better or gamblers where it give a vast information about the each individual racer and helps to know the which racer is mostly having much chances and to also we can know who will have winning chances of the one of odd winning but we can see that a single rule of thumb is not enough for predicting the winner of the race but it is accurate and good than a random guessing. Furthermore by repeatedly asking the rest of the betters and expert opinions we get new information regarding the races and winner and many rules of thumbs which help to make better accurate prediction to build a program to know the winner .
There is a bunch of information about the racing and winners and there is need of the system where we need to extract the rule of thumb from this set of information we had, and to know the winners with help of rule of thumbs. To make a realistic find of the rule thumb from the set of information but the process of boosting will make a most of accurate predictions, where boosting refers to mostly the rough information and every chance of predictions by an iterative process of search and to better the prediction through these reiteration until it reachs a best one.
Boosting is the one image processing framework concept which contains good information from branches of machine learning algorithms, where it is known to be like PAC learning model (Probability accurately correct) , where the prediction is needed in most of the cases but not only in racing and all the time good information may not be possible so we need to make prediction with limited information provide by the systems so where boosting techquies and the PAC learning model makes a good predictions, due to Valiant ; see Kearns and Vazirani  for a good introduction to this model. Kearns and Valiant [25,26] were the first to pose the question of whether a weak learning algorithm which performs just slightly better than random guessing in the PAC model can be boosted into an arbitrarily accurate strong learning algorithm. Schapire  came up with the first provable polynomial-time boosting algorithm in 1989. A year later, Freund  developed a much more efficient boosting algorithm which, although optimal in a certain sense, nevertheless suffered from certain practical drawbacks. The first experiments with these early boosting algorithms were carried out by Drucker, Schapire and Simard  on an OCR task.
Boosting is a machine learning meta-algorithm for performing supervised learning  and boosting techniques are the mostly used method in image processing and this algorithm is successful in many feature selection and many recognition applications like face recognition, gesture recognition and generic object recognition. AdaBoost is the one most known and popular boosting algorithm, where it is a combined combination of predictions with a margin and it is well worked or processed on the system with less or low noise but doesnâ„¢t work well with the image processing with more noise on the system. The adaboost mainly differ the positive and negative samples by a margin, where the adaboost is well separates the training data without affecting the generalization performance .
AdaBoost first find the weakest classifier where the adaboost is well known for the weak classifiers, it trains the weak classifier into a strong classifier with an iterative process. The margin is the one which define the samples and adaboost is good at margin convergence by separating the training sample.
Here in above figure we can see the margin convergence of the adaboost, classification is done by the iterative functioning.
The AdaBoost is well known for the weak classifiers, here it identifies the weak classifiers, where it is not done in one cycle it goes through several cycle to make it into the form of strong classifier from a weak classifier and in the first step it identifies the weak classifier where it is named as weak classifier due wrong detection of the object or the region of interest, where the detection is classified by the next classifier if any wrong classification is done by previous classifier, so it will be like if one classifier does wrong next classifier has the right to check and correct the detection and it is weighted every time it goes through the process of detection in classification and the weights will be redistributed for every classifier and new weights will be given to each of them and high weight will be know for wrong detection and less weight represented as a good detection and the final combination of the weigh will be considered as the linear combination of the weights where it is represented by the lamada. Here lamada is the coefficient of weak classifier
Here in above we can see the representation of the weak classifier into strong classifier and it is represented by the lamada, where lamada is responsible for every iteration and it is represented as a value where value will be in-between 0 and 1 and if the lamada reach the final value to 1 then it is known to be weak classifier and the lamada is really untraceable and we cant expect that the lamada will be in around this region, so it is easy to observe the Dt where it is the representation of distributed weights, as show as below
Where we can see that dt is the distributed linear combination of weights and it is upto value 1. Adaboost makes the all the data given to it is in the form of a matrix and makes the matching of data, if it find the data is related to the work it is searching for then it give a value else removes the value.
We can see that lamada value maximizes the margin between the positive and negative samples given and the f(x) is the feature which defines the difference between samples. AdaBoost make a good feature extraction and selection by such kind of iterative process in an organized manner and this whole process is unseen in the practical because the tool we are using give us a opportunity to make things easier and faster there some in-build functions which are specially made for adaboost functioning and we can make use of those functions and programs. The program need well established samples and featurespace to make an extraction and selection of particular features which are necessary for to detect the wheel in the frames of video. Here most of the images without the wheel, means negative sub-windows are eliminated or removed in search and gives a fast detection in less time period.
Cascade of Classifier:
The cascade of classifier is the process done under the adaboost feature extraction and selection of a particular object of different scales. The features are available after the Haar filter and the main task is to extract the required features based on the object detection desired and there are many machine learning algorithms for making better classification function. The main work of cascade classifier is to combine the works of many classifiers to achieve the desired detection. Here cascade of classifier does the work of accepting the all the frames, if the frames contains the desired detection of object and in a very less time period it achieves the more work to be done by eliminating or removing the number of negative sub windows and accepting number of positive sub-windows and we need cascade of classification only to make a kind of process where there would not be any kind of drawback of the system. Where the simple classifier are used more efficiently to remove most of the false or negative sub widows or images and there will be very less need to call upon more critical classifiers are not need to control the
sub-windows . Here in above figure we can see that the classifier is accepting all the sub window images at a time but we can see there are more number of classifiers for classification this makes much of computation to be decreased and here we can observe that each classifier is working on the previous works of the other classifier and finally the features are stored in form of vector format named as feature vectors but the work of one classifier is not final and these works repeated again with other classifier and here if any of the previous works of the classifier does wrong detection are been corrected by the next coming classifiers or else these fault sub windows without desired object will be removed from the process as we can see in the figure that every classifier is capable of forwarding correct sub window or else to remove the fault windows. And the out come of the cascade of classifiers are used for further working of detection process.
Training a cascade of classifier
The cascade classifier of individual need to be trained in same way, the cascade process being with the form of detections set, where the working of the cascade classifier from the previous works are being improved a lot where we can check the false percent of identifying the faults is been improved and training of cascade is done individually ,first filter works on the detection of particular object and second filter detects the features which are worked by the previous or the first classifier and the third classifier also works similar in form but the features are need to be remained constant along the detection of the frames and the motion of the object to be keep on updating with the classification. Where most of the classifier has the common work of finding the correct detection and the required features to be known for detection in the sequence of the frames and this make most of the computation to get reduced and makes process faster.
The kalman filter is the one most used filter in most of the applications and it is well applicable in most of the systems like vision technologies, controllability of a system, electrical and electronic system and many more. The kalman filter is mainly used for the spacecraft tracking purpose and in such a manner it is discovered to be useful in many other applications, it is used to make estimate of the future states or behavior of the system in an indirect way, where it is like an inaccurately by system itself.
Filtering of the system is most needed in many applications like mechanical, electrical or embedded systems. In general we can think about the cell phone calls or news by radio we find more noise and disturbance when listening to the speaker where an algorithm of a good filter process will remove the noise from the electrical signal and make pass only the good or necessarily information to the end listener. Kalman filter is the one beneficial tool we had where a clear estimation can be designed in form of mathematical representation, which give a wide range of performance in order to estimate the linear states of the system. Kalman filter is famous and well used due to its attractiveness in practical works but also theoretically much interesting to work with where it gives a clear idea how does the estimation works with future estimation of a system, more over it is very well performed to remove the noise difference occurred between the states occurred. The filter is very interesting because it compares the estimations of past, present and even future states where the behavior of the system is unknown and unpredictable without this little information. If we have to make use of kalman filter there is a need of removing noise from the system where we need to make the system to be in the form of linear model, there like the vehicle driving on the road segment and a plane moving about the earth and many more, a simple good example to give a clear overview on the benefits of the kalman filter for a liner model system. It is done by the below two state space equation, the system below is with noise and then
the inputs to the system are given as matrixes, in the above equations A, B, C are the matrices and then K is the time period and X is the state of the system and u is the input of the system; y is the output of the system and the w is the noise occurred from the process of the system and z is known to be the noise formed, when the difference occurred between the states of the system. Further work is being processed on these equations like the kalman filter works on the system measured noise and process noise by considering the states of the system like previous states and the present state we can estimate the future state of the system.
There in above we can see the input to the system is given as state of the system and the process noise, where noise could be like White noise or Gaussian noise, like mean to zero and the state of the system is filtered and given to the next stage of the system with removed process noise from the state of the system and then in the output equation contains the measured noise, which is occurs by the variance occurred from the errors and this is minimized by the process, where Vk is the velocity of the system.
Tracking of Features
The main task is to track the features along the sequence of the frames of video, here the camera which is fixed in a area of region, where the camera located should be capable to track the features along the allocated region without loosing the detection of particular desired object for such kind of purpose of tracking we need of kalman filter to estimate the system how to track the feature along the frames of video. Tracking gives we to think about few things while before going into the tracking work belong to project and implimentation, here some verification about the scene of the motion of object in the sequence of images by knowing these thing where these same features are going into the frames of project and implimentation. After wheel detection, the vehicle is tracked as it moves through the subsequent video frames. The tracking is needed to deduce information about vehicle like its speed etc. Tracking a vehicle simply means identifying unique points or features on vehicle and tracking those features as they move through subsequent frames.
For feature extraction, Lucas-Kanade method  can be used. The method used by Lucas-Kanade is a two frame-differential method for optical flow estimation and image registration and by the local variables around the central pixel gives under consideration at any given time and this is know as the additional term to the optical flow.
Assuming that a specific point/feature in an image I(x, y, t) shifts to another position, then:
First order Taylor Expansion
Dividing by and denoting
Assuming constant in small neighborhood, the state space from equation above can be given
Our goal is to minimize . Multiplying both sides of Esq. of above by , we get:
represents the local neighborhood. We want this matrix to be invertible i.e. no zero eigen values. According to Shi/Tomasi method , a good feature is that for which has big eigen values. So the aim of finding unique features is to find those points where has higher eigen values.
Using OpenCV library function CvGoodFeaturesToTrack(), the good features can easily be found by Shi/Tomasi algorithm. The function is given as:
cvGoodFeaturesToTrack (frame,eig_image,temp_image,frame_features & number_of_features, .01, .01, NULL);
Where frame is the input image, eig_image and temp_image work as workspaces for the algorithm and frame_features is the output image in which the features are returned. The sixth argument .01 specifies the minimum quality of the features (based on the eigen values). The seventh argument .01 specifies the minimum Euclidean distance between features. The eighth argument NULL means using the entire input image (we can point to a part of the image also).The tracking is very interesting and to know the speed of the vehicle is one of the task need to be known to system in order to accurately track the system along the frames because once the position of the object is known then it is easier and less computation need to detect in next frame basing on previous information and the tool we are using for this project and implimentation gives us a well defined built functions, where it is like CVkalman() this function is been used in this paper to make track of the features for better detection of the desired work.
The development of project and implimentation work and a clear approach is discussed in this section with a brief block diagram, as shown as below
Here a clear approach is provide for implementation of the object detection and the programs which are been used in the project and implimentation and what are the outputs we get by running which is been clear explained and the tool we have used provides the built in programs for many tasks.
For formation of the createsample.exe file we have to use three programs which are available in the OpenCV tool itself these things will be available in the c-drive or else where the OpenCV software is installed and there we can find the createsample.exe file and we can run the program with the information available with us and can get the output further process. The second program belong to Haartranining.exe, where it is also available in the OpenCV as a built in program and we can find in the c-drive where the OpenCV software is loaded and here we need to give the output provided from the createsample.exe and the output of this haartraining.exe helps us for further process of project and implimentation and finally we can use last built in program named as performance.exe where this is the program which is utilized for checking the performance of the cascade classifier and the accuracy of the detection towards the wheel detection.
Images can be acquired from any where like streets wherever you find a wheels by using any kind of cam recorders and some of the images are been collected from the web search and there are many sites which provide the images for training purpose of detection, we need lot of datasets for any object detection. The collection of datasets consists of positive sample images and negative sample images, as shown as below
We will get positive and negative images for Wheels. we can use 100 +ve and 100 -ve images and even more for robust detection of the system and we have make note here where we are only detecting for the frontal wheel only, so our whole dataset of positive samples are of frontal view of wheels only.
Before running the createsample.exe, we need to crop the images for specifying the position and number of wheels and there is need of cropping the image o specify the region where we need to detect, so for this we have used a code for creating samples, we downloaded that code for cropping images, this program makes us to place the path of the folder into the command and then by running the program will make the images in the folder to be displayed and note here we have to make path of positive samples. After running the program we can make crop of each individual image in the folder, it work in such a manner where image after image will be coming to the front of the screen soon after completion of all the images the program gets terminated. We mark the position of wheels by mouse in an image and specify the number of wheels in the image by running that code; this will be done by using that code and after running that code, it will return the number of wheels and the coordinates of the wheel in the image, this information will be saved in a text file named as positivesample.txt, where this file is very important for further work and we can make use of the createsample.exe command with this file of positivesample.txt
The most important thing we have to keep in mind that all the images should contain equal or same height and width in order to avoid the poor construction of the cascade classifier else we have to face critical classification and such kind of errors are hard to define or hard to notify. The creation of the creatingsample we need to create a vector file by making use of the previous work positivesample.txt and now we have to go into the command prompt and have type the command including the positivesample.txt file to create the createsample vector file for further work. This createsample file is mainly useful for creation of the classifier and the command used for creation of this file is C:\program files\OpenCV\bin>createsamples â€œinfo Positivesamples.txt â€œnum 166 â€œshow -w 15 â€œh 20 â€œvec CreateSample.dat> createsample.log
Training Using AdaBoost:
The adaboost is a learning algorithm takes only the weak classifier and make it in the form of strong classifier and this is a process involved in extracting the feature which is required for detection of a wheel. Where we know that there are large sum of features are formed and the task is to extract the features based on the desired detection of the object, so It means that the best features (Haar Features) will be selected by this phase of training process and it will be done automatically by haartraining.exe
By using the haartraining command we can make cascade of classifier in TrainingSample folder and this will produce a file of cascade classifier and for detection this cascade of classifier should be converted in xml format.
The testing is much essential to make recheck of the work done and to conform approach to be in a controlled manner. The test be done from the first file we get a file of txt format of a complete details of all the images and the height, width and then the x, y co-ordinates and then the performance command recheck the performance of the cascade files and then the classification of the objects to be detected this test will produce wrong or error in some case where we can come to a conclusion for such kind of errors and these errors are because of the difference of height and width of the images or else the cascading may become problem in some other cases It will be done by a performance program which will test all the images to find the wheels inside it, then it will save the results separately in a folder for all the images and then we can view those images for checking the results and the format of the txt file is as shown as below
Real Time Software Configuration
The obstacles occurred when the system is placed in real-time environment and these systems may make a wrong detection due to many reasons and there is a need of overcoming such situation to make the system robust and fit to real-time scenario. In order to achieve this task, a real time computer vision based monitoring system is required that can track the vehicle wheels 24/7 and collect relevant information about them, working in every conditions like low luminosity, bad weather and other environmental changes.
A computer based traffic monitoring system can collect much more information than conventional systems. Such a system should be developed at lowest possible cost so that its implementation can be made feasible on a large geographical area. Through some image processing tasks, detecting vehicle wheels is an image is not very difficult. The problem arises in the correct tracking of the vehicles as they move through the path. Moreover, in order to have a good detection and tracking rate in all the conditions, environmental changes need to be dynamically sensed by the system in real time to configure itself according to them. Our proposed system is a low-cost monitoring system that is able to operate in real time with fast and efficient processing. It operates autonomously, requires low maintenance and can operate under different lighting conditions. All that is required for this system is a camera to record wheels as well as traffic scenes, a computer for processing image processing algorithms in real time and a screen to display wheels information and there we need a dynamic updating of the background of the video all the time, so in such a cases we have Lucas-Kenade method by using the kalman filter.
Video Capture and wheel Detection:
Camera is mounted at a proper place on the road which captures video frames of the road. The wheels of vehicle detection algorithm may be divided into three parts: firstly, the generation of a suitable references or background; secondly, the arithmetic subtraction operation; and thirdly, the selection (and application) of a suitable threshold . An empty background image having empty road without any object is obtained. This background image is used as a reference image to subtract subsequent frames from it. For subtracting each captured frame from the reference image, simple difference method is used. This method provides an absolute pixel-to-pixel difference between the background image and the current image.
Where D (i, j) represents M x N difference image, B (i,j) the background image and I(i,j) the current image. After that, a blob image is obtained which contains only those pixels of difference image D (i, j) which are above a certain specified threshold, as shown in Eq-2. This threshold value is set according to the lighting condition and must be adaptive.
After thresholding, a contour image of the wheel of vehicle is obtained. A contour image of the wheels is an image that contains only the outlines of the wheel. The contour image is obtained to set a bounding box around the vehicle and set the Region Of Interest (ROI). Region of interest is set in order to track the wheels of vehicle in forthcoming frames throughout its path. References images can be generated by a variety of methods, e.g. on a background image acquired during a period of relative inactivity within the scene or from a temporally adjacent image from a dynamic sequence. In order to adapt to both global and local illumination changes (e.g. clouds, shadows), updating strategies can be applied to the reference image in order to keep it up-to-date, we can see in further section in detail about the kalman filter background updating.
The image differencing been discussed a little in above section where we know the image of the background need to updated if any change by sudden occurred in the frames of video and the change is observed for big changes in the frames and for lighting changes, such kind of changes are been updated by the help of checking the variance between the pixels of to frames. The difference gives the difference from he previous frame. Here the thresholding is the one which make points the region of the interest in the image or frame by giving a blob or rectangle around the wheel detected.
Region of Interest:
In the frames of video we need to detect or identify the region of interest in the image and the whole process of wheel detection is defined basing on the region of interest. Once the region of interest is defined then the OpenCV will work further on that detection of region of interest, where in above sections we had discussed in detail how a desired detection of a object would be specified and make detect in the images. In order to point the region of the interest we need to crop the samples where we can make understand the system by cropping the region of interest and by these samples system will get know what to detect in the image. In OpenCV the function which is available after the cropping the region of interest.
The sub-image is the one which is stated as region of interest and then it is desired detection for any kind of application which is build for object detection.
There are different intensities at different regions of the images, the light glow will be different at some areas of the frames where we can find light in dim or else more bright we need to adjust the light glow at different regions of the with required glow, so we need to make an correct threshold of different regions of he image. And more over thresholding is like a image segmentation method and to get a threshold of the image we need to first change the image in the form of grayscale, where the image is been handled in the form of binary . So to build different thresholds for different regions of the image we need an adaptive thresholding. This Adaptive thresholding gives an required threshold to any region or the corners of the image. In case of building different thresholds is also known to be local or dynamic thresholds [wiki]
The figure 16, 17 are the representation of the image with grayscale and the threshold image here in the OpenCV tool we can get the built functions which are available for the thresholding with adaptive or without the adaptive, they are as shown as below
As we discussed in the above the obstacles in a real-time system for detection, road environment is prone to changes due to variation in light, weather conditions and other environmental changes. Image thresholding is affected by these changing conditions and the vehicle cannot be detected properly. Threshold value cannot be fixed for all the conditions and it needs to be changed dynamically. Due to this, the background image needs to be updated in real time. For sensing the environmental change and update the background image according to it, we use Kalman filter.
The Kalman filtering mainly make a kind of prediction by making a good formation of equation in order to know the future state of the system by removing or controlling the process as well as the noise from the variance of the states The kalman filter predicts a new system state, compares the prediction with the actual measured value, and obtains a new estimation by weighing the difference of the prediction and the measured value. Therefore, measured values which do not fit to the actual system behavior get a lower weight .In our system, the filter is applied to each image pixel P (i,j) of the input image to adaptively predict an estimate of the related background pixel at each time instant.
Our dynamic system can be characterized by the following equation:
Where represents the background image point p at the current frame. represents an estimate of the same quantity at the next frame and is an estimate of the system model error. The model error takes into account the system model approximation and is formed by two components:
Where represents a slow variation, is white noise with zero mean and the model of is a random walk model.
The measurement model is represented by the following equation:
Where represents the current image point p and is the estimate of the noise affecting the input image i.e. a measure of system error. The updating module uses the of the difference image point p and the estimate of the noise affecting the input image to update the filter gain . Then the updating module computes an estimate of the system model error on the basis of the filter gain and the value i.e.
The prediction module computes an estimate of at the next frame . Another simplified view of this process is shown in below
The detection of the desired is to detect the wheel in the image or frames of a video and to make a red color blob or rectangle around the wheel and the results of the system is shown below
A low cost prototype of a traffic monitoring system based on computer vision was proposed. The system uses lowest infrastructure and implementation cost by exploiting a camera to capture video images of road and the image processing algorithms. The information obtained by tracking the vehicle wheel is necessary to stored to know the type of the vehicle and further work can be used for collecting charges on over weight vehicle by observing the pressure on the wheels, where it is much useful information through the wheel tracking.
 Pablo Negri, Xavier Clady, Shehzad Muhammad Hanif, and Lionel Prevost, A Cascade of Boosted Generative and Discriminative Classifiers for Vehicle Detection,2008, Article ID 782432, 12 pages, doi:10.1155/2008/782432
S. Han, E. Ahn, and N. Kwak, Detection of multiple vehicles in image sequences for driving assistance system, in Proceedings of the International Conference on Computational Science and Its Applications (ICCSA â„¢05), vol. 3480, pp. 1122â€œ1128, Singapore, May 2005.
 Prof. Dan Simon, Kalman Filtering, Embedded Systems Programming JUNE 2001
P. A. Viola and M. J. Jones, Robust real-time face detection, in Proceedings of the 8th IEEE International Conference on Computer Vision (ICCV â„¢01), vol. 2, p. 747, Vancouver, BC, Canada, July 2001.
Gary Bradski, Adrian Kaehler and Vadim Pisarevsky. Learning-Based Computer Vision with Intelâ„¢s Open Source Computer Vision Library. In Intel Technologies Journal, volume 9, issue 2, 2005.
P.Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In IEEE Conference on Computer Vision and Pattern Recognition 2001, 2001.
GonÃ‚Â¸calo Monteiro, Paulo Peixoto, Urbano Nunes, VISION-BASED PEDESTRIAN DETECTION USING HAAR-LIKE FEATURES
Rainer Lienhart Alexander Kuranov and Vadim Pisarevsky. An empirical analysis of boosting algorithms for rapid objects with an extended set of Haar-like features. In Intel Technical Report MRL-TR-July 02-01, 2002.
Y.Freund and R. E. Schapire. Experiments with a new boosting algorithm.. In proceedings of 13th International Conference on Machine Learning, 1996.
J. Quinlan. Induction of decision trees. Machine Learning, 1:81â€œ106, 1986.
S. M. S. Islam, M. Bennamoun and R. Davies, Fast and Fully Automatic Ear Detection Using Cascaded AdaBoost
 GonÃ‚Â¸calo Monteiro, Paulo Peixoto, Urbano Nunes, VISION-BASED PEDESTRIAN DETECTION
USING HAAR-LIKE FEATURES
Konstantinos G. Derpanis, Integral image-based representations, July 14, 2007
Pierre F. Gabriel, Jacques G. Verly, Justus H. Piater, and AndrÃƒÂ© Genon, The State of the Art in Multiple Object Tracking Under Occlusion in Video Sequences
P.L. Rosin and T. Ellis, Image difference threshold strategies and shadow detection, Birmingham, UK, 1995, pp. 347-356.
 B.D. Lucas and T. Kanade, An iterative image registration technique with an application to stereo vision, 1981, pp. 674â€œ679.
 B.K.P. Horn and B.G. Schunck, Determining optical flow, vol. 17, 1981, pp. 185-203.
 L.G. Brown, A survey of image registration techniques, ACM computing surveys (CSUR), vol. 24, no. 4, 1992, pp. 325-376.
 J. Shi and C. Tomasi, Good features to track, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1994, pp. 593-600
C. Ridder, et al., Adaptive background estimation and foreground detection using Kalman-filtering, 1995, pp. 193â€œ199
Prof. Gian Luca Foresti, Background Updating Using Kalman Filter, Lecture on Artificial Vision, May 6, 2009, Alpen Adria University, Klagenfurt.
Yoav Freund, Robert E. Schapire, A Short Introduction to Boosting, Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, September, 1999.
L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134â€œ1142,November 1984.
Michael J. Kearns and Umesh V. Vazirani. An Introduction to Computational Learning Theory. MIT Press, 1994.
 Michael Kearns and Leslie G. Valiant. Learning Boolean formulae or finite automata is
as hard as factoring. Technical Report TR-14-88, Harvard University Aiken Computation
Laboratory, August 1988.
 Michael Kearns and Leslie G. Valiant. Cryptographic limitations on learning Boolean formulae and finite automata. Journal of the Association for Computing Machinery, 41(1):67â€œ95, January 1994.
 Robert E. Schapire. The strength of weak learnability. Machine Learning, 5(2):197â€œ227,
Yoav Freund. Boosting a weak learning algorithm by majority. Information and Computation, 121(2):256â€œ285, 1995.
 Harris Drucker, Robert Schapire, and Patrice Simard. Boosting performance in neural networks. International Journal of Pattern Recognition and Artificial Intelligence, 7(4):705â€œ719, 1993.
Cynthia Rudin, Robert E. Schapire, Ingrid Daubechies, Analysis of Boosting Algorithms using the Smooth Margin Function: A Study of Three Algorithms, 2004.
Rainer Lienhart, Alexander Kuranov, Vadim Pisarevsky, Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection, MRL Technical Report, May 2002
4. Haar-Like Features
4.1. Integral Image
4.2. Feature Selection and Extraction
4.3. Feature vectors and feature spaces
5. Training Stage
7.1. Weak Classifier
8. Cascade of Classifier
8.1. Training Of Cascade of Classifier
9. Kalman Filter
9.1. Tracking of Feature
10. Experimental Work
10.1. Programs Used
10.2. Image Acquisition
10.3. Creating Samples
10.4. Training Using AdaBoost
11. Real Time Software Configuration
11.1. Video Capture and wheel Detection
11.2. Image Differencing
11.3. Region of Interest
11.5. Background Updating
12. Results Verification
Use Search at http://topicideas.net/search.php wisely To Get Information About Project Topic and Seminar ideas with report/source code along pdf and ppt presenaion
|Popular Searches: object tracking kalman filter opencv c code, abstract of multi detection tracking, mapreduce adaboost, downstream digital, object tracking camera, digital image enlarger, kalman filter matlab based project report,|