Deep neural network for automatic vehicle detection
Hakola, Anni (2019)
Hakola, Anni
2019
Sähkötekniikka
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2019-05-22
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201905211710
https://urn.fi/URN:NBN:fi:tty-201905211710
Tiivistelmä
Machine learning has achieved an important role in research, business and everyday life in the form of, for example, automatic aviation, face and speech recognition and virtual reality games. Visy Oy, a company in Tampere, Finland, is developing various tools for automatic traffic control. The tools include an access gate consisting of an inductive loop or a laser scanner, a barrier and a camera. The purpose of a loop or a scanner is to trigger a camera to take an image when a vehicle is in the correct spot to which the camera is zoomed and focused. The image is fed to a license plate recognition software and a permit decision is made according to the recognized plate. If the access is accepted, the barrier will open.
This Thesis has two aims regarding machine learning combined with automatic traffic control. The first aim is to search, study and test high-image-quality cameras and decide, whether they are suitable for Visy projects or not. The high image quality is motivated by the customers’ need for recognizing small details, such as seals and dangerous goods labels, from an image that is taken of a whole container. The current cameras that Visy Oy is using are not sufficient for this purpose.
Three cameras are chosen for the camera tests including Sony’s video surveillance camera, Canon’s digital single-lens reflex camera and the current camera used in the projects, Basler’s video surveillance camera. Only Sony and Basler are included in the final tests because of a problem in software support in Canon’s camera. The tests are performed in Visy Oy’s perspective and for Visy Oy’s needs in the office of Visy Oy, and the results are observed and estimated visually. In the tests, the cameras shoot images every 15 minutes during the night also and the images are saved to a folder on a computer.
Sony is found to have significantly higher image quality, especially at night, compared to Basler. Sony fulfils Visy’s requirements and is found to be suitable for Visy’s projects. It has already been proposed to a potential project where small details need to be recognized, but no confirmation has been received for the project while writing this Thesis.
The second aim of this Thesis is to implement a deep convolutional neural network for automatic vehicle detection, called a virtual trigger. Its purpose is to replace inductive loops and laser scanners in Visy projects. In other words, image frames are captured from a camera and each frame is classified to contain a vehicle on the correct spot or not. If the image is classified to have a vehicle on the correct spot, an image for license plate recognition is triggered. Three different network models are implemented, trained and tested, including two pre-trained models and one model that is created from scratch.
The requirements for the virtual trigger network are that it is fast and classifies the images with a high classification accuracy, meaning over 99 %. The neural network tests show that one of the pre-trained network models achieves almost all the goals and is chosen for real-life tests, which are not a part of this Thesis. Virtual trigger is operating on a real installation now. The results are promising, but further improvements are needed for obtaining over 99 % accuracy in real life.
Almost all the goals were achieved, a suitable camera was found, and virtual trigger obtained over 99 % validation accuracy. Camera tests were slightly one-sided and virtual trigger did not exceed the aim on the test data, but the future for both parts looks promising.
This Thesis has two aims regarding machine learning combined with automatic traffic control. The first aim is to search, study and test high-image-quality cameras and decide, whether they are suitable for Visy projects or not. The high image quality is motivated by the customers’ need for recognizing small details, such as seals and dangerous goods labels, from an image that is taken of a whole container. The current cameras that Visy Oy is using are not sufficient for this purpose.
Three cameras are chosen for the camera tests including Sony’s video surveillance camera, Canon’s digital single-lens reflex camera and the current camera used in the projects, Basler’s video surveillance camera. Only Sony and Basler are included in the final tests because of a problem in software support in Canon’s camera. The tests are performed in Visy Oy’s perspective and for Visy Oy’s needs in the office of Visy Oy, and the results are observed and estimated visually. In the tests, the cameras shoot images every 15 minutes during the night also and the images are saved to a folder on a computer.
Sony is found to have significantly higher image quality, especially at night, compared to Basler. Sony fulfils Visy’s requirements and is found to be suitable for Visy’s projects. It has already been proposed to a potential project where small details need to be recognized, but no confirmation has been received for the project while writing this Thesis.
The second aim of this Thesis is to implement a deep convolutional neural network for automatic vehicle detection, called a virtual trigger. Its purpose is to replace inductive loops and laser scanners in Visy projects. In other words, image frames are captured from a camera and each frame is classified to contain a vehicle on the correct spot or not. If the image is classified to have a vehicle on the correct spot, an image for license plate recognition is triggered. Three different network models are implemented, trained and tested, including two pre-trained models and one model that is created from scratch.
The requirements for the virtual trigger network are that it is fast and classifies the images with a high classification accuracy, meaning over 99 %. The neural network tests show that one of the pre-trained network models achieves almost all the goals and is chosen for real-life tests, which are not a part of this Thesis. Virtual trigger is operating on a real installation now. The results are promising, but further improvements are needed for obtaining over 99 % accuracy in real life.
Almost all the goals were achieved, a suitable camera was found, and virtual trigger obtained over 99 % validation accuracy. Camera tests were slightly one-sided and virtual trigger did not exceed the aim on the test data, but the future for both parts looks promising.