Object Recognition for Maritime Application Using Deep Neural Networks
Hakala, Jarmo (2018)
Hakala, Jarmo
2018
Tietotekniikka
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2018-11-07
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201810242474
https://urn.fi/URN:NBN:fi:tty-201810242474
Tiivistelmä
The aim of this thesis was to study object recognition with the state of the art methods in order to evaluate their potential for the sector of autonomous maritime logistics. In autonomous maritime transportation, object localization and recognition is crucial for safe and efficient traffic flow. In this study, object recognition was studied by training deep convolutional neural networks for image classification and by evaluating their classification and computational performance.
In machine learning, a classification algorithm is trained with supervised learning with a dataset of input-output examples. Object recognition is a classification task where objects are classified from images. In deep learning, deep neural networks with multiple layers learn hierarchical representations of the data. For training, they require more computation and data than traditional machine learning methods. In the past years, more and more data has become available, and the computation capacity has increased dramatically. Therefore, deep neural networks have outperformed traditional machine learning algorithms in many tasks, such as object recognition. The best results in object recognition are achieved using deep convolutional neural networks.
In the experiments, deep convolutional neural networks were trained for image classification with Rolls-Royce Maritime Image (RRMI) dataset. Small-CNN architecture was generated and trained with random hyperparameter search approach using random weight initialization whereas VGG16, ResNet50 and MobileNet architectures were trained with transfer learning. The classification and computational performances of the models were measured. Transfer learning approach proved to improve classification performance. The VGG16 achieved the best accuracy of 84.0% for the dataset. The best average class accuracy of 78.4% was achieved with the ResNet50. The computational performance of the models was evaluated by measuring the time required for image classification with a CPU and GPU in order to evaluate their potential for a real-time object localization and recognition system. With the GPU, the models were much faster and performed in 3.6-16.0 milliseconds.
In machine learning, a classification algorithm is trained with supervised learning with a dataset of input-output examples. Object recognition is a classification task where objects are classified from images. In deep learning, deep neural networks with multiple layers learn hierarchical representations of the data. For training, they require more computation and data than traditional machine learning methods. In the past years, more and more data has become available, and the computation capacity has increased dramatically. Therefore, deep neural networks have outperformed traditional machine learning algorithms in many tasks, such as object recognition. The best results in object recognition are achieved using deep convolutional neural networks.
In the experiments, deep convolutional neural networks were trained for image classification with Rolls-Royce Maritime Image (RRMI) dataset. Small-CNN architecture was generated and trained with random hyperparameter search approach using random weight initialization whereas VGG16, ResNet50 and MobileNet architectures were trained with transfer learning. The classification and computational performances of the models were measured. Transfer learning approach proved to improve classification performance. The VGG16 achieved the best accuracy of 84.0% for the dataset. The best average class accuracy of 78.4% was achieved with the ResNet50. The computational performance of the models was evaluated by measuring the time required for image classification with a CPU and GPU in order to evaluate their potential for a real-time object localization and recognition system. With the GPU, the models were much faster and performed in 3.6-16.0 milliseconds.