Privacy conscious computer vision
Yrjänäinen, Jukka (2020)
Yrjänäinen, Jukka
2020
Degree Programme in Electrical Engineering, MSc (Tech)
Informaatioteknologian ja viestinnän tiedekunta - Faculty of Information Technology and Communication Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2020-05-20
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202005125231
https://urn.fi/URN:NBN:fi:tuni-202005125231
Tiivistelmä
This work describes the design and the implementation of the distributed computer vision solution for person tracking and counting in a public museum. The system consists of an edge device fleet that does send data to a cloud server for further analysis. The key design goal is the protection of the privacy of the persons being monitored. This is achieved by a system that does not send actual images to the server, instead detected persons are represented with a feature vector extracted from the images. Privacy sensitive image data is not stored or transmitted anywhere in the system.
Device design consists of Raspberry Pi single-board computers equipped with a neural network acceleration hardware and a camera module. These devices are used to locate a person from the camera view with the object detection neural network. After object detection a re-identification neural network is applied to the found object to generate a feature vector representation. This vector is sent to the cloud server. Based on the feature vectors it is possible to associate detected people across multiple cameras and moments time. However, it is not possible to reconstruct the original image from the feature vector.
The experiments and performance measurements with edge devices show that the simultaneous use of deep neural networks for object detection and feature generation using relatively low-cost hardware is feasible. The design recommendation based on the experiments is that the use of a dedicated HW accelerator for running all neural networks is preferred. The analysis also show that the variation of the accuracy and computational complexity of used neural networks offers a range of feasible performance trade-offs.
The data analysis method in the server using only feature vector data for tracking and clustering is evaluated. The experiments with publicly available image dataset indicate that with the proposed approach it is possible to approximate person count with reasonable accuracy.
Device design consists of Raspberry Pi single-board computers equipped with a neural network acceleration hardware and a camera module. These devices are used to locate a person from the camera view with the object detection neural network. After object detection a re-identification neural network is applied to the found object to generate a feature vector representation. This vector is sent to the cloud server. Based on the feature vectors it is possible to associate detected people across multiple cameras and moments time. However, it is not possible to reconstruct the original image from the feature vector.
The experiments and performance measurements with edge devices show that the simultaneous use of deep neural networks for object detection and feature generation using relatively low-cost hardware is feasible. The design recommendation based on the experiments is that the use of a dedicated HW accelerator for running all neural networks is preferred. The analysis also show that the variation of the accuracy and computational complexity of used neural networks offers a range of feasible performance trade-offs.
The data analysis method in the server using only feature vector data for tracking and clustering is evaluated. The experiments with publicly available image dataset indicate that with the proposed approach it is possible to approximate person count with reasonable accuracy.