Hyppää sisältöön
    • Suomeksi
    • In English
Trepo
  • Suomeksi
  • In English
  • Kirjaudu
Näytä viite 
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
  •   Etusivu
  • Trepo
  • TUNICRIS-julkaisut
  • Näytä viite
JavaScript is disabled for your browser. Some features of this site may not work without it.

The CORSMAL benchmark for the prediction of the properties of containers

Xompero, Alessio; Donaher, Santiago; Iashin, Vladimir; Palermo, Francesca; Solak, Gokhan; Coppola, Claudio; Ishikawa, Reina; Nagao, Yuichi; Hachiuma, Ryo; Liu, Qi; Feng, Fan; Lan, Chuanlin; Chan, Rosa H.M.; Christmann, Guilherme; Song, Jyun Ting; Neeharika, Gonuguntla; Reddy, Chinnakotla K.T.; Jain, Dinesh; Rehman, Bakhtawar Ur; Cavallaro, Andrea (2022)

 
Avaa tiedosto
The_CORSMAL_Benchmark_for_the_Prediction_of_the_Properties_of_Containers_1.pdf (2.299Mt)
Lataukset: 



Xompero, Alessio
Donaher, Santiago
Iashin, Vladimir
Palermo, Francesca
Solak, Gokhan
Coppola, Claudio
Ishikawa, Reina
Nagao, Yuichi
Hachiuma, Ryo
Liu, Qi
Feng, Fan
Lan, Chuanlin
Chan, Rosa H.M.
Christmann, Guilherme
Song, Jyun Ting
Neeharika, Gonuguntla
Reddy, Chinnakotla K.T.
Jain, Dinesh
Rehman, Bakhtawar Ur
Cavallaro, Andrea
2022

IEEE Access
doi:10.1109/ACCESS.2022.3166906
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tuni-202205064441

Kuvaus

Peer reviewed
Tiivistelmä
<p>The contactless estimation of the weight of a container and the amount of its content manipulated by a person are key pre-requisites for safe human-to-robot handovers. However, opaqueness and transparencies of the container and the content, and variability of materials, shapes, and sizes, make this problem challenging. In this paper, we present a range of methods and an open framework to benchmark acoustic and visual perception for the estimation of the capacity of a container, and the type, mass, and amount of its content. The framework includes a dataset, specific tasks and performance measures. We conduct a fair and in-depth comparative analysis of methods that used this framework and audio-only or vision-only baselines designed from related works. Based on this analysis, we can conclude that audio-only and audio-visual classifiers are suitable for the estimation of the type and amount of the content using different types of convolutional neural networks, combined with either recurrent neural networks or a majority voting strategy, whereas computer vision methods are suitable to determine the capacity of the container using regression and geometric approaches. Classifying the content type and level using only audio achieves a weighted average F1-score up to 81% and 97%, respectively. Estimating the container capacity with vision-only approaches and filling mass with audio-visual approaches, multi-stage algorithms reaches up to 65% weighted average capacity and mass scores. These results show that there is still room of improvement for the design of future methods that will be ranked and compared on the individual leaderboards provided by our open framework.</p>
Kokoelmat
  • TUNICRIS-julkaisut [20143]
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste
 

 

Selaa kokoelmaa

TekijätNimekkeetTiedekunta (2019 -)Tiedekunta (- 2018)Tutkinto-ohjelmat ja opintosuunnatAvainsanatJulkaisuajatKokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
Kalevantie 5
PL 617
33014 Tampereen yliopisto
oa[@]tuni.fi | Tietosuoja | Saavutettavuusseloste