"Xie, Huang" - Selaus tekijän mukaan TUNICRIS-julkaisut
-
Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances
Xie, Huang; Khorrami, Khazar; Räsänen, Okko; Virtanen, Tuomas; Fuentes, Magdalena; Heittola, Toni; Imoto, Keisuke; Mesaros, Annamaria; Politis, Archontis; Serizel, Romain; Virtanen, Tuomas (Tampere University, 2023)
conferenceObject -
Language-based Audio Retrieval Task in DCASE 2022 Challenge
Xie, Huang; Lipping, Samuel; Virtanen, Tuomas (DCASE, 2022)
conferenceObject -
On Negative Sampling for Contrastive Audio-Text Retrieval
Xie, Huang; Räsänen, Okko; Virtanen, Tuomas
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (IEEE, 2023)
conferenceObjectThis paper investigates negative sampling for contrastive learning in the context of audio-text retrieval. The strategy for negative sampling refers to selecting negatives (either audio clips or textual descriptions) from ... -
Unsupervised Audio-Caption Aligning Learns Correspondences between Individual Sound Events and Textual Phrases
Xie, Huang; Räsänen, Okko; Drossos, Konstantinos; Virtanen, Tuomas
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (IEEE, 05 / 2022)
conferenceObjectWe investigate unsupervised learning of correspondences between sound events and textual phrases through aligning audio clips with textual captions describing the content of a whole audio clip. We align originally unaligned ... -
Zero-Shot Audio Classification using Image Embeddings
Dogan, Duygu; Xie, Huang; Heittola, Toni; Virtanen, Tuomas
European Signal Processing Conference (IEEE, 2022)
conferenceObject -
Zero-Shot Audio Classification Via Semantic Embeddings
Xie, Huang; Virtanen, Tuomas (2021)
articleIn this paper, we study zero-shot learning in audio classification via semantic embeddings extracted from textual labels and sentence descriptions of sound classes. Our goal is to obtain a classifier that is capable of ... -
Zero-shot audio classification with factored linear and nonlinear acoustic-semantic projections
Xie, Huang; Räsänen, Okko; Virtanen, Tuomas
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (IEEE, 2021)
conferenceObjectIn this paper, we study zero-shot learning in audio classification through factored linear and nonlinear acoustic-semantic projections between audio instances and sound classes. Zero-shot learning in audio classification ...