Viitteet 1-2 / 2
WaveTransformer: An Architecture For Automated Audio Captioning
Automated audio captioning is a multi-modal task in which the system receives an audio sample as an input and generates a text (a caption) that describes the information presented in the audio. The system not only detects ...
Sequence Temporal Sub-Sampling for Automated Audio Captioning
Audio captioning is a novel task in machine learning which involves the generation of textual description for an audio signal. For example, a method for audio captioning must be able to generate descriptions like “two ...