NTT's team wins 1st place in Audio Captioning task at DCASE 2020 Challenge

Yuma Koizumi, with the Media Intelligence Laboratories of the Service Innovation Laboratory Group, and Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, and Kunio Kashino, with the Communication Science Laboratories of the Science and Core Technology Laboratory Group, won 1st place in the Audio Captioning task at the DCASE 2020 Challenge held from March to July this year.

Yuma Koizumi
(Researcher)

Daiki Takeuchi

Yasunori Ohishi
(Senior Research Scientist)

Noboru Harada
(Senior Research Scientist, Supervisor)

Kunio Kashino
(Senior Distinguished Researcher)

The DCASE* Challenge is an annual international competition officially recognized by the IEEE Audio and Acoustic Signal Processing Technical Committee, and this year's event was the sixth. "Automated audio captioning" is a new task DACE introduced this year. The challenge is to automatically generate appropriate and accurate text descriptions or explanations for given audio signals of various non-speech sounds. Ten teams from around the world competed in the task.

NTT is one of the earliest research institutes in the world that to work on the verbalization of sounds. To tackle the task, we took full advantage of the algorithms and knowledge accumulated by the above members, and combined various ideas ranging from pre-processing to post-processing and automated meta-parameter tuning.

Automated audio captioning is an emerging technology field, but a method for achieving it has not yet been established. The capability to describe all kinds of sounds with texts could bring many benefits to our lives in the near future. NTT will therefore continue its research to further strengthen the technology.

DCASE：Detection and Classification of Acoustic Scenes and Events is an international conference on sound event detection and sound scene classification.

DCASE2020 Challenge (DCASE2020) Open other window

Automatic creation of textual content descriptions for general audio signals. (DCASE2020) Open other window

NTT's team wins 1st place in Audio Captioning task at DCASE 2020 Challenge

NTT STORY