Comparing everyday listening to auditory perception stimuli by classifying two million YouTube Videos
Name:Andres, Elizondo Lopez
School/Affiliation:McMaster University
Co-Authors:Michael Schutz
Virtual or In-person:In-person
Abstract:
Roughly 90% of auditory experiments are conducted using simplistic, computer-generated stimuli (Schutz & Gillard, 2020). Using a classification of audio found in 2 million YouTube videos into 527 labels by Gemmeke et al. (2017) provides a useful approximation of sounds found in everyday listening. Two research assistants reviewed 10 representative videos from each of the 527 labels, classifying them according to the categories used by Schutz and Gillard (2020). We found that the YouTube labels have an opposite distribution compared to stimuli found in auditory experiments. Our results indicate most sounds heard in YouTube videos exhibit complex temporal structures (~90%). These findings contrast with stimuli used to assess the auditory system, where only 11% exhibit the temporal complexity of referential sounds (Schutz & Gillard, 2020). This raises important questions about the theories and models derived from auditory experiments focused on simplistic sounds generalize to auditory processing as it occurs outside the laboratory. It also helps clarify why a small but growing body of experimental work demonstrates that theories derived from experiments using artificial stimuli fail to generalize when assessed with more complex sounds common outside the laboratory.