P1-6 Predicting Musical Chills From Autoencoder Latent Space Representations
Name:Arun Asthagiri
School/Affiliation:Northeastern University
Co-Authors:Psyche Loui
Virtual or In-person:In-person
Abstract:
Music-induced chills arise from heightened states of physiological arousal that correspond to intense, pleasurable musical moments (Salimpoor, 2009). It is generally thought that musical features, such as melodic appoggiaturas or harmonic sequences (Sloboda, 1991), can elicit physiological responses during these pleasurable moments. However, these musical features may interact in how they individually or jointly contribute to a physiological response; this poses a challenge for the ability to predict chills in music at scale. More recently, deep-learning encoder-decoder approaches have garnered attention in their ability to create meaningful latent space representations from complex data. It is not well understood how closely these computer representations resemble human representations or music perception. Here, we investigate whether embeddings in the latent space of a pre-trained neural audio compression model (Kumar, 2023) can predict physiological arousal and music-induced chills. Northeastern students (n=46) listened to self-selected “chills-inducing” and “neutral” music, reporting moments of chills along with ratings of valence, engagement and surprise. Skin conductance and PPG were collected using wearables. First, we reproduce findings linking music-induced chills to physiological arousal, including increased heart rate and skin conductance. Then, we extract time-varying embeddings of songs from a neural autoencoder, which form apparent clusters in songs and correlate with structural elements (eg. section transitions). Finally, we show that transitions in the embedding space coincide in time with physiological change. This approach to using audio encoder representations of raw audio holds promise for extracting perceptually and physiologically meaningful information from music.