Estimating severity of depression from acoustic features and embeddings of natural speech

Sri Harsha Dumpala; Sheri Rempel; Katerina Dikaios; Mehri Sajjadian; Rudolf Uher; Sageev Oore

doi:10.1109/ICASSP39728.2021.9414129

Estimating severity of depression from acoustic features and embeddings of natural speech

Sri Harsha Dumpala, Sheri Rempel, Katerina Dikaios, Mehri Sajjadian, Rudolf Uher, Sageev Oore

Medicine

Research output: Contribution to journal › Conference article › peer-review

14 Citations (Scopus)

Abstract

Major depressive disorder, referred to as depression, is a leading cause of disability, absence from work, and premature death. Automatic assessment of depression from speech is a critical step towards improving diagnosis and treatment of depression. Previous works on depression assessment from speech considered various acoustic features extracted from speech to estimate depression severity. But performance of these approaches is not at clinical standards, and thus requires further improvement. In this work, we examine two novel approaches for improving depression severity estimation from short audio recordings of speech. Specifically, in audio recordings of a narrative by individuals diagnosed with major depressive disorder, we analyze spectral-based and excitation source-based features extracted from speech, and significance of sentiment and emotion classification in estimation of depression severity. Initial results indicate synchrony between depression scores and the sentiment and emotion labels. We propose the use of sentiment and emotion based embeddings obtained using machine learning techniques in estimation of depression severity. We also propose use of multi-task training to better estimate depression severity. We show that the proposed approaches provide additive improvements in the estimation of depression severity.

Original language	English
Pages (from-to)	7278-7282
Number of pages	5
Journal	Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Volume	2021-June
DOIs	https://doi.org/10.1109/ICASSP39728.2021.9414129
Publication status	Published - 2021
Event	2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada Duration: Jun 6 2021 → Jun 11 2021

Bibliographical note

Funding Information:
Resources used in preparing this research were provided, in part, by CIHR funding reference #165835, NSERC, the Province of Ontario, Canada Research Chairs Program, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute www.vectorinstitute.ai/#partners.

Publisher Copyright:
©2021 IEEE

ASJC Scopus Subject Areas

Software
Signal Processing
Electrical and Electronic Engineering

Access to Document

10.1109/ICASSP39728.2021.9414129

Cite this

@article{3d60d243bbfa4a5d958d066ec8208135,

title = "Estimating severity of depression from acoustic features and embeddings of natural speech",

abstract = "Major depressive disorder, referred to as depression, is a leading cause of disability, absence from work, and premature death. Automatic assessment of depression from speech is a critical step towards improving diagnosis and treatment of depression. Previous works on depression assessment from speech considered various acoustic features extracted from speech to estimate depression severity. But performance of these approaches is not at clinical standards, and thus requires further improvement. In this work, we examine two novel approaches for improving depression severity estimation from short audio recordings of speech. Specifically, in audio recordings of a narrative by individuals diagnosed with major depressive disorder, we analyze spectral-based and excitation source-based features extracted from speech, and significance of sentiment and emotion classification in estimation of depression severity. Initial results indicate synchrony between depression scores and the sentiment and emotion labels. We propose the use of sentiment and emotion based embeddings obtained using machine learning techniques in estimation of depression severity. We also propose use of multi-task training to better estimate depression severity. We show that the proposed approaches provide additive improvements in the estimation of depression severity.",

author = "Dumpala, {Sri Harsha} and Sheri Rempel and Katerina Dikaios and Mehri Sajjadian and Rudolf Uher and Sageev Oore",

note = "Funding Information: Resources used in preparing this research were provided, in part, by CIHR funding reference #165835, NSERC, the Province of Ontario, Canada Research Chairs Program, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute www.vectorinstitute.ai/#partners. Publisher Copyright: {\textcopyright}2021 IEEE; 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 ; Conference date: 06-06-2021 Through 11-06-2021",

year = "2021",

doi = "10.1109/ICASSP39728.2021.9414129",

language = "English",

volume = "2021-June",

pages = "7278--7282",

journal = "Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing",

issn = "1520-6149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Estimating severity of depression from acoustic features and embeddings of natural speech

AU - Dumpala, Sri Harsha

AU - Rempel, Sheri

AU - Dikaios, Katerina

AU - Sajjadian, Mehri

AU - Uher, Rudolf

AU - Oore, Sageev

N1 - Funding Information: Resources used in preparing this research were provided, in part, by CIHR funding reference #165835, NSERC, the Province of Ontario, Canada Research Chairs Program, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute www.vectorinstitute.ai/#partners. Publisher Copyright: ©2021 IEEE

PY - 2021

Y1 - 2021

N2 - Major depressive disorder, referred to as depression, is a leading cause of disability, absence from work, and premature death. Automatic assessment of depression from speech is a critical step towards improving diagnosis and treatment of depression. Previous works on depression assessment from speech considered various acoustic features extracted from speech to estimate depression severity. But performance of these approaches is not at clinical standards, and thus requires further improvement. In this work, we examine two novel approaches for improving depression severity estimation from short audio recordings of speech. Specifically, in audio recordings of a narrative by individuals diagnosed with major depressive disorder, we analyze spectral-based and excitation source-based features extracted from speech, and significance of sentiment and emotion classification in estimation of depression severity. Initial results indicate synchrony between depression scores and the sentiment and emotion labels. We propose the use of sentiment and emotion based embeddings obtained using machine learning techniques in estimation of depression severity. We also propose use of multi-task training to better estimate depression severity. We show that the proposed approaches provide additive improvements in the estimation of depression severity.

AB - Major depressive disorder, referred to as depression, is a leading cause of disability, absence from work, and premature death. Automatic assessment of depression from speech is a critical step towards improving diagnosis and treatment of depression. Previous works on depression assessment from speech considered various acoustic features extracted from speech to estimate depression severity. But performance of these approaches is not at clinical standards, and thus requires further improvement. In this work, we examine two novel approaches for improving depression severity estimation from short audio recordings of speech. Specifically, in audio recordings of a narrative by individuals diagnosed with major depressive disorder, we analyze spectral-based and excitation source-based features extracted from speech, and significance of sentiment and emotion classification in estimation of depression severity. Initial results indicate synchrony between depression scores and the sentiment and emotion labels. We propose the use of sentiment and emotion based embeddings obtained using machine learning techniques in estimation of depression severity. We also propose use of multi-task training to better estimate depression severity. We show that the proposed approaches provide additive improvements in the estimation of depression severity.

UR - http://www.scopus.com/inward/record.url?scp=85112030239&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85112030239&partnerID=8YFLogxK

U2 - 10.1109/ICASSP39728.2021.9414129

DO - 10.1109/ICASSP39728.2021.9414129

M3 - Conference article

AN - SCOPUS:85112030239

SN - 1520-6149

VL - 2021-June

SP - 7278

EP - 7282

JO - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

JF - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

T2 - 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021

Y2 - 6 June 2021 through 11 June 2021

ER -

Estimating severity of depression from acoustic features and embeddings of natural speech

Abstract

Bibliographical note

ASJC Scopus Subject Areas

Access to Document

Other files and links

Fingerprint

Cite this