Emotion Recognition in Conversations | Summaries Computer Vision

Emotion Recognition in Conversations with Transfer Learning from

Generative Conversation Modeling

Devamanyu Hazarikaa, Soujanya Poriac,∗, Roger Zimmermanna, Rada Mihalceab

aSchool of Computing, National University of Singapore

bComputer Science & Engineering, University of Michigan, USA

cInformation Systems Technology and Design, Singapore University of Technology and Design

Abstract

Recognizing emotions in conversations is a challenging task due to the presence of contextual dependencies

governed by self- and inter-personal influences. Recent approaches have focused on modeling these depen-

dencies primarily via supervised learning. However, purely supervised strategies demand large amounts of

annotated data, which is lacking in most of the available corpora in this task. To tackle this challenge, we look

at transfer learning approaches as a viable alternative. Given the large amount of available conversational

data, we investigate whether generative conversational models can be leveraged to transfer affective knowledge

for the target task of detecting emotions in context. We propose an approach where we first train a neural

dialogue generation model and then perform parameter transfer to initiate our target emotion classifier. Apart

from the traditional pre-trained sentence encoders, we also incorporate parameter transfer from the recurrent

components that model inter-sentence context across the whole conversation. Based on this idea, we perform

several experiments across multiple datasets and find improvement in performance and robustness against

limited training data. Our models also achieve better validation performances in significantly fewer epochs.

Overall, we infer that knowledge acquired from dialogue generators can indeed help recognize emotions in

conversations.

Keywords: Emotion Recognition in Conversations, Transfer Learning, Generative Pre-training

1. Introduction

Emotion Recognition in Conversations (ERC) is the task of detecting emotions from utterances in a

conversation. It is an imp ortant task with applications ranging from dialogue understanding to affective

dialogue systems [

]. Apart from the traditional challenges of dialogue understanding, such as intent-detection,

contextual grounding, and others [

], ERC presents additional challenges as it requires the ability to model

emotional dynamics governed by self- and inter-speaker influences at play [

]. Further complications arise due

to the limited availability of annotated data–especially in multimodal ERC–and the variability in annotations

owing to the subjectivity of annotators in interpreting emotions.

In this work, we focus on these issues by investigating a framework of sequential inductive transfer

learning (TL) [

]. In particular, we attempt to transfer contextual affective information from a generative

conversation modeling task to ERC.

But why should generative modeling of conversations acquire knowledge on emotional dynamics? To

answer this question, we first observe the role of emotions in conversations. Several works in the literature

have indicated that emotional goals and influences act as latent controllers in dialogues [

]. Fig. 1 provides

∗

Corresponding author. Contributions: ideation (use of transfer learning in emotion recognition in conversation from

generative conversation modeling) and organization of the paper.

Email addresses: hazarika@comp.nus.edu.sg (Devamanyu Hazarika), soujanya_poria@sutd.edu.sg (Soujanya Poria),

rogerz@comp.nus.edu.sg (Roger Zimmermann), mihalcea@umich.edu (Rada Mihalcea)

Preprint submitted to Journal of Elsevier October 11, 2019

arXiv:submit/2881028 [cs.CL] 11 Oct 2019

Emotion Recognition in Conversations, Summaries of Computer Vision

Related documents

Partial preview of the text

Download Emotion Recognition in Conversations and more Summaries Computer Vision in PDF only on Docsity!

Emotion Recognition in Conversations with Transfer Learning from

Generative Conversation Modeling

arXiv:submit/2881028 [cs.CL] 11 Oct 2019

(a) (b) (c)

Dataset: IEMOCAP

Initial Weight 10% 100%

sentenc cxtenc F-score F-score

θcornellenc

θcxtcornell 27.5 ± 1. 3 55.1 ± 0. 9

θubuntuenc

θubuntucxt 23.3 ± 0. 8 53.7 ± 0. 9

θBERT

θubuntucxt 35.7 ± 1. 1 58.8 ± 0. 5

θcxtcornell 36.3 ± 1. 1 58.5 ± 0. 8