Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Impact of Robot Behavior on Human Backchanneling in HRI, Lecture notes of Communication

This thesis explores the impact of a robot's behavior on human backchanneling feedback during conversation, the effect of the amount of backchanneling on conversation perception, and the differences in human backchanneling behavior when listening to a robot versus a human. The research is significant for creating fluent human-robot interactions and using human listener feedback to adjust conversations.

What you will learn

  • Does the amount of backchanneling feedback influence conversation perception?
  • How does human backchanneling behavior differ when listening to a robot compared to a human?
  • How does a robot's behavior affect human backchanneling feedback?
  • How can backchanneling behavior be generated in human-robot interaction?

Typology: Lecture notes

2021/2022

Uploaded on 09/12/2022

scream
scream 🇬🇧

4.5

(11)

276 documents

1 / 51

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Backchanneling in Human-Robot
Interaction
Differences in Human Backchanneling Behavior when
Communicating with another Human versus a Robot
Adna Bliek
Internal Supervisor: Prof. Dr. N.A. Taatgen
(Artificial Intelligence, University of Groningen)
External Supervisor: Prof. Dr. T. Hellström
(Umeå University, Sweden)
Artificial Intelligence
University of Groningen, The Netherlands
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33

Partial preview of the text

Download Impact of Robot Behavior on Human Backchanneling in HRI and more Lecture notes Communication in PDF only on Docsity!

Backchanneling in Human-Robot

Interaction

Differences in Human Backchanneling Behavior when

Communicating with another Human versus a Robot

Adna Bliek

Internal Supervisor: Prof. Dr. N.A. Taatgen

(Artificial Intelligence, University of Groningen)

External Supervisor: Prof. Dr. T. Hellström

(Umeå University, Sweden)

Artificial Intelligence

University of Groningen, The Netherlands

Abstract

During human conversations, information is not only conveyed by the speaker but also by the listener. For this, the listener is using backchannel feedback. Backchanneling can be done verbal or non-verbal and is a key aspect of human conversation, it shows listener attention and contributes to a more fluent interaction. In recent years, the effect of a robot’s backchanneling while listening to a human has been investigated. But the effect of backchanneling by a human when listening to a robot has been studied poorly. In this thesis, we investigate (1) how a robot’s behavior affects the human’s backchan- neling feedback, (2) whether the amount of backchanneling feedback has an effect on how the conversation is perceived, and (3) how human backchanneling behavior differs when listening to a robot compared to a human. The investigation of human backchanneling is important in Human-Robot Interaction to create fluent conversations and to be able to use the feedback of the human listener to adjust the conversation. We conducted an experiment looking at the backchanneling cues gaze, gesture, and pauses that were exhibited by a semi-humanoid robot. We found that pauses have a significant influence on the backchannel behavior of the human. Furthermore, we found that backchanneling behavior when listening to a robot or a human differs significantly.

i

Contents

  • 1 Introduction
    • 1.1 Human Communication
    • 1.2 Significance of the Study
    • 1.3 Research Question
    • 1.4 Overview of the Thesis
    • 1.5 Outline
  • 2 Background
    • 2.1 Backchanneling Definition
    • 2.2 Backchanneling in Human-Human Communication
      • 2.2.1 Types of Backchanneling
      • 2.2.2 Backchannel Timing
      • 2.2.3 Effect of Backchanneling Feedback
      • 2.2.4 Effect of Backchanneling Cues
      • 2.2.5 Cultural Differences
    • 2.3 Backchanneling in Human-Robot Interaction
      • 2.3.1 Simulated Agents
      • 2.3.2 Physical Robots
  • 3 Method
    • 3.1 Robot - Pepper
    • 3.2 Backchannel Cues
    • 3.3 Backchannel Behaviors
    • 3.4 First Pilot Study
      • 3.4.1 Findings and Conclusions
    • 3.5 Second Pilot Study
      • 3.5.1 Findings and Conclusion
    • 3.6 Final Experiment
      • 3.6.1 Experimental Setup
      • 3.6.2 Video Analysis
      • 3.6.3 Participants
  • 4 Results
    • 4.1 Backchanneling per Condition
    • 4.2 Feedback Questions
    • 4.3 Post-Questionnaire
      • 4.3.1 Technological Experience
      • 4.3.2 Attitude Towards Pepper
    • 4.4 Differences in Backchanneling to a Robot and a Human
      • 4.4.1 Backchanneling Behavior Backchanneling in Human Robot Communication
      • 4.4.2 Feedback Questions
    • 4.5 Difference Between Dilemmas and Start-Stop Text
    • 4.6 Cultural Backchanneling Differences
      • 4.6.1 Differences Between Native Swedish and Non-Swedish
  • 5 Discussion
    • 5.1 Research Question
    • 5.2 Research Question
    • 5.3 Research Question
    • 5.4 Limitations
  • 6 Conclusion
    • 6.1 Future Research
  • A Dilemmas
  • B Feedback Questions
    • B.1 Map Task
    • B.2 Dilemmas
  • C Post-Questionnaire
    • C.1 Socio-Demographic Questions
    • C.2 Technological Experience
    • C.3 Attitude Towards Pepper
  • D Statistical Tests

Backchanneling in Human Robot Communication

4.5 Number of backchanneling responses for the eight observed human backchan- neling behaviors, with separate bars for backchanneling while listening to a dilemma and the start or stop text respectively. The height of each bar is the average over all stories and participants................. 25 4.6 Number of backchanneling responses for the eight observed human backchan- neling behaviors, with separate bars for backchanneling by native Swedes and non-Swedes respectively. The height of each bar is the average over all stories and participants............................. 26

vi Chapter 0 Adna Bliek

List of Tables

3.1 During the experiments eight different combinations of backchanneling cues were executed by the robot. For example, during condition 6 pauses and gestures are present but no gaze........................ 11 3.2 List of annotated human backchannel behaviors. The labels show the pos- sible states for each behavior. The first label is the default value, i.e. the value that is expected to be seen when the participant is not backchanneling. 11 3.3 Parameters of the cue conditions exhibited by the per experiment. The parameters of the cue conditions were changed after each of the pilot ex- periments according to feedback given by the participants to create a more natural feeling conversations with the robot.................. 12

4.1 Normalized scores of the amount of backchanneling responses, for each one of the eight cue conditions............................ 21 4.2 Percentage of backchanneling behaviors used by the participants for each cue condition................................... 21 4.3 User ratings for the four feedback questions asked after each dilemma for each cue condition................................ 22 4.4 Scores on the post-questionnaire. The maximally possible score on the an- thropomorphism, animacy and likability questions is 25 and the maximally possible score on the emotional state questions is 15............. 23 4.5 Number of backchanneling responses observed while the participant is lis- tening to the robot or testleader. The numbers are averaged over all stories and participants................................. 24 4.6 Percentage of backchanneling responses observed while the participant is listening to the robot or testleader....................... 24 4.7 Comparison of the user ratings for the four feedback questions asked after each dilemma................................... 25 4.8 Number of backchanneling responses observed while the participant is lis- tening to the robot telling a dilemma or start or stop text. The numbers are averaged over all stories and participants................. 25 4.9 Percentage of backchanneling responses observed while the participant is listening to the robot telling a dilemma or start or stop text......... 25

D.1 Model = Amount of Backchanneling ∼ Pause + (1|Participant)...... 40

vii

Backchanneling in Human Robot Communication

1.2 Significance of the Study

Backchanneling has been studied in human-human conversations in different cultures, sit- uations, and modalities [12, 24, 8, 21]. In the recent years there has also been research in the field of backchanneling in Human-Robot Interaction(HRI). This research has been mainly focused on studying backchanneling in HRI in the direction of a robot giving backchanneling feedback while listening to a human [13, 10, 16]. The focus was on how a robot could communicate that it was listening to the human and create a more fluent conversation. In this research, we look at how to trigger backchanneling behavior by a human when listening to a robot. This part of HRI is largely missing in the literature. With this research we hope to contribute preliminary results into how human backchan- neling affects HRI and how it can be triggered by a robot, so that a robot can actively contribute to more fluent and effective conversations.

1.3 Research Question

The main research questions addressed in this thesis are:

  1. How should a robot behave in order to trigger a human’s natural backchanneling behavior?
  2. Does human backchanneling affect how the listener perceives the interaction?
  3. How does a human’s backchanneling behavior differ when listening to a robot com- pared to a human?

In order to answer the questions, an experiment with human participants listening to a story-telling robot was conducted. The first research question was investigated by im- plementing three backchanneling-inviting cues in a robot. The cues were pauses, gestures and gaze. The effect on the human backchanneling behavior was measured by counting the amount of backchanneling feedback given by a human during each condition. The second research question was examined by looking at how the participants self-report their perception of the conversation. The third research question was looked into by comparing the backchannelling behavior of the participants while listening to a story told by the robot versus a human.

1.4 Overview of the Thesis

Three experiments were conducted, all using the semi-humanoid robot Pepper. The first two experiments were pilot experiments. The first experiment was inspired by a map task [1]. During the second and third experiment, ethical dilemmas were told by the robot. To answer the first research question, three backchanneling cues were used by the robot: gestures, gaze and pauses. We found that pauses have a significant influence on the amount of backchanneling performed by the human participants when listening to the ethical dilemmas. The other two backchanneling cues did not have a significant influence on the amount of backchanneling. To answer the second research question, we looked at ratings by the participants, and did not find a significant effect of backchanneling on the perception of the conversation. The third research question was answered by comparing

2 Chapter 1 Adna Bliek

Backchanneling in Human Robot Communication

the amount of backchanneling by the participants to the robot versus a human. We found that the participants backchanneled significantly more to the human and used different backchanneling behaviors.

1.5 Outline

The remainder of the thesis is structured as followed: First, in Chapter 2 the background is discussed, describing backchanneling research in human-human communication, and the current state of the art in relevant parts of human-robot interaction. Then, Chap- ter 3 discusses methodology, describing which robot and methods have been used in the experiments. Furthermore, it describes the two pilot experiments and the design and implementation of the final experiment. Chapter 4 describes the results obtained in the final experiment. Chapter 5 discusses the results and puts them into the context of the background. Chapter 6 concludes the thesis by describing possible future research and summarizing the research.

Chapter 1 Adna Bliek 3

Backchanneling in Human Robot Communication

has defined six categories of reasons why listeners use backchannel feedback. The de- fined backchannel categories are continuer, display of understanding, support towards the speaker’s judgment, agreement, strong emotional response, and minor addition, correc- tion, or request for information. The most frequent function found in English conversa- tions is as continuer [25]. Continuers are used by the listener to communicate that they do not intend to take over the speaking turn but would like the speaker to continue.

The categorization of backchannel behaviors can be challenging as words can be am- biguous in meaning especially when only looking at a single word. For example, the often-used backchannel ’yeah’ can be categorized as a continuer, display of understand- ing, or support towards the speaker’s judgment in a backchannel situation or even as a response to a question in a non-backchannel situation. To distinguish between the different meanings lexical and semantic knowledge has to be used [18].

2.2.1 Types of Backchanneling

What kind of backchanneling response is used varies between languages, cultures, and in- dividuals. In addition, the used backchannel responses are also dependent on the context, whereas nodding is often used in face-to-face conversations, it does not yield any feedback in phone conversations.

Backchanneling feedback can be divided into two categories: generic and specific feed- back. Generic backchannel feedback includes all feedback that is not specific to the context and is often seen as the standard backchannel response. Generic responses include nod- ding or verbal phrases like ’yeah’, these responses can most often be characterized as continuer. The second category is specific backchannel responses. These responses are related to the content, for example, looking sad at appropriate moments, or mirroring the speaker’s gestures and movements. These responses permit the listener to become the co-narrator, illustrating, or adding to the story [3].

2.2.2 Backchannel Timing

Backchannel feedback can be given during pauses or overlapping the speech of the speaker. Feedback during speech pauses is a universal phenomenon, whereas the presence of backchan- nel feedback overlapping speech is not present in all cultures and languages. Native Ger- man speakers produce less overlapping backchannel responses than American speakers in their native languages and their conversation can be disturbed by overlapping feed- back [12].

In American English, it was found that backchannel responses seem to follow into- national phrases with raising pitch [5]. Besides, it has been found that in American English backchanneling is being done at grammatically significant breaks, such as at ends of clauses or sentence-final positions [25].

The amount of backchanneling responses of distracted listeners was found to be sig- nificantly lower than that of listeners attending to the content of the story. This effect was found for both generic and specific responses but was larger for specific responses. An increase in the cognitive demand of a task while still being able to attend to the story did not change the amount and kind of backchanneling feedback of the listener [3].

Chapter 2 Adna Bliek 5

Backchanneling in Human Robot Communication

2.2.3 Effect of Backchanneling Feedback

The quality of how well the speaker can tell a story can be influenced by the amount of appropriate backchanneling. When a listener is distracted and does not use as many backchannel responses and less specific responses, it was found that the quality of the storytelling was lower than when appropriate backchanneling was present [3]. Besides, the number of backchanneling responses was also found to have a weak positive correla- tion with task success [5]. But more backchanneling does not always have to be better, too much backchanneling has for example been found to have a negative effect on the enjoyment of the conversation [24]. In addition to improving the storytelling, backchanneling can also express rapport and grounding in a conversation [27]. Rapport is the perceived connection between the listener and speaker and grounding can show acknowledgment of the listener.

2.2.4 Effect of Backchanneling Cues

As described before in subsection 2.2.2, the amount of backchanneling can be influenced by different factors, for example backchannel cues. Backchannel cues are cues of the speaker while speaking provoking a backchannel response by the listener. Such a cue can, for example, be a rising pitch, a pause, a gaze shift, or a gesture. These cues can have an individual influence on the backchannel behavior, but they can also have a joint influence. a joint influence is the influence of multiple cues used at the same time is bigger than the individual influences of the used cues [27]. The cues also do not always have to be synchronized with the speech, a gesture can, for example, be performed a bit after the corresponding speech[26]. In this thesis, the effect of the backchanneling cues pause, gesture, and gaze will be investigated.

Pause

In a conversation different kinds of pauses can be present. Pauses can be categorized by function and whether they are filled or silent. Filled pauses are, in contrast to silent pauses, pauses that are filled with filler words like ’eehm’. The specific filler words can vary per culture and language. Three functions of pauses can be identified: a psycholinguistic function to allow the speaker to breathe, a cognitive function to allow the speaker to plan the next part of their speech, and a communicative function, to help the listener to identify significant syntactic places is the speech stream [6]. The last two functions can indicate to the listener whether the speakers want to continue their turn or give the turn to the listener.

Gesture

Gestures are a universal phenomenon, there has not been any report of a culture that is not using gestures accompanying their speech [20]. They are already present and synchronized in children in the one-word stage. Those children will, for example, gesture to an object while uttering the word drink to show what they would like to drink [9]. Even though gestures are present in all cultures, they vary between them in terms of position, size, and plane (lateral, sagittal, or vertical) [20]. A gesture that is appropriate in one culture can be misunderstood in another or even be perceived as rude.

6 Chapter 2 Adna Bliek

Backchanneling in Human Robot Communication

2.3 Backchanneling in Human-Robot Interaction

The effects of human backchanneling behavior in Human-Robot Interaction(HRI) have not been extensively studied. In HRI it has mainly been studied how the backchanneling behavior of a robot affects the human but not the effect of backchanneling-inviting cues provided by the robot. HRI experiments can be done with physical robots or virtual agents, but the results found in virtual agents cannot always be translated to interactions with physical robots.

2.3.1 Simulated Agents

Agent Giving Backchannel Cues

In an experiment by Hjalmarsson and Oertel [13], a virtual agent used gaze as a backchan- nel cue. The participants were divided into two groups: the agent looked at the partic- ipants of the first group at backchannel inviting moment, and at random moments for the second group. The researchers found that the participants backchanneled more when the agent used the backchanneling inviting gaze condition. A limitation of the study that should be noted is that it was not looked at natural backchannel behavior but the participants were asked to press a button when they thought that a backchannel would be appropriate. This experiment shows that the backchanneling inviting cue gaze of a simulated agent could influence the behavior of the human listener but more investigation into whether this is also translatable to natural backchannel behavior, and physical robots is needed.

Agent Giving Backchanneling Feedback

Gratch et al. [10] created a virtual agent to create rapport in a conversation of a hu- man with the agent. To create good rapport, the agent used backchanneling behavior when the participant was talking. The feedback was generated using real-time analysis of the acoustic properties of the speech and the speaker’s gestures. To test the agent four conditions were tested: human-human face to face, mediated (virtual agents movements were copied from a human), responsive (agent reacts to the participant using the auto- matically generated responses), and non-contingent (responsive feedback from the earlier session was used, which was not synchronized to the speech of current participant). They found that the responsive agent was as effective as a human listener in creating rapport and more effective than the mediated or non-contingent agent.

2.3.2 Physical Robots

Robot Giving Backchannel Feedback

Inden et al. [16] created five strategies to create models for the best timing of backchannel behavior when listening to a human speaker. The strategies tested were: (1) copying the timing of the original human listener, (2) producing backchannels at randomly selected times, (3) producing backchannels according to high-level timing distributions relative to the interlocutor’s utterance and pauses, (4) according to local entrainment to the interlocutors’ vowels, or (5) according to both. They concluded that the strategies to generate backchanneling behavior using empirically derived global timing distributions

8 Chapter 2 Adna Bliek

Backchanneling in Human Robot Communication

were perceived as missing fewer opportunities for backchannel feedback than the random strategy. Hussain et al. [14] used a Markov decision process to train a social robot to backchannel at appropriate times to maximize the engagement of the user. They found that reinforce- ment learning could be a useful way of learning backchannel behavior as the robot is able to learn immediately from the earlier time-steps and adjust its feedback. But it should be noted that their results have not been tested in an HRI experiment.

Robot Detecting Backchanneling Behavior

Backchannel feedback can be a predictor of how engaged the listener is during a conver- sation. Lala et al. [22] used this property of backchannel feedback to detect how engaged the human was during a conversation and used the feedback to keep the user engaged during the conversation with the robot.

Chapter 2 Adna Bliek 9

Backchanneling in Human Robot Communication

Backchanneling Condition Pause Gaze Gesture C1 0 0 0 C2 0 0 1 C3 0 1 0 C4 0 1 1 C5 1 0 0 C6 1 0 1 C7 1 1 0 C8 1 1 1

Table 3.1: During the experi- ments eight different combinations of backchanneling cues were executed by the robot. For example, during condition 6 pauses and gestures are present but no gaze.

Category Labels Lean neutral, towards, away Brow neutral, raise Smile none, smile Frown none, frown Nod none, nod Shake Head none, shake head Head Movement neutral, forward, up, tilt Utter none, utterance Hand in Face not in face, in face

Table 3.2: List of annotated human backchan- nel behaviors. The labels show the possible states for each behavior. The first label is the default value, i.e. the value that is expected to be seen when the participant is not backchan- neling.

programming languages, Pepper can also be programmed and controlled using the Chore- ograph Suite which is a block based programming environment. The tablet located on Pepper’s chest is an android tablet, android application can be installed or it can be con- trolled using the NAOqi framework. In the later case an HTML website will be displayed and the feedback from this website can be used by the robot. In this project, Pepper was programmed using the NAOqi framework for python. In addition, HTML, JavaScript and CSS were used to program the website shown on the tablet. In the JavaScript code the NAOqi SDK was used to establish communication between the python and JavaScript code.

3.2 Backchannel Cues

During all experiments three backchannel cues that can be created by Pepper were manip- ulated: pauses, gestures and the gaze. The cues were tested in all possible combinations (see Table 3.1). The backchannel cues were placed at points where a topic in the text was finished (see Fig. 3.4c and Fig. 3.5). Speech pauses were created by adding additional pauses to the text spoken by Pepper. During all three experiments pauses were presented at all possible cue places. After the pilot experiments, the length of the pauses was reduced from 2 seconds to 1.2 seconds as the pauses were perceived as too long by the participants. The gestures that were used were predefined gestures in the NAOqi framework. They were categorized under the topic "explain" and were randomly chosen from the topic. Gestures were performed while the robot was talking. During the execution of a gesture, Pepper moved its body, arms and hands. The gestures were randomly presented 50% of time during the pilot experiments to ensure that the use of gestures felt natural. Using gestures the whole time could have felt unnatural. The amount of gestures was increased to 60% after the pilot experiments. When the gaze cue was used, Pepper did not look at the human participant all the time but would look away and back at the participant at the time when also pauses could be placed. During the pilot experiments, the robot looked away or back at the participant during possible cue places 50% of the time. After the pilot experiments this number

Chapter 3 Adna Bliek 11

Backchanneling in Human Robot Communication

Experiment Pause Length

Pause Probability

Gesture Probability

Gaze Angle

Gaze Probability

Voice Speed Map Task Pilot Study 2 sec^ 100%^ 50%^

yaw: 0. pitch: 0.1 50%^

slow: 70% normal: 90% Dilemmas Pilot Study 2 sec 100% 50% yaw: 0. pitch: 0. 50% 90% Dilemmas Experiment 1.2 sec^ 100%^ 60%^

yaw: 0. pitch: 0.1 60%^ 80%

Table 3.3: Parameters of the cue conditions exhibited by the per experiment. The pa- rameters of the cue conditions were changed after each of the pilot experiments according to feedback given by the participants to create a more natural feeling conversations with the robot.

(a) Lean Towards (b) Lean Back (c) Smile (d) Brow Raise

(e) Head Neutral (f) Head Up (g) Head Down (h) Frown

Figure 3.2: Example Images of Backchanneling Behavior

was increased to 60% The parameters of the cue conditions were changed after each of the pilot experiments according to feedback given by the participants and observations made during the experiments. The parameters for each cue condition during the three experiments can be seen in Table 3.3.

3.3 Backchannel Behaviors

The participants’ backchannel behaviors that were annotated can be found in Table 3.2. The choice of behaviors was based on the earlier work in [23], and was extended with commonly observed backchannel behaviors in the videos of the pilot studies. The added behaviors were Shake head and Head move. Examples of real backchannel behaviors can be found in Figure 3.2.

12 Chapter 3 Adna Bliek