본문 바로가기 메뉴 바로가기

News

Professor Gunhee Kim's Research Team at SNU Develops AI Dialogue Model That Mimics Human Habits and Backchannels

  • Uploaded by

    대외협력실

  • Upload Date

    2025.05.30

  • Views

    250

Professor Gunhee Kim's Research Team at SNU Develops AI Dialogue Model That Mimics Human Habits and Backchannels
- Wins Senior Area Chair Award in Speech Processing and Spoken Language Understanding at NAACL 2025, the top global NLP conference
- Expected to be applied in podcast production, counseling AIs, voice assistants, and care services

연구진 단체 사진
▲ (From left) Kang-wook Kim (undergraduate researcher), Professor Gunhee Kim, and Sehun Lee (PhD candidate), all from the Department of Computer Science, Seoul National University

Seoul National University College of Engineering announced that Professor Gunhee Kim's team from the Department of Computer Science has developed a speech dialogue generation technology in which artificial intelligence (AI) understands and reproduces human conversational behaviors such as verbal habits, backchannels, and interruptions.

In this study, the team built "Behavior-SD," the world’s largest spoken dialogue dataset based on conversational behavior, and proposed an AI model called "BeDLM" that enables more natural speech interaction.

The research team presented their paper at the 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2025), held from April 29 to May 4 in Albuquerque, New Mexico. The study was awarded the Senior Area Chair Award, the top honor in the Speech Processing and Spoken Language Understanding category. NAACL is one of the most prestigious global academic conferences in natural language processing (NLP), a field of AI focused on enabling computers to understand and generate human language.

The research team focused on the fact that people exhibit unique conversational behaviors in spoken dialogue that are not typically present in text-based communication. For example, during conversations, people often use verbal fillers like "um..." or "you know...", insert short affirmations like "right" or "yeah" at appropriate moments, or sometimes interrupt the other speaker. Existing AI dialogue systems, which fail to reflect these subtle features, often sound unnatural and robotic. To overcome this limitation, Professor Kim's team concluded that integrating conversational behaviors is essential for achieving truly human-like speech-based AI interaction.

To address this challenge, the team introduced both a spoken dialogue dataset and a generation model that meticulously capture individual behaviors like verbal habits, backchannels, interruptions, and emotional expressions.

First, they constructed the large-scale "Behavior-SD" dataset, which comprises 100,000 dialogue patterns and over 2,000 hours of speech data, designed to closely replicate real conversation settings. In addition to basic sentence exchanges, the data was annotated with a wide range of conversational behaviors with fine granularity.

BeDLM_eng
▲ Illustration of BeDLM generating dialogue based on speaker behavior patterns (e.g., frequency of interruptions and backchannels)

Using this dataset, the team developed BeDLM (Behaviorally Aware Spoken Dialogue Generation with Large Language Models), a dialogue generation model that incorporates conversational behavior. Based on a large language model (LLM), BeDLM takes in a dialogue situation and behavioral patterns of two speakers and generates a spoken dialogue that closely resembles real human interaction. Because the model flexibly reproduces features such as backchannels, interruptions, and verbal habits, it overcomes the limitations of conventional AI dialogue systems and creates speech interactions that feel more human.

BeDLM is expected to be widely applicable in areas requiring emotional and interactive AI, such as podcast content creation, counseling AIs, and personalized voice assistants. The technology is also expected to support smoother communication between humans and AI in fields like education, therapy, and elder care. Additionally, the Behavior-SD dataset and code developed through this research are openly available as open-source resources, allowing researchers worldwide to freely utilize them. This openness is expected to accelerate the dissemination of related technologies and inspire follow-up research.

Professor Gunhee Kim stated, "When people converse, they listen and adapt to the other’s verbal and visual reactions even as they speak. Existing AI dialogue models failed to reflect this dynamic interaction. We aimed to go beyond that limitation, and I believe this study takes speech dialogue AI one step closer to human-level naturalness.“

Lee Sehun, the first author of the paper and a PhD candidate in the Department of Computer Science at SNU, commented, "This study demonstrates that by reflecting diverse behavioral patterns unique to speech dialogue in data and models, we can achieve more human-like AI conversations." He added, "Building the dataset and finding the right modeling method for AI to understand conversational behavior was not easy. I hope BeDLM will be adopted into real-world voice services and become a natural and immersive AI communication technology."

Currently pursuing his PhD at SNU, Lee is researching the advancement of behavior-based spoken dialogue models so that AI can model and control more diverse human behaviors. He plans to continue studying advanced speech-based conversational AI technologies and gain industry experience through internships to contribute to real-world deployment and scalability.

NAACL 2025 음성 인식 부문 최고 논문상
▲ Senior Area Chair Award in Speech Processing and Spoken Language Understanding at NAACL 2025

연구진 프로필 사진
▲ (From left) Sehun Lee (PhD candidate), Kang-wook Kim (undergraduate researcher), and Professor Gunhee Kim, all from the Department of Computer Science, Seoul National University


[Reference Materials]
- Paper/Conference : “Behavior-SD : Behaviorally Aware Spoken Dialogue Generation with Large Language Models”, 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL2025)
- Paper Link : https://aclanthology.org/2025.naacl-long.484.pdf