Tan Lee


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
PodAgent: A Comprehensive Framework for Podcast Generation
Yujia Xiao | Lei He | Haohan Guo | Feng-Long Xie | Tan Lee
Findings of the Association for Computational Linguistics: ACL 2025

Existing automatic audio generation methods struggle to generate podcast-like audio programs effectively. The key challenges lie in in-depth content generation, appropriate and expressive voice production. This paper proposed PodAgent, a comprehensive framework for creating audio programs. PodAgent 1) generates informative topic-discussion content by designing a Host-Guest-Writer multi-agent collaboration system, 2) builds a voice pool for suitable voice-role matching and 3) utilizes LLM-enhanced speech synthesis method to generate expressive conversational speech. Given the absence of standardized evaluation criteria for podcast-like audio generation, we developed comprehensive assessment guidelines to effectively evaluate the model’s performance. Experimental results demonstrate PodAgent’s effectiveness, significantly surpassing direct GPT-4 generation in topic-discussion dialogue content, achieving an 87.4% voice-matching accuracy, and producing more expressive speech through LLM-guided synthesis. Demo page: https://podcast-agent.github.io/demo/. Source code: https://github.com/yujxx/PodAgent.

2009

pdf bib
Automatic Recognition of Cantonese-English Code-Mixing Speech
Joyce Y. C. Chan | Houwei Cao | P. C. Ching | Tan Lee
International Journal of Computational Linguistics & Chinese Language Processing, Volume 14, Number 3, September 2009

2007

pdf bib
Integrating Complementary Features from Vocal Source and Vocal Tract for Speaker Identification
Nengheng Zheng | Tan Lee | Ning Wang | P. C. Ching
International Journal of Computational Linguistics & Chinese Language Processing, Volume 12, Number 3, September 2007: Special Issue on Invited Papers from ISCSLP 2006

2006

pdf bib
Using Duration Information in Cantonese Connected-Digit Recognition
Yu Zhu | Tan Lee
International Journal of Computational Linguistics & Chinese Language Processing, Volume 11, Number 1, March 2006: Special Issue on Human Computer Speech Processing

pdf bib
Modeling Cantonese Pronunciation Variations for Large-Vocabulary Continuous Speech Recognition
Tan Lee | Patgi Kam | Frank K. Soong
International Journal of Computational Linguistics & Chinese Language Processing, Volume 11, Number 1, March 2006: Special Issue on Human Computer Speech Processing

2001

pdf bib
Design, Compilation and Processing of CUCall: A Set of Cantonese Spoken Language Corpora Collected Over Telephone Networks
W.K. Lo | P.C. Ching | Tan Lee | Helen Meng
Proceedings of Research on Computational Linguistics Conference XIV