Hsin-Min Wang - ACL Anthology

This is an internal, incomplete preview of a proposed change to the ACL Anthology. For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes. Do not treat this content as an official publication.

Hsin-Min Wang

Also published as: Hsin-min Wang

2025

The AS-SLAM system for Formosa Speech Recognition Challenge 2025
Chih-Hsi Chen | Pei-Jun Liao | Chia-Hua Wu | Pang-Cheng Wu | Hsin-Min Wang
Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)

In recent years, large-scale pre-trained speech models such as Whisper have been widely applied to speech recognition. While they achieve strong performance on high-resource languages such as English and Mandarin, dialects and other low-resource languages remain challenging due to limited data availability. The government-led “Formosa Speech in the Wild (FSW) project” is an important cultural preservation initiative for Hakka, a regional dialect, where the development of Hakka ASR systems represents a key technological milestone. Beyond model architecture, data processing and training strategies are also critical. In this paper, we explore data augmentation techniques for Hakka speech, including TTS and MUSAN-based approaches, and analyze different data combinations by fine-tuning the pre-trained Whisper model. We participated in the 2025 Hakka FSR ASR competition (student track) for the Dapu and Zhaoan varieties. In the pilot test, our system achieved 7th place in Hanzi recognition (CER: 15.92) and 3rd place in Pinyin recognition (SER: 20.49). In the official finals, our system ranked 6 in Hanzi recognition (CER: 15.73) and 4 in Pinyin recognition (SER: 20.68). We believe that such data augmentation strategies can advance research on Hakka ASR and support the long-term preservation of Hakka culture.

2023

Enhancing Automatic Speech Recognition Performance Through Multi-Speaker Text-to-Speech
Po-Kai Chen | Bing-Jhih Huang | Chi-Tao Chen | Hsin-Min Wang | Jia-Ching Wang
Proceedings of the 35th Conference on Computational Linguistics and Speech Processing (ROCLING 2023)

2022

Aligning Sentences in a Paragraph-Paraphrased Corpus with New Embedding-based Similarity Measures
Aleksandra Smolka | Hsin-Min Wang | Jason S. Chang | Keh-Yih Su
International Journal of Computational Linguistics & Chinese Language Processing, Volume 27, Number 2, December 2022

Chinese Movie Dialogue Question Answering Dataset
Shang-Bao Luo | Cheng-Chung Fan | Kuan-Yu Chen | Yu Tsao | Hsin-Min Wang | Keh-Yih Su
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

This paper constructs a Chinese dialogue-based information-seeking question answering dataset CMDQA, which is mainly applied to the scenario of getting Chinese movie related information. It contains 10K QA dialogs (40K turns in total). All questions and background documents are compiled from the Wikipedia via an Internet crawler. The answers to the questions are obtained via extracting the corresponding answer spans within the related text passage. In CMDQA, in addition to searching related documents, pronouns are also added to the question to better mimic the real dialog scenario. This dataset can test the individual performance of the information retrieval, the question answering and the question re-writing modules. This paper also provides a baseline system and shows its performance on this dataset. The experiments elucidate that it still has a big gap to catch the human performance. This dataset thus provides enough challenge for the researcher to conduct related research.

Is Character Trigram Overlapping Ratio Still the Best Similarity Measure for Aligning Sentences in a Paraphrased Corpus?
Aleksandra Smolka | Hsin-Min Wang | Jason S. Chang | Keh-Yih Su
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

Sentence alignment is an essential step in studying the mapping among different language expressions, and the character trigram overlapping ratio was reported to be the most effective similarity measure in aligning sentences in the text simplification dataset. However, the appropriateness of each similarity measure depends on the characteristics of the corpus to be aligned. This paper studies if the character trigram is still a suitable similarity measure for the task of aligning sentences in a paragraph paraphrasing corpus. We compare several embedding-based and non-embeddings model-agnostic similarity measures, including those that have not been studied previously. The evaluation is conducted on parallel paragraphs sampled from the Webis-CPC-11 corpus, which is a paragraph paraphrasing dataset. Our results show that modern BERT-based measures such as Sentence-BERT or BERTScore can lead to significant improvement in this task.

2021

Sequence to General Tree: Knowledge-Guided Geometry Word Problem Solving
Shih-hung Tsai | Chao-Chun Liang | Hsin-Min Wang | Keh-Yih Su
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

With the recent advancements in deep learning, neural solvers have gained promising results in solving math word problems. However, these SOTA solvers only generate binary expression trees that contain basic arithmetic operators and do not explicitly use the math formulas. As a result, the expression trees they produce are lengthy and uninterpretable because they need to use multiple operators and constants to represent one single formula. In this paper, we propose sequence-to-general tree (S2G) that learns to generate interpretable and executable operation trees where the nodes can be formulas with an arbitrary number of arguments. With nodes now allowed to be formulas, S2G can learn to incorporate mathematical domain knowledge into problem-solving, making the results more interpretable. Experiments show that S2G can achieve a better performance against strong baselines on problems that require domain knowledge.

Answering Chinese Elementary School Social Studies Multiple Choice Questions
Chao-Chun Liang | Daniel Lee | Meng-Tse Wu | Hsin-Min Wang | Keh-Yih Su
International Journal of Computational Linguistics & Chinese Language Processing, Volume 26, Number 2, December 2021

This paper presents a framework to answer the questions that require various kinds of inference mechanisms (such as Extraction, Entailment-Judgement, and Summarization). Most of the previous approaches adopt a rigid framework which handles only one inference mechanism. Only a few of them adopt several answer generation modules for providing different mechanisms; however, they either lack an aggregation mechanism to merge the answers from various modules, or are too complicated to be implemented with neural networks. To alleviate the problems mentioned above, we propose a divide-and-conquer framework, which consists of a set of various answer generation modules, a dispatch module, and an aggregation module. The answer generation modules are designed to provide different inference mechanisms, the dispatch module is used to select a few appropriate answer generation modules to generate answer candidates, and the aggregation module is employed to select the final answer. We test our framework on the 2020 Formosa Grand Challenge Contest dataset. Experiments show that the proposed framework outperforms the state-of-the-art Roberta-large model by about 11.4%.

Mining Commonsense and Domain Knowledge from Math Word Problems
Shih-Hung Tsai | Chao-Chun Liang | Hsin-Min Wang | Keh-Yih Su
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)

Current neural math solvers learn to incorporate commonsense or domain knowledge by utilizing pre-specified constants or formulas. However, as these constants and formulas are mainly human-specified, the generalizability of the solvers is limited. In this paper, we propose to explicitly retrieve the required knowledge from math problemdatasets. In this way, we can determinedly characterize the required knowledge andimprove the explainability of solvers. Our two algorithms take the problem text andthe solution equations as input. Then, they try to deduce the required commonsense and domain knowledge by integrating information from both parts. We construct two math datasets and show the effectiveness of our algorithms that they can retrieve the required knowledge for problem-solving.

2019

Influences of Prosodic Feature Replacement on the Perceived Singing Voice Identity
Kuan-Yi Kang | Yi-Wen Liu | Hsin-Min Wang
Proceedings of the 31st Conference on Computational Linguistics and Speech Processing (ROCLING 2019)

2018

WaveNet 聲碼器及其於語音轉換之應用 (WaveNet Vocoder and its Applications in Voice Conversion) [In Chinese]
Wen-Chin Huang | Chen-Chou Lo | Hsin-Te Hwang | Yu Tsao | Hsin-Min Wang
Proceedings of the 30th Conference on Computational Linguistics and Speech Processing (ROCLING 2018)

2017

基於鑑別式自編碼解碼器之錄音回放攻擊偵測系統 (A Replay Spoofing Detection System Based on Discriminative Autoencoders) [In Chinese]
Yu-Ding Lu | Hung-Shin Lee | Yu Tsao | Hsin-Min Wang
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing (ROCLING 2017)

基於i-vector與PLDA並使用GMM-HMM強制對位之自動語者分段標記系統 (Speaker Diarization based on I-vector PLDA Scoring and using GMM-HMM Forced Alignment) [In Chinese]
Cheng-Jo Ray Chang | Hung-Shin Lee | Hsin-Min Wang | Jyh-Shing Roger Jang
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing (ROCLING 2017)

使用查詢意向探索與類神經網路於語音文件檢索之研究 (Exploring Query Intent and Neural Network modeling Techniques for Spoken Document Retrieval) [In Chinese]
Tien-Hong Lo | Ying-Wen Chen | Berlin Chen | Kuan-Yu Chen | Hsin-Min Wang
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing (ROCLING 2017)

當代非監督式方法之比較於節錄式語音摘要 (An Empirical Comparison of Contemporary Unsupervised Approaches for Extractive Speech Summarization) [In Chinese]
Shih-Hung Liu | Kuan-Yu Chen | Kai-Wun Shih | Berlin Chen | Hsin-Min Wang | Wen-Lian Hsu
International Journal of Computational Linguistics & Chinese Language Processing, Volume 22, Number 1, June 2017

語音文件檢索使用類神經網路技術 (On the Use of Neural Network Modeling Techniques for Spoken Document Retrieval) [In Chinese]
Tien-Hong Lo | Ying-Wen Chen | Kuan-Yu Chen | Hsin-Min Wang | Berlin Chen
International Journal of Computational Linguistics & Chinese Language Processing, Volume 22, Number 2, December 2017-Special Issue on Selected Papers from ROCLING XXIX

基於鑑別式自編碼解碼器之錄音回放攻擊偵測系統 (A Replay Spoofing Detection System Based on Discriminative Autoencoders) [In Chinese]
Chia-Lung Wu | Hsiang-Ping Hsu | Yu-Ding Lu | Yu Tsao | Hung-Shin Lee | Hsin-Min Wang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 22, Number 2, December 2017-Special Issue on Selected Papers from ROCLING XXIX

2016

Learning to Distill: The Essence Vector Modeling Framework
Kuan-Yu Chen | Shih-Hung Liu | Berlin Chen | Hsin-Min Wang
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In the context of natural language processing, representation learning has emerged as a newly active research subject because of its excellent performance in many applications. Learning representations of words is a pioneering study in this school of research. However, paragraph (or sentence and document) embedding learning is more suitable/reasonable for some tasks, such as sentiment classification and document summarization. Nevertheless, as far as we are aware, there is only a dearth of research focusing on launching unsupervised paragraph embedding methods. Classic paragraph embedding methods infer the representation of a given paragraph by considering all of the words occurring in the paragraph. Consequently, those stop or function words that occur frequently may mislead the embedding learning process to produce a misty paragraph representation. Motivated by these observations, our major contributions are twofold. First, we propose a novel unsupervised paragraph embedding method, named the essence vector (EV) model, which aims at not only distilling the most representative information from a paragraph but also excluding the general background information to produce a more informative low-dimensional vector representation for the paragraph. We evaluate the proposed EV model on benchmark sentiment classification and multi-document summarization tasks. The experimental results demonstrate the effectiveness and applicability of the proposed embedding method. Second, in view of the increasing importance of spoken content processing, an extension of the EV model, named the denoising essence vector (D-EV) model, is proposed. The D-EV model not only inherits the advantages of the EV model but also can infer a more robust representation for a given spoken paragraph against imperfect speech recognition. The utility of the D-EV model is evaluated on a spoken document summarization task, confirming the effectiveness of the proposed embedding method in relation to several well-practiced and state-of-the-art summarization methods.

運用序列到序列生成架構於重寫式自動摘要(Exploiting Sequence-to-Sequence Generation Framework for Automatic Abstractive Summarization)[In Chinese]
Yu-Lun Hsieh | Shih-Hung Liu | Kuan-Yu Chen | Hsin-Min Wang | Wen-Lian Hsu | Berlin Chen
Proceedings of the 28th Conference on Computational Linguistics and Speech Processing (ROCLING 2016)

2015

Proceedings of the 27th Conference on Computational Linguistics and Speech Processing (ROCLING 2015)
Sin-Horng Chen | Hsin-Min Wang | Jen-Tzung Chien
Proceedings of the 27th Conference on Computational Linguistics and Speech Processing (ROCLING 2015)

表示法學習技術於節錄式語音文件摘要之研究(A Study on Representation Learning Techniques for Extractive Spoken Document Summarization) [In Chinese]
Kai-Wun Shih | Berlin Chen | Kuan-Yu Chen | Shih-Hung Liu | Hsin-Min Wang
Proceedings of the 27th Conference on Computational Linguistics and Speech Processing (ROCLING 2015)

調變頻譜分解之改良於強健性語音辨識(Several Refinements of Modulation Spectrum Factorization for Robust Speech Recognition) [In Chinese]
Ting-Hao Chang | Hsiao-Tsung Hung | Kuan-Yu Chen | Hsin-Min Wang | Berlin Chen
Proceedings of the 27th Conference on Computational Linguistics and Speech Processing (ROCLING 2015)

節錄式語音文件摘要使用表示法學習技術 (Extractive Spoken Document Summarization with Representation Learning Techniques) [In Chinese]
Kai-Wun Shih | Kuan-Yu Chen | Shih-Hung Liu | Hsin-Min Wang | Berlin Chen
International Journal of Computational Linguistics & Chinese Language Processing, Volume 20, Number 2, December 2015 - Special Issue on Selected Papers from ROCLING XXVII

調變頻譜分解技術於強健語音辨識之研究 (Investigating Modulation Spectrum Factorization Techniques for Robust Speech Recognition) [In Chinese]
Ting-Hao Chang | Hsiao-Tsung Hung | Kuan-Yu Chen | Hsin-Min Wang | Berlin Chen
International Journal of Computational Linguistics & Chinese Language Processing, Volume 20, Number 2, December 2015 - Special Issue on Selected Papers from ROCLING XXVII

2014

Leveraging Effective Query Modeling Techniques for Speech Recognition and Summarization
Kuan-Yu Chen | Shih-Hung Liu | Berlin Chen | Ea-Ee Jan | Hsin-Min Wang | Wen-Lian Hsu | Hsin-Hsi Chen
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Proceedings of the 26th Conference on Computational Linguistics and Speech Processing (ROCLING 2014)
Jing-Yang Jou | Chia-Hui Chang | Hsin-Min Wang
Proceedings of the 26th Conference on Computational Linguistics and Speech Processing (ROCLING 2014)

探究新穎語句模型化技術於節錄式語音摘要 (Investigating Novel Sentence Modeling Techniques for Extractive Speech Summarization) [In Chinese]
Shih-Hung Liu | Kuan-Yu Chen | Yu-Lun Hsieh | Berlin Chen | Hsin-Min Wang | Wen-Lian Hsu
Proceedings of the 26th Conference on Computational Linguistics and Speech Processing (ROCLING 2014)

2013

Semantic Naïve Bayes Classifier for Document Classification
How Jing | Yu Tsao | Kuan-Yu Chen | Hsin-Min Wang
Proceedings of the Sixth International Joint Conference on Natural Language Processing

改良語句模型技術於節錄式語音摘要之研究 (Improved Sentence Modeling Techniques for Extractive Speech Summarization) [In Chinese]
Shih-Hung Liu | Kuan-Yu Chen | Hsin-Min Wang | Wen-Lian Hsu | Berlin Chen
Proceedings of the 25th Conference on Computational Linguistics and Speech Processing (ROCLING 2013)

A Study of Language Modeling for Chinese Spelling Check
Kuan-Yu Chen | Hung-Shin Lee | Chung-Han Lee | Hsin-Min Wang | Hsin-Hsi Chen
Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing

2007

A Novel Characterization of the Alternative Hypothesis Using Kernel Discriminant Analysis for LLR-Based Speaker Verification
Yi-Hsiang Chao | Hsin-Min Wang | Ruei-Chuan Chang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 12, Number 3, September 2007: Special Issue on Invited Papers from ISCSLP 2006

2006

A Maximum Entropy Approach for Semantic Language Modeling
Chuang-Hua Chueh | Hsin-Min Wang | Jen-Tzung Chien
International Journal of Computational Linguistics & Chinese Language Processing, Volume 11, Number 1, March 2006: Special Issue on Human Computer Speech Processing

An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition
Jen-Wei Kuo | Shih-Hung Liu | Hsin-Min Wang | Berlin Chen
International Journal of Computational Linguistics & Chinese Language Processing, Volume 11, Number 3, September 2006: Special Issue on Selected Papers from ROCLING XVII

2005

MATBN: A Mandarin Chinese Broadcast News Corpus
Hsin-Min Wang | Berlin Chen | Jen-Wei Kuo | Shih-Sian Cheng
International Journal of Computational Linguistics & Chinese Language Processing, Volume 10, Number 2, June 2005: Special Issue on Annotated Speech Corpora

2004

Proceedings of the 16th Conference on Computational Linguistics and Speech Processing
Lee-Feng Chien | Hsin-Min Wang
Proceedings of the 16th Conference on Computational Linguistics and Speech Processing

藍芽無線環境下中文語音辨識效能之評估與分析 (Performance Evaluation and Analysis of Mandarin Speech Recognition over Bluetooth Communication Environments) [In Chinese]
Yin-Cheng Chen | Tan-Hsu Tan | Hsin-Min Wang | Wei-Ho Tsai
Proceedings of the 16th Conference on Computational Linguistics and Speech Processing

2001

Mandarin-English Information: Investigating Translingual Speech Retrieval
Helen Meng | Berlin Chen | Sanjeev Khudanpur | Gina-Anne Levow | Wai-Kit Lo | Douglas Oard | Patrick Shone | Karen Tang | Hsin-Min Wang | Jianqiang Wang
Proceedings of the First International Conference on Human Language Technology Research

2000

Mandarin-English Information (MEI): Investigating Translingual Speech Retrieval
Helen Meng | Sanjeev Khudanpur | Gina Levow | Douglas W. Oard | Hsin-Min Wang
ANLP-NAACL 2000 Workshop: Embedded Machine Translation Systems

1999

A New Syllable-based Approach for Retrieving Mandarin Spoken Documents Using Short Speech Queries
Hsin-min Wang
Proceedings of Research on Computational Linguistics Conference XII

1998

Statistical Analysis of Mandarin Acoustic Units and Automatic Extraction of Phonetically Rich Sentences Based Upon a Very Large Chinese Text Corpus
Hsin-min Wang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 3, Number 2, August 1998

1993

從中文語料庫中自動選取連續國語語音特性平衡句的方法 (Automatic Selection of Phonetically Rich Sentences from A Chinese Text Corpus) [In Chinese]
Hsin-Min Wang | Yuan-Cheng Chang | Lin-Shan Lee
Proceedings of Rocling VI Computational Linguistics Conference VI

Co-authors

Hung-Shin Lee 4

Chao-Chun Liang 4

Aleksandra Smolka 3

Jason S. Chang 2

Ting-Hao Chang 2

Hsin-Hsi Chen 2

Ying-Wen Chen 2

Jen-Tzung Chien 2

Cheng-Chung Fan 2

Hsiao-Tsung Hung 2

Sanjeev Khudanpur 2

Gina-Anne Levow 2

Shang-Bao Luo 2

Douglas W. Oard 2

Shih-Hung Tsai 2

Kuang-Yu Chang 1

Ruei-Chuan Chang 1

Chia-Hui Chang 1

Cheng-Jo Ray Chang 1

Yuan-Cheng Chang 1

Yi-Hsiang Chao 1

Chih-Hsi Chen 1

Yin-Cheng Chen 1

Sin-Horng Chen 1

Shih-Sian Cheng 1

Lee-Feng Chien 1

Chuang-Hua Chueh 1

Chiao-Wei Hsu 1

Hsiang-Ping Hsu 1

Bing-Jhih Huang 1

Wen-Chin Huang 1

Hsin-Te Hwang 1

Jyh-Shing Roger Jang 1

Jing-Yang Jou 1

Chia-Chih Kuo 1

Chung-Han Lee 1

Patrick Shone 1

Shih-Hong Tsai 1

Jia-Ching Wang 1

Jianqiang Wang 1

Pang-Cheng Wu 1

Venues