Biyi Fang


2025

pdf bib
TrInk: Ink Generation with Transformer Network
Zezhong Jin | Shubhang Desai | Xu Chen | Biyi Fang | Zhuoyi Huang | Zhe Li | Chong-Xin Gan | Xiao Tu | Man-Wai Mak | Yan Lu | Shujie Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

In this paper, we propose TrInk, a Transformer-based model for ink generation, which effectively captures global dependencies. To better facilitate the alignment between the input text and generated stroke points, we introduce scaled positional embeddings and a Gaussian memory mask in the cross-attention module. Additionally, we design both subjective and objective evaluation pipelines to comprehensively assess the legibility and style consistency of the generated handwriting. Experiments demonstrate that our Transformer-based model achieves a 35.56% reduction in character error rate (CER) and an 29.66% reduction in word error rate (WER) on the IAM-OnDB dataset compared to previous methods. We provide an demo page with handwriting samples from TrInk and baseline models at: https://akahello-a11y.github.io/trink-demo/

2022

pdf bib
SLATE: A Sequence Labeling Approach for Task Extraction from Free-form Inked Content
Apurva Gandhi | Ryan Serrao | Biyi Fang | Gilbert Antonius | Jenna Hong | Tra My Nguyen | Sheng Yi | Ehi Nosakhare | Irene Shaffer | Soundararajan Srinivasan
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track

We present SLATE, a sequence labeling approach for extracting tasks from free-form content such as digitally handwritten (or “inked”) notes on a virtual whiteboard. Our approach allows us to create a single, low-latency model to simultaneously perform sentence segmentation and classification of these sentences into task/non-task sentences. SLATE greatly outperforms a baseline two-model (sentence segmentation followed by classification model) approach, achieving a task F1 score of 84.4%, a sentence segmentation (boundary similarity) score of 88.4% and three times lower latency compared to the baseline. Furthermore, we provide insights into tackling challenges of performing NLP on the inking domain. We release both our code and dataset for this novel task.