@inproceedings{fang-etal-2024-efficiently,
    title = "Efficiently Acquiring Human Feedback with {B}ayesian Deep Learning",
    author = "Fang, Haishuo  and
      Gor, Jeet  and
      Simpson, Edwin",
    editor = {V{\'a}zquez, Ra{\'u}l  and
      Celikkanat, Hande  and
      Ulmer, Dennis  and
      Tiedemann, J{\"o}rg  and
      Swayamdipta, Swabha  and
      Aziz, Wilker  and
      Plank, Barbara  and
      Baan, Joris  and
      de Marneffe, Marie-Catherine},
    booktitle = "Proceedings of the 1st Workshop on Uncertainty-Aware NLP (UncertaiNLP 2024)",
    month = mar,
    year = "2024",
    address = "St Julians, Malta",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.uncertainlp-1.7/",
    pages = "70--80",
    abstract = "Learning from human feedback can improve models for text generation or passage ranking, aligning them better to a user{'}s needs. Data is often collected by asking users to compare alternative outputs to a given input, which may require a large number of comparisons to learn a ranking function. The amount of comparisons needed can be reduced using Bayesian Optimisation (BO) to query the user about only the most promising candidate outputs. Previous applications of BO to text ranking relied on shallow surrogate models to learn ranking functions over candidate outputs,and were therefore unable to fine-tune rankers based on deep, pretrained language models. This paper leverages Bayesian deep learning (BDL) to adapt pretrained language models to highly specialised text ranking tasks, using BO to tune the model with a small number of pairwise preferences between candidate outputs. We apply our approach to community question answering (cQA) and extractive multi-document summarisation (MDS) with simulated noisy users, finding that our BDL approach significantly outperforms both a shallow Gaussian process model and traditional active learning with a standard deep neural network, while remaining robust to noise in the user feedback."
}Markdown (Informal)
[Efficiently Acquiring Human Feedback with Bayesian Deep Learning](https://preview.aclanthology.org/ingest-emnlp/2024.uncertainlp-1.7/) (Fang et al., UncertaiNLP 2024)
ACL