@inproceedings{kopiczko-etal-2025-bitune,
    title = "Bitune: Leveraging Bidirectional Attention to Improve Decoder-Only {LLM}s",
    author = "Kopiczko, Dawid Jan  and
      Blankevoort, Tijmen  and
      Asano, Yuki M",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.481/",
    pages = "9521--9547",
    ISBN = "979-8-89176-332-6",
    abstract = "Decoder-only large language models typically rely solely on masked causal attention, which limits their expressiveness by restricting information flow to one direction. We propose Bitune, a method that enhances pretrained decoder-only LLMs by incorporating bidirectional attention into prompt processing. We evaluate Bitune in instruction-tuning and question-answering settings, showing significant improvements in performance on commonsense reasoning, arithmetic, and language understanding tasks. Furthermore, extensive ablation studies validate the role of each component of the method, and demonstrate that Bitune is compatible with various parameter-efficient finetuning techniques and full model finetuning."
}Markdown (Informal)
[Bitune: Leveraging Bidirectional Attention to Improve Decoder-Only LLMs](https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.481/) (Kopiczko et al., EMNLP 2025)
ACL