This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
Jawad IbnAhad
Fixing paper assignments
Please select all papers that do not belong to this person.
Indicate below which author they should be assigned to.
Large Language Models have achieved great success in tasks like sentiment analysis, machine translation, and question answering, yet their effectiveness in the multilingual financial domain remains less explored. This study explores the potential of generative LLMs for classifying financial sustainability in four diverse languages: English, Hindi, Bengali, and Telugu, representing low, medium, and high-resource language categories. We propose a novel fine-tuning approach that integrates both positive and negative rationales alongside classification labels. Unlike existing approaches, our method improves classification performance by incorporating structured bidirectional reasoning into financial decision-making. Extensive evaluations demonstrate that the proposed approach consistently outperforms prior methods across all four languages, establishing new benchmark results for multilingual financial NLP. Notably, it also enables smaller models to achieve competitive or even superior performance compared to significantly larger models fine-tuned with conventional methods, demonstrating its suitability for industry applications.
Large Language Models (LLMs) excel at complexreasoning tasks, yet their performance hinges on the quality of their prompts and pipeline structures. Manual promptdesign, as used in frameworks like DSPy, poses significantlimitations: it is time-intensive, demands substantial expertise,and lacks scalability, restricting the widespread use of LLMsacross diverse applications. To overcome these challenges, weintroduce AutoDSPy, the first framework to fully automateDSPy pipeline construction using reinforcement learning (RL).AutoDSPy leverages an RL-tuned policy network to dynamicallyselect optimal reasoning modules—such as Chain-of-Thought forlogical tasks or ReAct for tool integration—along with inputoutput signatures and execution strategies, entirely eliminatingthe need for manual configuration. Experimental results on theGSM8K and HotPotQA benchmarks demonstrate that AutoDSPyoutperforms traditional DSPy baselines, achieving accuracy gainsof up to 4.3% while reducing inference time, even with smallermodels like GPT-2 (127M). By integrating RL-based automation,AutoDSPy enhances both efficiency and accessibility, simplifyingthe development of structured, high-performing LLM solutionsand enabling scalability across a wide range of tasks