Friday, July 4, 2025

Pre-translation vs. direct inference in multilingual LLM functions

Massive language fashions (LLMs) have gotten omnipresent instruments for fixing a variety of issues. Nonetheless, their effectiveness in dealing with various languages has been hampered by inherent limitations in coaching information, which are sometimes skewed in the direction of English. To deal with this, pre-translation, the place inputs are translated to English earlier than feeding them to the LLM, has turn out to be a regular apply.

Earlier analysis has demonstrated the effectiveness of pre-translation for optimum LLM efficiency for GPT-3/3.5/4, ChatGPT, PaLM and different fashions. Whereas pre-translation helps deal with the language bias concern, it introduces complexities and inefficiencies, and it could result in info loss. With the introduction of recent highly effective LLMs educated on huge multilingual datasets, it’s time to revisit the assumed necessity of pre-translation.

In our current work “Breaking the Language Barrier: Can Direct Inference Outperform Pre-Translation in Multilingual LLM Purposes?”, to be offered at NAACL’24, we re-evaluate the necessity for pre-translation utilizing PaLM2, which has been established as extremely performant in multilingual duties. Our findings problem the pre-translation paradigm established in prior analysis and spotlight some great benefits of direct inference in PaLM2. Particularly, we exhibit that PaLM2-L constantly outperforms pre-translation in 94 out of 108 languages, providing a extra environment friendly and efficient software in multilingual settings whereas unlocking linguistic authenticity and assuaging the restrictions of pre-translation.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles