Friday, December 13, 2024

Sony AI and AI Singapore collaborate to create a large-scale language model.

three speech bubbles on string

Sony Analysis has partnered to refine and optimise the Southeast Asian Languages in One Community (SEA-LION) artificial intelligence model, with a focus on Indian language subsets. 

Sony’s AI arm will collaborate with accountable stakeholders to address the challenge of, ensuring its large language model stands out effectively globally, accurately reflecting the world’s diverse populations and languages. Companions will collaborate on Tuesday to produce an analysis under the SEA-LION umbrella, utilizing a collection of large language models (LLMs) specifically pre-trained and fine-tuned for Southeast Asian cultures and languages. 

The open-source large language model has been trained on a staggering 981 billion language tokens – a measure of linguistic complexity defined by AISG as fragmented phrases generated through the tokenization process. The corpus comprises approximately 623 billion English tokens, 128 billion Southeast Asian tokens, and 91 billion Chinese language tokens.  

Sony will collaborate to refine and vet the AI model, leveraging its research capabilities in India and expertise in developing Large Language Models (LLMs) for Indian languages, including Tamil. Tamil, spoken by an estimated 60-85 million people worldwide, primarily resides among populations in India and Southeast Asia. 

Sony will share best practices for large language model (LLM) growth and analysis methodologies, as well as the application of its research in speech recognition, content evaluation, and natural language processing. 

According to AISG’s senior director of AI products Leslie Teo, the integration of the SEA-LION AI prototype and Tamil language functionality holds great promise in boosting the effectiveness of contemporary applications. The Singapore company is also willing to share its knowledge and best practices for accelerating language learning model development. 

Artificial intelligence innovators, along with other diverse business players, collaborate on refining the regional Large Language Model, also working to make it accessible to developers who can then create tailored AI applications. 

“Obstacles to accessing Large Language Models that encompass the global landscape of languages and cultures have hindered our ability to drive meaningful analysis and develop innovative technologies that are representative and equitable for the diverse populations we aim to serve,” said Hiroaki Kitano, president of Sony Research. “Significant power lies within the realms of range and localization.” Throughout Southeast Asia, a diverse linguistic landscape prevails, with over 1,000 distinct languages spoken by its regional inhabitants. This underscores the critical importance of developing AI technologies that cater to the diverse needs of global populations.

Founded in April 2023, Sony Analytics explores the intersection of technology and creativity to boost content production and fan interaction, focusing on the potential applications of AI, sensing technologies, and digital innovations. The company’s dedicated research team has been actively exploring cutting-edge technologies, including model compression and neural rendering, with the aim of integrating these innovations into Sony’s Graphic User Interface development tools, namely the Neural Network Console and open-source libraries, Neural Network Libraries. 

Applied sciences can power innovative electronics products across various industries, including gaming, entertainment, film, music and interactive experiences, according to Sony. 

Its interactive leisure unit has filed a patent for a “Harassment Detection Equipment” that utilizes an entry unit designed to collect biometric data and generate emotional insights from customers, according to a search of the World Intellectual Property Organization’s PatentScope platform.

Sony aims to develop a technology capable of identifying and neutralizing harmful interactions within multiplayer gaming environments or virtual reality settings, mirroring its efforts to combat online harassment. The innovative system leverages machine learning and AI models to accurately detect biometric data, including audio cues like speech, allowing it to discern a participant’s emotional state through subtle indicators such as sobbing or screaming sounds. These indicators could also be employed to identify potential victims of harassment within a shared environment, as submitted. 

Sony Music Group clarifies that it will not permit the scraping of its artists’ copyrighted works, including compositions, lyrics, and audio recordings, for training AI models unless explicit approval is granted.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles