Core42 Announces Open-Source Arabic LLM, Jais 30B
Jais 30B is Core42’s latest open-source Arabic Language Learning Model (LLM).
Core42, a G42 company, has announced the launch of Jais 30B – a bilingual Large Language Model (LLM) which has the potential to improve Arabic verbosity by 160% and English by 233%, the UAE-based company said in a press release [1] . With 30 bn parameters - compared to its 13 bn-parameter predecessor launched in August of this year, Jais 13B – Jais 30B facilitates summarization, translation and Q&A.
LLMs are a kind of Generative AI that utilize machine learning to understand and produce text. They are built on bns of parameters. LLM has potential applications in telecommunications, energy, education, healthcare and marketing communications.
Jais 30B’s open-source bilingual model aims to cater to over 400 mn Arabic speakers worldwide, supporting the UAE’s digital platforms and frameworks and acting as a tool for Arabic speakers across MENA. Tests showed that it improved summarization by 53% in Arabic and 85% in English. It also improved formatting by 130% in Arabic and 134% in English, making it equivalent to monolingual English models.
Jais 30B was trained on a dataset of 126 bn Arabic tokens, 251 bn English tokens and 50 bn code tokens and trained on Cerebras’s Condor Galaxy 1 AI supercomputers.
The newly-formed Core42, part of the G42 group of companies, is the UAE’s national-scale enabler for cloud and generative AI. Jais is a collaboration between Core42 and Inception (which has since been merged into Core42 as its AI applied research unit), Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) and California-based chip company Cerebras Systems.
In 2018, the UAE launched the National Strategy for Artificial Intelligence 2031, which aims to put the country on the forefront of AI globally. A report by PwC estimates that AI will contribute $320 bn on Middle East economies by 2030, with the UAE seeing a contribution of almost 14% of its GDP - equivalent to $96 bn - generated from AI by 2030.