SEA-LION is a family of open-source large language models developed by AI Singapore as part of the National Multi-Modal Large Language Model project. Trained on multilingual datasets from Southeast Asia, SEA-LION supports low-resource languages like Thai, Vietnamese, and Bahasa Indonesia. The models aim to improve cultural representation in AI and enhance accessibility for multilingual natural language processing (NLP) tasks, including translation, summarization, and question answering.
A growing observatory of examples of how open data from official sources and generative artificial intelligence (AI) are intersecting across domains and geographies.
Share your project for inclusion. We seek to learn from generative AI initiatives that use open government and research data across a Spectrum of Scenarios. More information on each scenario can be found in our report: A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI.