Observatory of Examples of How Open Data and Generative AI Intersect

A growing observatory of examples of how open data from official sources and generative artificial intelligence (AI) are intersecting across domains and geographies.

Share your project for inclusion. We seek to learn from generative AI initiatives that use open government and research data across a Spectrum of Scenarios. More information on each scenario can be found in our report: A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI.

SEA-LION

SEA-LION is a family of open-source large language models developed by AI Singapore as part of the National Multi-Modal Large Language Model project. Trained on multilingual datasets from Southeast Asia, SEA-LION supports low-resource languages like Thai, Vietnamese, and Bahasa Indonesia. The models aim to improve cultural representation in AI and enhance accessibility for multilingual natural language processing (NLP) tasks, including translation, summarization, and question answering. 

Region

apac

Sector

academia

Scenario

pre-training

Start Date

2023

Location: Singapore