Asclepius is a large language model for the medical domain trained on synthetic clinical notes generated through public biomedical information. The team chose to experiment with synthetic data given privacy concerns associated with using patient data in LLMs. The authors indicate that the LLM trained on synthetic data can have similar quality outputs to those trained on patient data.
A growing observatory of examples of how open data from official sources and generative artificial intelligence (AI) are intersecting across domains and geographies.
Share your project for inclusion. We seek to learn from generative AI initiatives that use open government and research data across a Spectrum of Scenarios. More information on each scenario can be found in our report: A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI.