Can open datasets help machine learning solve medical mysteries?

The medical data housed in patient records are a clinical researcher’s dream: the key, potentially, to better tools to treat disease and screen with precision. They’re also a computer scientist’s nightmare: locked away in hospital systems, subject to restrictive data-sharing agreements, and often too messy to make use of.

A new open science project wants to accelerate ethical AI in medicine by doing the hard work of collecting and cleaning that data. Nightingale Open Science launched in December with $6 million in funding, led by Schmidt Futures, the philanthropy of ex-Google CEO Eric Schmidt. (It has no affiliation with Google’s controversial health record-mining partnership with Ascension, which went by the code name Project Nightingale). It will freely share de-identified clinical datasets with researchers, linking medical images like X-rays, ECG results, and biopsy slides — 40 terabytes worth, to start — to outcomes from partnered health systems. Hundreds of researchers have signed up for access in its first month.

Unlock this article by subscribing to STAT+ and enjoy your first 30 days free!