Predicting High Risk Breast Cancer - Phase 2 (2023)

Daily submission limit: 1

Table of Contents


Every year, 40 million women get a mammogram; some go on to have an invasive biopsy to better examine a concerning area. Underneath these routine tests lies a deep—and disturbing—mystery. Since the 1990s, we have found far more ‘cancers’, which has in turn prompted vastly more surgical procedures and chemotherapy. But death rates from metastatic breast cancer have hardly changed.

When a pathologist looks at a biopsy slide, she is looking for known signs of cancer: tubules, cells with atypical looking nuclei, evidence of rapid cell division. These features, first identified in 1928, still underlie critical decisions today: which women must receive urgent treatment with surgery and chemotherapy? And which can be prescribed “watchful waiting”, sparing them invasive procedures for cancers that would not harm them?

There is already evidence that algorithms can predict which cancers will metastasize and harm patients on the basis of the biopsy image. Fascinatingly, these algorithms also hone in on features that humans neglect, for example, the nature of the non-cancerous tissue surrounding the tumor. But to date, the datasets linking biopsy images to patient outcomes—metastasis, death—have been far smaller than what is needed to apply modern approaches.

Updates in Phase 2

Phase 1 of this contest (results) that ended in January 2023 included the entire data set and the objective was to predict the cancer stage from the biopsy slides, which reflects the real-world skew towards stages 0 and 1, as well as a lower proportion of minority race and ethnicities. For the current (phase 2) contest, we have created a subset to balance the proportion of patients with various cancer stages, as well as minority race and ethnicities. For the current contest, The scoring method has also been changed to calculating the area under the ROC-curve (one vs rest macro average AUC) instead of the mean squared error.

This dataset contains images and outcomes for 1,000 biopsies correspond to 842 patients ranging from 2014 to 2020 that. Please refer to the full version of the dataset documentation as you get started to learn more about the cohort and key variables for this challenge including mortality and cancer stage.

Baseline Score

Nightingale team used CLAM‘s out of the box model to calculate the baseline score using only the biopsy images. CLAM identified the regions of slides that contain tissues and uses transfer learning from a pre-trained model. CLAM is a method that uses weakly supervised learning using slide-level labels to classify whole slide images (WSI). Using CLAM’s out of the box model, our baseline model obtained an AUC of 0.68


Providence St. Joseph, Nightingale, and The Association for Health Learning and Inference (AHLI) developed this challenge in order to catalyze the development of algorithms that find new signal in digital pathology images, ultimately providing new insights into which patients may be at risk and need preventive treatment.

The goal of this challenge is to predict the stage of a patient’s cancer, using only the slide images generated by a breast biopsy.

Cancer staging is a complex, multidisciplinary task: while it does take into account some features of the biopsy, it also integrates a wide variety of external information: the size of the lesion biopsied, its appearance and location on imaging, and a variety of other tests (imaging and more) to determine whether the cancer has spread to other locations in the body. This important contextual information, most of which is not present in the whole slide image, serves as our ground-truth label for the challenge. By linking features of the whole slide image to this label, algorithmic approaches have the potential to find new sources of signal—beyond the tubules, atypical nuclei, and cell division markers pathologists consider today—that can identify patients with benign or deadly cancers.

Building on successful work in this challenge, a particularly interesting next step is to identify predictable “outliers”: patients whose cancer is far more—or less—benign that it appears to the pathologists. Researchers at Providence, who have access to rich and granular data on pathologists’ judgments, are eager to collaborate on this exciting follow-on work.

Prizes and Compute Credits

Nightingale OS and AHLI have announced prizes up to $20,000 and free compute credits worth up to $50,000 for the two phases. For the current phase (phase 2), up to $10,000 will be awarded as prize money.

Compute Credits

Each team will be allocated compute credits worth $250 when they join the contest and request them. Active teams may request additional tranches of $250 of free compute credits when they have less than $50 balance, up to a maximum of $1,000 per team for the duration of the contest. Complimentary compute credits will expire at the end of the contest. Teams may also pay for their own compute credits beyond this limit – they will not expire at the end of the contest.


Based on the AUC, the winning team will be awarded a cash prize of $3,000 and the runner-up team will be awarded a cash prize of $2,000.

Additional prizes of up to $5,000 may be awarded based on clinical significance of the submitted results, such as the ability to distinguish metastatic vs non-metastatic cancers, or the accuracy in minority patients. All entries on the leaderboard will be evaluated based on various criteria of clinical significance as defined by the review committee.

All prize winning entries are subject to rules governing the prizes and the contest. Please see the rules for US export controls and tax reporting/withholding information.


This dataset contains whole slide images from 1,000 breast biopsies, in 842 patients, over the years 2014 to 2020. For our purposes, an observation in this dataset corresponds to a biopsy (i.e., performance in the hold-out set will be evaluated at the biopsy level). The contest content can be found in the contest directory.


Images: Each biopsy generates between one to one hundred physical slides (processed with hematoxylin and eosin stain). The slides have been digitized at 40x magnification with a Hamamatsu slide scanner, yielding a whole slide image. These images have a resolution around 100,000 x 150,000 pixels, and are stored as a single NDPI file (average size ~2GB). A NDPI file is a TIFF-like file, and libraries like openslide can be used to interact with them. The 1,000 biopsies of this dataset generate 10,856 whole slide images, with a median of 5 WSI per biopsy.

Labels: The primary label is the cancer stage associated with a biopsy. 94% of staging judgments in this dataset are made within one month of biopsy).

Dataset splits: The dataset has been split randomly at the patient level. It was a 75/25 split with the 25% holdout used for validation purposes. The 75% data was then subsampled in order to increase minority patients representation, and this subset of 1,000 biopsies have been made available for phase two of this competition. Refer to Table 1 for what is expected to be made available.

Table 1

Train Holdout
N biopsies 1000 1077
 N patients 842 856
 N images 10856 16899
N biopsies with stage 1000 886
 0 18% redacted
 I 39%
 II 20%
 III 17%
 IV 6.0%
N unstaged biopsies 0 191

Model performance measurement: The primary metric we will evaluate model performance is prediction of cancer stage, among staged cases in the holdout set. More detailed information on the exact scoring methodology is below. The holdout set contains staged and unstaged biopsies (see Table 1), but only staged biopsies will be used to calculate the primary Challenge metric. We will award additional prizes for other aspects of performance, reflecting both clinical utility of models and Nightingale Open Science’s commitment to equity.

You are free to use images from unstaged cases in any way you’d like in the training process. You are also free to use any other available information in the training process: detailed demographic information including age, sex, and self-reported race; and other information about the cancer and its progression beyond stage, including mortality and ICD codes for metastatic disease (though keep in mind the many caveats of this information, as noted for example here). However, note that none of this contextual information will be provided in the holdout set—only the slide images. Please refer to the full version of the dataset documentation as you get started to learn more about the cohort and key variables.

Important Challenge Dates

ATTENTION: The contest ends on May 3, 2023.


  1. Registration required. Only registered Nightingale OS users may participate in the contest.
    1. Registration is open to anyone worldwide who has an active affiliation with an accredited academic institution.
    2. Your registration must be approved by Nightingale before you can access contest data or any other Nightingale OS resources.
    3. You can only sign up for one account.
    4. You must abide by all Nightingale Terms of Service. In particular, no portion of a Nightingale dataset may be downloaded.
  2. Teams and collaboration
    1. Size limit. Teams can be any size. As a general rule, contests often limit team sizes to about 5. Because this is a multi-disciplinary research area, Nightingale recognizes that teams may be larger from time to time.
    2. No merges. You can only be a member of one team during a given phase.
      1. If you entered the competition as an individual, you may join someone else’s project only if you have not submitted any entries for scoring during the contest’s current phase.
    3. No sharing. Sharing code between teams is prohibited unless the information shared is free and publicly available.
    4. Public discussion. You may not publicly describe the methodology you are using for the competition until after the submission deadline for a given phase. You may describe your methodology in the context of other datasets as long as you don’t also indicate this is the methodology you used in competition.
    5. Publication. Please use the recommended citation found on the dataset documentation page. Although the dataset is available to all Nightingale users, please do not submit contest-related work to other conferences or journals until the current contest has ended.
  3. Scoring
    1. Predictions CSV. Teams submit entries for scoring in the form of a predictions file. Nightingale will score the entry according to the methodology described in the Scoring methodology section.
    2. Entry limit. Each team can submit 1 entry per day.
    3. Public leaderboard. During the competition period, each team’s current ranking will be visible on the competition public leaderboard. After the end date, the leaderboard will reveal each team’s score and model description.
    4. High score. The best score from all your team’s submissions will be used in the leaderboard, along with your description of that entry, if any. See Scoring for details.
    5. Tiebreaker. In the event of a tie, the first submission will outrank subsequent submissions.
    6. Submission requirements. Nightingale may make efforts to help you validate your predictions file and fix any issues before scoring. However, invalid submissions may result in no score and may count against your team’s entry limit.
  4. Resources
    1. Use of non-public data or software. Use of free and publicly available external data is allowed, including pre-trained models. If you use any proprietary software or data that is not free and publicly available, such as a pre-trained model using private data, then you should declare this by including “[NON-PUBLIC]” in your Nightingale project description field. Rankings and scores for teams using non-public resources will be published on the leaderboard and encouraged to contribute to the conferences but will not be eligible for any prizes.
    2. Billing. Nightingale OS provides both free and non-free compute resources to contest participants. To support participants with the cost of non-free compute servers, Nightingale allocates a limited amount of free compute credits to the teams on sign up. Additional free compute credits will be allocated periodically based on availability of credits and each team’s activity. Teams are responsible for all additional costs incurred for non-free resources after they consume the free compute credits.
  5. Prizes
    1. Review of entries and Deadlines. All top scoring entries are subject to Nightingale’s review for compliance with guidelines and quality of results. After the deadline, the organizers will add the reviewers to the leading projects so that their submissions can be reviewed for conformance with the rules for awarding prizes. Because the goal of this contest is to promote research and to stimulate interest in these kinds of applications, we could extend the deadline in order to get broader participation and high quality submissions. We will let everyone know in advance if we do change any key dates.
    2. Publishing and Licensing of work. To be eligible for prizes, the teams must publish their winning solution on a public code repository (github preferred) as open source under the Apache License v2 or an Apache-compatible open source license. The license must be added to your project files by the deadline to be eligible - see here for an example of how to do this. Third-party resources, if used, must be freely available to the public, and must not form a significant part of the solution as determined by the judges. This clause is intended to promote the repeatability and reproducibility of science. If a team is unable to comply with this clause, they may still compete and will be placed on the leaderboard, but they will not be eligible for prizes.
    3. Export control. Prize money is subject to US export control regulations. Teams in countries prohibited by US export control regulations will not be eligible for prizes.
    4. Tax reporting and withholding. ALL TAXES IMPOSED ON PRIZES ARE THE SOLE RESPONSIBILITY OF THE WINNERS. Payments to potential winners are subject to the express requirement that they submit all documentation requested by competition organizers for compliance with tax reporting and withholding regulations. Prizes will be net of any taxes that the competition organizer is required by law to withhold. Foreign residents (as defined by US tax regulations) will be subject to mandatory 30% tax withholding. If a potential winner fails to provide any required documentation or comply with applicable laws, the Prize may be forfeited and the organizer may select an alternative potential winner. Any winners who are U.S. residents may receive an IRS Form 1099 in the amount of their prize. Any winners who are foreign residents may receive an IRS Form 1042-S in the amount of their prize and the tax amount withheld.


Scoring methodology

Cancer stage takes on discrete values (0, I, II, III, IV), and clinically, some errors are more costly than others: in broad strokes, metastatic disease (stage IV) is managed very differently from locoregional disease, and carries a much higher mortality rate. We use the one-vs-rest weighted average AUC (area under the ROC-curve) to score the entries in this contest.

Nightingale has randomly sampled a set of patients to create a holdout set on which entries will be scored. You can see the images in the dataset directory in the holdout subdirectory, but of course, stage and mortality fields are excluded.


To submit an entry, predict the stage for each biopsy and write results to a CSV file.

Predictions CSV file format

Although teams are being asked to use slides to train their algorithms, predictions for this contest are for biopsies. Each biopsy produces multiple slides. Teams will need to incorporate multiple slides into their algorithms or aggregate slide predictions at a biopsy level. Also be aware that biopsy_id, slide_id, and patient_ngsci_id all are hexadecimal strings of the same length.

Column index Contents Data type
0 Biopsy_ID string
1 Probability_Stage_0 float in range [0,1]
2 Probability_Stage_1 float in range [0,1]
3 Probability_Stage_2 float in range [0,1]
4 Probability_Stage_3 float in range [0,1]
5 Probability_Stage_4 float in range [0,1]
6 Predicted_Stage integer in {0,1,2,3,4}

Example CSV


note: no header is included

Submission description (recommended)

Frequently Asked Questions

I’m new to Nightingale. How can I learn to use the platform quickly?

Please see the quick start tutorial here. The tutorial notebook is also available inside the platform - you can copy, run, and modify the notebook hands-on to learn to use the platform quickly.

How to request complimentary compute credits?

To request these compute credits, teams should submit a help request from the platform or email, mentioning their project id. Complimentary credits will expire at the of the contest. Please see the details on the amount of free compute credits here.

How to get additional paid compute credits?

If a team wants to pay for additional compute credits, they may generate an invoice for a specific dollar amount and have it mailed to the payee’s email address. This invoice email will contain an electronic link for online payment using a credit card. On successful payment, these credits will be added to the project. These paid compute credits will not expire at the end of the contest, and we are able to transfer unspent compute credits to a new project created by the same user.

How to submit an entry?

Submit your predictions file inside any instance in your Nightingale project:

>>> import ngsci
>>> ngsci.submit_contest_entry("path/to/your.csv", description="our model")
 (<Result.SUCCESS: 1>, 'success')

After you submit a file, Nightingale will attempt to validate the schema of your CSV. If it fails validation, you will see one or more errors in the Submissions tab in your Nightingale OS project (not in your JupyterLab editor). Any validation errors detected automatically will not count against your submission limit.

After a predictions file passes validation, it will be submitted for scoring. The result will be posted in your project Submissions view, and if the result is a new high score for your team, the public leaderboard will also be updated.

Can I see a sample submission?

A sample CSV and notebook are located in ~/datasets/brca-psj-path/contest-phase-2/sample-submission. The sample CSV is in the proper format and contains all the biopsy_id‘s needed for a submission. The notebook describing the holdout tables and the process of submitting an entry.

How should I submit my model?

Teams that are in contention for winning prizes will be expected to share their projects with Nightingale staff within 3 days after the submission deadline. More precise instructions will be given to the top teams a week before the deadline.

Expected to include:

  1. The notebooks or scripts used to train the model
  2. The weights or serialized model used for winning submission, which can be used to reproduce submission results
  3. A Open-source license

Here are some tutorials for saving weights and models in PyTorch and Tensorflow.

How to enter the contest and form a team?

After you have registered for Nightingale and a Nightingale admin has admitted you, then you can enter the currently active contest. For example: Predicting High Risk Breast Cancer Phase 2 - 2023.

When you enter the contest, a new, empty project is created for you and associated with this contest. This project is your team workspace.

How to transfer data from a Phase 1 contest project into a Phase 2 project?

If you worked on Phase 1, you probably want to use some of the project files that you created in your new Phase 2 project. There isn’t a direct way to copy from one project to another because each Nightingale OS project is isolated. You can migrate project data, however, using your home directory.

  1. Start an instance in the Phase 1 project.
  2. Copy data that you want to transfer from the (shared) ${HOME}/project directory to your (private) ${HOME} directory.
  3. Stop your instance in the Phase 1 project.
  4. Start an instance in the Phase 2 project.
  5. Move or copy the data that you want to transfer from ${HOME} to the (new) ${HOME}/project directory.

Host organizations

This is a challenge jointly hosted by Nightingale Open Science; The Association for Health Learning and Inference; and Providence St. Joseph Health.

Nightingale Open Science is a platform connecting researchers with deidentified, cutting edge medical datasets. The Nightingale OS team works closely with health systems around the world to create and curate datasets of medical images linked to ground-truth labels, and make them freely available to academic researchers. Nightingale OS launched at the 2021 NeurIPS conference with five anchor datasets spanning different disease areas.

The Association for Health Learning and Inference (AHLI) is a not-for-profit organization dedicated to building a transdisciplinary machine learning and health community. AHLI works with its partners to advance health data quality and access, knowledge discovery, and meaningful use of complex health data. AHLI was founded in September 2021 with generous support from Schmidt Futures.

Providence St. Joseph Health is a not-for-profit health care system operating in seven states and serves as the parent organization for 100,000 caregivers. The combined system includes 51 hospitals, 829 clinics, and other health, education and social services across Washington, Oregon, California, Alaska, Montana, New Mexico, and Texas.

Together, our three teams are thrilled to collaborate on this contest to spur collaboration and competition in the field of computational medicine.


We thank our generous funders: Schmidt Futures, The Gordon and Betty Moore Foundation, and Ken Griffin. We would also like to acknowledge the team that created and conceived of this dataset, and worked with us to make this challenge possible: Carlo Bifulco, MD, Director of Molecular Pathology and Pathology Informatics; Brian Piening, PhD, Technical Director of Clinical Genomics; Tucker Bower, Bioinformatics Scientist. Many thanks also to the leadership of Ari Robicsek, Chief Medical Analytics Officer at Providence, Bill Wright, VP of Health Innovation at Providence, and Raina Tamakawna, Enterprise and GME Research Program Manager at Providence.

We express our thanks to Hamamatsu as well – developers of the NanoZoomer 360 platform, Hamamatsu supported this work with a grant from their Product Marketing Division and partnered with Providence to ensure a seamless start to the dataset creation process.

NGSCI logo
AHLI logo
Providence logo