The ABCD Neurocognitive Prediction Challenge (ABCD-NP-Challenge 2019) invites researchers to submit their method for predicting fluid intelligence from T1-weighed MRI (11K subjects in total, age 9-10 years). The data of about 4k individuals will be provided for training. The accuracy of each method will be assessed on its predicted fluid intelligence scores of 6K children, whose actual scores will be revealed after the challenge deadline. Downloading the data needs prior approval by NIH NDAR, which will require sign off by the institution you are affiliated with. So start the application process early. Please also sign up to the emailing list to receive updates about the challenge.
About ABCD: The ABCD study is the largest long-term study of brain development and child health in the United States. The ABCD Research Consortium consists of a Coordinating Center, a Data Informatics and Analysis Center, and 21 research sites across the country, which recruited over 11K children ages 9-10. Of each participant, the study acquires structural, diffusion functional brain MRIs as well as genetics, neuropsychological, behavioral, and other health assessments. The goal of ABCD is to determine how childhood experiences (such as sports, videogames, social media, unhealthy sleep patterns, and smoking) interact with each other and with a child's changing biology to affect brain development and social, behavioral, academic, health, and other outcomes.
About the Challenge: Determining the neural mechanisms underlying general intelligence is fundamental to understanding cognitive development, how this relates to real-world health outcomes, and how interventions (education, environment) might improve outcomes through adolescence and into adulthood. A major factor in measuring general intelligence is fluid intelligence (Carroll, 1993), which the ABCD measures via the NIH Toolbox Neurocognition battery and from which demographic confounding factors (e.g., sex and age) are removed. The fluid intelligence scores of 4154 subjects will be provided to participants for training (3739 samples) and validation (415 samples), while the scores of about 6K subjects will have to be predicted based on T1-weighted MRI. The MRIs are acquired according to the following acquisition protocol.
The fluid intelligence scores are pre-residualized on data collection site, sociodemographic variables and brain volume. Using the R function lm, a linear regression model was constructed with fluid intelligence as the dependent variable and brain volume, data collection site, age at baseline, sex at birth, race/ethnicity, highest parental education, parental income, and parental marital status as independent variables. Any subject in the ABCD NDA Release 1.1 data set with a missing value in the dependent or independent variables in this linear model was deleted from the training set. After fitting the linear model on the resulting subset of list-wise complete data, the residuals were extracted; these residuals constitute the training values for the prediction contest. The R code utilized for computing the residuals on the training data will be made available for download soon.
In addition to the fluid intelligence scores, the challenge organizers will also provide skull stripped images affinely aligned to the SRI 24 atlas, segmented into regions of interest according to that atlas, and the corresponding volume scores of each ROI via a csv file. The challenge organizers nor ABCD are responsible for the quality of the derived data. Publications using the data should cite the Data Supplement of Pfefferbaum et al., Altered Brain Developmental Trajectories in Adolescents After Initiating Drinking. Am J Psychiatry, 175(4), pp. 370-380, 2018. Specifically, the raw T1-weighted MRI was first transformed into a nifti file using the Minimal Processing Pipeline by ABCD (Hager at al., Image processing and analysis methods for the Adolescent Brain Cognitive Development Study, Under Review at Neuroimage). The T1 images were then applied to the cross-sectional component of the NCANDA pipeline (see Data Supplement). The processing involved noise removal and correcting field inhomogeneity confined to the brain mask defined by non-rigidly aligning SRI24 atlas to the T1w MRI via ANTS. The brain mask was refined by majority voting across maps extracted by FSL BET, AFNI 3dSkullStrip, FreeSurfer mri_gcut, and the Robust Brain Extraction (ROBEX) method, which were applied on combinations of bias and non-bias corrected T1w images. Using the refined masked, image inhomogeneity correction was repeated and the skull-stripped T1w image was affinely registered to the SRI24 atlas via ANTS. The resulting T1w image was visual inspected and rejected from the challenge if it failed the two-tier quality check. The image was segmented into brain tissue (gray matter, white matter, and cerebrospinal fluid) via Atropos. Gray matter tissue was further parcellated according to the SRI24 atlas.
Contestants will be ranked separately on the validation data set (pre-residualized fluid intelligence will be provided) and on the test data sets.On each data set, we will compute the Mean Squared Prediction Error (MSPE) between their predicted scores and the pre-residucal fluid intelligence scores. The pre-residualized fluid intelligence is computed via the algorithm described for the training data. If the algorithm is unable to produce a numerical prediction for a given test subject, the predicted value for that subject will be set to the value that gives the worst performance (i.e., largest MSPE) from among the set of values produced by the same algorithm on the subjects in the test dataset.
Data Access: The fluid intelligence scores, raw T1-weighted MRIs, and derived data will be accessible to the challenge participants via the NDAR portal. To gain access to the data, please follow the four steps outlined in the tutorial. If NDAR approves your application to gain access to the data, they will allow you to download 500GB of the ABCD data for free. This credit can be used to download the raw baseline T1-weighted of the ABCD study or the corresponding derived data provided by the challenge organizers. Note, that (as of today) the entire training and validation dataset can be downloaded but only 3648 samples of the test dataset. Before the submission deadline, around 2.5K additional test samples will be made available. More specific download instructions will be posted soon.
December 17, 2018
January 15, 2019
March 10, 2019