How Autism Speaks is changing the future of autism research with open science

Autism Speaks logo
Autism Speaks is changing the future of autism research with open science


Autism refers to a broad range of neurodevelopmental differences characterized by challenges with social skills, repetitive behaviours, speech, and nonverbal communication. Autism affects an estimated 1 in 36 (3%) children in the United States. Autism Speaks is a non-profit organization that is dedicated to creating an inclusive world for all individuals with autism throughout their lifespan. It is the largest autism research organization in the United States. MSSNG (pronounced “missing”) is a groundbreaking collaboration between Autism Speaks, Verily, DNAstack, The Hospital for Sick Children (SickKids), and the research community to create the world’s largest whole genome sequencing database on autism with deep phenotyping.


Autism Speaks needed a software solution to support the processing and private sharing of whole genome sequence and deep phenotype data from over ten thousand people with autism and their relatives in MSSNG. They wanted a solution that is cloud-based, GA4GH compliant, and can connect additional datasets from collaborating organizations around the globe, with the goal of creating the world’s largest federated network of data for autism research.


Autism Speaks leveraged Omics AI to harmonize processing and sharing of this data through Neuroscience AI. The solution uses Publisher to connect data, Explorer to share it, and Workbench to process it. Bioinformatics and visualization services were used to author workflows and generate interactive visualizations.


Autism Speaks partnered with DNAstack to help create and share a harmonized collection of whole genome sequences and deep phenotype data collected through MSSNG. The resulting dataset is controlled access and available on Neuroscience AI, the world’s first federated network for autism research. In order to create this collection, Autism Speaks enlisted bioinformatics services to author an open source pipeline for data processing. The pipeline runs automatically through Workbench and performs read alignment, quality control, haplotype calling, and joint variant calling. Genomics and metadata are connected using Publisher, and shared into Neuroscience AI powered by Explorer. This collaboration has enabled hundreds of researchers to access one of the world’s largest genomic datasets of its kind, leading to novel insights about the biology of autism.


Samples analyzed


Core hours to process


Genes indentified


Papers published


Researchers worldwide
Autism Speaks is excited to be part of this impressive consortium and to continue our work with DNAstack to develop federated systems that can help to responsibly share autism data. Because autism is complex and diverse, there is a pressing need to bring together a wide variety of data sources to understand autism and how best to serve the many needs of the autism community. This unique, collective thinking has the ability to deliver individualized care, guide health decisions for autistic individuals and families and serve as a driver for discovering new, effective therapies.
Dean Hartley
Senior Director of Genetic Discovery & Translational Medicine

Ready to Get Started?

Accelerate your science with Omics AI.

Have questions? Contact us