Neuroimaging data sets contain vast amounts of information. Source: J. T. Vogelstein et al.Neuroimaging data sets contain vast amounts of information. Source: J. T. Vogelstein et al.

Recent technological developments such as high-throughput imaging and other technological advances enable collection of terabytes of data per day. These data cannot be stored and processed on conventional computer systems and must instead be processed on high-performance compute clusters.

To cope with such large datasets, researchers launched NeuroData, an open-access data repository powered by open-source web-services software applications that store, analyze and visualize data. The system is designed to analyze disparate datasets by reusing components originally designed for other applications. The repository holds nearly 100 public and private datasets, including large amounts of complex multidimensional data, from 30 collaborators.

NeuroData uses the Block Object Storage Service developed at Johns Hopkins Applied Physics Laboratory for rapid storage of petabytes of data in a secure cloud-based environment. As the code is open source, any user can download, set up and update this ecosystem.

Researchers from Johns Hopkins University, Seattle’s Allen Institute for Brain Sciences, Facebook, Janelia Research Campus (Ashburn, Virginia), Gigantum (Washington, DC) and Stanford University contributed to the development of the neuroscience data repository.

To contact the author of this article, email shimmelstein@globalspec.com