2016 NSLS-II & CFN Joint Users' Meeting

Workshop 2: Meeting the Challenges of Big Data Sets


Dmitri Zakharov, Center of Functional Nanomaterials (CFN-BNL); Shinjae Yoo, Computational Science Center (CC-BNL); Stuart Wilkins, National Synchrotron Light Source (NSLS II-BNL)


Berkner Hall, Bldg. 488, Conference Room B


The workshop will showcase state-of-the-art capabilities in characterization and modeling techniques revolving around the acquisition, analysis and mining of large data sets. Recent innovations in instrument design and advances in automation have enabled integration of multiple characterization techniques into a single multimodal instrument. Furthermore, technological breakthroughs, as well as advances in applied mathematics and statistics algorithms have enabled more sensitive detection: this aids the push towards higher spatial frequencies and angular and temporal resolutions. As a result new approaches generate very large data sets, which require efficient data acquisition, processing, visualization and storage schemes. The purpose of this workshop is to bring together scientist and engineers, from academia and industry, working on synthesis, assembly, characterization, and modeling to open dialog about the challenges in big data management at BNL. We hope that the discussion will contribute to development of guidelines, best practices and common solutions to address the needs of scientific community.

Translating Big Data into Better Information for TEM: The Camera Evolves
Cory Czarnik, Gatan Inc.
(11:30 a.m. – 12:15 p.m.)

As we transition from the previous generation of cameras and detectors (e.g., CCD based sensors, relatively slow frame rates, binned images to improve frame rate) to the current generation of cameras (large format CMOS detectors, video-like frame rates and extensive in-situ imaging), the demands on data workflow and data management are growing exponentially. Historically, it had only been practical to record a “picture” of an event in the TEM (e.g., VHS tape, screen capture software, etc.) whereas today’s camera systems are capable of storing the data in each (of perhaps 16M) pixel(s) at tens of frames per second directly to disk for offline quantitative analysis. This is facilitated by the rapid increase in computational power, data transfer efficiency, and data storage capabilities at the same time that the cost of these capabilities approaches something that can be incorporated into a single (albeit high end) PC with a commercial operating system. This has opened up many new applications including identifying fine detail in structural biology specimens (including the recent Zika virus), 4D STEM application to strain mapping in materials, sub-ms temporal resolution for in-situ reaction studies, and analysis of beam-sensitive zeolite and metal-organic framework samples as well as simply improving the usability of cameras for “picture taking” with fast frame drift correction of images.

It is increasingly clear that a streamlined, robust, yet simple method of managing and analyzing large data sets needs to be developed in parallel with the applications themselves. For example, the structural biology community is currently employing direct detection cameras that output 80Gb/sec from the camera that is subsequently processed and reduced in size in order to make it useful for quick-turn analysis. While it may be straightforward to use a “brute force” method of storing, moving, and analyzing a single data set to demonstrate proof of concept for a given application, optimizing the workflow for each application will be required for widespread adoption and implementation.

Workshop 2

The Annual National Synchrotron Light Source ll (NSLS-ll) and Center for Functional Nanomaterials (CFN) Users' Meeting provides a venue for scientists from diverse disciplines who will use the NSLS-ll and CFN facilities to share their work and discuss future directions for their research. New results and advances in experimental capabilities in synchrotron radiation and nanoscale science research will be highlighted.
Monday, May 23, 2016
8:30 am - 6:00 pm
Brookhaven National Laboratory
Center for Functional Nanomaterials
11973 Upton , NY
United States