Data and Knowledge Systems

PyNGL demonstrationAs research data sets in the atmospheric and related sciences become more complex, diverse, and geographically distributed, next-generation knowledge tools are necessary to manage, access, and analyze these terabytes of data. To meet this challenge, CISL, in cooperation with partner agencies, centers, and universities, is developing an end-to-end simulation and analysis environment that will support a new era of scientific discovery.

CISL is working to increase the breadth and richness of key data holdings, delivering new data sets to the community, and evaluating new methods of digital preservation. CISL is also developing new software for data analysis and visualization, building portals to computational and data resources, and creating Grid-enabled cyberinfrastructures for professional collaboration.
 

         

Comparison of two datasets

 

Data curation

Building data services for an Earth system knowledge environment requires adding collections to NCAR’s Research Data Archive, a collection of more than 600 datasets used in the geosciences. CISL is working to deliver new, very large datasets to the research community, including reanalyses, regional climate model results, and global weather ensembles. CISL is also playing a leadership role in developing global interoperable data systems, contributing strongly to World Meteorological Organization efforts in this area.

CISL engages in ongoing efforts to improve and expand NCAR’s Research Data Archive, preserve historical observational and analyzed collections, and make new products available that are necessary for leading-edge research. CISL is also collaborating with a number of other institutions to establish standard reference datasets, preserve collections that evolve over time, and establish permanent data storage for digital assets that might become lost.

 

 

 

 

         

PyNGL visualization

VAPoR visualization

 

Data analysis and visualization

High-performance computers running simulations of the climate, weather, oceans, cryosphere, and biosphere generate massive amounts of numerical data — but in order to get scientific insight from this output, researchers must analyze it. For this, they turn to scientific visualization, a way to represent complex numerical information in the more visual form of charts, graphs, images, and animations.

In its Analysis Environment Project, CISL is developing a cyberinfrastructure of high-performance visualization computers and shared file systems that will help scientists explore their data through interactive post-processing, analysis, and visualization. CISL is also developing versatile and scalable software tools for geoscientific visualization: PyNGL and PyNIO are Python interfaces to the NCAR Command Language’s graphics and file input/output libraries, respectively, while VAPoR is open-source software for exploring terascale-sized datasets.

 

 

 

         

SCD Portal

Community Data Portal browser

Earth System Grid points

 

Portals and Grid computing

As part of its ongoing commitment to advancing information technology, CISL is developing and deploying specialized Web portals that provide access to geoscience data and computational resources via the Internet. These portals offer customizable interfaces, integration of functions from multiple systems, a single access point for services, and tools for scientific collaboration.

The SCD Portal is an easy-to-use online gateway to NCAR computational resources, providing access to supercomputer, queue, job statistics, and charging information. CISL’s goal in developing this portal has been to integrate and simplify the tools scientists need to do their research

The Community Data Portal (CDP) offers a central point of entry to the large and diversified data holdings of NCAR, UCAR, and the UCAR Office of Programs. The Community Data Portal will provide a broad spectrum of functionality, including data search and discovery, data catalogs and metadata browsing, reliable high-performance data download, and server-side data processing.

CISL and a number of collaborating laboratories and universities released the Earth System Grid Web portal in the summer of 2004. Designed for general use by the climate modeling community, it allows easy access to the latest data from NCAR’s Community Climate System Model (CCSM). The Earth System Grid makes it easier to manage and access terascale climate-model data across high-performance, broadband networks. Users can browse data catalogs hierarchically, perform searches on metadata, download full files, or subset the virtual aggregated datasets.