While some of the data stored on the NCAR MSS originate from field experiments and observations, the bulk of the data is generated by global climate-simulation models and other earth-science models that run on supercomputers, and SCD faces an increasing demand to archive data from ever-faster supercomputers. Essentially, the faster the supercomputer, the more data there are to be archived. Even greater demands for archiving data will result from the growing use of coupled atmospheric/oceanic simulation models.
| MSS growth statistics | ||
|---|---|---|
| eFY96 | eFY97 | |
| Total storage (TB) | 82 | 110 |
| Total files (x 106) | 2.9 | 3.9 |
| Net growth (TB per month) | 1.5 | 3.0 |
| Data read/written (TB per month) | 8 | 161 |
| Data migrated internally (TB per month) | 8 | 16 |
| Manual tape mounts (number per month) | 45,000 | 60,000 |
| Robotic tape mounts (number per month) | 40,000 | 50,000 |
| Offline cartridge count | 145,000 | 165,0002 |
| GFLOPS on NCAR computing floor | ~5 | ~10 |
| 1 16 TB per month = 5 MB/sec | ||
| 2 All on IBM 3490 cartridge media | ||
First, working with the NCAR/SCD Distributed Computing Services (DCS) project, the MSS group co-developed an interface between the MSS Master File Directory (MFD) and a set of POSIX-style commands. The MSS MFD is a metadata database that contains information for all files stored in the NCAR MSS. The DCS command set was placed into production in FY97. This command set creates an environment where the user will have standard UNIX-style commands to manipulate MSS files just as the user manipulates UNIX files on computers. In support of the DCS command set, a trash can feature was added to the NCAR MSS. The trash can is where deleted or purged MSS files are placed for a short period of time to allow retrieval of those files. Also during FY97, a beta msrcp command was introduced. msrcp is similar to the Unix rcp command except the target or source for msrcp is an MSS file. Like its UNIX counterpart, msrcp supports wildcard and recursive subdirectory descents.
Second, the Mass Storage Control Processor (MSCP) was upgraded, increasing its CPU capacity by 50% and doubling the central memory.
Third, the remaining older MSS diskfarm disks were replaced with newer disk technology, increasing the total diskfarm capacity from 140 GB to 180 GB.
Fourth, a StorageTek Powderhorn ACS with SD-3 (Redwood) cartridge drives was placed into production with an online storage capacity to 300 terabytes (TB). In addition, manually mounted SD-3 cartridge drives were placed into production, increasing the offline capacity well beyond 1 petabyte (PB).
Finally, the NCAR MSS Group completed a working prototype of an MSS-IV Data Migration server. This server will support the internal migration of MSS data in the storage hierarchy and oozing existing archive data to new storage technologies.
In addition to 3480 and 3490E cartridge tapes, and 7- and 9-track round tapes, the NCAR MSS also offers import/export to single- and double-density Exabyte cartridge tapes.
MSS-IV is the design of the next-generation NCAR Mass Storage System. Based on the proven MSS-III design architecture, MSS-IV is meant to:
MSS-III can be decomposed into a set of functional components. These components are based on the IEEE Storage Reference Model Version 2. While adhering to this model, MSS-IV is being designed as a distributed system. An initial design requirement of MSS-IV was to eliminate the dependence on MVS mainframes and move toward a more heterogeneous, vendor-independent implementation.
Initially, the platform of choice will be smaller UNIX computers, but MSS-IV is not limited to these, and it will be deployable on MVS mainframes. MSS-IV will be implemented in a distributed computing environment where the functional components will be matched with an appropriate compute platform. That is, a data mover function will be deployed on a platform that is configured for bulk data transfer efficiency, while a database function will be placed on a platform that is configured for transaction processing efficiency.
MSS-IV extends the capabilities of the current MSS-III system. Certain device connections that are impossible on an MVS mainframe, such as SCSI-attached devices, can be achieved in MSS-IV. MSS-IV allows the integration of vendor-supplied software as new or improved functional components are available. This can be as specific as a device interface or as extensive as a fully functional archive system.
In addition to advanced capabilities, the MSS-IV design extends the capacity of the system well beyond that of MSS-III. MSS-III must be deployed on a single machine. MSS-IV is a distributed design that can be deployed across multiple machines. Hence the total system capacity of MSS-IV can exceed what can be achieved on a single machine. In addition, the capacity of MSS-IV can be extended simply by replicating a functional component. For example, the total data migration function could be deployed on multiple machines that yield a higher aggregate data transfer capacity than a single machine could achieve.
The design of MSS-IV eliminates the need to build a complete MSS-IV system before it can be deployed. Therefore, MSS-IV will be deployed incrementally, requiring each MSS-IV component to interoperate with other components from both MSS-III and MSS-IV. This design allows for a user-transparent migration from MSS-III to MSS-IV in an orderly, incremental manner.
| NCAR | UCAR | NSF | NCAR FY97 ASR |