|
|
|
|
|
|

While some of the data stored on the NCAR MSS originate from field experiments and observations, the bulk of the data is generated by global climate-simulation models and other earth-science models that run on supercomputers, and SCD faces an increasing demand to archive data from ever-faster supercomputers. Essentially, the faster the supercomputer, the more data there are to be archived. Even greater demands for archiving data will result from the growing use of coupled atmospheric/oceanic simulation models.
The following table compares year-end statistics for FY1996, FY1997, FY1998, and projected statistics for year-end FY1999 and FY2005. The FY2005 estimates assume a flat budget for supercomputing, historical data storage trends at NCAR, and Moore's Law growth in computer performance per unit cost. Even with the most optimistic vendor projections for storage densities and costs, these estimates indicate that the NCAR MSS would require between one and two dozen ACSs and the annual MSS budget will exceed that for supercomputers.
| eFY1996 | eFY1997 | eFY1998 | eFY19994 | eFY20054 | |
|---|---|---|---|---|---|
| Total storage (TB) | 82 | 110 | 150 | 256 | 5,700 |
| Total files (x106) | 2.9 | 3.9 | 5.1 | 8.5 | 190 |
| Net growth (TB per month) at eFY | 1.5 | 3.0 | 5.0 | 7.5 | 220 |
| Data read/written (TB per month) | 8 | 161 | 20 | 30 | 500 |
| Data migrated internally (TB per month) | 8 | 16 | 20 | 30 | 500 |
| Manual tape mounts (number per month) | 45,000 | 60,000 | 37,000 | 18,000 | 1,0005 |
| Robotic tape mounts (number per month) | 40,000 | 50,000 | 37,000 | 55,000 | 900,0005 |
| Offline cartridge count | 145,000 | 165,0002 | 169,0003 | 169,000 | 85,0006 |
| GFLOPS on NCAR computing floor | ~5 | ~10 | ~20 | ~36 | ~1,000 |
First, working with the Distributed Computing Services project team, the MSS group co-developed a data transfer interface between the MSS and workstation/server machines. In FY1997 a beta version of the msrcp command was introduced. msrcp is similar to the UNIX rcp command, except the target or source for msrcp is an MSS file. Like its UNIX counterpart, msrcp supports wildcard and recursive subdirectory descents. In FY1998, msrcp became a production interface.
Second, additional ESCON channels were added to the Mass Storage Control Processor (MSCP) and additional ports were added to the ESCON director switch. These channels and ports will be used to support additional ESCON-attached tape drives to increase MSS performance and to evaluate emerging tape technologies.
Third, serial High Performance Parallel Interfaces (HiPPI) were integrated into the MSS High Performance Data Fabric (HPDF). Serial HiPPI is an optical-fibre-based connection used to connect compute servers to the MSS HPDF. Serial HiPPI has many advantages over parallel copper HiPPI, including smaller deployment costs and smaller, more reliable connectors and cables.
Fourth, the ability to automatically create multiple copies of MSS files was implemented to address tape transport and media reliability issues encountered in late FY1997 and early FY1998. The NCAR MSS now has the capability of creating from 1 to 32 copies of every file written into the MSS. Currently, the MSS will create two copies of selected MSS files to maintain file error and loss rates much better than one in 100 million.
Finally, the NCAR MSS Group completed a production MSS-IV Data Migration server. This server supports the internal migration of MSS data in the storage hierarchy and the oozing of existing archive data to new storage technologies. A Silicon Graphics Cray Origin2000 four-processor system was purchased to support the new server code and will be deployed for production in early FY1999.
Optical-fibre-based serial HiPPI was introduced into the HPDF in FY1998. Serial HiPPI has many advantages over parallel copper HiPPI implementations, including smaller deployment cost and smaller, more reliable connectors and cables. Older parallel copper HiPPI interfaces will be phased out as the host machines supporting those interfaces are replaced or retired.
HiPPI technology continues to be deployed only in a niche market. It has not shown signs of spreading into the commodity marketplace, and as a result the cost of HiPPI technology has remained high and the number of HiPPI vendors is dwindling. The lack of availability and support of HiPPI technology is becoming a critical issue to the continued operation of the MSS. Replacement technologies are on the horizon, but not yet widely available nor are they functional enough to immediately replace HiPPI. Promising replacement technologies are Fibre Channel and Network Attached Storage Devices. Fibre-Channel-attached RAID units are available today at extremely attractive costs. During FY1998 Fibre Channel RAID technology was evaluated by the MSS Group and will be deployed in FY1999 to supplement the disk capacity of the DataPark. Over the next few years, the number and types of available Fibre-Channel-attached devices are expected to grow and include tape storage. Once tape devices can be Fibre Channel attached, SCD intends to evaluate the replacement of our HiPPI fabric with Fibre Channel.
Network Attached Storage Devices (NASD) is another emerging technology that is being closely tracked by SCD. Today a handful of vendors supply Network File System (NFS)-based NASD devices. Some vendors are developing "local-disk"-attached NASD products using Fibre Channel and HiPPI connections. SCD's current strategy is to deploy a Fibre Channel infrastructure and add NASD to it at a later time. The end result will be the decommissioning of our HiPPI fabric and ESCON and BMX storage devices, and the wholesale replacement of those older technologies with new, vendor-supported (and hopefully standards-based) technologies.
"Migration" refers to the massive task of transferring tens of terabytes of data from old media to modern media before the equipment that uses the old media becomes obsolete. This task by itself is straightforward; however, this data migration must be handled as a background task while the processing and storage components of the system remain fully dedicated to supplying prompt, 24-hour-per-day service to users. When the migration is complete, the total capacity of the offline archive (assuming no reduction in the offline archive's available floor space in the SCD machine room) will exceed 1 petabyte. The migration was started near the end of FY1998 and is expected to take 2 to 3 years to complete.
Expansion of the MSS storage hierarchy is planned over the next five years with the introduction of new tape technologies, new ACSs, and with the integration of a front-end file server having its own HSM to offload active and temporary data. The MSS Archive will become a back-end store for the file server accessed only by the front-end HSM. A single global name space will be provided for all data managed by SCD. Evaluation of HSM solutions began in FY1998 and will continue in FY1999.
Options to exchange data with smaller satellite storage systems are being investigated. Using this technique, data generated at NCAR could be transferred to remote sites for further analysis. The NCAR SCD storage model would thus be geographically distributed, rather than centrally located and administered.
In addition to 3480 and 3490E cartridge tapes and 9-track round tapes, the NCAR MSS also offers import/export to single and double-density Exabyte cartridge tapes. The deployment of an MSS-IV Import/Export server in FY1999 will provide the ability to support many more device types, such as CD-ROM, DAT, and newer Exabyte media to name a few.
MSS-IV is the design of the next-generation NCAR Mass Storage System. Based on the proven MSS-III design architecture, MSS-IV is meant to:
MSS-III can be decomposed into a set of functional components. These components are based on the IEEE Storage Reference Model Version 2. While adhering to this model, MSS-IV is being designed as a distributed system. An initial design requirement of MSS-IV was to eliminate the dependence on MVS mainframes and move toward a more heterogeneous, vendor-independent implementation.
Initially, the platform of choice will be smaller UNIX computers, but MSS-IV is not limited to these, and it will be deployable on MVS mainframes. MSS-IV will be implemented in a distributed computing environment where the functional components will be matched with an appropriate compute platform. That is, a data mover function will be deployed on a platform that is configured for bulk data transfer efficiency, while a database function will be placed on a platform that is configured for transaction processing efficiency.
MSS-IV extends the capabilities of the current MSS-III system. Certain device connections that are impossible on an MVS mainframe, such as SCSI-attached devices, can be achieved in MSS-IV. MSS-IV allows the integration of vendor-supplied software as new or improved functional components are available. This can be as specific as a device interface or as extensive as a fully functional archive system.
In addition to advanced capabilities, the MSS-IV design extends the capacity of the system well beyond that of MSS-III. MSS-III must be deployed on a single machine. MSS-IV is a distributed design that can be deployed across multiple machines. Hence the total system capacity of MSS-IV can exceed what can be achieved on a single machine. In addition, the capacity of MSS-IV can be extended simply by replicating one or more of its functional components. For example, the total data migration function could be deployed on multiple machines that yield a higher aggregate data transfer capacity than a single machine could achieve.
The design of MSS-IV eliminates the need to build a complete MSS-IV system before it can be deployed. Therefore, MSS-IV will be deployed incrementally, requiring each MSS-IV component to interoperate with other components from both MSS-III and MSS-IV. This design allows for a user-transparent migration from MSS-III to MSS-IV in an orderly, incremental manner.
FY1998 saw the deployment of the first production MSS-IV server, a data migration server along with the underlying infrastructure upon which other MSS-IV servers will be built. An import/export Exabyte server is scheduled for deployment in FY1999. A metadata server will be designed in FY1999 and a storage server the following year.
|
|
|
|
|
|