1998 ASR Home
Back
SCD ASR Index
Next
SCD Home

Maintaining the existing production supercomputer environment

Even though significant changes have taken place with the integration of Distributed Shared Memory (DSM) computer systems into both the Climate Simulation Laboratory (CSL) and Community computational environments during FY1998, SCD continued to maintain and enhance its existing production parallel-vector supercomputer (a.k.a. PVP) environment. In FY1999 SCD will continue to support the established PVP environment, but will focus more resources on maintaining and enhancing the DSM systems and user environments that were established as part of the NCAR computational environment in FY1998.

During FY1998 the major changes in the SCD production supercomputer environment were:

During FY1998, SCD integrated new DSM systems into the existing parallel-vector supercomputer production environment and attempted to provide as seamless a computational environment for the CSL and Community users as possible. Though SCD introduced these two DSM computer systems, the Silicon Graphics Cray Origin2000 (ute) and HP SPP-2000 (sioux), into the CSL and Community environments, respectively, SCD is committed to continuing to provide support of production parallel-vector supercomputer systems at least through FY1999, and likely well beyond. Historically, it is these parallel-vector systems that have served the computational needs of NCAR and the atmospheric/oceanic sciences, and SCD will continue to support these systems to minimize the impact of the introduction of new computing architectures and systems.

At the end of FY1998, the "production computational environment" managed by SCD for NCAR includes five Cray supercomputers (a C90/16 (antero), a J90/20 (aztec), a J90/16 (paiute), and a pair of J90se/24 systems (ouray and chipeta) -- known generically as "Parallel Vector Processor" or PVP systems), Distributed Shared Memory supercomputers (the Silicon Graphics Cray Origin2000 (ute) and HP SPP-2000 (sioux)), the NCAR Mass Storage system, the HiPPI data communications fabric and networking facilities, the DataPark (a Silicon Graphics PowerChallenge XL (winterpark)), a test system (Silicon Graphics Cray Origin2000 (mouache)), fileservers, and the Visualization Lab. This environment's reliability is enhanced by the systems support, operational monitoring, and user services activities provided by SCD staff.

One of the most important aspects of SCD's attention to maintaining the existing production supercomputer environments is to provide 7x24 operation and service. This attention is reflected in the following tables, which show average system performance and utilization for FY1998:

Average Community supercomputer system performance and utilization statistics for FY1998
SystemGFLOPSUtilz'nUserIdleSystemWaitIOIOfsIOswp
chipeta1.60592.7%95.2%1.6%3.2%0.2%--
ouray1.49892.4%93.9%2.3%3.8%0.2%--
paiute0.87187.0%87.9%8.4%3.7%3.7%--
sioux~2.051.5%52.5%47.5%----

Where "GFLOPS" is the average number of floating point operations per second (in billions) during the measuring period; "Utilz'n" is the average user utilization of the system (system downtime counts against utilization); "User" is the percent of uptime occupied in performing computation for user processes; "Idle" is the percent of uptime spent idle; "System" is the percent of uptime consumed in system overhead; "WaitIO" is the percent of uptime spent awaiting I/O completion; "IOfs" is the percent of the WaitIO time spent in performing user filesystem I/O; and "IOswp" is the percent of the WaitIO time spent in performing process swapping/paging.


Average CSL supercomputer system performance and utilization statistics for FY1998
SystemGFLOPSUtilz'nUserIdleSystemWaitIOIOfsIOswp
antero4.73090.7%92.6%3.5%3.8%0.5%--
T3D~1.173.1%75.9%24.1%----
aztec1.20993.7%95.3%2.3%2.4%0.1%--
ute~4.566.8%68.1%28.2%2.2%1.3%89.4%0.1%

Where "GFLOPS" is the average number of floating point operations per second (in billions) during the measuring period; "Utilz'n" is the average user utilization of the system (system downtime counts against utilization); "User" is the percent of uptime occupied in performing computation for user processes; "Idle" is the percent of uptime spent idle; "System" is the percent of uptime consumed in system overhead; "WaitIO" is the percent of uptime spent awaiting I/O completion; "IOfs" is the percent of the WaitIO time spent in performing user filesystem I/O; and "IOswp" is the percent of the WaitIO time spent in performing process swapping/paging.


Average DataPark and test system performance and utilization statistics for FY1998
SystemGFLOPSUtilz'nUserIdleSystemWaitIOIOfsIOswp
mouache~0.21.8%1.8%96.3%1.3%0.5%79.7%19.4%
winterpark~0.418.7%18.8%57.9%6.3%16.5%86.6%12.9%

Where "GFLOPS" is the average number of floating point operations per second (in billions) during the measuring period; "Utilz'n" is the average user utilization of the system (system downtime counts against utilization); "User" is the percent of uptime occupied in performing computation for user processes; "Idle" is the percent of uptime spent idle; "System" is the percent of uptime consumed in system overhead; "WaitIO" is the percent of uptime spent awaiting I/O completion; "IOfs" is the percent of the WaitIO time spent in performing user filesystem I/O; and "IOswp" is the percent of the WaitIO time spent in performing process swapping/paging.


Key maintenance activities

During FY1998, SCD provided ongoing maintenance activities to ensure the integrity and reliability of existing computational systems. Some of the key areas were:

1998 ASR Home
Back
SCD ASR Index
Next
SCD Home