![]() |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
The procurement of ARCS, NCAR's Advanced Research Computing SystemIn February 2000, SCD began the process of drafting the technical requirements for new supercomputing equipment to support the NCAR Community and the Climate Simulation Laboratory (CSL) in preparation for issuing an open and competitive procurement. SCD engaged the UCAR Contracts office, and requested that each of the NCAR division directors appoint a technical representative who could participate with SCD in drafting the Request For Proposal (RFP). The RFP2000 project, as it was known then, was formally inaugurated on 10 March 2000 with the first meeting of the Technical Committee. The committee was comprised of 18 SCD staff members, 8 scientific advisors from NCAR's science divisions, and 4 members from UCAR Contracts. Later in the process, three members of the NCAR community were added as an External Review Team. The committee soon agreed to provide the procurement effort with a more appropriate name: the NCAR Advanced Research Computing System (ARCS). The initial draft of the ARCS technical requirements was assembled by SCD; these were the requirements it believed would best serve the users of NCAR's computing facility. In addition, requirements were examined from the 1995 NCAR Accelerated Computing Environment (ACE) RFP, and recent RFPs from a number of peer centers in North America and Europe, including the Geophysical Fluid Dynamics Lab (GFDL), the National Center for Environmental Prediction (NCEP), and the NOAA Forecast Systems Laboratory (FSL). Throughout the summer of 2000, the ARCS committee evaluated the scientific needs of the NCAR Community and the CSL and forged those into explicit requirements for the ARCS RFP. In the meantime, the ARCS benchmark suite, including performance assessment kernels, system and I/O subsystem tests, and representative models run at NCAR, was being assembled and tested in preparation for release with the RFP. A Vendor Review Draft of the RFP was released for comment in late August 2000. This attracted the interest of 14 prospective offerors, all of whom were put under a UCAR non-disclosure agreement. Of those interested offerors, 9 responded with comments. As the final revisions were made to the RFP documents and the benchmark suite was being completed, SCD established a secure website for the distribution of the RFP documents to prospective offerors; without the website, these materials and the ARCS benchmark suite would have required 5 CD-ROMs and over 200 pages of printed material per interested offeror. On November 1, 2000, the ARCS RFP was released with a due date for proposals of 9 January 2001. The RFP requested three- and five-year proposals, and stated: "The ARCS should provide a highly productive computational environment for the development and execution of complex, long-running, computationally intensive earth system models. The paramount objective is to advance atmospheric and related-science research across broad fronts (e.g., turbulence, micro- and mesoscale meteorology, weather, atmospheric chemistry, ocean modeling, climate prediction, paleoclimate, upper atmosphere, solar-terrestrial interactions, the solar atmosphere and its interior, etc.). NCAR seeks the highest level of computing capacity and capability to address this central objective. The due date for the initial proposals was extended for two weeks, to January 23, 2001, upon the request of several offerors and the consideration of SCD and the ARCS committee for the Christmas and New Year holidays. On December 6, 2000, SCD hosted a Pre-Proposal Vendor Conference at NCAR's Mesa Lab to provide the opportunity for prospective offerors to view the computer room infrastructure and physical plant and to provide a uniform set of questions and answers. All formal communications with prospective offerors, including amendments to the RFP and benchmark suite, additional questions posed by the offerors and formal responses by the committee, were conducted via the secure website. We estimated that the secure website medium provided a uniform mechanism for communications, and expedited the process significantly. NCAR was commended by a number of prospective offerors for utilizing this information technology as a mechanism for delivering information to them and for allowing them to submit their proposals. While awaiting proposal submission, SCD developed spreadsheets to be used by the three ARCS subcommittees, or teams (business, price, and technical), to objectively assess each of the proposals. These spreadsheets incorporated an automatic scoring mechanism tailored to the evaluation specifications provided by UCAR Contracts. The technical evaluation spreadsheet included all 416 attributes of the technical requirements of the RFP. Of the 19 prospective offerors, 3 submitted proposals for the ARCS system on January 23, 2001. Those three offerors were Compaq, IBM, and SGI. SCD had developed a second secure website for the exclusive use of ARCS committee members to electronically access all offeror-supplied information, including proposals, benchmark results, pricing and business information. Access control mechanisms were put in place so that during the proposal evaluation and assessment phase, the technical team could access only the technical components of the proposals, thus preventing the business and price proposals from influencing the technical proposal evaluations. Each of the 14 members of the ARCS technical team conducted a thorough assessment of each of the technical proposals, and used the technical evaluation spreadsheet to submit their individual assessments of the proposals. These 42 spreadsheets were electronically combined and analyzed to produce summary reports of the strengths, weaknesses, and deficiencies of the proposals. Additionally, the benchmark results were analyzed and used to generate performance projections for each model included in the benchmark suite for each of the three proposals. In a series of meetings, these results were presented and considered by the ARCS committee. The ARCS committee unanimously agreed that all proposals were relatively disappointing and sought a second round of proposals, termed BAFOs (Best And Final Offers), due March 26, 2001. While awaiting the BAFO proposals, the ARCS committee continued its analysis of the initial proposals and benchmark results and submitted sets of clarification questions to the offerors. Upon receipt of the BAFO proposals, the ARCS committee reiterated the evaluation and assessment process conducted on the previous proposals. New performance projections were generated, presented, and considered by the committee. The committee also reviewed offeror responses to clarification questions and issued new sets of clarification questions to the offerors. Though the BAFO proposals were more favorable, the committee still felt that the offered equipment fell disappointingly shy of expectations, with too little equipment delivered early and major upgrades delivered too late in the contract window. Thus a third set of proposals were solicited from the three offerors with common guidance to the offerors that we sought to at least double computational capacity with the first equipment drop and achieve a sustained performance exceeding 1 TFLOP by 2005. The due date for these Supplemental Proposals was set at May 17, 2001. Two of the three offerors provided revised proposals on May 17, while the third advised that their final proposal is unchanged from their BAFO proposal. The ARCS committee again evaluated the proposals, reviewed new performance projections based on them, and in a series of meetings during May and June debated the relative merits of the supplemental proposals. SCD presented the results to the NCAR Directors and members of the SCD Advisory Panel in June, and a formal recommendation was presented to UCAR President Rick Anthes, NCAR Director Tim Killeen, and the NCAR Directors on June 21, 2001. SCD compiled an ARCS Evaluation and Recommendation Report and submitted it as supplementary supporting information on July 5. NCAR and UCAR management concurred with the ARCS committees and SCD's recommendations, approving that contract negotiations first be entered with IBM. These negotiations took place in late July, with final contractual terms and conditions completed on August 14. UCAR Contracts submitted the IBM ARCS Contract to the National Science Foundation on September 14, which approved the contract on October 5, 2001. The ARCS RFP resultsThe ARCS system, as contracted with IBM, will provide a phased introduction of new computational, storage, and communications technologies through the life of the contract. This will allow NCAR's Scientific Computing Division to maintain a stable, state-of-the-art production facility for the next three to five years. The initial delivery augments the current blackforest system by more than doubling its computational capacity, from 0.9 to 2.0 peak TFLOPS, and provides a five-fold increase in disk storage capacity. A second delivery, in September 2002, will introduce IBM's next-generation processor (POWER4), node (Regatta), and switch (Colony) technologies, adding almost 5 peak TFLOPS, upgraded switch communications, and 21 TB of new disk storage. This system will be called bluesky. In the fall of 2003, the Colony switch will be replaced with IBM's next-generation Federation switch technology, which provides much lower latency and higher bandwidth than does the Colony switch. If NCAR chooses to exercise the two-year contract extension option, in the fall of 2004 the bluesky system will be upgraded with an additional 4 peak TFLOPS and 32 TB of new disk storage. The following tables summarize the major attributes of the ARCS systems through the contract lifetime:
The contract established minimum capability performance requirements and a continuing process for maintaining current versions of NCAR models as part of the suite of codes to be used to measure the system performance. The minimum model capability requirements, which can also be thought of as model speed-up relative to blackforest, are 1.0x for the blackforest upgrade, 3.1x for bluesky, and 4.6x for the bluesky upgrade. Failure of IBM to meet these capability requirements will result in a corresponding additional equipment delivery that will increase the total capacity, thus peak TFLOPS, of the system. IBM and SCD have agreed to work together to improve the user environment and user support services that will be provided to NCAR and CSL. This agreement covers many aspects of the ARCS, including on-site IBM applications specialists, training in advanced programming, performance analysis and tuning techniques, and a more efficient process for reporting, escalating, and resolving compiler and tools problems. The agreement with IBM also includes early access to new hardware and software technologies, Live Test Demonstrations of those technologies prior to equipment delivery, new software feature and tools development, training for systems engineers and operators, and specialized training tailored to the needs of the user communities served by NCAR. Additionally, the agreement with IBM will provide the opportunity for NCAR to participate in IBM's Blue Light HPC project. Blue Light, like IBM's Blue Gene project, is an exploratory effort of IBM's Exploratory Server Systems department at IBM Research to develop future PetaFLOPS supercomputing systems. NCAR's collaboration with IBM in Blue Light holds the promise of significant and revolutionary advancements in climate, weather, and Earth systems models, and will provide IBM with valuable input on hardware and software design. The first ARCS equipment delivery, the blackforest upgrade, occurred on Friday, 5 October 2001 with the delivery of 13 SP frames and pallets of cables and support equipment. By SCD's estimate, once the blackforest upgrade equipment is online, NCAR's IBM SP system will be ranked at the number 8 position on the TOP500 Supercomputer Sites list (see http://www.top500.org/lists/2001/06/top500_0106.pdf). While SCD expects NCAR to slip out of the top ten by the end of 2001 due to other centers upgrading their systems, the installation of the bluesky system in the second phase of ARCS, with IBM's POWER4-based processors, has the potential of placing NCAR within the top five. SCD looks forward to the future of high-performance computing and maintaining a state-of-the-art computational and storage facility through the next three to five years. It is SCD's intent that the ARCS will not only maintain, but also enhance, NCAR's leadership role in Earth science research. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||