Technical support services
Providing software engineering and math libraries support for scientists
using NCAR/SCD's high-performance scientific computing facilities is
the core business and mission of SCD's Technical Consulting Group (TCG).
This group is the first point of contact for users with questions and
concerns about their scientific computing efforts. They provide the user
community a centralized interface for resolving technical problems,
advising users on optimal software design and implementation
techniques, and channeling needs expressed by users into SCD's planning
process. When the assistance of a specialist from another SCD section
is required to resolve a problem, TCG coordinates SCD efforts and
manages the follow-through with the user. Collaborations with other
SCD groups, users, vendors, and other high-performance computing
centers are central to maintaining the expertise required to support
this mission.
In addition to TCG's core business, the group identified several
near-term projects and goals which are critical to SCD's mission and
which require special attention because of recent developments in
supercomputing and scientific computing technology. While striving to
absorb the impact of the September 1997 staff reduction, TCG has made
progress on many projects with the currect status described below. with
Distributed Shared Memory architecture computer systems are the
current most likely future of high performance computing at NCAR. TCG
is striving to develop a level of expertise with these architectures
similar to what is currently available for Parallel-Vector
architectures. TCG has accomplished these goals in this regard:
- arranged a weekly series of short seminars for SGI's distributed
shared memory implementation.
- produced User Guides for both the SGI (ute) and HP (sioux)
systems.
- worked with both HP and SGI to identify and isolate many software
problems for resolution.
- tracked progress and developments with the OpenMP standards
and vendor availability.
- provided math libraries support, maintenance, verification, and
research.
Supporting the production computing environment per TCG's core mission
statement has continued to be critical to the success of the
Scientific Computing Division. Historically, SCD's best asset in the
eyes of the user community has been user support. TCG maintains the
highest standards in user responsiveness, as well as diligence in
system test and checkout to guarantee a stable and productive work
environment for our users. TCG recognizes the need to expand the quality of
user outreach, collaboration, and individualized service.
In addition to UNICOS operating system support on Crays, SCD now provides
support for new platforms running Irix and SPP-UX. Last year, SCD
established a new "Data Park" platform, "winterpark," an SGI Power
Challenge running the Irix operating system. Winterpark was configured
to provide users with fast, free access to the Mass Storage System and
ample disk space for postanalysis of data. TCG helped early users
adjust to the new operating system, and produced a winterpark user
guide. User demand for this system has been overwhelming.
TCG is responsible for testing the user environment in cooperation
with HPS to ensure that operating system and programming environment
software upgrades have minimal impact on user productivity. To this
end, TCG was involved in every operating system and compiler environment
software installation. As problems were uncovered, TCG
collaborated with HPS to develop a strategy for testing and isolating
the problem, based on whether the problem lies within the operating
system versus within the programming environment. For programming
environment issues, TCG takes the lead in characterizing the problem
for the vendor and following the bug fixes through the pipeline.
There have been significant developments with the NCAR Graphics product, NCAR
Command Language (NCL), over the last year. TCG increasingly consults
on NCL in response to users' growing use. In
collaboration with scientists from the NCAR Climate and Global
Dynamics Section and the SCD Graphics and Data Analysis Group, we
helped design and review "getting
started" documentation to help new users get off to a faster start
on this powerful but complex graphics and analysis application.
TCG staff have contributed a significant amount of programming effort
to the Distributed
Computing Services (DCS) project for the last several years. The
goal of the DCS effort is the redesign of the user interface to the
Mass Storage System. In addition to the initial MSS metadata
commands, file transfer commands and support were completed this
fiscal year and are seeing widespread use. Several divisions within
NCAR are currently in the process of adding DCS to their divisional
servers.
Over the last year, approximately 1 million metadata commands were
serviced via the DCS system. For a six-week period this last summer,
1.4 Terabytes of data were transferred using the new MSS file access
method.
The final stage of DCS implementation, the importing and exporting of
data to various media to the MSS, is in the initial testing phases.
It is expected that this feature will be completed during the first
half of the next fiscal year.
TCG has been incrementally adding information on RISC-based
processors, DSM technology, and cache optimization as well as updated
information on Fortran
and OpenMP standards. Cray documentation is now
available on the Web from SCD as well. Documentation in these areas
has received much attention from the TCG staff
in order to alow users immediate access to information on their own.
TCG includes external collaborations on their list of goals because
community leadership and a high level of awareness of industry trends
are critical to providing the SCD users with the information they need
to stay current with new hardware and software developments such as
those involved with the recent move toward DSM architectures. TCG has
maintained a presence in the Supercomputing conferences, user groups,
and other standards organizations including the Parallel Tools Consortium and the High
Performance Debugging Forum over the last year.