|
by Juli Rew
Today's Internet potentially offers
gigabit-per-second bandwidth, but few users get even a small fraction of that,
unless they have networking wizards performing hand-tuning of the network at both
ends of the application. SCD researchers along with their partners at the Pittsburgh
Supercomputing Center (PSC) and the National Center for Supercomputing Applications
(NCSA) at the University of Illinois are embarking on a National Science Foundation-funded
project called Web100 that will help ordinary users exploit 100% of the available
network bandwidth.
Providing end-to-end performance
The main way data is transferred from
one networked computer to another is via the venerable Transmission Control Protocol/Internet
Protocol (TCP/IP). IP is implemented in all of the end systems and the routers
and acts as a relay to move packets of data from one host, through one or more
routers, to another host. TCP keeps track of the packets of data to assure that
all are delivered reliably and in order to the appropriate hosts. Unfortunately,
the default TCP configurations used by most end-systems may not be appropriate
for the available network bandwidth. Furthermore, a message may traverse multiple
heterogeneous local and wide-area networks to reach its destination, so it is
difficult for a user to manually optimize TCP for such complex networks.
SCD co-principal investigators Basil Irwin and Marla Meehl, along
with investigators at PSC and NCSA, say
that a primary goal of Web100 is to develop software that interacts with the operating
system and user applications to automatically
optimize performance for all TCP transfers.
Tuning up TCP
Most TCP applications blindly use the
default TCP buffer size," Basil explains. "If the TCP buffer is too small to hold
enough packets to fill a high-bandwidth large-latency network pipe, then transmission
of packets is forced to prematurely halt until some of the packets in the filled
buffer are acknowledged by the receiver as having correctly arrived. Such premature
transmission halting means the full bandwidth of a pipe isn't being fully utilized.
While a network administrator can alter the TCP buffer size manually, it's not
a trivial task."
The Web100 project will seek to develop a mechanism to allow the operating
system to change the TCP buffer size dynamically, transparently, and automatically
for all TCP sessions. The first step is to endow TCP implementations with better
instrumentation so that they can detect undersized buffer conditions and better
see where there are bottlenecks in the path or other TCP bugs.
Once the operating system TCP implementation has been beefed up with real-time
TCP metrics, it will also be possible to gather many more statistics about individual
TCP sessions. The investigators then envision developing an "autotuner" that will
monitor session performance and respond with needed adjustments. Autotuning, the
ability to automatically tune TCP to simultaneously achieve maximum throughput
across all connections for all applications within the resource limits of the
host, has already been successfully demonstrated by PSC in the FreeBSD operating
system.
The tuning process can be quite complicated, so autotuning will be partitioned
so that the operating system kernel simply accepts certain basic tuning adjustments,
while the complex tuning algorithms that determine what these adjustments will
be will run in user mode, where network researchers can easily extend, replace,
or disable them.
Will fixing TCP buffer sizes end most data transfer bottlenecks on the Internet?
That depends on whether a given bottleneck is caused by an incorrect TCP buffer
or some other problem. Other bottlenecks can be caused by TCP packets being dropped
by the network or even the applications themselves. Conversely, will ubiquitous
well-tuned TCP crush the net? If so, then maybe it simply means the network capacity
needs to be increased as users are able for the first time to effectively and
conveniently use large amounts of bandwidth.
Web tools for the desktop
A second Web100 goal is to develop diagnostic
and performance monitoring tools that will allow users to monitor the status of
their data transfers. Ideally, at least some of the tools for end-users should
pass the "granny test" -- that is, be simple enough for a grandmother with little
network experience to use and understand. For instance, a simple display that
shows bytes/second and lost packets/second should be understandable to most people.
The initial Web100 products will be based on kernel modifications to the Linux
operating system to allow TCP autotuning. Linux is widely used at research universities,
and its source code is freely available. The Web100 group will be testing their
new tools primarily on Intel-based computers, since Linux is widely used on Intel-based
systems.
Alpha release available
A pre-Alpha release Web100 code release
was made in November 2000, and it is being tested by Stanford Linear Accelerator
Center and Oak Ridge National Laboratory. A new Alpha0 release was released in
late March. It includes many new instruments, some simple diagnostic tools, some
library functions, and a simple autotuning daemon. Additional testers for alpha0
include Argonne National Laboratory, Lawrence Berkeley National Laboratory, Globus,
and Internet2.
The future: Spreading the word
Basil cautions that the goal of universal
tuned access may not be reachable soon since the Internet is such an amorphous
distributed network. Individuals and institutions choose and install their own
computers, with most being non-Linux systems. To really achieve universal 100%
throughput will require adoption of autotuning by all commercial operating system
vendors. Thus, although the Web100 group will be making its Linux-based codes
freely available, experience suggests that, convincing other vendors to incorporate
the code in their own systems will take awhile.
Marla says, "We are excited the Web100 will finally help deliver the speed
that these networks are capable of and if not speed, at least help us to diagnose
and correct fundamental network problems such as packet loss and routing problems."
For more information
A web site with news items, papers,
and other information about Web100 is available at http://www.web100.org.
As the Web100 code matures, it will eventually be downloadable from the site.
Web100 is looking for active co-developers. It plans to set up a special Web
site known as a portal that will allow its members to contribute to the project
and its products.
|