SCD News: You are hereGo to UCAR home pageGo to NCAR home pageGo to SCD home pageSend email to Digital Information GroupGo to SCD internal pagesSearch SCD siteGo to SCD News table of contentsGo to UCAR home pageGo to NCAR home page
Go to SCD News table of contentsGo to photo of the weekGo to features archiveGo to news archiveGo to tips archiveGo to updates archive

Go to UCAR home pageGo to NCAR home pageGo to SCD home pageGo to SCD News home pageGo to SCD internal pagesGo to SCD News home pageGo to Features archiveGo to News archiveGo to Tips archiveGo to Updates archiveSCD News > Tips and techniques

How to handle floating-point exceptions in Fortran

Error trapping is a reliable method for discovering FPEs in your code

Richard Valent

SCD technical consultant
Richard Valent

by Richard Valent

When computations go awry in your program, you may notice incorrect numbers in some output fields, even though your program continues to execute. Sometimes you may notice strings like INF and NaN in fields where only numbers should be; these indicate certain kinds of floating-point exceptions (FPEs). INF means "infinity" and NaN means "not a number." Sometimes it's hard to find where these FPEs occur in your code, but you must find and fix them. They are useful only as diagnostics, and they harm performance since each FPE interrupts the processor on which it occurs.

So how do you find where FPEs are occuring in your code?

Salting your code with print statements is hit-or-miss and invasive, and we do not recommend it. If you believe you have only a few FPEs, you are well advised to use a debugger like TotalView or dbx, which will often automatically point at the first FPE in your core file. But if you have many FPEs, weeding them out in this manner can be tedious.


Error trapping

An alternative and reliable method is called "trapping." By trapping, we mean setting a trap at your program's runtime that gets tripped when an FPE occurs, after which the program execution follows a prescribed course of your choice. This course is referred to as "handling" the error, where the handling you choose may cause the program to abort, print a diagnostic message, or provide a traceback. With certain methods of trapping, you can even provide a subroutine or function that changes the behavior of the floating-point arithmetic, though you should consult a numerical analyst about the consequences before handling errors in this manner.

Since trapping and handling require extra processor time, you may wish to remove trapping/handling subroutine calls and compiler options after you have removed your program's FPEs.

Both trapping and handling are implemented via "signals," and you often find their documentation under the broader topics of "signals" or "signal handling".

All computers discussed in this article utilize IEEE binary floating-point arithmetic [1], with the exception of Cray, which uses Cray floating-point arithmetic. (Please consult a Cray Research CPU hardware reference manual if you need information about Cray's format.)


Six floating-point error types

There are six FPE types in the context of IEEE floating-point arithmetic.

Underflow
One form of underflow exception is signaled by the creation of a tiny nonzero result between the minimum expressible exponent, which, because it is tiny, may cause some other exception later. The other form of underflow exception is signaled by an extraordinary loss of accuracy during the approximation of such tiny numbers by denormalized numbers.

Overflow
The overflow exception is signaled when what would have been the magnitude of the rounded floating-point result, were the exponent range unbounded, is larger than the destination format's largest finite number.

Integer overflow
The integer overflow exception is signaled when an integer quantity is larger than the destination format's largest integer.

Divide by zero
The divide-by-zero exception is signaled on an implemented divide operation if the divisor is zero and the dividend is a finite nonzero number.

Invalid operand (infinity)
The invalid operand exception is signaled when one or both of the operands are invalid for an implemented operation. The result (if not trapped) is NaN for floating-point numbers and not defined for fixed-point numbers.

Inexact result
The inexact result exception is signaled when the rounded result of an operation is not exact or if it overflows without an overflow trap. Users normally do not trap or handle this type of FPE, in deference to the others.

Variables, utilities, and calls

Vendors may choose among three interfaces for trapping FPEs: environment variables, utilities, and subroutine calls. Environment variables provide the least invasive interface of the three, but only SGI provides it.

Each vendor discussed in this note provides the subroutine-call interface for trapping FPEs in Fortran, but each has its own implementation, so portability is lost. A proposal for IEEE floating-point exception handling in Fortran is given in [2]. However, no vendor has implemented it, to our knowledge.

To find out what a vendor offers for FPE trapping and handling, you can browse the vendor's online documentation, using the search engine and search words like "FPE" and "signal." Looking at man pages and hardcopy manuals helps too. Vendors provide too little documentation in the area of trapping; one almost feels it is done as an afterthought.

The remainder of this note is devoted to showing interfaces for trapping FPEs, in the context of our experience at NCAR. We include the Cray example at the end, since in our opinion it is least helpful. Here is a table of the interfaces you will see below:

Machine Environment
variable
Utility
interface
Subroutine
interface
IBM -- dbx external fhandler_
SGI TRAP_FPE ssrun, prof call handle_sigfpes
SUN -- dbx call ieee_handler
CRAY -- -- call sigon, sigoff, fsigctl



IBM trapping FPEs via subroutine fhandler_

Documentation
XL Fortran for AIX: User's Guide, Version 6, Release 1: "Detecting and Trapping Floating-Point Exceptions," p. 312. Also -qflttrap option, pp. 192-193.


OS and compiler
AIX 4.3.3.10, Fortran Version 07.01


Compilation
xlf -c -qfree -qflttrap=und:en -qsigtrap=fhandler_ job.f -lmass
cc -c flttrap_handler.c
xlf job.o flttrap_handler.o
job.f explanation
The program calls the IBM-provided subroutine fhandler_ when an underflow is encountered so that underflows are cut over to zero. This is accomplished via compiler flags and external statements rather than user call fhandler_ instrumentation.

Note: subroutine fhandler_ handles each type of FPE, not just underflows. You will want to study subroutine fhandler_ to see if it handles FPEs according to your needs, and modify it accordingly.


Comments
Not straightforward until you read the documentation and know to pick up file flttrap_handler.c from directory /usr/lpp/xlf/samples/floating_point.


job.f
program main
  implicit none
  integer i
  real*8 u,v
  external fhandler_
!
  v = 1.0d-300
  u = exp(v)
  do i=1,25
    v = v*1.0d-01
    u = exp(v)
    write(6,*)'i,u=',i,u,v
  end do
  stop
end program main

SGI trapping FPEs via environment variable TRAP_FPE

Documentation
man sigfpe


OS and compiler:
IRIX 64 6.5, MIPSpro f90 Version 7.2.1


Compilation
See script immediately below. Note -l fpe required.


Comments
None. TRAP_FPE is an excellent trapping interface.

SGI provides environment variable TRAP_FPE as a convenient way to count errors and trace overflows in your program without having to add routine calls or code to your program. To use it, you must set TRAP_FPE and compile your code with library option -l fpe. See the fsigfpe man page for more information. To duplicate the floating-point behavior on UNICOS, set TRAP_FPE as follows:

 
setenv TRAP_FPE \ 
      "UNDERFL=FLUSH_ZERO; OVERFL=ABORT,TRACE; DIVZERO=ABORT,TRACE; \
      INVALID=ABORT,TRACE" 
f90  -64 -mips4 job.f -l fpe 
a.out

SGI trapping FPEs via ssrun and prof utilities

Documentation
man ssrun; man prof


OS and compiler:
IRIX 64 6.5, MIPSpro f90 Version 7.2.1


Compilation
See script immediately below. Note -l fpe -l fpe_ss required.


Comments: None.

SGI provides utilities ssrun and prof which may be used together to determine where floating-point exceptions occur in your code. Use SpeedShop utility ssrun with option -fpe on your executable to build an intermediate file, which you then profile with the prof command to make a report that counts FPE exceptions routine by routine. You must build your executable with library options -l fpe and -l fpe_ss. The resulting report is helpful for overviewing your code's FPEs, but it is not a replacement for a full trace report obtainable by the above methods. Example:

f90 -64 -mips4 job.f -l fpe -l fpe_ss
ssrun -fpe a.out
prof a.out.fpe.m1043136
a.out

SGI trapping FPEs via subroutine calls

Documentation
/usr/include/f90sigfpe (text file)


OS and compiler:
IRIX 64 6.5, MIPSpro f90 Version 7.2.1


Compilation
f90 -mips4 -64 -fixedform job.f -L/usr/lib64 -lfpe


job.f explanation
The program calls user-provided subroutine abort_overfl when an overflow is encountered. Statements like fsigfpe(2) % abort =2 set Fortran 90 structure component values.


Comments
The vendor should provide a man page at the very least.


job.f
      include '/usr/include/f90sigfpe.h'
      external  abort_overfl
      real      x
! set OVERFL abort and trace values in f90sigfpe.h common block
! sigfpe via f90 structures as per documentation in f90sigfpe.h
      fsigfpe(2) % abort =2
      fsigfpe(2) % trace =2
! turn on handler
      call handle_sigfpes
     1  (FPE_ON, FPE_EN_OVERFL, 0, FPE_ABORT_ON_ERROR,abort_overfl)
! do the work here
      x = pow(10.0,10)
      x = pow(10.0,40)
      x = pow(10.0,50)
! turnoff handling
      call handle_sigfpes(FPE_OFF, FPE_EN_OVERFL, 0, 0, 0)
      stop
      end

      real function pow(x,n)
      integer n
      real    x
      pow = x**n
      print*, ' x,n,pow=', x,n,pow
      return
      end

      subroutine abort_overfl(pc)
      integer*4 pc
      print *, 'subroutine abort_overfl: pc=', pc
      return
      end

SUN trapping FPEs via the dbx utility

Documentation
AnswerBook (SUN's online documentation)


OS and compiler:
SunOS 5.5.1, SunSoft F90 Version 1.0.1.0


Comments
dbx's catch FPE is handy for finding an FPE's line number.

 f90 -g job.f90
dbx a.out
(dbx) catch FPE
(dbx) run

SUN trapping FPEs via subroutine calls

Documentation
man -s 3f f77_ieee_environment


OS and compiler:
Solaris 2.5.1, f90 WorkShop Compilers 4.2


Compilation
f77 job.f


job.f explanation
The program calls user-provided subroutine sample_handler when an overflow is encountered. It does this by passing the routine name through SUN's IEEE_HANDLER routine.


Comments
The documentation is hard to follow.


job.f
      program sun
C
C  Sample program to illustrate using SUN Fortran ieee exception handling.
C
C    There are three types of  action  :  get,  set,  and  clear.
C    There are five types of exception :
C         inexact
C         division       ... division by zero exception
C         underflow
C         overflow
C         invalid
C         all            ... all five exceptions above
C         common         ...  invalid,  overflow,  and   division
C                        exceptions
C
C    Note: all and common only make sense with set or clear.
C
C    Individual call to ieee_handler accumulate the requests.
C
      external sample_handler
C
C  Set up traps on all exceptions.
C
      ieeer = ieee_handler ( 'set', 'common', sample_handler)
      if (ieeer .ne. 0) print *,' ieee_handler cannot set exceptions '
C
      a = 0.
      print *,a
      b = 1./a
      print *,b
      c = 5.
      print *,c
      stop
      end
      integer function sample_handler ( sig, code, sigcontext)
C
C  User-supplied exception handler.
C
      integer sig, code, sigcontext(5)
      print *, 'ieee exception'
      stop
      end

Cray trapping FPEs via subroutine calls

Caveat: Cray C90 and J90 series run Cray arithmetic, not IEEE binary floating-point arithmetic. The only FPEs you can trap on these Crays are floating-point overflow and divide by zero. For completeness, we provide an example showing how to do this on Cray machines.

Documentation
man signal, man fsigctl

OS and compiler:
UNICOS 10.0.0.3 and f90 Version 3.1.0.0

Compilation
f90 job.f

job.f explanation
Use routines fsigctl, sigoff, and sigon to trap floating-point exceptions and other signals. Provide your own routine sighndlr to do what you want when an exception is encountered, e.g call routine tracebk for a trace.

Comments
There is no way to distinguish between different kinds of floating-point exceptions. But underflow is not a problem since Cray arithmetic automatically cuts over to zero.

job.f
	program main
	real x
	external sighndlr
C
C register to catch signals 8==SIGFPE, floating-point error
	call fsigctl('REGISTER',8,sighndlr)
C
C no interuptions
	call sigoff()
C
C force overflow
      x = 1.0
      do 20 k=1,3000
        x = 10.0*x
   20 continue
      print*, x
C
C release signals
	call sigon()
	write(*,*) 'after sigrelease'
	stop 'test'
	end

	subroutine sighndlr()
	write(*,*) '      in signal handler.'
	write(*,*) '      do whatever needs to be done,'
	write(*,*) '      then return to point of interruption'
	return
	end


References

  1. IEEE Standard for Binary Floating-Point Arithmetic (IEEE Std 754-1985) [back to text]
  2. WG5 (1997), "Technical Report for Floating-Point Exception Handling in Fortran," ISO/IEC/JTC1/SC22/WG5 N1281 [back to text]

Back to contents