DEC Alpha Options (Using the GNU Compiler Collection (GCC))

3.20.13 DEC Alpha Options

These -m options are defined for the DEC Alpha implementations:

-mno-soft-float

-msoft-float

Use (do not use) the hardware floating-point instructions for floating-point operations. When -msoft-float is specified, functions in libgcc.a are used to perform floating-point operations. Unless they are replaced by routines that emulate the floating-point operations, or compiled in such a way as to call such emulations routines, these routines issue floating-point operations. If you are compiling for an Alpha without floating-point operations, you must ensure that the library is built so as not to call them.

Note that Alpha implementations without floating-point operations are required to have floating-point registers.

-mfp-reg

-mno-fp-regs

Generate code that uses (does not use) the floating-point register set. -mno-fp-regs implies -msoft-float. If the floating-point register set is not used, floating-point operands are passed in integer registers as if they were integers and floating-point results are passed in $0 instead of $f0. This is a non-standard calling sequence, so any function with a floating-point argument or return value called by code compiled with -mno-fp-regs must also be compiled with that option.

A typical use of this option is building a kernel that does not use, and hence need not save and restore, any floating-point registers.

-mieee

The Alpha architecture implements floating-point hardware optimized for maximum performance. It is mostly compliant with the IEEE floating-point standard. However, for full compliance, software assistance is required. This option generates code fully IEEE-compliant code except that the inexact-flag is not maintained (see below). If this option is turned on, the preprocessor macro _IEEE_FP is defined during compilation. The resulting code is less efficient but is able to correctly support denormalized numbers and exceptional IEEE values such as not-a-number and plus/minus infinity. Other Alpha compilers call this option -ieee_with_no_inexact.

-mieee-with-inexact

This is like -mieee except the generated code also maintains the IEEE inexact-flag. Turning on this option causes the generated code to implement fully-compliant IEEE math. In addition to _IEEE_FP, _IEEE_FP_EXACT is defined as a preprocessor macro. On some Alpha implementations the resulting code may execute significantly slower than the code generated by default. Since there is very little code that depends on the inexact-flag, you should normally not specify this option. Other Alpha compilers call this option -ieee_with_inexact.

-mfp-trap-mode=trap-mode

This option controls what floating-point related traps are enabled. Other Alpha compilers call this option -fptm trap-mode. The trap mode can be set to one of four values:

n: This is the default (normal) setting. The only traps that are enabled are the ones that cannot be disabled in software (e.g., division by zero trap).
u: In addition to the traps enabled by n, underflow traps are enabled as well.
su: Like u, but the instructions are marked to be safe for software completion (see Alpha architecture manual for details).
sui: Like su, but inexact traps are enabled as well.

-mfp-rounding-mode=rounding-mode

Selects the IEEE rounding mode. Other Alpha compilers call this option -fprm rounding-mode. The rounding-mode can be one of:

n: Normal IEEE rounding mode. Floating-point numbers are rounded towards the nearest machine number or towards the even machine number in case of a tie.
m: Round towards minus infinity.
c: Chopped rounding mode. Floating-point numbers are rounded towards zero.
d: Dynamic rounding mode. A field in the floating-point control register (fpcr, see Alpha architecture reference manual) controls the rounding mode in effect. The C library initializes this register for rounding towards plus infinity. Thus, unless your program modifies the fpcr, d corresponds to round towards plus infinity.

-mtrap-precision=trap-precision

In the Alpha architecture, floating-point traps are imprecise. This means without software assistance it is impossible to recover from a floating trap and program execution normally needs to be terminated. GCC can generate code that can assist operating system trap handlers in determining the exact location that caused a floating-point trap. Depending on the requirements of an application, different levels of precisions can be selected:

p: Program precision. This option is the default and means a trap handler can only identify which program caused a floating-point exception.
f: Function precision. The trap handler can determine the function that caused a floating-point exception.
i: Instruction precision. The trap handler can determine the exact instruction that caused a floating-point exception.

Other Alpha compilers provide the equivalent options called -scope_safe and -resumption_safe.

-mieee-conformant

This option marks the generated code as IEEE conformant. You must not use this option unless you also specify -mtrap-precision=i and either -mfp-trap-mode=su or -mfp-trap-mode=sui. Its only effect is to emit the line .eflag 48 in the function prologue of the generated assembly file.

-mbuild-constants

Normally GCC examines a 32- or 64-bit integer constant to see if it can construct it from smaller constants in two or three instructions. If it cannot, it outputs the constant as a literal and generates code to load it from the data segment at run time.

Use this option to require GCC to construct all integer constants using code, even if it takes more instructions (the maximum is six).

You typically use this option to build a shared library dynamic loader. Itself a shared library, it must relocate itself in memory before it can find the variables and constants in its own data segment.

-mbwx

-mcix

-mfix

-mmax

Indicate whether GCC should generate code to use the optional BWX, CIX, FIX and MAX instruction sets. The default is to use the instruction sets supported by the CPU type specified via -mcpu= option or that of the CPU on which GCC was built if none is specified.

-msafe-bwa

-mno-safe-bwa

Indicate whether in the absence of the optional BWX instruction set GCC should generate multi-thread and async-signal safe code for byte and aligned word memory accesses.

-msafe-partial

-mno-safe-partial

Indicate whether GCC should generate multi-thread and async-signal safe code for partial memory accesses, including piecemeal accesses to unaligned data as well as block accesses to leading and trailing parts of aggregate types or other objects in memory that do not respectively start and end on an aligned 64-bit data boundary.

-mfloat-vax

-mfloat-ieee

Generate code that uses (does not use) VAX F and G floating-point arithmetic instead of IEEE single and double precision.

-mexplicit-relocs

-mno-explicit-relocs

Older Alpha assemblers provided no way to generate symbol relocations except via assembler macros. Use of these macros does not allow optimal instruction scheduling. GNU binutils as of version 2.12 supports a new syntax that allows the compiler to explicitly mark which relocations should apply to which instructions. This option is mostly useful for debugging, as GCC detects the capabilities of the assembler when it is built and sets the default accordingly.

-msmall-data

-mlarge-data

When -mexplicit-relocs is in effect, static data is accessed via gp-relative relocations. When -msmall-data is used, objects 8 bytes long or smaller are placed in a small data area (the .sdata and .sbss sections) and are accessed via 16-bit relocations off of the $gp register. This limits the size of the small data area to 64KB, but allows the variables to be directly accessed via a single instruction.

The default is -mlarge-data. With this option the data area is limited to just below 2GB. Programs that require more than 2GB of data must use malloc or mmap to allocate the data in the heap instead of in the programs data segment.

When generating code for shared libraries, -fpic implies -msmall-data and -fPIC implies -mlarge-data.

-msmall-text

-mlarge-text

When -msmall-text is used, the compiler assumes that the code of the entire program (or shared library) fits in 4MB, and is thus reachable with a branch instruction. When -msmall-data is used, the compiler can assume that all local symbols share the same $gp value, and thus reduce the number of instructions required for a function call from 4 to 1.

The default is -mlarge-text.

-mcpu=cpu_type

Set the instruction set and instruction scheduling parameters for machine type cpu_type. You can specify either the EV style name or the corresponding chip number. GCC supports scheduling parameters for the EV4, EV5 and EV6 family of processors and chooses the default values for the instruction set from the processor you specify. If you do not specify a processor type, GCC defaults to the processor on which the compiler was built.

Supported values for cpu_type are

ev4
ev45
21064: Schedules as an EV4 and has no instruction set extensions.
ev5
21164: Schedules as an EV5 and has no instruction set extensions.
ev56
21164a: Schedules as an EV5 and supports the BWX extension.
pca56
21164pc
21164PC: Schedules as an EV5 and supports the BWX and MAX extensions.
ev6
21264: Schedules as an EV6 and supports the BWX, FIX, and MAX extensions.
ev67
21264a: Schedules as an EV6 and supports the BWX, CIX, FIX, and MAX extensions.

Native toolchains also support the value native, which selects the best architecture option for the host processor. -mcpu=native has no effect if GCC does not recognize the processor.

-mtune=cpu_type

Set only the instruction scheduling parameters for machine type cpu_type. The instruction set is not changed.

Native toolchains also support the value native, which selects the best architecture option for the host processor. -mtune=native has no effect if GCC does not recognize the processor.

-mmemory-latency=time

Sets the latency the scheduler should assume for typical memory references as seen by the application. This number is highly dependent on the memory access patterns used by the application and the size of the external cache on the machine.

Valid options for time are

number: A decimal number representing clock cycles.
L1
L2
L3
main: The compiler contains estimates of the number of clock cycles for typical EV4 & EV5 hardware for the Level 1, 2 & 3 caches (also called Dcache, Scache, and Bcache), as well as to main memory. Note that L3 is only valid for EV5.