3.18.3 ARC Options

The following options control the architecture variant for which code is being compiled:

-mbarrel-shifter
Generate instructions supported by barrel shifter. This is the default unless -mcpu=ARC601 or ‘ -mcpu=ARCEM ’ is in effect.
-mcpu= cpu
Set architecture type, register usage, and instruction scheduling parameters for cpu . There are also shortcut alias options available for backward compatibility and convenience. Supported values for cpu are
ARC600
arc600
Compile for ARC600. Aliases: -mA6 , -mARC600 .
ARC601
arc601
Compile for ARC601. Alias: -mARC601 .
ARC700
arc700
Compile for ARC700. Aliases: -mA7 , -mARC700 . This is the default when configured with --with-cpu=arc700 .
ARCEM
arcem
Compile for ARC EM.
ARCHS
archs
Compile for ARC HS.

-mdpfp
-mdpfp-compact
FPX: Generate Double Precision FPX instructions, tuned for the compact implementation.
-mdpfp-fast
FPX: Generate Double Precision FPX instructions, tuned for the fast implementation.
-mno-dpfp-lrsr
Disable LR and SR instructions from using FPX extension aux registers.
-mea
Generate Extended arithmetic instructions. Currently only divaw , adds , subs , and sat16 are supported. This is always enabled for -mcpu=ARC700 .
-mno-mpy
Do not generate mpy instructions for ARC700.
-mmul32x16
Generate 32x16 bit multiply and mac instructions.
-mmul64
Generate mul64 and mulu64 instructions. Only valid for -mcpu=ARC600 .
-mnorm
Generate norm instruction. This is the default if -mcpu=ARC700 is in effect.
-mspfp
-mspfp-compact
FPX: Generate Single Precision FPX instructions, tuned for the compact implementation.
-mspfp-fast
FPX: Generate Single Precision FPX instructions, tuned for the fast implementation.
-msimd
Enable generation of ARC SIMD instructions via target-specific builtins. Only valid for -mcpu=ARC700 .
-msoft-float
This option ignored; it is provided for compatibility purposes only. Software floating point code is emitted by default, and this default can overridden by FPX options; ‘ mspfp ’, ‘ mspfp-compact ’, or ‘ mspfp-fast ’ for single precision, and ‘ mdpfp ’, ‘ mdpfp-compact ’, or ‘ mdpfp-fast ’ for double precision.
-mswap
Generate swap instructions.
-matomic
This enables Locked Load/Store Conditional extension to implement atomic memopry built-in functions. Not available for ARC 6xx or ARC EM cores.
-mdiv-rem
Enable DIV/REM instructions for ARCv2 cores.
-mcode-density
Enable code density instructions for ARC EM, default on for ARC HS.
-mll64
Enable double load/store operations for ARC HS cores.
-mmpy-option= multo
Compile ARCv2 code with a multiplier design option. ‘ wlh1 ’ is the default value. The recognized values for multo are:
0
No multiplier available.
1
The multiply option is set to w: 16x16 multiplier, fully pipelined. The following instructions are enabled: MPYW, and MPYUW.
2
The multiply option is set to wlh1: 32x32 multiplier, fully pipelined (1 stage). The following instructions are additionally enabled: MPY, MPYU, MPYM, MPYMU, and MPY_S.
3
The multiply option is set to wlh2: 32x32 multiplier, fully pipelined (2 stages). The following instructions are additionally enabled: MPY, MPYU, MPYM, MPYMU, and MPY_S.
4
The multiply option is set to wlh3: Two 16x16 multiplier, blocking, sequential. The following instructions are additionally enabled: MPY, MPYU, MPYM, MPYMU, and MPY_S.
5
The multiply option is set to wlh4: One 16x16 multiplier, blocking, sequential. The following instructions are additionally enabled: MPY, MPYU, MPYM, MPYMU, and MPY_S.
6
The multiply option is set to wlh5: One 32x4 multiplier, blocking, sequential. The following instructions are additionally enabled: MPY, MPYU, MPYM, MPYMU, and MPY_S.

This option is only available for ARCv2 cores.

-mfpu= fpu
Enables specific floating-point hardware extension for ARCv2 core. Supported values for fpu are:
fpus
Enables support for single precision floating point hardware extensions.
fpud
Enables support for double precision floating point hardware extensions. The single precision floating point extension is also enabled. Not available for ARC EM.
fpuda
Enables support for double precision floating point hardware extensions using double precision assist instructions. The single precision floating point extension is also enabled. This option is only available for ARC EM.
fpuda_div
Enables support for double precision floating point hardware extensions using double precision assist instructions, and simple precision square-root and divide hardware extensions. The single precision floating point extension is also enabled. This option is only available for ARC EM.
fpuda_fma
Enables support for double precision floating point hardware extensions using double precision assist instructions, and simple precision fused multiple and add hardware extension. The single precision floating point extension is also enabled. This option is only available for ARC EM.
fpuda_all
Enables support for double precision floating point hardware extensions using double precision assist instructions, and all simple precision hardware extensions. The single precision floating point extension is also enabled. This option is only available for ARC EM.
fpus_div
Enables support for single precision floating point, and single precision square-root and divide hardware extensions.
fpud_div
Enables support for double precision floating point, and double precision square-root and divide hardware extensions. This option includes option ‘ fpus_div ’. Not available for ARC EM.
fpus_fma
Enables support for single precision floating point, and single precision fused multiple and add hardware extensions.
fpud_fma
Enables support for double precision floating point, and double precision fused multiple and add hardware extensions. This option includes option ‘ fpus_fma ’. Not available for ARC EM.
fpus_all
Enables support for all single precision floating point hardware extensions.
fpud_all
Enables support for all single and double precision floating point hardware extensions. Not available for ARC EM.

The following options are passed through to the assembler, and also define preprocessor macro symbols.

-mdsp-packa
Passed down to the assembler to enable the DSP Pack A extensions. Also sets the preprocessor symbol __Xdsp_packa .
-mdvbf
Passed down to the assembler to enable the dual viterbi butterfly extension. Also sets the preprocessor symbol __Xdvbf .
-mlock
Passed down to the assembler to enable the Locked Load/Store Conditional extension. Also sets the preprocessor symbol __Xlock .
-mmac-d16
Passed down to the assembler. Also sets the preprocessor symbol __Xxmac_d16 .
-mmac-24
Passed down to the assembler. Also sets the preprocessor symbol __Xxmac_24 .
-mrtsc
Passed down to the assembler to enable the 64-bit Time-Stamp Counter extension instruction. Also sets the preprocessor symbol __Xrtsc .
-mswape
Passed down to the assembler to enable the swap byte ordering extension instruction. Also sets the preprocessor symbol __Xswape .
-mtelephony
Passed down to the assembler to enable dual and single operand instructions for telephony. Also sets the preprocessor symbol __Xtelephony .
-mxy
Passed down to the assembler to enable the XY Memory extension. Also sets the preprocessor symbol __Xxy .

The following options control how the assembly code is annotated:

-misize
Annotate assembler instructions with estimated addresses.
-mannotate-align
Explain what alignment considerations lead to the decision to make an instruction short or long.

The following options are passed through to the linker:

-marclinux
Passed through to the linker, to specify use of the arclinux emulation. This option is enabled by default in tool chains built for arc-linux-uclibc and arceb-linux-uclibc targets when profiling is not requested.
-marclinux_prof
Passed through to the linker, to specify use of the arclinux_prof emulation. This option is enabled by default in tool chains built for arc-linux-uclibc and arceb-linux-uclibc targets when profiling is requested.

The following options control the semantics of generated code:

-mlong-calls
Generate call insns as register indirect calls, thus providing access to the full 32-bit address range.
-mmedium-calls
Don't use less than 25 bit addressing range for calls, which is the offset available for an unconditional branch-and-link instruction. Conditional execution of function calls is suppressed, to allow use of the 25-bit range, rather than the 21-bit range with conditional branch-and-link. This is the default for tool chains built for arc-linux-uclibc and arceb-linux-uclibc targets.
-mno-sdata
Do not generate sdata references. This is the default for tool chains built for arc-linux-uclibc and arceb-linux-uclibc targets.
-mucb-mcount
Instrument with mcount calls as used in UCB code. I.e. do the counting in the callee, not the caller. By default ARC instrumentation counts in the caller.
-mvolatile-cache
Use ordinarily cached memory accesses for volatile references. This is the default.
-mno-volatile-cache
Enable cache bypass for volatile references.

The following options fine tune code generation:

-malign-call
Do alignment optimizations for call instructions.
-mauto-modify-reg
Enable the use of pre/post modify with register displacement.
-mbbit-peephole
Enable bbit peephole2.
-mno-brcc
This option disables a target-specific pass in arc_reorg to generate BRcc instructions. It has no effect on BRcc generation driven by the combiner pass.
-mcase-vector-pcrel
Use pc-relative switch case tables - this enables case table shortening. This is the default for -Os .
-mcompact-casesi
Enable compact casesi pattern. This is the default for -Os .
-mno-cond-exec
Disable ARCompact specific pass to generate conditional execution instructions. Due to delay slot scheduling and interactions between operand numbers, literal sizes, instruction lengths, and the support for conditional execution, the target-independent pass to generate conditional execution is often lacking, so the ARC port has kept a special pass around that tries to find more conditional execution generating opportunities after register allocation, branch shortening, and delay slot scheduling have been done. This pass generally, but not always, improves performance and code size, at the cost of extra compilation time, which is why there is an option to switch it off. If you have a problem with call instructions exceeding their allowable offset range because they are conditionalized, you should consider using -mmedium-calls instead.
-mearly-cbranchsi
Enable pre-reload use of the cbranchsi pattern.
-mexpand-adddi
Expand adddi3 and subdi3 at rtl generation time into add.f , adc etc.
-mindexed-loads
Enable the use of indexed loads. This can be problematic because some optimizers then assume that indexed stores exist, which is not the case.

Enable Local Register Allocation. This is still experimental for ARC, so by default the compiler uses standard reload (i.e. -mno-lra ).

-mlra-priority-none
Don't indicate any priority for target registers.
-mlra-priority-compact
Indicate target register priority for r0..r3 / r12..r15.
-mlra-priority-noncompact
Reduce target register priority for r0..r3 / r12..r15.
-mno-millicode
When optimizing for size (using -Os ), prologues and epilogues that have to save or restore a large number of registers are often shortened by using call to a special function in libgcc; this is referred to as a millicode call. As these calls can pose performance issues, and/or cause linking issues when linking in a nonstandard way, this option is provided to turn off millicode call generation.
-mmixed-code
Tweak register allocation to help 16-bit instruction generation. This generally has the effect of decreasing the average instruction size while increasing the instruction count.
-mq-class
Enable 'q' instruction alternatives. This is the default for -Os .
-mRcq
Enable Rcq constraint handling - most short code generation depends on this. This is the default.
-mRcw
Enable Rcw constraint handling - ccfsm condexec mostly depends on this. This is the default.
-msize-level= level
Fine-tune size optimization with regards to instruction lengths and alignment. The recognized values for level are:
0
No size optimization. This level is deprecated and treated like ‘ 1 ’.
1
Short instructions are used opportunistically.
2
In addition, alignment of loops and of code after barriers are dropped.
3
In addition, optional data alignment is dropped, and the option Os is enabled.

This defaults to ‘ 3 ’ when -Os is in effect. Otherwise, the behavior when this is not set is equivalent to level ‘ 1 ’.

-mtune= cpu
Set instruction scheduling parameters for cpu , overriding any implied by -mcpu= .

Supported values for cpu are

ARC600
Tune for ARC600 cpu.
ARC601
Tune for ARC601 cpu.
ARC700
Tune for ARC700 cpu with standard multiplier block.
ARC700-xmac
Tune for ARC700 cpu with XMAC block.
ARC725D
Tune for ARC725D cpu.
ARC750D
Tune for ARC750D cpu.

-mmultcost= num
Cost to assume for a multiply instruction, with ‘ 4 ’ being equal to a normal instruction.
-munalign-prob-threshold= probability
Set probability threshold for unaligning branches. When tuning for ‘ ARC700 ’ and optimizing for speed, branches without filled delay slot are preferably emitted unaligned and long, unless profiling indicates that the probability for the branch to be taken is below probability . See Cross-profiling . The default is (REG_BR_PROB_BASE/2), i.e. 5000.

The following options are maintained for backward compatibility, but are now deprecated and will be removed in a future release:

-margonaut
Obsolete FPX.
-mbig-endian
-EB
Compile code for big endian targets. Use of these options is now deprecated. Users wanting big-endian code, should use the arceb-elf32 and arceb-linux-uclibc targets when building the tool chain, for which big-endian is the default.
-mlittle-endian
-EL
Compile code for little endian targets. Use of these options is now deprecated. Users wanting little-endian code should use the arc-elf32 and arc-linux-uclibc targets when building the tool chain, for which little-endian is the default.
-mbarrel_shifter
Replaced by -mbarrel-shifter .
-mdpfp_compact
Replaced by -mdpfp-compact .
-mdpfp_fast
Replaced by -mdpfp-fast .
-mdsp_packa
Replaced by -mdsp-packa .
-mEA
Replaced by -mea .
-mmac_24
Replaced by -mmac-24 .
-mmac_d16
Replaced by -mmac-d16 .
-mspfp_compact
Replaced by -mspfp-compact .
-mspfp_fast
Replaced by -mspfp-fast .
-mtune= cpu
Values ‘ arc600 ’, ‘ arc601 ’, ‘ arc700 ’ and ‘ arc700-xmac ’ for cpu are replaced by ‘ ARC600 ’, ‘ ARC601 ’, ‘ ARC700 ’ and ‘ ARC700-xmac ’ respectively
-multcost= num
Replaced by -mmultcost .