This is gccint.info, produced by makeinfo version 7.1 from gccint.texi. Copyright © 1988-2024 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being "Funding Free Software", the Front-Cover Texts being (a) (see below), and with the Back-Cover Texts being (b) (see below). A copy of the license is included in the section entitled "GNU Free Documentation License". (a) The FSF's Front-Cover Text is: A GNU Manual (b) The FSF's Back-Cover Text is: You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development. INFO-DIR-SECTION Software development START-INFO-DIR-ENTRY * gccint: (gccint). Internals of the GNU Compiler Collection. END-INFO-DIR-ENTRY This file documents the internals of the GNU compilers. Copyright © 1988-2024 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being "Funding Free Software", the Front-Cover Texts being (a) (see below), and with the Back-Cover Texts being (b) (see below). A copy of the license is included in the section entitled "GNU Free Documentation License". (a) The FSF's Front-Cover Text is: A GNU Manual (b) The FSF's Back-Cover Text is: You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development.  File: gccint.info, Node: Pattern Ordering, Next: Dependent Patterns, Prev: Standard Names, Up: Machine Desc 17.11 When the Order of Patterns Matters ======================================== Sometimes an insn can match more than one instruction pattern. Then the pattern that appears first in the machine description is the one used. Therefore, more specific patterns (patterns that will match fewer things) and faster instructions (those that will produce better code when they do match) should usually go first in the description. In some cases the effect of ordering the patterns can be used to hide a pattern when it is not valid. For example, the 68000 has an instruction for converting a fullword to floating point and another for converting a byte to floating point. An instruction converting an integer to floating point could match either one. We put the pattern to convert the fullword first to make sure that one will be used rather than the other. (Otherwise a large integer might be generated as a single-byte immediate quantity, which would not work.) Instead of using this pattern ordering it would be possible to make the pattern for convert-a-byte smart enough to deal properly with any constant value.  File: gccint.info, Node: Dependent Patterns, Next: Jump Patterns, Prev: Pattern Ordering, Up: Machine Desc 17.12 Interdependence of Patterns ================================= In some cases machines support instructions identical except for the machine mode of one or more operands. For example, there may be "sign-extend halfword" and "sign-extend byte" instructions whose patterns are (set (match_operand:SI 0 ...) (extend:SI (match_operand:HI 1 ...))) (set (match_operand:SI 0 ...) (extend:SI (match_operand:QI 1 ...))) Constant integers do not specify a machine mode, so an instruction to extend a constant value could match either pattern. The pattern it actually will match is the one that appears first in the file. For correct results, this must be the one for the widest possible mode (‘HImode’, here). If the pattern matches the ‘QImode’ instruction, the results will be incorrect if the constant value does not actually fit that mode. Such instructions to extend constants are rarely generated because they are optimized away, but they do occasionally happen in nonoptimized compilations. If a constraint in a pattern allows a constant, the reload pass may replace a register with a constant permitted by the constraint in some cases. Similarly for memory references. Because of this substitution, you should not provide separate patterns for increment and decrement instructions. Instead, they should be generated from the same pattern that supports register-register add insns by examining the operands and generating the appropriate machine instruction.  File: gccint.info, Node: Jump Patterns, Next: Looping Patterns, Prev: Dependent Patterns, Up: Machine Desc 17.13 Defining Jump Instruction Patterns ======================================== GCC does not assume anything about how the machine realizes jumps. The machine description should define a single pattern, usually a ‘define_expand’, which expands to all the required insns. Usually, this would be a comparison insn to set the condition code and a separate branch insn testing the condition code and branching or not according to its value. For many machines, however, separating compares and branches is limiting, which is why the more flexible approach with one ‘define_expand’ is used in GCC. The machine description becomes clearer for architectures that have compare-and-branch instructions but no condition code. It also works better when different sets of comparison operators are supported by different kinds of conditional branches (e.g. integer vs. floating-point), or by conditional branches with respect to conditional stores. Two separate insns are always used on most machines that use a separate condition code register (*note Condition Code::). Even in this case having a single entry point for conditional branches is advantageous, because it handles equally well the case where a single comparison instruction records the results of both signed and unsigned comparison of the given operands (with the branch insns coming in distinct signed and unsigned flavors) as in the x86 or SPARC, and the case where there are distinct signed and unsigned compare instructions and only one set of conditional branch instructions as in the PowerPC.  File: gccint.info, Node: Looping Patterns, Next: Insn Canonicalizations, Prev: Jump Patterns, Up: Machine Desc 17.14 Defining Looping Instruction Patterns =========================================== Some machines have special jump instructions that can be utilized to make loops more efficient. A common example is the 68000 ‘dbra’ instruction which performs a decrement of a register and a branch if the result was greater than zero. Other machines, in particular digital signal processors (DSPs), have special block repeat instructions to provide low-overhead loop support. For example, the TI TMS320C3x/C4x DSPs have a block repeat instruction that loads special registers to mark the top and end of a loop and to count the number of loop iterations. This avoids the need for fetching and executing a ‘dbra’-like instruction and avoids pipeline stalls associated with the jump. GCC has two special named patterns to support low overhead looping. They are ‘doloop_begin’ and ‘doloop_end’. These are emitted by the loop optimizer for certain well-behaved loops with a finite number of loop iterations using information collected during strength reduction. The ‘doloop_end’ pattern describes the actual looping instruction (or the implicit looping operation) and the ‘doloop_begin’ pattern is an optional companion pattern that can be used for initialization needed for some low-overhead looping instructions. Note that some machines require the actual looping instruction to be emitted at the top of the loop (e.g., the TMS320C3x/C4x DSPs). Emitting the true RTL for a looping instruction at the top of the loop can cause problems with flow analysis. So instead, a dummy ‘doloop’ insn is emitted at the end of the loop. The machine dependent reorg pass checks for the presence of this ‘doloop’ insn and then searches back to the top of the loop, where it inserts the true looping insn (provided there are no instructions in the loop which would cause problems). Any additional labels can be emitted at this point. In addition, if the desired special iteration counter register was not allocated, this machine dependent reorg pass could emit a traditional compare and jump instruction pair. For the ‘doloop_end’ pattern, the loop optimizer allocates an additional pseudo register as an iteration counter. This pseudo register cannot be used within the loop (i.e., general induction variables cannot be derived from it), however, in many cases the loop induction variable may become redundant and removed by the flow pass. The ‘doloop_end’ pattern must have a specific structure to be handled correctly by GCC. The example below is taken (slightly simplified) from the PDP-11 target: (define_expand "doloop_end" [(parallel [(set (pc) (if_then_else (ne (match_operand:HI 0 "nonimmediate_operand" "+r,!m") (const_int 1)) (label_ref (match_operand 1 "" "")) (pc))) (set (match_dup 0) (plus:HI (match_dup 0) (const_int -1)))])] "" "{ if (GET_MODE (operands[0]) != HImode) FAIL; }") (define_insn "doloop_end_insn" [(set (pc) (if_then_else (ne (match_operand:HI 0 "nonimmediate_operand" "+r,!m") (const_int 1)) (label_ref (match_operand 1 "" "")) (pc))) (set (match_dup 0) (plus:HI (match_dup 0) (const_int -1)))] "" { if (which_alternative == 0) return "sob %0,%l1"; /* emulate sob */ output_asm_insn ("dec %0", operands); return "bne %l1"; }) The first part of the pattern describes the branch condition. GCC supports three cases for the way the target machine handles the loop counter: • Loop terminates when the loop register decrements to zero. This is represented by a ‘ne’ comparison of the register (its old value) with constant 1 (as in the example above). • Loop terminates when the loop register decrements to −1. This is represented by a ‘ne’ comparison of the register with constant zero. • Loop terminates when the loop register decrements to a negative value. This is represented by a ‘ge’ comparison of the register with constant zero. For this case, GCC will attach a ‘REG_NONNEG’ note to the ‘doloop_end’ insn if it can determine that the register will be non-negative. Since the ‘doloop_end’ insn is a jump insn that also has an output, the reload pass does not handle the output operand. Therefore, the constraint must allow for that operand to be in memory rather than a register. In the example shown above, that is handled (in the ‘doloop_end_insn’ pattern) by using a loop instruction sequence that can handle memory operands when the memory alternative appears. GCC does not check the mode of the loop register operand when generating the ‘doloop_end’ pattern. If the pattern is only valid for some modes but not others, the pattern should be a ‘define_expand’ pattern that checks the operand mode in the preparation code, and issues ‘FAIL’ if an unsupported mode is found. The example above does this, since the machine instruction to be used only exists for ‘HImode’. If the ‘doloop_end’ pattern is a ‘define_expand’, there must also be a ‘define_insn’ or ‘define_insn_and_split’ matching the generated pattern. Otherwise, the compiler will fail during loop optimization.  File: gccint.info, Node: Insn Canonicalizations, Next: Expander Definitions, Prev: Looping Patterns, Up: Machine Desc 17.15 Canonicalization of Instructions ====================================== There are often cases where multiple RTL expressions could represent an operation performed by a single machine instruction. This situation is most commonly encountered with logical, branch, and multiply-accumulate instructions. In such cases, the compiler attempts to convert these multiple RTL expressions into a single canonical form to reduce the number of insn patterns required. In addition to algebraic simplifications, following canonicalizations are performed: • For commutative and comparison operators, a constant is always made the second operand. If a machine only supports a constant as the second operand, only patterns that match a constant in the second operand need be supplied. • For the ‘vec_merge’ with constant mask(the third operand), the first and the second operand can be exchanged by inverting the mask. In such cases, a constant is always made the second operand, otherwise the least significant bit of the mask is always set(select the first operand first). • For associative operators, a sequence of operators will always chain to the left; for instance, only the left operand of an integer ‘plus’ can itself be a ‘plus’. ‘and’, ‘ior’, ‘xor’, ‘plus’, ‘mult’, ‘smin’, ‘smax’, ‘umin’, and ‘umax’ are associative when applied to integers, and sometimes to floating-point. • For these operators, if only one operand is a ‘neg’, ‘not’, ‘mult’, ‘plus’, or ‘minus’ expression, it will be the first operand. • In combinations of ‘neg’, ‘mult’, ‘plus’, and ‘minus’, the ‘neg’ operations (if any) will be moved inside the operations as far as possible. For instance, ‘(neg (mult A B))’ is canonicalized as ‘(mult (neg A) B)’, but ‘(plus (mult (neg B) C) A)’ is canonicalized as ‘(minus A (mult B C))’. • For the ‘compare’ operator, a constant is always the second operand if the first argument is a condition code register. • For instructions that inherently set a condition code register, the ‘compare’ operator is always written as the first RTL expression of the ‘parallel’ instruction pattern. For example, (define_insn "" [(set (reg:CCZ FLAGS_REG) (compare:CCZ (plus:SI (match_operand:SI 1 "register_operand" "%r") (match_operand:SI 2 "register_operand" "r")) (const_int 0))) (set (match_operand:SI 0 "register_operand" "=r") (plus:SI (match_dup 1) (match_dup 2)))] "" "addl %0, %1, %2") • An operand of ‘neg’, ‘not’, ‘mult’, ‘plus’, or ‘minus’ is made the first operand under the same conditions as above. • ‘(ltu (plus A B) B)’ is converted to ‘(ltu (plus A B) A)’. Likewise with ‘geu’ instead of ‘ltu’. • ‘(minus X (const_int N))’ is converted to ‘(plus X (const_int -N))’. • Within address computations (i.e., inside ‘mem’), a left shift is converted into the appropriate multiplication by a power of two. • De Morgan's Law is used to move bitwise negation inside a bitwise logical-and or logical-or operation. If this results in only one operand being a ‘not’ expression, it will be the first one. A machine that has an instruction that performs a bitwise logical-and of one operand with the bitwise negation of the other should specify the pattern for that instruction as (define_insn "" [(set (match_operand:M 0 ...) (and:M (not:M (match_operand:M 1 ...)) (match_operand:M 2 ...)))] "..." "...") Similarly, a pattern for a "NAND" instruction should be written (define_insn "" [(set (match_operand:M 0 ...) (ior:M (not:M (match_operand:M 1 ...)) (not:M (match_operand:M 2 ...))))] "..." "...") In both cases, it is not necessary to include patterns for the many logically equivalent RTL expressions. • The only possible RTL expressions involving both bitwise exclusive-or and bitwise negation are ‘(xor:M X Y)’ and ‘(not:M (xor:M X Y))’. • The sum of three items, one of which is a constant, will only appear in the form (plus:M (plus:M X Y) CONSTANT) • Equality comparisons of a group of bits (usually a single bit) with zero will be written using ‘zero_extract’ rather than the equivalent ‘and’ or ‘sign_extract’ operations. • ‘(sign_extend:M1 (mult:M2 (sign_extend:M2 X) (sign_extend:M2 Y)))’ is converted to ‘(mult:M1 (sign_extend:M1 X) (sign_extend:M1 Y))’, and likewise for ‘zero_extend’. • ‘(sign_extend:M1 (mult:M2 (ashiftrt:M2 X S) (sign_extend:M2 Y)))’ is converted to ‘(mult:M1 (sign_extend:M1 (ashiftrt:M2 X S)) (sign_extend:M1 Y))’, and likewise for patterns using ‘zero_extend’ and ‘lshiftrt’. If the second operand of ‘mult’ is also a shift, then that is extended also. This transformation is only applied when it can be proven that the original operation had sufficient precision to prevent overflow. Further canonicalization rules are defined in the function ‘commutative_operand_precedence’ in ‘gcc/rtlanal.cc’.  File: gccint.info, Node: Expander Definitions, Next: Insn Splitting, Prev: Insn Canonicalizations, Up: Machine Desc 17.16 Defining RTL Sequences for Code Generation ================================================ On some target machines, some standard pattern names for RTL generation cannot be handled with single insn, but a sequence of RTL insns can represent them. For these target machines, you can write a ‘define_expand’ to specify how to generate the sequence of RTL. A ‘define_expand’ is an RTL expression that looks almost like a ‘define_insn’; but, unlike the latter, a ‘define_expand’ is used only for RTL generation and it can produce more than one RTL insn. A ‘define_expand’ RTX has four operands: • The name. Each ‘define_expand’ must have a name, since the only use for it is to refer to it by name. • The RTL template. This is a vector of RTL expressions representing a sequence of separate instructions. Unlike ‘define_insn’, there is no implicit surrounding ‘PARALLEL’. • The condition, a string containing a C expression. This expression is used to express how the availability of this pattern depends on subclasses of target machine, selected by command-line options when GCC is run. This is just like the condition of a ‘define_insn’ that has a standard name. Therefore, the condition (if present) may not depend on the data in the insn being matched, but only the target-machine-type flags. The compiler needs to test these conditions during initialization in order to learn exactly which named instructions are available in a particular run. • The preparation statements, a string containing zero or more C statements which are to be executed before RTL code is generated from the RTL template. Usually these statements prepare temporary registers for use as internal operands in the RTL template, but they can also generate RTL insns directly by calling routines such as ‘emit_insn’, etc. Any such insns precede the ones that come from the RTL template. • Optionally, a vector containing the values of attributes. *Note Insn Attributes::. Every RTL insn emitted by a ‘define_expand’ must match some ‘define_insn’ in the machine description. Otherwise, the compiler will crash when trying to generate code for the insn or trying to optimize it. The RTL template, in addition to controlling generation of RTL insns, also describes the operands that need to be specified when this pattern is used. In particular, it gives a predicate for each operand. A true operand, which needs to be specified in order to generate RTL from the pattern, should be described with a ‘match_operand’ in its first occurrence in the RTL template. This enters information on the operand's predicate into the tables that record such things. GCC uses the information to preload the operand into a register if that is required for valid RTL code. If the operand is referred to more than once, subsequent references should use ‘match_dup’. The RTL template may also refer to internal "operands" which are temporary registers or labels used only within the sequence made by the ‘define_expand’. Internal operands are substituted into the RTL template with ‘match_dup’, never with ‘match_operand’. The values of the internal operands are not passed in as arguments by the compiler when it requests use of this pattern. Instead, they are computed within the pattern, in the preparation statements. These statements compute the values and store them into the appropriate elements of ‘operands’ so that ‘match_dup’ can find them. There are two special macros defined for use in the preparation statements: ‘DONE’ and ‘FAIL’. Use them with a following semicolon, as a statement. ‘DONE’ Use the ‘DONE’ macro to end RTL generation for the pattern. The only RTL insns resulting from the pattern on this occasion will be those already emitted by explicit calls to ‘emit_insn’ within the preparation statements; the RTL template will not be generated. ‘FAIL’ Make the pattern fail on this occasion. When a pattern fails, it means that the pattern was not truly available. The calling routines in the compiler will try other strategies for code generation using other patterns. Failure is currently supported only for binary (addition, multiplication, shifting, etc.) and bit-field (‘extv’, ‘extzv’, and ‘insv’) operations. If the preparation falls through (invokes neither ‘DONE’ nor ‘FAIL’), then the ‘define_expand’ acts like a ‘define_insn’ in that the RTL template is used to generate the insn. The RTL template is not used for matching, only for generating the initial insn list. If the preparation statement always invokes ‘DONE’ or ‘FAIL’, the RTL template may be reduced to a simple list of operands, such as this example: (define_expand "addsi3" [(match_operand:SI 0 "register_operand" "") (match_operand:SI 1 "register_operand" "") (match_operand:SI 2 "register_operand" "")] "" " { handle_add (operands[0], operands[1], operands[2]); DONE; }") Here is an example, the definition of left-shift for the SPUR chip: (define_expand "ashlsi3" [(set (match_operand:SI 0 "register_operand" "") (ashift:SI (match_operand:SI 1 "register_operand" "") (match_operand:SI 2 "nonmemory_operand" "")))] "" " { if (GET_CODE (operands[2]) != CONST_INT || (unsigned) INTVAL (operands[2]) > 3) FAIL; }") This example uses ‘define_expand’ so that it can generate an RTL insn for shifting when the shift-count is in the supported range of 0 to 3 but fail in other cases where machine insns aren't available. When it fails, the compiler tries another strategy using different patterns (such as, a library call). If the compiler were able to handle nontrivial condition-strings in patterns with names, then it would be possible to use a ‘define_insn’ in that case. Here is another case (zero-extension on the 68000) which makes more use of the power of ‘define_expand’: (define_expand "zero_extendhisi2" [(set (match_operand:SI 0 "general_operand" "") (const_int 0)) (set (strict_low_part (subreg:HI (match_dup 0) 0)) (match_operand:HI 1 "general_operand" ""))] "" "operands[1] = make_safe_from (operands[1], operands[0]);") Here two RTL insns are generated, one to clear the entire output operand and the other to copy the input operand into its low half. This sequence is incorrect if the input operand refers to [the old value of] the output operand, so the preparation statement makes sure this isn't so. The function ‘make_safe_from’ copies the ‘operands[1]’ into a temporary register if it refers to ‘operands[0]’. It does this by emitting another RTL insn. Finally, a third example shows the use of an internal operand. Zero-extension on the SPUR chip is done by ‘and’-ing the result against a halfword mask. But this mask cannot be represented by a ‘const_int’ because the constant value is too large to be legitimate on this machine. So it must be copied into a register with ‘force_reg’ and then the register used in the ‘and’. (define_expand "zero_extendhisi2" [(set (match_operand:SI 0 "register_operand" "") (and:SI (subreg:SI (match_operand:HI 1 "register_operand" "") 0) (match_dup 2)))] "" "operands[2] = force_reg (SImode, GEN_INT (65535)); ") _Note:_ If the ‘define_expand’ is used to serve a standard binary or unary arithmetic operation or a bit-field operation, then the last insn it generates must not be a ‘code_label’, ‘barrier’ or ‘note’. It must be an ‘insn’, ‘jump_insn’ or ‘call_insn’. If you don't need a real insn at the end, emit an insn to copy the result of the operation into itself. Such an insn will generate no code, but it can avoid problems in the compiler.  File: gccint.info, Node: Insn Splitting, Next: Including Patterns, Prev: Expander Definitions, Up: Machine Desc 17.17 Defining How to Split Instructions ======================================== There are two cases where you should specify how to split a pattern into multiple insns. On machines that have instructions requiring delay slots (*note Delay Slots::) or that have instructions whose output is not available for multiple cycles (*note Processor pipeline description::), the compiler phases that optimize these cases need to be able to move insns into one-instruction delay slots. However, some insns may generate more than one machine instruction. These insns cannot be placed into a delay slot. Often you can rewrite the single insn as a list of individual insns, each corresponding to one machine instruction. The disadvantage of doing so is that it will cause the compilation to be slower and require more space. If the resulting insns are too complex, it may also suppress some optimizations. The compiler splits the insn if there is a reason to believe that it might improve instruction or delay slot scheduling. The insn combiner phase also splits putative insns. If three insns are merged into one insn with a complex expression that cannot be matched by some ‘define_insn’ pattern, the combiner phase attempts to split the complex pattern into two insns that are recognized. Usually it can break the complex pattern into two patterns by splitting out some subexpression. However, in some other cases, such as performing an addition of a large constant in two insns on a RISC machine, the way to split the addition into two insns is machine-dependent. The ‘define_split’ definition tells the compiler how to split a complex insn into several simpler insns. It looks like this: (define_split [INSN-PATTERN] "CONDITION" [NEW-INSN-PATTERN-1 NEW-INSN-PATTERN-2 ...] "PREPARATION-STATEMENTS") INSN-PATTERN is a pattern that needs to be split and CONDITION is the final condition to be tested, as in a ‘define_insn’. When an insn matching INSN-PATTERN and satisfying CONDITION is found, it is replaced in the insn list with the insns given by NEW-INSN-PATTERN-1, NEW-INSN-PATTERN-2, etc. The PREPARATION-STATEMENTS are similar to those statements that are specified for ‘define_expand’ (*note Expander Definitions::) and are executed before the new RTL is generated to prepare for the generated code or emit some insns whose pattern is not fixed. Unlike those in ‘define_expand’, however, these statements must not generate any new pseudo-registers. Once reload has completed, they also must not allocate any space in the stack frame. There are two special macros defined for use in the preparation statements: ‘DONE’ and ‘FAIL’. Use them with a following semicolon, as a statement. ‘DONE’ Use the ‘DONE’ macro to end RTL generation for the splitter. The only RTL insns generated as replacement for the matched input insn will be those already emitted by explicit calls to ‘emit_insn’ within the preparation statements; the replacement pattern is not used. ‘FAIL’ Make the ‘define_split’ fail on this occasion. When a ‘define_split’ fails, it means that the splitter was not truly available for the inputs it was given, and the input insn will not be split. If the preparation falls through (invokes neither ‘DONE’ nor ‘FAIL’), then the ‘define_split’ uses the replacement template. Patterns are matched against INSN-PATTERN in two different circumstances. If an insn needs to be split for delay slot scheduling or insn scheduling, the insn is already known to be valid, which means that it must have been matched by some ‘define_insn’ and, if ‘reload_completed’ is nonzero, is known to satisfy the constraints of that ‘define_insn’. In that case, the new insn patterns must also be insns that are matched by some ‘define_insn’ and, if ‘reload_completed’ is nonzero, must also satisfy the constraints of those definitions. As an example of this usage of ‘define_split’, consider the following example from ‘a29k.md’, which splits a ‘sign_extend’ from ‘HImode’ to ‘SImode’ into a pair of shift insns: (define_split [(set (match_operand:SI 0 "gen_reg_operand" "") (sign_extend:SI (match_operand:HI 1 "gen_reg_operand" "")))] "" [(set (match_dup 0) (ashift:SI (match_dup 1) (const_int 16))) (set (match_dup 0) (ashiftrt:SI (match_dup 0) (const_int 16)))] " { operands[1] = gen_lowpart (SImode, operands[1]); }") When the combiner phase tries to split an insn pattern, it is always the case that the pattern is _not_ matched by any ‘define_insn’. The combiner pass first tries to split a single ‘set’ expression and then the same ‘set’ expression inside a ‘parallel’, but followed by a ‘clobber’ of a pseudo-reg to use as a scratch register. In these cases, the combiner expects exactly one or two new insn patterns to be generated. It will verify that these patterns match some ‘define_insn’ definitions, so you need not do this test in the ‘define_split’ (of course, there is no point in writing a ‘define_split’ that will never produce insns that match). Here is an example of this use of ‘define_split’, taken from ‘rs6000.md’: (define_split [(set (match_operand:SI 0 "gen_reg_operand" "") (plus:SI (match_operand:SI 1 "gen_reg_operand" "") (match_operand:SI 2 "non_add_cint_operand" "")))] "" [(set (match_dup 0) (plus:SI (match_dup 1) (match_dup 3))) (set (match_dup 0) (plus:SI (match_dup 0) (match_dup 4)))] " { int low = INTVAL (operands[2]) & 0xffff; int high = (unsigned) INTVAL (operands[2]) >> 16; if (low & 0x8000) high++, low |= 0xffff0000; operands[3] = GEN_INT (high << 16); operands[4] = GEN_INT (low); }") Here the predicate ‘non_add_cint_operand’ matches any ‘const_int’ that is _not_ a valid operand of a single add insn. The add with the smaller displacement is written so that it can be substituted into the address of a subsequent operation. An example that uses a scratch register, from the same file, generates an equality comparison of a register and a large constant: (define_split [(set (match_operand:CC 0 "cc_reg_operand" "") (compare:CC (match_operand:SI 1 "gen_reg_operand" "") (match_operand:SI 2 "non_short_cint_operand" ""))) (clobber (match_operand:SI 3 "gen_reg_operand" ""))] "find_single_use (operands[0], insn, 0) && (GET_CODE (*find_single_use (operands[0], insn, 0)) == EQ || GET_CODE (*find_single_use (operands[0], insn, 0)) == NE)" [(set (match_dup 3) (xor:SI (match_dup 1) (match_dup 4))) (set (match_dup 0) (compare:CC (match_dup 3) (match_dup 5)))] " { /* Get the constant we are comparing against, C, and see what it looks like sign-extended to 16 bits. Then see what constant could be XOR'ed with C to get the sign-extended value. */ int c = INTVAL (operands[2]); int sextc = (c << 16) >> 16; int xorv = c ^ sextc; operands[4] = GEN_INT (xorv); operands[5] = GEN_INT (sextc); }") To avoid confusion, don't write a single ‘define_split’ that accepts some insns that match some ‘define_insn’ as well as some insns that don't. Instead, write two separate ‘define_split’ definitions, one for the insns that are valid and one for the insns that are not valid. The splitter is allowed to split jump instructions into a sequence of jumps or create new jumps while splitting non-jump instructions. As the control flow graph and branch prediction information needs to be updated after the splitter runs, several restrictions apply. Splitting of a jump instruction into a sequence that has another jump instruction to the same label is always valid, as the compiler expects identical behavior of the new jump. When the new sequence contains multiple jump instructions or new labels, more assistance is needed. The splitter is permitted to create only unconditional jumps, or simple conditional jump instructions. Additionally it must attach a ‘REG_BR_PROB’ note to each conditional jump. A global variable ‘split_branch_probability’ holds the probability of the original branch in case it was a simple conditional jump, −1 otherwise. To simplify recomputing of edge frequencies, the new sequence is permitted to have only forward jumps to the newly-created labels. For the common case where the pattern of a define_split exactly matches the pattern of a define_insn, use ‘define_insn_and_split’. It looks like this: (define_insn_and_split [INSN-PATTERN] "CONDITION" "OUTPUT-TEMPLATE" "SPLIT-CONDITION" [NEW-INSN-PATTERN-1 NEW-INSN-PATTERN-2 ...] "PREPARATION-STATEMENTS" [INSN-ATTRIBUTES]) INSN-PATTERN, CONDITION, OUTPUT-TEMPLATE, and INSN-ATTRIBUTES are used as in ‘define_insn’. The NEW-INSN-PATTERN vector and the PREPARATION-STATEMENTS are used as in a ‘define_split’. The SPLIT-CONDITION is also used as in ‘define_split’, with the additional behavior that if the condition starts with ‘&&’, the condition used for the split will be the constructed as a logical "and" of the split condition with the insn condition. For example, from i386.md: (define_insn_and_split "zero_extendhisi2_and" [(set (match_operand:SI 0 "register_operand" "=r") (zero_extend:SI (match_operand:HI 1 "register_operand" "0"))) (clobber (reg:CC 17))] "TARGET_ZERO_EXTEND_WITH_AND && !optimize_size" "#" "&& reload_completed" [(parallel [(set (match_dup 0) (and:SI (match_dup 0) (const_int 65535))) (clobber (reg:CC 17))])] "" [(set_attr "type" "alu1")]) In this case, the actual split condition will be ‘TARGET_ZERO_EXTEND_WITH_AND && !optimize_size && reload_completed’. The ‘define_insn_and_split’ construction provides exactly the same functionality as two separate ‘define_insn’ and ‘define_split’ patterns. It exists for compactness, and as a maintenance tool to prevent having to ensure the two patterns' templates match. It is sometimes useful to have a ‘define_insn_and_split’ that replaces specific operands of an instruction but leaves the rest of the instruction pattern unchanged. You can do this directly with a ‘define_insn_and_split’, but it requires a NEW-INSN-PATTERN-1 that repeats most of the original INSN-PATTERN. There is also the complication that an implicit ‘parallel’ in INSN-PATTERN must become an explicit ‘parallel’ in NEW-INSN-PATTERN-1, which is easy to overlook. A simpler alternative is to use ‘define_insn_and_rewrite’, which is a form of ‘define_insn_and_split’ that automatically generates NEW-INSN-PATTERN-1 by replacing each ‘match_operand’ in INSN-PATTERN with a corresponding ‘match_dup’, and each ‘match_operator’ in the pattern with a corresponding ‘match_op_dup’. The arguments are otherwise identical to ‘define_insn_and_split’: (define_insn_and_rewrite [INSN-PATTERN] "CONDITION" "OUTPUT-TEMPLATE" "SPLIT-CONDITION" "PREPARATION-STATEMENTS" [INSN-ATTRIBUTES]) The ‘match_dup’s and ‘match_op_dup’s in the new instruction pattern use any new operand values that the PREPARATION-STATEMENTS store in the ‘operands’ array, as for a normal ‘define_insn_and_split’. PREPARATION-STATEMENTS can also emit additional instructions before the new instruction. They can even emit an entirely different sequence of instructions and use ‘DONE’ to avoid emitting a new form of the original instruction. The split in a ‘define_insn_and_rewrite’ is only intended to apply to existing instructions that match INSN-PATTERN. SPLIT-CONDITION must therefore start with ‘&&’, so that the split condition applies on top of CONDITION. Here is an example from the AArch64 SVE port, in which operand 1 is known to be equivalent to an all-true constant and isn't used by the output template: (define_insn_and_rewrite "*while_ult_cc" [(set (reg:CC CC_REGNUM) (compare:CC (unspec:SI [(match_operand:PRED_ALL 1) (unspec:PRED_ALL [(match_operand:GPI 2 "aarch64_reg_or_zero" "rZ") (match_operand:GPI 3 "aarch64_reg_or_zero" "rZ")] UNSPEC_WHILE_LO)] UNSPEC_PTEST_PTRUE) (const_int 0))) (set (match_operand:PRED_ALL 0 "register_operand" "=Upa") (unspec:PRED_ALL [(match_dup 2) (match_dup 3)] UNSPEC_WHILE_LO))] "TARGET_SVE" "whilelo\t%0., %2, %3" ;; Force the compiler to drop the unused predicate operand, so that we ;; don't have an unnecessary PTRUE. "&& !CONSTANT_P (operands[1])" { operands[1] = CONSTM1_RTX (mode); } ) The splitter in this case simply replaces operand 1 with the constant value that it is known to have. The equivalent ‘define_insn_and_split’ would be: (define_insn_and_split "*while_ult_cc" [(set (reg:CC CC_REGNUM) (compare:CC (unspec:SI [(match_operand:PRED_ALL 1) (unspec:PRED_ALL [(match_operand:GPI 2 "aarch64_reg_or_zero" "rZ") (match_operand:GPI 3 "aarch64_reg_or_zero" "rZ")] UNSPEC_WHILE_LO)] UNSPEC_PTEST_PTRUE) (const_int 0))) (set (match_operand:PRED_ALL 0 "register_operand" "=Upa") (unspec:PRED_ALL [(match_dup 2) (match_dup 3)] UNSPEC_WHILE_LO))] "TARGET_SVE" "whilelo\t%0., %2, %3" ;; Force the compiler to drop the unused predicate operand, so that we ;; don't have an unnecessary PTRUE. "&& !CONSTANT_P (operands[1])" [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (unspec:SI [(match_dup 1) (unspec:PRED_ALL [(match_dup 2) (match_dup 3)] UNSPEC_WHILE_LO)] UNSPEC_PTEST_PTRUE) (const_int 0))) (set (match_dup 0) (unspec:PRED_ALL [(match_dup 2) (match_dup 3)] UNSPEC_WHILE_LO))])] { operands[1] = CONSTM1_RTX (mode); } )  File: gccint.info, Node: Including Patterns, Next: Peephole Definitions, Prev: Insn Splitting, Up: Machine Desc 17.18 Including Patterns in Machine Descriptions. ================================================= The ‘include’ pattern tells the compiler tools where to look for patterns that are in files other than in the file ‘.md’. This is used only at build time and there is no preprocessing allowed. It looks like: (include PATHNAME) For example: (include "filestuff") Where PATHNAME is a string that specifies the location of the file, specifies the include file to be in ‘gcc/config/target/filestuff’. The directory ‘gcc/config/target’ is regarded as the default directory. Machine descriptions may be split up into smaller more manageable subsections and placed into subdirectories. By specifying: (include "BOGUS/filestuff") the include file is specified to be in ‘gcc/config/TARGET/BOGUS/filestuff’. Specifying an absolute path for the include file such as; (include "/u2/BOGUS/filestuff") is permitted but is not encouraged. 17.18.1 RTL Generation Tool Options for Directory Search -------------------------------------------------------- The ‘-IDIR’ option specifies directories to search for machine descriptions. For example: genrecog -I/p1/abc/proc1 -I/p2/abcd/pro2 target.md Add the directory DIR to the head of the list of directories to be searched for header files. This can be used to override a system machine definition file, substituting your own version, since these directories are searched before the default machine description file directories. If you use more than one ‘-I’ option, the directories are scanned in left-to-right order; the standard default directory come after.  File: gccint.info, Node: Peephole Definitions, Next: Insn Attributes, Prev: Including Patterns, Up: Machine Desc 17.19 Machine-Specific Peephole Optimizers ========================================== In addition to instruction patterns the ‘md’ file may contain definitions of machine-specific peephole optimizations. The combiner does not notice certain peephole optimizations when the data flow in the program does not suggest that it should try them. For example, sometimes two consecutive insns related in purpose can be combined even though the second one does not appear to use a register computed in the first one. A machine-specific peephole optimizer can detect such opportunities. There are two forms of peephole definitions that may be used. The original ‘define_peephole’ is run at assembly output time to match insns and substitute assembly text. Use of ‘define_peephole’ is deprecated. A newer ‘define_peephole2’ matches insns and substitutes new insns. The ‘peephole2’ pass is run after register allocation but before scheduling, which may result in much better code for targets that do scheduling. * Menu: * define_peephole:: RTL to Text Peephole Optimizers * define_peephole2:: RTL to RTL Peephole Optimizers  File: gccint.info, Node: define_peephole, Next: define_peephole2, Up: Peephole Definitions 17.19.1 RTL to Text Peephole Optimizers --------------------------------------- A definition looks like this: (define_peephole [INSN-PATTERN-1 INSN-PATTERN-2 ...] "CONDITION" "TEMPLATE" "OPTIONAL-INSN-ATTRIBUTES") The last string operand may be omitted if you are not using any machine-specific information in this machine description. If present, it must obey the same rules as in a ‘define_insn’. In this skeleton, INSN-PATTERN-1 and so on are patterns to match consecutive insns. The optimization applies to a sequence of insns when INSN-PATTERN-1 matches the first one, INSN-PATTERN-2 matches the next, and so on. Each of the insns matched by a peephole must also match a ‘define_insn’. Peepholes are checked only at the last stage just before code generation, and only optionally. Therefore, any insn which would match a peephole but no ‘define_insn’ will cause a crash in code generation in an unoptimized compilation, or at various optimization stages. The operands of the insns are matched with ‘match_operands’, ‘match_operator’, and ‘match_dup’, as usual. What is not usual is that the operand numbers apply to all the insn patterns in the definition. So, you can check for identical operands in two insns by using ‘match_operand’ in one insn and ‘match_dup’ in the other. The operand constraints used in ‘match_operand’ patterns do not have any direct effect on the applicability of the peephole, but they will be validated afterward, so make sure your constraints are general enough to apply whenever the peephole matches. If the peephole matches but the constraints are not satisfied, the compiler will crash. It is safe to omit constraints in all the operands of the peephole; or you can write constraints which serve as a double-check on the criteria previously tested. Once a sequence of insns matches the patterns, the CONDITION is checked. This is a C expression which makes the final decision whether to perform the optimization (we do so if the expression is nonzero). If CONDITION is omitted (in other words, the string is empty) then the optimization is applied to every sequence of insns that matches the patterns. The defined peephole optimizations are applied after register allocation is complete. Therefore, the peephole definition can check which operands have ended up in which kinds of registers, just by looking at the operands. The way to refer to the operands in CONDITION is to write ‘operands[I]’ for operand number I (as matched by ‘(match_operand I ...)’). Use the variable ‘insn’ to refer to the last of the insns being matched; use ‘prev_active_insn’ to find the preceding insns. When optimizing computations with intermediate results, you can use CONDITION to match only when the intermediate results are not used elsewhere. Use the C expression ‘dead_or_set_p (INSN, OP)’, where INSN is the insn in which you expect the value to be used for the last time (from the value of ‘insn’, together with use of ‘prev_nonnote_insn’), and OP is the intermediate value (from ‘operands[I]’). Applying the optimization means replacing the sequence of insns with one new insn. The TEMPLATE controls ultimate output of assembler code for this combined insn. It works exactly like the template of a ‘define_insn’. Operand numbers in this template are the same ones used in matching the original sequence of insns. The result of a defined peephole optimizer does not need to match any of the insn patterns in the machine description; it does not even have an opportunity to match them. The peephole optimizer definition itself serves as the insn pattern to control how the insn is output. Defined peephole optimizers are run as assembler code is being output, so the insns they produce are never combined or rearranged in any way. Here is an example, taken from the 68000 machine description: (define_peephole [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4))) (set (match_operand:DF 0 "register_operand" "=f") (match_operand:DF 1 "register_operand" "ad"))] "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])" { rtx xoperands[2]; xoperands[1] = gen_rtx_REG (SImode, REGNO (operands[1]) + 1); #ifdef MOTOROLA output_asm_insn ("move.l %1,(sp)", xoperands); output_asm_insn ("move.l %1,-(sp)", operands); return "fmove.d (sp)+,%0"; #else output_asm_insn ("movel %1,sp@", xoperands); output_asm_insn ("movel %1,sp@-", operands); return "fmoved sp@+,%0"; #endif }) The effect of this optimization is to change jbsr _foobar addql #4,sp movel d1,sp@- movel d0,sp@- fmoved sp@+,fp0 into jbsr _foobar movel d1,sp@ movel d0,sp@- fmoved sp@+,fp0 INSN-PATTERN-1 and so on look _almost_ like the second operand of ‘define_insn’. There is one important difference: the second operand of ‘define_insn’ consists of one or more RTX's enclosed in square brackets. Usually, there is only one: then the same action can be written as an element of a ‘define_peephole’. But when there are multiple actions in a ‘define_insn’, they are implicitly enclosed in a ‘parallel’. Then you must explicitly write the ‘parallel’, and the square brackets within it, in the ‘define_peephole’. Thus, if an insn pattern looks like this, (define_insn "divmodsi4" [(set (match_operand:SI 0 "general_operand" "=d") (div:SI (match_operand:SI 1 "general_operand" "0") (match_operand:SI 2 "general_operand" "dmsK"))) (set (match_operand:SI 3 "general_operand" "=d") (mod:SI (match_dup 1) (match_dup 2)))] "TARGET_68020" "divsl%.l %2,%3:%0") then the way to mention this insn in a peephole is as follows: (define_peephole [... (parallel [(set (match_operand:SI 0 "general_operand" "=d") (div:SI (match_operand:SI 1 "general_operand" "0") (match_operand:SI 2 "general_operand" "dmsK"))) (set (match_operand:SI 3 "general_operand" "=d") (mod:SI (match_dup 1) (match_dup 2)))]) ...] ...)  File: gccint.info, Node: define_peephole2, Prev: define_peephole, Up: Peephole Definitions 17.19.2 RTL to RTL Peephole Optimizers -------------------------------------- The ‘define_peephole2’ definition tells the compiler how to substitute one sequence of instructions for another sequence, what additional scratch registers may be needed and what their lifetimes must be. (define_peephole2 [INSN-PATTERN-1 INSN-PATTERN-2 ...] "CONDITION" [NEW-INSN-PATTERN-1 NEW-INSN-PATTERN-2 ...] "PREPARATION-STATEMENTS") The definition is almost identical to ‘define_split’ (*note Insn Splitting::) except that the pattern to match is not a single instruction, but a sequence of instructions. It is possible to request additional scratch registers for use in the output template. If appropriate registers are not free, the pattern will simply not match. Scratch registers are requested with a ‘match_scratch’ pattern at the top level of the input pattern. The allocated register (initially) will be dead at the point requested within the original sequence. If the scratch is used at more than a single point, a ‘match_dup’ pattern at the top level of the input pattern marks the last position in the input sequence at which the register must be available. Here is an example from the IA-32 machine description: (define_peephole2 [(match_scratch:SI 2 "r") (parallel [(set (match_operand:SI 0 "register_operand" "") (match_operator:SI 3 "arith_or_logical_operator" [(match_dup 0) (match_operand:SI 1 "memory_operand" "")])) (clobber (reg:CC 17))])] "! optimize_size && ! TARGET_READ_MODIFY" [(set (match_dup 2) (match_dup 1)) (parallel [(set (match_dup 0) (match_op_dup 3 [(match_dup 0) (match_dup 2)])) (clobber (reg:CC 17))])] "") This pattern tries to split a load from its use in the hopes that we'll be able to schedule around the memory load latency. It allocates a single ‘SImode’ register of class ‘GENERAL_REGS’ (‘"r"’) that needs to be live only at the point just before the arithmetic. A real example requiring extended scratch lifetimes is harder to come by, so here's a silly made-up example: (define_peephole2 [(match_scratch:SI 4 "r") (set (match_operand:SI 0 "" "") (match_operand:SI 1 "" "")) (set (match_operand:SI 2 "" "") (match_dup 1)) (match_dup 4) (set (match_operand:SI 3 "" "") (match_dup 1))] "/* determine 1 does not overlap 0 and 2 */" [(set (match_dup 4) (match_dup 1)) (set (match_dup 0) (match_dup 4)) (set (match_dup 2) (match_dup 4)) (set (match_dup 3) (match_dup 4))] "") If we had not added the ‘(match_dup 4)’ in the middle of the input sequence, it might have been the case that the register we chose at the beginning of the sequence is killed by the first or second ‘set’. There are two special macros defined for use in the preparation statements: ‘DONE’ and ‘FAIL’. Use them with a following semicolon, as a statement. ‘DONE’ Use the ‘DONE’ macro to end RTL generation for the peephole. The only RTL insns generated as replacement for the matched input insn will be those already emitted by explicit calls to ‘emit_insn’ within the preparation statements; the replacement pattern is not used. ‘FAIL’ Make the ‘define_peephole2’ fail on this occasion. When a ‘define_peephole2’ fails, it means that the replacement was not truly available for the particular inputs it was given. In that case, GCC may still apply a later ‘define_peephole2’ that also matches the given insn pattern. (Note that this is different from ‘define_split’, where ‘FAIL’ prevents the input insn from being split at all.) If the preparation falls through (invokes neither ‘DONE’ nor ‘FAIL’), then the ‘define_peephole2’ uses the replacement template. Insns are scanned in forward order from beginning to end for each basic block. Matches are attempted in order of ‘define_peephole2’ appearance in the ‘md’ file. After a successful replacement, scanning for further opportunities for ‘define_peephole2’, resumes with the first generated replacement insn as the first insn to be matched against all ‘define_peephole2’. For the example above, after its successful replacement, the first insn that can be matched by a ‘define_peephole2’ is ‘(set (match_dup 4) (match_dup 1))’.  File: gccint.info, Node: Insn Attributes, Next: Conditional Execution, Prev: Peephole Definitions, Up: Machine Desc 17.20 Instruction Attributes ============================ In addition to describing the instruction supported by the target machine, the ‘md’ file also defines a group of “attributes” and a set of values for each. Every generated insn is assigned a value for each attribute. One possible attribute would be the effect that the insn has on the machine's condition code. * Menu: * Defining Attributes:: Specifying attributes and their values. * Expressions:: Valid expressions for attribute values. * Tagging Insns:: Assigning attribute values to insns. * Attr Example:: An example of assigning attributes. * Insn Lengths:: Computing the length of insns. * Constant Attributes:: Defining attributes that are constant. * Mnemonic Attribute:: Obtain the instruction mnemonic as attribute value. * Delay Slots:: Defining delay slots required for a machine. * Processor pipeline description:: Specifying information for insn scheduling.  File: gccint.info, Node: Defining Attributes, Next: Expressions, Up: Insn Attributes 17.20.1 Defining Attributes and their Values -------------------------------------------- The ‘define_attr’ expression is used to define each attribute required by the target machine. It looks like: (define_attr NAME LIST-OF-VALUES DEFAULT) NAME is a string specifying the name of the attribute being defined. Some attributes are used in a special way by the rest of the compiler. The ‘enabled’ attribute can be used to conditionally enable or disable insn alternatives (*note Disable Insn Alternatives::). The ‘predicable’ attribute, together with a suitable ‘define_cond_exec’ (*note Conditional Execution::), can be used to automatically generate conditional variants of instruction patterns. The ‘mnemonic’ attribute can be used to check for the instruction mnemonic (*note Mnemonic Attribute::). The compiler internally uses the names ‘ce_enabled’ and ‘nonce_enabled’, so they should not be used elsewhere as alternative names. LIST-OF-VALUES is either a string that specifies a comma-separated list of values that can be assigned to the attribute, or a null string to indicate that the attribute takes numeric values. DEFAULT is an attribute expression that gives the value of this attribute for insns that match patterns whose definition does not include an explicit value for this attribute. *Note Attr Example::, for more information on the handling of defaults. *Note Constant Attributes::, for information on attributes that do not depend on any particular insn. For each defined attribute, a number of definitions are written to the ‘insn-attr.h’ file. For cases where an explicit set of values is specified for an attribute, the following are defined: • A ‘#define’ is written for the symbol ‘HAVE_ATTR_NAME’. • An enumerated class is defined for ‘attr_NAME’ with elements of the form ‘UPPER-NAME_UPPER-VALUE’ where the attribute name and value are first converted to uppercase. • A function ‘get_attr_NAME’ is defined that is passed an insn and returns the attribute value for that insn. For example, if the following is present in the ‘md’ file: (define_attr "type" "branch,fp,load,store,arith" ...) the following lines will be written to the file ‘insn-attr.h’. #define HAVE_ATTR_type 1 enum attr_type {TYPE_BRANCH, TYPE_FP, TYPE_LOAD, TYPE_STORE, TYPE_ARITH}; extern enum attr_type get_attr_type (); If the attribute takes numeric values, no ‘enum’ type will be defined and the function to obtain the attribute's value will return ‘int’. There are attributes which are tied to a specific meaning. These attributes are not free to use for other purposes: ‘length’ The ‘length’ attribute is used to calculate the length of emitted code chunks. This is especially important when verifying branch distances. *Note Insn Lengths::. ‘enabled’ The ‘enabled’ attribute can be defined to prevent certain alternatives of an insn definition from being used during code generation. *Note Disable Insn Alternatives::. ‘mnemonic’ The ‘mnemonic’ attribute can be defined to implement instruction specific checks in e.g. the pipeline description. *Note Mnemonic Attribute::. For each of these special attributes, the corresponding ‘HAVE_ATTR_NAME’ ‘#define’ is also written when the attribute is not defined; in that case, it is defined as ‘0’. Another way of defining an attribute is to use: (define_enum_attr "ATTR" "ENUM" DEFAULT) This works in just the same way as ‘define_attr’, except that the list of values is taken from a separate enumeration called ENUM (*note define_enum::). This form allows you to use the same list of values for several attributes without having to repeat the list each time. For example: (define_enum "processor" [ model_a model_b ... ]) (define_enum_attr "arch" "processor" (const (symbol_ref "target_arch"))) (define_enum_attr "tune" "processor" (const (symbol_ref "target_tune"))) defines the same attributes as: (define_attr "arch" "model_a,model_b,..." (const (symbol_ref "target_arch"))) (define_attr "tune" "model_a,model_b,..." (const (symbol_ref "target_tune"))) but without duplicating the processor list. The second example defines two separate C enums (‘attr_arch’ and ‘attr_tune’) whereas the first defines a single C enum (‘processor’).  File: gccint.info, Node: Expressions, Next: Tagging Insns, Prev: Defining Attributes, Up: Insn Attributes 17.20.2 Attribute Expressions ----------------------------- RTL expressions used to define attributes use the codes described above plus a few specific to attribute definitions, to be discussed below. Attribute value expressions must have one of the following forms: ‘(const_int I)’ The integer I specifies the value of a numeric attribute. I must be non-negative. The value of a numeric attribute can be specified either with a ‘const_int’, or as an integer represented as a string in ‘const_string’, ‘eq_attr’ (see below), ‘attr’, ‘symbol_ref’, simple arithmetic expressions, and ‘set_attr’ overrides on specific instructions (*note Tagging Insns::). ‘(const_string VALUE)’ The string VALUE specifies a constant attribute value. If VALUE is specified as ‘"*"’, it means that the default value of the attribute is to be used for the insn containing this expression. ‘"*"’ obviously cannot be used in the DEFAULT expression of a ‘define_attr’. If the attribute whose value is being specified is numeric, VALUE must be a string containing a non-negative integer (normally ‘const_int’ would be used in this case). Otherwise, it must contain one of the valid values for the attribute. ‘(if_then_else TEST TRUE-VALUE FALSE-VALUE)’ TEST specifies an attribute test, whose format is defined below. The value of this expression is TRUE-VALUE if TEST is true, otherwise it is FALSE-VALUE. ‘(cond [TEST1 VALUE1 ...] DEFAULT)’ The first operand of this expression is a vector containing an even number of expressions and consisting of pairs of TEST and VALUE expressions. The value of the ‘cond’ expression is that of the VALUE corresponding to the first true TEST expression. If none of the TEST expressions are true, the value of the ‘cond’ expression is that of the DEFAULT expression. TEST expressions can have one of the following forms: ‘(const_int I)’ This test is true if I is nonzero and false otherwise. ‘(not TEST)’ ‘(ior TEST1 TEST2)’ ‘(and TEST1 TEST2)’ These tests are true if the indicated logical function is true. ‘(match_operand:M N PRED CONSTRAINTS)’ This test is true if operand N of the insn whose attribute value is being determined has mode M (this part of the test is ignored if M is ‘VOIDmode’) and the function specified by the string PRED returns a nonzero value when passed operand N and mode M (this part of the test is ignored if PRED is the null string). The CONSTRAINTS operand is ignored and should be the null string. ‘(match_test C-EXPR)’ The test is true if C expression C-EXPR is true. In non-constant attributes, C-EXPR has access to the following variables: INSN The rtl instruction under test. WHICH_ALTERNATIVE The ‘define_insn’ alternative that INSN matches. *Note Output Statement::. OPERANDS An array of INSN's rtl operands. C-EXPR behaves like the condition in a C ‘if’ statement, so there is no need to explicitly convert the expression into a boolean 0 or 1 value. For example, the following two tests are equivalent: (match_test "x & 2") (match_test "(x & 2) != 0") ‘(le ARITH1 ARITH2)’ ‘(leu ARITH1 ARITH2)’ ‘(lt ARITH1 ARITH2)’ ‘(ltu ARITH1 ARITH2)’ ‘(gt ARITH1 ARITH2)’ ‘(gtu ARITH1 ARITH2)’ ‘(ge ARITH1 ARITH2)’ ‘(geu ARITH1 ARITH2)’ ‘(ne ARITH1 ARITH2)’ ‘(eq ARITH1 ARITH2)’ These tests are true if the indicated comparison of the two arithmetic expressions is true. Arithmetic expressions are formed with ‘plus’, ‘minus’, ‘mult’, ‘div’, ‘mod’, ‘abs’, ‘neg’, ‘and’, ‘ior’, ‘xor’, ‘not’, ‘ashift’, ‘lshiftrt’, and ‘ashiftrt’ expressions. ‘const_int’ and ‘symbol_ref’ are always valid terms (*note Insn Lengths::,for additional forms). ‘symbol_ref’ is a string denoting a C expression that yields an ‘int’ when evaluated by the ‘get_attr_...’ routine. It should normally be a global variable. ‘(eq_attr NAME VALUE)’ NAME is a string specifying the name of an attribute. VALUE is a string that is either a valid value for attribute NAME, a comma-separated list of values, or ‘!’ followed by a value or list. If VALUE does not begin with a ‘!’, this test is true if the value of the NAME attribute of the current insn is in the list specified by VALUE. If VALUE begins with a ‘!’, this test is true if the attribute's value is _not_ in the specified list. For example, (eq_attr "type" "load,store") is equivalent to (ior (eq_attr "type" "load") (eq_attr "type" "store")) If NAME specifies an attribute of ‘alternative’, it refers to the value of the compiler variable ‘which_alternative’ (*note Output Statement::) and the values must be small integers. For example, (eq_attr "alternative" "2,3") is equivalent to (ior (eq (symbol_ref "which_alternative") (const_int 2)) (eq (symbol_ref "which_alternative") (const_int 3))) Note that, for most attributes, an ‘eq_attr’ test is simplified in cases where the value of the attribute being tested is known for all insns matching a particular pattern. This is by far the most common case. ‘(attr_flag NAME)’ The value of an ‘attr_flag’ expression is true if the flag specified by NAME is true for the ‘insn’ currently being scheduled. NAME is a string specifying one of a fixed set of flags to test. Test the flags ‘forward’ and ‘backward’ to determine the direction of a conditional branch. This example describes a conditional branch delay slot which can be nullified for forward branches that are taken (annul-true) or for backward branches which are not taken (annul-false). (define_delay (eq_attr "type" "cbranch") [(eq_attr "in_branch_delay" "true") (and (eq_attr "in_branch_delay" "true") (attr_flag "forward")) (and (eq_attr "in_branch_delay" "true") (attr_flag "backward"))]) The ‘forward’ and ‘backward’ flags are false if the current ‘insn’ being scheduled is not a conditional branch. ‘attr_flag’ is only used during delay slot scheduling and has no meaning to other passes of the compiler. ‘(attr NAME)’ The value of another attribute is returned. This is most useful for numeric attributes, as ‘eq_attr’ and ‘attr_flag’ produce more efficient code for non-numeric attributes.  File: gccint.info, Node: Tagging Insns, Next: Attr Example, Prev: Expressions, Up: Insn Attributes 17.20.3 Assigning Attribute Values to Insns ------------------------------------------- The value assigned to an attribute of an insn is primarily determined by which pattern is matched by that insn (or which ‘define_peephole’ generated it). Every ‘define_insn’ and ‘define_peephole’ can have an optional last argument to specify the values of attributes for matching insns. The value of any attribute not specified in a particular insn is set to the default value for that attribute, as specified in its ‘define_attr’. Extensive use of default values for attributes permits the specification of the values for only one or two attributes in the definition of most insn patterns, as seen in the example in the next section. The optional last argument of ‘define_insn’ and ‘define_peephole’ is a vector of expressions, each of which defines the value for a single attribute. The most general way of assigning an attribute's value is to use a ‘set’ expression whose first operand is an ‘attr’ expression giving the name of the attribute being set. The second operand of the ‘set’ is an attribute expression (*note Expressions::) giving the value of the attribute. When the attribute value depends on the ‘alternative’ attribute (i.e., which is the applicable alternative in the constraint of the insn), the ‘set_attr_alternative’ expression can be used. It allows the specification of a vector of attribute expressions, one for each alternative. When the generality of arbitrary attribute expressions is not required, the simpler ‘set_attr’ expression can be used, which allows specifying a string giving either a single attribute value or a list of attribute values, one for each alternative. The form of each of the above specifications is shown below. In each case, NAME is a string specifying the attribute to be set. ‘(set_attr NAME VALUE-STRING)’ VALUE-STRING is either a string giving the desired attribute value, or a string containing a comma-separated list giving the values for succeeding alternatives. The number of elements must match the number of alternatives in the constraint of the insn pattern. Note that it may be useful to specify ‘*’ for some alternative, in which case the attribute will assume its default value for insns matching that alternative. ‘(set_attr_alternative NAME [VALUE1 VALUE2 ...])’ Depending on the alternative of the insn, the value will be one of the specified values. This is a shorthand for using a ‘cond’ with tests on the ‘alternative’ attribute. ‘(set (attr NAME) VALUE)’ The first operand of this ‘set’ must be the special RTL expression ‘attr’, whose sole operand is a string giving the name of the attribute being set. VALUE is the value of the attribute. The following shows three different ways of representing the same attribute value specification: (set_attr "type" "load,store,arith") (set_attr_alternative "type" [(const_string "load") (const_string "store") (const_string "arith")]) (set (attr "type") (cond [(eq_attr "alternative" "1") (const_string "load") (eq_attr "alternative" "2") (const_string "store")] (const_string "arith"))) The ‘define_asm_attributes’ expression provides a mechanism to specify the attributes assigned to insns produced from an ‘asm’ statement. It has the form: (define_asm_attributes [ATTR-SETS]) where ATTR-SETS is specified the same as for both the ‘define_insn’ and the ‘define_peephole’ expressions. These values will typically be the "worst case" attribute values. For example, they might indicate that the condition code will be clobbered. A specification for a ‘length’ attribute is handled specially. The way to compute the length of an ‘asm’ insn is to multiply the length specified in the expression ‘define_asm_attributes’ by the number of machine instructions specified in the ‘asm’ statement, determined by counting the number of semicolons and newlines in the string. Therefore, the value of the ‘length’ attribute specified in a ‘define_asm_attributes’ should be the maximum possible length of a single machine instruction.  File: gccint.info, Node: Attr Example, Next: Insn Lengths, Prev: Tagging Insns, Up: Insn Attributes 17.20.4 Example of Attribute Specifications ------------------------------------------- The judicious use of defaulting is important in the efficient use of insn attributes. Typically, insns are divided into “types” and an attribute, customarily called ‘type’, is used to represent this value. This attribute is normally used only to define the default value for other attributes. An example will clarify this usage. Assume we have a RISC machine with a condition code and in which only full-word operations are performed in registers. Let us assume that we can divide all insns into loads, stores, (integer) arithmetic operations, floating point operations, and branches. Here we will concern ourselves with determining the effect of an insn on the condition code and will limit ourselves to the following possible effects: The condition code can be set unpredictably (clobbered), not be changed, be set to agree with the results of the operation, or only changed if the item previously set into the condition code has been modified. Here is part of a sample ‘md’ file for such a machine: (define_attr "type" "load,store,arith,fp,branch" (const_string "arith")) (define_attr "cc" "clobber,unchanged,set,change0" (cond [(eq_attr "type" "load") (const_string "change0") (eq_attr "type" "store,branch") (const_string "unchanged") (eq_attr "type" "arith") (if_then_else (match_operand:SI 0 "" "") (const_string "set") (const_string "clobber"))] (const_string "clobber"))) (define_insn "" [(set (match_operand:SI 0 "general_operand" "=r,r,m") (match_operand:SI 1 "general_operand" "r,m,r"))] "" "@ move %0,%1 load %0,%1 store %0,%1" [(set_attr "type" "arith,load,store")]) Note that we assume in the above example that arithmetic operations performed on quantities smaller than a machine word clobber the condition code since they will set the condition code to a value corresponding to the full-word result.  File: gccint.info, Node: Insn Lengths, Next: Constant Attributes, Prev: Attr Example, Up: Insn Attributes 17.20.5 Computing the Length of an Insn --------------------------------------- For many machines, multiple types of branch instructions are provided, each for different length branch displacements. In most cases, the assembler will choose the correct instruction to use. However, when the assembler cannot do so, GCC can when a special attribute, the ‘length’ attribute, is defined. This attribute must be defined to have numeric values by specifying a null string in its ‘define_attr’. In the case of the ‘length’ attribute, two additional forms of arithmetic terms are allowed in test expressions: ‘(match_dup N)’ This refers to the address of operand N of the current insn, which must be a ‘label_ref’. ‘(pc)’ For non-branch instructions and backward branch instructions, this refers to the address of the current insn. But for forward branch instructions, this refers to the address of the next insn, because the length of the current insn is to be computed. For normal insns, the length will be determined by value of the ‘length’ attribute. In the case of ‘addr_vec’ and ‘addr_diff_vec’ insn patterns, the length is computed as the number of vectors multiplied by the size of each vector. Lengths are measured in addressable storage units (bytes). Note that it is possible to call functions via the ‘symbol_ref’ mechanism to compute the length of an insn. However, if you use this mechanism you must provide dummy clauses to express the maximum length without using the function call. You can see an example of this in the ‘pa’ machine description for the ‘call_symref’ pattern. The following macros can be used to refine the length computation: ‘ADJUST_INSN_LENGTH (INSN, LENGTH)’ If defined, modifies the length assigned to instruction INSN as a function of the context in which it is used. LENGTH is an lvalue that contains the initially computed length of the insn and should be updated with the correct length of the insn. This macro will normally not be required. A case in which it is required is the ROMP. On this machine, the size of an ‘addr_vec’ insn must be increased by two to compensate for the fact that alignment may be required. The routine that returns ‘get_attr_length’ (the value of the ‘length’ attribute) can be used by the output routine to determine the form of the branch instruction to be written, as the example below illustrates. As an example of the specification of variable-length branches, consider the IBM 360. If we adopt the convention that a register will be set to the starting address of a function, we can jump to labels within 4k of the start using a four-byte instruction. Otherwise, we need a six-byte sequence to load the address from memory and then branch to it. On such a machine, a pattern for a branch instruction might be specified as follows: (define_insn "jump" [(set (pc) (label_ref (match_operand 0 "" "")))] "" { return (get_attr_length (insn) == 4 ? "b %l0" : "l r15,=a(%l0); br r15"); } [(set (attr "length") (if_then_else (lt (match_dup 0) (const_int 4096)) (const_int 4) (const_int 6)))])  File: gccint.info, Node: Constant Attributes, Next: Mnemonic Attribute, Prev: Insn Lengths, Up: Insn Attributes 17.20.6 Constant Attributes --------------------------- A special form of ‘define_attr’, where the expression for the default value is a ‘const’ expression, indicates an attribute that is constant for a given run of the compiler. Constant attributes may be used to specify which variety of processor is used. For example, (define_attr "cpu" "m88100,m88110,m88000" (const (cond [(symbol_ref "TARGET_88100") (const_string "m88100") (symbol_ref "TARGET_88110") (const_string "m88110")] (const_string "m88000")))) (define_attr "memory" "fast,slow" (const (if_then_else (symbol_ref "TARGET_FAST_MEM") (const_string "fast") (const_string "slow")))) The routine generated for constant attributes has no parameters as it does not depend on any particular insn. RTL expressions used to define the value of a constant attribute may use the ‘symbol_ref’ form, but may not use either the ‘match_operand’ form or ‘eq_attr’ forms involving insn attributes.  File: gccint.info, Node: Mnemonic Attribute, Next: Delay Slots, Prev: Constant Attributes, Up: Insn Attributes 17.20.7 Mnemonic Attribute -------------------------- The ‘mnemonic’ attribute is a string type attribute holding the instruction mnemonic for an insn alternative. The attribute values will automatically be generated by the machine description parser if there is an attribute definition in the md file: (define_attr "mnemonic" "unknown" (const_string "unknown")) The default value can be freely chosen as long as it does not collide with any of the instruction mnemonics. This value will be used whenever the machine description parser is not able to determine the mnemonic string. This might be the case for output templates containing more than a single instruction as in ‘"mvcle\t%0,%1,0\;jo\t.-4"’. The ‘mnemonic’ attribute set is not generated automatically if the instruction string is generated via C code. An existing ‘mnemonic’ attribute set in an insn definition will not be overriden by the md file parser. That way it is possible to manually set the instruction mnemonics for the cases where the md file parser fails to determine it automatically. The ‘mnemonic’ attribute is useful for dealing with instruction specific properties in the pipeline description without defining additional insn attributes. (define_attr "ooo_expanded" "" (cond [(eq_attr "mnemonic" "dlr,dsgr,d,dsgf,stam,dsgfr,dlgr") (const_int 1)] (const_int 0)))  File: gccint.info, Node: Delay Slots, Next: Processor pipeline description, Prev: Mnemonic Attribute, Up: Insn Attributes 17.20.8 Delay Slot Scheduling ----------------------------- The insn attribute mechanism can be used to specify the requirements for delay slots, if any, on a target machine. An instruction is said to require a “delay slot” if some instructions that are physically after the instruction are executed as if they were located before it. Classic examples are branch and call instructions, which often execute the following instruction before the branch or call is performed. On some machines, conditional branch instructions can optionally “annul” instructions in the delay slot. This means that the instruction will not be executed for certain branch outcomes. Both instructions that annul if the branch is true and instructions that annul if the branch is false are supported. Delay slot scheduling differs from instruction scheduling in that determining whether an instruction needs a delay slot is dependent only on the type of instruction being generated, not on data flow between the instructions. See the next section for a discussion of data-dependent instruction scheduling. The requirement of an insn needing one or more delay slots is indicated via the ‘define_delay’ expression. It has the following form: (define_delay TEST [DELAY-1 ANNUL-TRUE-1 ANNUL-FALSE-1 DELAY-2 ANNUL-TRUE-2 ANNUL-FALSE-2 ...]) TEST is an attribute test that indicates whether this ‘define_delay’ applies to a particular insn. If so, the number of required delay slots is determined by the length of the vector specified as the second argument. An insn placed in delay slot N must satisfy attribute test DELAY-N. ANNUL-TRUE-N is an attribute test that specifies which insns may be annulled if the branch is true. Similarly, ANNUL-FALSE-N specifies which insns in the delay slot may be annulled if the branch is false. If annulling is not supported for that delay slot, ‘(nil)’ should be coded. For example, in the common case where branch and call insns require a single delay slot, which may contain any insn other than a branch or call, the following would be placed in the ‘md’ file: (define_delay (eq_attr "type" "branch,call") [(eq_attr "type" "!branch,call") (nil) (nil)]) Multiple ‘define_delay’ expressions may be specified. In this case, each such expression specifies different delay slot requirements and there must be no insn for which tests in two ‘define_delay’ expressions are both true. For example, if we have a machine that requires one delay slot for branches but two for calls, no delay slot can contain a branch or call insn, and any valid insn in the delay slot for the branch can be annulled if the branch is true, we might represent this as follows: (define_delay (eq_attr "type" "branch") [(eq_attr "type" "!branch,call") (eq_attr "type" "!branch,call") (nil)]) (define_delay (eq_attr "type" "call") [(eq_attr "type" "!branch,call") (nil) (nil) (eq_attr "type" "!branch,call") (nil) (nil)])  File: gccint.info, Node: Processor pipeline description, Prev: Delay Slots, Up: Insn Attributes 17.20.9 Specifying processor pipeline description ------------------------------------------------- To achieve better performance, most modern processors (super-pipelined, superscalar RISC, and VLIW processors) have many “functional units” on which several instructions can be executed simultaneously. An instruction starts execution if its issue conditions are satisfied. If not, the instruction is stalled until its conditions are satisfied. Such “interlock (pipeline) delay” causes interruption of the fetching of successor instructions (or demands nop instructions, e.g. for some MIPS processors). There are two major kinds of interlock delays in modern processors. The first one is a data dependence delay determining “instruction latency time”. The instruction execution is not started until all source data have been evaluated by prior instructions (there are more complex cases when the instruction execution starts even when the data are not available but will be ready in given time after the instruction execution start). Taking the data dependence delays into account is simple. The data dependence (true, output, and anti-dependence) delay between two instructions is given by a constant. In most cases this approach is adequate. The second kind of interlock delays is a reservation delay. The reservation delay means that two instructions under execution will be in need of shared processors resources, i.e. buses, internal registers, and/or functional units, which are reserved for some time. Taking this kind of delay into account is complex especially for modern RISC processors. The task of exploiting more processor parallelism is solved by an instruction scheduler. For a better solution to this problem, the instruction scheduler has to have an adequate description of the processor parallelism (or “pipeline description”). GCC machine descriptions describe processor parallelism and functional unit reservations for groups of instructions with the aid of “regular expressions”. The GCC instruction scheduler uses a “pipeline hazard recognizer” to figure out the possibility of the instruction issue by the processor on a given simulated processor cycle. The pipeline hazard recognizer is automatically generated from the processor pipeline description. The pipeline hazard recognizer generated from the machine description is based on a deterministic finite state automaton (DFA): the instruction issue is possible if there is a transition from one automaton state to another one. This algorithm is very fast, and furthermore, its speed is not dependent on processor complexity(1). The rest of this section describes the directives that constitute an automaton-based processor pipeline description. The order of these constructions within the machine description file is not important. The following optional construction describes names of automata generated and used for the pipeline hazards recognition. Sometimes the generated finite state automaton used by the pipeline hazard recognizer is large. If we use more than one automaton and bind functional units to the automata, the total size of the automata is usually less than the size of the single automaton. If there is no one such construction, only one finite state automaton is generated. (define_automaton AUTOMATA-NAMES) AUTOMATA-NAMES is a string giving names of the automata. The names are separated by commas. All the automata should have unique names. The automaton name is used in the constructions ‘define_cpu_unit’ and ‘define_query_cpu_unit’. Each processor functional unit used in the description of instruction reservations should be described by the following construction. (define_cpu_unit UNIT-NAMES [AUTOMATON-NAME]) UNIT-NAMES is a string giving the names of the functional units separated by commas. Don't use name ‘nothing’, it is reserved for other goals. AUTOMATON-NAME is a string giving the name of the automaton with which the unit is bound. The automaton should be described in construction ‘define_automaton’. You should give “automaton-name”, if there is a defined automaton. The assignment of units to automata are constrained by the uses of the units in insn reservations. The most important constraint is: if a unit reservation is present on a particular cycle of an alternative for an insn reservation, then some unit from the same automaton must be present on the same cycle for the other alternatives of the insn reservation. The rest of the constraints are mentioned in the description of the subsequent constructions. The following construction describes CPU functional units analogously to ‘define_cpu_unit’. The reservation of such units can be queried for an automaton state. The instruction scheduler never queries reservation of functional units for given automaton state. So as a rule, you don't need this construction. This construction could be used for future code generation goals (e.g. to generate VLIW insn templates). (define_query_cpu_unit UNIT-NAMES [AUTOMATON-NAME]) UNIT-NAMES is a string giving names of the functional units separated by commas. AUTOMATON-NAME is a string giving the name of the automaton with which the unit is bound. The following construction is the major one to describe pipeline characteristics of an instruction. (define_insn_reservation INSN-NAME DEFAULT_LATENCY CONDITION REGEXP) DEFAULT_LATENCY is a number giving latency time of the instruction. There is an important difference between the old description and the automaton based pipeline description. The latency time is used for all dependencies when we use the old description. In the automaton based pipeline description, the given latency time is only used for true dependencies. The cost of anti-dependencies is always zero and the cost of output dependencies is the difference between latency times of the producing and consuming insns (if the difference is negative, the cost is considered to be zero). You can always change the default costs for any description by using the target hook ‘TARGET_SCHED_ADJUST_COST’ (*note Scheduling::). INSN-NAME is a string giving the internal name of the insn. The internal names are used in constructions ‘define_bypass’ and in the automaton description file generated for debugging. The internal name has nothing in common with the names in ‘define_insn’. It is a good practice to use insn classes described in the processor manual. CONDITION defines what RTL insns are described by this construction. You should remember that you will be in trouble if CONDITION for two or more different ‘define_insn_reservation’ constructions is TRUE for an insn. In this case what reservation will be used for the insn is not defined. Such cases are not checked during generation of the pipeline hazards recognizer because in general recognizing that two conditions may have the same value is quite difficult (especially if the conditions contain ‘symbol_ref’). It is also not checked during the pipeline hazard recognizer work because it would slow down the recognizer considerably. REGEXP is a string describing the reservation of the cpu's functional units by the instruction. The reservations are described by a regular expression according to the following syntax: regexp = regexp "," oneof | oneof oneof = oneof "|" allof | allof allof = allof "+" repeat | repeat repeat = element "*" number | element element = cpu_function_unit_name | reservation_name | result_name | "nothing" | "(" regexp ")" • ‘,’ is used for describing the start of the next cycle in the reservation. • ‘|’ is used for describing a reservation described by the first regular expression *or* a reservation described by the second regular expression *or* etc. • ‘+’ is used for describing a reservation described by the first regular expression *and* a reservation described by the second regular expression *and* etc. • ‘*’ is used for convenience and simply means a sequence in which the regular expression are repeated NUMBER times with cycle advancing (see ‘,’). • ‘cpu_function_unit_name’ denotes reservation of the named functional unit. • ‘reservation_name’ -- see description of construction ‘define_reservation’. • ‘nothing’ denotes no unit reservations. Sometimes unit reservations for different insns contain common parts. In such case, you can simplify the pipeline description by describing the common part by the following construction (define_reservation RESERVATION-NAME REGEXP) RESERVATION-NAME is a string giving name of REGEXP. Functional unit names and reservation names are in the same name space. So the reservation names should be different from the functional unit names and cannot be the reserved name ‘nothing’. The following construction is used to describe exceptions in the latency time for given instruction pair. This is so called bypasses. (define_bypass NUMBER OUT_INSN_NAMES IN_INSN_NAMES [GUARD]) NUMBER defines when the result generated by the instructions given in string OUT_INSN_NAMES will be ready for the instructions given in string IN_INSN_NAMES. Each of these strings is a comma-separated list of filename-style globs and they refer to the names of ‘define_insn_reservation’s. For example: (define_bypass 1 "cpu1_load_*, cpu1_store_*" "cpu1_load_*") defines a bypass between instructions that start with ‘cpu1_load_’ or ‘cpu1_store_’ and those that start with ‘cpu1_load_’. GUARD is an optional string giving the name of a C function which defines an additional guard for the bypass. The function will get the two insns as parameters. If the function returns zero the bypass will be ignored for this case. The additional guard is necessary to recognize complicated bypasses, e.g. when the consumer is only an address of insn ‘store’ (not a stored value). If there are more one bypass with the same output and input insns, the chosen bypass is the first bypass with a guard in description whose guard function returns nonzero. If there is no such bypass, then bypass without the guard function is chosen. The following five constructions are usually used to describe VLIW processors, or more precisely, to describe a placement of small instructions into VLIW instruction slots. They can be used for RISC processors, too. (exclusion_set UNIT-NAMES UNIT-NAMES) (presence_set UNIT-NAMES PATTERNS) (final_presence_set UNIT-NAMES PATTERNS) (absence_set UNIT-NAMES PATTERNS) (final_absence_set UNIT-NAMES PATTERNS) UNIT-NAMES is a string giving names of functional units separated by commas. PATTERNS is a string giving patterns of functional units separated by comma. Currently pattern is one unit or units separated by white-spaces. The first construction (‘exclusion_set’) means that each functional unit in the first string cannot be reserved simultaneously with a unit whose name is in the second string and vice versa. For example, the construction is useful for describing processors (e.g. some SPARC processors) with a fully pipelined floating point functional unit which can execute simultaneously only single floating point insns or only double floating point insns. The second construction (‘presence_set’) means that each functional unit in the first string cannot be reserved unless at least one of pattern of units whose names are in the second string is reserved. This is an asymmetric relation. For example, it is useful for description that VLIW ‘slot1’ is reserved after ‘slot0’ reservation. We could describe it by the following construction (presence_set "slot1" "slot0") Or ‘slot1’ is reserved only after ‘slot0’ and unit ‘b0’ reservation. In this case we could write (presence_set "slot1" "slot0 b0") The third construction (‘final_presence_set’) is analogous to ‘presence_set’. The difference between them is when checking is done. When an instruction is issued in given automaton state reflecting all current and planned unit reservations, the automaton state is changed. The first state is a source state, the second one is a result state. Checking for ‘presence_set’ is done on the source state reservation, checking for ‘final_presence_set’ is done on the result reservation. This construction is useful to describe a reservation which is actually two subsequent reservations. For example, if we use (presence_set "slot1" "slot0") the following insn will be never issued (because ‘slot1’ requires ‘slot0’ which is absent in the source state). (define_reservation "insn_and_nop" "slot0 + slot1") but it can be issued if we use analogous ‘final_presence_set’. The forth construction (‘absence_set’) means that each functional unit in the first string can be reserved only if each pattern of units whose names are in the second string is not reserved. This is an asymmetric relation (actually ‘exclusion_set’ is analogous to this one but it is symmetric). For example it might be useful in a VLIW description to say that ‘slot0’ cannot be reserved after either ‘slot1’ or ‘slot2’ have been reserved. This can be described as: (absence_set "slot0" "slot1, slot2") Or ‘slot2’ cannot be reserved if ‘slot0’ and unit ‘b0’ are reserved or ‘slot1’ and unit ‘b1’ are reserved. In this case we could write (absence_set "slot2" "slot0 b0, slot1 b1") All functional units mentioned in a set should belong to the same automaton. The last construction (‘final_absence_set’) is analogous to ‘absence_set’ but checking is done on the result (state) reservation. See comments for ‘final_presence_set’. You can control the generator of the pipeline hazard recognizer with the following construction. (automata_option OPTIONS) OPTIONS is a string giving options which affect the generated code. Currently there are the following options: • “no-minimization” makes no minimization of the automaton. This is only worth to do when we are debugging the description and need to look more accurately at reservations of states. • “time” means printing time statistics about the generation of automata. • “stats” means printing statistics about the generated automata such as the number of DFA states, NDFA states and arcs. • “v” means a generation of the file describing the result automata. The file has suffix ‘.dfa’ and can be used for the description verification and debugging. • “w” means a generation of warning instead of error for non-critical errors. • “no-comb-vect” prevents the automaton generator from generating two data structures and comparing them for space efficiency. Using a comb vector to represent transitions may be better, but it can be very expensive to construct. This option is useful if the build process spends an unacceptably long time in genautomata. • “ndfa” makes nondeterministic finite state automata. This affects the treatment of operator ‘|’ in the regular expressions. The usual treatment of the operator is to try the first alternative and, if the reservation is not possible, the second alternative. The nondeterministic treatment means trying all alternatives, some of them may be rejected by reservations in the subsequent insns. • “collapse-ndfa” modifies the behavior of the generator when producing an automaton. An additional state transition to collapse a nondeterministic NDFA state to a deterministic DFA state is generated. It can be triggered by passing ‘const0_rtx’ to state_transition. In such an automaton, cycle advance transitions are available only for these collapsed states. This option is useful for ports that want to use the ‘ndfa’ option, but also want to use ‘define_query_cpu_unit’ to assign units to insns issued in a cycle. • “progress” means output of a progress bar showing how many states were generated so far for automaton being processed. This is useful during debugging a DFA description. If you see too many generated states, you could interrupt the generator of the pipeline hazard recognizer and try to figure out a reason for generation of the huge automaton. As an example, consider a superscalar RISC machine which can issue three insns (two integer insns and one floating point insn) on the cycle but can finish only two insns. To describe this, we define the following functional units. (define_cpu_unit "i0_pipeline, i1_pipeline, f_pipeline") (define_cpu_unit "port0, port1") All simple integer insns can be executed in any integer pipeline and their result is ready in two cycles. The simple integer insns are issued into the first pipeline unless it is reserved, otherwise they are issued into the second pipeline. Integer division and multiplication insns can be executed only in the second integer pipeline and their results are ready correspondingly in 9 and 4 cycles. The integer division is not pipelined, i.e. the subsequent integer division insn cannot be issued until the current division insn finished. Floating point insns are fully pipelined and their results are ready in 3 cycles. Where the result of a floating point insn is used by an integer insn, an additional delay of one cycle is incurred. To describe all of this we could specify (define_cpu_unit "div") (define_insn_reservation "simple" 2 (eq_attr "type" "int") "(i0_pipeline | i1_pipeline), (port0 | port1)") (define_insn_reservation "mult" 4 (eq_attr "type" "mult") "i1_pipeline, nothing*2, (port0 | port1)") (define_insn_reservation "div" 9 (eq_attr "type" "div") "i1_pipeline, div*7, div + (port0 | port1)") (define_insn_reservation "float" 3 (eq_attr "type" "float") "f_pipeline, nothing, (port0 | port1)) (define_bypass 4 "float" "simple,mult,div") To simplify the description we could describe the following reservation (define_reservation "finish" "port0|port1") and use it in all ‘define_insn_reservation’ as in the following construction (define_insn_reservation "simple" 2 (eq_attr "type" "int") "(i0_pipeline | i1_pipeline), finish") ---------- Footnotes ---------- (1) However, the size of the automaton depends on processor complexity. To limit this effect, machine descriptions can split orthogonal parts of the machine description among several automata: but then, since each of these must be stepped independently, this does cause a small decrease in the algorithm's performance.  File: gccint.info, Node: Conditional Execution, Next: Define Subst, Prev: Insn Attributes, Up: Machine Desc 17.21 Conditional Execution =========================== A number of architectures provide for some form of conditional execution, or predication. The hallmark of this feature is the ability to nullify most of the instructions in the instruction set. When the instruction set is large and not entirely symmetric, it can be quite tedious to describe these forms directly in the ‘.md’ file. An alternative is the ‘define_cond_exec’ template. (define_cond_exec [PREDICATE-PATTERN] "CONDITION" "OUTPUT-TEMPLATE" "OPTIONAL-INSN-ATTRIBUES") PREDICATE-PATTERN is the condition that must be true for the insn to be executed at runtime and should match a relational operator. One can use ‘match_operator’ to match several relational operators at once. Any ‘match_operand’ operands must have no more than one alternative. CONDITION is a C expression that must be true for the generated pattern to match. OUTPUT-TEMPLATE is a string similar to the ‘define_insn’ output template (*note Output Template::), except that the ‘*’ and ‘@’ special cases do not apply. This is only useful if the assembly text for the predicate is a simple prefix to the main insn. In order to handle the general case, there is a global variable ‘current_insn_predicate’ that will contain the entire predicate if the current insn is predicated, and will otherwise be ‘NULL’. OPTIONAL-INSN-ATTRIBUTES is an optional vector of attributes that gets appended to the insn attributes of the produced cond_exec rtx. It can be used to add some distinguishing attribute to cond_exec rtxs produced that way. An example usage would be to use this attribute in conjunction with attributes on the main pattern to disable particular alternatives under certain conditions. When ‘define_cond_exec’ is used, an implicit reference to the ‘predicable’ instruction attribute is made. *Note Insn Attributes::. This attribute must be a boolean (i.e. have exactly two elements in its LIST-OF-VALUES), with the possible values being ‘no’ and ‘yes’. The default and all uses in the insns must be a simple constant, not a complex expressions. It may, however, depend on the alternative, by using a comma-separated list of values. If that is the case, the port should also define an ‘enabled’ attribute (*note Disable Insn Alternatives::), which should also allow only ‘no’ and ‘yes’ as its values. For each ‘define_insn’ for which the ‘predicable’ attribute is true, a new ‘define_insn’ pattern will be generated that matches a predicated version of the instruction. For example, (define_insn "addsi" [(set (match_operand:SI 0 "register_operand" "r") (plus:SI (match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r")))] "TEST1" "add %2,%1,%0") (define_cond_exec [(ne (match_operand:CC 0 "register_operand" "c") (const_int 0))] "TEST2" "(%0)") generates a new pattern (define_insn "" [(cond_exec (ne (match_operand:CC 3 "register_operand" "c") (const_int 0)) (set (match_operand:SI 0 "register_operand" "r") (plus:SI (match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r"))))] "(TEST2) && (TEST1)" "(%3) add %2,%1,%0")  File: gccint.info, Node: Define Subst, Next: Constant Definitions, Prev: Conditional Execution, Up: Machine Desc 17.22 RTL Templates Transformations =================================== For some hardware architectures there are common cases when the RTL templates for the instructions can be derived from the other RTL templates using simple transformations. E.g., ‘i386.md’ contains an RTL template for the ordinary ‘sub’ instruction-- ‘*subsi_1’, and for the ‘sub’ instruction with subsequent zero-extension--‘*subsi_1_zext’. Such cases can be easily implemented by a single meta-template capable of generating a modified case based on the initial one: (define_subst "NAME" [INPUT-TEMPLATE] "CONDITION" [OUTPUT-TEMPLATE]) INPUT-TEMPLATE is a pattern describing the source RTL template, which will be transformed. CONDITION is a C expression that is conjunct with the condition from the input-template to generate a condition to be used in the output-template. OUTPUT-TEMPLATE is a pattern that will be used in the resulting template. ‘define_subst’ mechanism is tightly coupled with the notion of the subst attribute (*note Subst Iterators::). The use of ‘define_subst’ is triggered by a reference to a subst attribute in the transforming RTL template. This reference initiates duplication of the source RTL template and substitution of the attributes with their values. The source RTL template is left unchanged, while the copy is transformed by ‘define_subst’. This transformation can fail in the case when the source RTL template is not matched against the input-template of the ‘define_subst’. In such case the copy is deleted. ‘define_subst’ can be used only in ‘define_insn’ and ‘define_expand’, it cannot be used in other expressions (e.g. in ‘define_insn_and_split’). * Menu: * Define Subst Example:: Example of ‘define_subst’ work. * Define Subst Pattern Matching:: Process of template comparison. * Define Subst Output Template:: Generation of output template.  File: gccint.info, Node: Define Subst Example, Next: Define Subst Pattern Matching, Up: Define Subst 17.22.1 ‘define_subst’ Example ------------------------------ To illustrate how ‘define_subst’ works, let us examine a simple template transformation. Suppose there are two kinds of instructions: one that touches flags and the other that does not. The instructions of the second type could be generated with the following ‘define_subst’: (define_subst "add_clobber_subst" [(set (match_operand:SI 0 "" "") (match_operand:SI 1 "" ""))] "" [(set (match_dup 0) (match_dup 1)) (clobber (reg:CC FLAGS_REG))]) This ‘define_subst’ can be applied to any RTL pattern containing ‘set’ of mode SI and generates a copy with clobber when it is applied. Assume there is an RTL template for a ‘max’ instruction to be used in ‘define_subst’ mentioned above: (define_insn "maxsi" [(set (match_operand:SI 0 "register_operand" "=r") (max:SI (match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r")))] "" "max\t{%2, %1, %0|%0, %1, %2}" [...]) To mark the RTL template for ‘define_subst’ application, subst-attributes are used. They should be declared in advance: (define_subst_attr "add_clobber_name" "add_clobber_subst" "_noclobber" "_clobber") Here ‘add_clobber_name’ is the attribute name, ‘add_clobber_subst’ is the name of the corresponding ‘define_subst’, the third argument (‘_noclobber’) is the attribute value that would be substituted into the unchanged version of the source RTL template, and the last argument (‘_clobber’) is the value that would be substituted into the second, transformed, version of the RTL template. Once the subst-attribute has been defined, it should be used in RTL templates which need to be processed by the ‘define_subst’. So, the original RTL template should be changed: (define_insn "maxsi" [(set (match_operand:SI 0 "register_operand" "=r") (max:SI (match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r")))] "" "max\t{%2, %1, %0|%0, %1, %2}" [...]) The result of the ‘define_subst’ usage would look like the following: (define_insn "maxsi_noclobber" [(set (match_operand:SI 0 "register_operand" "=r") (max:SI (match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r")))] "" "max\t{%2, %1, %0|%0, %1, %2}" [...]) (define_insn "maxsi_clobber" [(set (match_operand:SI 0 "register_operand" "=r") (max:SI (match_operand:SI 1 "register_operand" "r") (match_operand:SI 2 "register_operand" "r"))) (clobber (reg:CC FLAGS_REG))] "" "max\t{%2, %1, %0|%0, %1, %2}" [...])  File: gccint.info, Node: Define Subst Pattern Matching, Next: Define Subst Output Template, Prev: Define Subst Example, Up: Define Subst 17.22.2 Pattern Matching in ‘define_subst’ ------------------------------------------ All expressions, allowed in ‘define_insn’ or ‘define_expand’, are allowed in the input-template of ‘define_subst’, except ‘match_par_dup’, ‘match_scratch’, ‘match_parallel’. The meanings of expressions in the input-template were changed: ‘match_operand’ matches any expression (possibly, a subtree in RTL-template), if modes of the ‘match_operand’ and this expression are the same, or mode of the ‘match_operand’ is ‘VOIDmode’, or this expression is ‘match_dup’, ‘match_op_dup’. If the expression is ‘match_operand’ too, and predicate of ‘match_operand’ from the input pattern is not empty, then the predicates are compared. That can be used for more accurate filtering of accepted RTL-templates. ‘match_operator’ matches common operators (like ‘plus’, ‘minus’), ‘unspec’, ‘unspec_volatile’ operators and ‘match_operator’s from the original pattern if the modes match and ‘match_operator’ from the input pattern has the same number of operands as the operator from the original pattern.  File: gccint.info, Node: Define Subst Output Template, Prev: Define Subst Pattern Matching, Up: Define Subst 17.22.3 Generation of output template in ‘define_subst’ ------------------------------------------------------- If all necessary checks for ‘define_subst’ application pass, a new RTL-pattern, based on the output-template, is created to replace the old template. Like in input-patterns, meanings of some RTL expressions are changed when they are used in output-patterns of a ‘define_subst’. Thus, ‘match_dup’ is used for copying the whole expression from the original pattern, which matched corresponding ‘match_operand’ from the input pattern. ‘match_dup N’ is used in the output template to be replaced with the expression from the original pattern, which matched ‘match_operand N’ from the input pattern. As a consequence, ‘match_dup’ cannot be used to point to ‘match_operand’s from the output pattern, it should always refer to a ‘match_operand’ from the input pattern. If a ‘match_dup N’ occurs more than once in the output template, its first occurrence is replaced with the expression from the original pattern, and the subsequent expressions are replaced with ‘match_dup N’, i.e., a reference to the first expression. In the output template one can refer to the expressions from the original pattern and create new ones. For instance, some operands could be added by means of standard ‘match_operand’. After replacing ‘match_dup’ with some RTL-subtree from the original pattern, it could happen that several ‘match_operand’s in the output pattern have the same indexes. It is unknown, how many and what indexes would be used in the expression which would replace ‘match_dup’, so such conflicts in indexes are inevitable. To overcome this issue, ‘match_operands’ and ‘match_operators’, which were introduced into the output pattern, are renumerated when all ‘match_dup’s are replaced. Number of alternatives in ‘match_operand’s introduced into the output template ‘M’ could differ from the number of alternatives in the original pattern ‘N’, so in the resultant pattern there would be ‘N*M’ alternatives. Thus, constraints from the original pattern would be duplicated ‘N’ times, constraints from the output pattern would be duplicated ‘M’ times, producing all possible combinations.  File: gccint.info, Node: Constant Definitions, Next: Iterators, Prev: Define Subst, Up: Machine Desc 17.23 Constant Definitions ========================== Using literal constants inside instruction patterns reduces legibility and can be a maintenance problem. To overcome this problem, you may use the ‘define_constants’ expression. It contains a vector of name-value pairs. From that point on, wherever any of the names appears in the MD file, it is as if the corresponding value had been written instead. You may use ‘define_constants’ multiple times; each appearance adds more constants to the table. It is an error to redefine a constant with a different value. To come back to the a29k load multiple example, instead of (define_insn "" [(match_parallel 0 "load_multiple_operation" [(set (match_operand:SI 1 "gpc_reg_operand" "=r") (match_operand:SI 2 "memory_operand" "m")) (use (reg:SI 179)) (clobber (reg:SI 179))])] "" "loadm 0,0,%1,%2") You could write: (define_constants [ (R_BP 177) (R_FC 178) (R_CR 179) (R_Q 180) ]) (define_insn "" [(match_parallel 0 "load_multiple_operation" [(set (match_operand:SI 1 "gpc_reg_operand" "=r") (match_operand:SI 2 "memory_operand" "m")) (use (reg:SI R_CR)) (clobber (reg:SI R_CR))])] "" "loadm 0,0,%1,%2") The constants that are defined with a define_constant are also output in the insn-codes.h header file as #defines. You can also use the machine description file to define enumerations. Like the constants defined by ‘define_constant’, these enumerations are visible to both the machine description file and the main C code. The syntax is as follows: (define_c_enum "NAME" [ VALUE0 VALUE1 (VALUE32 32) VALUE33 ... VALUEN ]) This definition causes the equivalent of the following C code to appear in ‘insn-constants.h’: enum NAME { VALUE0 = 0, VALUE1 = 1, VALUE32 = 32, VALUE33 = 33, ... VALUEN = N }; #define NUM_CNAME_VALUES (N + 1) where CNAME is the capitalized form of NAME. It also makes each VALUEI available in the machine description file, just as if it had been declared with: (define_constants [(VALUEI I)]) Each VALUEI is usually an upper-case identifier and usually begins with CNAME. You can split the enumeration definition into as many statements as you like. The above example is directly equivalent to: (define_c_enum "NAME" [VALUE0]) (define_c_enum "NAME" [VALUE1]) ... (define_c_enum "NAME" [VALUEN]) Splitting the enumeration helps to improve the modularity of each individual ‘.md’ file. For example, if a port defines its synchronization instructions in a separate ‘sync.md’ file, it is convenient to define all synchronization-specific enumeration values in ‘sync.md’ rather than in the main ‘.md’ file. Some enumeration names have special significance to GCC: ‘unspecv’ If an enumeration called ‘unspecv’ is defined, GCC will use it when printing out ‘unspec_volatile’ expressions. For example: (define_c_enum "unspecv" [ UNSPECV_BLOCKAGE ]) causes GCC to print ‘(unspec_volatile ... 0)’ as: (unspec_volatile ... UNSPECV_BLOCKAGE) ‘unspec’ If an enumeration called ‘unspec’ is defined, GCC will use it when printing out ‘unspec’ expressions. GCC will also use it when printing out ‘unspec_volatile’ expressions unless an ‘unspecv’ enumeration is also defined. You can therefore decide whether to keep separate enumerations for volatile and non-volatile expressions or whether to use the same enumeration for both. Another way of defining an enumeration is to use ‘define_enum’: (define_enum "NAME" [ VALUE0 VALUE1 ... VALUEN ]) This directive implies: (define_c_enum "NAME" [ CNAME_CVALUE0 CNAME_CVALUE1 ... CNAME_CVALUEN ]) where CVALUEI is the capitalized form of VALUEI. However, unlike ‘define_c_enum’, the enumerations defined by ‘define_enum’ can be used in attribute specifications (*note define_enum_attr::).  File: gccint.info, Node: Iterators, Prev: Constant Definitions, Up: Machine Desc 17.24 Iterators =============== Ports often need to define similar patterns for more than one machine mode or for more than one rtx code. GCC provides some simple iterator facilities to make this process easier. * Menu: * Mode Iterators:: Generating variations of patterns for different modes. * Code Iterators:: Doing the same for codes. * Int Iterators:: Doing the same for integers. * Subst Iterators:: Generating variations of patterns for define_subst. * Parameterized Names:: Specifying iterator values in C++ code.  File: gccint.info, Node: Mode Iterators, Next: Code Iterators, Up: Iterators 17.24.1 Mode Iterators ---------------------- Ports often need to define similar patterns for two or more different modes. For example: • If a processor has hardware support for both single and double floating-point arithmetic, the ‘SFmode’ patterns tend to be very similar to the ‘DFmode’ ones. • If a port uses ‘SImode’ pointers in one configuration and ‘DImode’ pointers in another, it will usually have very similar ‘SImode’ and ‘DImode’ patterns for manipulating pointers. Mode iterators allow several patterns to be instantiated from one ‘.md’ file template. They can be used with any type of rtx-based construct, such as a ‘define_insn’, ‘define_split’, or ‘define_peephole2’. * Menu: * Defining Mode Iterators:: Defining a new mode iterator. * Substitutions:: Combining mode iterators with substitutions * Examples:: Examples  File: gccint.info, Node: Defining Mode Iterators, Next: Substitutions, Up: Mode Iterators 17.24.1.1 Defining Mode Iterators ................................. The syntax for defining a mode iterator is: (define_mode_iterator NAME [(MODE1 "COND1") ... (MODEN "CONDN")]) This allows subsequent ‘.md’ file constructs to use the mode suffix ‘:NAME’. Every construct that does so will be expanded N times, once with every use of ‘:NAME’ replaced by ‘:MODE1’, once with every use replaced by ‘:MODE2’, and so on. In the expansion for a particular MODEI, every C condition will also require that CONDI be true. For example: (define_mode_iterator P [(SI "Pmode == SImode") (DI "Pmode == DImode")]) defines a new mode suffix ‘:P’. Every construct that uses ‘:P’ will be expanded twice, once with every ‘:P’ replaced by ‘:SI’ and once with every ‘:P’ replaced by ‘:DI’. The ‘:SI’ version will only apply if ‘Pmode == SImode’ and the ‘:DI’ version will only apply if ‘Pmode == DImode’. As with other ‘.md’ conditions, an empty string is treated as "always true". ‘(MODE "")’ can also be abbreviated to ‘MODE’. For example: (define_mode_iterator GPR [SI (DI "TARGET_64BIT")]) means that the ‘:DI’ expansion only applies if ‘TARGET_64BIT’ but that the ‘:SI’ expansion has no such constraint. It is also possible to include iterators in other iterators. For example: (define_mode_iterator VI [V16QI V8HI V4SI V2DI]) (define_mode_iterator VF [V8HF V4SF (V2DF "TARGET_DOUBLE")]) (define_mode_iterator V [VI (VF "TARGET_FLOAT")]) makes ‘:V’ iterate over the modes in ‘VI’ and the modes in ‘VF’. When a construct uses ‘:V’, the ‘V8HF’ and ‘V4SF’ expansions require ‘TARGET_FLOAT’ while the ‘V2DF’ expansion requires ‘TARGET_DOUBLE && TARGET_FLOAT’. Iterators are applied in the order they are defined. This can be significant if two iterators are used in a construct that requires substitutions. *Note Substitutions::.  File: gccint.info, Node: Substitutions, Next: Examples, Prev: Defining Mode Iterators, Up: Mode Iterators 17.24.1.2 Substitution in Mode Iterators ........................................ If an ‘.md’ file construct uses mode iterators, each version of the construct will often need slightly different strings or modes. For example: • When a ‘define_expand’ defines several ‘addM3’ patterns (*note Standard Names::), each expander will need to use the appropriate mode name for M. • When a ‘define_insn’ defines several instruction patterns, each instruction will often use a different assembler mnemonic. • When a ‘define_insn’ requires operands with different modes, using an iterator for one of the operand modes usually requires a specific mode for the other operand(s). GCC supports such variations through a system of "mode attributes". There are two standard attributes: ‘mode’, which is the name of the mode in lower case, and ‘MODE’, which is the same thing in upper case. You can define other attributes using: (define_mode_attr NAME [(MODE1 "VALUE1") ... (MODEN "VALUEN")]) where NAME is the name of the attribute and VALUEI is the value associated with MODEI. When GCC replaces some :ITERATOR with :MODE, it will scan each string and mode in the pattern for sequences of the form ‘’, where ATTR is the name of a mode attribute. If the attribute is defined for MODE, the whole ‘<...>’ sequence will be replaced by the appropriate attribute value. For example, suppose an ‘.md’ file has: (define_mode_iterator P [(SI "Pmode == SImode") (DI "Pmode == DImode")]) (define_mode_attr load [(SI "lw") (DI "ld")]) If one of the patterns that uses ‘:P’ contains the string ‘"\t%0,%1"’, the ‘SI’ version of that pattern will use ‘"lw\t%0,%1"’ and the ‘DI’ version will use ‘"ld\t%0,%1"’. Here is an example of using an attribute for a mode: (define_mode_iterator LONG [SI DI]) (define_mode_attr SHORT [(SI "HI") (DI "SI")]) (define_insn ... (sign_extend:LONG (match_operand: ...)) ...) The ‘ITERATOR:’ prefix may be omitted, in which case the substitution will be attempted for every iterator expansion.  File: gccint.info, Node: Examples, Prev: Substitutions, Up: Mode Iterators 17.24.1.3 Mode Iterator Examples ................................ Here is an example from the MIPS port. It defines the following modes and attributes (among others): (define_mode_iterator GPR [SI (DI "TARGET_64BIT")]) (define_mode_attr d [(SI "") (DI "d")]) and uses the following template to define both ‘subsi3’ and ‘subdi3’: (define_insn "sub3" [(set (match_operand:GPR 0 "register_operand" "=d") (minus:GPR (match_operand:GPR 1 "register_operand" "d") (match_operand:GPR 2 "register_operand" "d")))] "" "subu\t%0,%1,%2" [(set_attr "type" "arith") (set_attr "mode" "")]) This is exactly equivalent to: (define_insn "subsi3" [(set (match_operand:SI 0 "register_operand" "=d") (minus:SI (match_operand:SI 1 "register_operand" "d") (match_operand:SI 2 "register_operand" "d")))] "" "subu\t%0,%1,%2" [(set_attr "type" "arith") (set_attr "mode" "SI")]) (define_insn "subdi3" [(set (match_operand:DI 0 "register_operand" "=d") (minus:DI (match_operand:DI 1 "register_operand" "d") (match_operand:DI 2 "register_operand" "d")))] "TARGET_64BIT" "dsubu\t%0,%1,%2" [(set_attr "type" "arith") (set_attr "mode" "DI")])  File: gccint.info, Node: Code Iterators, Next: Int Iterators, Prev: Mode Iterators, Up: Iterators 17.24.2 Code Iterators ---------------------- Code iterators operate in a similar way to mode iterators. *Note Mode Iterators::. The construct: (define_code_iterator NAME [(CODE1 "COND1") ... (CODEN "CONDN")]) defines a pseudo rtx code NAME that can be instantiated as CODEI if condition CONDI is true. Each CODEI must have the same rtx format. *Note RTL Classes::. As with mode iterators, each pattern that uses NAME will be expanded N times, once with all uses of NAME replaced by CODE1, once with all uses replaced by CODE2, and so on. *Note Defining Mode Iterators::. It is possible to define attributes for codes as well as for modes. There are two standard code attributes: ‘code’, the name of the code in lower case, and ‘CODE’, the name of the code in upper case. Other attributes are defined using: (define_code_attr NAME [(CODE1 "VALUE1") ... (CODEN "VALUEN")]) Instruction patterns can use code attributes as rtx codes, which can be useful if two sets of codes act in tandem. For example, the following ‘define_insn’ defines two patterns, one calculating a signed absolute difference and another calculating an unsigned absolute difference: (define_code_iterator any_max [smax umax]) (define_code_attr paired_min [(smax "smin") (umax "umin")]) (define_insn ... [(set (match_operand:SI 0 ...) (minus:SI (any_max:SI (match_operand:SI 1 ...) (match_operand:SI 2 ...)) (:SI (match_dup 1) (match_dup 2))))] ...) The signed version of the instruction uses ‘smax’ and ‘smin’ while the unsigned version uses ‘umax’ and ‘umin’. There are no versions that pair ‘smax’ with ‘umin’ or ‘umax’ with ‘smin’. Here's an example of code iterators in action, taken from the MIPS port: (define_code_iterator any_cond [unordered ordered unlt unge uneq ltgt unle ungt eq ne gt ge lt le gtu geu ltu leu]) (define_expand "b" [(set (pc) (if_then_else (any_cond:CC (cc0) (const_int 0)) (label_ref (match_operand 0 "")) (pc)))] "" { gen_conditional_branch (operands, ); DONE; }) This is equivalent to: (define_expand "bunordered" [(set (pc) (if_then_else (unordered:CC (cc0) (const_int 0)) (label_ref (match_operand 0 "")) (pc)))] "" { gen_conditional_branch (operands, UNORDERED); DONE; }) (define_expand "bordered" [(set (pc) (if_then_else (ordered:CC (cc0) (const_int 0)) (label_ref (match_operand 0 "")) (pc)))] "" { gen_conditional_branch (operands, ORDERED); DONE; }) ...  File: gccint.info, Node: Int Iterators, Next: Subst Iterators, Prev: Code Iterators, Up: Iterators 17.24.3 Int Iterators --------------------- Int iterators operate in a similar way to code iterators. *Note Code Iterators::. The construct: (define_int_iterator NAME [(INT1 "COND1") ... (INTN "CONDN")]) defines a pseudo integer constant NAME that can be instantiated as INTI if condition CONDI is true. Int iterators can appear in only those rtx fields that have 'i', 'n', 'w', or 'p' as the specifier. This means that each INT has to be a constant defined using ‘define_constant’ or ‘define_c_enum’. As with mode and code iterators, each pattern that uses NAME will be expanded N times, once with all uses of NAME replaced by INT1, once with all uses replaced by INT2, and so on. *Note Defining Mode Iterators::. It is possible to define attributes for ints as well as for codes and modes. Attributes are defined using: (define_int_attr ATTR_NAME [(INT1 "VALUE1") ... (INTN "VALUEN")]) In additon to these user-defined attributes, it is possible to use ‘’ to refer to the current expansion of iterator NAME (such as INT1, INT2, and so on). Here's an example of int iterators in action, taken from the ARM port: (define_int_iterator QABSNEG [UNSPEC_VQABS UNSPEC_VQNEG]) (define_int_attr absneg [(UNSPEC_VQABS "abs") (UNSPEC_VQNEG "neg")]) (define_insn "neon_vq" [(set (match_operand:VDQIW 0 "s_register_operand" "=w") (unspec:VDQIW [(match_operand:VDQIW 1 "s_register_operand" "w") (match_operand:SI 2 "immediate_operand" "i")] QABSNEG))] "TARGET_NEON" "vq.\t%0, %1" [(set_attr "type" "neon_vqneg_vqabs")] ) This is equivalent to: (define_insn "neon_vqabs" [(set (match_operand:VDQIW 0 "s_register_operand" "=w") (unspec:VDQIW [(match_operand:VDQIW 1 "s_register_operand" "w") (match_operand:SI 2 "immediate_operand" "i")] UNSPEC_VQABS))] "TARGET_NEON" "vqabs.\t%0, %1" [(set_attr "type" "neon_vqneg_vqabs")] ) (define_insn "neon_vqneg" [(set (match_operand:VDQIW 0 "s_register_operand" "=w") (unspec:VDQIW [(match_operand:VDQIW 1 "s_register_operand" "w") (match_operand:SI 2 "immediate_operand" "i")] UNSPEC_VQNEG))] "TARGET_NEON" "vqneg.\t%0, %1" [(set_attr "type" "neon_vqneg_vqabs")] )  File: gccint.info, Node: Subst Iterators, Next: Parameterized Names, Prev: Int Iterators, Up: Iterators 17.24.4 Subst Iterators ----------------------- Subst iterators are special type of iterators with the following restrictions: they could not be declared explicitly, they always have only two values, and they do not have explicit dedicated name. Subst-iterators are triggered only when corresponding subst-attribute is used in RTL-pattern. Subst iterators transform templates in the following way: the templates are duplicated, the subst-attributes in these templates are replaced with the corresponding values, and a new attribute is implicitly added to the given ‘define_insn’/‘define_expand’. The name of the added attribute matches the name of ‘define_subst’. Such attributes are declared implicitly, and it is not allowed to have a ‘define_attr’ named as a ‘define_subst’. Each subst iterator is linked to a ‘define_subst’. It is declared implicitly by the first appearance of the corresponding ‘define_subst_attr’, and it is not allowed to define it explicitly. Declarations of subst-attributes have the following syntax: (define_subst_attr "NAME" "SUBST-NAME" "NO-SUBST-VALUE" "SUBST-APPLIED-VALUE") NAME is a string with which the given subst-attribute could be referred to. SUBST-NAME shows which ‘define_subst’ should be applied to an RTL-template if the given subst-attribute is present in the RTL-template. NO-SUBST-VALUE is a value with which subst-attribute would be replaced in the first copy of the original RTL-template. SUBST-APPLIED-VALUE is a value with which subst-attribute would be replaced in the second copy of the original RTL-template.  File: gccint.info, Node: Parameterized Names, Prev: Subst Iterators, Up: Iterators 17.24.5 Parameterized Names --------------------------- Ports sometimes need to apply iterators using C++ code, in order to get the code or RTL pattern for a specific instruction. For example, suppose we have the ‘neon_vq’ pattern given above: (define_int_iterator QABSNEG [UNSPEC_VQABS UNSPEC_VQNEG]) (define_int_attr absneg [(UNSPEC_VQABS "abs") (UNSPEC_VQNEG "neg")]) (define_insn "neon_vq" [(set (match_operand:VDQIW 0 "s_register_operand" "=w") (unspec:VDQIW [(match_operand:VDQIW 1 "s_register_operand" "w") (match_operand:SI 2 "immediate_operand" "i")] QABSNEG))] ... ) A port might need to generate this pattern for a variable ‘QABSNEG’ value and a variable ‘VDQIW’ mode. There are two ways of doing this. The first is to build the rtx for the pattern directly from C++ code; this is a valid technique and avoids any risk of combinatorial explosion. The second is to prefix the instruction name with the special character ‘@’, which tells GCC to generate the four additional functions below. In each case, NAME is the name of the instruction without the leading ‘@’ character, without the ‘<...>’ placeholders, and with any underscore before a ‘<...>’ placeholder removed if keeping it would lead to a double or trailing underscore. ‘insn_code maybe_code_for_NAME (I1, I2, ...)’ See whether replacing the first ‘<...>’ placeholder with iterator value I1, the second with iterator value I2, and so on, gives a valid instruction. Return its code if so, otherwise return ‘CODE_FOR_nothing’. ‘insn_code code_for_NAME (I1, I2, ...)’ Same, but abort the compiler if the requested instruction does not exist. ‘rtx maybe_gen_NAME (I1, I2, ..., OP0, OP1, ...)’ Check for a valid instruction in the same way as ‘maybe_code_for_NAME’. If the instruction exists, generate an instance of it using the operand values given by OP0, OP1, and so on, otherwise return null. ‘rtx gen_NAME (I1, I2, ..., OP0, OP1, ...)’ Same, but abort the compiler if the requested instruction does not exist, or if the instruction generator invoked the ‘FAIL’ macro. For example, changing the pattern above to: (define_insn "@neon_vq" [(set (match_operand:VDQIW 0 "s_register_operand" "=w") (unspec:VDQIW [(match_operand:VDQIW 1 "s_register_operand" "w") (match_operand:SI 2 "immediate_operand" "i")] QABSNEG))] ... ) would define the same patterns as before, but in addition would generate the four functions below: insn_code maybe_code_for_neon_vq (int, machine_mode); insn_code code_for_neon_vq (int, machine_mode); rtx maybe_gen_neon_vq (int, machine_mode, rtx, rtx, rtx); rtx gen_neon_vq (int, machine_mode, rtx, rtx, rtx); Calling ‘code_for_neon_vq (UNSPEC_VQABS, V8QImode)’ would then give ‘CODE_FOR_neon_vqabsv8qi’. It is possible to have multiple ‘@’ patterns with the same name and same types of iterator. For example: (define_insn "@some_arithmetic_op" [(set (match_operand:INTEGER_MODES 0 "register_operand") ...)] ... ) (define_insn "@some_arithmetic_op" [(set (match_operand:FLOAT_MODES 0 "register_operand") ...)] ... ) would produce a single set of functions that handles both ‘INTEGER_MODES’ and ‘FLOAT_MODES’. It is also possible for these ‘@’ patterns to have different numbers of operands from each other. For example, patterns with a binary rtl code might take three operands (one output and two inputs) while patterns with a ternary rtl code might take four operands (one output and three inputs). This combination would produce separate ‘maybe_gen_NAME’ and ‘gen_NAME’ functions for each operand count, but it would still produce a single ‘maybe_code_for_NAME’ and a single ‘code_for_NAME’.  File: gccint.info, Node: Target Macros, Next: Host Config, Prev: Machine Desc, Up: Top 18 Target Description Macros and Functions ****************************************** In addition to the file ‘MACHINE.md’, a machine description includes a C header file conventionally given the name ‘MACHINE.h’ and a C source file named ‘MACHINE.c’. The header file defines numerous macros that convey the information about the target machine that does not fit into the scheme of the ‘.md’ file. The file ‘tm.h’ should be a link to ‘MACHINE.h’. The header file ‘config.h’ includes ‘tm.h’ and most compiler source files include ‘config.h’. The source file defines a variable ‘targetm’, which is a structure containing pointers to functions and data relating to the target machine. ‘MACHINE.c’ should also contain their definitions, if they are not defined elsewhere in GCC, and other functions called through the macros defined in the ‘.h’ file. * Menu: * Target Structure:: The ‘targetm’ variable. * Driver:: Controlling how the driver runs the compilation passes. * Run-time Target:: Defining ‘-m’ options like ‘-m68000’ and ‘-m68020’. * Per-Function Data:: Defining data structures for per-function information. * Storage Layout:: Defining sizes and alignments of data. * Type Layout:: Defining sizes and properties of basic user data types. * Registers:: Naming and describing the hardware registers. * Register Classes:: Defining the classes of hardware registers. * Stack and Calling:: Defining which way the stack grows and by how much. * Varargs:: Defining the varargs macros. * Trampolines:: Code set up at run time to enter a nested function. * Library Calls:: Controlling how library routines are implicitly called. * Addressing Modes:: Defining addressing modes valid for memory operands. * Anchored Addresses:: Defining how ‘-fsection-anchors’ should work. * Condition Code:: Defining how insns update the condition code. * Costs:: Defining relative costs of different operations. * Scheduling:: Adjusting the behavior of the instruction scheduler. * Sections:: Dividing storage into text, data, and other sections. * PIC:: Macros for position independent code. * Assembler Format:: Defining how to write insns and pseudo-ops to output. * Debugging Info:: Defining the format of debugging output. * Floating Point:: Handling floating point for cross-compilers. * Mode Switching:: Insertion of mode-switching instructions. * Target Attributes:: Defining target-specific uses of ‘__attribute__’. * Emulated TLS:: Emulated TLS support. * MIPS Coprocessors:: MIPS coprocessor support and how to customize it. * PCH Target:: Validity checking for precompiled headers. * C++ ABI:: Controlling C++ ABI changes. * D Language and ABI:: Controlling D ABI changes. * Rust Language and ABI:: Controlling Rust ABI changes. * Named Address Spaces:: Adding support for named address spaces * Misc:: Everything else.  File: gccint.info, Node: Target Structure, Next: Driver, Up: Target Macros 18.1 The Global ‘targetm’ Variable ================================== -- Variable: struct gcc_target targetm The target ‘.c’ file must define the global ‘targetm’ variable which contains pointers to functions and data relating to the target machine. The variable is declared in ‘target.h’; ‘target-def.h’ defines the macro ‘TARGET_INITIALIZER’ which is used to initialize the variable, and macros for the default initializers for elements of the structure. The ‘.c’ file should override those macros for which the default definition is inappropriate. For example: #include "target.h" #include "target-def.h" /* Initialize the GCC target structure. */ #undef TARGET_COMP_TYPE_ATTRIBUTES #define TARGET_COMP_TYPE_ATTRIBUTES MACHINE_comp_type_attributes struct gcc_target targetm = TARGET_INITIALIZER; Where a macro should be defined in the ‘.c’ file in this manner to form part of the ‘targetm’ structure, it is documented below as a "Target Hook" with a prototype. Many macros will change in future from being defined in the ‘.h’ file to being part of the ‘targetm’ structure. Similarly, there is a ‘targetcm’ variable for hooks that are specific to front ends for C-family languages, documented as "C Target Hook". This is declared in ‘c-family/c-target.h’, the initializer ‘TARGETCM_INITIALIZER’ in ‘c-family/c-target-def.h’. If targets initialize ‘targetcm’ themselves, they should set ‘target_has_targetcm=yes’ in ‘config.gcc’; otherwise a default definition is used. Similarly, there is a ‘targetm_common’ variable for hooks that are shared between the compiler driver and the compilers proper, documented as "Common Target Hook". This is declared in ‘common/common-target.h’, the initializer ‘TARGETM_COMMON_INITIALIZER’ in ‘common/common-target-def.h’. If targets initialize ‘targetm_common’ themselves, they should set ‘target_has_targetm_common=yes’ in ‘config.gcc’; otherwise a default definition is used. Similarly, there is a ‘targetdm’ variable for hooks that are specific to the D language front end, documented as "D Target Hook". This is declared in ‘d/d-target.h’, the initializer ‘TARGETDM_INITIALIZER’ in ‘d/d-target-def.h’. If targets initialize ‘targetdm’ themselves, they should set ‘target_has_targetdm=yes’ in ‘config.gcc’; otherwise a default definition is used. Similarly, there is a ‘targetrustm’ variable for hooks that are specific to the Rust language front end, documented as "Rust Target Hook". This is declared in ‘rust/rust-target.h’, the initializer ‘TARGETRUSTM_INITIALIZER’ in ‘rust/rust-target-def.h’. If targets initialize ‘targetrustm’ themselves, they should set ‘target_has_targetrustm=yes’ in ‘config.gcc’; otherwise a default definition is used.  File: gccint.info, Node: Driver, Next: Run-time Target, Prev: Target Structure, Up: Target Macros 18.2 Controlling the Compilation Driver, ‘gcc’ ============================================== You can control the compilation driver. -- Macro: DRIVER_SELF_SPECS A list of specs for the driver itself. It should be a suitable initializer for an array of strings, with no surrounding braces. The driver applies these specs to its own command line between loading default ‘specs’ files (but not command-line specified ones) and choosing the multilib directory or running any subcommands. It applies them in the order given, so each spec can depend on the options added by earlier ones. It is also possible to remove options using ‘%abi_limb_mode’ ‘CEIL (N, GET_MODE_PRECISION (info->abi_limb_mode))’ limbs, ordered from least significant to most significant if ‘!info->big_endian’, otherwise from most significant to least significant. If ‘info->extended’ is false, the bits above or equal to N are undefined when stored in a register or memory, otherwise they are zero or sign extended depending on if it is ‘unsigned _BitInt(N)’ or one of ‘_BitInt(N)’ or ‘signed _BitInt(N)’. Alignment of the type is ‘GET_MODE_ALIGNMENT (info->limb_mode)’. -- Target Hook: machine_mode TARGET_PROMOTE_FUNCTION_MODE (const_tree TYPE, machine_mode MODE, int *PUNSIGNEDP, const_tree FUNTYPE, int FOR_RETURN) Like ‘PROMOTE_MODE’, but it is applied to outgoing function arguments or function return values. The target hook should return the new mode and possibly change ‘*PUNSIGNEDP’ if the promotion should change signedness. This function is called only for scalar _or pointer_ types. FOR_RETURN allows to distinguish the promotion of arguments and return values. If it is ‘1’, a return value is being promoted and ‘TARGET_FUNCTION_VALUE’ must perform the same promotions done here. If it is ‘2’, the returned mode should be that of the register in which an incoming parameter is copied, or the outgoing result is computed; then the hook should return the same mode as ‘promote_mode’, though the signedness may be different. TYPE can be NULL when promoting function arguments of libcalls. The default is to not promote arguments and return values. You can also define the hook to ‘default_promote_function_mode_always_promote’ if you would like to apply the same rules given by ‘PROMOTE_MODE’. -- Macro: PARM_BOUNDARY Normal alignment required for function parameters on the stack, in bits. All stack parameters receive at least this much alignment regardless of data type. On most machines, this is the same as the size of an integer. -- Macro: STACK_BOUNDARY Define this macro to the minimum alignment enforced by hardware for the stack pointer on this machine. The definition is a C expression for the desired alignment (measured in bits). This value is used as a default if ‘PREFERRED_STACK_BOUNDARY’ is not defined. On most machines, this should be the same as ‘PARM_BOUNDARY’. -- Macro: PREFERRED_STACK_BOUNDARY Define this macro if you wish to preserve a certain alignment for the stack pointer, greater than what the hardware enforces. The definition is a C expression for the desired alignment (measured in bits). This macro must evaluate to a value equal to or larger than ‘STACK_BOUNDARY’. -- Macro: INCOMING_STACK_BOUNDARY Define this macro if the incoming stack boundary may be different from ‘PREFERRED_STACK_BOUNDARY’. This macro must evaluate to a value equal to or larger than ‘STACK_BOUNDARY’. -- Macro: FUNCTION_BOUNDARY Alignment required for a function entry point, in bits. -- Macro: BIGGEST_ALIGNMENT Biggest alignment that any data type can require on this machine, in bits. Note that this is not the biggest alignment that is supported, just the biggest alignment that, when violated, may cause a fault. -- Target Hook: HOST_WIDE_INT TARGET_ABSOLUTE_BIGGEST_ALIGNMENT If defined, this target hook specifies the absolute biggest alignment that a type or variable can have on this machine, otherwise, ‘BIGGEST_ALIGNMENT’ is used. -- Macro: MALLOC_ABI_ALIGNMENT Alignment, in bits, a C conformant malloc implementation has to provide. If not defined, the default value is ‘BITS_PER_WORD’. -- Macro: ATTRIBUTE_ALIGNED_VALUE Alignment used by the ‘__attribute__ ((aligned))’ construct. If not defined, the default value is ‘BIGGEST_ALIGNMENT’. -- Macro: MINIMUM_ATOMIC_ALIGNMENT If defined, the smallest alignment, in bits, that can be given to an object that can be referenced in one operation, without disturbing any nearby object. Normally, this is ‘BITS_PER_UNIT’, but may be larger on machines that don't have byte or half-word store operations. -- Macro: BIGGEST_FIELD_ALIGNMENT Biggest alignment that any structure or union field can require on this machine, in bits. If defined, this overrides ‘BIGGEST_ALIGNMENT’ for structure and union fields only, unless the field alignment has been set by the ‘__attribute__ ((aligned (N)))’ construct. -- Macro: ADJUST_FIELD_ALIGN (FIELD, TYPE, COMPUTED) An expression for the alignment of a structure field FIELD of type TYPE if the alignment computed in the usual way (including applying of ‘BIGGEST_ALIGNMENT’ and ‘BIGGEST_FIELD_ALIGNMENT’ to the alignment) is COMPUTED. It overrides alignment only if the field alignment has not been set by the ‘__attribute__ ((aligned (N)))’ construct. Note that FIELD may be ‘NULL_TREE’ in case we just query for the minimum alignment of a field of type TYPE in structure context. -- Macro: MAX_STACK_ALIGNMENT Biggest stack alignment guaranteed by the backend. Use this macro to specify the maximum alignment of a variable on stack. If not defined, the default value is ‘STACK_BOUNDARY’. -- Macro: MAX_OFILE_ALIGNMENT Biggest alignment supported by the object file format of this machine. Use this macro to limit the alignment which can be specified using the ‘__attribute__ ((aligned (N)))’ construct for functions and objects with static storage duration. The alignment of automatic objects may exceed the object file format maximum up to the maximum supported by GCC. If not defined, the default value is ‘BIGGEST_ALIGNMENT’. On systems that use ELF, the default (in ‘config/elfos.h’) is the largest supported 32-bit ELF section alignment representable on a 32-bit host e.g. ‘(((uint64_t) 1 << 28) * 8)’. On 32-bit ELF the largest supported section alignment in bits is ‘(0x80000000 * 8)’, but this is not representable on 32-bit hosts. -- Target Hook: void TARGET_LOWER_LOCAL_DECL_ALIGNMENT (tree DECL) Define this hook to lower alignment of local, parm or result decl ‘(DECL)’. -- Target Hook: HOST_WIDE_INT TARGET_STATIC_RTX_ALIGNMENT (machine_mode MODE) This hook returns the preferred alignment in bits for a statically-allocated rtx, such as a constant pool entry. MODE is the mode of the rtx. The default implementation returns ‘GET_MODE_ALIGNMENT (MODE)’. -- Macro: DATA_ALIGNMENT (TYPE, BASIC-ALIGN) If defined, a C expression to compute the alignment for a variable in the static store. TYPE is the data type, and BASIC-ALIGN is the alignment that the object would ordinarily have. The value of this macro is used instead of that alignment to align the object. If this macro is not defined, then BASIC-ALIGN is used. One use of this macro is to increase alignment of medium-size data to make it all fit in fewer cache lines. Another is to cause character arrays to be word-aligned so that ‘strcpy’ calls that copy constants to character arrays can be done inline. -- Macro: DATA_ABI_ALIGNMENT (TYPE, BASIC-ALIGN) Similar to ‘DATA_ALIGNMENT’, but for the cases where the ABI mandates some alignment increase, instead of optimization only purposes. E.g. AMD x86-64 psABI says that variables with array type larger than 15 bytes must be aligned to 16 byte boundaries. If this macro is not defined, then BASIC-ALIGN is used. -- Target Hook: HOST_WIDE_INT TARGET_CONSTANT_ALIGNMENT (const_tree CONSTANT, HOST_WIDE_INT BASIC_ALIGN) This hook returns the alignment in bits of a constant that is being placed in memory. CONSTANT is the constant and BASIC_ALIGN is the alignment that the object would ordinarily have. The default definition just returns BASIC_ALIGN. The typical use of this hook is to increase alignment for string constants to be word aligned so that ‘strcpy’ calls that copy constants can be done inline. The function ‘constant_alignment_word_strings’ provides such a definition. -- Macro: LOCAL_ALIGNMENT (TYPE, BASIC-ALIGN) If defined, a C expression to compute the alignment for a variable in the local store. TYPE is the data type, and BASIC-ALIGN is the alignment that the object would ordinarily have. The value of this macro is used instead of that alignment to align the object. If this macro is not defined, then BASIC-ALIGN is used. One use of this macro is to increase alignment of medium-size data to make it all fit in fewer cache lines. If the value of this macro has a type, it should be an unsigned type. -- Target Hook: HOST_WIDE_INT TARGET_VECTOR_ALIGNMENT (const_tree TYPE) This hook can be used to define the alignment for a vector of type TYPE, in order to comply with a platform ABI. The default is to require natural alignment for vector types. The alignment returned by this hook must be a power-of-two multiple of the default alignment of the vector element type. -- Macro: STACK_SLOT_ALIGNMENT (TYPE, MODE, BASIC-ALIGN) If defined, a C expression to compute the alignment for stack slot. TYPE is the data type, MODE is the widest mode available, and BASIC-ALIGN is the alignment that the slot would ordinarily have. The value of this macro is used instead of that alignment to align the slot. If this macro is not defined, then BASIC-ALIGN is used when TYPE is ‘NULL’. Otherwise, ‘LOCAL_ALIGNMENT’ will be used. This macro is to set alignment of stack slot to the maximum alignment of all possible modes which the slot may have. If the value of this macro has a type, it should be an unsigned type. -- Macro: LOCAL_DECL_ALIGNMENT (DECL) If defined, a C expression to compute the alignment for a local variable DECL. If this macro is not defined, then ‘LOCAL_ALIGNMENT (TREE_TYPE (DECL), DECL_ALIGN (DECL))’ is used. One use of this macro is to increase alignment of medium-size data to make it all fit in fewer cache lines. If the value of this macro has a type, it should be an unsigned type. -- Macro: MINIMUM_ALIGNMENT (EXP, MODE, ALIGN) If defined, a C expression to compute the minimum required alignment for dynamic stack realignment purposes for EXP (a type or decl), MODE, assuming normal alignment ALIGN. If this macro is not defined, then ALIGN will be used. -- Macro: EMPTY_FIELD_BOUNDARY Alignment in bits to be given to a structure bit-field that follows an empty field such as ‘int : 0;’. If ‘PCC_BITFIELD_TYPE_MATTERS’ is true, it overrides this macro. -- Macro: STRUCTURE_SIZE_BOUNDARY Number of bits which any structure or union's size must be a multiple of. Each structure or union's size is rounded up to a multiple of this. If you do not define this macro, the default is the same as ‘BITS_PER_UNIT’. -- Macro: STRICT_ALIGNMENT Define this macro to be the value 1 if instructions will fail to work if given data not on the nominal alignment. If instructions will merely go slower in that case, define this macro as 0. -- Macro: PCC_BITFIELD_TYPE_MATTERS Define this if you wish to imitate the way many other C compilers handle alignment of bit-fields and the structures that contain them. The behavior is that the type written for a named bit-field (‘int’, ‘short’, or other integer type) imposes an alignment for the entire structure, as if the structure really did contain an ordinary field of that type. In addition, the bit-field is placed within the structure so that it would fit within such a field, not crossing a boundary for it. Thus, on most machines, a named bit-field whose type is written as ‘int’ would not cross a four-byte boundary, and would force four-byte alignment for the whole structure. (The alignment used may not be four bytes; it is controlled by the other alignment parameters.) An unnamed bit-field will not affect the alignment of the containing structure. If the macro is defined, its definition should be a C expression; a nonzero value for the expression enables this behavior. Note that if this macro is not defined, or its value is zero, some bit-fields may cross more than one alignment boundary. The compiler can support such references if there are ‘insv’, ‘extv’, and ‘extzv’ insns that can directly reference memory. The other known way of making bit-fields work is to define ‘STRUCTURE_SIZE_BOUNDARY’ as large as ‘BIGGEST_ALIGNMENT’. Then every structure can be accessed with fullwords. Unless the machine has bit-field instructions or you define ‘STRUCTURE_SIZE_BOUNDARY’ that way, you must define ‘PCC_BITFIELD_TYPE_MATTERS’ to have a nonzero value. If your aim is to make GCC use the same conventions for laying out bit-fields as are used by another compiler, here is how to investigate what the other compiler does. Compile and run this program: struct foo1 { char x; char :0; char y; }; struct foo2 { char x; int :0; char y; }; main () { printf ("Size of foo1 is %d\n", sizeof (struct foo1)); printf ("Size of foo2 is %d\n", sizeof (struct foo2)); exit (0); } If this prints 2 and 5, then the compiler's behavior is what you would get from ‘PCC_BITFIELD_TYPE_MATTERS’. -- Macro: BITFIELD_NBYTES_LIMITED Like ‘PCC_BITFIELD_TYPE_MATTERS’ except that its effect is limited to aligning a bit-field within the structure. -- Target Hook: bool TARGET_ALIGN_ANON_BITFIELD (void) When ‘PCC_BITFIELD_TYPE_MATTERS’ is true this hook will determine whether unnamed bitfields affect the alignment of the containing structure. The hook should return true if the structure should inherit the alignment requirements of an unnamed bitfield's type. -- Target Hook: bool TARGET_NARROW_VOLATILE_BITFIELD (void) This target hook should return ‘true’ if accesses to volatile bitfields should use the narrowest mode possible. It should return ‘false’ if these accesses should use the bitfield container type. The default is ‘false’. -- Target Hook: bool TARGET_MEMBER_TYPE_FORCES_BLK (const_tree FIELD, machine_mode MODE) Return true if a structure, union or array containing FIELD should be accessed using ‘BLKMODE’. If FIELD is the only field in the structure, MODE is its mode, otherwise MODE is VOIDmode. MODE is provided in the case where structures of one field would require the structure's mode to retain the field's mode. Normally, this is not needed. -- Macro: ROUND_TYPE_ALIGN (TYPE, COMPUTED, SPECIFIED) Define this macro as an expression for the alignment of a type (given by TYPE as a tree node) if the alignment computed in the usual way is COMPUTED and the alignment explicitly specified was SPECIFIED. The default is to use SPECIFIED if it is larger; otherwise, use the smaller of COMPUTED and ‘BIGGEST_ALIGNMENT’ -- Macro: MAX_FIXED_MODE_SIZE An integer expression for the size in bits of the largest integer machine mode that should actually be used. All integer machine modes of this size or smaller can be used for structures and unions with the appropriate sizes. If this macro is undefined, ‘GET_MODE_BITSIZE (DImode)’ is assumed. -- Macro: STACK_SAVEAREA_MODE (SAVE_LEVEL) If defined, an expression of type ‘machine_mode’ that specifies the mode of the save area operand of a ‘save_stack_LEVEL’ named pattern (*note Standard Names::). SAVE_LEVEL is one of ‘SAVE_BLOCK’, ‘SAVE_FUNCTION’, or ‘SAVE_NONLOCAL’ and selects which of the three named patterns is having its mode specified. You need not define this macro if it always returns ‘Pmode’. You would most commonly define this macro if the ‘save_stack_LEVEL’ patterns need to support both a 32- and a 64-bit mode. -- Macro: STACK_SIZE_MODE If defined, an expression of type ‘machine_mode’ that specifies the mode of the size increment operand of an ‘allocate_stack’ named pattern (*note Standard Names::). You need not define this macro if it always returns ‘word_mode’. You would most commonly define this macro if the ‘allocate_stack’ pattern needs to support both a 32- and a 64-bit mode. -- Target Hook: scalar_int_mode TARGET_LIBGCC_CMP_RETURN_MODE (void) This target hook should return the mode to be used for the return value of compare instructions expanded to libgcc calls. If not defined ‘word_mode’ is returned which is the right choice for a majority of targets. -- Target Hook: scalar_int_mode TARGET_LIBGCC_SHIFT_COUNT_MODE (void) This target hook should return the mode to be used for the shift count operand of shift instructions expanded to libgcc calls. If not defined ‘word_mode’ is returned which is the right choice for a majority of targets. -- Target Hook: scalar_int_mode TARGET_UNWIND_WORD_MODE (void) Return machine mode to be used for ‘_Unwind_Word’ type. The default is to use ‘word_mode’. -- Target Hook: bool TARGET_MS_BITFIELD_LAYOUT_P (const_tree RECORD_TYPE) This target hook returns ‘true’ if bit-fields in the given RECORD_TYPE are to be laid out following the rules of Microsoft Visual C/C++, namely: (i) a bit-field won't share the same storage unit with the previous bit-field if their underlying types have different sizes, and the bit-field will be aligned to the highest alignment of the underlying types of itself and of the previous bit-field; (ii) a zero-sized bit-field will affect the alignment of the whole enclosing structure, even if it is unnamed; except that (iii) a zero-sized bit-field will be disregarded unless it follows another bit-field of nonzero size. If this hook returns ‘true’, other macros that control bit-field layout are ignored. When a bit-field is inserted into a packed record, the whole size of the underlying type is used by one or more same-size adjacent bit-fields (that is, if its long:3, 32 bits is used in the record, and any additional adjacent long bit-fields are packed into the same chunk of 32 bits. However, if the size changes, a new field of that size is allocated). In an unpacked record, this is the same as using alignment, but not equivalent when packing. If both MS bit-fields and ‘__attribute__((packed))’ are used, the latter will take precedence. If ‘__attribute__((packed))’ is used on a single field when MS bit-fields are in use, it will take precedence for that field, but the alignment of the rest of the structure may affect its placement. -- Target Hook: bool TARGET_DECIMAL_FLOAT_SUPPORTED_P (void) Returns true if the target supports decimal floating point. -- Target Hook: bool TARGET_FIXED_POINT_SUPPORTED_P (void) Returns true if the target supports fixed-point arithmetic. -- Target Hook: void TARGET_EXPAND_TO_RTL_HOOK (void) This hook is called just before expansion into rtl, allowing the target to perform additional initializations or analysis before the expansion. For example, the rs6000 port uses it to allocate a scratch stack slot for use in copying SDmode values between memory and floating point registers whenever the function being expanded has any SDmode usage. -- Target Hook: void TARGET_INSTANTIATE_DECLS (void) This hook allows the backend to perform additional instantiations on rtl that are not actually in any insns yet, but will be later. -- Target Hook: const char * TARGET_MANGLE_TYPE (const_tree TYPE) If your target defines any fundamental types, or any types your target uses should be mangled differently from the default, define this hook to return the appropriate encoding for these types as part of a C++ mangled name. The TYPE argument is the tree structure representing the type to be mangled. The hook may be applied to trees which are not target-specific fundamental types; it should return ‘NULL’ for all such types, as well as arguments it does not recognize. If the return value is not ‘NULL’, it must point to a statically-allocated string constant. Target-specific fundamental types might be new fundamental types or qualified versions of ordinary fundamental types. Encode new fundamental types as ‘u N NAME’, where NAME is the name used for the type in source code, and N is the length of NAME in decimal. Encode qualified versions of ordinary types as ‘U N NAME CODE’, where NAME is the name used for the type qualifier in source code, N is the length of NAME as above, and CODE is the code used to represent the unqualified version of this type. (See ‘write_builtin_type’ in ‘cp/mangle.cc’ for the list of codes.) In both cases the spaces are for clarity; do not include any spaces in your string. This hook is applied to types prior to typedef resolution. If the mangled name for a particular type depends only on that type's main variant, you can perform typedef resolution yourself using ‘TYPE_MAIN_VARIANT’ before mangling. The default version of this hook always returns ‘NULL’, which is appropriate for a target that does not define any new fundamental types. -- Target Hook: void TARGET_EMIT_SUPPORT_TINFOS (emit_support_tinfos_callback CALLBACK) If your target defines any fundamental types which depend on ISA flags, they might need C++ tinfo symbols in libsupc++/libstdc++ regardless of ISA flags the library is compiled with. This hook allows creating tinfo symbols even for those cases, by temporarily creating each corresponding fundamental type trees, calling the CALLBACK function on it and setting the type back to ‘nullptr’.