| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
This adds the relocation type and other necessary infrastructure
to use the @toc@h modifier in the assembler.
llvm-svn: 184549
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds necessary infrastructure to support the @h modifier.
Note that all required relocation types were already present
(and unused).
This patch provides support for using @h in the assembler;
it would also be possible to now use this feature in code
generated by the compiler, but this is not done yet.
llvm-svn: 184548
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This renames more VK_PPC_ enums, to make them more closely reflect
the @modifier string they represent. This also prepares for adding
a bunch of new VK_PPC_ enums in upcoming patches.
For consistency, some MO_ flags related to VK_PPC_ enums are
likewise renamed.
No change in behaviour.
llvm-svn: 184547
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is another minor cleanup; to bring enum names in line
with the corresponding @modifier names, this renames:
VK_PPC_TOC -> VK_PPC_TOCBASE
VK_PPC_TOC_ENTRY -> VK_PPC_TOC16
No code change intended.
llvm-svn: 184491
|
|
|
|
|
|
|
|
|
| |
This just re-sorts the big switch statement in
PPCELFObjectWriter::getRelocTypeInner to follow
the (numerical) order of the reloc types, and
fixes a couple of whitespace issues.
llvm-svn: 184485
|
|
|
|
|
|
|
| |
The isDarwin parameter to the llvm::LowerPPCMachineInstrToMCInst
routine is now no longer needed; remove it.
llvm-svn: 184441
|
|
|
|
|
|
|
| |
This (hopefully) fixes build failures resulting from r184436;
the PowerPC asm parser now depends on PowerPC target expresssions.
llvm-svn: 184439
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for having the assembler optimize fixups
to constructs like "symbol@ha" or "symbol@l" if "symbol" can be
resolved at assembler time.
This optimization is already present in the PPCMCExpr.cpp code
for handling PPC_HA16/PPC_LO16 target expressions. However,
those target expression were used only on Darwin targets.
This patch changes target expression code so that they are
usable also with the GNU assembler (using the @ha / @l syntax
instead of the ha16() / lo16() syntax), and changes the
MCInst lowering code to generate those target expressions
where appropriate.
It also changes the asm parser to generate HA16/LO16 target
expressions when parsing assembler source that uses the
@ha / @l modifiers. The effect is that now the above-
mentioned optimization automatically becomes available
for those situations too.
llvm-svn: 184436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just like for branch mnemonics (where support was recently added), the
assembler is supposed to support extended mnemonics for the compare
instructions where no condition register is specified explicitly
(and CR0 is assumed implicitly).
This patch adds support for those extended compare mnemonics.
Index: llvm-head/test/MC/PowerPC/ppc64-encoding-ext.s
===================================================================
--- llvm-head.orig/test/MC/PowerPC/ppc64-encoding-ext.s
+++ llvm-head/test/MC/PowerPC/ppc64-encoding-ext.s
@@ -449,21 +449,37 @@
# CHECK: cmpdi 2, 3, 128 # encoding: [0x2d,0x23,0x00,0x80]
cmpdi 2, 3, 128
+# CHECK: cmpdi 0, 3, 128 # encoding: [0x2c,0x23,0x00,0x80]
+ cmpdi 3, 128
# CHECK: cmpd 2, 3, 4 # encoding: [0x7d,0x23,0x20,0x00]
cmpd 2, 3, 4
+# CHECK: cmpd 0, 3, 4 # encoding: [0x7c,0x23,0x20,0x00]
+ cmpd 3, 4
# CHECK: cmpldi 2, 3, 128 # encoding: [0x29,0x23,0x00,0x80]
cmpldi 2, 3, 128
+# CHECK: cmpldi 0, 3, 128 # encoding: [0x28,0x23,0x00,0x80]
+ cmpldi 3, 128
# CHECK: cmpld 2, 3, 4 # encoding: [0x7d,0x23,0x20,0x40]
cmpld 2, 3, 4
+# CHECK: cmpld 0, 3, 4 # encoding: [0x7c,0x23,0x20,0x40]
+ cmpld 3, 4
# CHECK: cmpwi 2, 3, 128 # encoding: [0x2d,0x03,0x00,0x80]
cmpwi 2, 3, 128
+# CHECK: cmpwi 0, 3, 128 # encoding: [0x2c,0x03,0x00,0x80]
+ cmpwi 3, 128
# CHECK: cmpw 2, 3, 4 # encoding: [0x7d,0x03,0x20,0x00]
cmpw 2, 3, 4
+# CHECK: cmpw 0, 3, 4 # encoding: [0x7c,0x03,0x20,0x00]
+ cmpw 3, 4
# CHECK: cmplwi 2, 3, 128 # encoding: [0x29,0x03,0x00,0x80]
cmplwi 2, 3, 128
+# CHECK: cmplwi 0, 3, 128 # encoding: [0x28,0x03,0x00,0x80]
+ cmplwi 3, 128
# CHECK: cmplw 2, 3, 4 # encoding: [0x7d,0x03,0x20,0x40]
cmplw 2, 3, 4
+# CHECK: cmplw 0, 3, 4 # encoding: [0x7c,0x03,0x20,0x40]
+ cmplw 3, 4
# FIXME: Trap mnemonics
Index: llvm-head/lib/Target/PowerPC/PPCInstrInfo.td
===================================================================
--- llvm-head.orig/lib/Target/PowerPC/PPCInstrInfo.td
+++ llvm-head/lib/Target/PowerPC/PPCInstrInfo.td
@@ -2201,3 +2201,12 @@ defm : BranchExtendedMnemonic<"ne", 68>;
defm : BranchExtendedMnemonic<"nu", 100>;
defm : BranchExtendedMnemonic<"ns", 100>;
+def : InstAlias<"cmpwi $rA, $imm", (CMPWI CR0, gprc:$rA, s16imm:$imm)>;
+def : InstAlias<"cmpw $rA, $rB", (CMPW CR0, gprc:$rA, gprc:$rB)>;
+def : InstAlias<"cmplwi $rA, $imm", (CMPLWI CR0, gprc:$rA, u16imm:$imm)>;
+def : InstAlias<"cmplw $rA, $rB", (CMPLW CR0, gprc:$rA, gprc:$rB)>;
+def : InstAlias<"cmpdi $rA, $imm", (CMPDI CR0, g8rc:$rA, s16imm:$imm)>;
+def : InstAlias<"cmpd $rA, $rB", (CMPD CR0, g8rc:$rA, g8rc:$rB)>;
+def : InstAlias<"cmpldi $rA, $imm", (CMPLDI CR0, g8rc:$rA, u16imm:$imm)>;
+def : InstAlias<"cmpld $rA, $rB", (CMPLD CR0, g8rc:$rA, g8rc:$rB)>;
+
llvm-svn: 184435
|
|
|
|
|
|
| |
caching it. The TLI may change between functions. No functionality change.
llvm-svn: 184349
|
|
|
|
|
|
|
|
|
| |
Someone may want to do something crazy, like replace these objects if they
change or something.
No functionality change intended.
llvm-svn: 184175
|
|
|
|
|
|
|
|
|
|
| |
MachineInstrs
Frame index handling is now target-agnostic, so delete the target hooks
for creation & asm printing of target-specific addressing in DBG_VALUEs
and any related functions.
llvm-svn: 184067
|
|
|
|
|
|
|
| |
Now that the PRED_BAD has been removed, this is failing the Clang
-Werror build due to -Wcovered-switch-default.
llvm-svn: 183863
|
|
|
|
|
|
|
|
| |
I'm taking David Blaikie's suggestion to use an
Optional<PPC::Predicate> return value instead. That's the right
solution for this problem. Thanks for pointing out that possibility!
llvm-svn: 183858
|
|
|
|
|
|
|
| |
Introducing PRED_BAD caused some unexpected warnings that are now
suppressed.
llvm-svn: 183854
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a preparatory patch for fast-isel support. The instruction
selector will need to access some functions in PPCGenCallingConv.inc,
which in turn requires several helper functions to be defined. These
are currently defined near the only use of PCCGenCallingConv.inc,
inside PPCISelLowering.cpp. This patch moves the declaration of the
functions into the associated header file to provide the needed
visibility.
No functional change intended.
llvm-svn: 183844
|
|
|
|
|
|
|
|
| |
Allows returning a PPC::Predicate from a function with a no-predicate
value possible. Preparatory patch for fast-isel on PPC64 ELF. No
behavioral change intended.
llvm-svn: 183841
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've been comparing the object file output of LLVM's integrated
assembler against the external assembler on PowerPC, and one
area where differences still remain are in DWARF sections.
In particular, the GNU assembler generates .debug_frame and
.debug_line sections using a code alignment factor of 4, since
all PowerPC instructions have size 4 and must be aligned to a
multiple of 4. However, current MC code hard-codes a code
alignment factor of 1.
This patch changes this by adding a "minimum instruction alignment"
data element to MCAsmInfo and using this as code alignment factor.
This requires passing a MCContext into MCDwarfLineAddr::Encode
and MCDwarfLineAddr::EncodeAdvanceLoc. Note that one caller,
MCDwarfLineAddr::Write, didn't actually have that information
available. However, it turns out that this routine is in fact
never used in the whole code base, so the patch simply removes
it. If it turns out to be needed again at a later time, it
could be re-added with an updated interface.
llvm-svn: 183834
|
|
|
|
|
|
|
|
|
| |
A plain "sc" without argument is supposed to be treated like "sc 0"
by the assembler. This patch adds a corresponding alias.
Problem reported by Joerg Sonnenberger.
llvm-svn: 183687
|
|
|
|
|
|
|
|
|
|
| |
The extended branch mnemonics are supposed to use an implied CR0
if there is no explicit condition register specified. This patch
adds extra variants of the mnemonics to this effect.
Problem reported by Joerg Sonnenberger.
llvm-svn: 183686
|
|
|
|
|
|
|
|
|
| |
This patch removes some redundancy by generating the extended branch
mnemonics via a multiclass.
No change in behaviour expected.
llvm-svn: 183685
|
|
|
|
|
|
|
|
| |
On PPC32, [su]div,rem on i64 types are transformed into runtime library
function calls. As a result, they are not allowed in counter-based loops (the
counter-loops verification pass caught this error; this change fixes PR16169).
llvm-svn: 183581
|
|
|
|
|
|
| |
Avoids unused variable warnings in Release builds.
llvm-svn: 183512
|
|
|
|
|
|
|
|
| |
the internals of TargetMachine could change.
No functionality change intended.
llvm-svn: 183494
|
|
|
|
|
|
|
| |
This also makes TableGen able to compute sizes/offsets of synthesized
indices representing tuples.
llvm-svn: 183061
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes PR16146: gdb.base__call-ar-st.exp fails after
pre-RA-sched=source fixes.
Patch by Xiaoyi Guo!
This also fixes an unsupported dbg.value test case. Codegen was
previously incorrect but the test was passing by luck.
llvm-svn: 182885
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isConsecutiveLS is a slightly more general form of
SelectionDAG::isConsecutiveLoad. Aside from also handling stores, it also does
not assume equality of the chain operands is necessary. In the case of the PPC
backend, this chain condition is checked in a more general way by the
surrounding code.
Mostly, this part of the refactoring in preparation for supporting optimized
unaligned stores.
llvm-svn: 182723
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When expanding unaligned Altivec loads, we use the decremented offset trick to
prevent page faults. Unfortunately, if we have a sequence of consecutive
unaligned loads, this leads to suboptimal code generation because the 'extra'
load from the first unaligned load can be combined with the base load from the
second (but only if the decremented offset trick is not used for the first).
Search up and down the chain, through loads and token factors, looking for
consecutive loads, and if one is found, don't use the offset reduction trick.
These duplicate loads are later combined to yield the desired sequence (in the
future, we might want a more-powerful chain search, but that will require some
changes to allow the combiner routines to access the AA object).
This should complete the initial implementation of the optimized unaligned
Altivec load expansion. There is some refactoring that should be done, but
that will happen when the unaligned store expansion is added.
llvm-svn: 182719
|
|
|
|
|
|
|
|
|
| |
The lvsl permutation control instruction is a function only of the alignment of
the pointer operand (relative to the 16-byte natural alignment of Altivec
vectors). As a result, multiple lvsl intrinsics where the operands differ by a
multiple of 16 can be combined.
llvm-svn: 182708
|
|
|
|
|
|
|
| |
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.
llvm-svn: 182703
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Altivec only directly supports aligned loads, but the loads have a strange
property: If given an unaligned address, they truncate the address to the next
lower aligned address, and load from there. This property, along with an extra
load and some special-purpose permutation-control instructions that generate
the appropriate permutations from the original unaligned address, allow
efficient lowering of aligned loads. This code uses the trick explained in the
Apple Velocity Engine optimization overview document to prevent the needed
extra load from possibly causing a page fault if the original address happens
to be aligned.
As noted in the FIXMEs, there are several additional optimizations that can be
performed to reduce the cost of these loads even more. These will be
implemented in future commits.
llvm-svn: 182691
|
|
|
|
| |
llvm-svn: 182680
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that there is no longer any distinction between symbolLo
and symbolHi operands in either printing, encoding, or parsing,
the operand types can be removed in favor of simply using
s16imm.
This completes the patch series to decouple lo/hi operand part
processing from the particular instruction whose operand it is.
No change in code generation expected from this patch.
llvm-svn: 182618
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When targeting the Darwin assembler, we need to generate markers ha16() and
lo16() to designate the high and low parts of a (symbolic) immediate. This
is necessary not just for plain symbols, but also for certain symbolic
expression, typically along the lines of ha16(A - B). The latter doesn't
work when simply using VariantKind flags on the symbol reference.
This is why the current back-end uses hacks (explicitly called out as such
via multiple FIXMEs) in the symbolLo/symbolHi print methods.
This patch uses target-defined MCExpr codes to represent the Darwin
ha16/lo16 constructs, following along the lines of the equivalent solution
used by the ARM back end to handle their :upper16: / :lower16: markers.
This allows us to get rid of special handling both in the symbolLo/symbolHi
print method and in the common code MCExpr::print routine. Instead, the
ha16 / lo16 markers are printed simply in a custom print routine for the
target MCExpr types. (As a result, the symbolLo/symbolHi print methods
can now replaced by a single printS16ImmOperand routine that also handles
symbolic operands.)
The patch also provides a EvaluateAsRelocatableImpl routine to handle
ha16/lo16 constructs. This is not actually used at the moment by any
in-tree code, but is provided as it makes merging into David Fang's
out-of-tree Mach-O object writer simpler.
Since there is no longer any need to treat VK_PPC_GAS_HA16 and
VK_PPC_DARWIN_HA16 differently, they are merged into a single
VK_PPC_ADDR16_HA (and likewise for the _LO16 types).
llvm-svn: 182616
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using PatLeaf rather than ImmLeaf when defining immediate predicates
prevents simple patterns using those predicates from being recognized
for fast instruction selection. This patch replaces the immSExt16
PatLeaf predicate with two ImmLeaf predicates, imm32SExt16 and
imm64SExt16, allowing a few more patterns to be recognized (ADDI,
ADDIC, MULLI, ADDI8, and ADDIC8). Using the new predicates does not
help for LI, LI8, SUBFIC, and SUBFIC8 because these are rejected for
other reasons, but I see no reason to retain the PatLeaf predicate.
No functional change intended, and thus no test cases yet. This is
preliminary work for enabling fast-isel support for PowerPC. When
that support is ready, we'll be able to test this function.
llvm-svn: 182510
|
|
|
|
|
|
|
|
|
|
|
|
| |
Although I had added some support for the BDZ/BDNZ branches into the selector
(in r158204), I had not correctly adjusted the condition at the top of the
loop. As a result, these branches were still essentially unsupported.
This fixes PR16086. Unfortunately, any test case would be very large (because
it would need to force the loop backedge to exceed the range of the 16-bit
immediate).
llvm-svn: 182385
|
|
|
|
|
|
| |
As discussed, LoopUtils.h is a better name.
llvm-svn: 182314
|
|
|
|
|
|
|
|
|
| |
Now that the preheader insertion logic in LoopSimplify is externally exposed,
use it, and remove the copy-and-pasted version.
No functionality change intended.
llvm-svn: 182300
|
|
|
|
|
|
|
|
|
|
| |
As the pairing of this instruction form with the bdnz/bdz branches is now
enforced by the verification pass, make it clear from the name that these
are used only for counter-based loops.
No functionality change intended.
llvm-svn: 182296
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When asserts are enabled, this adds a verification pass for PPC counter-loop
formation. Unfortunately, without sacrificing code quality, there is no better
way of forming counter-based loops except at the (late) IR level. This means
that we need to recognize, at the IR level, anything which might turn into a
function call (or indirect branch). Because this is currently a finite set of
things, and because SelectionDAG lowering is basic-block local, this can be
done. Nevertheless, it is fragile, and failure results in a miscompile. This
verification pass checks that all (reachable) counter-based branches are
dominated by a loop mtctr instruction, and that no instructions in between
clobber the counter register. If these conditions are not satisfied, then an
ICE will be triggered.
In short, this is to help us sleep better at night.
llvm-svn: 182295
|
|
|
|
|
|
|
|
| |
We don't need to reject all inline asm as using the counter register (most does
not). Only those that explicitly clobber the counter register need to prevent
the transformation.
llvm-svn: 182191
|
|
|
|
| |
llvm-svn: 182180
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the equivalent change to r182091/r182092
in the old-style code emitter. Instead of having two separate
16-bit immediate encoding routines depending on the instruction,
this patch introduces a single encoder that checks the machine
operand flags to decide whether the low or high half of a
symbol address is required.
Since now both encoders make no further distinction between
"symbolLo" and "symbolHi", the .td operand can now use a
single getS16ImmEncoding method.
Tested by running the old-style JIT tests on 32-bit Linux.
llvm-svn: 182097
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that fixup_ppc_ha16 and fixup_ppc_lo16 are being treated exactly
the same everywhere, it no longer makes sense to have two fixup types.
This patch merges them both into a single type fixup_ppc_half16,
and renames fixup_ppc_lo16_ds to fixup_ppc_half16ds for consistency.
(The half16 and half16ds names are taken from the description of
relocation types in the PowerPC ABI.)
No change in code generation expected.
llvm-svn: 182092
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current PowerPC MC back end distinguishes between fixup_ppc_ha16
and fixup_ppc_lo16, which are determined by the instruction the fixup
applies to, and uses this distinction to decide whether a fixup ought
to resolve to the high or the low part of a symbol address.
This isn't quite correct, however. It is valid -if unusual- assembler
to use, e.g.
li 1, symbol@ha
or
lis 1, symbol@l
Whether the high or the low part of the address is used depends solely
on the @ suffix, not on the instruction.
In addition, both
li 1, symbol
and
lis 1, symbol
are valid, assuming the symbol address fits into 16 bits; again, both
will then refer to the actual symbol value (so li will load the value
itself, while lis will load the value shifted by 16).
To fix this, two places need to be adapted. If the fixup cannot be
resolved at assembler time, a relocation needs to be emitted via
PPCELFObjectWriter::getRelocType. This routine already looks at
the VK_ type to determine the relocation. The only problem is that
will reject any _LO modifier in a ha16 fixup and vice versa. This
is simply incorrect; any of those modifiers ought to be accepted
for either fixup type.
If the fixup *can* be resolved at assembler time, adjustFixupValue
currently selects the high bits of the symbol value if the fixup
type is ha16. Again, this is incorrect; see the above example
lis 1, symbol
Now, in theory we'd have to respect a VK_ modifier here. However,
in fact common code never even attempts to resolve symbol references
using any nontrivial VK_ modifier at assembler time; it will always
fall back to emitting a reloc and letting the linker handle it.
If this ever changes, presumably there'd have to be a target callback
to resolve VK_ modifiers. We'd then have to handle @ha etc. there.
llvm-svn: 182091
|
|
|
|
|
|
|
| |
Now that we have good testing, remove addFrameMove and create cfi
instructions directly.
llvm-svn: 182052
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some IR-level instructions (such as FP <-> i64 conversions) are not chained
w.r.t. the mtctr intrinsic and yet may become function calls that clobber the
counter register. At the selection-DAG level, these might be reordered with the
mtctr intrinsic causing miscompiles. To avoid this situation, if an existing
preheader has instructions that might use the counter register, create a new
preheader for the mtctr intrinsic. This extra block will be remerged with the
old preheader at the MI level, but will prevent unwanted reordering at the
selection-DAG level.
llvm-svn: 182045
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the second part of the change to always return "true"
offset values from getPreIndexedAddressParts, tackling the
case of "memrix" type operands.
This is about instructions like LD/STD that only have a 14-bit
field to encode immediate offsets, which are implicitly extended
by two zero bits by the machine, so that in effect we can access
16-bit offsets as long as they are a multiple of 4.
The PowerPC back end currently handles such instructions by
carrying the 14-bit value (as it will get encoded into the
actual machine instructions) in the machine operand fields
for such instructions. This means that those values are
in fact not the true offset, but rather the offset divided
by 4 (and then truncated to an unsigned 14-bit value).
Like in the case fixed in r182012, this makes common code
operations on such offset values not work as expected.
Furthermore, there doesn't really appear to be any strong
reason why we should encode machine operands this way.
This patch therefore changes the encoding of "memrix" type
machine operands to simply contain the "true" offset value
as a signed immediate value, while enforcing the rules that
it must fit in a 16-bit signed value and must also be a
multiple of 4.
This change must be made simultaneously in all places that
access machine operands of this type. However, just about
all those changes make the code simpler; in many cases we
can now just share the same code for memri and memrix
operands.
llvm-svn: 182032
|
|
|
|
|
|
|
| |
On PPC32, i64 FP conversions are implemented using runtime calls (which clobber
the counter register). These must be excluded.
llvm-svn: 182023
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DAGCombiner::CombineToPreIndexedLoadStore calls a target routine to
decompose a memory address into a base/offset pair. It expects the
offset (if constant) to be the true displacement value in order to
perform optional additional optimizations; in particular, to convert
other uses of the original pointer into uses of the new base pointer
after pre-increment.
The PowerPC implementation of getPreIndexedAddressParts, however,
simply calls SelectAddressRegImm, which returns a TargetConstant.
This value is appropriate for encoding into the instruction, but
it is not always usable as true displacement value:
- Its type is always MVT::i32, even on 64-bit, where addresses
ought to be i64 ... this causes the optimization to simply
always fail on 64-bit due to this line in DAGCombiner:
// FIXME: In some cases, we can be smarter about this.
if (Op1.getValueType() != Offset.getValueType()) {
- Its value is truncated to an unsigned 16-bit value if negative.
This causes the above opimization to generate wrong code.
This patch fixes both problems by simply returning the true
displacement value (in its original type). This doesn't
affect any other user of the displacement.
llvm-svn: 182012
|