| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This first cut is pretty conservative. The final argument register (R6)
is call-saved, so we would need to make sure that the R6 argument to a
sibling call is the same as the R6 argument to the calling function,
which seems worth keeping as a separate patch.
Saying that integer truncations are free means that we no longer
use the extending instructions LGF and LLGF for spills in int-conv-09.ll
and int-conv-10.ll. Instead we treat the registers as 64 bits wide and
truncate them to 32-bits where necessary. I think it's unlikely we'd
use LGF and LLGF for spills in other situations for the same reason,
so I'm removing the tests rather than replacing them. The associated
code is generic and applies to many more instructions than just
LGF and LLGF, so there is no corresponding code removal.
llvm-svn: 188669
|
| |
|
|
|
|
|
| |
* pow(x, 0.5) -> fabs(sqrt(x))
* pow(2.0, x) -> exp2(x)
llvm-svn: 188656
|
| |
|
|
|
|
|
|
|
|
| |
We had previously been asserting when faced with a FCOPYSIGN f64, ppcf128 node
because there was no way to expand the FCOPYSIGN node. Because ppcf128 is the
sum of two doubles, and the first double must have the larger magnitude, we
can take the sign from the first double. As a result, in addition to fixing the
crash, this is also an optimization.
llvm-svn: 188655
|
| |
|
|
|
|
|
|
|
| |
Modern PPC cores support a floating-point copysign instruction, and we can use
this to lower the FCOPYSIGN node (which is created from calls to the libm
copysign function). A couple of extra patterns are necessary because the
operand types of FCOPYSIGN need not agree.
llvm-svn: 188653
|
| |
|
|
|
|
|
|
|
| |
This reduces the noise in diffs making it more likely that, at least for
LLVM revision-over-revision, diffs will actually yield usable results.
This is consistent with objdump's DWARF dumping behavior.
llvm-svn: 188650
|
| |
|
|
|
|
|
|
| |
We check this in many/all other cases, just missed this one it seems.
Perhaps it'd be worth unifying this so we never emit zero-length
DW_AT_names.
llvm-svn: 188649
|
| |
|
|
|
|
|
|
| |
When patching inlineasm nodes to use GPRPair for 64-bit values, we
were dropping the information that two operands were tied, which
effectively broke the live-interval of vregs affected.
llvm-svn: 188643
|
| |
|
|
| |
llvm-svn: 188637
|
| |
|
|
|
| |
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 188598
|
| |
|
|
|
| |
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 188597
|
| |
|
|
|
| |
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 188596
|
| |
|
|
|
|
|
|
| |
Ongoing 'make the verifier happy' improvements to ARM fast-isel.
rdar://12594152
llvm-svn: 188595
|
| |
|
|
|
|
|
|
|
|
| |
Properly constrain the operand register class for instructions used
in [sz]ext expansion. Update more tests to use the verifier now that
we're getting the register classes correct.
rdar://12594152
llvm-svn: 188594
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Teach the generic instruction selection helper functions to constrain
the register classes of their input operands. For non-physical register
references, the generic code needs to be careful not to mess that up
when replacing references to result registers. As the comment indicates
for MachineRegisterInfo::replaceRegWith(), it's important to call
constrainRegClass() first.
rdar://12594152
llvm-svn: 188593
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Lots of machine verifier errors result from using a plain GPR regclass
for incoming argument copies. A more restrictive rGPR class is more
appropriate since it more accurately represents what's happening, plus
it lines up better with isel later on so the verifier is happier.
Reduces the number of ARM fast-isel tests not running with the verifier
enabled by over half.
rdar://12594152
llvm-svn: 188592
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This regards how mips16 is viewed. It's not really a target type but
there has always been a target for it in the td files. It's more properly
-mcpu=mips32 -mattr=+mips16 . This is how clang treats it but we have
always had the -mcpu=mips16 which I probably should delete now but it will
require updating all the .ll test cases for mips16. In this case it changed
how we decide if we have a count bits instruction and whether instruction
lowering should then expand ctlz. Now that we have dual mode compilation,
-mattr=+mips16 really just indicates the inital processor mode that
we are compiling for. (It is also possible to have -mcpu=64 -mattr=+mips16
but as far as I know, nobody has even built such a processor, though there
is an architecture manual for this).
llvm-svn: 188586
|
| |
|
|
|
|
| |
buildbots.
llvm-svn: 188567
|
| |
|
|
|
|
|
| |
Fixes two recent piglit regressions with radeonsi.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188559
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD
instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused
it to corrupt the encoding of that by clobbering the first operand with
the second one.
Undo that damage and only apply the SMRD logic to that.
Fixes some derivates related piglit regressions with radeonsi.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188558
|
| |
|
|
|
|
|
|
|
|
| |
address.
This unbreaks PIC with fast isel on ELF targets (PR16717). The output matches
what GCC and SDag do for PIC but may not cover all of the many flavors of PIC
that exist.
llvm-svn: 188551
|
| |
|
|
|
|
|
|
| |
Thumb2 literal loads use an offset encoding which allows for
negative zero. This fixes parsing and encoding so that #-0
is correctly processed. The parser represents #-0 as INT32_MIN.
llvm-svn: 188549
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are many Thumb instructions which take 12-bit immediates encoded in a special
8-byte value + 4-byte rotator form. Not all numbers are represented, and it's legal
to transform an assembly instruction to be able to encode the immediate.
For example: AND and BIC are complementary instructions; one can switch the AND
to a BIC as long as the immediate is complemented.
The intent is to switch one instruction into its complementary one when the immediate
cannot be encoded in the form requested in the original assembly and when the
complementary immediate is encodable.
The patch addresses two issues:
1. definition of t2SOImmNot immediate - it has to check that the orignal value is
not encoded naturally
2. t2AND and t2BIC instruction aliases which should use the Thumb2 SOImm operand
rather than the ARM one.
llvm-svn: 188548
|
| |
|
|
|
|
| |
It would also make sense to use it for memchr; I'm working on that now.
llvm-svn: 188547
|
| |
|
|
| |
llvm-svn: 188546
|
| |
|
|
| |
llvm-svn: 188544
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Generalize r188163 to cope with return types other than MVT::i32, just
as the existing visitMemCmpCall code did. I've split this out into a
subroutine so that it can be used for other upcoming patches.
I also noticed that I'd used the wrong API to record the out chain.
It's a load that uses DAG.getRoot() rather than getRoot(), so the out
chain should go on PendingLoads. I don't have a testcase for that because
we don't do any interesting scheduling on z yet.
llvm-svn: 188540
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r188163 used CLC to implement memcmp. Code that compares the result
directly against zero can test the CC value produced by CLC, but code
that needs an integer result must use IPM. The sequence I'd used was:
ipm <reg>
sll <reg>, 2
sra <reg>, 30
but I'd forgotten that this inverts the order, so that CC==1 ("less")
becomes an integer greater than zero, and CC==2 ("greater") becomes
an integer less than zero. This sequence should only be used if the
CLC arguments are reversed to compensate. The problem then is that
the branch condition must also be reversed when testing the CLC
result directly.
Rather than do that, I went for a different sequence that works with
the natural CLC order:
ipm <reg>
srl <reg>, 28
rll <reg>, <reg>, 31
One advantage of this is that it doesn't clobber CC. A disadvantage
is that any sign extension to 64 bits must be done separately,
rather than being folded into the shifts.
llvm-svn: 188538
|
| |
|
|
|
|
| |
files.
llvm-svn: 188537
|
| |
|
|
|
|
| |
v8i64.
llvm-svn: 188534
|
| |
|
|
| |
llvm-svn: 188529
|
| |
|
|
|
|
|
|
| |
- Benjamin fixed the emission of this file in r179937, but it still lives on a
few buildbots. We should probably clean up the build dirs once in a while,
eh?
llvm-svn: 188527
|
| |
|
|
| |
llvm-svn: 188526
|
| |
|
|
|
|
|
| |
This reverts commit a6a39ced095c2f453624ce62c4aead25db41a18f.
This is the wrong version of this fix.
llvm-svn: 188523
|
| |
|
|
|
|
|
|
|
| |
The SIInsertWaits pass was overwriting the first operand (gds bit) of
DS_WRITE_B32 with the second operand (value to write). This meant that
any time the value to write was stored in an odd number VGPR, the gds
bit would be set causing the instruction to write to GDS instead of LDS.
llvm-svn: 188522
|
| |
|
|
|
| |
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 188521
|
| |
|
|
|
| |
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 188520
|
| |
|
|
|
| |
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 188519
|
| |
|
|
|
| |
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 188518
|
| |
|
|
|
| |
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 188517
|
| |
|
|
|
| |
Tested-by: Aaron Watry <awatry@gmail.com>
llvm-svn: 188515
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Instead of setting the suffixes in a bunch of places, just set one master
list in the top-level config. We now only modify the suffix list in a few
suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py).
- Aside from removing the need for a bunch of lit.local.cfg files, this enables
4 tests that were inadvertently being skipped (one in
Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and
CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been
XFAILED).
- This commit also fixes a bunch of config files to use config.root instead of
older copy-pasted code.
llvm-svn: 188513
|
| |
|
|
|
|
|
|
|
|
|
| |
When both constants are positive or both constants are negative,
InstCombine already simplifies comparisons like this, but when
it's exactly zero and -1, the operand sorting ends up reversed
and the pattern fails to match. Handle that special case.
Follow up for rdar://14689217
llvm-svn: 188512
|
| |
|
|
|
|
|
| |
This path wasn't tested before without a datalayout,
so add some more tests and re-run with and without one.
llvm-svn: 188507
|
| |
|
|
| |
llvm-svn: 188503
|
| |
|
|
|
|
|
|
|
| |
the input character is not converted to char before comparing with zero.
The patch was discussed in this thread:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130812/184069.html
llvm-svn: 188489
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When the -dfsan-debug-nonzero-labels parameter is supplied, the code
is instrumented such that when a call parameter, return value or load
produces a nonzero label, the function __dfsan_nonzero_label is called.
The idea is that a debugger breakpoint can be set on this function
in a nominally label-free program to help identify any bugs in the
instrumentation pass causing labels to be introduced.
Reviewers: eugenis
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1405
llvm-svn: 188472
|
| |
|
|
|
|
|
|
|
| |
1. The offset range for Thumb1 PC relative loads is [0..1020] and not [-1024..1020]
2. Thumb2 PC relative loads may define the PC, so the restriction placed on target register is removed
3. Removes unneeded alias between "ldr.n" and t1LDRpci. ".n" is actually stripped by both tablegen
and the ASM parser, so this alias rule really does nothing
llvm-svn: 188466
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Includes:
madd_q, maddr_q, maddv, max_[asu], maxi_[su], min_[asu], mini_[su], mod_[su],
msub_q, msubr_q, msubv, mul_q, mulr_q, mulv, nloc, nlzc, nori, ori, pckev,
pckod, pcnt, sat_[su], shf, sld, sldi, sll, slli, splat, splati, sr[al],
sr[al]i, subs_[su], subss_u, subus_s, subv, subvi, vshf, xori
Patch by Daniel Sanders
llvm-svn: 188460
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Includes:
fadd, fceq, fcg[et], fclass, fcl[et], fcne, fcun, fdiv, fexdo, fexp2,
fexup[lr], ffint_[su], ffql, ffqr, fill, flog2, fmadd, fmax, fmax_a, fmin,
fmin_a, fmsub, fmul, frint, frcp, frsqrt, fseq, fsge, fsgt, fsle, fslt,
fsne, fsqr, fsub, ftint_s, ftq
Patch by Daniel Sanders
llvm-svn: 188458
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Includes:
add_a, adds_[asu], addv, addvi, andi.b, asub_[su].[bhwd], aver?_[su]_[bhwd],
bclr, bclri, bins[lr], bins[lr]i, bmnzi, bmzi, bneg, bnegi, bseli, bset, bseti,
c(eq|ne), c(eq|ne)i, cl[et]_[su], cl[et]i_[su], copy_[su].[bhw], div_[su],
dotp_[su], dpadd_[su], dpsub_[su], ilvev, ilvl, ilvod, ilvr, insv, insve,
ldi
Patch by Daniel Sanders
llvm-svn: 188457
|