summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Emit function alias to data as a function symbol.Evgeniy Stepanov2015-12-041-0/+12
| | | | | | | | | | CFI emits jump slots for indirect functions as a byte array constant, and declares function-typed aliases to these constants. This change fixes AsmPrinter to emit these aliases as function symbols and not data symbols. llvm-svn: 254674
* CodeGen peephole: fold redundant phys reg copiesJF Bastien2015-12-031-0/+190
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Code generation often exposes redundant physical register copies through virtual registers such as: %vreg = COPY %PHYSREG ... %PHYSREG = COPY %vreg There are cases where no intervening clobber of %PHYSREG occurs, and the later copy could therefore be removed. In some cases this further allows us to remove the initial copy. This patch contains a motivating example which comes from the x86 build of Chrome, specifically cc::ResourceProvider::UnlockForRead uses libstdc++'s implementation of hash_map. That example has two tests live at the same time, and after machine sinking LLVM has confused itself enough and things spilling EFLAGS is a great idea even though it's never restored and the comparison results are both live. Before this patch we have: DEC32m %RIP, 1, %noreg, <ga:@L>, %noreg, %EFLAGS<imp-def> %vreg1<def> = COPY %EFLAGS; GR64:%vreg1 %EFLAGS<def> = COPY %vreg1; GR64:%vreg1 JNE_1 <BB#1>, %EFLAGS<imp-use> Both copies are useless. This patch tries to eliminate the later copy in a generic manner. dec is especially confusing to LLVM when compared with sub. I wrote this patch to treat all physical registers generically, but only remove redundant copies of non-allocatable physical registers because the allocatable ones caused issues (e.g. when calling conventions weren't properly modeled) and should be handled later by the register allocator anyways. The following tests used to failed when the patch also replaced allocatable registers: CodeGen/X86/StackColoring.ll CodeGen/X86/avx512-calling-conv.ll CodeGen/X86/copy-propagation.ll CodeGen/X86/inline-asm-fpstack.ll CodeGen/X86/musttail-varargs.ll CodeGen/X86/pop-stack-cleanup.ll CodeGen/X86/preserve_mostcc64.ll CodeGen/X86/tailcallstack64.ll CodeGen/X86/this-return-64.ll This happens because COPY has other special meaning for e.g. dependency breakage and x87 FP stack. Note that all other backends' tests pass. Reviewers: qcolombet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15157 llvm-svn: 254665
* [WebAssembly] Fix dominance check for PHIs in the StoreResult passDan Gohman2015-12-031-0/+43
| | | | | | | | | | | | | | When a block has no terminator instructions, getFirstTerminator() returns end(), which can't be used in dominance checks. Check dominance for phi operands separately. Also, remove some bits from WebAssemblyRegStackify.cpp that were causing trouble on the same testcase; they were left behind from an earlier experiment. Differential Revision: http://reviews.llvm.org/D15210 llvm-svn: 254662
* [X86] Put no-op ADJCALLSTACK markers around all dynamic loweringsReid Kleckner2015-12-033-8/+49
| | | | | | | | | | | | | | | | | Summary: These ADJCALLSTACK markers don't generate code, but they keep dynamic alloca code that calls chkstk out of the prologue. This slightly pessimizes inalloca calls by preventing some register copy coalescing, but I can live with that. Reviewers: qcolombet Subscribers: hans, llvm-commits Differential Revision: http://reviews.llvm.org/D15200 llvm-svn: 254645
* Move branch folding test to a better location.Andrew Kaylor2015-12-031-0/+110
| | | | llvm-svn: 254640
* AArch64FastISel: Use cbz/cbnz to branch on i1Matthias Braun2015-12-033-19/+7
| | | | | | | | | In the case of a conditional branch without a preceding cmp we used to emit a "and; cmp; b.eq/b.ne" sequence, use tbz/tbnz instead. Differential Revision: http://reviews.llvm.org/D15122 llvm-svn: 254621
* AMDGPU/SI: Emit constant arrays in the .hsrodata_readonly_agent sectionTom Stellard2015-12-031-0/+36
| | | | | | | | | | | | Summary: This is done only when targeting HSA. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13807 llvm-svn: 254587
* Revert "ScheduleDAGInstrs: Rework schedule graph builder."Matthias Braun2015-12-0314-84/+84
| | | | | | | | | | This works mostly fine but breaks some stage 1 builders when compiling compiler-rt on i386. Revert for further investigation as I can't see an obvious cause/fix. This reverts commit r254577. llvm-svn: 254586
* ScheduleDAGInstrs: Rework schedule graph builder.Matthias Braun2015-12-0314-84/+84
| | | | | | | | | | | | | | | The new algorithm remembers the uses encountered while walking backwards until a matching def is found. Contrary to the previous version this: - Works without LiveIntervals being available - Allows to increase the precision to subregisters/lanemasks (not used for now) The changes in the AMDGPU tests are necessary because the R600 scheduler is not stable with respect to the order of nodes in the ready queues. Differential Revision: http://reviews.llvm.org/D9068 llvm-svn: 254577
* [WebAssembly] Add a test for wasm-store-results passDerek Schuff2015-12-031-0/+18
| | | | | | Differential Revision: http://reviews.llvm.org/D15167 llvm-svn: 254570
* Tests: PPC: remove unnecessary metadata. NFCKyle Butt2015-12-021-3/+0
| | | | | | Remove unnecessary metadata from a test case. llvm-svn: 254544
* AMDGPU/SI: Correctly emit agent global segment variables when targeting HSATom Stellard2015-12-021-0/+105
| | | | | | Differential Revision: http://reviews.llvm.org/D14508 llvm-svn: 254540
* [CodeGen]: Fix bad interaction with AntiDep breaking and inline asm.Kyle Butt2015-12-021-0/+308
| | | | | | | | | AggressiveAntiDepBreaker was renaming registers specified by the user for inline assembly. While this will work for compiler-specified registers, it won't work for user-specified registers, and at the time this runs, I don't currently see a way to distinguish them. llvm-svn: 254532
* AArch64: use ldxp/stxp pair to implement 128-bit atomic loads.Tim Northover2015-12-022-2/+11
| | | | | | | | The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic if there has been a corresponding successful stxp. It's less clear for AArch32, so I'm leaving that alone for now. llvm-svn: 254524
* AMDGPU/SI: Don't emit group segment global variablesTom Stellard2015-12-021-0/+14
| | | | | | | | | | | | Summary: Only global or readonly segment variables should appear in object files. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15111 llvm-svn: 254519
* [AArch64]: Add support for Cortex-A35Christof Douma2015-12-023-0/+37
| | | | | | Adds support for the new Cortex-A35 ARMv8-A core. llvm-svn: 254503
* Patch to fix a crash in the PowerPC back end due to ISD::ROTL and ISD::ROTRNemanja Ivanovic2015-12-021-0/+12
| | | | | | not being expanded. Test case included. llvm-svn: 254501
* [X86][FMA] Optimize FNEG(FMUL) PatternsSimon Pilgrim2015-12-022-2/+166
| | | | | | | | | | On FMA targets, we can avoid having to load a constant to negate a float/double multiply by instead using a FNMSUB (-(X*Y)-0) Fix for PR24366 Differential Revision: http://reviews.llvm.org/D14909 llvm-svn: 254495
* [X86][AVX512] add comi with SaeAsaf Badouh2015-12-021-0/+75
| | | | | | | | add builtin_ia32_vcomisd and builtin_ia32_vcomisd Differential Revision: http://reviews.llvm.org/D14331 llvm-svn: 254493
* [X86] Make sure the prologue does not clobber EFLAGS when it lives accross it.Quentin Colombet2015-12-021-0/+113
| | | | | | | | This is a superset of the fix done in r254448. This fixes PR25607. llvm-svn: 254478
* AArch64: fix 128-bit shiftsTim Northover2015-12-021-37/+43
| | | | | | | | | | | | | | We mustn't introduce a shift of exactly 64-bits for any inputs, since that's an UNDEF value (and worse, it's not what you want with the natural Arch64 implementation). The generated code is pretty horrific, but I couldn't come up with an obviously better alternative (if the amount is constant EXTR could help). Turns out 128-bit shifts are just nasty. rdar://22491037 llvm-svn: 254475
* AMDGPU: Error on addrspacecasts that aren't actually implementedMatt Arsenault2015-12-012-52/+66
| | | | llvm-svn: 254469
* AMDGPU: Implement isNoopAddrSpaceCastMatt Arsenault2015-12-011-0/+66
| | | | llvm-svn: 254468
* [X86] Make sure the prologue does not clobber EFLAGS when it lives accross it.Quentin Colombet2015-12-011-0/+91
| | | | | | This fixes PR25629. llvm-svn: 254448
* Fix Thumb1 epilogue generationArtyom Skrobov2015-12-011-0/+60
| | | | | | | | | | | | | | Summary: This had been broken for a very long time, but nobody noticed until D14357 enabled shrink-wrapping by default. Reviewers: jroelofs, qcolombet Subscribers: tyomitch, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14986 llvm-svn: 254444
* [AArch64] Fix a corner case in BitFeild selectWeiming Zhao2015-12-011-0/+22
| | | | | | | | | | | | | | | | | Summary: When not useful bits, BitWidth becomes 0 and APInt will not be happy. See https://llvm.org/bugs/show_bug.cgi?id=25571 We can just mark the operand as IMPLICIT_DEF is none bits of it is used. Reviewers: t.p.northover, jmolloy Subscribers: gberry, jmolloy, mgrang, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D14803 llvm-svn: 254440
* AVX-512: regenerated test for avx512 arithmetics, NFCElena Demikhovsky2015-12-011-61/+222
| | | | llvm-svn: 254410
* Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics.Yury Gribov2015-12-011-0/+21
| | | | | | | | | | | | | | | The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset from native stack pointer to the address of the most recent dynamic alloca on the caller's stack. These intrinsics are intendend for use in combination with @llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning routines. Patch by Max Ostapenko. Differential Revision: http://reviews.llvm.org/D14983 llvm-svn: 254404
* Replace all weight-based interfaces in MBB with probability-based ↵Cong Hou2015-12-0115-48/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | interfaces, and update all uses of old interfaces. (This is the second attempt to submit this patch. The first caused two assertion failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687) The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254377
* Revert r254348: "Replace all weight-based interfaces in MBB with ↵Hans Wennborg2015-12-0115-48/+48
| | | | | | | | | | probability-based interfaces, and update all uses of old interfaces." and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction." Asserts were firing in Chromium builds. See PR25687. llvm-svn: 254366
* Replace all weight-based interfaces in MBB with probability-based ↵Cong Hou2015-12-0115-48/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | interfaces, and update all uses of old interfaces. The patch in http://reviews.llvm.org/D13745 is broken into four parts: 1. New interfaces without functional changes (http://reviews.llvm.org/D13908). 2. Use new interfaces in SelectionDAG, while in other passes treat probabilities as weights (http://reviews.llvm.org/D14361). 3. Use new interfaces in all other passes. 4. Remove old interfaces. This patch is 3+4 above. In this patch, MBB won't provide weight-based interfaces any more, which are totally replaced by probability-based ones. The interface addSuccessor() is redesigned so that the default probability is unknown. We allow unknown probabilities but don't allow using it together with known probabilities in successor list. That is to say, we either have a list of successors with all known probabilities, or all unknown probabilities. In the latter case, we assume each successor has 1/N probability where N is the number of successors. An assertion checks if the user is attempting to add a successor with the disallowed mixed use as stated above. This can help us catch many misuses. All uses of weight-based interfaces are now updated to use probability-based ones. Differential revision: http://reviews.llvm.org/D14973 llvm-svn: 254348
* [X86][FMA4] Prefer FMA4 to FMASimon Pilgrim2015-11-303-3/+3
| | | | | | | | | | | | | | | | We currently output FMA instructions on targets which support both FMA4 + FMA (i.e. later Bulldozer CPUS bdver2/bdver3/bdver4). This patch flips this so FMA4 is preferred; this is for several reasons: 1 - FMA4 is non-destructive reducing the need for mov instructions. 2 - Its more straighforward to commute and fold inputs (although the recent work on FMA has reduced this difference). 3 - All supported targets have FMA4 performance equal or better to FMA - Piledriver (bdver2) in particular has half the throughput when executing FMA instructions. Its looks like no future AMD processor lines will support FMA4 after the Bulldozer series so we're not causing problems for later CPUs. Differential Revision: http://reviews.llvm.org/D14997 llvm-svn: 254339
* Have 'optnone' respect the -fast-isel=false option.Paul Robinson2015-11-301-2/+2
| | | | | | | | This is primarily useful for debugging optnone v. ISel issues. Differential Revision: http://reviews.llvm.org/D14792 llvm-svn: 254335
* [X86] Update test/CodeGen/X86/avg.ll with the help of ↵Cong Hou2015-11-301-250/+347
| | | | | | update_llc_test_checks.py. NFC. llvm-svn: 254334
* AMDGPU: Rework how private buffer passed for HSAMatt Arsenault2015-11-3012-195/+406
| | | | | | | | | | | | | | | | If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. llvm-svn: 254331
* AMDGPU: Remove SIPrepareScratchRegsMatt Arsenault2015-11-307-32/+704
| | | | | | | | | | | | | | | | | | | | | | It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. llvm-svn: 254329
* AMDGPU: Use assert zext for workgroup sizesMatt Arsenault2015-11-301-0/+60
| | | | llvm-svn: 254328
* [ARM] For old thumb ISA like v4t, we cannot use PC directly in pop.Quentin Colombet2015-11-301-4/+23
| | | | | | Fix the epilogue emission to account for that. llvm-svn: 254325
* [X86] Add RIP to GR64_TCW64David Majnemer2015-11-302-1/+17
| | | | | | | | | The MachineVerifier wants to check that the register operands of an instruction belong to the instruction's register class. RIP-relative control flow instructions violated this by referencing RIP. While this was fixed for SysV, it was never fixed for Win64. llvm-svn: 254315
* Enable shrink wrapping for PPC64Kit Barton2015-11-301-1/+0
| | | | | | | | | | | Re-enable shrink wrapping for PPC64 Little Endian. One minor modification to PPCFrameLowering::findScratchRegister was necessary to handle fall-thru blocks (blocks with no terminator) correctly. Tested with all LLVM test, clang tests, and the self-hosting build, with no problems found. PHabricator: http://reviews.llvm.org/D14778 llvm-svn: 254314
* AMDGPU: Don't reserve SCRATCH_PTR input registerMatt Arsenault2015-11-301-1/+1
| | | | | | This hasn't been doing anything since using relocations was added. llvm-svn: 254304
* AVX512: regenerate avx512bw intrincics tests results.Igor Breger2015-11-301-470/+879
| | | | | | Differential Revision: http://reviews.llvm.org/D15069 llvm-svn: 254295
* [X86] int_x86_avx2_permps and X86ISD::VPERMV should take an integer vector ↵Craig Topper2015-11-292-6/+6
| | | | | | for its shuffle indices. llvm-svn: 254269
* [X86][SSE] Added support for lowering to ADDSUBPS/ADDSUBPD with commuted inputsSimon Pilgrim2015-11-291-0/+70
| | | | | | We could already recognise shuffle(FSUB, FADD) -> ADDSUB, this allow us to recognise shuffle(FADD, FSUB) -> ADDSUB by commuting the shuffle mask prior to matching. llvm-svn: 254259
* [X86][AVX] Regenerate ADDSUB testsSimon Pilgrim2015-11-282-132/+309
| | | | | | Tidied up triple and regenerate tests using update_llc_test_checks.py llvm-svn: 254237
* Revert "[ARM] Generate ABI_optimization_goals build attribute, as described ↵Renato Golin2015-11-285-100/+0
| | | | | | | | | in the ARM ARM." This reverts commit r254201 and r254202, as it broke test-suite, self-hosting and sanitizer tests on ARM buildbots. llvm-svn: 254234
* [X86][FMA] Added 512-bit tests to match 128/256-bit tests coverageSimon Pilgrim2015-11-281-0/+487
| | | | | | As discussed on D14909 llvm-svn: 254233
* [X86][FMA] More thorough FMA testsSimon Pilgrim2015-11-282-186/+510
| | | | | | | | | | | | Added FMADD/FMSUB/FNMADD/FNMSUB tests for all types Added load folding tests for 512-bit vectors NOTE: Many of the AVX512 FMA instructions don't yet commute/fold correctly As discussed on D14909 llvm-svn: 254232
* [X86][AVX2] Tidied up PBROADCAST testsSimon Pilgrim2015-11-281-86/+154
| | | | | | Tidied up triple and regenerate tests using update_llc_test_checks.py llvm-svn: 254231
* llvm/test/CodeGen/SystemZ/alloca-04.ll REQUIRES asserts due to -debug-pass.NAKAMURA Takumi2015-11-281-1/+1
| | | | llvm-svn: 254230
OpenPOWER on IntegriCloud