summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [x86] enable machine combiner reassociations for 128-bit vector logical ↵Sanjay Patel2015-09-122-11/+75
| | | | | | | | | | integer insts (2nd try) The changes in: test/CodeGen/X86/machine-cp.ll are just due to scheduling differences after some logic instructions were reassociated. llvm-svn: 247516
* [X86] Added i1 vector sextload testsSimon Pilgrim2015-09-121-0/+1257
| | | | llvm-svn: 247509
* [X86][FMA] Refreshed fma testsSimon Pilgrim2015-09-122-152/+228
| | | | llvm-svn: 247508
* revert r247506; need to verify changes in existing testsSanjay Patel2015-09-121-68/+0
| | | | llvm-svn: 247507
* [x86] enable machine combiner reassociations for 128-bit vector logical ↵Sanjay Patel2015-09-121-0/+68
| | | | | | integer insts llvm-svn: 247506
* [X86][SSE] Use general sext IR for (v)pmovsx stack folding testsSimon Pilgrim2015-09-122-37/+37
| | | | llvm-svn: 247502
* Use function attribute "stackrealign" to decide whether stackAkira Hatanaka2015-09-1110-11/+11
| | | | | | | | | | | | | | | | | realignment should be forced. With this commit, we can now force stack realignment when doing LTO and do so on a per-function basis. Also, add a new cl::opt option "stackrealign" to CommandFlags.h which is used to force stack realignment via llc's command line. Out-of-tree projects currently using -force-align-stack to force stack realignment should make changes to attach the attribute to the functions in the IR. Differential Revision: http://reviews.llvm.org/D11814 llvm-svn: 247450
* [X86] Make sure startproc/endproc are pairedDavid Majnemer2015-09-111-0/+37
| | | | | | | | | | We used different conditions to determine if we should emit startproc vs endproc. Use the same condition to ensure that they will always be paired. This fixes PR24374. llvm-svn: 247435
* [IR] Print the label operands of a catchpad like an invokeReid Kleckner2015-09-111-3/+6
| | | | | | | | | | | | | The rest of the EH pads are fine, since they have at most one label and take fewer operands for the personality. Old catchpad vs. new: %5 = catchpad [i8* bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8*)] to label %__except.ret.10 unwind label %catchendblock.9 ----- %5 = catchpad [i8* bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8*)] to label %__except.ret.10 unwind label %catchendblock.9 llvm-svn: 247433
* [opaque pointer type] Add textual IR support for explicit type parameter for ↵David Blaikie2015-09-1120-59/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | global aliases update.py: import fileinput import sys import re alias_match_prefix = r"(.*(?:=|:|^)\s*(?:external |)(?:(?:private|internal|linkonce|linkonce_odr|weak|weak_odr|common|appending|extern_weak|available_externally) )?(?:default |hidden |protected )?(?:dllimport |dllexport )?(?:unnamed_addr |)(?:thread_local(?:\([a-z]*\))? )?alias" plain = re.compile(alias_match_prefix + r" (.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|addrspacecast|\[\[[a-zA-Z]|\{\{).*$)") cast = re.compile(alias_match_prefix + r") ((?:bitcast|inttoptr|addrspacecast)\s*\(.* to (.*?)(| addrspace\(\d+\) *)\*\)\s*(?:;.*)?$)") gep = re.compile(alias_match_prefix + r") ((?:getelementptr)\s*(?:inbounds)?\s*\((?P<type>.*), (?P=type)(?:\s*addrspace\(\d+\)\s*)?\* .*\)\s*(?:;.*)?$)") def conv(line): m = re.match(cast, line) if m: return m.group(1) + " " + m.group(3) + ", " + m.group(2) m = re.match(gep, line) if m: return m.group(1) + " " + m.group(3) + ", " + m.group(2) m = re.match(plain, line) if m: return m.group(1) + ", " + m.group(2) + m.group(3) + "*" + m.group(4) + "\n" return line for line in sys.stdin: sys.stdout.write(conv(line)) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name *.ll | xargs ./apply.sh From llvm/src/tools/clang: find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll | xargs ./apply.sh llvm-svn: 247378
* [WinEH] Push and pop EBP for 32-bit funcletsReid Kleckner2015-09-103-1/+14
| | | | | | | | The Win32 EH runtime caller does not preserve EBP, even though it does preserve the CSRs (EBX, ESI, EDI) for us. The result was that each finally funclet call would leave the frame pointer off by 12 bytes. llvm-svn: 247348
* [SPARC] Switch to the Machine Scheduler.James Y Knight2015-09-106-55/+53
| | | | | | | | | | | | | The (mostly-deprecated) SelectionDAG-based ILPListDAGScheduler scheduler was making poor scheduling decisions, causing high register pressure and extraneous register spills. Switching to the newer machine scheduler generates better code -- even without there being a machine model defined for SPARC yet. (Actually committing the test changes too, this time, unlike r247315) llvm-svn: 247343
* Fix SEH state numbering algorithm to handle cleanupendpadsReid Kleckner2015-09-102-0/+219
| | | | | | | WinEHPrepare's new coloring algorithm really expects to see cleanupendpads now, so Clang will start emitting them soon. llvm-svn: 247341
* Tidy up some alias syntax to make explicit pointer type migration easierDavid Blaikie2015-09-101-1/+1
| | | | llvm-svn: 247312
* [WinEH] Fix single-block cleanup coloringJoseph Tremoulet2015-09-101-9/+45
| | | | | | | | | | | | | | | | | | Summary: The coloring code in WinEHPrepare queues cleanuprets' successors with the correct color (the parent one) when it sees their cleanuppad, and so later when iterating successors knows to skip processing cleanuprets since they've already been queued. This latter check was incorrectly under an 'else' condition and so inadvertently was not kicking in for single-block cleanups. This change sinks the check out of the 'else' to fix the bug. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12751 llvm-svn: 247299
* Fix PR 24724 - The implicit register verifier shouldn't assume certain operandAlex Lorenz2015-09-104-2/+49
| | | | | | | | | | order. The implicit register verifier in the MIR parser should only check if the instruction's default implicit operands are present in the instruction. It should not check the order in which they occur. llvm-svn: 247283
* AVX512: Implemented encoding and intrinsics forIgor Breger2015-09-106-29/+132
| | | | | | | | | vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11802 llvm-svn: 247276
* [DAGCombine] Truncate BUILD_VECTOR operators if necessary when constant ↵Silviu Baranga2015-09-101-0/+12
| | | | | | | | | | | | | | | | | | | | | folding vectors Summary: The BUILD_VECTOR node will truncate its operators to match the type. We need to take this into account when constant folding - we need to perform a truncation before constant folding the elements. This is because the upper bits can change the result, depending on the operation type (for example this is the case for min/max). This change also adds a regression test. Reviewers: jmolloy Subscribers: jmolloy, llvm-commits Differential Revision: http://reviews.llvm.org/D12697 llvm-svn: 247265
* [ARM] Do not use vtrn for vectorshuffle if the order is reversedJames Molloy2015-09-103-0/+50
| | | | | | | | The tests in isVTRNMask and isVTRN_v_undef_Mask should also check that the elements of the upper and lower half of the vectorshuffle occur in the correct order when both halves are used. Without this test the code assumes that it is correct to use vector transpose (vtrn) for the masks <1, 1, 0, 0> and <1, 3, 0, 2>, among others, but the transpose actually incorrectly generates shuffles for <0, 0, 1, 1> and <0, 2, 1, 3> in this case. Patch by Jeroen Ketema! llvm-svn: 247254
* Enable the shrink wrapping optimization for PPC64.Kit Barton2015-09-101-0/+556
| | | | | | | | | | | | | | The changes in this patch are as follows: 1. Modify the emitPrologue and emitEpilogue methods to work properly when the prologue and epilogue blocks are not the first/last blocks in the function 2. Fix a bug in PPCEarlyReturn optimization caused by an empty entry block in the function 3. Override the runShrinkWrap PredicateFtor (defined in TargetMachine) to check whether shrink wrapping should run: Shrink wrapping will run on PPC64 (Little Endian and Big Endian) unless -enable-shrink-wrap=false is specified on command line A new test case, ppc-shrink-wrapping.ll was created based on the existing shrink wrapping tests for x86, arm, and arm64. Phabricator review: http://reviews.llvm.org/D11817 llvm-svn: 247237
* [AArch64] Match FI+offset in STNP addressing mode.Ahmed Bougacha2015-09-101-6/+2
| | | | | | | | | | | | | | | First, we need to teach isFrameOffsetLegal about STNP. It already knew about the STP/LDP variants, but those were probably never exercised, because it's only the load/store optimizer that generates STP/LDP, and the only user of the method is frame lowering, which runs earlier. The STP/LDP cases were wrong: they didn't take into account the fact that they return two results, not one, so the immediate offset will be the 4th operand, not the 3rd. Follow-up to r247234. llvm-svn: 247236
* [AArch64] Match base+offset in STNP addressing mode.Ahmed Bougacha2015-09-101-14/+165
| | | | | | Followup to r247231. llvm-svn: 247234
* [AArch64] Support selecting STNP.Ahmed Bougacha2015-09-101-0/+192
| | | | | | | | | | | | | | | | | | We could go through the load/store optimizer and match STNP where we would have matched a nontemporal-annotated STP, but that's not reliable enough, as an opportunistic optimization. Insetad, we can guarantee emitting STNP, by matching them at ISel. Since there are no single-input nontemporal stores, we have to resort to some high-bits-extracting trickery to generate an STNP from a plain store. Also, we need to support another, LDP/STP-specific addressing mode, base + signed scaled 7-bit immediate offset. For now, only match the base. Let's make it smart separately. Part of PR24086. llvm-svn: 247231
* [WinEH] Add codegen support for cleanuppad and cleanupretReid Kleckner2015-09-102-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | All of the complexity is in cleanupret, and it mostly follows the same codepaths as catchret, except it doesn't take a return value in RAX. This small example now compiles and executes successfully on win32: extern "C" int printf(const char *, ...) noexcept; struct Dtor { ~Dtor() { printf("~Dtor\n"); } }; void has_cleanup() { Dtor o; throw 42; } int main() { try { has_cleanup(); } catch (int) { printf("caught it\n"); } } Don't try to put the cleanup in the same function as the catch, or Bad Things will happen. llvm-svn: 247219
* [SEH] Emit 32-bit SEH tables for the new EH IRReid Kleckner2015-09-091-0/+219
| | | | | | | | | | | The 32-bit tables don't actually contain PC range data, so emitting them is incredibly simple. The 64-bit tables, on the other hand, use the same table for state numbering as well as label ranges. This makes things more difficult, so it will be implemented later. llvm-svn: 247192
* [WebAssembly] Update target datalayout strings.Dan Gohman2015-09-0910-10/+10
| | | | llvm-svn: 247187
* Revert "AVX512: Implemented encoding and intrinsics for vextracti64x4 ↵Renato Golin2015-09-096-132/+29
| | | | | | | | ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding." This reverts commit r247149, as it was breaking numerous buildbots of varied architectures. llvm-svn: 247177
* [WebAssembly] Implement calls with void return types.Dan Gohman2015-09-091-0/+10
| | | | llvm-svn: 247158
* AMDGPU/SI: Fold operands through REG_SEQUENCE instructionsTom Stellard2015-09-093-14/+6
| | | | | | | | | | | | | | Summary: This helps mostly when we use add instructions for address calculations that contain immediates. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12256 llvm-svn: 247157
* AVX512: Implemented encoding and intrinsics forIgor Breger2015-09-096-29/+132
| | | | | | | | | vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11802 llvm-svn: 247149
* Fix PR 24633 - Handle undef values when parsing standalone constants.Alex Lorenz2015-09-091-0/+16
| | | | llvm-svn: 247145
* Fix vector splitting for extract_vector_elt and vector elements of <8-bits.Daniel Sanders2015-09-091-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: One of the vector splitting paths for extract_vector_elt tries to lower: define i1 @via_stack_bug(i8 signext %idx) { %1 = extractelement <2 x i1> <i1 false, i1 true>, i8 %idx ret i1 %1 } to: define i1 @via_stack_bug(i8 signext %idx) { %base = alloca <2 x i1> store <2 x i1> <i1 false, i1 true>, <2 x i1>* %base %2 = getelementptr <2 x i1>, <2 x i1>* %base, i32 %idx %3 = load i1, i1* %2 ret i1 %3 } However, the elements of <2 x i1> are not byte-addressible. The result of this is that the getelementptr expands to '%base + %idx * (1 / 8)' which simplifies to '%base + %idx * 0', and then simply '%base' causing all values of %idx to extract element zero. This commit fixes this by promoting the vector elements of <8-bits to i8 before splitting the vector. This fixes a number of test failures in pocl. Reviewers: pekka.jaaskelainen Subscribers: pekka.jaaskelainen, llvm-commits Differential Revision: http://reviews.llvm.org/D12591 llvm-svn: 247128
* [WebAssembly] Fix lowering of calls with more than one argument.Dan Gohman2015-09-091-0/+13
| | | | llvm-svn: 247118
* SelectionDAG: Support Expand of f16 extloadsMatt Arsenault2015-09-091-0/+81
| | | | | | | | | | Currently this hits an assert that extload should always be supported, which assumes integer extloads. This moves a hack out of SI's argument lowering and is covered by existing tests. llvm-svn: 247113
* [WebAssembly] Implement WebAssemblyInstrInfo::copyPhysRegDan Gohman2015-09-091-0/+26
| | | | llvm-svn: 247110
* [WinEH] Avoid creating MBBs for LLVM BBs that cannot contain codeReid Kleckner2015-09-081-6/+0
| | | | | | | | | | | | | | Typically these are catchpads, which hold data used to decide whether to catch the exception or continue unwinding. We also shouldn't create MBBs for catchendpads, cleanupendpads, or terminatepads, since no real code can live in them. This fixes a problem where MI passes (like the register allocator) would try to put code into catchpad blocks, which are not executed by the runtime. In the new world, blocks ending in invokes now have many possible successors. llvm-svn: 247102
* [WinEH] Emit prologues and epilogues for funcletsReid Kleckner2015-09-082-14/+190
| | | | | | | | | | | | | | | | | | | | | | | | Summary: 32-bit funclets have short prologues that allocate enough stack for the largest call in the whole function. The runtime saves CSRs for the funclet. It doesn't restore CSRs after we finally transfer control back to the parent funciton via a CATCHRET, but that's a separate issue. 32-bit funclets also have to adjust the incoming EBP value, which is what llvm.x86.seh.recoverframe does in the old model. 64-bit funclets need to spill CSRs as normal. For simplicity, this just spills the same set of CSRs as the parent function, rather than trying to compute different CSR sets for the parent function and each funclet. 64-bit funclets also allocate enough stack space for the largest outgoing call frame, like 32-bit. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12546 llvm-svn: 247092
* Fix the PPC CTR Loop pass to look for calls to the intrinsics thatEric Christopher2015-09-081-0/+349
| | | | | | read CTR and count them as reading the CTR. llvm-svn: 247083
* Fix comments and RUN line in x86-64 stdarg test leftover from last commitDerek Schuff2015-09-081-2/+2
| | | | | | From http://reviews.llvm.org/D12346 llvm-svn: 247070
* x32. Fixes a bug in how struct va_list is initialized in x32Derek Schuff2015-09-084-2/+153
| | | | | | | | | | | | | Summary: This patch modifies X86TargetLowering::LowerVASTART so that struct va_list is initialized with 32 bit pointers in x32. It also includes tests that call @llvm.va_start() for x32. Patch by João Porto Subscribers: llvm-commits, hjl.tools Differential Revision: http://reviews.llvm.org/D12346 llvm-svn: 247069
* x32. Fixes a bug in i8mem_NOREX declaration.Derek Schuff2015-09-084-8/+20
| | | | | | | | | | | | | | The old implementation assumed LP64 which is broken for x32. Specifically, the MOVE8rm_NOREX and MOVE8mr_NOREX, when selected, would cause a 'Cannot emit physreg copy instruction' error message to be reported. This patch also enable the h-register*ll tests for x32. Differential Revision: http://reviews.llvm.org/D12336 Patch by João Porto llvm-svn: 247058
* AMDGPU: Handle sub of constant for DS offset foldingMatt Arsenault2015-09-081-0/+125
| | | | | | | | | sub C, x - > add (sub 0, x), C for DS offsets. This is mostly to fix regressions that show up when SeparateConstOffsetFromGEP is enabled. llvm-svn: 247054
* Fix CPP Backend for GEP API changes for opaque pointer typesDavid Blaikie2015-09-081-0/+10
| | | | | | Based on a patch by Jerome Witmann. llvm-svn: 247047
* WebAssembly: NFC rename shr/sarJF Bastien2015-09-083-6/+6
| | | | | | Renamed from: https://github.com/WebAssembly/design/pull/332 llvm-svn: 247028
* [WebAssembly] Temporarily disable this test, as it depends on additional ↵Dan Gohman2015-09-081-0/+3
| | | | | | patches that aren't yet checked in. llvm-svn: 247011
* AVX512: kunpck encoding implementation Igor Breger2015-09-081-1/+26
| | | | | | | | Added tests for encoding. Differential Revision: http://reviews.llvm.org/D12061 llvm-svn: 247010
* [WebAssembly] Enable SSA lowering and other pre-regalloc passesDan Gohman2015-09-081-0/+22
| | | | llvm-svn: 247008
* [mips] Reserve address spaces 1-255 for software use.Daniel Sanders2015-09-081-0/+12
| | | | | | | | | | | | Summary: And define them to have noop casts with address spaces 0-255. Reviewers: pekka.jaaskelainen Subscribers: pekka.jaaskelainen, llvm-commits Differential Revision: http://reviews.llvm.org/D12678 llvm-svn: 246990
* AVX-512: Lowering for 512-bit vector shuffles.Elena Demikhovsky2015-09-083-506/+367
| | | | | | | | Vector types: <8 x 64>, <16 x 32>, <32 x 16> float and integer. Differential Revision: http://reviews.llvm.org/D10683 llvm-svn: 246981
* [X86][MMX] Added missing stack folding tests for MMX/SSSE3 instructionsSimon Pilgrim2015-09-061-2/+146
| | | | llvm-svn: 246949
OpenPOWER on IntegriCloud