summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-04-16358-1021/+1021
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the call instruction See r230786 and r230794 for similar changes to gep and load respectively. Call is a bit different because it often doesn't have a single explicit type - usually the type is deduced from the arguments, and just the return type is explicit. In those cases there's no need to change the IR. When that's not the case, the IR usually contains the pointer type of the first operand - but since typed pointers are going away, that representation is insufficient so I'm just stripping the "pointerness" of the explicit type away. This does make the IR a bit weird - it /sort of/ reads like the type of the first operand: "call void () %x(" but %x is actually of type "void ()*" and will eventually be just of type "ptr". But this seems not too bad and I don't think it would benefit from repeating the type ("void (), void () * %x(" and then eventually "void (), ptr %x(") as has been done with gep and load. This also has a side benefit: since the explicit type is no longer a pointer, there's no ambiguity between an explicit type and a function that returns a function pointer. Previously this case needed an explicit type (eg: a function returning a void() function was written as "call void () () * @x(" rather than "call void () * @x(" because of the ambiguity between a function returning a pointer to a void() function and a function returning void). No ambiguity means even function pointer return types can just be written alone, without writing the whole function's type. This leaves /only/ the varargs case where the explicit type is required. Given the special type syntax in call instructions, the regex-fu used for migration was a bit more involved in its own unique way (as every one of these is) so here it is. Use it in conjunction with the apply.sh script and associated find/xargs commands I've provided in rr230786 to migrate your out of tree tests. Do let me know if any of this doesn't cover your cases & we can iterate on a more general script/regexes to help others with out of tree tests. About 9 test cases couldn't be automatically migrated - half of those were functions returning function pointers, where I just had to manually delete the function argument types now that we didn't need an explicit function type there. The other half were typedefs of function types used in calls - just had to manually drop the * from those. import fileinput import sys import re pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)') addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$") func_end = re.compile("(?:void.*|\)\s*)\*$") def conv(match, line): if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)): return line return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():] for line in sys.stdin: sys.stdout.write(conv(re.search(pat, line), line)) llvm-svn: 235145
* Disable AArch64 fast-isel on big-endian call vector returns.Pete Cooper2015-04-161-0/+172
| | | | | | | | A big-endian vector return needs a byte-swap which we aren't doing right now. For now just bail on these cases to get correctness back. llvm-svn: 235133
* [WinEH] Handle a landingpad, resume, and cleanup all rolled into a BBReid Kleckner2015-04-161-0/+35
| | | | | | This happens a lot with simple cleanups after SimplifyCFG. llvm-svn: 235117
* Revert the switch lowering change (r235101, r235103, r235106)Hans Wennborg2015-04-168-357/+58
| | | | | | Looks like it broke the sanitizer-ppc64-linux1 build. Reverting for now. llvm-svn: 235108
* Add a triple to switch.ll test.Hans Wennborg2015-04-161-2/+2
| | | | llvm-svn: 235103
* Switch lowering: extract jump tables and bit tests before building binary ↵Hans Wennborg2015-04-168-58/+357
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tree (PR22262) This is a major rewrite of the SelectionDAG switch lowering. The previous code would lower switches as a binary tre, discovering clusters of cases suitable for lowering by jump tables or bit tests as it went along. To increase the likelihood of finding jump tables, the binary tree pivot was selected to maximize case density on both sides of the pivot. By not selecting the pivot in the middle, the binary trees would not always be balanced, leading to performance problems in the generated code. This patch rewrites the lowering to search for clusters of cases suitable for jump tables or bit tests first, and then builds the binary tree around those clusters. This way, the binary tree will always be balanced. This has the added benefit of decoupling the different aspects of the lowering: tree building and jump table or bit tests finding are now easier to tweak separately. For example, this will enable us to balance the tree based on profile info in the future. The algorithm for finding jump tables is O(n^2), whereas the previous algorithm was O(n log n) for common cases, and quadratic only in the worst-case. This doesn't seem to be major problem in practice, e.g. compiling a file consisting of a 10k-case switch was only 30% slower, and such large switches should be rare in practice. Compiling e.g. gcc.c showed no compile-time difference. If this does turn out to be a problem, we could limit the search space of the algorithm. This commit also disables all optimizations during switch lowering in -O0. Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235101
* TRUNCATE constant folding - minor fix for rL233224Simon Pilgrim2015-04-161-0/+21
| | | | | | Fix for test case found by James Molloy - TRUNCATE of constant build vectors can be more simply achieved by simply replacing with a new build vector node with the truncated value type - no need to touch the scalar operands at all. llvm-svn: 235079
* [CodeGen] Re-apply r234809 (concat of scalars), with an x86_mmx fix.Ahmed Bougacha2015-04-162-0/+144
| | | | | | | | | | | | | | | | | | | | | The only type that isn't an integer, isn't floating point, and isn't a vector; ladies and gentlemen, the gift that keeps on giving: x86_mmx! Fixes PR23246. Original message (reverted in r235062): [CodeGen] Combine concat_vectors of scalars into build_vector. Combine something like: (v8i8 concat_vectors (v2i8 bitcast (i16)) x4) into: (v8i8 (bitcast (v4i16 BUILD_VECTOR (i16) x4))) If any of the scalars are floating point, use that throughout. Differential Revision: http://reviews.llvm.org/D8948 llvm-svn: 235072
* Revert r234809 because it caused PR23246.Nick Lewycky2015-04-161-125/+0
| | | | llvm-svn: 235062
* [SEH] Deal with users of the old lpad for SEH catch-all blocksReid Kleckner2015-04-161-0/+59
| | | | | | | | | | | The way we split SEH catch-all blocks can leave some dead EH values behind at -O0. Try to remove them, and if we fail, replace them all with undef. Fixes a crash when removing the old unreachable landingpad which is still used by extractvalue instructions in the catch-all block. llvm-svn: 235061
* DebugInfo: Remove 'inlinedAt:' field from MDLocalVariableDuncan P. N. Exon Smith2015-04-154-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove 'inlinedAt:' from MDLocalVariable. Besides saving some memory (variables with it seem to be single largest `Metadata` contributer to memory usage right now in -g -flto builds), this stops optimization and backend passes from having to change local variables. The 'inlinedAt:' field was used by the backend in two ways: 1. To tell the backend whether and into what a variable was inlined. 2. To create a unique id for each inlined variable. Instead, rely on the 'inlinedAt:' field of the intrinsic's `!dbg` attachment, and change the DWARF backend to use a typedef called `InlinedVariable` which is `std::pair<MDLocalVariable*, MDLocation*>`. This `DebugLoc` is already passed reliably through the backend (as verified by r234021). This commit removes the check from r234021, but I added a new check (that will survive) in r235048, and changed the `DIBuilder` API in r235041 to require a `!dbg` attachment whose 'scope:` is in the same `MDSubprogram` as the variable's. If this breaks your out-of-tree testcases, perhaps the script I used (mdlocalvariable-drop-inlinedat.sh) will help; I'll attach it to PR22778 in a moment. llvm-svn: 235050
* DebugInfo: Add missing !dbg attachments to intrinsicsDuncan P. N. Exon Smith2015-04-1516-26/+26
| | | | | | | | Add missing `!dbg` attachments to `@llvm.dbg.*` intrinsics. I updated these using a script (add-dbg-to-intrinsics.sh) that I'll attach to PR22778 for posterity. llvm-svn: 235040
* [WinEH] Try to make the MachineFunction CFG more accurateReid Kleckner2015-04-151-0/+6
| | | | | | | | | | | | This avoids emitting code for unreachable landingpad blocks that contain calls to llvm.eh.actions and indirectbr. It's also a first step towards unifying the SEH and WinEH lowering codepaths. I'm keeping the old fan-in lowering of SEH around until the preparation version works well enough that we can switch over without breaking existing users. llvm-svn: 235037
* Reland "[WinEH] Use the parent function when computing frameescape labels"Reid Kleckner2015-04-151-0/+163
| | | | | | Fixed the test by removing extraneous quotes. llvm-svn: 235028
* Revert "[WinEH] Use the parent function when computing frameescape labels"Reid Kleckner2015-04-151-163/+0
| | | | | | This reverts commit r235025. The test isn't passing yet. llvm-svn: 235027
* [WinEH] Use the parent function when computing frameescape labelsReid Kleckner2015-04-151-0/+163
| | | | | | Fixes assertions in MC when a local label wasn't defined. llvm-svn: 235025
* Update tests to not be as dependent on section numbers.Rafael Espindola2015-04-154-9/+4
| | | | | | | | Many of these predate llvm-readobj. With elf-dump we had to match a relocation to symbol number and symbol number to symbol name or section number. llvm-svn: 235015
* [X86] add an exedepfix entry for movq == movlps == movlpdSanjay Patel2015-04-1512-18/+93
| | | | | | | | | | | | | This is a 1-line patch (with a TODO for AVX because that will affect even more regression tests) that lets us substitute the appropriate 64-bit store for the float/double/int domains. It's not clear to me exactly what the difference is between the 0xD6 (MOVPQI2QImr) and 0x7E (MOVSDto64mr) opcodes, but this is apparently the right choice. Differential Revision: http://reviews.llvm.org/D8691 llvm-svn: 235014
* [x86] Implement combineRepeatedFPDivisorsSanjay Patel2015-04-151-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set the transform bar at 2 divisions because the fastest current x86 FP divider circuit is in SandyBridge / Haswell at 10 cycle latency (best case) relative to a 5 cycle multiplier. So that's the worst case for this transform (no latency win), but multiplies are obviously pipelined while divisions are not, so there's still a big throughput win which we would expect to show up in typical FP code. These are the sequences I'm comparing: divss %xmm2, %xmm0 mulss %xmm1, %xmm0 divss %xmm2, %xmm0 Becomes: movss LCPI0_0(%rip), %xmm3 ## xmm3 = mem[0],zero,zero,zero divss %xmm2, %xmm3 mulss %xmm3, %xmm0 mulss %xmm1, %xmm0 mulss %xmm3, %xmm0 [Ignore for the moment that we don't optimize the chain of 3 multiplies into 2 independent fmuls followed by 1 dependent fmul...this is the DAG version of: https://llvm.org/bugs/show_bug.cgi?id=21768 ...if we fix that, then the transform becomes even more profitable on all targets.] Differential Revision: http://reviews.llvm.org/D8941 llvm-svn: 235012
* Re-apply r234898 and fix tests.Daniel Jasper2015-04-153-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit makes LLVM not estimate branch probabilities when doing a single bit bitmask tests. The code that originally made me discover this is: if ((a & 0x1) == 0x1) { .. } In this case we don't actually have any branch probability information and should not assume to have any. LLVM transforms this into: %and = and i32 %a, 1 %tobool = icmp eq i32 %and, 0 So, in this case, the result of a bitwise and is compared against 0, but nevertheless, we should not assume to have probability information. CodeGen/ARM/2013-10-11-select-stalls.ll started failing because the changed probabilities changed the results of ARMBaseInstrInfo::isProfitableToIfCvt() and led to an Ifcvt of the diamond in the test. AFAICT, the test was never meant to test this and thus changing the test input slightly to not change the probabilities seems like the best way to preserve the meaning of the test. llvm-svn: 234979
* [WinEH] Avoid emitting xdata tables twice for cleanupsReid Kleckner2015-04-141-0/+20
| | | | | | | Since adding invokes of llvm.donothing to cleanups, we come here now, and trivial EH cleanup usage from clang fails to compile. llvm-svn: 234948
* Revert "The code that originally made me discover this is:"Rafael Espindola2015-04-142-3/+3
| | | | | | | This reverts commit r234898. CodeGen/ARM/2013-10-11-select-stalls.ll was faling. llvm-svn: 234903
* Change the testcase mtriple to x86_64-unknown-unknownKrzysztof Parzyszek2015-04-141-1/+1
| | | | llvm-svn: 234900
* The code that originally made me discover this is:Daniel Jasper2015-04-142-3/+3
| | | | | | | | | | | | | | | | | | if ((a & 0x1) == 0x1) { .. } In this case we don't actually have any branch probability information and should not assume to have any. LLVM transforms this into: %and = and i32 %a, 1 %tobool = icmp eq i32 %and, 0 So, in this case, the result of a bitwise and is compared against 0, but nevertheless, we should not assume to have probability information. llvm-svn: 234898
* R600/SI: Fix verifier error caused by SIAnnotateControlFlowTom Stellard2015-04-141-0/+25
| | | | | | | | | | | | | | This pass will always try to insert llvm.SI.ifbreak intrinsics in the same block that its conditional value is computed in. This is a problem when conditions for breaks or continue are computed outside of the loop, because the llvm.SI.ifbreak intrinsic ends up being inserted outside of the loop. This patch fixes this problem by inserting the llvm.SI.ifbreak intrinsics in the loop header when the condition is computed outside the loop. llvm-svn: 234891
* [CodeGen] Combine concat_vectors of scalars into build_vector.Ahmed Bougacha2015-04-131-0/+125
| | | | | | | | | | | | | Combine something like: (v8i8 concat_vectors (v2i8 bitcast (i16)) x4) into: (v8i8 (bitcast (v4i16 BUILD_VECTOR (i16) x4))) If any of the scalars are floating point, use that throughout. Differential Revision: http://reviews.llvm.org/D8948 llvm-svn: 234809
* Settle on a specific triple for the aarch64 testcaseKrzysztof Parzyszek2015-04-131-1/+1
| | | | llvm-svn: 234801
* Also add mtriple to the aarch64 testcaseKrzysztof Parzyszek2015-04-131-1/+1
| | | | llvm-svn: 234797
* Add mtriple to test case to avoid problems with different naming schemesKrzysztof Parzyszek2015-04-131-1/+1
| | | | llvm-svn: 234793
* Remove this test until I figure out why it failsKrzysztof Parzyszek2015-04-131-31/+0
| | | | llvm-svn: 234777
* Use FileCheck for testMatthias Braun2015-04-131-13/+17
| | | | llvm-svn: 234774
* Make the ARM testcase from r234764 also pass on ThumbKrzysztof Parzyszek2015-04-131-3/+3
| | | | llvm-svn: 234772
* Revert revisions r234755, r234759, r234760Jan Vesely2015-04-137-80/+28
| | | | | | | | | | | Revert "Remove default in fully-covered switch (to fix Clang -Werror -Wcovered-switch-default)" Revert "R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO" Revert "LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB" Using overflow operations fails CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll on hexagon, nvptx, and r600. Revert while I investigate. llvm-svn: 234768
* Allow memory intrinsics to be tail callsKrzysztof Parzyszek2015-04-135-0/+155
| | | | llvm-svn: 234764
* DAGCombiner: Fix crash in select(select) opt.Matthias Braun2015-04-131-0/+21
| | | | | | | | | In case of different types used for the condition of the selects the select(select) -> select(and) normalisation cannot be performed. See also: http://reviews.llvm.org/D7622 llvm-svn: 234763
* R600: Add carry and borrow instructions. Use them to implement UADDO/USUBOJan Vesely2015-04-134-8/+74
| | | | | | | | | | | | | | | | v2: tighten the sub64 tests v3: rename to CARRY/BORROW v4: fixup test cmdline add known bits computation use sign extend instead of sub 0,x better add test v5: remove redundant break move lowering to separate functions fix comments Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewers: arsenm llvm-svn: 234759
* LegalizeDAG: Try to use Overflow operations when expanding ADD/SUBJan Vesely2015-04-134-24/+10
| | | | | | | | | | v2: consider BooleanContents when processing overflow Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewers: resistor, jholewinsky (nvidia parts) Differential Revision: http://reviews.llvm.org/D6340 llvm-svn: 234755
* [ARM] Align global variables passed to memory intrinsicsJohn Brawn2015-04-132-13/+51
| | | | | | | | | | Fill in the TODO in CodeGenPrepare::OptimizeCallInst so that global variables that are passed to memory intrinsics are aligned in the same way that allocas are. Differential Revision: http://reviews.llvm.org/D8421 llvm-svn: 234735
* llvm/test/CodeGen/R600/fminnum.ll: Relax an expression for NaN on MSVCRT ↵NAKAMURA Takumi2015-04-131-1/+1
| | | | | | | | | like r204118. <stdin>:202:2: note: possible intended match here 2143289344(1.#QNAN0e+00), 2(2.802597e-45) llvm-svn: 234719
* R600: Make FMIN/MAXNUM legal on all asicsJan Vesely2015-04-122-0/+182
| | | | | | | | v2: Add tests Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> reviewer: arsenm llvm-svn: 234716
* [PowerPC] Really iterate over all loops in PPCLoopDataPrefetch/PPCLoopPreIncPrepHal Finkel2015-04-121-0/+48
| | | | | | | | When I fixed these a couple of days ago to iterate over all loops, not just depth == 1 loops, I inadvertently made it such that we'd only look at the first top-level loop. Make sure that we really look at all of them. llvm-svn: 234705
* [PowerPC] Disable part-word atomics on the P7Hal Finkel2015-04-111-16/+16
| | | | | | | As it turns out, even though these are part of ISA 2.06, the P7 does not support them (or, at least, not any P7s we're tested so far). llvm-svn: 234686
* Add direct moves to/from VSR and exploit them for FP/INT conversionsNemanja Ivanovic2015-04-111-0/+426
| | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D8928 It adds direct move instructions to/from VSX registers to GPR's. These are exploited for FP <-> INT conversions. llvm-svn: 234682
* [PowerPC] Fix PPCLoopPreIncPrep for depth > 1 loopsHal Finkel2015-04-111-0/+52
| | | | | | | | | This pass had the same problem as the data-prefetching pass: it was only checking for depth == 1 loops in practice. Fix that, add some debugging statements, and make sure that, when we grab an AddRec, it is for the loop we expect. llvm-svn: 234670
* [CodeGen] Split -enable-global-merge into ARM and AArch64 options.Ahmed Bougacha2015-04-115-17/+19
| | | | | | | | | | | | | Currently, there's a single flag, checked by the pass itself. It can't force-enable the pass (and is on by default), because it might not even have been created, as that's the targets decision. Instead, have separate explicit flags, so that the decision is consistently made in the target. Keep the flag as a last-resort "force-disable GlobalMerge" for now, for backwards compatibility. llvm-svn: 234666
* [WinEH] Recognize SEH finally block inserted by the frontendReid Kleckner2015-04-101-0/+155
| | | | | | | | | | | This allows winehprepare to build sensible llvm.eh.actions calls for SEH finally blocks. The pattern matching in this change is brittle and should be replaced with something more robust soon. In the meantime, this will let us write the code that produces __C_specific_handler xdata tables, which we need regardless of how we decide to get finally blocks through EH preparation. llvm-svn: 234663
* [InstCombine][CodeGenPrep] Create llvm.uadd.with.overflow in CGP.Sanjoy Das2015-04-101-22/+5
| | | | | | | | | | | | | | | | | | | Summary: This change moves creating calls to `llvm.uadd.with.overflow` from InstCombine to CodeGenPrep. Combining overflow check patterns into calls to the said intrinsic in InstCombine inhibits optimization because it introduces an intrinsic call that not all other transforms and analyses understand. Depends on D8888. Reviewers: majnemer, atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8889 llvm-svn: 234638
* Avoid spewing binary to stdout in some filetype=obj testsReid Kleckner2015-04-101-10/+10
| | | | llvm-svn: 234627
* use update_llc_test_checks.py to tighten checkingSanjay Patel2015-04-101-94/+103
| | | | | | | | test features, not CPUs remove unnecessary cruft llvm-svn: 234622
* [WinEH] Try to make outlining invokes work a little betterReid Kleckner2015-04-101-0/+91
| | | | | | | | WinEH currently turns invokes into calls. Long term, we will reconsider this, but for now, make sure we remap the operands and clone the successors of the new terminator. llvm-svn: 234608
OpenPOWER on IntegriCloud