summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* Fixed MSVC compile warning issue introduced in r232837Simon Pilgrim2015-03-221-1/+2
| | | | | | - was reporting 'warning C4715: 'getType32' : not all control paths return a value' llvm-svn: 232913
* [SimplifyLibCalls] Fix negative shifts being produced by the memchr -> ↵Benjamin Kramer2015-03-212-1/+15
| | | | | | bitfield transform. llvm-svn: 232903
* [SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check.Benjamin Kramer2015-03-214-7/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | strchr("123!", C) != nullptr is a common pattern to check if C is one of 1, 2, 3 or !. If the largest element of the string is smaller than the target's register size we can easily create a bitfield and just do a simple test for set membership. int foo(char C) { return strchr("123!", C) != nullptr; } now becomes cmpl $64, %edi ## range check sbbb %al, %al movabsq $0xE000200000001, %rcx btq %rdi, %rcx ## bit test sbbb %cl, %cl andb %al, %cl ## and the two conditions andb $1, %cl movzbl %cl, %eax ## returning an int ret (imho the backend should expand this into a series of branches, but that's a different story) The code is currently limited to bit fields that fit in a register, so usually 64 or 32 bits. Sadly, this misses anything using alpha chars or {}. This could be fixed by just emitting a i128 bit field, but that can generate really ugly code so we have to find a better way. To some degree this is also recreating switch lowering logic, but we can't simply emit a switch instruction and thus change the CFG within instcombine. llvm-svn: 232902
* R600: Cleanup test with multiple check prefixesMatt Arsenault2015-03-211-42/+40
| | | | llvm-svn: 232901
* StringRef: Just forward StringRef::find to libc's memchr.Benjamin Kramer2015-03-211-3/+6
| | | | | | | | | | | Modern libc's have an SSE version of memchr which is a lot faster than our hand-rolled version. In the past I was reluctant to use it because Darwin's memchr used a naive ridiculously slow implementation, but that has been fixed some versions ago. Should have zero functional impact. llvm-svn: 232898
* Revert accidental commit.Benjamin Kramer2015-03-211-6/+3
| | | | | | While this is a fun change, I didn't really test it :) llvm-svn: 232897
* SimplifyLibCalls: Add basic optimization of memchr calls.Benjamin Kramer2015-03-214-3/+168
| | | | | | This is just memchr(x, y, 0) -> nullptr and constant folding. llvm-svn: 232896
* ValueTracking: Forward getConstantStringInfo's TrimAtNul param into ↵Benjamin Kramer2015-03-211-2/+3
| | | | | | | | | | | recursive invocation Currently this is only used to tweak the backend's memcpy inlining heuristics, testing that isn't very helpful. A real test case will follow in the next commit, where this behavior would cause a real miscompilation. llvm-svn: 232895
* Tidied up vec_zero_cse.ll test. NFCI.Simon Pilgrim2015-03-211-9/+10
| | | | | | Added target triple and refactored the CHECKs to be per function. llvm-svn: 232894
* MemoryDependenceAnalysis: Don't miscompile atomicsDavid Majnemer2015-03-213-96/+21
| | | | | | | | | | | | r216771 introduced a change to MemoryDependenceAnalysis that allowed it to reason about acquire/release operations. However, this change does not ensure that the acquire/release operations pair. Unfortunately, this leads to miscompiles as we won't see an acquire load as properly memory effecting. This largely reverts r216771. This fixes PR22708. llvm-svn: 232889
* AArch64: simplify test caseTim Northover2015-03-211-18/+2
| | | | llvm-svn: 232886
* Remove the target independent TargetMachine::getSubtarget andEric Christopher2015-03-2112-25/+25
| | | | | | | | | | | | | | | | | | | TargetMachine::getSubtargetImpl routines. This keeps the target independent code free of bare subtarget calls while the remainder of the backends are migrated, or not if they don't wish to support per-function subtargets as would be needed for function multiversioning or LTO of disparate cpu subarchitecture types, e.g. clang -msse4.2 -c foo.c -emit-llvm -o foo.bc clang -c bar.c -emit-llvm -o bar.bc llvm-link foo.bc bar.bc -o baz.bc llc baz.bc and get appropriate code for what the command lines requested. llvm-svn: 232885
* Remove the bare getSubtargetImpl call from the AArch64 port. As partEric Christopher2015-03-213-6/+37
| | | | | | | of this add a test that shows we can generate code for functions that specifically enable a subtarget feature. llvm-svn: 232884
* Remove the bare getSubtargetImpl call from the PPC port. As partEric Christopher2015-03-213-5/+44
| | | | | | | of this add a test that shows we can generate code with for functions that differ by subtarget feature. llvm-svn: 232882
* Forward the Function based getSubtarget call to the appropriate ImplEric Christopher2015-03-211-2/+2
| | | | | | call. llvm-svn: 232881
* Grab a subtarget off of an AMDGPUTargetMachine rather than aEric Christopher2015-03-211-11/+11
| | | | | | | bare target machine in preparation for the TargetMachine bare getSubtarget/getSubtargetImpl calls going away. llvm-svn: 232880
* Cache the Function dependent subtarget on the MachineFunction.Eric Christopher2015-03-214-2/+108
| | | | | | | | | | | As preparation for removing the getSubtargetImpl() call from TargetMachine go ahead and flip the switch on caching the function dependent subtarget and remove the bare getSubtargetImpl call from the X86 port. As part of this add a few tests that show we can generate code and assemble on X86 based on features/cpu on the Function. llvm-svn: 232879
* Grab the cached subtarget off of the MachineFunction.Eric Christopher2015-03-211-5/+4
| | | | llvm-svn: 232878
* Grab a subtarget off of a MipsTargetMachine rather than aEric Christopher2015-03-212-7/+11
| | | | | | | bare target machine in preparation for the TargetMachine bare getSubtarget/getSubtargetImpl calls going away. llvm-svn: 232877
* Simplify the query for a subtarget in the NVPTX pass manager.Eric Christopher2015-03-211-2/+1
| | | | llvm-svn: 232876
* Change getISAEncoding to use the target triple to determineEric Christopher2015-03-213-6/+8
| | | | | | | | thumb-ness similar to the rest of the Module level asm printing infrastructure as debug info finalization happens after the function may be missing. llvm-svn: 232875
* Make the Hexagon ISelDAGToDAG pass set the subtarget dynamicallyEric Christopher2015-03-212-12/+19
| | | | | | on each runOnMachineFunction invocation. llvm-svn: 232874
* [sanitizer] experimental tracing for cmp instructionsKostya Serebryany2015-03-212-13/+63
| | | | llvm-svn: 232873
* [CodeGen][IfCvt] Don't re-ifcvt blocks with unanalyzable terminators.Ahmed Bougacha2015-03-212-0/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we couldn't analyze its terminator (i.e., it's an indirectbr, or some other weirdness), we can't safely re-if-convert a predicated block, because we can't tell whether the predicated terminator can fallthrough (it does). Currently, we would completely ignore the fallthrough successor. In the added testcase, this means we used to generate: ... @ %entry: cmp r5, #21 ittt ne @ %cc1f: cmpne r7, #42 @ %cc2t: strne.w r5, [r8] movne pc, r10 @ %cc1t: ... Whereas the successor of %cc1f was originally %bb1. With the fix, we get the correct: ... @ %entry: cmp r5, #21 itt eq @ %cc1t: streq.w r5, [r11] moveq pc, r0 @ %cc1f: cmp r7, #42 itt ne @ %cc2t: strne.w r5, [r8] movne pc, r10 @ %bb1: ... rdar://20192768 Differential Revision: http://reviews.llvm.org/D8509 llvm-svn: 232872
* [AArch64] Prefer UZP for concat_vector of illegal truncs.Ahmed Bougacha2015-03-212-22/+28
| | | | | | Follow-up to r232459: prefer a UZP shuffle to the intermediate truncs. llvm-svn: 232871
* Make getLastArgNoClaim work for up to 4 arguments.Filipe Cabecinhas2015-03-202-0/+24
| | | | | | | | | | | | | | Summary: This is needed for http://reviews.llvm.org/D8507 I have no idea what stand-alone tests could be done, if needed. Reviewers: Bigcheese, craig.topper, samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8508 llvm-svn: 232859
* Tell lit.cfg about more Windows triples.Yunzhong Gao2015-03-201-1/+1
| | | | | | For example, the host triple on my 64-bit PC is x86_64-pc-windows-msvc. llvm-svn: 232854
* [X86, AVX] instcombine common cases of vperm2* intrinsics into shufflesSanjay Patel2015-03-202-0/+295
| | | | | | | | | | | | | | | | | | | | vperm2* intrinsics are just shuffles. In a few special cases, they're not even shuffles. Optimizing intrinsics in InstCombine is better than handling this in the front-end for at least two reasons: 1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders really angry (and so I have regrets about some patches from last week). 2. Doing mask conversion logic in header files is hard to write and subsequently read. There are a couple of TODOs in this patch to complete this optimization. Differential Revision: http://reviews.llvm.org/D8486 llvm-svn: 232852
* Fixing a bug with WinEH PHI handlingAndrew Kaylor2015-03-203-5/+271
| | | | llvm-svn: 232851
* [X86] Prefer blendps over insertps codegen for one special caseSanjay Patel2015-03-202-18/+54
| | | | | | | | | | | | | | | | | | | | With this patch, for this one exact case, we'll generate: blendps %xmm0, %xmm1, $1 instead of: insertps %xmm0, %xmm1, $0 If there's a memory operand available for load folding and we're optimizing for size, we'll still generate the insertps. The detailed performance data motivation for this may be found in D7866; in summary, blendps has 2-3x throughput vs. insertps on widely used chips. Differential Revision: http://reviews.llvm.org/D8332 llvm-svn: 232850
* X86: Make helper functions static. NFC.Benjamin Kramer2015-03-201-4/+4
| | | | llvm-svn: 232848
* Remove dead calls and function arguments dealing with TRI in StackMaps.Eric Christopher2015-03-202-5/+3
| | | | llvm-svn: 232847
* DebugInfo: Require valid DIDescriptorsDuncan P. N. Exon Smith2015-03-201-10/+13
| | | | | | | | | | | | | | | | | | | | | As part of PR22777, switch from `dyn_cast_or_null<>` to `cast<>` in most `DIDescriptor` accessors. These classes are lightweight wrappers around pointers, so the users should check for valid pointers before using them. This survives a Darwin clang -g bootstrap (after fixing testcases), but it's possible the bots will complain about other configurations. I'll fix any fallout as quickly as I can! Once this bakes for a bit I'll remove the macros. Note that `DebugLoc` implicitly gets stricter with this change as well, since it forward to `DILocation`. Any code that's using `DebugLoc` accessors should check `DebugLoc::isUnknown()` first. (BTW, I'm also partway through a cleanup of the `DebugLoc` API to make it more obvious what it is (a glorified pointer wrapper) and remove cruft from before the Metadata/Value split. I'll commit soon.) llvm-svn: 232844
* Don't declare all text sections at the start of the .sRafael Espindola2015-03-2014-127/+108
| | | | | | | | | | | | | | | | | The code this patch removes was there to make sure the text sections went before the dwarf sections. That is necessary because MachO uses offsets relative to the start of the file, so adding a section can change relaxations. The dwarf sections were being printed at the start just to produce symbols pointing at the start of those sections. The underlying issue was fixed in r231898. The dwarf sections are now printed when they are about to be used, which is after we printed the text sections. To make sure we don't regress, the patch makes the MachO streamer assert if CodeGen puts anything unexpected after the DWARF sections. llvm-svn: 232842
* Bugpoint: Fix invalid 'inlinedAt:' references in testcaseDuncan P. N. Exon Smith2015-03-201-6/+6
| | | | | | | These are causing crashes in `DebugInfoFinder` after a WIP patch to increase strictness of `DIDescriptor` accessors. llvm-svn: 232839
* AsmPrinter: Check subprogram before using itDuncan P. N. Exon Smith2015-03-201-2/+5
| | | | | | | Check return of `getDISubprogram()` before using it. A WIP patch makes `DIDescriptor` accessors more strict (and would crash on this). llvm-svn: 232838
* Reorganize the x86 ELF relocation selection logic.Rafael Espindola2015-03-203-176/+207
| | | | | | | | | | | | | | | The main differences are: * Split in 32 and 64 bit functions. * First switch on the Modifier so that we have only one non fully covered switch. * Map the fixup kind first to a x86_64 (or i386) specific enum, to make it easy to handle cases like X86::reloc_riprel_4byte_movq_load. * Switch on IsPCRel last, which reduces code duplication. Fixes pr22308. llvm-svn: 232837
* DwarfDebug: Check for null DebugLocsDuncan P. N. Exon Smith2015-03-201-13/+15
| | | | | | | | | | `DL` might be null, so check for that before using accessors. A WIP patch to make `DIDescriptors` more strict fails otherwise. As a bonus, I think the logic is easier to follow now (despite the extra nesting depth). llvm-svn: 232836
* Verifier: Check that !dbg attachments have the right typeDuncan P. N. Exon Smith2015-03-204-37/+20
| | | | | | | | | | | | | | | A WIP patch makes `DIDescriptor` accessors more strict, which in turn causes the `DebugInfoFinder` to crash on wrongly typed `!dbg` attachments. Catch that error up front in `Verifier::visitInstruction()`. Also remove a test that we "handle" invalid `!dbg` attachments, added back in r99938. We don't want to handle those anymore. Note: I'm *not* recursing and verifying the debug info graph reachable from this node; that work is already done by `verifyDebugInfo()`. llvm-svn: 232834
* DebugInfoFinder: Check for null imported entitiesDuncan P. N. Exon Smith2015-03-201-0/+2
| | | | | | | | Don't use the accessors in `DIImportedEntity` on a null pointer. (A WIP patch to make `DIDescriptor` accessors more strict crashes here otherwise.) llvm-svn: 232833
* SanitizerCoverage: Check for null DebugLocsDuncan P. N. Exon Smith2015-03-201-2/+3
| | | | | | | After a WIP patch to make `DIDescriptor` accessors more strict, this started asserting. llvm-svn: 232832
* SelectionDAGBuilder: Rangeify a loop. NFC.Hans Wennborg2015-03-201-8/+6
| | | | llvm-svn: 232831
* SelectionDAGBuilder::handleJTSwitchCase, simplify loop; NFCHans Wennborg2015-03-201-9/+4
| | | | llvm-svn: 232830
* Rewrite test/Feature/md_on_instruction.llDuncan P. N. Exon Smith2015-03-201-22/+23
| | | | | | | | | | | | This test is supposed to be testing whether metadata attachments to instructions work, but it was using invalid debug info to do so. (This was causing assertion failures in the `DebugInfoFinder` with a WIP patch to be more strict about `DIDescriptor` accessors.) Rather than fix the debug info -- which is better tested elsewhere -- just test the IR feature directly. llvm-svn: 232828
* Correctly estimate SROA savings for store operands in inline cost analysis.Wei Mi2015-03-202-2/+24
| | | | | | | | | | | | When estimating SROA savings, we want to see if an address is derived off an alloca in the caller. For store instructions, operand 1 is the address operand, but the current code uses operand 0. Use getPointerOperand for loads and stores to fix this. Patch by Easwaran Raman. http://reviews.llvm.org/D8425 llvm-svn: 232827
* Small optimization to avoid getting pass info when we will not run loopDaniel Berlin2015-03-201-0/+3
| | | | llvm-svn: 232826
* [ARM] Fix handling of thumb1 out-of-range frame offsetsJohn Brawn2015-03-209-19/+59
| | | | | | | | | | | | | | | | LocalStackSlotPass assumes that isFrameOffsetLegal doesn't change its answer when the base register changes. Unfortunately this isn't true in thumb1, where SP-based loads allow a larger offset than non-SP-based loads, and this causes the base register reuse code to generate instructions that are unencodable, causing an assertion failure. Solve this by adding a BaseReg parameter to isFrameOffsetLegal, which ARMBaseRegisterInfo can then make use of to give the correct answer. Differential Revision: http://reviews.llvm.org/D8419 llvm-svn: 232825
* Stripped trailing whitespace. NFC.Simon Pilgrim2015-03-201-15/+15
| | | | llvm-svn: 232822
* Rewrite StackMap location handling to pre-compute the dwarf registerEric Christopher2015-03-202-83/+99
| | | | | | | | | | | | numbers before emission. This removes a dependency on being able to access TRI at the module level and is similar to the DwarfExpression handling. I've modified the debug support into print/dump routines that'll do the same dumping but is now callable anywhere and if TRI isn't available will go ahead and just print out raw register numbers. llvm-svn: 232821
* At the beginning of doFinalization set the MachineFunction toEric Christopher2015-03-201-0/+5
| | | | | | | | nullptr so that users get an earlier dereferencing error and so that we can use it to conditionalize access to MachineFunction specific data. llvm-svn: 232820
OpenPOWER on IntegriCloud