summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86][XOP] BITREVERSE lowering using VPPERMSimon Pilgrim2016-03-301-1/+70
| | | | | | XOP's VPPERM has some great 'permute operations' that it can do as well as part of shuffling the bytes of a 128-bit vector - in this case we use it to perform BITREVERSE in a single instruction. llvm-svn: 264870
* [ThinLTO] Serialize the Module SourceFileName to/from LLVM assemblyTeresa Johnson2016-03-305-0/+23
| | | | | | | | | | | | | | | | | | | Summary: This change serializes out and in the SourceFileName to LLVM assembly so that it is preserved through "llvm-dis | llvm-as". This is necessary to ensure that the global identifiers created for local values in the module summary index are the same even if the bitcode is streamed out and read back from LLVM assembly. Serializing the summary itself to LLVM assembly is in progress. Reviewers: joker.eph Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D18588 llvm-svn: 264869
* [NVPTX] Avoid temporary std::string and make single-use function local to ↵Benjamin Kramer2016-03-302-5/+4
| | | | | | | | the cpp file. No functionality change intended. llvm-svn: 264861
* [VectorUtils] Don't try and truncate PHIs to a smaller bitwidthJames Molloy2016-03-301-0/+15
| | | | | | | | | | We already try not to truncate PHIs in computeMinimalBitwidths. LoopVectorize can't handle it and we really don't need to, because both induction and reduction PHIs are truncated by other means. However, we weren't bailing out in all the places we should have, and we ended up by returning a PHI to be truncated, which has caused PR27018. This fixes PR17018. llvm-svn: 264852
* [x86] Fix a horrible bug in our lowering of x86 floating point atomicChandler Carruth2016-03-301-24/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | operations. Specifically, we had code that tried to badly approximate reconstructing all of the possible variations on addressing modes in two x86 instructions based on those in one pseudo instruction. This is not the first bug uncovered with doing this, so stop doing it altogether. Instead generically and pedantically copy every operand from the address over to both new instructions, and strip kill flags from any register operands. This fixes a subtle bug seen in the wild where we would mysteriously drop parts of the addressing mode, causing for example the index argument in the added test case to just be completely ignored. Hypothetically, this was an extremely bad miscompile because it actually caused a predictable and leveragable write of a 64bit quantity to an unintended offset (the first element of the array intead of whatever other element was intended). As a consequence, in theory this could even have introduced security vulnerabilities. However, this was only something that could happen with an atomic floating point add. No other operation could trigger this bug, so it seems extremely unlikely to have occured widely in the wild. But it did in fact occur, and frequently in scientific applications which were using relaxed atomic updates of a floating point value after adding a delta. Those would end up being quite badly miscompiled by LLVM, which is how we found this. Of course, this often looks like a race condition in the code, but it was actually a miscompile. I suspect that this whole RELEASE_FADD thing was a complete mistake. There is no such operation, and I worry that anything other than add will get remarkably worse codegeneration. But that's not for this change.... llvm-svn: 264845
* IR: Constify LLVMContext::discardValueNames, NFCDuncan P. N. Exon Smith2016-03-301-1/+1
| | | | llvm-svn: 264823
* BitcodeReader: Fix weird whitespace, NFCDuncan P. N. Exon Smith2016-03-301-1/+1
| | | | llvm-svn: 264822
* [MemorySSA] Make the visitor more careful with calls.George Burgess IV2016-03-301-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this patch, the MemorySSA caching visitor would cache all calls that it visited. When paired with phi optimization, this can be problematic. Consider: define void @foo() { ; 1 = MemoryDef(liveOnEntry) call void @clobberFunction() br i1 undef, label %if.end, label %if.then if.then: ; MemoryUse(??) call void @readOnlyFunction() ; 2 = MemoryDef(1) call void @clobberFunction() br label %if.end if.end: ; 3 = MemoryPhi(...) ; MemoryUse(?) call void @readOnlyFunction() ret void } When optimizing MemoryUse(?), we visit defs 1 and 2, so we note to cache them later. We ultimately end up not being able to optimize passed the Phi, so we set MemoryUse(?) to point to the Phi. We then cache the clobbering call for def 1 to be the Phi. This commit changes this behavior so that we wipe out any calls added to VisistedCalls while visiting the defs of a phi we couldn't optimize. Aside: With this patch, we now can bootstrap clang/LLVM without a single MemorySSA verifier failure. Woohoo. :) llvm-svn: 264820
* [x86] Extract a helper function to compute the full addressing mode fromChandler Carruth2016-03-302-21/+29
| | | | | | | | | an x86 MachineInstr's operands. This will be super useful to fix some bad atomics code in my next commit. No functionality changed. llvm-svn: 264819
* [PGO] Handle invoke inst in IR based icall instrumentationXinliang David Li2016-03-301-5/+7
| | | | | | Differential Revision: http://reviews.llvm.org/D18580 llvm-svn: 264818
* [MemorySSA] Change how the walker views/walks visited phis.George Burgess IV2016-03-301-22/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch teaches the caching MemorySSA walker a few things: 1. Not to walk Phis we've walked before. It seems that we tried to do this before, but it didn't work so well in cases like: define void @foo() { %1 = alloca i8 %2 = alloca i8 br label %begin begin: ; 3 = MemoryPhi({%0,liveOnEntry},{%end,2}) ; 1 = MemoryDef(3) store i8 0, i8* %2 br label %end end: ; MemoryUse(?) load i8, i8* %1 ; 2 = MemoryDef(1) store i8 0, i8* %2 br label %begin } Because we wouldn't put Phis in Q.Visited until we tried to visit them. So, when trying to optimize MemoryUse(?): - We would visit 3 above - ...Which would make us put {%0,liveOnEntry} in Q.Visited - ...Which would make us visit {%0,liveOnEntry} - ...Which would make us put {%end,2} in Q.Visited - ...Which would make us visit {%end,2} - ...Which would make us visit 3 - ...Which would realize we've already visited everything in 3 - ...Which would make us conservatively return 3. In the added test-case, (@looped_visitedonlyonce) this behavior would cause us to give incorrect results. Specifically, we'd visit 4 twice in the same query, but on the second visit, we'd skip while.cond because it had been visited, visit if.then/if.then2, and cache "1" as the clobbering def on the way back. 2. If we try to walk the defs of a {Phi,MemLoc} and see it has been visited before, just hand back the Phi we're trying to optimize. I promise this isn't as terrible as it seems. :) We now insert {Phi,MemLoc} pairs just before walking the Phi's upward defs. So, we check the cache for the {Phi,MemLoc} pair before checking if we've already walked the Phi. The {Phi,MemLoc} pair is (almost?) always guaranteed to have a cache entry if we've already fully walked it, because we cache as we go. So, if the {Phi,MemLoc} pair isn't in cache, either: (a) we must be in the process of visiting it (in which case, we can't give a better answer in a cache-as-we-go DFS walker) (b) we visited it, but didn't cache it on the way back (...which seems to require `ModifyingAccess` to not dominate `StartingAccess`, so I'm 99% sure that would be an error. If it's not an error, I haven't been able to get it to happen locally, so I suspect it's rare.) - - - - - As a consequence of this change, we no longer skip upward defs of phis, so we can kill the `VisitedOnlyOne` check. This gives us better accuracy than we had before, at the cost of potentially doing a bit more work when we have a loop. llvm-svn: 264814
* [Aarch64] Turn on the LoopDataPrefetch pass for CycloneAdam Nemet2016-03-301-1/+1
| | | | llvm-svn: 264811
* [PPC] Remove -ppc-loop-prefetch-distance in favor of -prefetch-distanceAdam Nemet2016-03-291-7/+5
| | | | | | | After the previous change, this can now be overridden centrally in the pass. llvm-svn: 264807
* [LoopDataPrefetch] Centralize the tuning cl::opts under the passAdam Nemet2016-03-292-25/+41
| | | | | | | | | This is effectively NFC, minus the renaming of the options (-cyclone-prefetch-distance -> -prefetch-distance). The change was requested by Tim in D17943. llvm-svn: 264806
* [tsan] Do not instrument reads/writes to instruction profile counters.Anna Zaks2016-03-291-1/+25
| | | | | | | | | We have known races on profile counters, which can be reproduced by enabling -fsanitize=thread and -fprofile-instr-generate simultaneously on a multi-threaded program. This patch avoids reporting those races by not instrumenting the reads and writes coming from the instruction profiler. llvm-svn: 264805
* [libFuzzer] more docsKostya Serebryany2016-03-291-1/+2
| | | | llvm-svn: 264803
* ADCE: Remove debug info intrinsics in dead scopesDuncan P. N. Exon Smith2016-03-291-6/+60
| | | | | | | | | | | | | | | | | During ADCE, track which debug info scopes still have live references from the code, and delete debug info intrinsics for the dead ones. These intrinsics describe the locations of variables (in registers or stack slots). If there's no code left corresponding to a variable's scope, then there's no way to reference the variable in the debugger and it doesn't matter what its value is. I add a DEBUG printout when the described location in an SSA register, in case it helps some trying to track down why locations get lost. However, we still delete these; the scope itself isn't attached to any real code, so the ship has already sailed. llvm-svn: 264800
* MachineSink: make shouldSink a TII target hookFiona Glaser2016-03-291-7/+2
| | | | | | | Some targets may disagree on what they want sunk or not sunk, so make this a target hook instead of hardcoded. llvm-svn: 264799
* [LoopDataPrefetch] Make more member functions private, NFC.Adam Nemet2016-03-291-1/+2
| | | | llvm-svn: 264798
* Add a print method to MachineFunctionProperties for better error messagesDerek Schuff2016-03-292-2/+37
| | | | | | | | | This makes check failures much easier to understand. Make it empty (but leave it in the class) for NDEBUG builds. Differential Revision: http://reviews.llvm.org/D18529 llvm-svn: 264780
* [SPARC] Use AtomicExpandPass to expand AtomicRMW instructions.James Y Knight2016-03-293-173/+10
| | | | | | | | | They were previously expanded to CAS loops in a custom isel expansion, but AtomicExpandPass knows how to do that generically. Testing is covered by the existing sparc atomics.ll testcases. llvm-svn: 264771
* MachineVerifier: On dead-def live segments, check that corresponding machine ↵Matthias Braun2016-03-291-3/+18
| | | | | | operand has a dead flag llvm-svn: 264769
* LiveVariables: Fix typo and shorten commentMatthias Braun2016-03-291-4/+2
| | | | llvm-svn: 264768
* IR: Add DbgInfoIntrinsic::getVariableLocationDuncan P. N. Exon Smith2016-03-291-22/+5
| | | | | | | | | | | | Create a common accessor, DbgInfoIntrinsic::getVariableLocation, which doesn't care about the type of debug info intrinsic. Use this to further unify the implementations of DbgDeclareInst::getAddress and DbgValueInst::getValue. Besides being a cleanup, I'm planning to use this to prepare DEBUG output without having to branch on the concrete type. llvm-svn: 264767
* [ThinLTO] Remove post-pass metadata linking supportTeresa Johnson2016-03-295-354/+37
| | | | | | | | | | | Since we have moved to a model where functions are imported in bulk from each source module after making summary-based importing decisions, there is no longer a need to link metadata as a postpass, and all users have been removed. This essentially reverts r255909 and follow-on fixes. llvm-svn: 264763
* Add support for no-jump-tablesNirav Dave2016-03-291-2/+7
| | | | | | | | | | | | | Add function soft attribute to the generation of Jump Tables in CodeGen as initial step towards clang support of gcc's no-jump-table support Reviewers: hans, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18321 llvm-svn: 264756
* Add MachineVerifier check for AllVRegsAllocated MachineFunctionPropertyDerek Schuff2016-03-295-15/+20
| | | | | | | | | | | | | | | | | Summary: Check that any function that has the property set is free of virtual register operands. Also, it is actually VirtRegMap (and not the register allocators) that acutally remove the VReg operands (except for RegAllocFast). Reviewers: qcolombet Subscribers: MatzeB, llvm-commits, qcolombet Differential Revision: http://reviews.llvm.org/D18535 llvm-svn: 264755
* Swift Calling Convention: add swiftself attribute.Manman Ren2016-03-2917-5/+59
| | | | | | Differential Revision: http://reviews.llvm.org/D17866 llvm-svn: 264754
* [SCEV] Extract out a MatchBinaryOp; NFCISanjoy Das2016-03-291-222/+284
| | | | | | | | | MatchBinaryOp abstracts out the IR instructions from the operations they represent. While this change is NFC, we will use this factoring later to map things like `(extractvalue 0 (sadd.with.overflow X Y))` to `(add X Y)`. llvm-svn: 264747
* [SCEV] Use Operator::getOpcode instead of manual dispatch; NFCSanjoy Das2016-03-291-8/+3
| | | | llvm-svn: 264746
* Make InlineSimple's one-arg constructor explicit. NFCJustin Lebar2016-03-291-1/+2
| | | | llvm-svn: 264744
* Reformat a comment in InlineSimple.cpp. NFCJustin Lebar2016-03-291-3/+3
| | | | llvm-svn: 264743
* Test commit accessKonstantin Zhuravlyov2016-03-291-1/+1
| | | | llvm-svn: 264736
* [ThinLTO] Use new GlobalValue::getGUID helper (NFC)Teresa Johnson2016-03-291-2/+1
| | | | | | | This was already being used for functions and aliases, was missed when handling global variables. llvm-svn: 264734
* [mips] Test commit: Mark insertNoop as dead code (NFC)Simon Dardis2016-03-291-0/+1
| | | | llvm-svn: 264728
* [mips] Correct MIPS16 jal/jalx to have uimm26 offsets and add MC layer range ↵Daniel Sanders2016-03-293-5/+9
| | | | | | | | | | | | | | | | checks. NFC. Summary: However, this has no effect at this time because the instructions affected are marked 'isCodeGenOnly=1' and have no alternative for the MC layer. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D18179 llvm-svn: 264712
* AVX-512: fixed a bug in fp_to_uint pattern on KNLElena Demikhovsky2016-03-291-0/+4
| | | | | | | | | Fixed fp_to_uint instruction selection on KNL. One pattern was missing for <4 x double> to <4 x i32> Differential Revision: http://reviews.llvm.org/D18512 llvm-svn: 264701
* BitcodeReader: Allow METADATA_STRINGS to only have !""Duncan P. N. Exon Smith2016-03-291-1/+1
| | | | | | | Support parsing a METADATA_STRINGS record that only has a single piece of metadata, !"". Fixes a corner case in r264551. llvm-svn: 264699
* [SimlifyCFG] Prevent passes from destroying canonical loop structure, ↵Hyojin Sung2016-03-293-7/+32
| | | | | | | | | | | | | | | | | especially for nested loops When eliminating or merging almost empty basic blocks, the existence of non-trivial PHI nodes is currently used to recognize potential loops of which the block is the header and keep the block. However, the current algorithm fails if the loops' exit condition is evaluated only with volatile values hence no PHI nodes in the header. Especially when such a loop is an outer loop of a nested loop, the loop is collapsed into a single loop which prevent later optimizations from being applied (e.g., transforming nested loops into simplified forms and loop vectorization). The patch augments the existing PHI node-based check by adding a pre-test if the BB actually belongs to a set of loop headers and not eliminating it if yes. llvm-svn: 264697
* RegisterPressure: Simplify liveness tracking when lanemasks are not checked.Matthias Braun2016-03-291-31/+66
| | | | | | | | | | | | | | Split RegisterOperands code that collects defs/uses into a variant with and without lanemask tracking. This is a bit of code duplication, but there are enough subtle differences between the two variants that this seems cleaner (and potentially faster). This also fixes a problem where lanes where tracked even though TrackLaneMasks was false. This is part of the fix for http://llvm.org/PR27106. I will commit the testcase when it is completely fixed. llvm-svn: 264696
* LiveVariables: Do not remove dead flags from vreg operandsMatthias Braun2016-03-291-3/+8
| | | | | | | Also add a FIXME comment on why Mips RDDSP causes bogus dead flags to be added which LiveVariables cleans up by accident. llvm-svn: 264695
* [PowerPC] Refactor popcnt[dw] target featuresHal Finkel2016-03-295-18/+28
| | | | | | | | | Instead of using two feature bits, one to indicate the availability of the popcnt[dw] instructions, and another to indicate whether or not they're fast, use a single enum. This allows more consistent control via target attribute strings, and via Clang's command line. llvm-svn: 264690
* [Codegen] Decrease minimum jump table density.Kyle Butt2016-03-292-8/+23
| | | | | | | | | | | Minimum density for both optsize and non optsize are now options -sparse-jump-table-density (default 10) for non optsize functions -dense-jump-table-density (default 40) for optsize functions, which matches the current default. This improves several benchmarks at google at the cost of a small codesize increase. For code compiled with -Os, the old behavior continues llvm-svn: 264689
* Sample profile summary cleanupEaswaran Raman2016-03-282-7/+7
| | | | | | | | Replace references to MaxHeadSamples with MaxFunctionCount Differential Revision: http://reviews.llvm.org/D18522 llvm-svn: 264686
* [WebAssembly] Remove duplicate disabling of passesDerek Schuff2016-03-281-12/+6
| | | | | | Also put all the disabled passes together llvm-svn: 264684
* [PowerPC] Clarify a comment in PPCTTI about vector loadsHal Finkel2016-03-281-1/+1
| | | | | | | This should say that we could do unaligned vector loads on the P7 using VSX instructions, not that we should. llvm-svn: 264683
* Remove personality for declarations in CloneModule.Evgeniy Stepanov2016-03-281-0/+2
| | | | | | | | | | Personality is copied as part of copyFunctionAttributes, but it is invalid on a declaration. Remove the personality attribute it the function body is not cloned. Also add a verifier run over output modules in the llvm-split tool. llvm-svn: 264667
* [X86][SSE] Vectorize a bit (AND/XOR/OR) op if a BUILD_VECTOR has the same op ↵Simon Pilgrim2016-03-281-0/+50
| | | | | | | | | | | | | | for all their scalar elements. If all a BUILD_VECTOR's source elements are the same bit (AND/XOR/OR) operation type and each has one constant operand, lower to a pair of BUILD_VECTOR and just apply the bit operation to the vectors. The constant operands will form a constant vector meaning that we still only have a single BUILD_VECTOR to lower and we will have replaced all the scalarized operations with a single SSE equivalent. Its not in our interest to start make a general purpose vectorizer from this, but I'm seeing enough of these scalar bit operations from the later legalization/scalarization stages to support them at least. Differential Revision: http://reviews.llvm.org/D18492 llvm-svn: 264666
* Reapply (2x) "[PGO] Fix name encoding for ObjC-like functions"Vedant Kumar2016-03-281-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | Function names in ObjC can have spaces in them. This interacts poorly with name compression, which uses spaces to separate PGO names. Fix the issue by using a different separator and update a test. I chose "\01" as the separator because 1) it's non-printable, 2) we strip it from PGO names, and 3) it's the next natural choice once "\00" is discarded (that one's overloaded). What's changed since the original commit? - I fixed up the covmap-V2 binary format tests using a linux VM. - I weakened the CHECK lines in instrprof-comdat.h to account for the fact that there have been bugfixes to clang coverage. These will be fixed up in a follow-up. - I added an assert to make sure we don't get bitten by this again. - I constructed the c-general.profraw file without name compression enabled to appease some bots. Differential Revision: http://reviews.llvm.org/D18516 llvm-svn: 264658
* Add an IR Verifier check for orphaned DICompileUnits.Adrian Prantl2016-03-281-2/+23
| | | | | | | | | A DICompileUnit that is not listed in llvm.dbg.cu will cause assertion failures and/or crashes in the backend. The Verifier should reject this. rdar://problem/25369499 llvm-svn: 264657
OpenPOWER on IntegriCloud