summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Wrong attribute. LLVM_ATTRIBUTE_USED not LLVM_ATTRIBUTE_UNUSEDSid Manning2014-10-151-1/+1
| | | | llvm-svn: 219837
* Allow forward references to section symbols.Rafael Espindola2014-10-151-1/+8
| | | | llvm-svn: 219835
* Teach ScalarEvolution to sharpen range information.Sanjoy Das2014-10-151-0/+38
| | | | | | | | | | | | If x is known to have the range [a, b) in a loop predicated by (icmp ne x, a), its range can be sharpened to [a + 1, b). Get ScalarEvolution and hence IndVars to exploit this fact. This change triggers an optimization to widen-loop-comp.ll, so it had to be edited to get it to pass. phabricator: http://reviews.llvm.org/D5639 llvm-svn: 219834
* Add LLVM_ATTRIBUTE_UNUSED to function currently just used in an assertSid Manning2014-10-151-0/+2
| | | | | | Fixes break when -Wunused-function is used. llvm-svn: 219833
* InstCombine: Narrow switch instructions using known bits.Akira Hatanaka2014-10-151-0/+31
| | | | | | | | | Truncate the operands of a switch instruction to a narrower type if the upper bits are known to be all ones or zeros. rdar://problem/17720004 llvm-svn: 219832
* Reapply "[FastISel][AArch64] Add custom lowering for GEPs."Juergen Ributzka2014-10-151-0/+76
| | | | | | | | | | | This is mostly a copy of the existing FastISel GEP code, but we have to duplicate it for AArch64, because otherwise we would bail out even for simple cases. This is because the standard fastEmit functions don't cover MUL at all and ADD is lowered very inefficientily. The original commit had a bug in the add emit logic, which has been fixed. llvm-svn: 219831
* [FastISel][AArch64] Factor out add with immediate emission into a helper ↵Juergen Ributzka2014-10-151-13/+28
| | | | | | | | function. NFC. Simplify add with immediate emission by factoring it out into a helper function. llvm-svn: 219830
* Correctly handle references to section symbols.Rafael Espindola2014-10-153-50/+54
| | | | | | | | | | | | | | | | | When processing assembly like .long .text we were creating a new undefined symbol .text. GAS on the other hand would handle that as a reference to the .text section. This patch implements that by creating the section symbols earlier so that they are visible during asm parsing. The patch also updates llvm-readobj to print the symbol number in the relocation dump so that the test can differentiate between two sections with the same name. llvm-svn: 219829
* Enable the instruction printer in HexagonMCTargetDescSid Manning2014-10-154-4/+64
| | | | | | | | | | This adds the MCInstPrinter to the LLVMHexagonDesc library and removes the dependency LLVMHexagonAsmPrinter had on LLVMHexagonDesc. This is a prerequisite needed by the disassembler. Phabricator Revision: http://reviews.llvm.org/D5734 llvm-svn: 219826
* R600/SI: Also try to use 0 base for misaligned 8-byte DS loads.Matt Arsenault2014-10-151-0/+17
| | | | llvm-svn: 219823
* R600: Fix miscompiles when BFE has multiple usesMatt Arsenault2014-10-151-7/+10
| | | | | | SimplifyDemandedBits would break the other uses of the operand. llvm-svn: 219819
* correct const-ness with auto and dyn_castSanjay Patel2014-10-151-3/+3
| | | | | | | | | 1. Use const with autos. 2. Don't bother with explicit const in cast ops because they do it automagically. Thanks, David B. / Aaron B. / Reid K. llvm-svn: 219817
* [SLPVectorize] Basic ephemeral-value awarenessHal Finkel2014-10-151-3/+30
| | | | | | | | | | | | | | The SLP vectorizer should not vectorize ephemeral values. These are used to express information to the optimizer, and vectorizing them does not lead to faster code (because the ephemeral values are dropped prior to code generation, vectorized or not), and obscures the information the instructions are attempting to communicate (the logic that interprets the arguments to @llvm.assume generically does not understand vectorized conditions). Also, uses by ephemeral values are free (because they, and the necessary extractelement instructions, will be dropped prior to code generation). llvm-svn: 219816
* Treat the WorkSet used to find ephemeral values as double-endedHal Finkel2014-10-151-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to make sure that we visit all operands of an instruction before moving deeper in the operand graph. We had been pushing operands onto the back of the work set, and popping them off the back as well, meaning that we might visit an instruction before visiting all of its uses that sit in between it and the call to @llvm.assume. To provide an explicit example, given the following: %q0 = extractelement <4 x float> %rd, i32 0 %q1 = extractelement <4 x float> %rd, i32 1 %q2 = extractelement <4 x float> %rd, i32 2 %q3 = extractelement <4 x float> %rd, i32 3 %q4 = fadd float %q0, %q1 %q5 = fadd float %q2, %q3 %q6 = fadd float %q4, %q5 %qi = fcmp olt float %q6, %q5 call void @llvm.assume(i1 %qi) %q5 is used by both %qi and %q6. When we visit %qi, it will be marked as ephemeral, and we'll queue %q6 and %q5. %q6 will be marked as ephemeral and we'll queue %q4 and %q5. Under the old system, we'd then visit %q4, which would become ephemeral, %q1 and then %q0, which would become ephemeral as well, and now we have a problem. We'd visit %rd, but it would not be marked as ephemeral because we've not yet visited %q2 and %q3 (because we've not yet visited %q5). This will be covered by a test case in a follow-up commit that enables ephemeral-value awareness in the SLP vectorizer. llvm-svn: 219815
* [MC] Make bundle alignment mode setting idempotent and support nested bundlesDerek Schuff2014-10-152-12/+36
| | | | | | | | | | | | | | | | | | | | | | Summary: Currently an error is thrown if bundle alignment mode is set more than once per module (either via the API or the .bundle_align_mode directive). This change allows setting it multiple times as long as the alignment doesn't change. Also nested bundle_lock groups are currently not allowed. This change allows them, with the effect that the group stays open until all nests are exited, and if any of the bundle_lock directives has the align_to_end flag, the group becomes align_to_end. These changes make the bundle aligment simpler to use in the compiler, and also better match the corresponding support in GNU as. Reviewers: jvoung, eliben Differential Revision: http://reviews.llvm.org/D5801 llvm-svn: 219811
* DI: Make comments "brief"-er, NFCDuncan P. N. Exon Smith2014-10-151-8/+8
| | | | | | | | | | | Follow-up to r219801. Post-commit review pointed out that all comments require a `\brief` description [1], so I converted many and recrafted a few to be briefer or to include a brief intro. (If I'm going to clean them up, I should do it right!) [1]: http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments llvm-svn: 219808
* Use 'auto' for easier reading; no functional change intended.Sanjay Patel2014-10-151-6/+3
| | | | llvm-svn: 219804
* DI: Cleanup comments, NFCDuncan P. N. Exon Smith2014-10-151-98/+0
| | | | | | | | | | | | | | | | | A number of comment cleanups: - Remove duplicated function and class names from comments. - Remove duplicated comments from source file (some of which were out-of-sync). - Move any unduplicated comments from source file to header. - Remove some noisy comments entirely (e.g., a comment for `DIDescriptor::print()` saying "print descriptor" just gets in the way of reading the code). llvm-svn: 219801
* Simplify handling of --noexecstack by using getNonexecutableStackSection.Rafael Espindola2014-10-1524-97/+63
| | | | llvm-svn: 219799
* DI: Use a `DenseMap` instead of named metadata, NFCDuncan P. N. Exon Smith2014-10-152-49/+5
| | | | | | | Remove a strange round-trip through named metadata to assign preserved local variables to their subprograms. llvm-svn: 219798
* Move getNonexecutableStackSection up to the base ELF class.Rafael Espindola2014-10-157-23/+9
| | | | | | The .note.GNU-stack section is not SystemZ/X86 specific. llvm-svn: 219796
* R600: Use existing variableMatt Arsenault2014-10-151-1/+1
| | | | llvm-svn: 219778
* R600: Remove outdated commentMatt Arsenault2014-10-151-3/+0
| | | | llvm-svn: 219777
* Revert "[FastISel][AArch64] Add custom lowering for GEPs."Juergen Ributzka2014-10-151-85/+0
| | | | | | This breaks our internal build bots. Reverting it to get the bots green again. llvm-svn: 219776
* [MachineSink] Use the real post dominator treeJingyue Wu2014-10-151-21/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo. The old heuristics caused instructions to sink unnecessarily, and might create register pressure. This is the second try of the fix. The first one (D4814) caused a performance regression due to failing to sink instructions out of loops (PR21115). This patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower one regardless of whether the target block post-dominates the source. Thanks Alexey Volkov for reporting PR21115! Test Plan: Added a NVPTX codegen test to verify that our change prevents the backend from over-sinking. It also shows the unnecessary register pressure caused by over-sinking. Added an X86 test to verify we can sink instructions out of loops regardless of the dominance relationship. This test is reduced from Alexey's test in PR21115. Updated an affected test in X86. Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime performance. Results are attached separately in the review thread. Reviewers: Jiangning, resistor, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D5633 llvm-svn: 219773
* ARM: drop check for triple that's no longer used.Tim Northover2014-10-151-3/+2
| | | | | | | | | | Early attempts to support AAPCS bare metal MachO targets based the decision on the CPU being compiled for. This was not a particularly great idea and we've got a better option now, but this check remained. No functional change for any target we care about. llvm-svn: 219767
* Remove unused variable.Eric Christopher2014-10-151-1/+0
| | | | llvm-svn: 219750
* No need to cache this unused variable.Eric Christopher2014-10-141-3/+1
| | | | | | Patch by Ehsan Akhgari. llvm-svn: 219749
* [AArch64] Wrong CC access in CSINC-conditional branch sequenceGerolf Hoflehner2014-10-141-5/+1
| | | | | | | | This is a follow up to commit r219742. It removes the CCInMI variable and accesses the CC in CSCINC directly. In the case of a conditional branch accessing the CC with CCInMI was wrong. llvm-svn: 219748
* [AAarch64] Optimize CSINC-branch sequenceGerolf Hoflehner2014-10-143-29/+149
| | | | | | | | | | | | | | | | | | | | | Peephole optimization that generates a single conditional branch for csinc-branch sequences like in the examples below. This is possible when the csinc sets or clears a register based on a condition code and the branch checks that register. Also the condition code may not be modified between the csinc and the original branch. Examples: 1. Convert csinc w9, wzr, wzr, <CC>;tbnz w9, #0, 0x44 to b.<invCC> 2. Convert csinc w9, wzr, wzr, <CC>; tbz w9, #0, 0x44 to b.<CC> rdar://problem/18506500 llvm-svn: 219742
* [LoopVectorize] Ignore @llvm.assume for cost estimates and legalityHal Finkel2014-10-141-3/+32
| | | | | | | | | | | | | | A few minor changes to prevent @llvm.assume from interfering with loop vectorization. First, treat @llvm.assume like the lifetime intrinsics, which are scalarized (but don't otherwise interfere with the legality checking). Second, ignore the cost of ephemeral instructions in the loop (these will go away anyway during CodeGen). Alignment assumptions and other uses of @llvm.assume can often end up inside of loops that should be vectorized (this is not uncommon for assumptions generated by __attribute__((align_value(n))), for example). llvm-svn: 219741
* [X86][SSE] pslldq/psrldq shuffle mask decodesSimon Pilgrim2014-10-143-0/+71
| | | | | | | | Patch to provide shuffle decodes and asm comments for the sse pslldq/psrldq SSE2/AVX2 byte shift instructions. Differential Revision: http://reviews.llvm.org/D5598 llvm-svn: 219738
* ARM: remove ARM/Thumb distinction for preferred alignment.Tim Northover2014-10-141-5/+0
| | | | | | | | | | | | Thumb1 has legitimate reasons for preferring 32-bit alignment of types i1/i8/i16, since the 16-bit encoding of "add rD, sp, #imm" requires #imm to be a multiple of 4. However, this is a trade-off betweem code size and RAM usage; the DataLayout string is not the best place to represent it even if desired. So this patch removes the extra Thumb requirements, hopefully making ARM and Thumb completely compatible in this respect. llvm-svn: 219734
* ARM: allow misaligned local variables in Thumb1 mode.Tim Northover2014-10-141-3/+1
| | | | | | | | There's no hard requirement on LLVM to align local variable to 32-bits, so the Thumb1 frame handling needs to be able to deal with variables that are only naturally aligned without falling over. llvm-svn: 219733
* [FastISel][AArch64] Add custom lowering for GEPs.Juergen Ributzka2014-10-141-0/+85
| | | | | | | | This is mostly a copy of the existing FastISel GEP code, but on AArch64 we bail out even for simple cases, because the standard fastEmit functions don't cover MUL and ADD is lowered inefficientily. llvm-svn: 219726
* [x86 asm] allow fwait alias in both At&t and Intel modes (PR21208)Hans Wennborg2014-10-141-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D5741 llvm-svn: 219725
* ARM: set preferred aggregate alignment to 32 universally.Tim Northover2014-10-141-4/+3
| | | | | | | | | | | Before, ARM and Thumb mode code had different preferred alignments, which could lead to some rather unexpected results. There's justification for reducing it from the default 64-bits (wasted space), but I don't think there is for going below 32-bits. There's no actual ABI change here, just to reassure people. llvm-svn: 219719
* [CFL-AA] CFL-AA should not assert on an va_arg instructionHal Finkel2014-10-141-0/+11
| | | | | | | | | | The CFL-AA implementation was missing a visit* routine for va_arg instructions, causing it to assert when run on a function that had one. For now, handle these in a conservative way. Fixes PR20954. llvm-svn: 219718
* Optimize away fabs() calls when input is squared (known positive).Sanjay Patel2014-10-141-1/+30
| | | | | | | | | | | | Eliminate library calls and intrinsic calls to fabs when the input is a squared value. Note that no unsafe-math / fast-math assumptions are needed for this optimization. Differential Revision: http://reviews.llvm.org/D5777 llvm-svn: 219717
* [FastISel][AArch64] Fix sign-/zero-extend folding when SelectionDAG is involved.Juergen Ributzka2014-10-141-39/+190
| | | | | | | | | | | Sign-/zero-extend folding depended on the load and the integer extend to be both selected by FastISel. This cannot always be garantueed and SelectionDAG might interfer. This commit adds additonal checks to load and integer extend lowering to catch this. Related to rdar://problem/18495928. llvm-svn: 219716
* InstCombine: Don't miscompile X % ((Pow2 << A) >>u B)David Majnemer2014-10-141-7/+4
| | | | | | | | | | | | | | | | | | | We assumed that A must be greater than B because the right hand side of a remainder operator must be nonzero. However, it is possible for A to be less than B if Pow2 is a power of two greater than 1. Take for example: i32 %A = 0 i32 %B = 31 i32 Pow2 = 2147483648 ((Pow2 << 0) >>u 31) is non-zero but A is less than B. This fixes PR21274. llvm-svn: 219713
* Reapply "R600: Add new intrinsic to read work dimensions"Jan Vesely2014-10-143-5/+20
| | | | | | | | | This effectively reverts revert 219707. After fixing the test to work with new function name format and renamed intrinsic. Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 219710
* Revert "r216914 - Revert: [APFloat] Fixed a bug in method 'fusedMultiplyAdd'"Hal Finkel2014-10-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reapply r216913, a fix for PR20832 by Andrea Di Biagio. The commit was reverted because of buildbot failures, and credit goes to Ulrich Weigand for isolating the underlying issue (which can be confirmed by Valgrind, which does helpfully light up like the fourth of July). Uli explained the problem with the original patch as: It seems the problem is calling multiplySignificand with an addend of category fcZero; that is not expected by this routine. Note that for fcZero, the significand parts are simply uninitialized, but the code in (or rather, called from) multiplySignificand will unconditionally access them -- in effect using uninitialized contents. This version avoids using a category == fcZero addend within multiplySignificand, which avoids this problem (the Valgrind output is also now clean). Original commit message: [APFloat] Fixed a bug in method 'fusedMultiplyAdd'. When folding a fused multiply-add builtin call, make sure that we propagate the correct result in the case where the addend is zero, and the two other operands are finite non-zero. Example: define double @test() { %1 = call double @llvm.fma.f64(double 7.0, double 8.0, double 0.0) ret double %1 } Before this patch, the instruction simplifier wrongly folded the builtin call in function @test to constant 'double 7.0'. With this patch, method 'fusedMultiplyAdd' correctly evaluates the multiply and propagates the expected result (i.e. 56.0). Added test fold-builtin-fma.ll with the reproducible from PR20832 plus extra test cases to verify the behavior of method 'fusedMultiplyAdd' in the presence of NaN/Inf operands. This fixes PR20832. llvm-svn: 219708
* Revert "R600: Add new intrinsic to read work dimensions"Rafael Espindola2014-10-143-20/+5
| | | | | | | | This reverts commit r219705. CodeGen/R600/work-item-intrinsics.ll was failing on linux. llvm-svn: 219707
* Remove unused member variable.Rafael Espindola2014-10-142-5/+3
| | | | | | Fixes pr20904. llvm-svn: 219706
* R600: Add new intrinsic to read work dimensionsJan Vesely2014-10-143-5/+20
| | | | | | | | | | | | | | v2: Add SI lowering Add test v3: Place work dimensions after the kernel arguments. v4: Calculate offset while lowering arguments v5: rebase v6: change prefix to AMDGPU Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 219705
* R600: FMA is VecALU only instructionJan Vesely2014-10-141-1/+1
| | | | | | Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 219704
* Finish getting Mips fast-isel to match up with AArch64 fast-iselReed Kotler2014-10-141-402/+396
| | | | | | | | | | | | | | | | | | | Summary: In order to facilitate use of common code, checking by reviewers of other fast-isel ports, and hopefully to eventually move most of Mips and other fast-isel ports into target independent code, I've tried to get the two implementations to line up. There is no functional code change. Just methods moved in the file to be in the same order as in AArch64. Test Plan: No functional change. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits, aemerson, rfuhler Differential Revision: http://reviews.llvm.org/D5692 llvm-svn: 219703
* DebugInfo: Ensure that all debug location scope chains from instructions ↵David Blaikie2014-10-142-2/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | within a function, lead to the function itself. Let me tell you a tale... Originally committed in r211723 after discovering a nasty case of weird scoping due to inlining, this was reverted in r211724 after it fired in ASan/compiler-rt. (minor diversion where I accidentally committed/reverted again in r211871/r211873) After further testing and fixing bugs in ArgumentPromotion (r211872) and Inlining (r212065) it was recommitted in r212085. Reverted in r212089 after the sanitizer buildbots still showed problems. Fixed another bug in ArgumentPromotion (r212128) found by this assertion. Recommitted in r212205, reverted in r212226 after it crashed some more on sanitizer buildbots. Fix clang some more in r212761. Recommitted in r212776, reverted in r212793. ASan failures. Recommitted in r213391, reverted in r213432, trying to reproduce flakey ASan build failure. Fixed bugs in r213805 (ArgPromo + DebugInfo), r213952 (LiveDebugVariables strips dbg_value intrinsics in functions not described by debug info). Recommitted in r214761, reverted in r214999, flakey failure on Windows buildbot. Fixed DeadArgElimination + DebugInfo bug in r219210. Recommitted in r219215, reverted in r219512, failure on ObjC++ atomic properties in the test-suite on Darwin. Fixed ObjC++ atomic properties issue in Clang in r219690. [This commit is provided 'as is' with no hope that this is the last time I commit this change either expressed or implied] llvm-svn: 219702
* Remove method that is identical to the base class one.Rafael Espindola2014-10-141-1/+0
| | | | llvm-svn: 219700
OpenPOWER on IntegriCloud