summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Have MachineInstrBundle use the MachineFunction for subtargetEric Christopher2014-10-141-5/+5
| | | | | | access rather than the TargetMachine. llvm-svn: 219662
* Access the subtarget off of the MachineFunction rather thanEric Christopher2014-10-141-4/+2
| | | | | | through the TargetMachine. llvm-svn: 219661
* Switch to select optimization for two-case switchesMarcello Maggioni2014-10-141-0/+170
| | | | | | | This is the same optimization of r219233 with modifications to support PHIs with multiple incoming edges from the same block and a test to check that this condition is handled. llvm-svn: 219656
* Include map into the A15SDOptimizer rather than pick it upEric Christopher2014-10-141-0/+1
| | | | | | transitively from the DFAPacketizer via TargetInstrInfo.h. llvm-svn: 219652
* Remove the TargetMachine from DFAPacketizer since it was onlyEric Christopher2014-10-142-6/+6
| | | | | | | being used to grab subtarget specific things that we can grab from the MachineFunction anyhow. llvm-svn: 219650
* fix formatting; NFCSanjay Patel2014-10-141-33/+25
| | | | llvm-svn: 219645
* Add some optional passes around the vectorizer to both better prepareChandler Carruth2014-10-141-1/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the IR going into it and to clean up the IR produced by the vectorizers. Note that these are *off by default* right now while folks collect data on whether the performance tradeoff is reasonable. In a build of the 'opt' binary, I see about 2% compile time regression due to this change on average. This is in my mind essentially the worst expected case: very little of the opt binary is going to *benefit* from these extra passes. I've seen several benchmarks improve in performance my small amounts due to running these passes, and there are certain (rare) cases where these passes make a huge difference by either enabling the vectorizer at all or by hoisting runtime checks out of the outer loop. My primary motivation is to prevent people from seeing runtime check overhead in benchmarks where the existing passes and optimizers would be able to eliminate that. I've chosen the sequence of passes based on the kinds of things that seem likely to be relevant for the code at each stage: rotaing loops for the vectorizer, finding correlated values, loop invariants, and unswitching opportunities from any runtime checks, and cleaning up commonalities exposed by the SLP vectorizer. I'll be pinging existing threads where some of these issues have come up and will start new threads to get folks to benchmark and collect data on whether this is the right tradeoff or we should do something else. llvm-svn: 219644
* Introduce LLVMWriteBitcodeToMemoryBuffer C API function.Peter Collingbourne2014-10-141-0/+8
| | | | llvm-svn: 219643
* InstCombine: Fix miscompile in X % -Y -> X % Y transformDavid Majnemer2014-10-131-6/+6
| | | | | | | | | | | We assumed that negation operations of the form (0 - %Z) resulted in a negative number. This isn't true if %Z was originally negative. Substituting the negative number into the remainder operation may result in undefined behavior because the dividend might be INT_MIN. This fixes PR21256. llvm-svn: 219639
* Removing the static destructor from ManagedStatic.cpp by controlling the ↵Chris Bieneman2014-10-131-6/+17
| | | | | | | | allocation and de-allocation of the mutex. This patch adds a new llvm_call_once function which is used by the ManagedStatic implementation to safely initialize a global to avoid static construction and destruction. llvm-svn: 219638
* Migrate another set of getSubtargetImpl away.Eric Christopher2014-10-131-2/+2
| | | | llvm-svn: 219636
* InstCombine: Don't miscompile (x lshr C1) udiv C2David Majnemer2014-10-132-9/+25
| | | | | | | | | | | | | We have a transform that changes: (x lshr C1) udiv C2 into: x udiv (C2 << C1) However, it is unsafe to do so if C2 << C1 discards any of C2's bits. This fixes PR21255. llvm-svn: 219634
* Make first of several changes to bring up to AArch64 fast-isel styleReed Kotler2014-10-131-179/+204
| | | | | | | | | | | | | | | | | | | | | | | Summary: Make Mips fast-isel track the form of AArch64 where practical. This makes it easier for people to review the code, to borrow similar code, and to see how to eventually move a lot of this target code for fast-isels into target independent code. These are just cosmetic changes. Should be no functional difference. Test Plan: make check test-suite for 4 flavors mips32 r1/r2 , -O0/-O2 Reviewers: dsanders Reviewed By: dsanders Subscribers: aemerson, llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D5595 llvm-svn: 219633
* Add an assertion about the integrity of the iterator.Adrian Prantl2014-10-131-0/+5
| | | | | | | | | Broken parent scope pointers in inlined DIVariables can cause ensureAbstractVariableIsCreated to insert new abstract scopes, thus invalidating the iterator in this loop and leading to hard-to-debug crashes. Useful when manually reducing IR for testcases. llvm-svn: 219628
* constify the getters in SDNodeDbgValue.Adrian Prantl2014-10-131-12/+12
| | | | llvm-svn: 219627
* Refactor debug statement and remove dead argument. NFC.Chad Rosier2014-10-132-18/+13
| | | | llvm-svn: 219626
* Fix a broadcast related regression on the vector shuffle lowering.Filipe Cabecinhas2014-10-131-3/+35
| | | | | | | | | | | | Summary: Test by Robert Lougher! Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5745 llvm-svn: 219617
* R600/SI: Minor cleanup of functionMatt Arsenault2014-10-131-9/+11
| | | | llvm-svn: 219616
* [asan-asm-instrumentation] Follow-up fixes to r219602: asserts are moved intoYuri Gorshenin2014-10-131-8/+8
| | | | | | function. llvm-svn: 219610
* Adds support for the Cortex-A17 to the ARM backendRenato Golin2014-10-132-1/+16
| | | | | | Patch by Matthew Wahab. llvm-svn: 219606
* [AArch64] Add workaround for Cortex-A53 erratum (835769)Bradley Smith2014-10-134-0/+238
| | | | | | | | | | | | | | | | | | | | Some early revisions of the Cortex-A53 have an erratum (835769) whereby it is possible for a 64-bit multiply-accumulate instruction in AArch64 state to generate an incorrect result. The details are quite complex and hard to determine statically, since branches in the code may exist in some circumstances, but all cases end with a memory (load, store, or prefetch) instruction followed immediately by the multiply-accumulate operation. The safest work-around for this issue is to make the compiler avoid emitting multiply-accumulate instructions immediately after memory instructions and the simplest way to do this is to insert a NOP. This patch implements such work-around in the backend, enabled via the option -aarch64-fix-cortex-a53-835769. The work-around code generation is not enabled by default. llvm-svn: 219603
* [asan-asm-instrumentation] Fixed memory references which includes %rsp as a ↵Yuri Gorshenin2014-10-131-111/+162
| | | | | | | | | | | | | | base or an index register. Summary: [asan-asm-instrumentation] Fixed memory references which includes %rsp as a base or an index register. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5599 llvm-svn: 219602
* Unix/Signals.inc: Let findModulesAndOffsets() built conditionally regarding ↵NAKAMURA Takumi2014-10-131-2/+3
| | | | | | to (defined(HAVE_BACKTRACE) && defined(ENABLE_BACKTRACES)). [-Wunused-function] llvm-svn: 219596
* Revert r219584, "[X86] Memory folding for commutative instructions."NAKAMURA Takumi2014-10-133-70/+51
| | | | | | It broke i686 selfhosting. llvm-svn: 219595
* [modules] Stop excluding Support/Debug.h from the Support module. This headerRichard Smith2014-10-131-1/+0
| | | | | | | has been modular since r206822, and excluding it was leading to workarounds such as the one in r219592, which this change removes. llvm-svn: 219593
* [Modules] Add some missing includes to make files compile stand-alone.Benjamin Kramer2014-10-122-1/+4
| | | | llvm-svn: 219592
* Modernize old-style static asserts. NFC.Benjamin Kramer2014-10-123-5/+3
| | | | llvm-svn: 219588
* Revert r219223, it creates invalid PHI nodes.Joerg Sonnenberger2014-10-121-170/+0
| | | | llvm-svn: 219587
* InstCombine: Turn (x != 0 & x <u C) into the canonical range check form (x-1 ↵Benjamin Kramer2014-10-121-0/+2
| | | | | | <u C-1) llvm-svn: 219585
* [X86] Memory folding for commutative instructions.Simon Pilgrim2014-10-123-51/+70
| | | | | | | | | | This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning. This mainly helps the stack inliner better fold reloads of 3 (or more) operand instructions (VEX encoded SSE etc.) but by performing this in the lowest foldMemoryOperandImpl implementation it also replaces the X86InstrInfo::optimizeLoadInstr version and is now used by FastISel too. Differential Revision: http://reviews.llvm.org/D5701 llvm-svn: 219584
* InstCombine: Simplify commonIDivTransformsDavid Majnemer2014-10-121-86/+76
| | | | | | | | | | A helper routine, MultiplyOverflows, was a less efficient reimplementation of APInt's smul_ov and umul_ov. While we are here, clean up the code so it's more uniform. No functionality change intended. llvm-svn: 219583
* Test commit access (email fix)Simon Pilgrim2014-10-111-3/+4
| | | | | | Indentation tidyup. llvm-svn: 219577
* AssumptionTracker: Don't create temporary CallbackVHs.Benjamin Kramer2014-10-111-3/+5
| | | | | | Those are expensive to create in cold cache scenarios. NFC. llvm-svn: 219575
* MC: Shrink MCSymbolRefExpr by only storing the bits we need.Benjamin Kramer2014-10-111-8/+19
| | | | | | 32 -> 16 bytes on x86_64. NFC. llvm-svn: 219574
* MC: Bit pack MCSymbolData.Benjamin Kramer2014-10-116-15/+14
| | | | | | | | | On x86_64 this brings it from 80 bytes to 64 bytes. Also make any member variables private and clean up uses to go through the existing accessors. NFC. llvm-svn: 219573
* Test commit accessSimon Pilgrim2014-10-111-1/+1
| | | | | | Fix comment typo + spelling. llvm-svn: 219572
* InstCombine: Don't fold (X <<s log(INT_MIN)) /s INT_MIN to XDavid Majnemer2014-10-111-1/+2
| | | | | | | | | | Consider the case where X is 2. (2 <<s 31)/s-2147483648 is zero but we would fold to X. Note that this is valid when we are in the unsigned domain because we require NUW: 2 <<u 31 results in poison. This fixes PR21245. llvm-svn: 219568
* InstCombine, InstSimplify: (%X /s C1) /s C2 isn't always 0 when C1 * C2 overflowDavid Majnemer2014-10-112-5/+14
| | | | | | | | | | | | | | | | | | consider: C1 = INT_MIN C2 = -1 C1 * C2 overflows without a doubt but consider the following: %x = i32 INT_MIN This means that (%X /s C1) is 1 and (%X /s C1) /s C2 is -1. N. B. Move the unsigned version of this transform to InstSimplify, it doesn't create any new instructions. This fixes PR21243. llvm-svn: 219567
* InstCombine: mul to shl shouldn't preserve nswDavid Majnemer2014-10-111-2/+0
| | | | | | | | | | | | | | | | | consider: mul i32 nsw %x, -2147483648 this instruction will not result in poison if %x is 1 however, if we transform this into: shl i32 nsw %x, 31 then we will be generating poison because we just shifted into the sign bit. This fixes PR21242. llvm-svn: 219566
* [SCEV] Fix one more caller blindly passing the latch to SCEV'sChandler Carruth2014-10-111-2/+1
| | | | | | | | | | | getSmallConstantTripCount even when it isn't the exiting block. I missed this in my first audit, very sorry. This was found in LNT and elsewhere. I don't have a test case, but it was completely obvious from inspection that this was the problem. I'll see if I can reduce a test case, but I'm not really hopeful, and the value seems quite low. llvm-svn: 219562
* Guard the definition of the stack tracing function with the same macrosChandler Carruth2014-10-111-0/+2
| | | | | | | that guard its usage. Without this, we can get unused function warnings when backtraces are disabled. llvm-svn: 219558
* Add basic conditional branches in mips fast-iselReed Kotler2014-10-111-7/+48
| | | | | | | | | | | | | | | | | | Summary: Implement the most basic form of conditional branches in Mips fast-isel. Test Plan: br1.ll run 4 flavors of test-suite. mips32 r1/r2 and at -O0/O2 Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D5583 llvm-svn: 219556
* [SCEV] Add some asserts to the recently improved trip count computationChandler Carruth2014-10-113-10/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | routines and fix all of the bugs they expose. I hit a test case that crashed even without these asserts due to passing a non-exiting latch to the ExitingBlock parameter of the trip count computation machinery. However, when I add the nice asserts, it turns out we have plenty of coverage of these bugs, they just didn't manifest in crashers. The core problem seems to stem from an assumption that the latch *is* the exiting block. While this is often true, and somewhat the "normal" way to think about loops, it isn't necessarily true. The correct way to call the trip count routines in a *generic* fashion (that is, without a particular exit in mind) is to just use the loop's single exiting block if it has one. The trip count can't be computed generically unless it does. This works great for the loop vectorizer. The loop unroller actually *wants* to select the latch when it has to chose between multiple exits because for unrolling it is the latch trips that matter. But if this is the desire, it needs to explicitly guard for non-exiting latches and check for the generic trip count in that case. I've added the asserts, and added convenience APIs for querying the trip count generically that check for a single exit block. I've kept the APIs consistent between computing trip count and trip multiples. Thansk to Mark for the help debugging and tracking down the *right* fix here! llvm-svn: 219550
* [MCJIT] Replace memcpy with readBytesUnaligned in RuntimeDyldMachOI386.Lang Hames2014-10-101-4/+2
| | | | | | | This should fix the failures of the MachO_i386_DynNoPIC_relocations.s test case on MIPS hosts. llvm-svn: 219543
* Return undef on FP <-> Int conversions that overflow (PR21330).Sanjay Patel2014-10-101-5/+14
| | | | | | | | | | | | | | | | | | | | The LLVM Lang Ref states for signed/unsigned int to float conversions: "If the value cannot fit in the floating point value, the results are undefined." And for FP to signed/unsigned int: "If the value cannot fit in ty2, the results are undefined." This matches the C definitions. The existing behavior pins to infinity or a max int value, but that may just lead to more confusion as seen in: http://llvm.org/bugs/show_bug.cgi?id=21130 Returning undef will hopefully lead to a less silent failure. Differential Revision: http://reviews.llvm.org/D5603 llvm-svn: 219542
* Follow-up to r219534 to make symbolization more robust.Alexey Samsonov2014-10-101-2/+7
| | | | | | | | | | 1) Explicitly provide important arguments to llvm-symbolizer, not relying on defaults. 2) Be more defensive about symbolizer output. This might fix weird failures on ninja-x64-msvc-RA-centos6 buildbot. llvm-svn: 219541
* R600/SI: Change how DS offsets are printedMatt Arsenault2014-10-103-16/+61
| | | | | | | Match SC by using offset/offset0/offset1 and printing in decimal. llvm-svn: 219537
* R600/SI: Match read2/write2 stride 64 versionsMatt Arsenault2014-10-101-38/+80
| | | | llvm-svn: 219536
* Re-land r219354: Use llvm-symbolizer to symbolize LLVM/Clang crash dumps.Alexey Samsonov2014-10-101-1/+140
| | | | | | | | | | | | In fact, symbolization is now expected to work only on Linux and FreeBSD/NetBSD, where we have dl_iterate_phdr and can learn the main executable name without argv0 (it will be possible on BSD systems after http://reviews.llvm.org/D5693 lands). #ifdef-out the code for all the rest Unix systems. Reviewed in http://reviews.llvm.org/D5610 llvm-svn: 219534
* R600/SI: Add load / store machine optimizer pass.Matt Arsenault2014-10-107-1/+403
| | | | | | | | | | | | | Currently this only functions to match simple cases where ds_read2_* / ds_write2_* instructions can be used. In the future it might match some of the other weird load patterns, such as direct to LDS loads. Currently enabled only with a subtarget feature to enable easier testing. llvm-svn: 219533
OpenPOWER on IntegriCloud