summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [FunctionAttrs] Add comment and clarify assertion message; NFCSanjoy Das2015-11-071-1/+6
| | | | llvm-svn: 252389
* [OperandBundles] Rename accessor, NFCSanjoy Das2015-11-072-2/+2
| | | | | | Rename getOperandBundle to getOperandBundleAt since that's more obvious. llvm-svn: 252388
* [FunctionAttrs] Add handling for operand bundlesSanjoy Das2015-11-071-4/+31
| | | | | | | | | | | | | | Summary: Teach the FunctionAttrs to do the right thing for IR with operand bundles. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14408 llvm-svn: 252387
* [FunctionAttrs] Fix an iterator wraparound bugSanjoy Das2015-11-071-18/+19
| | | | | | | | | | | | | | | | | | | Summary: This change fixes an iterator wraparound bug in `determinePointerReadAttrs`. Ideally, ++'ing off the `end()` of an iplist should result in a failed assert, but currently iplist seems to silently wrap to the head of the list on `end()++`. This is why the bad behavior is difficult to demonstrate. Reviewers: chandlerc, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14350 llvm-svn: 252386
* [WinEH] Update exception pointer registersJoseph Tremoulet2015-11-0722-60/+184
| | | | | | | | | | | | | | | | | | | | Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383
* [InstCombine] Teach FoldPHIArgZextsIntoPHI about EHPadsDavid Majnemer2015-11-071-0/+6
| | | | | | | | FoldPHIArgZextsIntoPHI cannot insert an instruction after the PHI if there is an EHPad in the BB. Doing so would result in an instruction inserted after a terminator. llvm-svn: 252377
* ADT: Remove last implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-11-075-10/+11
| | | | | | | | | | Some implicit ilist iterator conversions have crept back into Analysis, Transforms, Hexagon, and llvm-stress. This removes them. I'll commit a patch immediately after this to disallow them (in a separate patch so that it's easy to revert if necessary). llvm-svn: 252371
* [InstCombine] Don't insert an instruction after a terminatorDavid Majnemer2015-11-061-0/+6
| | | | | | | | We tried to insert a cast of a phi in a block whose terminator is an EHPad. This is invalid. Do not attempt the transform in these circumstances. llvm-svn: 252370
* Add 'notail' marker for call instructions.Akira Hatanaka2015-11-067-3/+13
| | | | | | | | | | | | This marker prevents optimization passes from adding 'tail' or 'musttail' markers to a call. Is is used to prevent tail call optimization from being performed on the call. rdar://problem/22667622 Differential Revision: http://reviews.llvm.org/D12923 llvm-svn: 252368
* Revert r252366: [Support] Use GetTempDir to get the temporary dir path on ↵Pawel Bylica2015-11-061-10/+37
| | | | | | Windows. llvm-svn: 252367
* [Support] Use GetTempDir to get the temporary dir path on Windows.Pawel Bylica2015-11-061-37/+10
| | | | | | | | | | | | | | | Summary: In general GetTempDir follows the same logic as the replaced code: checks env variables TMP, TEMP, USERPROFILE in order. However, it also perform other checks like making separators native (\), making the path absolute, etc. This change fixes FileSystemTest.CreateDir unittest that had been failing when run from Unix-like shell on Windows (Unix-like path separator (/) used in env variables). Reviewers: chapuni, rafael, aaron.ballman Subscribers: rafael, llvm-commits Differential Revision: http://reviews.llvm.org/D14231 llvm-svn: 252366
* [AArch64][FastISel] Don't even try to select vector icmps.Ahmed Bougacha2015-11-061-0/+4
| | | | | | | | | | | | We used to try to constant-fold them to i32 immediates. Given that fast-isel doesn't otherwise support vNi1, when selecting the result users, we'd fallback to SDAG anyway. However, if the users were in another block, we'd insert broken cross-class copies (GPR32 to FPR64). Give up, let SDAG agree with itself on a vNi1 legalization strategy. llvm-svn: 252364
* [X86] Fold (trunc (i32 (zextload i16))) into vbroadcast.Ahmed Bougacha2015-11-061-0/+6
| | | | | | | | | | | When matching non-LSB-extracting truncating broadcasts, we now insert the necessary SRL. If the scalar resulted from a load, the SRL will be folded into it, creating a narrower, offset, load. However, i16 loads aren't Desirable, so we get i16->i32 zextloads. We already catch i16 aextloads; catch these as well. llvm-svn: 252363
* [X86] SRL non-LSB extracts when folding to truncating broadcasts.Ahmed Bougacha2015-11-061-4/+9
| | | | | | | | | | | | Now that we recognize this, we can support it instead of bailing out. That is, we can fold: (v8i16 (shufflevector (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))), <1,1,...,1>)) into: (v8i16 (vbroadcast (i16 (trunc (srl Y, 16))))) llvm-svn: 252362
* [X86] Don't fold non-LSB extracts into truncating broadcasts.Ahmed Bougacha2015-11-061-12/+52
| | | | | | | | | | | | | | | We used to incorrectly assume that the offset we're extracting from was a multiple of the element size. So, we'd fold: (v8i16 (shufflevector (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))), <1,1,...,1>)) into: (v8i16 (vbroadcast (i16 (trunc Y)))) whereas we should have extracted the higher bits from X. Instead, bail out if the assumption doesn't hold. llvm-svn: 252361
* DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> ↵Tom Stellard2015-11-061-1/+2
| | | | | | | | | | | | extload Reviewers: resistor, arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13805 llvm-svn: 252349
* [InstCombine] Don't RAUW tokens with undefDavid Majnemer2015-11-061-2/+3
| | | | | | Let SimplifyCFG remove unreachable BBs which define token instructions. llvm-svn: 252343
* [SimplifyLibCalls] Don't hardcode the function name.Davide Italiano2015-11-061-1/+2
| | | | llvm-svn: 252342
* [ShrinkWrapping] Teach shrink-wrapping how to analyze RegMask.Quentin Colombet2015-11-061-8/+38
| | | | | | | Previously we were conservatively assuming that RegMask operands clobber callee saved registers. llvm-svn: 252341
* MachineScheduler: Add regpressure information to debug dumpMatthias Braun2015-11-063-9/+36
| | | | llvm-svn: 252340
* AMDGPU/SI: Refactor VOP[12C] tablegen definitionsTom Stellard2015-11-062-97/+75
| | | | | | | | | | | | | | Summary: Pass the VOPProfile object all the through to *_m multiclasses. This will allow us to do more simplifications in the future. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13437 llvm-svn: 252339
* Fix SLPVectorizer commutativity reorderingMehdi Amini2015-11-061-76/+69
| | | | | | | | | | | | | | | | | | | The SLPVectorizer had a very crude way of trying to benefit from associativity: it tried to optimize for splat/broadcast or in order to have the same operator on the same side. This is benefitial to the cost model and allows more vectorization to occur. This patch improve the logic and make the detection optimal (locally, we don't look at the full tree but only at the immediate children). Should fix https://llvm.org/bugs/show_bug.cgi?id=25247 Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D13996 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 252337
* Improved the operands commute transformation for X86-FMA3 instructions.Andrew Kaylor2015-11-063-80/+411
| | | | | | | | | | | | All 3 operands of FMA3 instructions are commutable now. Patch by Slava Klochkov Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab). Differential Revision: http://reviews.llvm.org/D13269 llvm-svn: 252335
* [WebAssembly] Make expression-stack pushing explicitDan Gohman2015-11-061-7/+19
| | | | | | | | | Modelling of the expression stack is evolving. This patch takes another step by making pushes explicit. Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252334
* [ValueTracking] Add parameters to isImpliedCondition; NFCSanjoy Das2015-11-064-13/+29
| | | | | | | | | | | | | | | | Summary: This change makes the `isImpliedCondition` interface similar to the rest of the functions in ValueTracking (in that it takes a DataLayout, AssumptionCache etc.). This is an NFC, intended to make a later diff less noisy. Depends on D14369 Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14391 llvm-svn: 252333
* [ValueTracking] De-pessimize isImpliedCondition around unsigned comparesSanjoy Das2015-11-061-4/+4
| | | | | | | | | | | | | | | Summary: Currently `isImpliedCondition` will optimize "I +_nuw C < L ==> I < L" only if C is positive. This is an unnecessary restriction -- the implication holds even if `C` is negative. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14369 llvm-svn: 252332
* [ValueTracking] Add a framework for encoding implication rulesSanjoy Das2015-11-061-21/+67
| | | | | | | | | | | | | | | | | | | | | Summary: This change adds a framework for adding more smarts to `isImpliedCondition` around inequalities. Informally, `isImpliedCondition` will now try to prove "A < B ==> C < D" by proving "C <= A && B <= D", since then it follows "C <= A < B <= D". While this change is in principle NFC, I could not think of a way to not handle cases like "i +_nsw 1 < L ==> i < L +_nsw 1" (that ValueTracking did not handle before) while keeping the change understandable. I've added tests for these cases. Reviewers: reames, majnemer, hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14368 llvm-svn: 252331
* AMDGPU: Cleanup includesMatt Arsenault2015-11-062-6/+4
| | | | llvm-svn: 252328
* AMDGPU: Create emergency stack slots during frame loweringMatt Arsenault2015-11-067-14/+89
| | | | | | Test has a bogus verifier error which will be fixed by later commits. llvm-svn: 252327
* AMDGPU: Remove unused scratch resource operandsMatt Arsenault2015-11-062-75/+131
| | | | | | The SGPR spill pseudos don't actually use them. llvm-svn: 252324
* AMDGPU: Add pass to detect used kernel featuresMatt Arsenault2015-11-064-0/+138
| | | | | | | | | | | Mark kernels that use certain features that require user SGPRs to support with kernel attributes. We need to know before instruction selection begins because it impacts the kernel calling convention lowering. For now this only detects the workitem intrinsics. llvm-svn: 252323
* AMDGPU: Fix hardcoded alignment of spill.Matt Arsenault2015-11-062-13/+12
| | | | | | | Instead of forcing 4 alignment when spilled, set register class alignments. llvm-svn: 252322
* AMDGPU: Hack for VS_32 register pressureMatt Arsenault2015-11-062-4/+17
| | | | | | | | | | | | | For some reason VS_32 ends up factoring into the pressure heuristics even though we should never see a virtual register with this class. When SGPRs are reserved for register spilling, this for some reason triggers reg-crit scheduling. Setting isAllocatable = 0 may help with this since that seems to remove it from the default implementation's generated table. llvm-svn: 252321
* Restore "Move metadata linking after lazy global materialization/linking."Teresa Johnson2015-11-061-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This reverts commit r251965. Restore "Move metadata linking after lazy global materialization/linking." This restores commit r251926, with fixes for the LTO bootstrapping bot failure. The bot failure was caused by references from debug metadata to otherwise unreferenced globals. Previously, this caused the lazy linking to link in their defs, which is unnecessary. With this patch, because lazy linking is complete when we encounter the metadata reference, the materializer created a declaration. For definitions such as aliases and comdats, it is illegal to have a declaration. Furthermore, metadata linking should not change code generation. Therefore, when linking of global value bodies is complete, the materializer will simply return nullptr as the new reference for the linked metadata. This change required fixing a different test to ensure there was a real reference to a linkonce global that was only being reference from metadata. Note that the new changes to the only-needed-named-metadata.ll test illustrate an issue with llvm-link -only-needed handling of comdat groups, whereby it may result in an incomplete comdat group. I note this in the test comments, but the issue is orthogonal to this patch (it can be reproduced without any metadata at head). Reviewers: dexonsmith, rafael, tra Subscribers: tobiasvk, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D14447 llvm-svn: 252320
* Restore "Move metadata linking after lazy global materialization/linking."Teresa Johnson2015-11-061-9/+9
| | | | | | This reverts commit r251965. llvm-svn: 252319
* [WinEH] Mark funclet entries and exits as clobbering all registersReid Kleckner2015-11-065-3/+31
| | | | | | | | | | | | | | | | | Summary: In this implementation, LiveIntervalAnalysis invents a few register masks on basic block boundaries that preserve no registers. The nice thing about this is that it prevents the prologue inserter from thinking it needs to spill all XMM CSRs, because it doesn't see any explicit physreg defs in the MI. Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer Subscribers: MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D14407 llvm-svn: 252318
* [LIR] Simplify code by making DataLayout globally accessible. NFC.Chad Rosier2015-11-061-11/+10
| | | | llvm-svn: 252317
* [AArch64]Enable the narrow ld promotion only on profitable microarchitecturesJun Bum Lim2015-11-061-8/+22
| | | | | | | | | The benefit from converting narrow loads into a wider load (r251438) could be micro-architecturally dependent, as it assumes that a single load with two bitfield extracts is cheaper than two narrow loads. Currently, this conversion is enabled only in cortex-a57 on which performance benefits were verified. llvm-svn: 252316
* Bring r252305 back with a test fix.Rafael Espindola2015-11-062-24/+20
| | | | | | | | | | We now create the .eh_frame section early, just like every other special section. This means that the special flags are visible in code that explicitly asks for ".eh_frame". llvm-svn: 252313
* Revert "Simplify the creation of .eh_frame/.debug_frame sections."Rafael Espindola2015-11-061-18/+24
| | | | | | | | This reverts commit r252305. Investigating a test failure. llvm-svn: 252306
* Simplify the creation of .eh_frame/.debug_frame sections.Rafael Espindola2015-11-061-24/+18
| | | | llvm-svn: 252305
* git clang-format and fix variable names. NFC.Rafael Espindola2015-11-061-42/+34
| | | | llvm-svn: 252304
* Use SHT_X86_64_UNWIND on every OS.Rafael Espindola2015-11-061-7/+5
| | | | | | | That is the ABI required type. Linkers still check the section name, so everything should still work. llvm-svn: 252300
* Pass SectionStart directly to the one function that uses it.Rafael Espindola2015-11-061-8/+5
| | | | llvm-svn: 252299
* [mips][ias] Range check uimm4 operands and fixed a bug this revealed.Daniel Sanders2015-11-063-14/+25
| | | | | | | | | | | | | | | Summary: The bug was that the sldi instructions have immediate widths dependant on their element size. So sldi.d has a 1-bit immediate and sldi.b has a 4-bit immediate. All of these were using 4-bit immediates previously. Reviewers: vkalintiris Subscribers: llvm-commits, atanasyan, dsanders Differential Revision: http://reviews.llvm.org/D14018 llvm-svn: 252297
* [mips][ias] Range check uimm3 operands.Daniel Sanders2015-11-062-7/+8
| | | | | | | | | | | | Summary: Reviewers: vkalintiris Subscribers: atanasyan, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14016 llvm-svn: 252296
* [mips][ias] Range check uimm2 operands and fix a bug this revealed.Daniel Sanders2015-11-069-46/+64
| | | | | | | | | | | | | | | Summary: The bug was that the MIPS32R6/MIPS64R6/microMIPS32R6 versions of LSA and DLSA (unlike the MSA version) failed to account for the off-by-one encoding of the immediate. The range is actually 1..4 rather than 0..3. Reviewers: vkalintiris Subscribers: atanasyan, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14015 llvm-svn: 252295
* [mips][ias] Range check uimmz operands.Daniel Sanders2015-11-062-2/+34
| | | | | | | | | | Reviewers: vkalintiris Subscribers: dsanders, atanasyan, llvm-commits Differential Revision: http://reviews.llvm.org/D14013 llvm-svn: 252294
* [mips] Define patterns for the atomic_{load,store}_{8,16,32,64} nodes.Vasileios Kalintiris2015-11-063-4/+26
| | | | | | | | | | | | | | | | Summary: Without these patterns we would generate a complete LL/SC sequence. This would be problematic for memory regions marked as WRITE-only or READ-only, as the instructions LL/SC would read/write to the protected memory regions correspondingly. Reviewers: dsanders Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D14397 llvm-svn: 252293
* AMDGPU/SI: Emit HSA kernels with symbol type STT_AMDGPU_HSA_KERNELTom Stellard2015-11-066-0/+60
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13804 llvm-svn: 252291
OpenPOWER on IntegriCloud