summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* fix formatting; NFCSanjay Patel2015-07-211-2/+2
| | | | llvm-svn: 242796
* Revert 242726, it broke ASan on OS X.Nico Weber2015-07-211-40/+0
| | | | llvm-svn: 242792
* Constfold trunc,rint,nearbyint,ceil and floor using APFloatKarthik Bhat2015-07-211-4/+33
| | | | | | | | A patch by Chakshu Grover! This patch allows constfolding of trunc,rint,nearbyint,ceil and floor intrinsics using APFloat class. Differential Revision: http://reviews.llvm.org/D11144 llvm-svn: 242763
* AVX512 : Implemented VPMADDUBSW and VPMADDWD instruction , Igor Breger2015-07-215-1/+38
| | | | | | | | Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11351 llvm-svn: 242761
* [ARM] Define subtarget feature "reserve-r9", which is used to decideAkira Hatanaka2015-07-213-13/+12
| | | | | | | | | | | | | | | | | | | | whether register r9 should be reserved. This recommits r242737, which broke bots because the number of subtarget features went over the limit of 64. This change is needed because we cannot use a backend option to set cl::opt "arm-reserve-r9" when doing LTO. Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to reserve r9 should make changes to add subtarget feature "reserve-r9" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11320 llvm-svn: 242756
* [RewriteStatepointsForGC] Minor code cleanup [NFC]Philip Reames2015-07-211-21/+5
| | | | | | We can use builders to simplify part of the code and we only check for the existance of the metadata value; this enables us to delete some redundant code. llvm-svn: 242751
* AMDGPU: Set isMoveImm on s_movk_i32Matt Arsenault2015-07-211-1/+1
| | | | llvm-svn: 242747
* ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common codeMatthias Braun2015-07-211-166/+186
| | | | | | | | | | | | | | | Re-apply of r241928 which had to be reverted because of the r241926 revert. This commit factors out common code from MergeBaseUpdateLoadStore() and MergeBaseUpdateLSMultiple() and introduces a new function MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a strd/ldrd instruction into an strd/ldrd instruction with writeback where possible. Differential Revision: http://reviews.llvm.org/D10676 llvm-svn: 242743
* ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2Matthias Braun2015-07-211-31/+102
| | | | | | | | | | Re-apply r241926 with an additional check that r13 and r15 are not used for LDRD/STRD. See http://llvm.org/PR24190. This also already includes the fix from r241951. Differential Revision: http://reviews.llvm.org/D10623 llvm-svn: 242742
* Revert r242737.Akira Hatanaka2015-07-203-12/+13
| | | | | | | | This caused builds to fail with the following error message: error:Too many subtarget features! Bump MAX_SUBTARGET_FEATURES. llvm-svn: 242740
* [ARM] Define subtarget feature "reserve-r9", which is used to decideAkira Hatanaka2015-07-203-13/+12
| | | | | | | | | | | | | | | | | whether register r9 should be reserved. This change is needed because we cannot use a backend option to set cl::opt "arm-reserve-r9" when doing LTO. Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to reserve r9 should make changes to add subtarget feature "reserve-r9" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11320 llvm-svn: 242737
* Revert "ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2"Matthias Braun2015-07-201-96/+29
| | | | | | This reverts commit r241926. This caused http://llvm.org/PR24190 llvm-svn: 242735
* Revert "ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code"Matthias Braun2015-07-201-186/+166
| | | | | | This reverts commit r241928. This caused http://llvm.org/PR24190 llvm-svn: 242734
* Revert "ARM: Use SpecificBumpPtrAllocator to fix leak introduced in r241920"Matthias Braun2015-07-201-3/+3
| | | | | | This reverts commit r241951. It caused http://llvm.org/PR24190 llvm-svn: 242733
* AArch64: Restrict macroop fusion heuristics to cyclone.Matthias Braun2015-07-201-31/+33
| | | | | | | Even though this is just some hinting for the scheduler it doesn't make sense to do that unless you know the target can perform the fusion. llvm-svn: 242732
* Targets: commonize some stack realignment codeJF Bastien2015-07-2016-165/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch does the following: * Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`. * Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute. Multiple targets duplicated the same `needsStackRealignment` code: - Aarch64. - ARM. - Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has. - PowerPC. - WebAssembly. - x86 almost: has an extra `-force-align-stack` option, which the default implementation now has. The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects: - AMDGPU - BPF - CppBackend - MSP430 - NVPTX - Sparc - SystemZ - XCore - Out-of-tree targets This is a breaking change! `make check` passes. The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation. `needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone. Reviewers: sunfish Subscribers: aemerson, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11160 llvm-svn: 242727
* Don't try to instrument allocas used by outlined SEH funcletsReid Kleckner2015-07-201-0/+40
| | | | | | | | | | | | | | | | | | | Summary: Arguments to llvm.localescape must be static allocas. They must be at some statically known offset from the frame or stack pointer so that other functions can access them with localrecover. If we ever want to instrument these, we can use more indirection to recover the addresses of these local variables. We can do it during clang irgen or with the asan module pass. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11307 llvm-svn: 242726
* AArch64: Add aditional Cyclone macroop fusion opportunitiesMatthias Braun2015-07-201-16/+34
| | | | | | | | Related to rdar://19205407 Differential Revision: http://reviews.llvm.org/D10746 llvm-svn: 242724
* MachineScheduler: Restrict macroop fusion to data-dependent instructions.Matthias Braun2015-07-201-9/+33
| | | | | | | | | | | | | | Before creating a schedule edge to encourage MacroOpFusion check that: - The predecessor actually writes a register that the branch reads. - The predecessor has no successors in the ScheduleDAG so we can schedule it in front of the branch. This avoids skewing the scheduling heuristic in cases where macroop fusion cannot happen. Differential Revision: http://reviews.llvm.org/D10745 llvm-svn: 242723
* Fix comment typo (test commit). NFCGeoff Berry2015-07-201-1/+1
| | | | llvm-svn: 242719
* [ARM] Refactor the prologue/epilogue emission to be more robust.Quentin Colombet2015-07-204-118/+219
| | | | | | | | | | | | | | | | This is the first step toward supporting shrink-wrapping for this target. The changes could be summarized by these items: - Expand the tail-call return as part of the expand pseudo pass. - Get rid of the assumptions that the epilogue is the exit block: * Do not assume which registers are free in the epilogue. (This indirectly improve the lowering of the code for the segmented stacks, see the test cases.) * Take into account that the basic block can be empty. Related to <rdar://problem/20821730> llvm-svn: 242714
* [NVPTX] make load on global readonly memory to use ldgJingyue Wu2015-07-201-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [NVPTX] make load on global readonly memory to use ldg Summary: As describe in [1], ld.global.nc may be used to load memory by nvcc when __restrict__ is used and compiler can detect whether read-only data cache is safe to use. This patch will try to check whether ldg is safe to use and use them to replace ld.global when possible. This change can improve the performance by 18~29% on affected kernels (ratt*_kernel and rwdot*_kernel) in S3D benchmark of shoc [2]. Patched by Xuetian Weng. [1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache [2] https://github.com/vetter/shoc Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll Reviewers: jholewinski, jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D11314 llvm-svn: 242713
* [Hexagon] Generate MUX from conditional transfers when dot-new not possibleKrzysztof Parzyszek2015-07-203-0/+328
| | | | llvm-svn: 242711
* MIR Serialization: Initial serialization of machine constant pools.Alex Lorenz2015-07-206-1/+89
| | | | | | | | | | This commit implements the initial serialization of machine constant pools and the constant pool index machine operands. The constant pool is serialized using a YAML sequence of YAML mappings that represent the constant values. The target-specific constant pool items aren't serialized by this commit. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242707
* [ImplicitNullChecks] Work with implicit defs.Sanjoy Das2015-07-202-7/+15
| | | | | | | | | | | | | | | Summary: This change generalizes the implicit null checks pass to work with instructions that don't have any explicit register defs. This lets us use X86's `cmp` against memory as faulting load instructions. Reviewers: reames, JosephTremoulet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11286 llvm-svn: 242703
* MIR Parser: Add support for quoted named global value operands.Alex Lorenz2015-07-203-7/+102
| | | | | | | | | | This commit extends the machine instruction lexer and implements support for the quoted global value tokens. With this change the syntax for the global value identifier tokens becomes identical to the syntax for the global identifier tokens from the LLVM's assembly language. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242702
* [AArch64] Change EON pattern to match more often.Chad Rosier2015-07-201-1/+1
| | | | | | | Phabricator: http://reviews.llvm.org/D11359 Patch by Geoff Berry <gberry@codeaurora.org> llvm-svn: 242694
* AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory opsTom Stellard2015-07-204-6/+70
| | | | | | | | | | | | | | Summary: The MUBUF addr64 bit has been removed on VI, so we must use FLAT instructions when the pointer is stored in VGPRs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11067 llvm-svn: 242673
* [mips] Added support for the ERETNC instruction.Vasileios Kalintiris2015-07-203-6/+10
| | | | | | | | | | | | | | Summary: This required adding the instruction predicate HasMips32r5. Patch by Scott Egerton. Reviewers: dsanders, vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11136 llvm-svn: 242666
* Revert "MergeFuncs: Transfer the function parameter attributes to the call site"Arnold Schwaighofer2015-07-191-1/+0
| | | | | | | | It is okay to not transfer parameter attributes. This reverts commit r242558. llvm-svn: 242646
* Narrow Callee scope, suggestion from David Blaikie.Yaron Keren2015-07-191-3/+3
| | | | llvm-svn: 242644
* [X86][SSE] Reordered cast vectorization costs. NFCI.Simon Pilgrim2015-07-191-47/+48
| | | | | | Reordered the data tables at the top and placed the lookups after. The first stage in the yak shaving necessary to get more accurate costs for a variety of targets given the recent improvements to SINT_TO_FP/UINT_TO_FP/SIGN_EXTEND vector lowering. llvm-svn: 242643
* De-duplicate CS.getCalledFunction() expression.Yaron Keren2015-07-191-1/+2
| | | | | | | Not sure if the optimizer will save the call as getCalledFunction() is not a trivial access function but the code is clearer this way. llvm-svn: 242641
* [DAGCombiner] Fixed minor typo that was missed in D9097.Simon Pilgrim2015-07-191-2/+2
| | | | | | We don't bitcast the UNDEFs - that is done in visitVECTOR_SHUFFLE, and the getValueType should come from the operand's SDValue not the SDNode. llvm-svn: 242640
* [X86] Add support for tbyte memory operand size for Intel-syntax x86 assemblyMichael Kuperstein2015-07-191-0/+1
| | | | | | | Differential Revision: http://reviews.llvm.org/D11257 Patch by: marina.yatsina@intel.com llvm-svn: 242639
* Remove TargetInstrInfo::canFoldMemoryOperandSimon Pilgrim2015-07-195-73/+0
| | | | | | | | | | canFoldMemoryOperand is not actually used anywhere in the codebase - all existing users instead call foldMemoryOperand directly when they wish to fold and can correctly deduce what they need from the return value. This patch removes the canFoldMemoryOperand base function and the target implementations; only x86 had a real (bit-rotted) implementation, although AMDGPU had a preparatory stub that had never needed to be completed. Differential Revision: http://reviews.llvm.org/D11331 llvm-svn: 242638
* AVX-512: Floating point conversions for SKX - DAG Lowering.Elena Demikhovsky2015-07-192-7/+36
| | | | | | | | | | SKX supports conversion for all FP types. Integer types include doublewords and quardwords. I added "Legal" status for these nodes and a bunch of tests. I added "NoVLX" for AVX DAG selection to force VLX instructions selection when VLX is supported. Differential Revision: http://reviews.llvm.org/D11255 llvm-svn: 242637
* Use SDValue bool check. NFCI.Simon Pilgrim2015-07-191-58/+39
| | | | llvm-svn: 242636
* [X86][SSE] Updated SHL/LSHR i64 vectorization costs.Simon Pilgrim2015-07-181-3/+3
| | | | | | This was missed in D8416. llvm-svn: 242621
* [AggressiveAntiDepBreaker] Use range loops for multimap access.Benjamin Kramer2015-07-181-23/+8
| | | | | | No functionality change intended. llvm-svn: 242620
* Rangify for loops in GlobalDCE, NFC.Yaron Keren2015-07-181-55/+49
| | | | llvm-svn: 242619
* [Hexagon] Use composition instead of inheritance from STL typesBenjamin Kramer2015-07-184-50/+27
| | | | | | | | The standard containers are not designed to be inherited from, as illustrated by the MSVC hacks for NodeOrdering. No functional change intended. llvm-svn: 242616
* [PM/AA] Remove the addEscapingUse update API that won't be easy toChandler Carruth2015-07-185-56/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | directly model in the new PM. This also was an incredibly brittle and expensive update API that was never fully utilized by all the passes that claimed to preserve AA, nor could it reasonably have been extended to all of them. Any number of places add uses of values. If we ever wanted to reliably instrument this, we would want a callback hook much like we have with ValueHandles, but doing this for every use addition seems *extremely* expensive in terms of compile time. The only user of this update mechanism is GlobalsModRef. The idea of using this to keep it up to date doesn't really work anyways as its analysis requires a symmetric analysis of two different memory locations. It would be very hard to make updates be sufficiently rigorous to *guarantee* symmetric analysis in this way, and it pretty certainly isn't true today. However, folks have been using GMR with this update for a long time and seem to not be hitting the issues. The reported issue that the update hook fixes isn't even a problem any more as other changes to GetUnderlyingObject worked around it, and that issue stemmed from *many* years ago. As a consequence, a prior patch provided a flag to control the unsafe behavior of GMR, and this patch removes the update mechanism that has questionable compile-time tradeoffs and is causing problems with moving to the new pass manager. Note the lack of test updates -- not one test in tree actually requires this update, even for a contrived case. All of this was extensively discussed on the dev list, this patch will just enact what that discussion decides on. I'm sending it for review in part to show what I'm planning, and in part to show the *amazing* amount of work this avoids. Every call to the AA here is something like three to six indirect function calls, which in the non-LTO pipeline never do any work! =[ Differential Revision: http://reviews.llvm.org/D11214 llvm-svn: 242605
* [libFuzzer] require the files and directories passed to the fuzzer to existKostya Serebryany2015-07-181-2/+8
| | | | llvm-svn: 242596
* [asan] Fix shadow mapping on Android/AArch64.Evgeniy Stepanov2015-07-171-4/+6
| | | | | | | | | Instrumentation and the runtime library were in disagreement about ASan shadow offset on Android/AArch64. This fixes a large number of existing tests on Android/AArch64. llvm-svn: 242595
* ARM: Enable MachineScheduler and disable PostRAScheduler for swift.Matthias Braun2015-07-173-1038/+14
| | | | | | | | | | | | | | | | | | | | | | | Reapply r242500 now that the swift schedmodel includes LDRLIT. This is mostly done to disable the PostRAScheduler which optimizes for instruction latencies which isn't a good fit for out-of-order architectures. This also allows to leave out the itinerary table in swift in favor of the SchedModel ones. This change leads to performance improvements/regressions by as much as 10% in some benchmarks, in fact we loose 0.4% performance over the llvm-testsuite for reasons that appear to be unknown or out of the compilers control. rdar://20803802 documents the investigation of these effects. While it is probably a good idea to perform the same switch for the other ARM out-of-order CPUs, I limited this change to swift as I cannot perform the benchmark verification on the other CPUs. Differential Revision: http://reviews.llvm.org/D10513 llvm-svn: 242588
* ARM: Add scheduling information for LDRLIT instructions to swift scheduling ↵Matthias Braun2015-07-171-0/+7
| | | | | | | | | | | model These pseudo instructions are only lowered after register allocation and are therefore still present when the machine scheduler runs. Add a run: line to a testcase that uses the uncommon flags necessary to actually produce a LDRLIT instruction on swift. llvm-svn: 242587
* [RAGreedy] Add an experimental deferred spilling feature.Quentin Colombet2015-07-171-6/+37
| | | | | | | | | | | | | | | | The idea of deferred spilling is to delay the insertion of spill code until the very end of the allocation. A "candidate" to spill variable might not required to be spilled because of other evictions that happened after this decision was taken. The spirit is similar to the optimistic coloring strategy implemented in Preston and Briggs graph coloring algorithm. For now, this feature is highly experimental. Although correct, it would require much more modification to properly model the effect of spilling. Anyway, this early patch helps prototyping this feature. Note: The test case cannot unfortunately be reduced and is probably fragile. llvm-svn: 242585
* MIR Parser: Allow the dollar characters in all of the identifier tokens.Alex Lorenz2015-07-171-1/+4
| | | | | | | | | | This commit modifies the machine instruction lexer so that it now accepts the '$' characters in identifier tokens. This change makes the syntax for unquoted global value tokens consistent with the syntax for the global idenfitier tokens in the LLVM's assembly language. llvm-svn: 242584
* AsmParser: Add a function to parse a standalone constant value.Alex Lorenz2015-07-173-0/+50
| | | | | | | | | | | | | | This commit extends the interface provided by the AsmParser library by adding a function that allows the user to parse a standalone contant value. This change is useful for MIR serialization, as it will allow the MIR Parser to parse the constant values in a machine constant pool. Reviewers: Duncan P. N. Exon Smith Differential Revision: http://reviews.llvm.org/D10280 llvm-svn: 242579
OpenPOWER on IntegriCloud