summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [AArch64] Enable FeatureFuseAES on Cortex-A72.Florian Hahn2017-05-151-0/+1
| | | | | | | | This patch enables fusing dependent AESE/AESMC and AESD/AESIMC instruction pairs on Cortex-A72, as recommended in the Software Optimization Guide, section 4.10. llvm-svn: 303073
* [AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64Dmitry Preobrazhensky2017-05-151-11/+22
| | | | | | | | | | See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D33123 llvm-svn: 303070
* [AMDGPU][MC] Removed V_MQSAD_U16_U8Dmitry Preobrazhensky2017-05-151-3/+0
| | | | | | | | | | | | This instruction does not really exist See Bug 33018: https://bugs.llvm.org//show_bug.cgi?id=33018 Reviewers: vpykhtin, artem.tamazov Differential Revision: https://reviews.llvm.org/D33126 llvm-svn: 303055
* [ARM] Mark LEApcrel instructions as isAsCheapAsAMoveJohn Brawn2017-05-153-3/+3
| | | | | | | | | | | Doing this means that if an LEApcrel is used in two places we will rematerialize instead of generating two MOVs. This is particularly useful for printfs using the same format string, where we want to generate an address into a register that's going to get corrupted by the call. Differential Revision: https://reviews.llvm.org/D32858 llvm-svn: 303054
* [ARM] Mark LEApcrel as not having side effectsJohn Brawn2017-05-151-2/+2
| | | | | | | | | | | | | | | Doing this lets us hoist it out of loops, and I've also marked it as rematerializable the same as the thumb1 and thumb2 counterparts. It looks like it being marked as such was just a mistake, as the commit that made that change only mentions LEApcrelJT and in thumb1 and thumb2 only the LEApcrelJT instructions were marked as having side-effects, so it looks like the intent was to only mark LEApcrelJT as having side-effects but LEApcrel was accidentally marked as such also. Differential Revision: https://reviews.llvm.org/D32857 llvm-svn: 303053
* [DWARF] - Speedup handling of relocations in DWARFContextInMemory.George Rimar2017-05-151-4/+17
| | | | | | | | | | | | | | | | | | | I am working on a speedup of building .gdb_index in LLD and noticed that relocations that are proccessed in DWARFContextInMemory often uses the same symbol in a row. This patch introduces caching to reduce the relocations proccessing time. For benchmark, I took debug LLC binary objects configured with -ggnu-pubnames and linked it using LLD. Link time without --gdb-index is about 4,45s. Link time with --gdb-index: a) Without patch: 19,16s b) With patch: 15,52s That means time spent on --gdb-index in this configuration is 19,16s - 4,45s = 14,71s (without patch) vs 15,52s - 4,45s = 11,07s (with patch). Differential revision: https://reviews.llvm.org/D31136 llvm-svn: 303051
* [X86] Relocate code of replacement of subtarget unsupported masked memory ↵Ayman Musa2017-05-155-546/+667
| | | | | | | | | | | | | | intrinsics to run also on -O0 option. Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional). CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0). Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation. Differential Revision: https://reviews.llvm.org/D32487 llvm-svn: 303050
* [NVPTX] Don't rely on default arguments to ↵Simon Pilgrim2017-05-151-3/+8
| | | | | | | | SelectionDAG::getMemIntrinsicNode. NFC. NFC followup to D33147, this explicitly sets all the arguments (instead of relying on the defaults) to SelectionDAG::getMemIntrinsicNode to help identify -verify-machineinstrs issues. llvm-svn: 303047
* [RegisterBankInfo] Remove overly-agressive assertsTom Stellard2017-05-151-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | Summary: We were asserting in RegisterBankInfo if RBI.copyCost() returns UINT_MAX. This is OK for RegBankSelect::Mode::Fast since we only try one instruction mapping and can't recover from this, but for RegBankSelect::Mode::Greedy we will be considering multiple instruction mappings, so we can recover if we see a UNIT_MAX copy cost. The copy cost for one pair of register banks in the AMDGPU backend will be UNIT_MAX, so this patch will prevent AMDGPU tests from breaking. Reviewers: ab, qcolombet, t.p.northover, dsanders Reviewed By: qcolombet Subscribers: tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D33144 llvm-svn: 303043
* MCObjectStreamer : fail with a diagnostic when emitting an out of range value.Arnaud A. de Grandmaison2017-05-151-0/+5
| | | | | | | | | We were previously silently emitting bogus data in release mode, making it very hard to diagnose the error, or crashing with an assert in debug mode. A proper diagnostic is now always emitted when the value to be emitted is out of range. llvm-svn: 303041
* [ValueTracking] Replace all uses of ComputeSignBit with computeKnownBits.Craig Topper2017-05-1511-101/+74
| | | | | | | | This patch finishes off the conversion of ComputeSignBit to computeKnownBits. Differential Revision: https://reviews.llvm.org/D33166 llvm-svn: 303035
* Move some code into ScalarEvolution.cpp; NFCSanjoy Das2017-05-151-0/+24
| | | | | | | I need to add some asserts to these constructors that are easier to add once they're in the .cpp file. llvm-svn: 303032
* [InstCombine] Merge duplicate functionality between InstCombine and ↵Craig Topper2017-05-154-109/+92
| | | | | | | | | | | | | | | | | | | | | | | ValueTracking Summary: Merge overflow computation for signed add, appearing both in InstCombine and ValueTracking. As part of the merge, cleanup the interface for overflow checks in InstCombine. Patch by Yoav Ben-Shalom. Reviewers: craig.topper, majnemer Reviewed By: craig.topper Subscribers: takuto.ikuta, llvm-commits Differential Revision: https://reviews.llvm.org/D32946 llvm-svn: 303029
* [InstCombine] Remove 'return' of a called function that also returned void. NFCCraig Topper2017-05-151-3/+2
| | | | llvm-svn: 303028
* [X86] Utilize SelectionDAG::getSelect(). NFC.Zvi Rackover2017-05-141-34/+27
| | | | | | | | | | Replace SelectionDAG::getNode(ISD::SELECT, ...) and SelectionDAG::getNode(ISD::VSELECT, ...) with SelectionDAG::getSelect(...) Saves a few lines of code and in some cases saves the need to explicitly check the type of the desired node. llvm-svn: 303024
* [X86][AVX1] Account for cost of extract/insert of 256-bit shiftsSimon Pilgrim2017-05-141-49/+49
| | | | llvm-svn: 303023
* [X86][AVX2] Fix costs for v4i64 ashr by splatSimon Pilgrim2017-05-141-0/+5
| | | | llvm-svn: 303022
* [X86][AVX1] Account for cost of extract/insert of 256-bit shifts by splatSimon Pilgrim2017-05-141-12/+12
| | | | llvm-svn: 303021
* [X86] Remove unused value from IntrinsicType enum. NFCCraig Topper2017-05-142-7/+1
| | | | llvm-svn: 303018
* [X86][AVX1] Account for cost of extract/insert of 256-bit SDIV/UDIV by mul ↵Simon Pilgrim2017-05-141-17/+17
| | | | | | sequences llvm-svn: 303017
* Fix DynamicLibraryTest.cpp on FreeBSD and NetBSDDimitry Andric2017-05-141-6/+24
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: After rL301562, on FreeBSD the DynamicLibrary unittests fail, because the test uses getMainExecutable("DynamicLibraryTests", Ptr), and since the path does not contain any slashes, retrieving the main executable will not work. Reimplement getMainExecutable() for FreeBSD and NetBSD using sysctl(3), which is more reliable than fiddling with relative or absolute paths. Also add retrieval of the original argv[] from the GoogleTest framework, to use as a fallback for other OSes. Reviewers: emaste, marsupial, hans, krytarowski Reviewed By: krytarowski Subscribers: krytarowski, llvm-commits Differential Revision: https://reviews.llvm.org/D33171 llvm-svn: 303015
* [COFF] Gracefully handle empty .drectve sectionsShoaib Meenai2017-05-141-1/+1
| | | | | | | | | | | | | | | | | | Running `llvm-readobj -coff-directives msvcrt.lib` resulted in this error: Invalid data was encountered while parsing the file This happened because some of the object files in the archive have empty `.drectve` sections. These empty sections result in a `parse_failed` error being returned from `COFFObjectFile::getSectionContents()`, which in turn caused `llvm-readobj` to stop. With this change, `getSectionContents` now returns success, and like before the resulting array is empty. Patch by Dave Lee. Differential Revision: https://reviews.llvm.org/D32652 llvm-svn: 303014
* [X86][XOP] XOP's general v16i8 shifts will be used instead of v8i16 shift + ↵Simon Pilgrim2017-05-141-3/+6
| | | | | | | | mask. Tweak cost model to match what lowering actually does. llvm-svn: 303013
* [X86][SSE] Account for cost of extract/insert of v32i8 vector shiftsSimon Pilgrim2017-05-141-3/+3
| | | | llvm-svn: 303012
* [X86][XOP] Account for cost of extract/insert of 256-bit vector shiftsSimon Pilgrim2017-05-141-12/+12
| | | | llvm-svn: 303010
* [X86][AVX] Allow 32-bit targets to peek through subvectors to extract ↵Simon Pilgrim2017-05-141-1/+10
| | | | | | constant splats for vXi64 shifts. llvm-svn: 303009
* [InstSimplify] Add patterns for folding (A & B) | (~A ^ B) -> (~A ^ B) and ↵Craig Topper2017-05-141-0/+18
| | | | | | | | its commuted variants. We already had (A & ~B) | (A ^ B), but we missed the cases where the not was part of the xor. llvm-svn: 303004
* [BasicAA] Alphabetize includes. NFCCraig Topper2017-05-141-1/+1
| | | | llvm-svn: 303002
* Fix test failure on windows -- do not return deleted funcXinliang David Li2017-05-141-2/+8
| | | | llvm-svn: 302999
* [SelectionDAG] Added support for EXTRACT_SUBVECTOR/CONCAT_VECTORS ↵Simon Pilgrim2017-05-131-7/+29
| | | | | | demandedelts in ComputeNumSignBits llvm-svn: 302997
* Add missing filesPeter Collingbourne2017-05-132-0/+25
| | | | llvm-svn: 302996
* Move lib/LibDriver -> lib/ToolDrivers/llvm-lib. NFCI.Peter Collingbourne2017-05-136-3/+3
| | | | | | | This reorganisation prevents us from cluttering up the top-level lib directory with more driver libraries such as llvm-dlltool (see D29892). llvm-svn: 302995
* [SelectionDAG] Add VECTOR_SHUFFLE support to ComputeNumSignBitsSimon Pilgrim2017-05-131-0/+34
| | | | llvm-svn: 302993
* [ValueTracking] Remove const_casts on several calls to computeKnownBits and ↵Craig Topper2017-05-133-8/+4
| | | | | | ComputeSignBit. NFC llvm-svn: 302991
* [x86, SSE] AVX1 PR28129 (256-bit all-ones rematerialization)Simon Pilgrim2017-05-132-10/+22
| | | | | | | | | | | | | | | | | | | | Further perf tests on Jaguar indicate that: vxorps %ymm0, %ymm0, %ymm0 vcmpps $15, %ymm0, %ymm0, %ymm0 is consistently faster (by about 9%) than: vpcmpeqd %xmm0, %xmm0, %xmm0 vinsertf128 $1, %xmm0, %ymm0, %ymm0 Testing equivalent code on a SandyBridge (E5-2640) puts it slightly (~3%) faster as well. Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D32416 llvm-svn: 302989
* [LoopOptimizer][Fix]PR32859, PR24738Simon Pilgrim2017-05-131-7/+9
| | | | | | | | | | | | | | | | | | | The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form. Here it is: before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ] and after this change we have: %e.0.ph = phi i32 [ 0, %for.inc.2.i ] %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ] Committed on behalf of @dtemirbulatov Differential Revision: https://reviews.llvm.org/D33055 llvm-svn: 302988
* This reverts r302984Vivek Pandya2017-05-131-2/+0
| | | | llvm-svn: 302985
* Simplify MIR Output used for Codegen TestingVivek Pandya2017-05-131-0/+2
| | | | | | | | | | | | | | - MIRYamlMapping: Default value provided for fields which have optional mappings. Implemented == operators for required classes. When a field's value is same as default value specified YAML IO class will not print it. - MIRPrinter: Above mentioned behaviour is not on by default. If -simplify-mir option not specified, then make yaml::Output to print fields with default values too. Differential Revision: https://reviews.llvm.org/D32304 llvm-svn: 302984
* [APInt] Use Lo_32/Hi_32/Make_64 in a few more places in the divide code. NFCICraig Topper2017-05-131-6/+6
| | | | llvm-svn: 302983
* [InstCombine] Prevent InstCombine from triggering an extra iteration if ↵Craig Topper2017-05-131-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | something changed in the initial Worklist creation Summary: If the Worklist build causes an IR change this change flag currently factors into the flag for running another iteration of the iteration loop. But only changes during processing should trigger another loop. This patch captures the worklist creation change flag into the outside the loop flag currently used for DbgDeclares and only sends that flag up to the caller. Rerunning the loop only depends on IC.run() now. This uses the debug output of InstCombine to determine if one or two iterations run. I couldn't think of a better way to detect it since the second spurious iteration shoudn't make any visible changes. Just wasted computation. I can do a pre-commit of the test case with the CHECK-NOT as a CHECK if this is an ok way to check this. This is a subset of D31678 as I'm still not sure how to verify the analysis behavior for that. Reviewers: davide, majnemer, spatel, chandlerc Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32453 llvm-svn: 302982
* [APInt] Fix typo in comment. NFCCraig Topper2017-05-131-1/+1
| | | | llvm-svn: 302974
* [AVR] When lowering Select8/Select16, put newly generated MBBs in the same spotDylan McKay2017-05-131-2/+3
| | | | | | | | | | Contributed by Dr. Gergő Érdi. Fixes a bug. Raised from (https://github.com/avr-rust/rust/issues/49). llvm-svn: 302973
* [AVR] Remove an unused variableDylan McKay2017-05-131-1/+0
| | | | llvm-svn: 302970
* [PartialInlining] Profile based cost analysisXinliang David Li2017-05-121-45/+363
| | | | | | | | | | | | Implemented frequency based cost/saving analysis and related options. The pass is now in a state ready to be turne on in the pipeline (in follow up). Differential Revision: http://reviews.llvm.org/D32783 llvm-svn: 302967
* [GISel]: Add a getConstantFPVRegVal utilityAditya Nandakumar2017-05-121-0/+8
| | | | | | | | This might be useful across various GISel Passes https://reviews.llvm.org/D33051 llvm-svn: 302964
* [GISel]: Fix undefined behavior while accessing DefaultAction mapAditya Nandakumar2017-05-121-1/+1
| | | | | | | | We end up dereferencing the end iterator here when the Aspect doesn't exist in the DefaultAction map. Change the API to return Optional<LLT> and return None when not found. Also update the callers to handle the None case llvm-svn: 302963
* [IR] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).Eugene Zelenko2017-05-121-8/+19
| | | | llvm-svn: 302961
* [TLI] Add mapping for various '__<func>_finite' forms of the math routines ↵Andrew Kaylor2017-05-121-0/+24
| | | | | | | | | | to SVML routines Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31789 llvm-svn: 302957
* [ConstantFolding] Add folding for various math '__<func>_finite' routines ↵Andrew Kaylor2017-05-121-11/+69
| | | | | | | | | | generated from -ffast-math Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31788 llvm-svn: 302956
* [TLI] Add declarations for various math header file routines from ↵Andrew Kaylor2017-05-121-0/+86
| | | | | | | | | | math-finite.h that create '__<func>_finite as functions Patch by Chris Chrulski Differential Revision: https://reviews.llvm.org/D31787 llvm-svn: 302955
OpenPOWER on IntegriCloud