summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Model MXCSR for AVX instructions other than AVX512Wang, Pengfei2019-12-032-12/+17
| | | | | | | | | | | | Summary: Model MXCSR for AVX instructions other than AVX512 Reviewers: craig.topper, RKSimon Subscribers: hiraditya, llvm-commits, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70875
* Place the "cold" code piece into the same section as the original functionBill Wendling2019-12-021-0/+3
| | | | | | | | | | | | | | | | Summary: This cropped up in the Linux kernel where cold code was placed in an incompatible section. Reviewers: compnerd, vsk, tejohnson Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70925
* Temporarily revert "build: avoid hardcoding the libxml2 library name"Eric Christopher2019-12-021-6/+12
| | | | | | | | as it breaks uses of llvm-config --system-libs and the follow-on commit "build: avoid cached literals being linked against" This reverts commits 340e7c0b77a7037afefe7255503afe362967b577 and 340e7c0b77a7037afefe7255503afe362967b577.
* [PGO][PGSO] Add an optional query type parameter to shouldOptimizeForSize.Hiroshi Yamauchi2019-12-027-13/+24
| | | | | | | | | | | | | | Summary: In case of a need to distinguish different query sites for gradual commit or debugging of PGSO. NFC. Reviewers: davidxl Subscribers: hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70510
* [Remarks][ThinLTO] Use the correct file extension based on the formatFrancis Visoiu Mistrih2019-12-021-1/+5
| | | | | Since we now have multiple formats, the ThinLTO remark files should also respect that.
* [MIBundles] Move analyzePhysReg out of MIBundleOperands iterator (NFC).Florian Hahn2019-12-024-17/+8
| | | | | | | | | | | analyzePhysReg does not really fit into the iterator and moving it makes it easier to change the base iterator. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm, qcolombet Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D70559
* [GlobalISel] CombinerHelper: Fix a bug in matchCombineCopyVolkan Keles2019-12-021-3/+26
| | | | | | | | | | | | | | | | | Summary: When combining COPY instructions, we were replacing the destination registers with the source register without checking register constraints. This patch adds a simple logic to check if the constraints match before replacing registers. Reviewers: qcolombet, aditya_nandakumar, aemerson, paquette, dsanders, Petar.Avramovic Reviewed By: aditya_nandakumar Subscribers: rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70616
* [ARM] Add ARMVCCThen to tablegen and make use of it. NFCDavid Green2019-12-022-71/+76
| | | | | | | Similar to the parent, this adds some constants to tablegen to replace the existing magic values. Differential Revision: https://reviews.llvm.org/D70825
* [ARM] Add ARMCC constants to tablegen. NFCDavid Green2019-12-023-151/+133
| | | | | | | | | | I got tired of looking at magic constants in tablegen files. This adds condition codes like ARMCCeq and makes use of them. I also removed the extra patterns for reverse condition codes from D70296, they should now be covered by the parent commit. Differential Revision: https://reviews.llvm.org/D70824
* [ARM] Add some VCMP folding and canonicalisationDavid Green2019-12-023-27/+62
| | | | | | | | | | The VCMP instructions in MVE can accept a register or ZR, but only as the right hand operator. Most of the time this will already be correct because the icmp will have been canonicalised that way already. There are some cases in the lowering of float conditions that this will not apply to though. This code should fix up those cases. Differential Revision: https://reviews.llvm.org/D70822
* [MIBundles] Move analyzeVirtReg out of MIBundleOperands iterator (NFC).Florian Hahn2019-12-022-17/+16
| | | | | | | | | | | analyzeVirtReg does not really fit into the iterator and moving it makes it easier to change the base iterator. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm, qcolombet Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D70558
* Reland "b19ec1eb3d0c [BPI] Improve unreachable/ColdCall heurstics to handle ↵Taewook Oh2019-12-021-57/+75
| | | | | | | | | | | | | loops." Summary: b19ec1eb3d0c has been reverted because of the test failures with PowerPC targets. This patch addresses the issues from the previous commit. Test Plan: ninja check-all. Confirmed that CodeGen/PowerPC/pr36292.ll and CodeGen/PowerPC/sms-cpy-1.ll pass Subscribers: llvm-commits
* [VPlan] Move graph traits (NFC).Florian Hahn2019-12-021-121/+122
| | | | | | | | | | | By defining the graph traits right after the VPBlockBase definitions, we can make use of them earlier in the file. Reviewers: hsaito, Ayal, gilr Reviewed By: gilr Differential Revision: https://reviews.llvm.org/D70733
* [DAGCombine] Factor oplist operations. NFCAmaury Séchet2019-12-021-6/+10
|
* [InstCombine] fix undef propagation for vector urem transform (PR44186)Sanjay Patel2019-12-021-2/+4
| | | | | | | | | As described here: https://bugs.llvm.org/show_bug.cgi?id=44186 The match() code safely allows undef values, but we can't safely propagate a vector constant that contains an undef to the new compare instruction.
* [SelectionDAG] Reduce assumptions made about levels. NFCAmaury Séchet2019-12-021-7/+8
|
* [ARM,MVE] Add intrinsics to deal with predicates.Simon Tatham2019-12-021-12/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This commit adds the `vpselq` intrinsics which take an MVE predicate word and select lanes from two vectors; the `vctp` intrinsics which create a tail predicate word suitable for processing the first m elements of a vector (e.g. in the last iteration of a loop); and `vpnot`, which simply complements a predicate word and is just syntactic sugar for the `~` operator. The `vctp` ACLE intrinsics are lowered to the IR intrinsics we've already added (and which D70592 just reorganized). I've filled in the missing isel rule for VCTP64, and added another set of rules to generate the predicated forms. I needed one small tweak in MveEmitter to allow the `unpromoted` type modifier to apply to predicates as well as integers, so that `vpnot` doesn't pointlessly convert its input integer to an `<n x i1>` before complementing it. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70485
* [ARM,MVE] Rename and clean up VCTP IR intrinsics.Simon Tatham2019-12-022-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: D65884 added a set of Arm IR intrinsics for the MVE VCTP instruction, to use in tail predication. But the 64-bit one doesn't work properly: its predicate type is `<2 x i1>` / `v2i1`, which isn't a legal MVE type (due to not having a full set of instructions that manipulate it usefully). The test of `vctp64` in `basic-tail-pred.ll` goes through `opt` fine, as the test expects, but if you then feed it to `llc` it causes a type legality failure at isel time. The usual workaround we've been using in the rest of the MVE intrinsics family is to bodge `v2i1` into `v4i1`. So I've adjusted the `vctp64` IR intrinsic to do that, and completely removed the code (and test) that uses that intrinsic for 64-bit tail predication. That will allow me to add isel rules (upcoming in D70485) that actually generate the VCTP64 instruction. Also renamed all four of these IR intrinsics so that they have `mve` in the name, since its absence was confusing. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: MarkMurrayARM Subscribers: samparker, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70592
* [ARM,MVE] Add an InstCombine rule permitting VPNOT.Simon Tatham2019-12-021-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | Summary: If a user writing C code using the ACLE MVE intrinsics generates a predicate and then complements it, then the resulting IR will use the `pred_v2i` IR intrinsic to turn some `<n x i1>` vector into a 16-bit integer; complement that integer; and convert back. This will generate machine code that moves the predicate out of the `P0` register, complements it in an integer GPR, and moves it back in again. This InstCombine rule replaces `i2v(~v2i(x))` with a direct complement of the original predicate vector, which we can already instruction- select as the VPNOT instruction which complements P0 in place. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70484
* [InstCombine] Revert rL341831: relax one-use check in foldICmpAddConstant() ↵Roman Lebedev2019-12-021-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (PR44100) rL341831 moved one-use check higher up, restricting a few folds that produced a single instruction from two instructions to the case where the inner instruction would go away. Original commit message: > InstCombine: move hasOneUse check to the top of foldICmpAddConstant > > There were two combines not covered by the check before now, > neither of which actually differed from normal in the benefit analysis. > > The most recent seems to be because it was just added at the top of the > function (naturally). The older is from way back in 2008 (r46687) > when we just didn't put those checks in so routinely, and has been > diligently maintained since. From the commit message alone, there doesn't seem to be a deeper motivation, deeper problem that was trying to solve, other than 'fixing the wrong one-use check'. As i have briefly discusses in IRC with Tim, the original motivation can no longer be recovered, too much time has passed. However i believe that the original fold was doing the right thing, we should be performing such a transformation even if the inner `add` will not go away - that will still unchain the comparison from `add`, it will no longer need to wait for `add` to compute. Doing so doesn't seem to break any particular idioms, as least as far as i can see. References https://bugs.llvm.org/show_bug.cgi?id=44100
* [PowerPC] Fix crash in peephole optimizationNemanja Ivanovic2019-12-021-2/+4
| | | | | | | | | | | | When converting reg+reg shifts to reg+imm rotates, we neglect to consider the CodeGenOnly versions of the 32-bit shift mnemonics. This means we produce a rotate with missing operands which causes a crash. Committing this fix without review since it is non-controversial that the list of mnemonics to consider should include the 64-bit aliases for the exact mnemonics. Fixes PR44183.
* [ARM][AArch64] Complex addition Neon intrinsics for Armv8.3-AVictor Campos2019-12-022-0/+44
| | | | | | | | | | | | | | | | | | | Summary: Add support for vcadd_* family of intrinsics. This set of intrinsics is available in Armv8.3-A. The fp16 versions require the FP16 extension, which has been available (opt-in) since Armv8.2-A. Reviewers: t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70862
* [InstCombine] fold copysign with constant sign argument to (fneg+)fabsSanjay Patel2019-12-021-0/+15
| | | | | | | | | | | | | | | If the sign of the sign argument is known (this could be extended to use ValueTracking), then we can use fneg+fabs to clear/set the sign bit of the magnitude argument. http://llvm.org/docs/LangRef.html#llvm-copysign-intrinsic This transform is already done in DAGCombiner, but we can do it sooner in IR as suggested in PR44153: https://bugs.llvm.org/show_bug.cgi?id=44153 We have effectively no analysis for copysign in IR, so we are taking the unusual step of increasing the number of IR instructions for the negative constant case. Differential Revision: https://reviews.llvm.org/D70792
* AMDGPU: Fixed indeterminate map iteration in SIPeepholeSDWATim Renouf2019-12-021-2/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D70783 Change-Id: Ic26f915a4acb4c00ecefa9d09d7c24cec370ed06
* [ARM][MVE][Intrinsics] Add VMINQ/VMAXQ/VMINNMQ/VMAXNMQ intrinsics.Mark Murray2019-12-021-38/+55
| | | | | | | | | | Summary: Add VMINQ/VMAXQ/VMINNMQ/VMAXNMQ intrinsics and their predicated versions. Add unit tests. Subscribers: kristof.beyls, hiraditya, dmgreen, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70829
* [ARM] Remove VHADD patternsDavid Green2019-12-021-54/+0
| | | | | | | | | | | | | | | These instructions do not work quite like I expected them to. They perform the addition and then shift in a higher precision integer, so do not match up with the patterns that we added. For example with s8s, adding 100 and 100 should wrap leaving the shift to work on a negative number. VHADD will instead do the arithmetic in higher precision, giving 100 overall. The vhadd gives a "better" result, but not one that matches up with the input. I am just removing the patterns here. We might be able to re-add them in the future by checking for wrap flags or changing bitwidths. But for the moment just remove them to remove the problem cases.
* [InstCombine] Fix big-endian miscompile of (bitcast (zext/trunc (bitcast)))Bjorn Pettersson2019-12-021-22/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: optimizeVectorResize is rewriting patterns like: %1 = bitcast vector %src to integer %2 = trunc/zext %1 %dst = bitcast %2 to vector Since bitcasting between integer an vector types gives different integer values depending on endianness, we need to take endianness into account. As it happens the old implementation only produced the correct result for little endian targets. Fixes: https://bugs.llvm.org/show_bug.cgi?id=44178 Reviewers: spatel, lattner, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, hiraditya, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70844
* [ORC] Add a runAsMain utility function to ExecutionUtils.Lang Hames2019-12-021-0/+26
| | | | | | | | | | | The runAsMain function takes a pointer to a function with a standard C main signature, int(*)(int, char*[]), and invokes it using the given arguments and program name. The arguments are copied into writable temporary storage as required by the C and C++ specifications, so runAsMain safe to use when calling main functions that modify their arguments in-place. This patch also uses the new runAsMain function to replace hand-rolled versions in lli, llvm-jitlink, and the SpeculativeJIT example.
* [Orc] Add setters for target options and features to JITTargetMachineBuilder.Lang Hames2019-12-021-3/+1
| | | | Also remove redundant feature initialization steps from the detectHost method.
* Fix broken comment phrasing and indentationMatt Arsenault2019-12-021-7/+6
|
* AMDGPU/GlobalISel: Add AGPR bank and RegBankSelect mfma intrinsicsAustin Kerbow2019-12-015-14/+134
| | | | Differential Revision: https://reviews.llvm.org/D70871
* [SCEV] Make SCEV verification available from command line with new PMDaniil Suchkov2019-12-022-0/+7
| | | | | | | | | | | | | | | | New pass manager doesn't use verifyAnalysis, so currently there is no way to call SCEV verification from command line when new PM is used. This patch adds a pass that allows you to do that. Reviewers: reames, fhahn, sanjoy.google, nikic Reviewed By: fhahn Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70423
* [X86] Add floating point execution domain to ↵Craig Topper2019-11-302-49/+74
| | | | comi/ucomi/cvtss2si/cvtsd2si/cvttss2si/cvttsd2si/cvtsi2ss/cvtsi2sd instructions.
* [InstCombine] Expand usub_sat patterns to handle constantsDavid Green2019-11-301-3/+11
| | | | | | | | The constants come through as add %x, -C, not a sub as would be expected. They need some extra matchers to canonicalise them towards usub_sat. Differential Revision: https://reviews.llvm.org/D69514
* [InstCombine] Adjust usub_sat fold one use checksDavid Green2019-11-301-3/+3
| | | | | | This adjusts the one use checks in the the usub_sat fold code to not increase instruction count, but otherwise do the fold. Reviewed as a part of D69514.
* Revert 651f07908a1 "[AArch64] Don't combine callee-save and local stack ↵Hans Wennborg2019-11-301-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | adjustment when optimizing for size" This caused asserts (and perhaps also miscompiles) while building for Windows on AArch64. See the discussion on D68530 for details and reproducer. Reverting until this can be investigated and fixed. > For arm64, D18619 introduced the ability to combine bumping the stack pointer > upfront in case it needs to be bumped for both the callee-save area as well as > the local stack area. > > That diff already remarks that "This change can cause an increase in > instructions", but argues that even when that happens, it should be still be a > performance benefit because the number of micro-ops is reduced. > > We have observed that this code-size increase can be significant in practice. > This diff disables combining stack bumping for methods that are marked as > optimize-for-size. > > Example of a prologue with the behavior before this diff (combining stack bumping when possible): > sub sp, sp, #0x40 > stp d9, d8, [sp, #0x10] > stp x20, x19, [sp, #0x20] > stp x29, x30, [sp, #0x30] > add x29, sp, #0x30 > [... compute x8 somehow ...] > stp x0, x8, [sp] > > And after this diff, if the method is marked as optimize-for-size: > stp d9, d8, [sp, #-0x30]! > stp x20, x19, [sp, #0x10] > stp x29, x30, [sp, #0x20] > add x29, sp, #0x20 > [... compute x8 somehow ...] > stp x0, x8, [sp, #-0x10]! > > Note that without combining the stack bump there are two auto-decrements, > nicely folded into the stp instructions, whereas otherwise there is a single > sub sp, ... instruction, but not folded. > > Patch by Nikolai Tillmann! > > Differential Revision: https://reviews.llvm.org/D68530
* Fix a typo.Hans Wennborg2019-11-301-1/+1
|
* [PowerPC][AIX] Add support for lowering int/float/double formal arguments.Sean Fertile2019-11-292-3/+119
| | | | | | | | | | | | | This patch adds LowerFormalArguments_AIX, support is added for lowering int, float, and double formal arguments into general purpose and floating point registers only. The aix calling convention testcase have been redone to test for caller and callee functionality in the same lit test. Patch by Zarko Todorovski! Differential Revision: https://reviews.llvm.org/D69578
* Revert "[ARM] Allocatable Global Register Variables for ARM"Carey Williams2019-11-298-68/+22
| | | | This reverts commit 2d739f98d8a53e38bf9faa88cdb6b0c2a363fb77.
* [ARM] Fix instruction selection for ARMISD::CMOV with f16 typeVictor Campos2019-11-292-1/+8
| | | | | | | | | | | | | | | | | | | | | | Summary: In the cases where the CMOV (f16) SDNode is used with condition codes LT, LE, VC or NE, it is successfully selected into a VSEL instruction. In the remaining cases, however, instruction selection fails since VSEL does not support other condition codes. This patch handles such cases by using the single-precision version of the VMOV instruction. Reviewers: ostannard, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70667
* [yaml2obj] - Add a way to describe content of the SHT_GNU_verneed section ↵Georgii Rymar2019-11-292-10/+28
| | | | | | | | | with "Content". There is no way to set raw content for SHT_GNU_verneed section. This patch implements it. Differential revision: https://reviews.llvm.org/D70816
* [Attributor] Deduce dereferenceable based on accessed bytes mapHideto Ueno2019-11-291-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch introduces the deduction based on load/store instructions whose pointer operand is a non-inbounds GEP instruction. For example if we have, ``` void f(int *u){ u[0] = 0; u[1] = 1; u[2] = 2; } ``` then u must be dereferenceable(12). This patch is inspired by D64258 Reviewers: jdoerfert, spatel, hfinkel, RKSimon, sstefan1, xbolva00, dtemirbulatov Reviewed By: jdoerfert Subscribers: jfb, lebedev.ri, xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70714
* [Attributor] Remove dereferenceable_or_null when nonull is presentHideto Ueno2019-11-291-0/+10
| | | | | | | | | | | | | | Summary: This patch prevents the simultaneous presence of `dereferenceable` and `dereferenceable_or_null` attribute Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70789
* [PassInstrumentation] Remove excess newline for the new pass managerFangrui Song2019-11-282-4/+1
| | | | This also removes excess newline for the legacy pass manager when -filter-print-funcs is specified.
* [LegalizeTypes] Add strict FP support to SoftenFloatRes_FP_ROUND. Fix ↵Craig Topper2019-11-281-6/+13
| | | | | | | | | | | mistake in SoftenFloatRes_FP_EXTEND. These will be needed for ARM fp-instrinsics.ll which is currently XFAILed. One of the getOperand calls in SoftenFloatRes_FP_EXTEND was not taking strict FP into account. It only affected the call to setTypeListBeforeSoften which only has an effect on some targets.
* [LegalizeTypes] In SoftenFloatRes_FNEG, always generate integer arithmetic, ↵Craig Topper2019-11-281-19/+4
| | | | | | | | never fall back to using fsub. We would previously fallback if the type wasn't f32/f64/f128. But I don't think any of the other floating point types ever go through the softening code anyway. So this code is dead.
* [LegalizeTypes] Use SoftenFloatRes_Unary in SoftenFloatRes_FCBRT to reduce code.Craig Topper2019-11-281-8/+2
| | | | | We don't have a STRICT_CBRT ISD opcode, but we can still use SoftenFloatRes_Unary to simplify some code.
* [DAGCombiner] Peek through vector concats when trying to combine shuffles.Amaury Séchet2019-11-281-9/+42
| | | | | | | | | | | | Summary: This combine showed up as needed when exploring the regression when processing the DAG in topological order. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68195
* [LegacyPassManager] Simplify FunctionPass::assignPassManagerFangrui Song2019-11-281-13/+9
| | | | And make it clear the parameter PreferredType is unused for FunctionPass.
* [LegacyPassManager] Simplify PMStack popFangrui Song2019-11-281-18/+6
|
OpenPOWER on IntegriCloud