summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* [InstSimplify, InstCombine] add/update tests with FP +0.0 vector with undef; NFCSanjay Patel2018-03-258-364/+423
| | | | llvm-svn: 328455
* [X86] Update cost model for Goldmont. Add fsqrt costs for SilvermontCraig Topper2018-03-252-8/+229
| | | | | | | | | | | | | | Add fdiv costs for Goldmont using table 16-17 of the Intel Optimization Manual. Also add overrides for FSQRT for Goldmont and Silvermont. Reviewers: RKSimon Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44644 llvm-svn: 328451
* [InstCombine] adjust test comments; NFCSanjay Patel2018-03-251-9/+6
| | | | llvm-svn: 328450
* [InstCombine] consolidate casted icmp vector testsSanjay Patel2018-03-251-660/+43
| | | | | | | | We have thorough coverage of predicates and scalar types, so we just need a sampling of vector tests to show that things are working or not with vectors types. llvm-svn: 328449
* [InstCombine] peek through more icmp of FP cast + bitcastSanjay Patel2018-03-251-135/+45
| | | | | | This is an extension of rL328426 as noted in D44367. llvm-svn: 328448
* [RISCV] Use init_array instead of ctors for RISCV target, by defaultMandeep Singh Grang2018-03-241-0/+30
| | | | | | | | | | | | | | | | | | | | | Summary: LLVM defaults to the newer .init_array/.fini_array scheme for static constructors rather than the less desirable .ctors/.dtors (the UseCtors flag defaults to false). This wasn't being respected in the RISC-V backend because it fails to call TargetLoweringObjectFileELF::InitializeELF with the the appropriate flag for UseInitArray. This patch fixes this by implementing RISCVELFTargetObjectFile and overriding its Initialize method to call InitializeELF(TM.Options.UseInitArray). Reviewers: asb, apazos Reviewed By: asb Subscribers: mgorny, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, llvm-commits Differential Revision: https://reviews.llvm.org/D44750 llvm-svn: 328433
* [InstCombine] peek through FP casts for sign-bit compares (PR36682)Sanjay Patel2018-03-242-107/+27
| | | | | | | | | | | | This pattern came up in PR36682: https://bugs.llvm.org/show_bug.cgi?id=36682 https://godbolt.org/g/LhuD9A Equality checks are planned as a follow-up enhancement. Differential Revision: https://reviews.llvm.org/D44367 llvm-svn: 328426
* [X86][AES] Ensure we're testing both non-VEX/VEX variants of AES ↵Simon Pilgrim2018-03-241-7/+321
| | | | | | | | instructions on AVX targets Add skylake server tests as well llvm-svn: 328424
* [X86][SSE] Ensure we're testing both non-VEX/VEX variants of SSE ↵Simon Pilgrim2018-03-246-112/+12529
| | | | | | | | instructions on AVX targets And ensure we don't use later instruction sets in SSE schedule tests llvm-svn: 328423
* [InstCombine] add multi-use/vector tests for intrinsic shrinking; NFCSanjay Patel2018-03-241-29/+155
| | | | llvm-svn: 328422
* [X86][AVX1] Ensure we don't use later instruction sets in AVX1 schedule testsSimon Pilgrim2018-03-241-9/+9
| | | | llvm-svn: 328421
* [X86][AVX2] Ensure we don't use later instruction sets in AVX2 schedule testsSimon Pilgrim2018-03-241-9/+13
| | | | llvm-svn: 328420
* Add REQUIRES lines for the targets being checked in this test.Eric Christopher2018-03-241-0/+3
| | | | llvm-svn: 328408
* [X86] Add a DAG combine to simplify PMULDQ/PMULUDQ nodesCraig Topper2018-03-241-18/+16
| | | | | | | | These nodes only use the lower 32 bits of their inputs so we can use SimplifyDemandedBits to simplify them. Differential Revision: https://reviews.llvm.org/D44375 llvm-svn: 328405
* Allow FDE references outside the +/-2GB range supported by PC relativeEric Christopher2018-03-241-21/+44
| | | | | | | | | | offsets for code models other than small/medium. For JIT application, memory layout is less controlled and can result in truncations otherwise. Patch based on one by Olexa Bilaniuk! llvm-svn: 328400
* [X86] Fix Windows `i1 zeroext` conventions to use i8 instead of i32Reid Kleckner2018-03-237-2/+76
| | | | | | | Both GCC and MSVC only look at the low byte of a boolean when it is passed. llvm-svn: 328386
* [PM][FunctionAttrs] add NoUnwind attribute inference to ↵Fedor Sergeev2018-03-2331-65/+181
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PostOrderFunctionAttrs pass Summary: This was motivated by absence of PrunEH functionality in new PM. It was decided that a proper way to do PruneEH is to add NoUnwind inference into PostOrderFunctionAttrs and then perform normal SimplifyCFG on top. This change generalizes attribute handling implemented for (a removal of) Convergent attribute, by introducing a generic builder-like class AttributeInferer It registers all the attribute inference requests, storing per-attribute predicates into a vector, and then goes through an SCC Node, scanning all the instructions for not breaking attribute assumptions. The main idea is that as soon all the instructions from all the functions of SCC Node conform to attribute assumptions then we are free to infer the attribute as set for all the functions of SCC Node. It handles two distinct cases of attributes: - those that might break due to derefinement of the function code for these attributes we are allowed to apply inference only if all the functions are "exact definitions". Example - NoUnwind. - those that do not care about derefinement for these attributes we are allowed to apply inference as soon as we see any function definition. Example - removal of Convergent attribute. Also in this commit: * Converted all the FunctionAttrs tests to use FileCheck and added new-PM invocations to them * FunctionAttrs/convergent.ll test demonstrates a difference in behavior between new and old PM implementations. Marked with FIXME. * PruneEH tests were converted to new-PM as well, using function-attrs+simplify-cfg combo as intended * some of "other" tests were updated since function-attrs now infers 'nounwind' even for old PM pipeline * -disable-nounwind-inference hidden option added as a possible workaround for a supposedly rare case when nounwind being inferred by default presents a problem Reviewers: chandlerc, jlebar Reviewed By: jlebar Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D44415 llvm-svn: 328377
* [InstCombine] increase test coverage for intrinsic shrinking; NFCSanjay Patel2018-03-231-48/+48
| | | | | | There were no tests with vector types before this. llvm-svn: 328371
* [Hexagon] Boost profit for word-mask immediates, reduce for othersKrzysztof Parzyszek2018-03-232-0/+145
| | | | | | This avoids unnecessary splitting due to uninteresting immediates. llvm-svn: 328364
* [Hexagon] Fold offset in base+immediate loads/storesKrzysztof Parzyszek2018-03-231-0/+55
| | | | | | | | Optimize Ry = add(Rx,#n); memw(Ry+#0) = Rz => memw(Rx,#n) = Rz. Patch by Jyotsna Verma. llvm-svn: 328355
* [AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPUTony Tye2018-03-232-7/+7
| | | | | | | | Add two additional implicit arguments for OpenCL for the AMDGPU target using the AMDHSA runtime to support device enqueue. Differential Revision: https://reviews.llvm.org/D44697 llvm-svn: 328351
* [AMDGPU] Remove use of OpenCL triple environment and replace with function ↵Tony Tye2018-03-232-20/+137
| | | | | | | | | | | attribute for AMDGPU - Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target. - Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS. Differential Revision: https://reviews.llvm.org/D43736 llvm-svn: 328349
* [Hexagon] Always generate mux out of predicated transfers if possibleKrzysztof Parzyszek2018-03-237-7/+67
| | | | | | | | | | | | HexagonGenMux would collapse pairs of predicated transfers if it assumed that the predicated .new forms cannot be created. Turns out that generating mux is preferable in almost all cases. Introduce an option -hexagon-gen-mux-threshold that controls the minimum distance between the instruction defining the predicate and the later of the two transfers. If the distance is closer than the threshold, mux will not be generated. Set the threshold to 0 by default. llvm-svn: 328346
* [Hexagon] Avoid early if-conversion for one sided branchesKrzysztof Parzyszek2018-03-231-0/+80
| | | | | | Patch by Anand Kodnani. llvm-svn: 328344
* [X86][Btver2] Cleanup TEST instructions to use JFPA (+JFPX on ymms) function ↵Simon Pilgrim2018-03-232-35/+35
| | | | | | unit llvm-svn: 328343
* [HWASan] Port HWASan to Linux x86-64 (LLVM)Alex Shlyapnikov2018-03-234-0/+256
| | | | | | | | | | | | | | | | | | | | | Summary: Porting HWASan to Linux x86-64, first of the three patches, LLVM part. The approach is similar to ARM case, trap signal is used to communicate memory tag check failure. int3 instruction is used to generate a signal, access parameters are stored in nop [eax + offset] instruction immediately following the int3 one. One notable difference is that x86-64 has to untag the pointer before use due to the lack of feature comparable to ARM's TBI (Top Byte Ignore). Reviewers: eugenis Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44699 llvm-svn: 328342
* [ARM] Fix "Constant pool entry out of range!" in Thumb1 modeAna Pazos2018-03-231-0/+359
| | | | | | | | | | | | | | | | | | | This patch fixes PR36658, "Constant pool entry out of range!" in Thumb1 mode. In ARMConstantIslands::optimizeThumb2JumpTables() in Thumb1 mode, adjustBBOffsetsAfter() is not calculating postOffset correctly by properly accounting for the padding that is required for the constant pool that immediately follows the jump table branch instruction. Reviewers: t.p.northover, eli.friedman Reviewed By: t.p.northover Subscribers: chrib, tstellar, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44709 llvm-svn: 328341
* [Hexagon] Two fixes in early if-conversionKrzysztof Parzyszek2018-03-231-0/+75
| | | | | | | | | - Fix checking for vector predicate registers. - Avoid speculating llvm.lifetime.end intrinsic. Patch by Harsha Jagasia and Brendon Cahoon. llvm-svn: 328339
* [X86][Btver2] Cleanup MOVMSK instructions to use JFPA function unitSimon Pilgrim2018-03-234-299/+299
| | | | | | Add missing non-VEX and (V)PMOVMSKB instructions to the pattern llvm-svn: 328338
* [X86][Btver2] Vector permutes use a JFPU01 scheduler pipe and JFPX/JVALU ↵Simon Pilgrim2018-03-232-303/+303
| | | | | | function unit llvm-svn: 328331
* [InstCombine] auto-generate checks; NFCSanjay Patel2018-03-232-16/+48
| | | | llvm-svn: 328329
* [X86][Btver2] Vector store instructions use a JFPU1 scheduler pipe and ↵Simon Pilgrim2018-03-235-415/+415
| | | | | | JSAGU/JSTC function units llvm-svn: 328328
* [InstSimplify] regenerate checks, move tests; NFCSanjay Patel2018-03-232-34/+43
| | | | llvm-svn: 328327
* Re-commit: [MachineLICM] Add functions to MachineLICM to hoist invariant storesZaara Syeda2018-03-233-4/+129
| | | | | | | | | | | | | | | | This patch adds functions to allow MachineLICM to hoist invariant stores. Currently, MachineLICM does not hoist any store instructions, however when storing the same value to a constant spot on the stack, the store instruction should be considered invariant and be hoisted. The function isInvariantStore iterates each operand of the store instruction and checks that each register operand satisfies isCallerPreservedPhysReg. The store may be fed by a copy, which is hoisted by isCopyFeedingInvariantStore. This patch also adds the PowerPC changes needed to consider the stack register as caller preserved. Differential Revision: https://reviews.llvm.org/D40196 llvm-svn: 328326
* [InstCombine] regenerate test checks; NFCSanjay Patel2018-03-231-10/+17
| | | | llvm-svn: 328325
* [X86][Btver2] Cleanup DPPS/DPPD instructions to use JFPA/JFPM function unitsSimon Pilgrim2018-03-232-81/+81
| | | | llvm-svn: 328324
* [AArch64] Don't reduce the width of loads if it prevents combining a shiftJohn Brawn2018-03-232-2/+307
| | | | | | | | | | | | | | | Loads and stores can only shift the offset register by the size of the value being loaded, but currently the DAGCombiner will reduce the width of the load if it's followed by a trunc making it impossible to later combine the shift. Solve this by implementing shouldReduceLoadWidth for the AArch64 backend and make it prevent the width reduction if this is what would happen, though do allow it if reducing the load width will let us eliminate a later sign or zero extend. Differential Revision: https://reviews.llvm.org/D44794 llvm-svn: 328321
* [X86][Btver2] Fix MicroOps counts for DPPS/YMM memory folded instructionsSimon Pilgrim2018-03-232-56/+56
| | | | | | This was due to a misunderstanding over what llvm calls a micro-op (retirement unit) is actually called a macro-op on the AMD/Jaguar target. Folded loads don't affect num macro ops. llvm-svn: 328320
* [X86][Btver2] Cleanup SSE42 PCMPISTR/PCMPESTR string instructions to ↵Simon Pilgrim2018-03-232-13/+13
| | | | | | | | correctly use JFPU1 scheduler pipe followed by JLAGU/JSAGU/JFPA/JVALU function units Fixes throughput to match Agner/Fam16h-SoG as well. llvm-svn: 328318
* [SLP] Stop counting cost of gather sequences with multiple usesMatthew Simpson2018-03-232-34/+21
| | | | | | | | | | | | | | | When building the SLP tree, we look for reuse among the vectorized tree entries. However, each gather sequence is represented by a unique tree entry, even though the sequence may be identical to another one. This means, for example, that a gather sequence with two uses will be counted twice when computing the cost of the tree. We should only count the cost of the definition of a gather sequence rather than its uses. During code generation, the redundant gather sequences are emitted, but we optimize them away with CSE. So it looks like this problem just affects the cost model. Differential Revision: https://reviews.llvm.org/D44742 llvm-svn: 328316
* [DEBUGINFO] Add flag for DWARF2 to use sections as references.Alexey Bataev2018-03-231-0/+54
| | | | | | | | | | | | | | | Summary: Some targets does not support labels inside debug sections, but support references in form `section+offset`. Patch adds initial support for this. Reviewers: echristo, probinson, jlebar Subscribers: llvm-commits, JDevlieghere Differential Revision: https://reviews.llvm.org/D43943 llvm-svn: 328314
* [ARM] Support float literals under XOChristof Douma2018-03-231-0/+118
| | | | | | | | | | | | | When targeting execute-only and fp-armv8, float constants in a compare resulted in instruction selection failures. This is now fixed by using vmov.f32 where possible, otherwise the floating point constant is lowered into a integer constant that is moved into a floating point register. This patch also restores using fpcmp with immediate 0 under fp-armv8. Change-Id: Ie87229706f4ed879a0c0cf66631b6047ed6c6443 llvm-svn: 328313
* Revert r328307: [IPSCCP] Use constant range information for comparisons of ↵Florian Hahn2018-03-231-9/+13
| | | | | | | | parameters. Reverted for now, due to it causing verifier failures. llvm-svn: 328312
* [GlobalISel] Fix legalizer combine to not use illegal input G_EXTRACT.Amara Emerson2018-03-231-3/+3
| | | | | | | This was being masked because GISel is enabled by default for -O0 and the abort was disabled. Modified test to explicitly enable abort. llvm-svn: 328311
* [test] Allow for optional No-Op Barrier Pass in O0 pipelineMatthew Simpson2018-03-231-1/+2
| | | | llvm-svn: 328310
* [X86][SandyBridge] Fix missing comma that was causing string concatenation ↵Simon Pilgrim2018-03-231-4/+4
| | | | | | | | of 2 instregex entries Found while updating D44687 llvm-svn: 328308
* [IPSCCP] Use constant range information for comparisons of parameters.Florian Hahn2018-03-231-13/+9
| | | | | | | | | | | | | | | | | | For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 328307
* [X86][Btver2] Vector move/load/store instructions use a JFPU01 scheduler ↵Simon Pilgrim2018-03-237-520/+520
| | | | | | pipe and JFPX/JVALU function unit as well as the AGUs llvm-svn: 328304
* [LoopUnroll] Simplify induction variables after peeling too.Florian Hahn2018-03-231-15/+10
| | | | | | | | | | | | | Loop peeling also has an impact on the induction variables, so we should benefit from induction variable simplification after peeling too. Reviewers: sanjoy, bogner, mzolotukhin, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D43878 llvm-svn: 328301
* [ARM] Error out on .arm assembler directives on windowsMartin Storsjo2018-03-231-0/+3
| | | | | | | | Windows on arm is thumb only. Differential Revision: https://reviews.llvm.org/D43005 llvm-svn: 328298
OpenPOWER on IntegriCloud