summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86][SSE4A] Add support for combining from EXTRQI/INSERTQI shufflesSimon Pilgrim2017-07-032-18/+9
| | | | llvm-svn: 307048
* [X86][SSE4A] Add SSE4A shuffle tests on pre-SSSE3 hardwareSimon Pilgrim2017-07-031-0/+71
| | | | llvm-svn: 307042
* [X86][SSE4A] Test SSE4A shuffle combining on SSE42 capable target as wellSimon Pilgrim2017-07-031-17/+36
| | | | llvm-svn: 307038
* DAGCombine: Combine BUILD_VECTOR to TRUNCATEZvi Rackover2017-07-033-557/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a combine for creating a truncate to replace a build_vector composed of extracts with indices that form a stride-2^N series. Example: v8i32 V = ... v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6)) --> v4i32 truncate (bitcast V to v4i64) Related discussion in llvm-dev about canonicalizing shuffles to truncates in LLVM IR: http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html. Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena Reviewed By: delena Subscribers: guyblank, delena, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34077 llvm-svn: 307036
* [x86] auto-generate complete checks for tests; NFCSanjay Patel2017-07-034-155/+162
| | | | | | | These all used 'CHECK-NOT' which isn't necessary if we have complete checks. There were also over-specifications in the RUN params such as CPU model. llvm-svn: 307033
* [x86] auto-generate complete checks for tests; NFCSanjay Patel2017-07-034-219/+539
| | | | | | | These all used 'CHECK-NOT' which isn't necessary if we have complete checks. There were also several over-specifications in the RUN params such as CPU model or OS requirement llvm-svn: 307028
* [X86][SSE4A] Add tests showing missed opportunities to combine ↵Simon Pilgrim2017-07-031-0/+80
| | | | | | EXTRQI/INSERTQI shuffles llvm-svn: 307027
* [AMDGPU] Switch scalarize global loads ON by defaultAlexander Timofeev2017-07-03140-558/+788
| | | | | | Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307026
* [x86] auto-generate complete checks for tests; NFCSanjay Patel2017-07-034-337/+337
| | | | | | These all used 'CHECK-NOT' which isn't necessary if we have complete checks. llvm-svn: 307024
* [GlobalISel][X86] fix %ptr(p0) = G_CONSTANT selection.Igor Breger2017-07-032-0/+40
| | | | llvm-svn: 307019
* AMDGPU: Add operand target flags serializationMatt Arsenault2017-07-021-0/+29
| | | | llvm-svn: 306995
* [X86][AVX512] Test AVX512VPOPCNTDQ CTPOP with/without AVX512BWSimon Pilgrim2017-07-021-29/+57
| | | | llvm-svn: 306991
* [X86][AVX512VPOPCNTDQ] Improve support for v16i8/v8i16/v16i16/ CTPOPSimon Pilgrim2017-07-026-156/+111
| | | | | | Zero extend to v16i32/v8i64, use VPOPCNTDQ instructions and truncate back. llvm-svn: 306990
* [X86][AVX512] Cleanup tzcnt tests triples and attributesSimon Pilgrim2017-07-021-36/+36
| | | | | | Avoid use of specific -mcpu llvm-svn: 306989
* [X86][AVX512] Cleanup popcnt tests triples and attributesSimon Pilgrim2017-07-021-15/+15
| | | | | | Avoid use of specific -mcpu llvm-svn: 306988
* [x86] auto-generate complete checks for tests; NFCSanjay Patel2017-07-024-72/+126
| | | | | | These all used 'CHECK-NOT' which isn't necessary if we have complete checks. llvm-svn: 306984
* [x86] remove unnecessary RUN for test after auto-generating checks; NFCSanjay Patel2017-07-021-5/+21
| | | | llvm-svn: 306983
* [x86] update test to use FileCheck and auto-generate checks; NFCSanjay Patel2017-07-021-1/+50
| | | | llvm-svn: 306982
* [x86] auto-generate complete checks for tests; NFCSanjay Patel2017-07-024-32/+41
| | | | | | These all used 'CHECK-NOT' which isn't necessary if we have complete checks. llvm-svn: 306981
* [X86][SSE] Attempt to combine 64-bit and 32-bit shuffles to unary shuffles ↵Simon Pilgrim2017-07-022-2/+2
| | | | | | | | before bit shifts We are combining shuffles to bit shifts before unary permutes, which means we can't fold loads plus the destination register is destructive llvm-svn: 306978
* [X86][SSE] Attempt to combine 64-bit and 16-bit shuffles to unary shuffles ↵Simon Pilgrim2017-07-021-5/+2
| | | | | | | | | | before bit shifts We are combining shuffles to bit shifts before unary permutes, which means we can't fold loads plus the destination register is destructive The 32-bit shuffles are a bit tricky and will be dealt with in a later patch llvm-svn: 306977
* [X86][SSE] Add test showing missed opportunity to combine to pshuflwSimon Pilgrim2017-07-021-0/+18
| | | | | | We are combining shuffles to bit shifts before unary permutes, which means we can't fold loads plus the destination register is destructive llvm-svn: 306976
* [X86] Rerun "update_llc_test_checks" tool on CodeGen tests. NFC.Gadi Haber2017-07-023-0/+75
| | | | | | | | | | This is NFC after rerunning the "update_llc_test_checks.py" tool on the CodeGen X86 tests in order to submit a patch. Minor differences due to added "End of Function" lines. Reviewers: zvi Differential Revision: https://reviews.llvm.org/D34933 llvm-svn: 306973
* [GlobalISel][X86] Support G_GLOBAL_VALUE operation.Igor Breger2017-07-024-0/+220
| | | | | | | | | | | | | | | | | Summary: Support G_GLOBAL_VALUE operation. For now most of the PIC configurations not implemented yet. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34738 Conflicts: test/CodeGen/X86/GlobalISel/regbankselect-X86_64.mir llvm-svn: 306972
* [GlobalISel][X86] Support vector type G_UNMERGE_VALUES selection.Igor Breger2017-07-023-17/+283
| | | | | | | | | | | | | | | | Summary: Support vector type G_UNMERGE_VALUES selection. For now G_UNMERGE_VALUES marked as legal for any type, so nothing to do in legalizer. Reviewers: t.p.northover, qcolombet, zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, guyblank, llvm-commits Differential Revision: https://reviews.llvm.org/D33665 llvm-svn: 306971
* fix trivial typos; NFCHiroshi Inoue2017-07-021-1/+1
| | | | | | suport -> support llvm-svn: 306968
* [X86][RDSEED] Split off i64 intrinsic tests and test i16/i32 on 32-bit ↵Simon Pilgrim2017-07-012-29/+56
| | | | | | target as well. llvm-svn: 306961
* [X86][RDRAND] Split off i64 intrinsic tests and test i16/i32 on 32-bit ↵Simon Pilgrim2017-07-012-36/+102
| | | | | | target as well. llvm-svn: 306960
* [X86] Removed reference to update_test_checks.pySimon Pilgrim2017-07-011-1/+1
| | | | llvm-svn: 306959
* [X86][AVX] Remove duplicate autogeneration noteSimon Pilgrim2017-07-011-3/+2
| | | | llvm-svn: 306958
* Remove the default ARMSubtarget from the ARM TargetMachine.Eric Christopher2017-07-011-10/+0
| | | | | | | This enables us to ensure better LTO and code generation in the face of module linking. Remove a report_fatal_error from the TargetMachine and replace it with an assert in ARMSubtarget - and remove the test that depended on the error. The assertion will still fire in the case that we were reporting before, but error reporting needs to be in front end tools if possible for options parsing. llvm-svn: 306939
* Recommit "r306541 - Add zero-length check to memcpy/memset load store loop ↵Teresa Johnson2017-07-011-0/+4
| | | | | | | | | | expansion"" With fix for use-after-free errors. We can't add the new branch and remove the old one until we are done with the Builder constructed for the block. llvm-svn: 306937
* Rewrite ARM execute only support to avoid the use of a command line flag and ↵Eric Christopher2017-07-014-15/+15
| | | | | | | | unqualified ARMSubtarget lookup. Paired with a clang commit to use the new behavior. llvm-svn: 306927
* [ORE] Add diagnostics hotness thresholdBrian Gesiak2017-06-301-0/+27
| | | | | | | | | | | | | | | | | | | | Summary: Add an option to prevent diagnostics that do not meet a minimum hotness threshold from being output. When generating optimization remarks for large codebases with a ton of cold code paths, this option can be used to limit the optimization remark output at a reasonable size. Discussion of this change can be read here: http://lists.llvm.org/pipermail/llvm-dev/2017-June/114377.html Reviewers: anemet, davidxl, hfinkel Reviewed By: anemet Subscribers: qcolombet, javed.absar, fhahn, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D34867 llvm-svn: 306912
* [Hexagon] Implement frame pointer elimination with -fomit-frame-pointerKrzysztof Parzyszek2017-06-303-24/+92
| | | | | | | It applies to leaf functions that are otherwise not required to have a frame pointer. llvm-svn: 306888
* Fix ODR violations due to abuse of LLVM_YAML_IS_(FLOW_)?SEQUENCE_VECTORRichard Smith2017-06-301-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | This is a short-term fix for PR33650 aimed to get the modules build bots green again. Remove all the places where we use the LLVM_YAML_IS_(FLOW_)?SEQUENCE_VECTOR macros to try to locally specialize a global template for a global type. That's not how C++ works. Instead, we now centrally define how to format vectors of fundamental types and of string (std::string and StringRef). We use flow formatting for the former cases, since that's the obvious right thing to do; in the latter case, it's less clear what the right choice is, but flow formatting is really bad for some cases (due to very long strings), so we pick block formatting. (Many of the cases that were using flow formatting for strings are improved by this change.) Other than the flow -> block formatting change for some vectors of strings, this should result in no functionality change. Differential Revision: https://reviews.llvm.org/D34907 Corresponding updates to clang, clang-tools-extra, and lld to follow. llvm-svn: 306878
* GlobalISel: add G_IMPLICIT_DEF instruction.Tim Northover2017-06-306-22/+52
| | | | | | | | | It looks like there are two target-independent but not GISel instructions that need legalization, IMPLICIT_DEF and PHI. These are already anomalies since their operands have important LLTs attached, so to make things more uniform it seems like a good idea to add generic variants. Starting with G_IMPLICIT_DEF. llvm-svn: 306875
* [Hexagon] Emit jump tables in text section based on a flagSumanth Gundapaneni2017-06-301-0/+57
| | | | | | | | This patch adds a new LLVM flag -hexagon-emit-jt-text which is defaulted to "false". The value "true" emits the switch generated jump tables in text section. Differential Revision: https://reviews.llvm.org/D34820 llvm-svn: 306872
* Revert "[Hexagon] Guard the generation of lookup table"Sumanth Gundapaneni2017-06-301-57/+0
| | | | | | | This reverts commit ae521f4192c3ed0202c047fec993cb59133dd1a0. Wrong commit message llvm-svn: 306871
* [Hexagon] Guard the generation of lookup tableSumanth Gundapaneni2017-06-301-0/+57
| | | | | | | | | The llvm flag "-hexagon-emit-lookup-tables" guards the generation of lookup table from a switch statement. Differential Revision: https://reviews.llvm.org/D34819 llvm-svn: 306869
* ARM: fix big-endian 64-bit cmpxchg.Tim Northover2017-06-301-0/+26
| | | | | | | | | | On big-endian machines the high and low parts of the value accessed by ldrexd and strexd are swapped around. To account for this we swap inputs and outputs in ISelLowering. Patch by Bharathi Seshadri. llvm-svn: 306865
* [PowerPC] auto-generate check lines; NFCSanjay Patel2017-06-301-70/+58
| | | | | | | | | | | The existing check lines were more flexible, but these are small enough tests that there shouldn't be much question about register allocation. I've been hand-modifying this file as I change the CGP memcmp expansion, but that's more error-prone and time-consuming than just running the update script. llvm-svn: 306861
* Revert "[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset"Nirav Dave2017-06-308-163/+209
| | | | | | | This reverts commit r306819 which appears be exposing underlying issues in a stage1 ppc64be build llvm-svn: 306820
* [DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffsetNirav Dave2017-06-308-209/+163
| | | | | | | | | | | | | | | | | | | | | | | | As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from there to BaseIndexOffset checks. Tests of note: * test/CodeGen/X86/build-vector* - Improved. * test/CodeGen/BPF/undef.ll - Improved store alignment allows an additional store merge * test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a case we already do not handle well. Here, the DAG is improved, but scheduling causes a code size degradation. Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D34472 llvm-svn: 306819
* [X86] Updated 32-bit memcmp tests to run with/without SSE2Simon Pilgrim2017-06-301-347/+402
| | | | llvm-svn: 306816
* Revert "r306541 - Add zero-length check to memcpy/memset load store loop ↵Daniel Jasper2017-06-301-4/+0
| | | | | | | | | expansion" Segfaults in non-optimized builds. I'll get a stack trace and a reproducer to Teresa. llvm-svn: 306793
* [WebAssembly] Add support for exception handling instructionsHeejin Ahn2017-06-301-0/+22
| | | | | | | | | | | | | | | | | | | Summary: This adds backend support for throw, rethrow, try, and try_end instructions. This needs the corresponding clang builtin support: https://reviews.llvm.org/D34783 This follows the Wasm exception handling proposal in https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md Reviewers: sunfish, dschuff Reviewed By: dschuff Subscribers: jfb, sbc100, jgravelle-google Differential Revision: https://reviews.llvm.org/D34826 llvm-svn: 306774
* Unified logic for computing target ABI in backend and front end by moving ↵Eric Christopher2017-06-306-7/+7
| | | | | | | | | | this common code to Support/TargetParser. Modeled Triple::GNU after front end code (aapcs abi) and updated tests that expect apcs abi. Based heavily on a patch by Ana Pazos! llvm-svn: 306768
* [GISel]: New Opcode G_FLOG/G_FLOG2Aditya Nandakumar2017-06-291-0/+19
| | | | | | https://reviews.llvm.org/D34837 llvm-svn: 306766
* Remove redundant copy in recurrencesTaewook Oh2017-06-292-1/+234
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If there is a chain of instructions formulating a recurrence, commuting operands can help removing a redundant copy. In the following example code, ``` BB#1: ; Loop Header %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13 ... BB#6: ; Loop Latch %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15 %vreg10<def,tied1> = ADD32rr %vreg1<kill,tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1,%vreg0 %vreg3<def,tied1> = ADD32rr %vreg2<kill,tied0>, %vreg10<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2,%vreg10 CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3 %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3 JL_1 <BB#1>, %EFLAGS<imp-use,kill> ``` Existing two-address generation pass generates following code: ``` BB#1: %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13 ... BB#6: Predecessors according to CFG: BB#5 BB#4 %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15 %vreg10<def> = COPY %vreg1<kill>; GR32:%vreg10,%vreg1 %vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg0 %vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10 %vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2 CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3 %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3 JL_1 <BB#1>, %EFLAGS<imp-use,kill> JMP_1 <BB#7> ``` This is suboptimal because the assembly code generated has a redundant copy at the end of #BB6 to feed %vreg13 to BB#1: ``` .LBB0_6: addl %esi, %edi addl %ebx, %edi cmpl $10, %edi movl %edi, %esi jl .LBB0_1 ``` This redundant copy can be elimiated by making instructions in the recurrence chain to compute the value "into" the register that actually holds the feedback value. In this example, this can be achieved by commuting %vreg0 and %vreg1 to compute %vreg10. With that change, code after two-address generation becomes ``` BB#1: %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13 ... BB#6: derived from LLVM BB %bb7 Predecessors according to CFG: BB#5 BB#4 %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15 %vreg10<def> = COPY %vreg0<kill>; GR32:%vreg10,%vreg0 %vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg1<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1 %vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10 %vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2 CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3 %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3 JL_1 <BB#1>, %EFLAGS<imp-use,kill> JMP_1 <BB#7> ``` and the final assembly does not have redundant copy: ``` .LBB0_6: addl %edi, %eax addl %ebx, %eax cmpl $10, %eax jl .LBB0_1 ``` Reviewers: qcolombet, MatzeB, wmi Reviewed By: wmi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31821 llvm-svn: 306758
OpenPOWER on IntegriCloud