summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [ppc] Correctly compute the cost of loading 32/64 bit memory into VSRGuozhi Wei2016-12-031-5/+14
| | | | | | | | VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory cost model, so the vectorization of the test case in pr30990 is beneficial. Differential Revision: https://reviews.llvm.org/D26713 llvm-svn: 288560
* [lanai] Custom lowering of SHL_PARTSJacques Pienaar2016-12-022-1/+53
| | | | | | | | | | | | Summary: Implement custom lowering of SHL_PARTS to enable lowering of left shift with larger than 32-bit shifts. Reviewers: eliben, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27232 llvm-svn: 288541
* [WebAssembly] Fix a compiler warning. NFC.Dan Gohman2016-12-021-1/+1
| | | | | | | Fix a warning about a comparison between signed and unsigned integer expressions. llvm-svn: 288532
* [SystemZ] Support remaining atomic instructionsUlrich Weigand2016-12-025-0/+124
| | | | | | | | Add assembler support for all atomic instructions that weren't already supported. Some of those could be used to implement codegen for 128-bit atomic operations, but this isn't done here yet. llvm-svn: 288526
* [SystemZ] Support floating-point control register instructionsUlrich Weigand2016-12-027-11/+94
| | | | | | | | | | Add assembler support for instructions manipulating the FPC. Also add codegen support via the GCC compatibility builtins: __builtin_s390_sfpc __builtin_s390_efpc llvm-svn: 288525
* [SystemZ] Refactor hasSideEffects settingUlrich Weigand2016-12-022-45/+27
| | | | | | | | Move setting of hasSideEffects out of SystemZInstrFormats.td, to allow use of the format classes for instructions where this flag shouldn't be set. NFC. llvm-svn: 288524
* AMDGPU: Implement isCheapAddrSpaceCastMatt Arsenault2016-12-022-2/+13
| | | | llvm-svn: 288523
* Tidyup code with indentation and clang-format. NFCI.Simon Pilgrim2016-12-021-6/+6
| | | | llvm-svn: 288505
* [Sparc] Fix parsing of double-precision %f18, %f20, and %f22Daniel Cederman2016-12-021-1/+1
| | | | | | | | | | | | Summary: They are currently being parsed as %f14, %f16, and %f18. Reviewers: venkatra, jyknight Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27342 llvm-svn: 288503
* [X86][SSE] Add support for extracting constant bit data from broadcasted ↵Simon Pilgrim2016-12-021-24/+44
| | | | | | constants llvm-svn: 288499
* [X86] Refactored getTargetConstantBitsFromNode to allow for expansion. NFCI.Simon Pilgrim2016-12-021-44/+56
| | | | | | | | getTargetConstantBitsFromNode currently only extracts constant pool vector data, but it will need to be generalized to support broadcast and scalar constant pool data as well. Converted Constant bit extraction and Bitset splitting to helper lambda functions. llvm-svn: 288496
* [AVX-512] Add EVEX vpshuflw/vpshufhw/vpshufd instructions to load folding ↵Craig Topper2016-12-021-0/+27
| | | | | | tables. llvm-svn: 288484
* [AVX-512] Add EVEX PSHUFB instructions to load folding tables.Craig Topper2016-12-021-0/+9
| | | | llvm-svn: 288482
* [AVX-512] Add masked VINSERTF/VINSERTI instructions to load folding tables.Craig Topper2016-12-021-1/+25
| | | | llvm-svn: 288481
* IR: Change PointerType to derive from Type rather than SequentialType.Peter Collingbourne2016-12-021-0/+2
| | | | | | | | | | | | | | | | | | | As proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106640.html This is for a couple of reasons: - Values of type PointerType are unlike the other SequentialTypes (arrays and vectors) in that they do not hold values of the element type. By moving PointerType we can unify certain aspects of how the other SequentialTypes are handled. - PointerType will have no place in the SequentialType hierarchy once pointee types are removed, so this is a necessary step towards removing pointee types. Differential Revision: https://reviews.llvm.org/D26595 llvm-svn: 288462
* IR: Change the gep_type_iterator API to avoid always exposing the "current" ↵Peter Collingbourne2016-12-027-9/+9
| | | | | | | | | | | | | type. Instead, expose whether the current type is an array or a struct, if an array what the upper bound is, and if a struct the struct type itself. This is in preparation for a later change which will make PointerType derive from Type rather than SequentialType. Differential Revision: https://reviews.llvm.org/D26594 llvm-svn: 288458
* AMDGPU: Use wider scalar spills for SGPR spillingMatt Arsenault2016-12-021-15/+70
| | | | | | | | | | | | | | | | Since the spill is for the whole wave, these don't have the swizzling problems that vector stores do and a single 4-byte allocation is enough to spill a 64 element register. This should reduce the number of spill instructions and put all the spills for a register in the same cacheline. This should save allocated private size, but for now it doesn't. The extra slots are allocated for each component, but never used because the frame layout is essentially finalized before frame indices are replaced. For always using the scalar store path, this should probably be moved into processFunctionBeforeFrameFinalized. llvm-svn: 288445
* [AArch64] Fold more spilled/refilled COPYs.Geoff Berry2016-12-011-10/+36
| | | | | | | | | | | | | | | Summary: Make AArch64InstrInfo::foldMemoryOperandImpl more general by folding all full COPYs between register classes of the same size that are either spilled or refilled. Reviewers: MatzeB, qcolombet Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27271 llvm-svn: 288439
* [MC] Refactor emitELFSize to make usage more consistent. NFC.Dan Gohman2016-12-011-2/+1
| | | | | | | | | | | | | Move the cast<MCSymbolELF> inside emitELFSize, so that: - it's done in one place instead of at each call - it's more consistent with similar functions like EmitCOFFSafeSEH - ambiguity between cast<> and dyn_cast<> is avoided (which also eliminates an unnecessary dyn_cast call) This also makes it easier to experiment with using ".size" directives on non-ELF targets. llvm-svn: 288437
* [ARM] Fix for 64-bit CAS expansion on ARM32 with -O0Oleg Ranevskyy2016-12-011-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch fixes comparison of 64-bit atomic with its expected value in CMP_SWAP_64 expansion. Currently, the low words are compared with CMP, while the high words are compared with SBC. SBC expects the carry flag to be set if CMP detects a difference. CMP might leave the carry unset for unequal arguments though if the first one is >= than the second. This might cause the comparison logic to detect false equality. Example of the broken C++ code: ``` std::atomic<long long> at(2); long long ll = 1; std::atomic_compare_exchange_strong(&at, &ll, 3); ``` Even though the atomic `at` and the expected value `ll` are not equal and `atomic_compare_exchange_strong` returns `false`, `at` is changed to 3. The patch replaces SBC with CMPEQ. Reviewers: t.p.northover Subscribers: aemerson, rengolin, llvm-commits, asl Differential Revision: https://reviews.llvm.org/D27315 llvm-svn: 288433
* AArch64: fix 128-bit cmpxchg at -O0 (again, again).Tim Northover2016-12-011-6/+14
| | | | | | | | | | | | | | | This time the issue is fortunately just a simple mistake rather than a horrible design spectre. I thought SUBS/SBCS provided sufficient NZCV flags for comparing two 64-bit values, but they don't. The fix is slightly clunkier in AArch64 because we can't use conditional execution to emit a pair of CMPs. Traditionally an "icmp ne i128" would map to an EOR/EOR/ORR/CBNZ, but that uses more registers so it's easier to go with a CSET/CINC/CBNZ combination. Slightly less efficient, but this is -O0 anyway. Thanks to Anton Korobeynikov for pointing out the issue. llvm-svn: 288418
* Fix unused variable warning in Release builds. NFC.Benjamin Kramer2016-12-011-1/+1
| | | | llvm-svn: 288416
* Refactored X86InterleavedAccess into a class. NFCI.David L Kreitzer2016-12-011-67/+171
| | | | | | | | Patch by Farhana Aleen Differential Revision: https://reviews.llvm.org/D25986 llvm-svn: 288410
* Move most EH from MachineModuleInfo to MachineFunctionMatthias Braun2016-12-017-19/+14
| | | | | | | | | | | | | | | | | | | | | | | Recommitting r288293 with some extra fixes for GlobalISel code. Most of the exception handling members in MachineModuleInfo is actually per function data (talks about the "current function") so it is better to keep it at the function instead of the module. This is a necessary step to have machine module passes work properly. Also: - Rename TidyLandingPads() to tidyLandingPads() - Use doxygen member groups instead of "//===- EH ---"... so it is clear where a group ends. - I had to add an ugly const_cast at two places in the AsmPrinter because the available MachineFunction pointers are const, but the code wants to call tidyLandingPads() in between (markFunctionEnd()/endFunction()). Differential Revision: https://reviews.llvm.org/D27227 llvm-svn: 288405
* [X86][SSE] Moved shuffle mask widening/narrowing helper functions earlier in ↵Simon Pilgrim2016-12-011-78/+84
| | | | | | | | the file. Will be necessary for a future patch. llvm-svn: 288395
* [SystemZ] Fix fallout from r288374Ulrich Weigand2016-12-011-1/+2
| | | | | | Avoid undefined behavior due to too-large shift count. llvm-svn: 288391
* [SystemZ] Fix applyFixup for 12-bit fixupsUlrich Weigand2016-12-011-1/+3
| | | | | | | | | Now that we have fixups that only fill parts of a byte, it turns out we have to mask off the bits outside the fixup area when applying them. Failing to do so caused invalid object code to be emitted for bprp with a negative 12-bit displacement. llvm-svn: 288374
* [X86][SSE] Classify AND bitmasks as variable shuffle masksSimon Pilgrim2016-12-011-0/+4
| | | | | | They are loading the bitmasks from the constant pool so the cost is similar to loading a shuffle mask. llvm-svn: 288367
* [X86][SSE] Add support for combining AND bitmasks to shuffles.Simon Pilgrim2016-12-011-0/+11
| | | | llvm-svn: 288365
* [LMT] Restrict nop length to oneAsaf Badouh2016-12-011-2/+2
| | | | | | | | | not all lakemont MCU support long nop. we can't assume we can generate long nop by default for MCU. Differential Revision: https://reviews.llvm.org/D26895 llvm-svn: 288363
* Silence GCC's -Wenum-compare after r288335 in the same way it is doneDaniel Jasper2016-12-011-1/+2
| | | | | | in X86FastISel.cpp. llvm-svn: 288337
* [X86][SSE] Add support for combining target shuffles to AND bitmasks.Simon Pilgrim2016-12-011-0/+31
| | | | llvm-svn: 288335
* [X86][SSE] Add support for combining ISD::AND with shuffles.Simon Pilgrim2016-12-011-0/+19
| | | | | | Attempts to convert an AND with a vector of 255 or 0 values into a shuffle (blend) mask. llvm-svn: 288333
* Temporarily Revert "Move most EH from MachineModuleInfo to MachineFunction"Eric Christopher2016-12-017-14/+19
| | | | | | | | | This apprears to have broken the global isel bot: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-globalisel_build/5174/console This reverts commit r288293. llvm-svn: 288322
* [WebAssembly] Emit .import_global assembler directivesDerek Schuff2016-12-013-0/+16
| | | | | | | | | | | | Support a new assembler directive, .import_global, to declare imported global variables (i.e. those with external linkage and no initializer). The linker turns these into wasm imports. Patch by Jacob Gravelle Differential Revision: https://reviews.llvm.org/D26875 llvm-svn: 288296
* Move most EH from MachineModuleInfo to MachineFunctionMatthias Braun2016-11-307-19/+14
| | | | | | | | | | | | | | | | | | | | | Most of the exception handling members in MachineModuleInfo is actually per function data (talks about the "current function") so it is better to keep it at the function instead of the module. This is a necessary step to have machine module passes work properly. Also: - Rename TidyLandingPads() to tidyLandingPads() - Use doxygen member groups instead of "//===- EH ---"... so it is clear where a group ends. - I had to add an ugly const_cast at two places in the AsmPrinter because the available MachineFunction pointers are const, but the code wants to call tidyLandingPads() in between (markFunctionEnd()/endFunction()). Differential Revision: https://reviews.llvm.org/D27227 llvm-svn: 288293
* Move FrameInstructions from MachineModuleInfo to MachineFunctionMatthias Braun2016-11-3011-89/+89
| | | | | | | | | | | This is per function data so it is better kept at the function instead of the module. This is a necessary step to have machine module passes work properly. Differential Revision: https://reviews.llvm.org/D27185 llvm-svn: 288291
* [PS4] Tighten up a triple check.Paul Robinson2016-11-301-1/+1
| | | | llvm-svn: 288286
* [AArch64] Refactor LSE support as feature separate from V8.1a support.Joel Jones2016-11-305-6/+16
| | | | | | | | | | | | | | | | | | | Summary: This is preparation for ThunderX processors that have Large System Extension (LSE) atomic instructions, but not the other instructions introduced by V8.1a. This will mimic changes to GCC as described here: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00388.html LSE instructions are: LD/ST<op>, CAS*, SWP Reviewers: t.p.northover, echristo, jmolloy, rengolin Subscribers: aemerson, mehdi_amini Differential Revision: https://reviews.llvm.org/D26621 llvm-svn: 288279
* Clarify rules for reserved regs, fix aarch64 ones.Matthias Braun2016-11-303-19/+24
| | | | | | | | | No test case necessary as the problematic condition is checked with the newly introduced assertAllSuperRegsMarked() function. Differential Revision: https://reviews.llvm.org/D26648 llvm-svn: 288277
* [AArch64] Fix useful bits detection for BFM instructionsSilviu Baranga2016-11-301-9/+38
| | | | | | | | | | | | | | | | | | | Summary: When computing useful bits for a BFM instruction, we need to take into consideration the case where both operands of the BFM are equal and provide data that we need to track. Not doing this can cause us to miss useful bits. Fixes PR31138 (https://llvm.org/bugs/show_bug.cgi?id=31138) Reviewers: t.p.northover, jmolloy Subscribers: evandro, gberry, srhines, pirama, mcrosier, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27130 llvm-svn: 288253
* [X86][SSE] Add support for target shuffle constant foldingSimon Pilgrim2016-11-301-4/+173
| | | | | | | | | | Initial support for target shuffle constant folding in cases where all shuffle inputs are constant. We may be able to relax this and merge shuffles with only some constant inputs in the future. I've added the helper function getTargetConstantBitsFromNode (based off a similar function in X86ShuffleDecodeConstantPool.cpp) that could be reused for other cases requiring constant vector extraction. Differential Revision: https://reviews.llvm.org/D27220 llvm-svn: 288250
* [PowerPC] Preserve machine dominator tree in PPCVSXFMAMutateKrzysztof Parzyszek2016-11-301-0/+4
| | | | | | It is needed by LiveIntervalAnalysis. llvm-svn: 288243
* [PowerPC] Improvements for BUILD_VECTOR Vol. 2Nemanja Ivanovic2016-11-291-4/+98
| | | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D26023 This patch adds support for converting a vector of loads into a single load if the loads are consecutive (in either direction). llvm-svn: 288219
* [PowerPC] Improvements for BUILD_VECTOR Vol. 2Nemanja Ivanovic2016-11-292-1/+98
| | | | | | | | | | | | | | | This patch corresponds to review: https://reviews.llvm.org/D25980 This is the 2nd patch in a series of 4 that improve the lowering and combining for BUILD_VECTOR nodes on PowerPC. This particular patch combines a build vector of fp-to-int conversions into an fp-to-int conversion of a build vector of fp values. For example: Converts (build_vector (fp_to_[su]i $A), (fp_to_[su]i $B), ...) Into (fp_to_[su]i (build_vector $A, $B, ...))). Which is a natural match for much cleaner code. llvm-svn: 288218
* [lanai] Manually match 0/-1 with R0/R1.Jacques Pienaar2016-11-292-7/+22
| | | | | | | | | | | | Summary: Previously 0 and -1 was matched via tablegen rules. But this could cause problems where a physical register was being used where a virtual register was expected (seen in optimizeSelect and TwoAddressInstructionPass). Instead follow AArch64 and match in DAGToDAGISel. Reviewers: eliben, majnemer Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D27171 llvm-svn: 288215
* Revert https://reviews.llvm.org/rL287679Nemanja Ivanovic2016-11-292-23/+6
| | | | | | | This commit caused some miscompiles that did not show up on any of the bots. Reverting until we can investigate the cause of those failures. llvm-svn: 288214
* [AArch64] allow and-not-compare transform to form 'bics'Sanjay Patel2016-11-291-0/+5
| | | | | | | | | This target hook was added with D19087: https://reviews.llvm.org/D19087 Differential Revision: https://reviews.llvm.org/D27221 llvm-svn: 288206
* [AArch64] Add a basic SchedMachineModel for Falkor.Chad Rosier2016-11-292-2/+29
| | | | | | Differential Revision: https://reviews.llvm.org/D26972 llvm-svn: 288194
* AMDGPU: Disallow exec as SMEM instruction operandMatt Arsenault2016-11-294-19/+42
| | | | | | | | | | | | | | | | | | | This is not in the list of valid inputs for the encoding. When spilling, copies from exec can be folded directly into the spill instruction which results in broken stores. This only fixes the operand constraints, more codegen work is required to avoid emitting the invalid spills. This sort of breaks the dbg.value test. Because the register class of the s_load_dwordx2 changes, there is a copy to SReg_64, and the copy is the operand of dbg_value. The copy is later dead, and removed from the dbg_value. llvm-svn: 288191
OpenPOWER on IntegriCloud