summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* The TOC save offset can be computed at compile time, do so andEric Christopher2015-02-133-7/+10
| | | | | | propagate changes. llvm-svn: 228997
* The return save offset can be computed at initialization time - doEric Christopher2015-02-133-17/+19
| | | | | | so and save the value. llvm-svn: 228996
* X86: Don't crash if we can't decode the pshufb maskDavid Majnemer2015-02-121-0/+2
| | | | | | | | | | | Constant pool entries are uniqued by their contents regardless of their type. This means that a pshufb can have a shuffle mask which isn't a simple array of bytes. The code path which attempts to decode the mask didn't check for failure, causing PR22559. llvm-svn: 228979
* Learn that __DATA,__objc_classrefs is not atomized via symbols.Rafael Espindola2015-02-121-0/+4
| | | | | | This should hopefully fix objc on AArch64. llvm-svn: 228976
* Change max interleave factor to 12 for POWER7 and POWER8.Olivier Sallenave2015-02-121-0/+6
| | | | llvm-svn: 228973
* Remove mostly unused setters.Rafael Espindola2015-02-122-25/+1
| | | | | | Most of the code was setting the TargetOptions directly. llvm-svn: 228961
* Add bulk of returning of values to Mips fast-iselReed Kotler2015-02-121-4/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implement the bulk of returning values in Mips fast-isel Test Plan: reatabi.ll Passes test-suite at -O0,-O2 and with mips32r2 and mips32r1. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits, aemerson, rfuhler Differential Revision: http://reviews.llvm.org/D5920 llvm-svn: 228958
* Relaxed over-zealous alignment requirement for VEX-encoded AES instructionsSimon Pilgrim2015-02-121-2/+2
| | | | llvm-svn: 228953
* On ELF, put PIC jump tables in a non executable section.Rafael Espindola2015-02-121-0/+18
| | | | | | Fixes PR22558. llvm-svn: 228939
* Put each jump table in an independent section if the function is too.Rafael Espindola2015-02-121-0/+5
| | | | | | This allows the linker to GC both, fixing pr22557. llvm-svn: 228937
* MathExtras: Bring Count(Trailing|Leading)Ones and CountPopulation in line ↵Benjamin Kramer2015-02-1216-36/+34
| | | | | | | | with countTrailingZeros Update all callers. llvm-svn: 228930
* [X86] Call frame optimization - allow stack-relative movs to be folded into ↵Michael Kuperstein2015-02-121-6/+0
| | | | | | | | a push Since we track esp precisely, there's no reason not to allow this. llvm-svn: 228924
* ARM: Fix another regression introduced in r223113Asiri Rathnayake2015-02-121-5/+0
| | | | | | | | | | | | | | | | | | | | | The changes in r223113 (ARM modified-immediate syntax) have broken instructions like: mov r0, #~0xffffff00 The problem is that I've added a spurious range check on the immediate operand to ensure that it lies between INT32_MIN and UINT32_MAX. While this range check is correct in theory, it causes problems because the operand is stored in an int64_t (by MC). So valid 32-bit constants like \#~0xffffff00 become out of range. The solution is to simply remove this range check. It is not possible to validate the range of the immediate operand with the current setup because: 1) The operand is stored in an int64_t by MC, 2) The immediate can be of the forms #imm, #-imm, #~imm or even #((~imm)) etc. So we just chop the value to 32 bits and use it. Also noted that the original range check was note tested by any of the unit tests. I've added a new test to cover #~imm kind of operands. Change-Id: I411e90d84312a2eff01b732bb238af536c4a7599 llvm-svn: 228920
* AVX-512: Fixed the "test" operation for i1 typeElena Demikhovsky2015-02-123-33/+8
| | | | | | | | | | | | Using KORTESTW for comparison i1 value with zero was wrong since the instruction tests 16 bits. KORTESTW may be used with KSHIFTL+KSHIFTR that clean the 15 upper bits. I removed (X86cmp i1, 0) pattern and zero-extend i1 to i8 and then use TESTB. There are some cases where i1 is in the mask register and the upper bits are already zeroed. Then KORTESTW is the better solution, but it is subject for optimization. Meanwhile, I'm fixing the correctness issue. llvm-svn: 228916
* [X86] A heuristic to estimate the size impact for converting stack-relative ↵Michael Kuperstein2015-02-121-25/+71
| | | | | | | | | | | | | parameter movs to pushes This gives a rough estimate of whether using pushes instead of movs is profitable, in terms of size. We go over all calls in the MachineFunction and compute: a) For each callsite that can not use pushes, the penalty of not having a reserved call frame. b) For each callsite that can use pushes, the gain of actually replacing the movs with pushes (and the potential penalty of having to readjust the stack). Differential Revision: http://reviews.llvm.org/D7561 llvm-svn: 228915
* [PowerPC] Mark jumps as expensive (using using CR bits)Hal Finkel2015-02-121-1/+3
| | | | | | | | | | | | | | | On PowerPC, which has a full set of logical operations on (its multiple sets of) condition-register bits, it is not profitable to break of complex conditions feeding a jump into multiple jumps. We can turn off this feature of CGP/SDAGBuilder by marking jumps as "expensive". P7 test-suite speedups (no regressions): MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 -0.626647% +/- 0.323583% MultiSource/Benchmarks/Olden/power/power -18.2821% +/- 8.06481% llvm-svn: 228895
* R600/SI: Disable subreg livenessTom Stellard2015-02-111-1/+1
| | | | | | This is temporary while we try to fix a crash in the register coalescer. llvm-svn: 228861
* R600: Split AMDGPUPassConfig into R600PassConfig and GCNPassConfigTom Stellard2015-02-112-66/+96
| | | | llvm-svn: 228850
* R600: Create an R600TargetMachine for pre-gcn GPUsTom Stellard2015-02-112-15/+36
| | | | | | | No functinality change. R600TargetMachine inherits from AMDGPUTargetMachine. llvm-svn: 228849
* [mips] Merge disassemblers into a single implementation.Daniel Sanders2015-02-111-84/+18
| | | | | | | | | | | | | | | | | | | | Summary: Currently we have Mips32 and Mips64 disassemblers and this causes the target triple to affect the disassembly despite all the relevant information being in the ELF header. These implementations do not need to be separate. This patch merges them together such that the appropriate tables are checked for the subtarget (e.g. Mips64 is checked when GP64 is enabled). Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7498 llvm-svn: 228825
* [X86] Split information collection from actual transformation in call frame ↵Michael Kuperstein2015-02-111-59/+100
| | | | | | | | | | | optimization This splits collecting information from actually performing the transformation, so that we can add a heuristic in between the two. NFC. Differential Revision: http://reviews.llvm.org/D7497 llvm-svn: 228817
* [PBQP] Cautiously update edge costs in the solverArnaud A. de Grandmaison2015-02-111-2/+2
| | | | | | | | | | | | | | | | | | The NodeMetadata are maintained in an incremental way. When an edge between 2 nodes has its cost updated, in the course of graph reduction for example, the NodeMetadata need first to have the old edge cost removed, then the new edge cost added. Only once the NodeMetadata have been fully updated, it becomes safe to consider promoting the nodes to the ConservativelyAllocatable or OptimallyReducible sets. Previously, this promotion was occuring right after the removing the old cost, and this was breaking the assumption that a ConservativelyAllocatable should not be spilled. This patch also adds asserts to: - enforces the invariant that a node's reduction can not be downgraded, - only not provably allocatable or optimally reducible nodes can be spilled. llvm-svn: 228816
* Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects.Zachary Turner2015-02-111-0/+3
| | | | | | | | | | This allows IDEs to recognize the entire set of header files for each of the core LLVM projects. Differential Revision: http://reviews.llvm.org/D7526 Reviewed By: Chris Bieneman llvm-svn: 228798
* R600/SI: Store immediate offsets > 12-bits in soffsetTom Stellard2015-02-111-13/+19
| | | | | | | This will save us from having to extend these offsets to 64-bits and storing them in a pair of vgprs. llvm-svn: 228776
* R600/SI: Add soffset operand to mubuf addr64 instructionTom Stellard2015-02-115-28/+33
| | | | | | We were previously hard-coding soffset to 0. llvm-svn: 228775
* X86: @llvm.frameaddress should defer to SelectionDAG for Win CFIDavid Majnemer2015-02-101-2/+7
| | | | llvm-svn: 228754
* X86: Make @llvm.frameaddress work correctly with Windows unwind codesDavid Majnemer2015-02-103-5/+34
| | | | | | | | | Simply loading or storing the frame pointer is not sufficient for Windows targets. Instead, create a synthetic frame object that we will lower later. References to this synthetic object will be replaced with the correct reference to the frame address. llvm-svn: 228748
* Fix up r228725, missed change in PPCSubtarget definitionBill Schmidt2015-02-101-6/+6
| | | | llvm-svn: 228728
* [PowerPC] Fix reverted patch r227976 to avoid register assignment issuesBill Schmidt2015-02-1011-114/+379
| | | | | | | | | | | | | | | | | | | See full discussion in http://reviews.llvm.org/D7491. We now hide the add-immediate and call instructions together in a separate pseudo-op, which is tagged to define GPR3 and clobber the call-killed registers. The PPCTLSDynamicCall pass prior to RA now expands this op into the two separate addi and call ops, with explicit definitions of GPR3 on both instructions, and explicit clobbers on the call instruction. The pass is now marked as requiring and preserving the LiveIntervals and SlotIndexes analyses, and fixes these up after the replacement sequences are introduced. Self-hosting has been verified on LE P8 and BE P7 with various optimization levels, etc. It has also been verified with the --no-tls-optimize flag workaround removed. llvm-svn: 228725
* X86: Emit Win64 SaveXMM opcodes at the right offset in the right orderDavid Majnemer2015-02-101-19/+18
| | | | | | | | | | | Walk the instructions marked FrameSetup and consider any stores of XMM registers to the stack as needing a SaveXMM opcode. This fixes PR22521. Differential Revision: http://reviews.llvm.org/D7527 llvm-svn: 228724
* [PowerPC] Support the (old) cntlz instruction aliasHal Finkel2015-02-101-0/+3
| | | | | | | Some old assembly code uses the cntlz alias for cntlzw, binutils supports this, and we should too. Fixes PR22519. llvm-svn: 228719
* [Hexagon] Adding vector load with post-increment instructions. Adding ↵Colin LeMahieu2015-02-104-4/+319
| | | | | | decoder function for 64bit control register class. llvm-svn: 228708
* [mips][microMIPS] Implement movep instructionZoran Jovanovic2015-02-107-0/+237
| | | | | | Differential Revision: http://reviews.llvm.org/D7465 llvm-svn: 228703
* [X86][AVX2] Missing AVX2 memory folding instructionsSimon Pilgrim2015-02-101-3/+35
| | | | | | | | Added most of the missing vector folding patterns for AVX2 (as well as fixing the vpermpd and verpmq patterns) Differential Revision: http://reviews.llvm.org/D7492 llvm-svn: 228688
* [X86][XOP] Added XOP memory folding patterns + testsSimon Pilgrim2015-02-101-5/+100
| | | | | | | | | | This patch adds the complete AMD Bulldozer XOP instruction set to the memory folding pattern tables for stack folding, etc. Note: Many of the XOP instructions have multiple table entries as it can fold loads from different sources. Differential Revision: http://reviews.llvm.org/D7484 llvm-svn: 228685
* [mips][microMIPS] Fix disassembling of 16-bit microMIPS instructions LWM16 ↵Jozef Kolek2015-02-102-7/+25
| | | | | | | | and SWM16 Differential Revision: http://reviews.llvm.org/D7436 llvm-svn: 228683
* [X86][FastIsel] Avoid introducing legacy SSE instructions if the target has AVX.Andrea Di Biagio2015-02-101-28/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch teaches X86FastISel how to select AVX instructions for scalar float/double convert operations. Before this patch, X86FastISel always selected legacy SSE instructions for FPExt (from float to double) and FPTrunc (from double to float). For example: \code define double @foo(float %f) { %conv = fpext float %f to double ret double %conv } \end code Before (with -mattr=+avx -fast-isel) X86FastIsel selected a CVTSS2SDrr which is legacy SSE: cvtss2sd %xmm0, %xmm0 With this patch, X86FastIsel selects a VCVTSS2SDrr instead: vcvtss2sd %xmm0, %xmm0, %xmm0 Added test fast-isel-fptrunc-fpext.ll to check both the register-register and the register-memory float/double conversion variants. Differential Revision: http://reviews.llvm.org/D7438 llvm-svn: 228682
* [X86] Preserve mem refs on newly created 'Store' node instead of 'Load' node ↵Craig Topper2015-02-101-1/+1
| | | | | | | | | | when handling store unfolding. Bug spotted by Steve King. I have no idea how to test this. llvm-svn: 228672
* [X86] Remove unnecessary alignment checks from the load folding tables.Craig Topper2015-02-101-44/+44
| | | | llvm-svn: 228671
* X86: Emit an ABI compliant prologue and epilogue for Win64David Majnemer2015-02-101-79/+132
| | | | | | | | | | | | | | Win64 has specific contraints on what valid prologues and epilogues look like. This constraint is born from the flexibility and descriptiveness of Win64's unwind opcodes. Prologues previously emitted by LLVM could not be represented by the unwind opcodes, preventing operations powered by stack unwinding to successfully work. Differential Revision: http://reviews.llvm.org/D7520 llvm-svn: 228641
* Migrate PPCAsmPrinter's subtarget from reference to pointer inEric Christopher2015-02-101-48/+49
| | | | | | preparation for making it MachineFunction dependent. llvm-svn: 228638
* Fix the clang -Werror build (-Wunused-variable)David Blaikie2015-02-101-3/+0
| | | | llvm-svn: 228635
* [Hexagon] Adding missing load instructions and removing an unused multiclass ↵Colin LeMahieu2015-02-091-38/+169
| | | | | | parameter. llvm-svn: 228630
* [Hexagon] Factoring classes out of some load patterns and deleting some ↵Colin LeMahieu2015-02-091-40/+87
| | | | | | unused ones. llvm-svn: 228627
* [Hexagon] Removing more V4 predicates since V4 is the required minimum.Colin LeMahieu2015-02-0912-470/+226
| | | | llvm-svn: 228614
* [Hexagon] Removing v2-4 flags. V4 is the minimum supported version.Colin LeMahieu2015-02-094-93/+64
| | | | llvm-svn: 228605
* [Hexagon] Factoring classes out of store patterns.Colin LeMahieu2015-02-091-34/+47
| | | | llvm-svn: 228602
* [Hexagon] Formatting v5 TD file. Removing commented defs.Colin LeMahieu2015-02-091-38/+28
| | | | llvm-svn: 228598
* [Hexagon] Cleaning up definition formatting.Colin LeMahieu2015-02-091-85/+85
| | | | llvm-svn: 228593
* This change implements the following three logical vector operations:Kit Barton2015-02-091-0/+25
| | | | | | | | | | | | veqv (vector equivalence) vnand vorc I increased the AddedComplexity for these instructions to 500 to ensure they are generated instead of issuing other VSX instructions. Phabricator review: http://reviews.llvm.org/D7469 llvm-svn: 228580
OpenPOWER on IntegriCloud