summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* Enhance synchscope representationKonstantin Zhuravlyov2017-07-113-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this replaces *singlethread* keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722
* [CodeGen] Rename DEBUG_TYPE to match passnamesEvandro Menezes2017-07-119-9/+9
| | | | | | | | | Rename missing DEBUG_TYPE "machine-scheduler" from backend files, which were absent from https://reviews.llvm.org/rL303921. Differential revision: https://reviews.llvm.org/D35231 llvm-svn: 307719
* [mips][mt] Correct spelling error in comment. NFCI.Simon Dardis2017-07-111-1/+1
| | | | llvm-svn: 307717
* [mips][mt][2/7] Implement .module and .set directives for the MT ASE.Simon Dardis2017-07-113-0/+82
| | | | | | | | | | | | This patch implements the .module and .set directives for the MT ASE, notably that .module sets the relevant flags in .MIPS.abiflags and .set doesn't. Reviewers: slthakur, atanasyan Differential Revision: https://reviews.llvm.org/D35249 llvm-svn: 307716
* [ARM, ELF] Don't shift movt relocation offsetsMartin Storsjo2017-07-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For ELF, a movw+movt pair is handled as two separate relocations. If an offset should be applied to the symbol address, this offset is stored as an immediate in the instruction (as opposed to stored as an offset in the relocation itself). Even though the actual value stored in the movt immediate after linking is the top half of the value, we need to store the unshifted offset prior to linking. When the relocation is made during linking, the offset gets added to the target symbol value, and the upper half of the value is stored in the instruction. This makes sure that movw+movt with offset symbols get properly handled, in case the offset addition in the lower half should be carried over to the upper half. This makes the output from the additions to the test case match the output from GNU binutils. For COFF and MachO, the movw/movt relocations are handled as a pair, and the overflow from the lower half gets carried over to the movt, so they should keep the shifted offset just as before. Differential Revision: https://reviews.llvm.org/D35242 llvm-svn: 307713
* [AArch64] Remove unused IsDarwin & IsNotDarwin predicates (NFCI). Florian Hahn2017-07-111-3/+0
| | | | | | | | | | | | Reviewers: t.p.northover, rengolin Reviewed By: t.p.northover Subscribers: aemerson, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D35266 llvm-svn: 307706
* [mips][mt][1/7] Add the MT ASE as a subtarget feature.Simon Dardis2017-07-115-1/+13
| | | | | | | | | | Preparatory work for adding the MIPS MT (multi-threading) ASE instructions. Reviewers: slthakur, atanasyan Differential Revision: https://reviews.llvm.org/D35247 llvm-svn: 307679
* Revert "AMDGPU: Do not test for SI in getIsaVersion"Konstantin Zhuravlyov2017-07-111-1/+1
| | | | | | | | This reverts commit r307573. This breaks downstream test. llvm-svn: 307678
* [Hexagon] Do not rely on callee-saved info in hasFPKrzysztof Parzyszek2017-07-112-14/+9
| | | | llvm-svn: 307675
* [PPC] Fix two bugs in frame lowering.Tony Jiang2017-07-112-17/+26
| | | | | | | | | | | 1. The available program storage region of the red zone to compilers is 288 bytes rather than 244 bytes. 2. The formula for negative number alignment calculation should be y = x & ~(n-1) rather than y = (x + (n-1)) & ~(n-1). Differential Revision: https://reviews.llvm.org/D34337 llvm-svn: 307672
* [Hexagon] Add support for nontemporal loads and stores on HVXKrzysztof Parzyszek2017-07-114-15/+112
| | | | | | | | Patch by Michael Wu. Differential Revision: https://reviews.llvm.org/D35104 llvm-svn: 307671
* fix formatting; NFCHiroshi Inoue2017-07-111-2/+2
| | | | llvm-svn: 307662
* [SystemZ] Minor fixing in SystemZScheduleZ13.tdJonas Paulsson2017-07-111-69/+84
| | | | | | | Some minor corrections for the recently added instructions. Review: Ulrich Weigand llvm-svn: 307658
* [ARM] GlobalISel: Add reg mapping for s64 G_FCMPDiana Picus2017-07-111-5/+9
| | | | | | Map the result into GPR and the operands into FPR. llvm-svn: 307653
* [ARM] ldr pc,=expression should be allowed in Thumb2Peter Smith2017-07-111-1/+1
| | | | | | | | | | | This change allows the pc to be used as a destination register for the pseudo instruction LDR pc,=expression . The pseudo instruction must not be transformed into a MOV, but it can use the Thumb2 LDR (literal) instruction to a constant pool entry. See (A7.7.43 from ARMv7M ARM ARM). Differential Revision: https://reviews.llvm.org/D34751 llvm-svn: 307640
* [ARM] GlobalISel: Fix oversight in G_FCMP legalizationDiana Picus2017-07-111-0/+1
| | | | | | | We used to forget to erase the original instruction when replacing a G_FCMP true/false. Fix this bug and make sure the tests check for it. llvm-svn: 307639
* [globalisel][tablegen] Correct matching of intrinsic ID's.Daniel Sanders2017-07-111-4/+4
| | | | | | | | | | | | TreePatternNode considers them to be plain integers but MachineInstr considers them to be a distinct kind of operand. The tweak to AArch64InstrInfo.td to produce a simple test case is a NFC for everything except GlobalISelEmitter (confirmed by diffing the tablegenerated files). GlobalISelEmitter is currently unable to infer the type of operands in the Dst pattern from the operands in the Src pattern. llvm-svn: 307634
* [ARM] GlobalISel: Legalize s64 G_FCMPDiana Picus2017-07-112-9/+64
| | | | | | Same as the s32 version, for both hard and soft float. llvm-svn: 307633
* [GlobalISel][X86] Use correct AND instructions.Igor Breger2017-07-111-1/+1
| | | | | | AND8ri8 not supported in 64bit. llvm-svn: 307630
* [PowerPC] fix latency for simple integer instructions in POWER9 schedulerHiroshi Inoue2017-07-111-1/+1
| | | | | | | | | In the POWER9 instruction scheduler, SchedWriteRes for the simple integer instructions are misconfigured to use that of (costly) DFU instructions. This results in surprisingly long instruction latency estimation and causes misbehavior in some optimizers such as if-conversion. Differential Revision: https://reviews.llvm.org/D34869 llvm-svn: 307624
* [PowerPC] avoid redundant analysis while lowering an immediate; NFCHiroshi Inoue2017-07-111-2/+8
| | | | | | | | | | This patch reduces compilation time by avoiding redundant analysis while selecting instructions to create an immediate. If the instruction count required to create the input number without rotate is 2, we do not need further analysis to find a shorter instruction sequence with rotate; rotate + load constant cannot be done by 1 instruction (i.e. getInt64CountDirectnever return 0). This patch should not change functionality. Differential Revision: https://reviews.llvm.org/D34986 llvm-svn: 307623
* [AVR] Remove a few very old TODOs that don't have enough context to understandDylan McKay2017-07-112-3/+4
| | | | llvm-svn: 307622
* [AVR] Rename 'ZREGS' to 'ZREG'Dylan McKay2017-07-113-16/+13
| | | | | | It will only ever contain one register. llvm-svn: 307620
* [AVR] Rename 'AVRTiny' to 'Tiny'Dylan McKay2017-07-112-13/+12
| | | | llvm-svn: 307619
* [AVR] Use the generic branch relaxerDylan McKay2017-07-113-5/+77
| | | | llvm-svn: 307617
* Doxygen formatting. NFCIJoel Jones2017-07-103-6/+15
| | | | llvm-svn: 307597
* Add DAG argument to canMergeStoresTo NFC.Nirav Dave2017-07-105-5/+10
| | | | llvm-svn: 307583
* [Hexagon] Convert typed ISD opcodes to generic ones, NFCKrzysztof Parzyszek2017-07-103-58/+43
| | | | llvm-svn: 307582
* [Hexagon] Remove unused ISD opcodes, NFCKrzysztof Parzyszek2017-07-103-97/+0
| | | | llvm-svn: 307580
* AMDGPU: Allow SIShrinkInstructions to fold FrameIndexesMatt Arsenault2017-07-101-0/+4
| | | | llvm-svn: 307576
* AMDGPU: Allow SIShrinkInstructions to work in non-SSAMatt Arsenault2017-07-101-24/+33
| | | | | | | | Immediates can be folded as long as the immediate is a vreg. Also undo commuting instructions if it didn't fold an immediate. llvm-svn: 307575
* AMDGPU: Remove unnecessary check for constant operandsMatt Arsenault2017-07-101-5/+0
| | | | | | | | | | An instruction that has an immediate operand can't reach this point. This is only called for a freshly shrunk instruction, which prevously couldn't have had a literal constant operand. This was also not conservative enough since it woudl also have had to filter other constant-like inputs like frame indexes. llvm-svn: 307574
* AMDGPU: Do not test for SI in getIsaVersionKonstantin Zhuravlyov2017-07-101-1/+1
| | | | | | SI is being tested by isa version in the first two if statements of the function. llvm-svn: 307573
* [Hexagon] Fix check for HMOTF_ConstExtend operand flagKrzysztof Parzyszek2017-07-103-19/+14
| | | | | | This fixes https://llvm.org/PR33718. llvm-svn: 307566
* [Hexagon] Handle Hexagon-specific machine operand target flags in MIRKrzysztof Parzyszek2017-07-103-4/+63
| | | | llvm-svn: 307564
* [PPC CodeGen] Expand the bitreverse.i64 intrinsic.Tony Jiang2017-07-102-0/+120
| | | | | | | Differential Revision: https://reviews.llvm.org/D34908 Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093 llvm-svn: 307563
* [PowerPC] Reduce register pressure by not materializing a constant just for ↵Lei Huang2017-07-101-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | use as an index register for X-Form loads/stores. For this example: float test (int *arr) { return arr[2]; } We currently generate the following code: li r4, 8 lxsiwax f0, r3, r4 xscvsxdsp f1, f0 With this patch, we will now generate: addi r3, r3, 8 lxsiwax f0, 0, r3 xscvsxdsp f1, f0 Originally reported in: https://bugs.llvm.org/show_bug.cgi?id=27204 Differential Revision: https://reviews.llvm.org/D35027 llvm-svn: 307553
* [X86] Model 256-bit AVX instructions in the AMD Jaguar scheduler Part-1 ↵Andrew V. Tischenko2017-07-101-0/+77
| | | | | | | | | | | (PR28573). The new version of the model is definitely faster. Differential Revision: https://reviews.llvm.org/D35198 llvm-svn: 307552
* fix typos in comments and error messages; NFCHiroshi Inoue2017-07-101-3/+3
| | | | llvm-svn: 307533
* [ARM] Tidy up ARMBaseRegisterInfo implementation. NFCJaved Absar2017-07-101-11/+8
| | | | | | | Clean up ARMBaseRegisterInfo implementation a bit. Differential Revision: https://reviews.llvm.org/D35116 llvm-svn: 307531
* This patch completely replaces the scheduling information for the ↵Gadi Haber2017-07-101-30/+2442
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SandyBridge architecture target by modifying the file X86SchedSandyBridge.td located under the X86 Target. The SandyBridge architects have provided us with a more accurate information about each instruction latency, number of uOPs and used ports and I used it to replace the existing estimated SNB instructions scheduling and to add missing scheduling information. Please note that the patch extensively affects the X86 MC instr scheduling for SNB. Also note that this patch will be followed by additional patches for the remaining target architectures HSW, IVB, BDW, SKL and SKX. The updated and extended information about each instruction includes the following details: •static latency of the instruction •number of uOps from which the instruction consists of •all ports used by the instruction's' uOPs For example, the following code dictates that instructions, ADC64mr, ADC8mr, SBB64mr, SBB8mr have a static latency of 9 cycles. Each of these instructions is decoded into 6 micro operations which use ports 4, ports 2 or 3 and port 0 and ports 0 or 1 or 5: def SBWriteResGroup94 : SchedWriteRes<[SBPort4,SBPort23,SBPort0,SBPort015]> { let Latency = 9; let NumMicroOps = 6; let ResourceCycles = [1,2,2,1]; } def: InstRW<[SBWriteResGroup94], (instregex "ADC64mr")>; def: InstRW<[SBWriteResGroup94], (instregex "ADC8mr")>; def: InstRW<[SBWriteResGroup94], (instregex "SBB64mr")>; def: InstRW<[SBWriteResGroup94], (instregex "SBB8mr")>; Note that apart for the header, most of the X86SchedSandyBridge.td file was generated by a script. Reviewers: zvi, chandlerc, RKSimon, m_zuckerman, craig.topper, igorb Differential Revision: https://reviews.llvm.org/D35019#inline-304691 llvm-svn: 307529
* [GlobalISel][X86] Support G_LOAD/G_STORE i1.Igor Breger2017-07-101-0/+2
| | | | | | | | | | | | | | Summary: Support G_LOAD/G_STORE i1. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D35178 llvm-svn: 307527
* [GlobalISel][X86] extend G_ZEXT support.Igor Breger2017-07-102-24/+29
| | | | | | | | | | | | | | | | | Summary: Mark G_ZEXT/G_SEXT i1 to i8/i16, i8 to i16 as legal. Support G_ZEXT i1 to i8/i16 instruction selection ( C++ code). This patch requred to support G_LOAD/G_STORE i1. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D35177 llvm-svn: 307526
* fix formatting; NFCHiroshi Inoue2017-07-101-2/+2
| | | | llvm-svn: 307523
* [X86] Allow GHC calling convention to use YMM and ZMM registersSimon Pilgrim2017-07-091-1/+9
| | | | | | | | | | GHC 8.4 will know how to use YMM and ZMM registers for calls. Submitted on behalf of @bgamari (Ben Gamari) Differential Revision: https://reviews.llvm.org/D34854 llvm-svn: 307504
* [IR] Add Type::isIntOrIntVectorTy(unsigned) similar to the existing ↵Craig Topper2017-07-091-2/+1
| | | | | | isIntegerTy(unsigned), but also works for vectors. llvm-svn: 307492
* [IR] Make use of ↵Craig Topper2017-07-091-1/+1
| | | | | | Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC llvm-svn: 307491
* fix trivial typos; NFCHiroshi Inoue2017-07-091-1/+1
| | | | | | sucessor -> successor llvm-svn: 307488
* [AMDGPU] Fix -Wimplicit-fallthrough warning. NFCI.Simon Pilgrim2017-07-081-6/+2
| | | | llvm-svn: 307485
* [AArch64] Fix -Wimplicit-fallthrough warnings. NFCI.Simon Pilgrim2017-07-081-2/+6
| | | | | | Add breaks - doesn't affect results as both GPR/FPU both check for 32/64 bit sizes. So will still default to GenericOps in the same way. llvm-svn: 307484
OpenPOWER on IntegriCloud