| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
global and local memory. These scopes restrict how synchronization is
achieved, which can result in improved performance.
This change extends existing notion of synchronization scopes in LLVM to
support arbitrary scopes expressed as target-specific strings, in addition to
the already defined scopes (single thread, system).
The LLVM IR and MIR syntax for expressing synchronization scopes has changed
to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
replaces *singlethread* keyword), or a target-specific name. As before, if
the scope is not specified, it defaults to CrossThread/System scope.
Implementation details:
- Mapping from synchronization scope name/string to synchronization scope id
is stored in LLVM context;
- CrossThread/System and SingleThread scopes are pre-defined to efficiently
check for known scopes without comparing strings;
- Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
the bitcode.
Differential Revision: https://reviews.llvm.org/D21723
llvm-svn: 307722
|
|
|
|
|
|
|
|
|
| |
Rename missing DEBUG_TYPE "machine-scheduler" from backend files, which were
absent from https://reviews.llvm.org/rL303921.
Differential revision: https://reviews.llvm.org/D35231
llvm-svn: 307719
|
|
|
|
| |
llvm-svn: 307717
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the .module and .set directives for the MT ASE,
notably that .module sets the relevant flags in .MIPS.abiflags and .set
doesn't.
Reviewers: slthakur, atanasyan
Differential Revision: https://reviews.llvm.org/D35249
llvm-svn: 307716
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For ELF, a movw+movt pair is handled as two separate relocations.
If an offset should be applied to the symbol address, this offset is
stored as an immediate in the instruction (as opposed to stored as an
offset in the relocation itself).
Even though the actual value stored in the movt immediate after linking
is the top half of the value, we need to store the unshifted offset
prior to linking. When the relocation is made during linking, the offset
gets added to the target symbol value, and the upper half of the value
is stored in the instruction.
This makes sure that movw+movt with offset symbols get properly
handled, in case the offset addition in the lower half should be
carried over to the upper half.
This makes the output from the additions to the test case match
the output from GNU binutils.
For COFF and MachO, the movw/movt relocations are handled as a pair,
and the overflow from the lower half gets carried over to the movt,
so they should keep the shifted offset just as before.
Differential Revision: https://reviews.llvm.org/D35242
llvm-svn: 307713
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: t.p.northover, rengolin
Reviewed By: t.p.northover
Subscribers: aemerson, javed.absar, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D35266
llvm-svn: 307706
|
|
|
|
|
|
|
|
|
|
| |
Preparatory work for adding the MIPS MT (multi-threading) ASE instructions.
Reviewers: slthakur, atanasyan
Differential Revision: https://reviews.llvm.org/D35247
llvm-svn: 307679
|
|
|
|
|
|
|
|
| |
This reverts commit r307573.
This breaks downstream test.
llvm-svn: 307678
|
|
|
|
| |
llvm-svn: 307675
|
|
|
|
|
|
|
|
|
|
|
| |
1. The available program storage region of the red zone to compilers is 288
bytes rather than 244 bytes.
2. The formula for negative number alignment calculation should be
y = x & ~(n-1) rather than y = (x + (n-1)) & ~(n-1).
Differential Revision: https://reviews.llvm.org/D34337
llvm-svn: 307672
|
|
|
|
|
|
|
|
| |
Patch by Michael Wu.
Differential Revision: https://reviews.llvm.org/D35104
llvm-svn: 307671
|
|
|
|
| |
llvm-svn: 307662
|
|
|
|
|
|
|
| |
Some minor corrections for the recently added instructions.
Review: Ulrich Weigand
llvm-svn: 307658
|
|
|
|
|
|
| |
Map the result into GPR and the operands into FPR.
llvm-svn: 307653
|
|
|
|
|
|
|
|
|
|
|
| |
This change allows the pc to be used as a destination register for the
pseudo instruction LDR pc,=expression . The pseudo instruction must not be
transformed into a MOV, but it can use the Thumb2 LDR (literal) instruction
to a constant pool entry. See (A7.7.43 from ARMv7M ARM ARM).
Differential Revision: https://reviews.llvm.org/D34751
llvm-svn: 307640
|
|
|
|
|
|
|
| |
We used to forget to erase the original instruction when replacing a
G_FCMP true/false. Fix this bug and make sure the tests check for it.
llvm-svn: 307639
|
|
|
|
|
|
|
|
|
|
|
|
| |
TreePatternNode considers them to be plain integers but MachineInstr considers
them to be a distinct kind of operand.
The tweak to AArch64InstrInfo.td to produce a simple test case is a NFC for
everything except GlobalISelEmitter (confirmed by diffing the tablegenerated
files). GlobalISelEmitter is currently unable to infer the type of operands in
the Dst pattern from the operands in the Src pattern.
llvm-svn: 307634
|
|
|
|
|
|
| |
Same as the s32 version, for both hard and soft float.
llvm-svn: 307633
|
|
|
|
|
|
| |
AND8ri8 not supported in 64bit.
llvm-svn: 307630
|
|
|
|
|
|
|
|
|
| |
In the POWER9 instruction scheduler, SchedWriteRes for the simple integer instructions are misconfigured to use that of (costly) DFU instructions.
This results in surprisingly long instruction latency estimation and causes misbehavior in some optimizers such as if-conversion.
Differential Revision: https://reviews.llvm.org/D34869
llvm-svn: 307624
|
|
|
|
|
|
|
|
|
|
| |
This patch reduces compilation time by avoiding redundant analysis while selecting instructions to create an immediate.
If the instruction count required to create the input number without rotate is 2, we do not need further analysis to find a shorter instruction sequence with rotate; rotate + load constant cannot be done by 1 instruction (i.e. getInt64CountDirectnever return 0).
This patch should not change functionality.
Differential Revision: https://reviews.llvm.org/D34986
llvm-svn: 307623
|
|
|
|
| |
llvm-svn: 307622
|
|
|
|
|
|
| |
It will only ever contain one register.
llvm-svn: 307620
|
|
|
|
| |
llvm-svn: 307619
|
|
|
|
| |
llvm-svn: 307617
|
|
|
|
| |
llvm-svn: 307597
|
|
|
|
| |
llvm-svn: 307583
|
|
|
|
| |
llvm-svn: 307582
|
|
|
|
| |
llvm-svn: 307580
|
|
|
|
| |
llvm-svn: 307576
|
|
|
|
|
|
|
|
| |
Immediates can be folded as long as the immediate is a vreg.
Also undo commuting instructions if it didn't fold an immediate.
llvm-svn: 307575
|
|
|
|
|
|
|
|
|
|
| |
An instruction that has an immediate operand can't reach
this point. This is only called for a freshly shrunk instruction,
which prevously couldn't have had a literal constant operand.
This was also not conservative enough since it woudl also have
had to filter other constant-like inputs like frame indexes.
llvm-svn: 307574
|
|
|
|
|
|
| |
SI is being tested by isa version in the first two if statements of the function.
llvm-svn: 307573
|
|
|
|
|
|
| |
This fixes https://llvm.org/PR33718.
llvm-svn: 307566
|
|
|
|
| |
llvm-svn: 307564
|
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D34908
Fix PR: https://bugs.llvm.org/show_bug.cgi?id=33093
llvm-svn: 307563
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
use as an index register for X-Form loads/stores.
For this example:
float test (int *arr) {
return arr[2];
}
We currently generate the following code:
li r4, 8
lxsiwax f0, r3, r4
xscvsxdsp f1, f0
With this patch, we will now generate:
addi r3, r3, 8
lxsiwax f0, 0, r3
xscvsxdsp f1, f0
Originally reported in: https://bugs.llvm.org/show_bug.cgi?id=27204
Differential Revision: https://reviews.llvm.org/D35027
llvm-svn: 307553
|
|
|
|
|
|
|
|
|
|
|
| |
(PR28573).
The new version of the model is definitely faster.
Differential Revision:
https://reviews.llvm.org/D35198
llvm-svn: 307552
|
|
|
|
| |
llvm-svn: 307533
|
|
|
|
|
|
|
| |
Clean up ARMBaseRegisterInfo implementation a bit.
Differential Revision: https://reviews.llvm.org/D35116
llvm-svn: 307531
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SandyBridge architecture target by modifying the file X86SchedSandyBridge.td located under the X86 Target.
The SandyBridge architects have provided us with a more accurate information about each instruction latency, number of uOPs and used ports and I used it to replace the existing estimated SNB instructions scheduling and to add missing scheduling information.
Please note that the patch extensively affects the X86 MC instr scheduling for SNB.
Also note that this patch will be followed by additional patches for the remaining target architectures HSW, IVB, BDW, SKL and SKX.
The updated and extended information about each instruction includes the following details:
•static latency of the instruction
•number of uOps from which the instruction consists of
•all ports used by the instruction's' uOPs
For example, the following code dictates that instructions, ADC64mr, ADC8mr, SBB64mr, SBB8mr have a static latency of 9 cycles. Each of these instructions is decoded into 6 micro operations which use ports 4, ports 2 or 3 and port 0 and ports 0 or 1 or 5:
def SBWriteResGroup94 : SchedWriteRes<[SBPort4,SBPort23,SBPort0,SBPort015]> {
let Latency = 9;
let NumMicroOps = 6;
let ResourceCycles = [1,2,2,1];
}
def: InstRW<[SBWriteResGroup94], (instregex "ADC64mr")>;
def: InstRW<[SBWriteResGroup94], (instregex "ADC8mr")>;
def: InstRW<[SBWriteResGroup94], (instregex "SBB64mr")>;
def: InstRW<[SBWriteResGroup94], (instregex "SBB8mr")>;
Note that apart for the header, most of the X86SchedSandyBridge.td file was generated by a script.
Reviewers: zvi, chandlerc, RKSimon, m_zuckerman, craig.topper, igorb
Differential Revision: https://reviews.llvm.org/D35019#inline-304691
llvm-svn: 307529
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Support G_LOAD/G_STORE i1.
Reviewers: zvi, guyblank
Reviewed By: guyblank
Subscribers: rovka, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D35178
llvm-svn: 307527
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Mark G_ZEXT/G_SEXT i1 to i8/i16, i8 to i16 as legal.
Support G_ZEXT i1 to i8/i16 instruction selection ( C++ code).
This patch requred to support G_LOAD/G_STORE i1.
Reviewers: zvi, guyblank
Reviewed By: guyblank
Subscribers: rovka, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D35177
llvm-svn: 307526
|
|
|
|
| |
llvm-svn: 307523
|
|
|
|
|
|
|
|
|
|
| |
GHC 8.4 will know how to use YMM and ZMM registers for calls.
Submitted on behalf of @bgamari (Ben Gamari)
Differential Revision: https://reviews.llvm.org/D34854
llvm-svn: 307504
|
|
|
|
|
|
| |
isIntegerTy(unsigned), but also works for vectors.
llvm-svn: 307492
|
|
|
|
|
|
| |
Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC
llvm-svn: 307491
|
|
|
|
|
|
| |
sucessor -> successor
llvm-svn: 307488
|
|
|
|
| |
llvm-svn: 307485
|
|
|
|
|
|
| |
Add breaks - doesn't affect results as both GPR/FPU both check for 32/64 bit sizes. So will still default to GenericOps in the same way.
llvm-svn: 307484
|