| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
| |
repairing.
Copies are easy because we repair only when there is a mismatch. For
non-copy repairing, i.e., cases that involves breaking down or gathering
up the value, one of the operand may not have a register bank yet. Thus,
derivate a cost from that, requires more work.
llvm-svn: 272157
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The MSR instructions can write to the CPSR, but we did not model this
fact, so we could emit them in the middle of IT blocks, changing the
condition flags for later instructions in the block.
The tests use two calls to llvm.write_register.i32 because it is valid
to use these instructions at the end of an IT block, which if conversion
does do in some cases. With two calls, the first clobbers the flags, so
a branch has to be used to make the second one conditional.
Differential Revision: http://reviews.llvm.org/D21139
llvm-svn: 272154
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The architecture enumeration is shared across ARM and AArch64. However, the
data is not. The code incorrectly would index into the array using the
architecture index which was offset by the ARMv7 architecture enumeration. We
do not have a marker for indicating the architectural family to which the
enumeration belongs so we cannot be clever about offsetting the index (at least
it is not immediately apparent to me). Instead, fall back to the tried-and-true
method of slowly iterating the array (its not a large array, so the impact of
this is not too high).
Because of the incorrect indexing, if we were lucky, we would crash, but usually
we would return an invalid StringRef. We did not have any tests for the AArch64
target parser previously;. Extend the previous tests I had added for ARM to
cover AArch64 for ensuring that we return expected StringRefs.
Take the opportunity to change some iterator types to references.
This work is needed to support parsing `.arch name` directives in the AArch64
target asm parser.
llvm-svn: 272145
|
| |
|
|
| |
llvm-svn: 272140
|
| |
|
|
| |
llvm-svn: 272139
|
| |
|
|
| |
llvm-svn: 272138
|
| |
|
|
|
|
|
|
|
| |
Also, switch to using functions from LiveIntervalAnalysis to update
live intervals, instead of performing the updates manually.
Re-committing r272045.
llvm-svn: 272135
|
| |
|
|
|
|
| |
isSwift is tested earlier and known to be false when we reach this code.
llvm-svn: 272127
|
| |
|
|
|
|
|
|
| |
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.
llvm-svn: 272126
|
| |
|
|
| |
llvm-svn: 272122
|
| |
|
|
| |
llvm-svn: 272117
|
| |
|
|
| |
llvm-svn: 272116
|
| |
|
|
|
|
| |
the coverage rt (it should now fail with a descriptive message)
llvm-svn: 272090
|
| |
|
|
|
|
| |
Long term we may want to give high cost at FPR to/from GPR copies.
llvm-svn: 272086
|
| |
|
|
|
|
|
| |
The cost of a copy may be different based on how many bits we have to
copy around. E.g., a 8-bit copy may be different than a 32-bit copy.
llvm-svn: 272084
|
| |
|
|
|
|
| |
This will allow code reuse in the coming commits.
llvm-svn: 272083
|
| |
|
|
|
|
|
| |
The MachineMemOperand parser lacked the code to handle %stack.X
references (%fixed-stack.X was working).
llvm-svn: 272082
|
| |
|
|
| |
llvm-svn: 272078
|
| |
|
|
|
|
| |
In the reference code, the field name is `cHashBuckets`.
llvm-svn: 272075
|
| |
|
|
| |
llvm-svn: 272073
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes linking problems on OSX.
Unfortunately it turns out we need to use an instance of the
``fuzzer::ExternalFunctions`` object in several places so this
commit also replaces all instances with a single global instance.
It also turns out initializing a global ``fuzzer::ExternalFunctions``
before main is entered (i.e. letting the object be initialised by the
global initializers) is not safe (on OSX the call to ``Printf()`` in the
CTOR crashes if it is called from a global initializer) so we instead
have a global ``fuzzer::ExternalFunctions*`` and initialize it inside
``FuzzerDriver()``.
Multiple unit tests depend also depend on the
``fuzzer::ExternalFunctions*`` global so a ``main()`` function has been
added that initializes it before running any tests.
Differential Revision: http://reviews.llvm.org/D20943
llvm-svn: 272072
|
| |
|
|
|
|
|
|
| |
Also use default/delete instead of hand-written ctors.
Thanks to Jia Chen for bringing this stuff up.
llvm-svn: 272064
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The presence of this attribute indicates that VGPR outputs should be computed
in whole quad mode. This will be used by Mesa for prolog pixel shaders, so
that derivatives can be taken of shader inputs computed by the prolog, fixing
a bug.
The generated code could certainly be improved: if a prolog pixel shader is
used (which isn't common in modern OpenGL - they're used for gl_Color, polygon
stipples, and forcing per-sample interpolation), Mesa will use this attribute
unconditionally, because it has to be conservative. So WQM may be used in the
prolog when it isn't really needed, and furthermore a silly back-and-forth
switch is likely to happen at the boundary between prolog and main shader
parts.
Fixing this is a bit involved: we'd first have to add a mechanism by which
LLVM writes the WQM-related input requirements to the main shader part binary,
and then Mesa specializes the prolog part accordingly. At that point, we may
as well just compile a monolithic shader...
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D20839
llvm-svn: 272063
|
| |
|
|
|
|
|
|
|
|
| |
This is necessary because the existing fuzzer-oom.test was Linux
specific due to its use of __sanitizer_print_memory_profile() which
is only available on Linux right now and so the test would fail on OSX.
Differential Revision: http://reviews.llvm.org/D20977
llvm-svn: 272061
|
| |
|
|
| |
llvm-svn: 272058
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Author: Wei Ding <wei.ding2@amd.com>
Date: Tue Jun 7 19:04:44 2016 +0000
Differential Revision: http://reviews.llvm.org/D20557
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044
91177308-0d34-0410-b5e6-96231b3b80d8
as it was breaking the bots.
This reverts commit r272044.
llvm-svn: 272056
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D21089
llvm-svn: 272054
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch is adding support for the MSVC buffer security check implementation
The buffer security check is turned on with the '/GS' compiler switch.
* https://msdn.microsoft.com/en-us/library/8dbf701c.aspx
* To be added to clang here: http://reviews.llvm.org/D20347
Some overview of buffer security check feature and implementation:
* https://msdn.microsoft.com/en-us/library/aa290051(VS.71).aspx
* http://www.ksyash.com/2011/01/buffer-overflow-protection-3/
* http://blog.osom.info/2012/02/understanding-vs-c-compilers-buffer.html
For the following example:
```
int example(int offset, int index) {
char buffer[10];
memset(buffer, 0xCC, index);
return buffer[index];
}
```
The MSVC compiler is adding these instructions to perform stack integrity check:
```
push ebp
mov ebp,esp
sub esp,50h
[1] mov eax,dword ptr [__security_cookie (01068024h)]
[2] xor eax,ebp
[3] mov dword ptr [ebp-4],eax
push ebx
push esi
push edi
mov eax,dword ptr [index]
push eax
push 0CCh
lea ecx,[buffer]
push ecx
call _memset (010610B9h)
add esp,0Ch
mov eax,dword ptr [index]
movsx eax,byte ptr buffer[eax]
pop edi
pop esi
pop ebx
[4] mov ecx,dword ptr [ebp-4]
[5] xor ecx,ebp
[6] call @__security_check_cookie@4 (01061276h)
mov esp,ebp
pop ebp
ret
```
The instrumentation above is:
* [1] is loading the global security canary,
* [3] is storing the local computed ([2]) canary to the guard slot,
* [4] is loading the guard slot and ([5]) re-compute the global canary,
* [6] is validating the resulting canary with the '__security_check_cookie' and performs error handling.
Overview of the current stack-protection implementation:
* lib/CodeGen/StackProtector.cpp
* There is a default stack-protection implementation applied on intermediate representation.
* The target can overload 'getIRStackGuard' method if it has a standard location for the stack protector cookie.
* An intrinsic 'Intrinsic::stackprotector' is added to the prologue. It will be expanded by the instruction selection pass (DAG or Fast).
* Basic Blocks are added to every instrumented function to receive the code for handling stack guard validation and errors handling.
* Guard manipulation and comparison are added directly to the intermediate representation.
* lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
* lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
* There is an implementation that adds instrumentation during instruction selection (for better handling of sibbling calls).
* see long comment above 'class StackProtectorDescriptor' declaration.
* The target needs to override 'getSDagStackGuard' to activate SDAG stack protection generation. (note: getIRStackGuard MUST be nullptr).
* 'getSDagStackGuard' returns the appropriate stack guard (security cookie)
* The code is generated by 'SelectionDAGBuilder.cpp' and 'SelectionDAGISel.cpp'.
* include/llvm/Target/TargetLowering.h
* Contains function to retrieve the default Guard 'Value'; should be overriden by each target to select which implementation is used and provide Guard 'Value'.
* lib/Target/X86/X86ISelLowering.cpp
* Contains the x86 specialisation; Guard 'Value' used by the SelectionDAG algorithm.
Function-based Instrumentation:
* The MSVC doesn't inline the stack guard comparison in every function. Instead, a call to '__security_check_cookie' is added to the epilogue before every return instructions.
* To support function-based instrumentation, this patch is
* adding a function to get the function-based check (llvm 'Value', see include/llvm/Target/TargetLowering.h),
* If provided, the stack protection instrumentation won't be inlined and a call to that function will be added to the prologue.
* modifying (SelectionDAGISel.cpp) do avoid producing basic blocks used for inline instrumentation,
* generating the function-based instrumentation during the ISEL pass (SelectionDAGBuilder.cpp),
* if FastISEL (not SelectionDAG), using the fallback which rely on the same function-based implemented over intermediate representation (StackProtector.cpp).
Modifications
* adding support for MSVC (lib/Target/X86/X86ISelLowering.cpp)
* adding support function-based instrumentation (lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp, .h)
Results
* IR generated instrumentation:
```
clang-cl /GS test.cc /Od /c -mllvm -print-isel-input
```
```
*** Final LLVM Code input to ISel ***
; Function Attrs: nounwind sspstrong
define i32 @"\01?example@@YAHHH@Z"(i32 %offset, i32 %index) #0 {
entry:
%StackGuardSlot = alloca i8* <<<-- Allocated guard slot
%0 = call i8* @llvm.stackguard() <<<-- Loading Stack Guard value
call void @llvm.stackprotector(i8* %0, i8** %StackGuardSlot) <<<-- Prologue intrinsic call (store to Guard slot)
%index.addr = alloca i32, align 4
%offset.addr = alloca i32, align 4
%buffer = alloca [10 x i8], align 1
store i32 %index, i32* %index.addr, align 4
store i32 %offset, i32* %offset.addr, align 4
%arraydecay = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 0
%1 = load i32, i32* %index.addr, align 4
call void @llvm.memset.p0i8.i32(i8* %arraydecay, i8 -52, i32 %1, i32 1, i1 false)
%2 = load i32, i32* %index.addr, align 4
%arrayidx = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 %2
%3 = load i8, i8* %arrayidx, align 1
%conv = sext i8 %3 to i32
%4 = load volatile i8*, i8** %StackGuardSlot <<<-- Loading Guard slot
call void @__security_check_cookie(i8* %4) <<<-- Epilogue function-based check
ret i32 %conv
}
```
* SelectionDAG generated instrumentation:
```
clang-cl /GS test.cc /O1 /c /FA
```
```
"?example@@YAHHH@Z": # @"\01?example@@YAHHH@Z"
# BB#0: # %entry
pushl %esi
subl $16, %esp
movl ___security_cookie, %eax <<<-- Loading Stack Guard value
movl 28(%esp), %esi
movl %eax, 12(%esp) <<<-- Store to Guard slot
leal 2(%esp), %eax
pushl %esi
pushl $204
pushl %eax
calll _memset
addl $12, %esp
movsbl 2(%esp,%esi), %esi
movl 12(%esp), %ecx <<<-- Loading Guard slot
calll @__security_check_cookie@4 <<<-- Epilogue function-based check
movl %esi, %eax
addl $16, %esp
popl %esi
retl
```
Reviewers: kcc, pcc, eugenis, rnk
Subscribers: majnemer, llvm-commits, hans, thakis, rnk
Differential Revision: http://reviews.llvm.org/D20346
llvm-svn: 272053
|
| |
|
|
| |
llvm-svn: 272048
|
| |
|
|
|
|
|
| |
Also, switch to using functions from LiveIntervalAnalysis to update
live intervals, instead of performing the updates manually.
llvm-svn: 272045
|
| |
|
|
| |
llvm-svn: 272044
|
| |
|
|
| |
llvm-svn: 272043
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch does a few things:
- Unifies AttrAll and AttrUnknown (since they were used for more or less
the same purpose anyway).
- Introduces AttrEscaped, an attribute that notes that a value escapes
our analysis for a given set, but not that an unknown value flows into
said set.
- Removes functions that take bit indices, since we also had functions
that took bitsets, and the use of both (with similar names) was
unclear and bug-prone.
Patch by Jia Chen.
Differential Revision: http://reviews.llvm.org/D21000
llvm-svn: 272040
|
| |
|
|
|
|
|
|
|
|
| |
Summary:
The option is very useful for testing, plus I intend to measure
its effect on fuzzer effectiveness.
Differential Revision: http://reviews.llvm.org/D21084
llvm-svn: 272035
|
| |
|
|
|
|
|
|
| |
negative values.
Originally reviewed here: http://reviews.llvm.org/D17463
llvm-svn: 272023
|
| |
|
|
|
|
|
| |
This reverts commit r271930, r271915, r271923. They break a thumb selfhosting
bot.
llvm-svn: 272017
|
| |
|
|
|
|
|
|
|
| |
These instructions end in "S" but are not flag-setting, so they need including
in the list of special cases in the assembly parser.
Differential Revision: http://reviews.llvm.org/D21077
llvm-svn: 272015
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes PR26314. This patch adds new helper “isNoWrap” with detection of
loop-invariant pointer case.
Patch by Roman Shirokiy.
Ref: https://llvm.org/bugs/show_bug.cgi?id=26314
Differential Revision: http://reviews.llvm.org/D17268
llvm-svn: 272014
|
| |
|
|
| |
llvm-svn: 272013
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the only way to use the (V)MOVNTDQA nontemporal vector loads instructions is through the int_x86_sse41_movntdqa style builtins.
This patch adds support for lowering nontemporal loads from general IR, allowing us to remove the movntdqa builtins in a future patch.
We currently still fold nontemporal loads into suitable instructions, we should probably look at removing this (and nontemporal stores as well) or at least make the target's folding implementation aware that its dealing with a nontemporal memory transaction.
There is also an issue that VMOVNTDQA only acts on 128-bit vectors on pre-AVX2 hardware - so currently a normal ymm load is still used on AVX1 targets.
Differential Review: http://reviews.llvm.org/D20965
llvm-svn: 272011
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the only way to use the (V)MOVNTDQA nontemporal vector loads instructions is through the int_x86_sse41_movntdqa style builtins.
This patch adds support for lowering nontemporal loads from general IR, allowing us to remove the movntdqa builtins in a future patch.
We currently still fold nontemporal loads into suitable instructions, we should probably look at removing this (and nontemporal stores as well) or at least make the target's folding implementation aware that its dealing with a nontemporal memory transaction.
There is also an issue that VMOVNTDQA only acts on 128-bit vectors on pre-AVX2 hardware - so currently a normal ymm load is still used on AVX1 targets.
Differential Review: http://reviews.llvm.org/D20965
llvm-svn: 272010
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D21040
llvm-svn: 272009
|
| |
|
|
|
|
| |
We can materialize these integers using a MOV; ADDi8 pair.
llvm-svn: 272007
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D21067
llvm-svn: 272006
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using an LLVM IR aggregate return value type containing three
or more integer values causes an abort in the fast isel pass.
This patch adds two more registers to RetCC_PPC64_ELF_FIS to
allow returning up to four integers with fast isel, just the
same as is currently supported with regular isel (RetCC_PPC).
This is needed for Swift and (possibly) other non-clang frontends.
Fixes PR26190.
llvm-svn: 272005
|
| |
|
|
|
|
|
|
|
|
| |
shuffle mask directly
We currently only combine to blend+zero if the target value type has 8 elements or less, but this was missing a lot of cases where the combined mask had been widened.
This change makes it so we use the combined mask to determine the blend value type, allowing us to catch more widened cases.
llvm-svn: 272003
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
A Thumb-2 post-indexed LDR instruction such as:
ldr.w r0, [r1], #4
Can be rewritten as:
ldm.n r1!, {r0}
LDMs can be more expensive than LDRs on some cores, so this has been enabled only in minsize mode.
llvm-svn: 272002
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have an LDM that uses only low registers and doesn't write to its base register:
ldm.w r0, {r1, r2, r3}
And that base register is dead after the LDM, then we can convert it to writeback form and use a narrow encoding:
ldm.n r0!, {r1, r2, r3}
Obviously, this introduces a new register write and so can cause WAW hazards, so I've enabled it only in minsize mode. This is a code size trick that ARM Compiler 5 ("armcc") does that we don't.
llvm-svn: 272000
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Thumb2 conditional branch B<cond>.W has a different encoding (T3)
to the unconditional branch B.W (T4) as it needs to record <cond>.
As the encoding is different the B<cond>.W is given a different
relocation type.
ELF for the ARM Architecture 4.6.1.6 (Table-13) states that
R_ARM_THM_JUMP19 should be used for B<cond>.W. At present the
MC layer is using the R_ARM_THM_JUMP24 from B.W.
This change makes B<cond>.W use R_ARM_THM_JUMP19 and alters the
existing test that checks for R_ARM_THM_JUMP24 to expect
R_ARM_THM_JUMP19.
llvm-svn: 271997
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
native shifts
Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit).
If the shift amount is constant we can sometimes convert these instructions to native shifts:
1 - if all shift amounts are in range then the conversion is trivial.
2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion.
3 - logical shifts just return zero if all elements have out of range shift amounts.
In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts.
Differential Revision: http://reviews.llvm.org/D19675
llvm-svn: 271996
|