| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MSVC always places the implicit sret parameter after the implicit this
parameter of instance methods. We used to handle this for
x86_thiscallcc by allocating the sret parameter on the stack and leaving
the this pointer in ecx, but that doesn't handle alternative calling
conventions like cdecl, stdcall, fastcall, or the win64 convention.
Instead, change the verifier to allow sret on the second parameter.
This also requires changing the Mips and X86 backends to return the
argument with the sret parameter, instead of assuming that the sret
parameter comes first.
The Sparc backend also returns sret parameters in a register, but I
wasn't able to update it to handle secondary sret parameters. It
currently calls report_fatal_error if you feed it an sret in the second
parameter.
Reviewers: rafael.espindola, majnemer
Differential Revision: http://reviews.llvm.org/D3617
llvm-svn: 208453
|
| |
|
|
|
|
| |
ARM64 backend was missing a required_library entry.
llvm-svn: 208437
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support to ARM for custom lowering of the
llvm.{u|s}add.with.overflow.i32 intrinsics for i32/i64. This is particularly useful
for handling idiomatic saturating math functions as generated by
InstCombineCompare.
Test cases included.
rdar://14853450
llvm-svn: 208435
|
| |
|
|
| |
llvm-svn: 208432
|
| |
|
|
|
|
| |
We were dropping the high bits of 64-bit immediate offsets.
llvm-svn: 208431
|
| |
|
|
| |
llvm-svn: 208430
|
| |
|
|
| |
llvm-svn: 208429
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-mcpu=mips[123] does not accept them
Summary:
This required a new instruction group representing the 32-bit subset of
MIPS-IV that was available in MIPS32
A small number of instructions are correctly rejected but with the wrong error
message. These have been placed in a separate test for now.
Depends on D3676
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3677
llvm-svn: 208414
|
| |
|
|
|
|
|
|
|
|
| |
When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must
be passed in a block of consecutive floating-point registers, or on the stack.
This means that unused floating-point registers cannot be back-filled with
part of an HFA, however this can currently happen. This patch, along with the
corresponding clang patch (http://reviews.llvm.org/D3083) prevents this.
llvm-svn: 208413
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
No functional change
Depends on D3675
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3676
llvm-svn: 208410
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-mcpu=mips[12] does not accept them
Summary:
This required a new instruction group representing the 32-bit subset of
MIPS-III that was available in MIPS32
A small number of instructions are correctly rejected but with the wrong error
message. These have been placed in a separate test for now.
There's some obvious InstAlias's that ought to be marked MIPS-III but arent.
This is because they are not currently tested. I intend to catch these with
a final pass through the tablegen records to find tablegen records without
ISA annotations.
Depends on D3674
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3675
llvm-svn: 208408
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 208405
|
| |
|
|
| |
llvm-svn: 208400
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds MIPS32r6/MIPS64r6 and checks the compatibility requirements for these
processors.
I've also included comments to describe removed and re-encoded instructions,
along with placeholder def's for the new instructions but there are no
functional changes to codegen at this point.
Reviewers: jkolek, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3622
llvm-svn: 208399
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: dsll, dsrl, sll, and srl already exist.
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3673
llvm-svn: 208397
|
| |
|
|
|
|
|
|
| |
Handle lowering of global addresses for PIC mode compilation on Windows. Always
use the movw/movt load to load the address as Windows on ARM requires ARMv7+ and
is a pure Thumb environment.
llvm-svn: 208385
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Also ran clang-format on the function. The code added is the last else
if block.
Reviewers: nadav, craig.topper, delena
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3518
llvm-svn: 208372
|
| |
|
|
|
|
|
| |
This patch doesn't introduce any functionality change. Test cases will be
added later when v5 support is added.
llvm-svn: 208349
|
| |
|
|
| |
llvm-svn: 208348
|
| |
|
|
| |
llvm-svn: 208344
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shift intrinsics.
This patch teaches the backend how to combine packed SSE2/AVX2 arithmetic shift
intrinsics.
The rules are:
- Always fold a packed arithmetic shift by zero to its first operand;
- Convert a packed arithmetic shift intrinsic dag node into a ISD::SRA only if
the shift count is known to be smaller than the vector element size.
This patch also teaches to function 'getTargetVShiftByConstNode' how fold
target specific vector shifts by zero.
Added two new tests to verify that the DAGCombiner is able to fold
sequences of SSE2/AVX2 packed arithmetic shift calls.
llvm-svn: 208342
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
No functional change
Depends on D3649
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3672
llvm-svn: 208334
|
| |
|
|
| |
llvm-svn: 208330
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The parsing of ADD/SUB shifted immediates needs to be done explicitly so
that better diagnostics can be emitted, as a side effect this also
removes some of the hacks in the current method of handling this operand
type.
Additionally remove manual CMP aliasing to ADD/SUB and use InstAlias
instead.
llvm-svn: 208329
|
| |
|
|
|
|
| |
Also emit a more useful diagnostic when they are not.
llvm-svn: 208318
|
| |
|
|
| |
llvm-svn: 208317
|
| |
|
|
| |
llvm-svn: 208316
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These instructions were added in MIPS-I, and MIPS-II but were removed in
MIPS-III. Interestingly, GAS continues to accept them when assembling for
MIPS-III.
For the moment, these instructions will follow GAS and accept them for
MIPS-III and newer but this will be tightened up when the invalid-*.s
tests are added.
Depends on D3647
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3648
llvm-svn: 208311
|
| |
|
|
|
|
|
|
| |
big endian calls.
SelectionDAG already knows about this, but fast-isel was ignorant.
llvm-svn: 208307
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-mcpu=mips1 does not accept them
Summary:
A small number of instructions are rejected with the wrong error message.
These have been placed in a separate test for now. There seems to be some
parsing quirk that triggers when these instructions are disabled.
Depends on D3571
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3647
llvm-svn: 208305
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: vmedic, dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3571
llvm-svn: 208301
|
| |
|
|
|
|
| |
We need to use a temporary register for a 2-step operation like REM.
llvm-svn: 208297
|
| |
|
|
|
|
| |
Patch by Yuri Gorshenin.
llvm-svn: 208296
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features), and also would not
correctly identify the target CPU if certain target features were disabled.
After some discussions on IRC with Chandler et al., it was decided that the
processor scheduling models were the right containers for this information
(because it is often tied to special uop dispatch-buffer sizes).
This does represent a small functionality change:
- For generic x86-64 (which uses the SB model and, thus, will get some
unrolling).
- For AMD cores (because they still currently use the SB scheduling model)
- For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump
the default threshold to 50; we're working on a test case for this).
Otherwise, nothing has changed for any other targets. The logic, however, has
been moved into BasicTTI, so other targets may now also opt-in to this
functionality simply by setting LoopMicroOpBufferSize in their processor
model definitions.
llvm-svn: 208289
|
| |
|
|
|
|
| |
ARM64 backend.
llvm-svn: 208284
|
| |
|
|
|
|
|
|
| |
This adds FK_SecRel_2 relocation support to ARM. This enables the building of
object files for armv7-windows-msvc which enables CodeView line tables for
debugging as opposed to armv7-windows-itanium which currently uses DWARF.
llvm-svn: 208273
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Vectors built with zeros and elements in the same order as another
(source) vector are optimized to be built using a single insertps
instruction.
Also optimize when we move one element in a vector to a different place
in that vector while zeroing out some of the other elements.
Further optimizations are possible, described in TODO comments.
I will be implementing at least some of them in the near future.
Added some tests for different cases where this optimization triggers.
Reviewers: nadav, delena, craig.topper
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3521
llvm-svn: 208271
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The loop stream detector (LSD) on modern Intel cores, which optimizes the
execution of small loops, has limits on the number of taken branches in
addition to uop-count limits (modern AMD cores have similar limits).
Unfortunately, at the IR level, estimating the number of branches that will be
taken is difficult. For one thing, it strongly depends on later passes (block
placement, etc.). The original implementation took a conservative approach and
limited the maximal BB DFS depth of the loop. However, fairly-extensive
benchmarking by several of us has revealed that this is the wrong approach. In
fact, there are zero known cases where the branch limit prevents a detrimental
unrolling (but plenty of cases where it does prevent beneficial unrolling).
While we could improve the current branch counting logic by incorporating
branch probabilities, this further complication seems unjustified without a
motivating regression. Instead, unless and until a regression appears, the
branch counting will be removed.
llvm-svn: 208255
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given a FMA family (e.g., 213, 231), not all the variants (i.e., register or
memory) are commutable.
E.g., for the 213 family (with the syntax src1, src2, src3):
fmaXXX213 A, B, reg3/mem3 == fmaXXX213 B, A, reg3/mem3
Now consider the 231 family:
fmaXXX231 A, B, reg3 == fmaXXX231 A, reg3, B
But
fmaXXX231 A, B, mem3 != fmaXXX231 A, mem3, B
Indeed, mem3 cannot be the second argument of the memory variant of fmaXXX231.
Working on a reduced test case!
<rdar://problem/16800495>
llvm-svn: 208252
|
| |
|
|
| |
llvm-svn: 208248
|
| |
|
|
| |
llvm-svn: 208239
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
default architecture for reasonable modern x86 processors, actually be
modern. This processor model should essentially be "tuned" for modern
x86 chips as much as possible without undue penalties on any specific
architecture. Previously we weren't even using the nice scheduling
models. There are a few other tweaks needed here, but this change at
least I have benchmarked across a decent swatch of chips (intel's
clovertown, westmere, and sandybridge; amd's istanbul) and seen no
significant regressions.
If anyone has suggested ways to test this, just let me know. Somewhat
alarmingly, no existing tests failed.
llvm-svn: 208230
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this patch disables the dead register elimination pass and the load/store pair
optimization pass at -O0. The ILP optimizations don't require the optimization
level to be checked because the call to addILPOpts is predicated with the
necessary check. The AdvSIMDScalar pass is disabled by default at all
optimization levels. This patch leaves that pass disabled by default.
Also, move command-line options into ARM64TargetMachine.cpp and add a few
additional flags to aid in debugging. This fixes an issue with the
-debug-pass=Structure flag where passes were printed, but not actually run
(i.e., AdvSIMDScalar pass).
llvm-svn: 208223
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These processors will only be available for the integrated assembler at
first (CodeGen will emit a fatal error saying they are not implemented).
The intention is to work through the existing instructions and correctly
annotate the ISA they were added in so that we have a sufficiently good
base to start MIPS64r6 development. MIPS64r6 removes/re-encodes certain
instructions and I believe it is best to define ISA's using set-union's
as far as possible rather than using set-subtraction.
Reviewers: vmedic
Subscribers: emaste, llvm-commits
Differential Revision: http://reviews.llvm.org/D3569
llvm-svn: 208221
|
| |
|
|
| |
llvm-svn: 208218
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FGRPredicates/GPRPredicates
Summary:
No functional change (confirmed by diffing tablegen-erated files).
Depends on D3642
Reviewers: vmedic, dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3645
llvm-svn: 208213
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AdditionalPredicates overrides
Summary:
No functional change
Depends on D3641
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3642
llvm-svn: 208212
|
| |
|
|
|
|
|
|
|
| |
When performing a scalar comparison that feeds into a vector select,
it's actually better to do the comparison on the vector side: the
scalar route would be "CMP -> CSEL -> DUP", the vector is "CM -> DUP"
since the vector comparisons are all mask based.
llvm-svn: 208210
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AdditionalPredicates overrides
Summary:
One small functional change. The recently added PAUSE instruction now has
the HasStdEnc predicate which was accidentally removed by a Requires<>.
Depends on D3640
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3641
llvm-svn: 208209
|
| |
|
|
|
|
| |
We were already always passing true, this just removes the option.
llvm-svn: 208205
|