| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
| |
Fix memory leaks on check-llvm tests detected by Asan.
This reverts commit r298282.
llvm-svn: 298329
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The glueless lowering of addc/adde in Thumb1 has known serious
miscompiles (see https://reviews.llvm.org/D31081), and r297820
causes an infinite loop for certain constructs. It's not
clear when they will be fixed, so let's just take them out
of the tree for now.
(I resolved a small conflict with r297453.)
llvm-svn: 298328
|
| |
|
|
|
|
|
|
|
|
|
| |
The special case of zero sized values was previously not handled correctly.
This patch handles this by not promoting if the size is zero.
Patch by Tim Neumann.
Differential Revision: https://reviews.llvm.org/D31116
llvm-svn: 298320
|
| |
|
|
|
|
|
|
|
|
|
| |
Make x86_64-fuchsia targets under -mcmodel=kernel use %gs rather
than %fs to access ABI slots for stack-protector and safe-stack
Patch by Roland McGrath.
Differential Revision: https://reviews.llvm.org/D30870
llvm-svn: 298302
|
| |
|
|
|
|
|
|
|
| |
Regain the ability to recognize loops calculating polynomial modulo
operation. This ability has been lost due to some changes in the
preceding optimizations. Add code to preprocess the IR to a form
that the pattern matching code can recognize.
llvm-svn: 298282
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D31141
llvm-svn: 298281
|
| |
|
|
|
|
|
|
|
|
| |
This fix enables sp3 abs modifier with constants
Reviewers: artem.tamazov
Differential Revision: https://reviews.llvm.org/D30825
llvm-svn: 298265
|
| |
|
|
|
|
|
| |
I don't know how to type. This fixes the last commit which would have made all
of the overflows legal, and kept the screaming.
llvm-svn: 298263
|
| |
|
|
|
|
|
|
|
|
| |
Forgot to remove some output before committing last time. (Instruction fixups
don't actually overflow anywhere in the test suite so far, so I missed it).
To prevent the outliner from screaming "Overflow!" in the event that that
does happen, this commit removes that output.
llvm-svn: 298260
|
| |
|
|
|
|
|
|
|
|
| |
Fixed several related issues with VOP3 fp modifiers.
Reviewers: artem.tamazov
Differential Revision: https://reviews.llvm.org/D30821
llvm-svn: 298255
|
| |
|
|
|
|
|
|
|
|
|
| |
This commit adds a parameter that lets us pass in the calling convention
of the call to CallLowering::lowerCall. This allows us to handle
situations where the calling convetion of the callee is different from
that of the caller.
Differential Revision: https://reviews.llvm.org/D31039
llvm-svn: 298254
|
| |
|
|
|
|
| |
This reverts commit r297958, it breaks device-libs build.
llvm-svn: 298239
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
instead of isel
Summary:
Currently we handle these intrinsics at isel with special patterns. But as they just map to normal logic operations, we should just handle them at lowering. This will expose them to DAG combine optimizations. Right now the kor-sequence test generates a bunch of regclass copies between GR16 and VK16 that the peephole optimizer and/or register coallescing are removing to keep everything in the mask domain. By handling the logic op intrinsics earlier, these copies become bitcasts in the DAG and get removed by DAG combine which seems more robust.
This should help enable my plan to stop copying between K registers and GR8/GR16. The peephole optimizer can't remove a chain of copies between K and GR32 with insert_subreg/extract_subreg present in the chain so the kor-sequence test break. But this patch should dodge the problem entirely.
Reviewers: zvi, delena, RKSimon, igorb
Reviewed By: igorb
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31056
llvm-svn: 298228
|
| |
|
|
|
|
| |
labels". NFCI.
llvm-svn: 298225
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The MIR printer dumps a string that describe the register mask of a function.
A static predefined list of register masks matches a static list of strings.
However when the register mask is not from the static predefined list, there is no descriptor string and the printer fails.
This patch adds support to custom register mask printing and dumping.
Also the list of callee saved registers (describing the registers that must be preserved for the caller) might be dynamic.
As such this data needs to be dumped and parsed back to the Machine Register Info.
Differential Revision: https://reviews.llvm.org/D30971
llvm-svn: 298207
|
| |
|
|
|
|
|
|
| |
Let targets specialize the pass with the register class so we can get a
parameterless default constructor and can put the pass into the pass
registry to enable testing with -run-pass=.
llvm-svn: 298184
|
| |
|
|
|
|
|
| |
Normalize ExeDepsFix, execution-fix, ExecutionDependencyFix and
ExecutionDepsFix to the last one.
llvm-svn: 298183
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: mkuper, rnk
Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D27050
llvm-svn: 298179
|
| |
|
|
| |
llvm-svn: 298178
|
| |
|
|
|
|
|
|
|
| |
This is direct port of HSAILAliasAnalysis pass, just cleaned for
style and renamed.
Differential Revision: https://reviews.llvm.org/D31103
llvm-svn: 298172
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds the necessary target hooks for outlining in AArch64. It also
refactors the switch statement used in `getMemOpBaseRegImmOfsWidth` into a
more general function, `getMemOpInfo`. This allows the outliner to share that
code without copying and pasting it.
The AArch64 outliner can be run using -mllvm -enable-machine-outliner, as with
the X86-64 outliner.
The test for this pass verifies that the outliner does, in fact outline
functions, fixes up the stack accesses properly, and can correctly generate a
tail call. In the future, this test should be replaced with a MIR test, so that
we can properly test immediate offset overflows in fixed-up instructions.
llvm-svn: 298162
|
| |
|
|
|
|
| |
Fixes bug 32248.
llvm-svn: 298125
|
| |
|
|
|
|
|
|
| |
If the loop condition was an i1 phi with a constantexpr input, this
would add a loop intrinsic fed by a phi dependent on a call to
if.break in the same block. Insert the call in the loop header.
llvm-svn: 298121
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move backend internal intrinsics along with the rest of the
normal intrinsics, and use the Intrinsic::getDeclaration
API instead of manually constructing the type list.
It's surprising this was working before. fdiv.fast had
the wrong number of parameters. The control flow intrinsic
declaration attributes were not being applied, and
their types were inconsistent. The actual IR use types
did not match the declaration, and were closer to the
types used for the patterns. The brcond lowering
was changing the types, so introduce new nodes for those.
llvm-svn: 298119
|
| |
|
|
| |
llvm-svn: 298118
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Use this code pattern when RAX is live, instead of emitting up to 2
billion adjustments:
pushq %rax
movabsq +-$Offset+-8, %rax
addq %rsp, %rax
xchg %rax, (%rsp)
movq (%rsp), %rsp
Try to clean this code up a bit while I'm here. In particular, hoist the
logic that handles the entire adjustment with `movabsq $imm, %rax` out
of the loop.
This negates the offset in the prologue and uses ADD because X86 only
has a two operand subtract which always subtracts from the destination
register, which can no longer be RSP.
Fixes PR31962
Reviewers: majnemer, sdardis
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30052
llvm-svn: 298116
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
As noted in the comment, we might want to account for this case,
but I didn't look at what that would mean for the asm.
I'm also not sure why this only reproduces with avx512, but I'm
putting a conservative fix in for now to avoid the crash.
Also, if both sides of an add are zexted, shouldn't we shrink that add?
https://bugs.llvm.org/show_bug.cgi?id=32316
llvm-svn: 298107
|
| |
|
|
| |
llvm-svn: 298106
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Loop unswitching can be extremely harmful for a SIMT target. In case
if hoisted condition is not uniform a SIMT machine will execute both
clones of a loop sequentially. Therefor LoopUnswitch checks if the
condition is non-divergent.
Since DivergenceAnalysis adds an expensive PostDominatorTree analysis
not needed for non-SIMT targets a new option is added to avoid unneded
analysis initialization. The method getAnalysisUsage is called when
TargetTransformInfo is not yet available and we cannot use it here.
For that reason a new field DivergentTarget is added to PassManagerBuilder
to control the behavior and set this field from a target.
Differential Revision: https://reviews.llvm.org/D30796
llvm-svn: 298104
|
| |
|
|
|
|
|
|
| |
This allows the optimization to rearrange loads and stores more aggressively.
Differential Revision: http://reviews.llvm.org/D30903
llvm-svn: 298092
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Fixing triple format in the tests added for the branch label fix for Thumb
Targets. Also recommitting previously approved patch, see
https://reviews.llvm.org/D30943.
Reviewed by: samparker
Differential Revision: https://reviews.llvm.org/D30987
llvm-svn: 298056
|
| |
|
|
|
|
|
|
|
|
|
|
| |
regardless of whether +fma was added on the command line.
We weren't able to handle isel of the 128/256-bit FMA instructions when AVX512F was enabled but VLX and FMA weren't.
I didn't mask FeatureAVX512 imply FeatureFMA as I wasn't sure I wanted disabling FMA to also disable AVX512. Instead we just can't prevent FMA instructions if AVX512 is enabled.
Another option would be to promote 128/256-bit to 512-bit, do the operation and extract it. But that requires a lot of extra isel patterns. Since no CPUs exist that support AVX512, but not FMA just using the VEX instructions seems better.
llvm-svn: 298051
|
| |
|
|
| |
llvm-svn: 298050
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If one of the subregs of the 128 bit reg is undefined when splitMove() splits
a store into two instructions, a use of an undefined physical register
results.
To remedy this, an implicit use of the super register is added onto both new
instructions, along with propagated kill and undef flags.
This was discovered with llvm-stress, and that test case is attached as
test/CodeGen/SystemZ/splitMove_undefReg_mverifier.ll
Thanks to Matthias Braun for helping with a nice explanation.
Review: Ulrich Weigand
llvm-svn: 298047
|
| |
|
|
|
|
|
|
| |
FMA, AVX512 and no VLX.
We were giving priority if VLX was enabled.
llvm-svn: 298046
|
| |
|
|
|
|
| |
This makes the values a little more consistent between similar instruction and reduces the values some. This results in better grouping in the isel table saving a few bytes.
llvm-svn: 298043
|
| |
|
|
|
|
|
| |
associated command line options and functions - it's currently unused
in all of llvm and clang other than being set and reset.
llvm-svn: 298023
|
| |
|
|
|
|
|
|
|
|
| |
This allows the optimization to rearrange loads and stores more
aggressively. This doesn't really affect performance, but it helps
codesize.
Differential Revision: https://reviews.llvm.org/D30839
llvm-svn: 298021
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch cleans the namespace of the Lanai target.
Reviewers: jpienaar
Reviewed By: jpienaar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D30955
llvm-svn: 298015
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Users often call getArgumentList().size(), which is a linear way to get
the number of function arguments. arg_size(), on the other hand, is
constant time.
In general, the fact that arguments are stored in an iplist is an
implementation detail, so I've removed it from the Function interface
and moved all other users to the argument container APIs (arg_begin(),
arg_end(), args(), arg_size()).
Reviewed By: chandlerc
Differential Revision: https://reviews.llvm.org/D31052
llvm-svn: 298010
|
| |
|
|
|
|
|
|
|
|
|
|
| |
A recent change switch the in-memory wasm value types
to be signed integers, but I missing a few cases where
these were being writing to the binary.
Differential Revision: https://reviews.llvm.org/D31014
Patch by Sam Clegg
llvm-svn: 297991
|
| |
|
|
|
|
|
|
|
|
| |
In fact this default implementation should be the only implementation,
keep it virtual for now to accomodate targets that don't model flags
correctly.
Differential Revision: https://reviews.llvm.org/D30747
llvm-svn: 297980
|
| |
|
|
|
|
|
|
|
| |
Earlier stages of GlobalISel always use ConstantInt in G_CONSTANT so that's
what we should check for.
This fixes a crash introduced in r297782.
llvm-svn: 297968
|
| |
|
|
| |
llvm-svn: 297959
|
| |
|
|
|
|
|
|
|
|
| |
We can mark functions to always inline early in the opt. Since we do not have
call support this early inlining creates opportunities for inter-procedural
optimizations which would not occur otherwise.
Differential Revision: https://reviews.llvm.org/D31016
llvm-svn: 297958
|
| |
|
|
| |
llvm-svn: 297920
|
| |
|
|
| |
llvm-svn: 297915
|
| |
|
|
| |
llvm-svn: 297913
|
| |
|
|
|
|
| |
Prep work for PR31810
llvm-svn: 297876
|
| |
|
|
|
|
|
| |
We're now able to select ADDWri thanks to the new complex pattern
support. Extend that to ADDXri.
llvm-svn: 297874
|