| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
ARMELFObjectWriter::addTargetSectionFlags
This simplifies the generic interface and also makes SHF_ARM_PURECODE
more robust (fixes a TODO). Inspecting MCDataFragment contents covers
more cases than MCObjectStreamer::EmitBytes.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Linux' current addLibCxxIncludePaths and addLibStdCxxIncludePaths
are actually almost non-Linux-specific at all, and can be reused
almost as such for all gcc toolchains. Only keep
Android/Freescale/Cray hacks in Linux's version.
Patch by sthibaul (Samuel Thibault)
Differential Revision: https://reviews.llvm.org/D69758
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
(PR43660)
Attempt to use combineLogicBlendIntoConditionalNegate for (select M, (sub 0, X), X) -> (sub (xor X, M), M)
We limit this to cases that can't easily replace the VSELECT with a shuffle (non-constant masks) or where a BLENDV is likely to occur (which tends to result in slower codegen).
|
|
|
|
| |
Updates function order in preparation of future fix for PR43660
|
|
|
|
|
|
| |
(NFC)
Silences a copy+paste analyzer warning - all they are doing are inserting NOOPs in exactly the same way.
|
|
|
|
|
|
|
|
|
|
|
| |
This adds extra scalar handling to isFMAFasterThanFMulAndFAdd, allowing
the target independent code to handle more folds in more situations (for
example if the fast math flags are present, but the global
AllowFPOpFusion option isnt). It also splits apart the HasSlowFPVMLx
into HasSlowFPVFMx, to allow VFMA and VMLA to be controlled separately
if needed.
Differential Revision: https://reviews.llvm.org/D72139
|
|
|
|
|
|
| |
This adds fp16 variants of all the fma patterns in the ARM backend.
Differential Revision: https://reviews.llvm.org/D72138
|
| |
|
|
|
|
|
| |
Some combinations of gcc and ccache do not deal well with raw strings in
macros. Moving the string out to attempt to fix the bots.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LegalizeVectorOps to avoid scalarization.
The code here isn't great in all caess. Particularly v4f64->v4i32
on 64-bit AVX targets. But there is some improvement in some
configurations.
There's definitely some issues with computeNumSignBits with
X86ISD::STRICT_FCMP. As well as not being able to propagate sign
bits through merge_values nodes that get created during custom
legalization.
|
|
|
|
|
|
|
|
|
|
|
|
| |
condition to feed the DstVT select.
Previously, for vectors we created a vselect with a condition that
didn't match what the target wanted according to getSetCCResultType.
To make up for this, X86 had a special DAG combine to detect if
the condition was all sign bits and then insert its own truncate
or extend. By adding the extend/truncate here explicitly we can
avoid that.
|
|
|
|
|
|
|
|
|
|
|
| |
UnrollStrictFPOp method. Call that method from ExpandUINT_TO_FLOAT.
ExpandStrictFPOp calls ExpandUINT_TO_FLOAT. Previously, ExpandUINT_TO_FLOAT
returned SDValue() if it wasn't able to handle and needed to unroll.
Then ExpandStrictFPOp would detect his SDValue() and do the unroll.
After this change, ExpandUINT_TO_FLOAT will directly call
UnrollStrictFPOp and return the unrolled result.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
```
lld/ELF/Relocations.cpp:1622:56: warning: loop variable 'ts' of type 'const std::pair<ThunkSection *, uint32_t>' (aka 'const pair<lld::elf::ThunkSection *, unsigned int>') creates a copy from type 'const std::pair<ThunkSection *, uint32_t>' [-Wrange-loop-analysis]
for (const std::pair<ThunkSection *, uint32_t> ts : isd->thunkSections)
```
Drop const qualifier to fix -Wrange-loop-analysis.
We can make -Wrange-loop-analysis warnings (DiagnoseForRangeConstVariableCopies) on `const A` more
permissive on more types (e.g. POD -> trivially copyable), unfortunately it will not make std::pair
good, because `constexpr pair& operator=(const pair& p);` is unfortunately user-defined.
Reviewed By: Mordante
Differential Revision: https://reviews.llvm.org/D72211
|
|
|
|
|
|
| |
This only handled G_SDIV, but they all are trivially scalarizable.
Also define placeholder AMDGPU division legalizer rules.
|
|
|
|
|
| |
This reverts commit 51ef53f3bd23559203fe9af82ff2facbfedc1db3, as it
breaks some bots.
|
|
|
|
|
|
|
|
|
|
|
|
| |
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.
Reviewers: sanjoy.google, efriedma, reames
Reviewed By: sanjoy.google
Differential Revision: https://reviews.llvm.org/D71537
|
| |
|
| |
|
|
|
|
|
| |
Fix selecting these for volatile global loads, and ensure the loads
are constant enough.
|
|
|
|
|
| |
The attempts to widen sufficently aligned, odd sized loads wasn't
consistently applied.
|
|
|
|
|
|
|
| |
This produces more intelligible looking results, more comparabble to
the DAG output in the simplest cases. This is probably wrong in
complex control flow, but RegBankSelect doesn't attempt analyzing if
this is on a masked path for selecting the bank yet.
|
|
|
|
|
|
|
|
|
|
|
| |
This test fails on macOS, causing the following bots to fail
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/7438/
http://green.lab.llvm.org/green/job/clang-stage1-RA/5034/
Error:
Error opening 'build/./lib/libBye.dylib': dlopen(build/./lib/libBye.dylib, 9): image not found
-load request ignored.
|
|
|
|
|
|
|
|
|
|
|
| |
We're checking the current register bank of the registers in the
instruction, but the mapping may have inserted cross bank copies and
is expecting to replace the registers.
We mostly get away with this currently, because VGPR->SGPR copies are
illegal, and we assume this won't happen. In a future change, we'll
start relying on more cross register bank copies being inserted, and
this starts to break down.
|
|
|
|
|
| |
This should fix
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/30086
|
|
|
|
|
|
| |
This reverts commit 19fd8925a4afe6efd248688cce06aceff50efe0c.
Should include a fix for PR44197.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
same address to avoid WAR conflict.
Reviewers: rampitec, vpykhtin, nhaehnle
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D71934
|
|
|
|
|
|
|
|
|
|
| |
hand of select to 0' fold
I would think it's better than having two practically identical folds
next to eachother, but then generalization isn't all that pretty
due to the fact that we need to produce different `sub` each time..
This change is no-functional-changes-intended refactoring.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR44426)
This decreases use count of %Op0, makes one hand of select to be 0,
and possibly exposes further folding potential.
Name: sub %Op0, (select %Cond, %Op0, %FalseVal) -> select %Cond, 0, (sub %Op0, %FalseVal)
%Op0 = %TrueVal
%o = select i1 %Cond, i8 %Op0, i8 %FalseVal
%r = sub i8 %Op0, %o
=>
%n = sub i8 %Op0, %FalseVal
%r = select i1 %Cond, i8 0, i8 %n
Name: sub %Op0, (select %Cond, %TrueVal, %Op0) -> select %Cond, (sub %Op0, %TrueVal), 0
%Op0 = %FalseVal
%o = select i1 %Cond, i8 %TrueVal, i8 %Op0
%r = sub i8 %Op0, %o
=>
%n = sub i8 %Op0, %TrueVal
%r = select i1 %Cond, i8 %n, i8 0
https://rise4fun.com/Alive/aHRt
https://bugs.llvm.org/show_bug.cgi?id=44426
|
|
|
|
| |
https://bugs.llvm.org/show_bug.cgi?id=44426
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This decreases use count of %Op1, makes one hand of select to be 0,
and possibly exposes further folding potential.
Name: sub (select %Cond, %Op1, %FalseVal), %Op1 -> select %Cond, 0, (sub %FalseVal, %Op1)
%Op1 = %TrueVal
%o = select i1 %Cond, i8 %Op1, i8 %FalseVal
%r = sub i8 %o, %Op1
=>
%n = sub i8 %FalseVal, %Op1
%r = select i1 %Cond, i8 0, i8 %n
Name: sub (select %Cond, %TrueVal, %Op1), %Op1 -> select %Cond, (sub %TrueVal, %Op1), 0
%Op1 = %FalseVal
%o = select i1 %Cond, i8 %TrueVal, i8 %Op1
%r = sub i8 %o, %Op1
=>
%n = sub i8 %TrueVal, %Op1
%r = select i1 %Cond, i8 %n, i8 0
https://rise4fun.com/Alive/avL
https://bugs.llvm.org/show_bug.cgi?id=44426
|
|
|
|
| |
https://bugs.llvm.org/show_bug.cgi?id=44426
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
memory usage.
Summary:
For artificial cases (huge array, few usages), Global SRA optimization creates
a lot of redundant data. It creates an instance of GlobalVariable for each array
element. For huge array, that means huge compilation time and huge memory usage.
Following example compiles for 10 minutes and requires 40GB of memory.
namespace {
char LargeBuffer[64 * 1024 * 1024];
}
int main ( void ) {
LargeBuffer[0] = 0;
printf("\n ");
return LargeBuffer[0] == 0;
}
The fix is to avoid Global SRA for large arrays.
Reviewers: craig.topper, rnk, efriedma, fhahn
Reviewed By: rnk
Subscribers: xbolva00, lebedev.ri, lkail, merge_guards_bot, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for ISD::EXTRACT_VECTOR_ELT (REAPPLIED)
This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract.
In particular this helps remove some unnecessary scalar->vector->scalar patterns.
The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue.
Reapplied after reversion at rL368660 due to PR42982 which was fixed at rGca7fdd41bda0.
Differential Revision: https://reviews.llvm.org/D65887
|
|
|
|
|
|
|
|
|
| |
Both MS link.exe and GNU ld.bfd handle it this way; one can have
multiple object files defining the same absolute symbols, as long
as it defines it to the same value. But if there are multiple absolute
symbols with differing values, it is treated as an error.
Differential Revision: https://reviews.llvm.org/D71981
|
|
|
|
|
|
|
|
|
|
|
|
| |
not use the index to look up the array after the loop.
This represents a more realistic version of the code being tested.
The cmov converter doesn't look at the code after the loop so
it doesn't matter for what's being tested.
But as noted in this twitter thread https://twitter.com/trav_downs/status/1213311159413161987
gcc can turn the previous MaxIndex code into the MaxValue code. So
returning the index makes it a distinct case.
|
|
|
|
|
|
| |
Submitted by: kiszk
Differential Revision: https://reviews.llvm.org/D72171
|
| |
|
|
|
|
|
|
|
|
|
| |
All the windows bots are failing match-tree.td and there's no obvious cause that
I can see. It's not just the %p formatting problem. My best guess is that
there's an ordering issue too but I'll need further information to figure that
out. Revert while I'm investigating.
This reverts commit 64f1bb5cd2c6d69af7c74ec68840029603560238 and 77d4b5f5feff663e70b347516cc4c77fa5cd2a20
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, there is no option to delete all the watchpoint without LLDB
asking for a confirmation. Besides making the watchpoint delete command
homogeneous with the breakpoint delete command, this option could also
become handy to trigger automated watchpoint deletion i.e. using
breakpoint actions.
rdar://42560586
Differential Revision: https://reviews.llvm.org/D72096
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Static archives contain object files which contain sections pointing to
external remark files.
When static archives are shipped without the remark files, dsymutil
shouldn't generate an error.
Instead, generate a warning to inform the user that remarks for that
library won't be available in the .dSYM.
|
| |
|
| |
|