| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test case included with r279125 exposed an existing signed integer
overflow. Since getTreeCost can return INT_MAX, we can't sum this cost together
with other costs, such as getReductionCost.
This patch removes the possibility of assigning a cost of INT_MAX. Since we
were previously using INT_MAX as an indicator for "should not vectorize", we
now explicitly check this condition with "isTreeTinyAndNotFullyVectorizable"
before computing a cost.
This patch adds a run-line to the test case used for r279125 that ensures we
don't vectorize. Previously, this line would vectorize the test case by chance
due to undefined behavior in the cost calculation.
Differential Revision: https://reviews.llvm.org/D23723
llvm-svn: 279562
|
| |
|
|
| |
llvm-svn: 279561
|
| |
|
|
|
|
|
|
| |
Instructions like G_ICMP have multiple types that may need to be legalized (the
boolean output and nearly arbitrary inputs in this case). So the legalizer must
be capable of deciding what to do for each of them separately.
llvm-svn: 279554
|
| |
|
|
| |
llvm-svn: 279553
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LTO object
Summary:
I assume there was a use case, so maybe this strawman patch will help
clarifying if it is legit.
In any case the current situation is not legit: a ThinLTO compilation
should not trigger an unexpected full LTO compilation.
Right now, adding a --save-temps option triggers this and makes the
number of output differs.
Reviewers: tejohnson
Subscribers: pcc, llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D23600
llvm-svn: 279550
|
| |
|
|
|
|
| |
AArch64.
llvm-svn: 279548
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Completes the m_APInt changes for simplifyICmpWithConstant().
Other commits in this series:
https://reviews.llvm.org/rL279492
https://reviews.llvm.org/rL279530
https://reviews.llvm.org/rL279534
https://reviews.llvm.org/rL279538
llvm-svn: 279543
|
| |
|
|
| |
llvm-svn: 279542
|
| |
|
|
| |
llvm-svn: 279538
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This greatly simplifies our handling of SDNode::SubclassData.
NFC, hopefully. :)
See discussion in D23035 for discussion about the design API of these
bitfields.
Reviewers: chandlerc
Subscribers: llvm-commits, rnk
Differential Revision: https://reviews.llvm.org/D23036
llvm-svn: 279537
|
| |
|
|
| |
llvm-svn: 279536
|
| |
|
|
|
|
|
|
| |
other minor fixes.
Differential revision: https://reviews.llvm.org/D23789
llvm-svn: 279535
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
appropriate options
An important performance setting on the LLVMContext for LTO is
enableDebugTypeODRUniquing(), this adds an automatic merging of
debug information in the context based on type ids.
Also, the lto::Config includes a diagnostic handler that needs to
be set on the Context, as well as the setDiscardValueNames() setting.
llvm-svn: 279532
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That commit added a new version of Intrinsic::getName which should only
be called when the intrinsic has no overloaded types. There are several
debugging paths, such as SDNode::dump which are printing the name of the
intrinsic but don't have the overloaded types. These paths should be ok
to just print the name instead of crashing.
The fix here is ultimately to just add a 'None' second argument as that
calls the overload capable getName, which is less efficient, but this is a
debugging path anyway, and not perf critical.
Thanks to Björn Pettersson for pointing out that there were more crashes.
llvm-svn: 279528
|
| |
|
|
|
|
| |
Commit r279241 unintentionally reverted that ability.
llvm-svn: 279526
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D23619
llvm-svn: 279523
|
| |
|
|
|
|
| |
The windows build bot did not like constexpr.
llvm-svn: 279517
|
| |
|
|
|
|
|
|
|
| |
The change in r279105 causes an infinite loop in some cases, as it sets the upper bits of an AND mask constant, which DAGCombiner::SimplifyDemandedBits then unsets.
This patch reverts that part of the behaviour, instead relying on .td peepholes to perform the transformation to NILL. I reapplied my original fix for the problem addressed by r279105 (unsetting the upper bits, which prevents a compiler abort for a different reason).
Differential Revision: https://reviews.llvm.org/D23781
llvm-svn: 279515
|
| |
|
|
| |
llvm-svn: 279514
|
| |
|
|
| |
llvm-svn: 279510
|
| |
|
|
|
|
| |
"AllTargetsDescs" in llvm-mc/CMakeLists.txt expects not ${target}MCTargetDesc, but ${target}Desc.
llvm-svn: 279509
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is not an official documented ABI for frame pointers in Thumb2,
but we should try to emit something which is useful.
We use r7 as the frame pointer for Thumb code, which currently means
that if a function needs to save a high register (r8-r11), it will get
pushed to the stack between the frame pointer (r7) and link register
(r14). This means that while a stack unwinder can follow the chain of
frame pointers up the stack, it cannot know the offset to lr, so does
not know which functions correspond to the stack frames.
To fix this, we need to push the callee-saved registers in two batches,
with the first push saving the low registers, fp and lr, and the second
push saving the high registers. This is already implemented, but
previously only used for iOS. This patch turns it on for all Thumb2
targets when frame pointers are required by the ABI, and the frame
pointer is r7 (Windows uses r11, so this isn't a problem there). If
frame pointer elimination is enabled we still emit a single push/pop
even if we need a frame pointer for other reasons, to avoid increasing
code size.
We must also ensure that lr is pushed to the stack when using a frame
pointer, so that we end up with a complete frame record. Situations that
could cause this were rare, because we already push lr in most
situations so that we can return using the pop instruction.
Differential Revision: https://reviews.llvm.org/D23516
llvm-svn: 279506
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: GVNHoist: Use the pass version of MemorySSA and preserve it.
Reviewers: sebpop, george.burgess.iv
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23782
llvm-svn: 279504
|
| |
|
|
|
|
|
|
|
|
| |
MachineFunctionAnalysis => Enable (Machine)ModulePasses"
Reverting while tracking down a use after free.
This reverts commit r279502.
llvm-svn: 279503
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the MachineFunctionAnalysis. Instead we keep a
map from IR Function to MachineFunction in the MachineModuleInfo.
This allows the insertion of ModulePasses into the codegen pipeline
without breaking it because the MachineFunctionAnalysis gets dropped
before a module pass.
Peak memory should stay unchanged without a ModulePass in the codegen
pipeline: Previously the MachineFunction was freed at the end of a codegen
function pipeline because the MachineFunctionAnalysis was dropped; With
this patch the MachineFunction is freed after the AsmPrinter has
finished.
Differential Revision: http://reviews.llvm.org/D23736
llvm-svn: 279502
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
branches
Looping over all terminators exposed AArch64 tests hitting
an assert from analyzeBranch failing. I believe these cases
were miscompiled before.
e.g.
fcmp s0, s1
b.ne LBB0_1
b.vc LBB0_2
b LBB0_2
LBB0_1:
; Large block
LBB0_2:
; ...
Both of the individual conditional branches need to
be expanded, since neither can reach the final block.
Split the original block into ones which analyzeBranch
will be able to understand.
llvm-svn: 279499
|
| |
|
|
|
|
| |
LanaiMemAluCombiner could try to query the debug value of a list sentinel. Add check to exit early instead.
llvm-svn: 279497
|
| |
|
|
|
|
|
|
| |
Given that we're not currently using blocker info, and whether or not we
will end up using it it is unclear, don't waste 8 (or 4) bytes of memory
per path node.
llvm-svn: 279493
|
| |
|
|
|
|
|
|
|
| |
And add a FIXME because the helper excludes folds for vectors. It's
not clear yet how many of these are actually testable (and therefore
necessary?) because later analysis uses computeKnownBits and other
methods to catch many of these cases.
llvm-svn: 279492
|
| |
|
|
|
|
|
|
|
|
| |
The assert in r279466 checks that we call the correct version of
Intrinsic::getName. The version which accepts only an ID should not
be used for intrinsics with overloaded types. The global-isel
code was calling the wrong version. The test CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
will ensure that we call the correct version from now on.
llvm-svn: 279487
|
| |
|
|
|
|
|
|
| |
helpers; NFCI
This saves some casting in the helper functions and eases some further refactoring.
llvm-svn: 279478
|
| |
|
|
|
|
|
|
| |
This should finish the GraphTraits migration.
Differential Revision: http://reviews.llvm.org/D23730
llvm-svn: 279475
|
| |
|
|
|
|
|
|
|
|
| |
Remove all the dead code around ilist_*sentinel_traits. This is a
follow-up to gutting them as part of r279314 (originally r278974),
staged to prevent broken builds in sub-projects.
Uses were removed from clang in r279457 and lld in r279458.
llvm-svn: 279473
|
| |
|
|
|
|
| |
constant vectors
llvm-svn: 279472
|
| |
|
|
|
|
|
|
|
| |
Xcode and MSVC list the headers and source files for each library.
LLVMSupport lists included the source files for ADT but not the headers. This
add the ADT headers so that they are browsable by the UI.
llvm-svn: 279470
|
| |
|
|
|
|
|
|
|
|
|
| |
Philip commented on r279113 to ask for better comments as to
when to use the different versions of getName. Its also possible
to assert in the simple case that we aren't an overloaded intrinsic
as those have to use the more capable version of getName.
Thanks for the comments Philip.
llvm-svn: 279466
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do most of the lowering in a pre-RA pass. Keep the skip jump
insertion late, plus a few other things that require more
work to move out.
One concern I have is now there may be COPY instructions
which do not have the necessary implicit exec uses
if they will be lowered to v_mov_b32.
This has a positive effect on SGPR usage in shader-db.
llvm-svn: 279464
|
| |
|
|
| |
llvm-svn: 279462
|
| |
|
|
| |
llvm-svn: 279461
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[Recommitting now an unrelated assertion in SROA is sorted out]
The new version has several advantages:
1) IMSHO it's more readable and neater
2) It handles loads and stores properly
3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.
With this change we can now finally sink load-modify-store idioms such as:
if (a)
return *b += 3;
else
return *b += 4;
=>
%z = load i32, i32* %y
%.sink = select i1 %a, i32 5, i32 7
%b = add i32 %z, %.sink
store i32 %b, i32* %y
ret i32 %b
When this works for switches it'll be even more powerful.
Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables.
This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup.
llvm-svn: 279460
|
| |
|
|
|
|
|
|
|
| |
Confirmed with aprantl, this assertion is incorrect - code can get here (for example 80-bit FP types) and if it does it's benign. This is exposed by a completely unrelated patch of mine, so stop the compiler falling over.
Original differential: http://reviews.llvm.org/D16187
aprantl's advice to remove assertion: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160815/382129.html
llvm-svn: 279454
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
__guard_local is defined as long on OpenBSD. If the source file contains
a definition of __guard_local, it mismatches with the int8 pointer type
used in LLVM. In that case, Module::getOrInsertGlobal() returns a
cast operation instead of a GlobalVariable. Trying to set the
visibility on the cast operation leads to random segfaults (seen when
compiling the OpenBSD kernel, which also runs with stack protection).
In the kernel, the hidden attribute does not matter. For userspace code,
__guard_local is defined as hidden in the startup code. If a program
re-defines __guard_local, the definition from the startup code will
either win or the linker complains about multiple definitions
(depending on whether the re-defined __guard_local is placed in the
common segment or not).
It also matches what gcc on OpenBSD does.
Thanks Stefan Kempf <sisnkemp@gmail.com> for the patch!
Differential Revision: http://reviews.llvm.org/D23674
llvm-svn: 279449
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: We can allow sinking if the single user block has only one unique predecessor, regardless of the number of edges. Note that a switch statement with multiple cases can have the same destination.
Reviewers: mcrosier, majnemer, spatel, reames
Subscribers: reames, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D23722
llvm-svn: 279448
|
| |
|
|
|
|
| |
This reverts commit r279443. It caused buildbot failures.
llvm-svn: 279447
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new version has several advantages:
1) IMSHO it's more readable and neater
2) It handles loads and stores properly
3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch.
With this change we can now finally sink load-modify-store idioms such as:
if (a)
return *b += 3;
else
return *b += 4;
=>
%z = load i32, i32* %y
%.sink = select i1 %a, i32 5, i32 7
%b = add i32 %z, %.sink
store i32 %b, i32* %y
ret i32 %b
When this works for switches it'll be even more powerful.
Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables.
This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup.
llvm-svn: 279443
|
| |
|
|
|
|
|
|
| |
chain (PR29088)
We could improve on this by making X86SubVBroadcast a full memory intrinsic similar to X86vzload
llvm-svn: 279441
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Assembler directives .dtprelword, .dtpreldword, .tprelword, and
.tpreldword generates relocations R_MIPS_TLS_DTPREL32, R_MIPS_TLS_DTPREL64,
R_MIPS_TLS_TPREL32, and R_MIPS_TLS_TPREL64 respectively.
The main motivation for this patch is to be able to write test cases
for checking correctness of the LLD linker's behaviour.
Differential Revision: https://reviews.llvm.org/D23669
llvm-svn: 279439
|
| |
|
|
|
|
|
|
| |
It use to be non-const for the sole purpose of custom handling of
commons symbol. This is moved now in the regular LTO handling now
and such we can constify the callback.
llvm-svn: 279438
|
| |
|
|
| |
llvm-svn: 279437
|
| |
|
|
|
|
| |
As discussed on D23027 we should be trying to be more strict on what is an undefined mask value.
llvm-svn: 279435
|