| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
| |
Modernise the code with range-loops etc
Reviewed by: @fhahn, @rovka
Differential Revision: https://reviews.llvm.org/D36502
llvm-svn: 310807
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code.
Next step is to use m_APInt instead of ConstantInt.
Reviewers: spatel, efriedma, davide, majnemer
Reviewed By: spatel
Subscribers: zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D36439
llvm-svn: 310806
|
| |
|
|
| |
llvm-svn: 310801
|
| |
|
|
| |
llvm-svn: 310799
|
| |
|
|
|
|
|
|
|
|
|
|
| |
environments
This allows using semicolons for bundling up more than one
statement per line. This is used within the mingw-w64 project in some
assembly files that contain code for multiple architectures.
Differential Revision: https://reviews.llvm.org/D36366
llvm-svn: 310797
|
| |
|
|
|
|
|
|
|
|
|
|
| |
answers for extracting 128-bits from a 512-bit vector and for mask registers.
Previously it would not return true for extracting either of the upper quarters of a 512-bit registers.
For mask registers we support extracting anything from index 0. And otherwise we only support extracting the upper half of a register.
Differential Revision: https://reviews.llvm.org/D36638
llvm-svn: 310794
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.
For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.
Reviewers: RKSimon, zvi, efriedma
Reviewed By: RKSimon
Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D36649
llvm-svn: 310793
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
information
This is a continuation patch for commit r307529 which completely replaces the scheduling information for the SandyBridge architecture target by modifying the file X86SchedSandyBridge.td located under the X86 Target (see also https://reviews.llvm.org/D35019).
In this patch we added the scheduling information of additional SNB instructions that were missing from the patch commit r307529, fixed the scheduling of several resource groups that include only port0 instead of port05 (i.e., port0 OR port5) and fixed several incorrect instructions' scheduling in the r307529 commit.
The patch also includes the X87 instructions which were missing in previous patch commit r307529 as reported in bugzilla bug 34080.
Reviewers: zvi, RKSimon, chandlerc, igorb, m_zuckerman, craig.topper, aymanmus, dim
Differential Revision: https://reviews.llvm.org/D36388
llvm-svn: 310792
|
| |
|
|
|
|
|
|
|
| |
K0 isn't expected as a write-mask, so provide a detailed error here, instead of the more generic one (invalid op for insn)
Conforms with gas
Differential Revision: https://reviews.llvm.org/D36570
llvm-svn: 310789
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add an X86 combine for TESTM when one of the operands is a BUILD_VECTOR(0,0,...).
TESTM op0, BUILD_VECTOR(0,0,...) -> BUILD_VECTOR(0,0,...)
TESTM BUILD_VECTOR(0,0,...), op1 -> BUILD_VECTOR(0,0,...)
Differential Revision:
https://reviews.llvm.org/D36536
llvm-svn: 310787
|
| |
|
|
|
|
| |
The combines here shouldn't be done for mask vectors, but it wasn't clear anything was preventing that.
llvm-svn: 310786
|
| |
|
|
| |
llvm-svn: 310785
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
correct type so we don't crash if we use a memory instruction
Summary:
Previously we were creating the flag result with MVT::Other which is interpretted as a Chain node. If we used a memory form of the instruction we would end up with a copyToReg that consumed the chain result of the adcx instruction instead of the flag result.
Pretty sure we should be using MVT::i32 here, that's what we do other places we create these node types.
We should probably consider this for 5.0 as well.
Reviewers: RKSimon, zvi, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36645
llvm-svn: 310784
|
| |
|
|
|
|
|
|
|
|
|
|
| |
If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index.
Reapplied with fix to only work with simple value types.
Committed on behalf of @jbhateja (Jatin Bhateja)
Differential Revision: https://reviews.llvm.org/D35788
llvm-svn: 310782
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
isThumb returns true for Thumb triples (little and big endian), isARM
returns true for ARM triples (little and big endian).
There are a few more checks using arm/thumb that are not covered by
those functions, e.g. that the architecture is either ARM or Thumb
(little endian) or ARM/Thumb little endian only.
Reviewers: javed.absar, rengolin, kristof.beyls, t.p.northover
Reviewed By: rengolin
Subscribers: llvm-commits, aemerson
Differential Revision: https://reviews.llvm.org/D34682
llvm-svn: 310781
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PR34037)
nsw, nuw, and exact carry implicit assumptions about their operands, so we need
to clear those after trivializing a value. We decided there was no danger for
llvm.assume or metadata, so there's just a comment about that.
This fixes miscompiles as shown in:
https://bugs.llvm.org/show_bug.cgi?id=33695
https://bugs.llvm.org/show_bug.cgi?id=34037
Differential Revision: https://reviews.llvm.org/D36592
llvm-svn: 310779
|
| |
|
|
|
|
|
|
|
|
|
| |
`external_weak` global
An `external_weak` global may be intended to resolve as a null pointer if it's
not defined, so it doesn't make sense to use a copy relocation for it.
Differential Revision: https://reviews.llvm.org/D36604
llvm-svn: 310773
|
| |
|
|
|
|
| |
(fprofile-instr-generate), Linux-only
llvm-svn: 310771
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The stack alignment depends on the ABI (16 bytes for N32 and N64 and 8
bytes for O32), not the CPU type.
Reviewers: sdardis
Reviewed By: sdardis
Subscribers: atanasyan, arichardson, llvm-commits
Differential Revision: https://reviews.llvm.org/D36326
llvm-svn: 310768
|
| |
|
|
|
|
| |
warnings; other minor fixes (NFC).
llvm-svn: 310766
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Updating remark API to newer OptimizationDiagnosticInfo API. This
allows remarks to show up in diagnostic yaml file, and enables use
of opt-viewer tool.
Hotness information for remarks (L505 and L751) do not display hotness
information, most likely due to profile information not being
propagated yet. Unsure if this is the desired outcome.
Patch by Tarun Rajendran.
Differential Revision: https://reviews.llvm.org/D36127
llvm-svn: 310763
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously we would use these instructions if sse was disabled and fastmath was enabled.
As mentioned in D28335, this is a bad idea.
Reviewers: efriedma, scanon, DavidKreitzer
Reviewed By: DavidKreitzer
Subscribers: zvi, llvm-commits
Differential Revision: https://reviews.llvm.org/D36344
llvm-svn: 310762
|
| |
|
|
|
|
|
|
|
|
| |
When the access to a weak symbol is not a call, the access has to be
able to produce the value 0 at runtime.
We were sometimes producing code sequences where that was not possible
if the code was leaded more than 4g away from 0.
llvm-svn: 310756
|
| |
|
|
|
|
| |
Handle the sibling call cases.
llvm-svn: 310753
|
| |
|
|
| |
llvm-svn: 310750
|
| |
|
|
|
|
|
|
|
|
|
| |
zero-instruction emission.
Two of the Windows bots are failing test\CodeGen\X86\GlobalISel\select-inc.mir
which should not have been affected by the change. Reverting while I investigate.
Also reverted r310735 because it builds on r310716.
llvm-svn: 310745
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we were writing an empty globals stream. Windows
tools interpret this as "private symbols are not present in
this PDB", even when they are, so we need to fix this. Regardless,
without it we don't have information about global variables, so
we need to fix it anyway. This patch does that.
With this patch, the "lm" command in WinDbg correctly reports
that we have private symbols available, but the "dv" command
still refuses to display local variables.
Differential Revision: https://reviews.llvm.org/D36535
llvm-svn: 310743
|
| |
|
|
|
|
| |
This only fixes a few things and serves as my initial test commit.
llvm-svn: 310742
|
| |
|
|
|
|
| |
Removed useless assert.
llvm-svn: 310738
|
| |
|
|
| |
llvm-svn: 310737
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pass does simplifications of well known AMD library calls.
If given -amdgpu-prelink option it works in a pre-link mode which
allows to reference new library functions which will be linked in
later.
In addition it also used to process traditional AMD option
-fuse-native which allows to replace some of the functions with
their fast native implementations from the library.
The necessary glue to pass the prelink option and translate
-fuse-native is to be added to the driver.
Differential Revision: https://reviews.llvm.org/D36436
llvm-svn: 310731
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This autoupgrades most of the broadcast intrinsics. They've been unused in clang for some time.
This leaves the 32x2 intrinsics because they are still used in clang.
Reviewers: RKSimon, zvi, igorb
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36606
llvm-svn: 310725
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This teaches 512-bit shuffles to detect unused halfs in order to reduce shuffle size.
We may need to refine the 512-bit exit point. I couldn't remember if we had good cross lane shuffles for 8/16 bit with AVX-512 or not.
I believe this is step towards being able to handle D36454 without a special case.
From here we need to improve our ability to combine extract_subvector with insert_subvector and other extract_subvectors. And we need to support narrowing binary operations where we don't demand all elements. This may be improvements to DAGCombiner::narrowExtractedVectorBinOp(by recognizing an insert_subvector in addition to concat) or we may need a target specific combiner.
Reviewers: RKSimon, zvi, delena, jbhateja
Reviewed By: RKSimon, jbhateja
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36601
llvm-svn: 310724
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous rev (r310208) failed to account for overflow when subtracting the
constants to see if they're suitable for shift/lea. This version add a check
for that and more test were added in r310490.
We can convert any select-of-constants to math ops:
http://rise4fun.com/Alive/d7d
For this patch, I'm enhancing an existing x86 transform that uses fake multiplies
(they always become shl/lea) to avoid cmov or branching. The current code misses
cases where we have a negative constant and a positive constant, so this is just
trying to plug that hole.
The DAGCombiner diff prevents us from hitting a terrible inefficiency: we can start
with a select in IR, create a select DAG node, convert it into a sext, convert it
back into a select, and then lower it to sext machine code.
Some notes about the test diffs:
1. 2010-08-04-MaskedSignedCompare.ll - We were creating control flow that didn't exist in the IR.
2. memcmp.ll - Choose -1 or 1 is the case that got me looking at this again. We could avoid the
push/pop in some cases if we used 'movzbl %al' instead of an xor on a different reg? That's a
post-DAG problem though.
3. mul-constant-result.ll - The trade-off between sbb+not vs. setne+neg could be addressed if
that's a regression, but those would always be nearly equivalent.
4. pr22338.ll and sext-i1.ll - These tests have undef operands, so we don't actually care about these diffs.
5. sbb.ll - This shows a win for what is likely a common case: choose -1 or 0.
6. select.ll - There's another borderline case here: cmp+sbb+or vs. test+set+lea? Also, sbb+not vs. setae+neg shows up again.
7. select_const.ll - These are motivating cases for the enhancement; replace cmov with cheaper ops.
Assembly differences between movzbl and xor to avoid a partial reg stall are caused later by the X86 Fixup SetCC pass.
Differential Revision: https://reviews.llvm.org/D35340
llvm-svn: 310717
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.
Depends on D35833
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D36084
llvm-svn: 310716
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Post commit review of rL308619 highlighted the need for handling N64
with -fno-pic. Testing reveale a stale assert when generating a GP
relative addressing mode.
This patch removes that assert and adds the necessary patterns for
MIPS64 to perform gp relative addressing with -fno-pic
(and the implicit -mno-abicalls + -mgpopt).
Reviewers: atanasyan, nitesh.jain
Differential Revision: https://reviews.llvm.org/D36472
llvm-svn: 310713
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Expose the dependencies of LLVMExecutionEngine library as PUBLIC rather
than PRIVATE when building a shared library. This is necessary because
the library is not contained but exposes API of other LLVM libraries via
its headers.
This causes other libraries to fail to link if the linker verifies for
correctness of -l flags (i.e. fails on indirect dependencies). This e.g.
happens when building LLDB against shared LLVM:
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTIN4llvm18MCJITMemoryManagerE[_ZTIN4llvm18MCJITMemoryManagerE]+0x10): undefined reference to `typeinfo for llvm::RuntimeDyld::MemoryManager'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN4llvm18MCJITMemoryManagerE[_ZTVN4llvm18MCJITMemoryManagerE]+0x60): undefined reference to `llvm::RuntimeDyld::MemoryManager::anchor()'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0x48): undefined reference to `llvm::RTDyldMemoryManager::deregisterEHFrames()'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0x60): undefined reference to `llvm::RuntimeDyld::MemoryManager::anchor()'
lib64/liblldbExpression.a(IRExecutionUnit.cpp.o):(.data.rel.ro._ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE[_ZTVN12lldb_private15IRExecutionUnit13MemoryManagerE]+0xd0): undefined reference to `llvm::JITSymbolResolver::anchor()'
collect2: error: ld returned 1 exit status
Declaring the dependencies as PUBLIC guarantees that any package using
the ExecutionEngine library will also get explicit -l flags for
the dependent libraries guaranteeing that the symbols exposed in headers
could be resolved.
Patch originally written by NAKAMURA Takumi.
Differential Revision: https://reviews.llvm.org/D36211
llvm-svn: 310712
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Fix insert_subvector / extract_subvector merges of bitcast values.
Reviewers: efriedma, craig.topper, RKSimon
Subscribers: RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D34571
llvm-svn: 310711
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move store merge to happen after intrinsic lowering to allow lowered
stores to be merged.
Some regressions due in MergeConsecutiveStores to missing
insert_subvector that are addressed in follow up patch.
Reviewers: craig.topper, efriedma, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34559
llvm-svn: 310710
|
| |
|
|
|
|
|
|
|
| |
Add assembler and disassembler support for the ARMv8.3-A pointer
authentication instructions.
Differential Revision: https://reviews.llvm.org/D36517
llvm-svn: 310709
|
| |
|
|
|
|
|
|
|
| |
Commit r310480 added the AArch64 ARMv8.2a dot product instructions;
this adds the AArch32 instructions.
Differential Revision: https://reviews.llvm.org/D36575
llvm-svn: 310701
|
| |
|
|
|
|
|
|
| |
rL310372 enabled simplifyShuffleMask to support undef shuffle mask inputs, but its causing hangs.
Removing support until I can triage the problem
llvm-svn: 310699
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes PR32721 in IfConvertTriangle and possible similar problems in
IfConvertSimple, IfConvertDiamond and IfConvertForkedDiamond.
In PR32721 we had a triangle
EBB
| \
| |
| TBB
| /
FBB
where FBB didn't have any successors at all since it ended with an
unconditional return. Then TBB and FBB were be merged into EBB, but EBB
would still keep its successors, and the use of analyzeBranch and
CorrectExtraCFGEdges wouldn't help to remove them since the return
instruction is not analyzable (at least not on ARM).
The edge updating code and branch probability updating code is now pushed
into MergeBlocks() which allows us to share the same update logic between
more callsites. This lets us remove several dependencies on analyzeBranch
and completely eliminate RemoveExtraEdges.
One thing that showed up with this patch was that IfConversion sometimes
left a successor with 0% probability even if there was no branch or
fallthrough to the successor.
One such example from the test case ifcvt_bad_zero_prob_succ.mir. The
indirect branch tBRIND can only jump to bb.1, but without the patch we
got:
bb.0:
successors: %bb.1(0x80000000)
bb.1:
successors: %bb.1(0x80000000), %bb.2(0x00000000)
tBRIND %r1, 1, %cpsr
B %bb.1
bb.2:
There is no way to jump from bb.1 to bb2, but still there is a 0% edge
from bb.1 to bb.2.
With the patch applied we instead get the expected:
bb.0:
successors: %bb.1(0x80000000)
bb.1:
successors: %bb.1(0x80000000)
tBRIND %r1, 1, %cpsr
B %bb.1
Since bb.2 had no predecessor at all, it was removed.
Several testcases had to be updated due to this since the removed
successor made the "Branch Probability Basic Block Placement" pass
sometimes place blocks in a different order.
Finally added a couple of new test cases:
* PR32721_ifcvt_triangle_unanalyzable.mir:
Regression test for the original problem dexcribed in PR 32721.
* ifcvt_triangleWoCvtToNextEdge.mir:
Regression test for problem that caused a revert of my first attempt
to solve PR 32721.
* ifcvt_simple_bad_zero_prob_succ.mir:
Test case showing the problem where a wrong successor with 0% probability
was previously left.
* ifcvt_[diamond|forked_diamond|simple]_unanalyzable.mir
Very simple test cases for the simple and (forked) diamond cases
involving unanalyzable branches that can be nice to have as a base if
wanting to write more complicated tests.
Reviewers: iteratee, MatzeB, grosser, kparzysz
Reviewed By: kparzysz
Subscribers: kbarton, davide, aemerson, nemanjai, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34099
llvm-svn: 310697
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
printing techniques with a DEBUG_TYPE controlling them.
It was a mistake to start re-purposing the pass manager `DebugLogging`
variable for generic debug printing -- those logs are intended to be
very minimal and primarily used for testing. More detailed and
comprehensive logging doesn't make sense there (it would only make for
brittle tests).
Moreover, we kept forgetting to propagate the `DebugLogging` variable to
various places making it also ineffective and/or unavailable. Switching
to `DEBUG_TYPE` makes this a non-issue.
llvm-svn: 310695
|
| |
|
|
|
|
|
|
|
|
|
| |
This fixes a MachineVerifier failure in machine-outliner.mir. Not explicitly
adding RegState::Define to the LR argument makes it unhappy because an explicit
definition is marked as a use.
Build failure:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/7496/testReport/junit/LLVM/CodeGen_AArch64/machine_outliner_mir/
llvm-svn: 310671
|
| |
|
|
|
|
|
|
| |
This reverts commit r310457.
It causes clang-produced IR to fail llvm codegen.
llvm-svn: 310662
|
| |
|
|
|
|
| |
This reverts commit r310648 which causes an unexpected assertion failure
llvm-svn: 310659
|
| |
|
|
|
|
|
|
| |
This also corrects the description to match what was actually implemented. The old comment said X^(C1|C2), but it implemented X^((C1|C2)&~(C1&C2)). I believe ((C1|C2)&~(C1&C2)) is equivalent to (C1^C2).
Differential Revision: https://reviews.llvm.org/D36505
llvm-svn: 310658
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Allow stores of bitcastable types to be merged by peeking through BITCAST nodes and recasting stored values constant and vector extract nodes as necessary.
Reviewers: jyknight, hfinkel, efriedma, RKSimon, spatel
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34569
llvm-svn: 310655
|
| |
|
|
| |
llvm-svn: 310654
|