| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
Pre-commit as discussed on D27657
llvm-svn: 289425
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have found that -- when the selected subarchitecture has a scheduling model
and we are not optimizing for size -- the machine-instruction combiner uses a
too-simple algorithm to compute the cost of one of the two alternatives [before
and after running a combining pass on a section of code], and therefor it throws
away the combination results too often.
This fix has the potential to help any ISA with the potential to combine
instructions and for which at least one subarchitecture has a scheduling model.
As of now, this is only known to definitely affect AArch64 subarchitectures with
a scheduling model.
Regression tested on AMD64/GNU-Linux, new test case tested to fail on an
unpatched compiler and pass on a patched compiler.
Patch by Abe Skolnik and Sebastian Pop.
llvm-svn: 289399
|
|
|
|
|
|
|
|
| |
from 'large element' scalar/vector to 'small element' vector.
Extension to D27129 which already supported bitcasts from 'small element' vector to 'large element' scalar/vector types.
llvm-svn: 289329
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM's use of DW_OP_bit_piece is incorrect and a based on a
misunderstanding of the wording in the DWARF specification. The offset
argument of DW_OP_bit_piece refers to the offset into the location
that is on the top of the DWARF expression stack, and not an offset
into the source variable. This has since also been clarified in the
DWARF specification.
This patch fixes all uses of DW_OP_bit_piece to emit the correct
offset and simplifies the DwarfExpression class to semi-automaticaly
emit empty DW_OP_pieces to adjust the offset of the source variable,
thus simplifying the code using DwarfExpression.
While this is an incompatible bugfix, in practice I don't expect this
to be much of a problem since LLVM's old interpretation and the
correct interpretation of DW_OP_bit_piece differ only when there are
gaps in the fragmented locations of the described variables or if
individual fragments are smaller than a byte. LLDB at least won't
interpret locations with gaps in them because is has no way to present
undefined bits in a variable, and there is a high probability that an
old-form expression will be malformed when interpreted correctly,
because the DW_OP_bit_piece offset will be outside of the location at
the top of the stack.
As a nice side-effect, this patch enables us to use a more efficient
encoding for subregisters: In order to express a sub-register at a
non-zero offset we now use a DW_OP_bit_piece instead of shifting the
value into place manually.
This patch also adds missing test coverage for code paths that weren't
exercised before.
<rdar://problem/29335809>
Differential Revision: https://reviews.llvm.org/D27550
llvm-svn: 289266
|
|
|
|
|
|
|
|
|
| |
Like DBG_VALUE, these emit nothing to the .text section, and sometimes
have no source location specified. Just ignore them.
Differential Revision: http://reviews.llvm.org/D27492
llvm-svn: 289256
|
|
|
|
|
|
| |
Reapplied with fix for PR31323 - X86 SSE2 vXi16 multiplies for illegal types were creating CONCAT_VECTORS nodes with vector inputs that might not total the number of elements in the result type.
llvm-svn: 289232
|
|
|
|
| |
llvm-svn: 289231
|
|
|
|
|
|
|
|
| |
UseAA is enabled."
This reverts commit r289221 which appears to be triggering an assertion
llvm-svn: 289226
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enabled.
Retrying after fixing overly aggressive load-store forwarding optimization.
Simplify Consecutive Merge Store Candidate Search
Now that address aliasing is much less conservative, push through
simplified store merging search which only checks for parallel stores
through the chain subgraph. This is cleaner as the separation of
non-interfering loads/stores from the store-merging logic.
Whem merging stores, search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited. This improves the quality of the
output SelectionDAG and generally the output CodeGen (with some
exceptions).
Additional Minor Changes:
1. Finishes removing unused AliasLoad code
2. Unifies the the chain aggregation in the merged stores across
code paths
3. Re-add the Store node to the worklist after calling
SimplifyDemandedBits.
4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
arbitrary, but seemed sufficient to not cause regressions in
tests.
This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.
Many tests required some changes as memory operations are now
reorderable. Some tests relying on the order were changed to use
volatile memory operations
Noteworthy tests:
CodeGen/AArch64/argument-blocks.ll -
It's not entirely clear what the test_varargs_stackalign test is
supposed to be asserting, but the new code looks right.
CodeGen/AArch64/arm64-memset-inline.lli -
CodeGen/AArch64/arm64-stur.ll -
CodeGen/ARM/memset-inline.ll -
The backend now generates *worse* code due to store merging
succeeding, as we do do a 16-byte constant-zero store efficiently.
CodeGen/AArch64/merge-store.ll -
Improved, but there still seems to be an extraneous vector insert
from an element to itself?
CodeGen/PowerPC/ppc64-align-long-double.ll -
Worse code emitted in this case, due to the improved store->load
forwarding.
CodeGen/X86/dag-merge-fast-accesses.ll -
CodeGen/X86/MergeConsecutiveStores.ll -
CodeGen/X86/stores-merging.ll -
CodeGen/Mips/load-store-left-right.ll -
Restored correct merging of non-aligned stores
CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
Improved. Correctly merges buffer_store_dword calls
CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
Improved. Sidesteps loading a stored value and
merges two stores
CodeGen/X86/pr18023.ll -
This test has been removed, as it was asserting incorrect
behavior. Non-volatile stores *CAN* be moved past volatile loads,
and now are.
CodeGen/X86/vector-idiv.ll -
CodeGen/X86/vector-lzcnt-128.ll -
It's basically impossible to tell what these tests are actually
testing. But, looks like the code got better due to the memory
operations being recognized as non-aliasing.
CodeGen/X86/win32-eh.ll -
Both loads of the securitycookie are now merged.
Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle
Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel
Differential Revision: https://reviews.llvm.org/D14834
llvm-svn: 289221
|
|
|
|
| |
llvm-svn: 289220
|
|
|
|
|
|
| |
Makes interception of BUILD_VECTOR creation easier for debugging.
llvm-svn: 289218
|
|
|
|
|
|
| |
Part of the work for PR31323 - add extra asserts checking that the input vectors are of consistent type and result in the correct number of vector elements.
llvm-svn: 289214
|
|
|
|
| |
llvm-svn: 289208
|
|
|
|
|
|
|
|
|
|
| |
Adds support for bitcasting a little endian 'small element' vector to 'large element' scalar/vector (e.g. v16i8 to v4i32 or v2i32 to i64), which is required for PR30845. We extract the knownbits for each 'small element' part and concatenate the results together.
We can add support for big endian and 'large element' scalar/vector to 'small element' vector bitcasting once we have test cases for them.
Differential Revision: https://reviews.llvm.org/D27129
llvm-svn: 289200
|
|
|
|
|
|
|
|
| |
This reverts commit r288916 as it is currently causing a crasher in
Halide. Reproducer on llvm.org/PR31323. While it might be that halide is
generating invalid IR, llc shouldn't crash.
llvm-svn: 289194
|
|
|
|
|
|
|
| |
Supporting them properly is a reasonably complex chunk of work, so to allow bot
testing before then we should at least be able to fall back to DAG ISel.
llvm-svn: 289150
|
|
|
|
| |
llvm-svn: 289149
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were falsely claiming that we had an LSDA for the relevant EH
personality before this change, which could lead to the EH machinery
interpreting random adjacent data as an LSDA.
Fixes PR31317
This change is safe because cleanups can't contain exception handlers
today. We do these things to maintain that invariant:
- C++ destructors are naturally out-of-line
- __finally blocks are outlined in clang
- LLVM's inliner will not inline EH constructs into cleanups
llvm-svn: 289101
|
|
|
|
| |
llvm-svn: 289060
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Most targets set the action for these nodes to Expand even though there
isn't actually any code for them in ExpandNode. Instead, targets simply
relied on the fact that no code generates these nodes as long as the
nodes aren't legal or custom.
However, generating these nodes can be useful e.g. for divide-by-constant
in wider integer types.
Expand of [US]MUL_LOHI will use MULH[US] when legal or custom, and
a sequence of half-width multiplications otherwise. Promote uses a wider
multiply.
This patch intends to not change the generated code, but indirect effects
are possible since expansions/promotions that were previously done in
DAGCombine may now be done in LegalizeDAG.
See D24822 for a change that actually uses the new expansion.
Reviewers: spatel, bkramer, venkatra, efriedma, hfinkel, ast, nadav, tstellarAMD
Subscribers: arsenm, jyknight, nemanjai, wdng, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D24956
llvm-svn: 289050
|
|
|
|
|
|
|
|
|
| |
So far it creates a test helper and so it should be moved there. It also
create a layering cycle between CodeGen and CodeGen/AsmPrinter, which
should be avoided.
Review: https://reviews.llvm.org/D27570
llvm-svn: 289044
|
|
|
|
| |
llvm-svn: 289038
|
|
|
|
|
|
| |
Appears to break on build bots. Reverting pending investigation.
llvm-svn: 289014
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The relocations for `DIEEntry::EmitValue` were wrong for Win64
(emitting FK_Data_4 instead of FK_SecRel_4). This corrects that
oversight so that the DWARF data is correct in Win64 COFF files.
Fixes PR15393.
Patch by Jameson Nash <jameson@juliacomputing.com> based on a patch
by David Majnemer.
Differential Revision: https://reviews.llvm.org/D21731
llvm-svn: 289013
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only tests we have for the DWARF parser are the tests that use llvm-dwarfdump and expect output from textual dumps.
More DWARF parser modification are coming in the next few weeks and I wanted to add tests that can verify that we can encode and decode all form types, as well as test some other basic DWARF APIs where we ask DIE objects for their children and siblings.
DwarfGenerator.cpp was added in the lib/CodeGen directory. This file contains the code necessary to easily create DWARF for tests:
dwarfgen::Generator DG;
Triple Triple("x86_64--");
bool success = DG.init(Triple, Version);
if (!success)
return;
dwarfgen::CompileUnit &CU = DG.addCompileUnit();
dwarfgen::DIE CUDie = CU.getUnitDIE();
CUDie.addAttribute(DW_AT_name, DW_FORM_strp, "/tmp/main.c");
CUDie.addAttribute(DW_AT_language, DW_FORM_data2, DW_LANG_C);
dwarfgen::DIE SubprogramDie = CUDie.addChild(DW_TAG_subprogram);
SubprogramDie.addAttribute(DW_AT_name, DW_FORM_strp, "main");
SubprogramDie.addAttribute(DW_AT_low_pc, DW_FORM_addr, 0x1000U);
SubprogramDie.addAttribute(DW_AT_high_pc, DW_FORM_addr, 0x2000U);
dwarfgen::DIE IntDie = CUDie.addChild(DW_TAG_base_type);
IntDie.addAttribute(DW_AT_name, DW_FORM_strp, "int");
IntDie.addAttribute(DW_AT_encoding, DW_FORM_data1, DW_ATE_signed);
IntDie.addAttribute(DW_AT_byte_size, DW_FORM_data1, 4);
dwarfgen::DIE ArgcDie = SubprogramDie.addChild(DW_TAG_formal_parameter);
ArgcDie.addAttribute(DW_AT_name, DW_FORM_strp, "argc");
// ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref4, IntDie);
ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref_addr, IntDie);
StringRef FileBytes = DG.generate();
MemoryBufferRef FileBuffer(FileBytes, "dwarf");
auto Obj = object::ObjectFile::createObjectFile(FileBuffer);
EXPECT_TRUE((bool)Obj);
DWARFContextInMemory DwarfContext(*Obj.get());
This code is backed by the AsmPrinter code that emits DWARF for the actual compiler.
While adding unit tests it was discovered that DIEValue that used DIEEntry as their values had bugs where DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref8, and DW_FORM_ref_udata forms were not supported. These are all now supported. Added support for DW_FORM_string so we can emit inlined C strings.
Centralized the code to unique abbreviations into a new DIEAbbrevSet class and made both the dwarfgen::Generator and the llvm::DwarfFile classes use the new class.
Fixed comments in the llvm::DIE class so that the Offset is known to be the compile/type unit offset.
DIEInteger now supports more DW_FORM values.
There are also unit tests that cover:
Encoding and decoding all form types and values
Encoding and decoding all reference types (DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref4, DW_FORM_ref8, DW_FORM_ref_udata, DW_FORM_ref_addr) including cross compile unit references with that go forward one compile unit and backward on compile unit.
Differential Revision: https://reviews.llvm.org/D27326
llvm-svn: 289010
|
|
|
|
| |
llvm-svn: 289003
|
|
|
|
| |
llvm-svn: 289002
|
|
|
|
|
|
|
|
| |
list.
Since r287792 if we try to do that we will hit an assert.
llvm-svn: 289001
|
|
|
|
|
|
|
|
| |
ConstantExpr instances were emitting code into the current block rather than
the entry block. This meant they didn't necessarily dominate all uses, which is
clearly wrong.
llvm-svn: 288985
|
|
|
|
|
|
|
| |
Having to ask the MIRBuilder for the current function is a little awkward, and
I'm intending to improve how that's threaded through anyway.
llvm-svn: 288983
|
|
|
|
|
|
|
|
|
|
|
|
| |
MachineIRBuilder had weird before/after and beginning/end flags for the insert
point. Unfortunately the non-default means that instructions will be inserted
in reverse order which is almost never what anyone wants.
Really, I think we just want (like IRBuilder has) the ability to insert at any
C++ iterator-style point (i.e. before any instruction or before MBB.end()). So
this fixes MIRBuilders to behave like IRBuilders in this respect.
llvm-svn: 288980
|
|
|
|
|
|
| |
SMAX/SMIN/UMAX/UMIN opcodes
llvm-svn: 288926
|
|
|
|
| |
llvm-svn: 288916
|
|
|
|
|
|
| |
EXTRACT_VECTOR_ELT does support demanded elts if the element index is known and in range.
llvm-svn: 288913
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On some platforms (like MSP430) the second element of the result
structure for SMULO/UMULO may have a shorter type than the one
returned by SetCC. We need to truncate it to the right type, or
else some incorrect code may be generated later on.
This fixes issue https://github.com/rust-lang/rust/issues/37829
Patch by Vadzim Dambrouski!
Differential Revision: https://reviews.llvm.org/D27154
llvm-svn: 288857
|
|
|
|
|
|
|
| |
We were rounding size in bits down rather than up, leading to 0-sized slots for
i1 (assert!) and bugs for other types not byte-aligned.
llvm-svn: 288848
|
|
|
|
|
|
|
|
| |
Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted.
Differential Revision: https://reviews.llvm.org/D27461
llvm-svn: 288842
|
|
|
|
| |
llvm-svn: 288840
|
|
|
|
|
|
| |
we don't actually demand that element
llvm-svn: 288839
|
|
|
|
|
|
|
|
|
|
| |
There were two problems:
+ AArch64 was reusing random data from its binary op tables, which is
complete nonsense for G_SEQUENCE.
+ Even when AArch64 gave up and said it couldn't handle G_SEQUENCE,
the generic code asserted.
llvm-svn: 288836
|
|
|
|
| |
llvm-svn: 288835
|
|
|
|
|
|
|
|
| |
It'll almost immediately fail because it always tries to half/double the size
until it finds a legal one. Unfortunately, this triggers an assertion
preventing the DAG fallback from being possible.
llvm-svn: 288834
|
|
|
|
| |
llvm-svn: 288814
|
|
|
|
|
|
|
|
|
|
| |
Summary: Add missing parens in assert, which warn in GCC.
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27448
llvm-svn: 288792
|
|
|
|
|
|
|
|
| |
The function used to finish off PHIs by adding the relevant basic blocks can
fail if we're aborting and still don't actually have the needed
MachineBasicBlocks. So avoid trying in that case.
llvm-svn: 288727
|
|
|
|
|
|
|
|
|
| |
When the entry block was empty after arg lowering, we were always placing
constants at the end. This is probably hamrless while translating the same
block, but horribly wrong once its terminator has been translated. So switch to
inserting at the beginning.
llvm-svn: 288720
|
|
|
|
| |
llvm-svn: 288717
|
|
|
|
| |
llvm-svn: 288713
|
|
|
|
|
|
|
|
| |
This makes it more similar to the floating-point constant, and also allows for
larger constants to be translated later. There's no real functional change in
this patch though, just syntax updates.
llvm-svn: 288712
|
|
|
|
|
|
|
|
| |
Returning 0 (NoReg) from getOrCreateVReg leads to unexpected situations later
in the translation. It's better to return a valid (if undefined) register and
let the rest of the instruction carry on as planned.
llvm-svn: 288709
|