| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
When the first instruction of a basic block has no location (consider a
LEA materializing the address of an alloca for a call), we want to start
the line table for the block with the first valid source location in the
block. We need to ignore DBG_VALUE instructions during this scan to get
decent line tables.
llvm-svn: 309628
|
| |
|
|
| |
llvm-svn: 309627
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
D33925 added a control flow simplification for -O2 --lto-O0 builds that
manually splits blocks and reassigns conditional branches but does not
correctly update phi nodes. If the else case being branched to had
incoming phi nodes the control-flow simplification would leave phi nodes
in that BB with an unhandled predecessor.
Patch by Vlad Tsyrklevich!
Differential Revision: https://reviews.llvm.org/D36012
llvm-svn: 309621
|
| |
|
|
|
|
| |
flags for one test (for now)
llvm-svn: 309615
|
| |
|
|
|
|
|
| |
Fix for pr30418 - error in backend: Cannot select: t17: x86mmx = select_cc t2, Constant:i64<0>, t7, t8, seteq:ch
Differential Revision: https://reviews.llvm.org/D34661
llvm-svn: 309614
|
| |
|
|
| |
llvm-svn: 309611
|
| |
|
|
| |
llvm-svn: 309610
|
| |
|
|
|
|
|
|
|
|
| |
We don't write any actual symbols to this stream yet, but for
now we just create the stream and hook it up to the appropriate
places and give it a valid header.
Differential Revision: https://reviews.llvm.org/D35290
llvm-svn: 309608
|
| |
|
|
|
|
|
| |
GCC was complaining about `&&` within `||` without explicit
parentheses. NFCI.
llvm-svn: 309606
|
| |
|
|
|
|
|
|
|
|
| |
This intrinsic clears the upper bits starting at a specified index. If the index is a constant we can do some simplifications.
This could be in InstSimplify, but we don't handle any target specific intrinsics there today.
Differential Revision: https://reviews.llvm.org/D36069
llvm-svn: 309604
|
| |
|
|
|
|
|
|
|
|
| |
This patch adds simplification support for the BEXTR/BEXTRI intrinsics to match gcc. This only supports cases that fold to 0 or can be fully constant folded. Theoretically we could support converting to AND if the shift part is unused or to only a shift if the mask doesn't modify any bits after an equivalent shl. gcc doesn't do these transformations either.
I put this in InstCombine, but it could be done in InstSimplify. It would be the first target specific intrinsic in InstSimplify.
Differential Revision: https://reviews.llvm.org/D36063
llvm-svn: 309603
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch refactors the code used in llc such that all the users of the
addPassesToEmitFile API have access to a homogeneous way of handling
start/stop-after/before options right out of the box.
In particular, just invoking addPassesToEmitFile will set the proper
pipeline without additional effort (modulo parsing a .mir file if the
start-before/after options are used.
NFC.
Differential Revision: https://reviews.llvm.org/D30913
llvm-svn: 309599
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As noted in the code comment, transforming this in the other direction might require
a separate transform here in CGP given the block-at-a-time DAG constraint.
Besides that theoretical motivation, there are 2 practical motivations for the
subtract-of-cmps form:
1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still).
There is discussion about canonicalizing IR to the select form
( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ),
so we probably need to add DAG transforms for those patterns anyway, but this improves the
memcmp output without waiting for that step.
2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert
that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided
if we choose to enable that.
Differential Revision: https://reviews.llvm.org/D34904
llvm-svn: 309597
|
| |
|
|
|
|
|
|
| |
verifies that the atom tag is actually the same with the tag of the DIE that we retrieve from the table.
Differential Revision: https://reviews.llvm.org/D35963
llvm-svn: 309596
|
| |
|
|
|
|
|
| |
We are not allowed to reason about an initializer value without first
consulting hasDefinitiveInitializer.
llvm-svn: 309594
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
vmovdqa64/vmovdqu64 instead.
These were taking priority over the aligned load instructions since there is no vmovda8/16. I don't think there is really a difference between aligned and unaligned on newer cpus so I don't think it matters which instructions we use.
But with this change we reduce the size of the isel table a little and we allow the aligned information to pass through to the evex->vec pass and produce the same output has avx/avx2 in some cases.
I also generally dislike patterns rooted in a bitcast which these were.
Differential Revision: https://reviews.llvm.org/D35977
llvm-svn: 309589
|
| |
|
|
| |
llvm-svn: 309584
|
| |
|
|
| |
llvm-svn: 309583
|
| |
|
|
|
|
|
|
|
| |
Updated AArch64 to widen destination to s32.
https://reviews.llvm.org/D35737
Reviewed by Tim
llvm-svn: 309579
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: As per title. This creates useless recombines.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33848
llvm-svn: 309578
|
| |
|
|
| |
llvm-svn: 309576
|
| |
|
|
|
|
| |
GCC7 did not warn about that, but Clang does.
llvm-svn: 309573
|
| |
|
|
|
|
| |
This fixes a buildbot failure with -Werror introduced by r309553
llvm-svn: 309572
|
| |
|
|
|
|
|
|
|
|
| |
DIEs are lazily deserialized so it's possible that the DWO CU is created
before the DIE is parsed. DWO shares .debug_addr and .debug_ranges with the
object file so overwriting the offset with 0 will make the CU unusable.
No test case because I couldn't get clang to emit a non-zero range base.
llvm-svn: 309570
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: All getReductionCost() functions are renamed to getArithmeticReductionCost() + added basic infrastructure to handle non-binary reduction operations.
Reviewers: spatel, mzolotukhin, Ayal, mkuper, gilr, hfinkel
Subscribers: RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D29402
llvm-svn: 309566
|
| |
|
|
| |
llvm-svn: 309563
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PR33883 shows that calls to intrinsic functions should not have their vector
arguments or returns subject to ABI changes required by the target.
This resolves PR33883.
Thanks to Alex Crichton for reporting the issue!
Reviewers: zoran.jovanovic, atanasyan
Differential Revision: https://reviews.llvm.org/D35765
llvm-svn: 309561
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Loop Vectorizer generates redundant operations when manipulating masks:
AND with true, OR with false, compare equal to true. Instead of relying on
a subsequent pass to clean them up, this patch avoids generating them.
Use null (no-mask) to represent all-one full masks, instead of a constant
all-one vector, following the convention of masked gathers and scatters.
Preparing for a follow-up VPlan patch in which these mask manipulating
operations are modeled using recipes.
Differential Revision: https://reviews.llvm.org/D35725
llvm-svn: 309558
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the created object files for the import library were broken.
Write the symbol table before the string table. Simplify the code by
using a separate variable Prefix instead of duplicating a few lines.
Also update the coff-weak-exports to actually check that the generated
weak symbols can be found as intended.
Differential Revision: https://reviews.llvm.org/D36065
llvm-svn: 309555
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Since r293359, most dump() function are only defined when
`!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions
only used by dump() functions are now unused in release builds,
generating lots of warnings. This patch only defines some print()
functions if they are used.
Reviewers: MatzeB
Reviewed By: MatzeB
Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D35949
llvm-svn: 309553
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
value < 0.
Found it during work on LLD, it would crash on following
linker script:
SECTIONS { .foo : { *("*®") } }
That happens because ® has int value -82. And chars are used as
array index in code, and are signed by default.
Differential revision: https://reviews.llvm.org/D35891
llvm-svn: 309549
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without any information about the called function, we cannot be sure
that it is safe to interchange loops which contain function calls. For
example there could be dependences that prevent interchanging between
accesses in the called function and the loops. Even functions without any
parameters could cause problems, as they could access memory using
global pointers.
For now, I think it is only safe to interchange loops with calls marked
as readnone.
With this patch, the LLVM test suite passes with `-O3 -mllvm
-enable-loopinterchange` and LoopInterchangeProfitability::isProfitable
returning true for all loops. check-llvm and check-clang also pass when
bootstrapped in a similar fashion, although only 3 loops got
interchanged.
Reviewers: karthikthecool, blitz.opensource, hfinkel, mcrosier, mkuper
Reviewed By: mcrosier
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D35489
llvm-svn: 309547
|
| |
|
|
|
|
|
|
|
|
|
| |
Added patterns to recognize AND 1 on the mask of a scalar masked
move is not needed since only the lower bit is relevant for the
instruction.
Differential Revision:
https://reviews.llvm.org/D35897
llvm-svn: 309546
|
| |
|
|
|
|
|
|
| |
Changed method names based on the discussion in https://reviews.llvm.org/D34986:
getInt64 -> selectI64Imm,
getInt64Count -> selectI64ImmInstrCount.
llvm-svn: 309541
|
| |
|
|
|
|
|
|
| |
load involved.
We already had a pattern without load, but with a load we were falling back to a regular 'and' due to pattern complexity priority.
llvm-svn: 309535
|
| |
|
|
|
|
|
| |
Missed the resetting base address selections when going from a base
address version to zero base address for non-base-addressed entries.
llvm-svn: 309529
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
relocations
(from comments in the test)
Group ranges in a range list that apply to the same section and use a base
address selection entry to reduce the number of relocations to one reloc per
section per range list. DWARF5 debug_rnglist will be more efficient than this
in terms of relocations, but it's still better than one reloc per entry in a
range list.
This is an object/executable size tradeoff - shrinking objects, but growing
the linked executable. In one large binary tested, total object size (not just
debug info) shrank by 16%, entirely relocation entries. Linked executable
grew by 4%. This was with compressed debug info in the objects, uncompressed
in the linked executable. Without compression in the objects, the win would be
smaller (the growth of debug_ranges itself would be more significant).
llvm-svn: 309526
|
| |
|
|
|
|
| |
Not sure quite how I failed so clearly to test this, but anyway.
llvm-svn: 309514
|
| |
|
|
|
|
|
|
|
| |
MS ignores the keyword "short" when used after a jc/jz instruction, LLVM ought to do the same.
Test: D35893
Differential Revision: https://reviews.llvm.org/D35892
llvm-svn: 309509
|
| |
|
|
|
|
|
|
|
|
|
| |
I was a bit lazy when I first implemented this & skipped the index
lookup - obviously for large files this becomes pretty crucial, so here
we go, do the index lookup. Speeds up large DWP symbolizing by... lots.
(20m -> 20s, actually, maybe more in a release build (that was a release
build without index lookup, compared to a debug/non-release build with
the index usage))
llvm-svn: 309507
|
| |
|
|
|
|
| |
single set of isel patterns.
llvm-svn: 309502
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
optimization phase.
Summary: This is in preparation of https://reviews.llvm.org/D36052
Reviewers: chandlerc, davidxl, tejohnson
Reviewed By: chandlerc
Subscribers: sanjoy, llvm-commits
Differential Revision: https://reviews.llvm.org/D36053
llvm-svn: 309500
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If you've archived the DWP file somewhere it's probably useful to be
able to just tell llvm-symbolizer where it is when you're symbolizing
stack traces from the binary.
This only provides a mechanism for specifying a single DWP file, good if
you're symbolizing a program with a single DWP file, but it's likely if
the program is dynamically linked that you might have a DWP for each
dynamic library - in which case this feature won't help (at least as
it's surfaced in llvm-symbolizer for now) - in theory it could be
extended to specify a collection of DWP files that could all be
consulted for split CU hash resolution.
llvm-svn: 309498
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fixes PR33790.
This patch still needs a yaml-style test, which I shall write tomorrow
Reviewers: anemet
Reviewed By: anemet
Subscribers: anemet, llvm-commits
Differential Revision: https://reviews.llvm.org/D35981
llvm-svn: 309497
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Most CPUs implementing AES fusion require instruction pairs of the form
AESE Vn, _
AESMC Vn, Vn
and
AESD Vn, _
AESIMC Vn, Vn
The constraint is added to AES(I)MC instructions which use the result of
an AES(E|D) instruction by using AES(I)MCTrr pseudo instructions, which
constraint source and destination registers to be the same.
A nice side effect of this change is that now all possible pairs are
scheduled back-to-back on the exynos-m1 for the misched-fusion-aes.ll
test case.
I had to update aes_load_store. The version I added initially was very
reduced and with the new constraint, AESE/AESMC could not be scheduled
back-to-back. I updated the test to be more realistic and still expose
the same scheduling problem as the initial test case.
Reviewers: t.p.northover, rengolin, evandro, kristof.beyls, silviu.baranga
Reviewed By: t.p.northover, evandro
Subscribers: aemerson, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D35299
llvm-svn: 309495
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change gives a 0.25% speedup on execution time, a 0.82% improvement
in benchmark scores and a 0.20% increase in binary size on a Cortex-A53.
These numbers are the geomean results on a wide range of benchmarks from
the test-suite and a range of proprietary suites.
Reviewers: t.p.northover, aadg, silviu.baranga, mcrosier, rengolin
Reviewed By: rengolin
Subscribers: grimar, davide, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D35568
llvm-svn: 309494
|
| |
|
|
|
|
|
|
|
|
| |
Rather than passing along most of the parameters, pass a reference to
the MCDWARFrameInfo instead. This makes it easier to pass additional
information about the frame to the checks. We need to keep the extra
constructor for the Key around to allow the construction of the null and
tombstone keys. NFC.
llvm-svn: 309493
|
| |
|
|
|
|
|
|
| |
If the return column is different, we cannot coalesce the CIE across the
FDEs. Add that to the key calculation. This ensures that we emit a
separate CIE.
llvm-svn: 309492
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch is in 2 parts:
1 - replace combineBT's use of SimplifyDemandedBits (hasOneUse only) with SelectionDAG::GetDemandedBits to more aggressively determine the lower bits used by BT.
2 - update SelectionDAG::GetDemandedBits to support ANY_EXTEND - if the demanded bits are only in the non-extended portion, then peek through and demand from the source value and then ANY_EXTEND that if we found a match.
Differential Revision: https://reviews.llvm.org/D35896
llvm-svn: 309486
|
| |
|
|
| |
llvm-svn: 309480
|