| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
| |
This patch enables fusing dependent AESE/AESMC and AESD/AESIMC
instruction pairs on Cortex-A72, as recommended in the Software
Optimization Guide, section 4.10.
llvm-svn: 303073
|
| |
|
|
|
|
|
|
|
|
| |
See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936
Reviewers: artem.tamazov, vpykhtin
Differential Revision: https://reviews.llvm.org/D33123
llvm-svn: 303070
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This instruction does not really exist
See Bug 33018: https://bugs.llvm.org//show_bug.cgi?id=33018
Reviewers: vpykhtin, artem.tamazov
Differential Revision: https://reviews.llvm.org/D33126
llvm-svn: 303055
|
| |
|
|
|
|
|
|
|
|
|
| |
Doing this means that if an LEApcrel is used in two places we will rematerialize
instead of generating two MOVs. This is particularly useful for printfs using
the same format string, where we want to generate an address into a register
that's going to get corrupted by the call.
Differential Revision: https://reviews.llvm.org/D32858
llvm-svn: 303054
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Doing this lets us hoist it out of loops, and I've also marked it as
rematerializable the same as the thumb1 and thumb2 counterparts.
It looks like it being marked as such was just a mistake, as the commit that
made that change only mentions LEApcrelJT and in thumb1 and thumb2 only the
LEApcrelJT instructions were marked as having side-effects, so it looks like
the intent was to only mark LEApcrelJT as having side-effects but LEApcrel was
accidentally marked as such also.
Differential Revision: https://reviews.llvm.org/D32857
llvm-svn: 303053
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I am working on a speedup of building .gdb_index in LLD and
noticed that relocations that are proccessed in DWARFContextInMemory often uses
the same symbol in a row. This patch introduces caching to reduce the relocations
proccessing time.
For benchmark,
I took debug LLC binary objects configured with -ggnu-pubnames and linked it using LLD.
Link time without --gdb-index is about 4,45s.
Link time with --gdb-index: a) Without patch: 19,16s b) With patch: 15,52s
That means time spent on --gdb-index in this configuration is
19,16s - 4,45s = 14,71s (without patch) vs 15,52s - 4,45s = 11,07s (with patch).
Differential revision: https://reviews.llvm.org/D31136
llvm-svn: 303051
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
intrinsics to run also on -O0 option.
Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional).
CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0).
Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation.
Differential Revision: https://reviews.llvm.org/D32487
llvm-svn: 303050
|
| |
|
|
|
|
|
|
| |
SelectionDAG::getMemIntrinsicNode. NFC.
NFC followup to D33147, this explicitly sets all the arguments (instead of relying on the defaults) to SelectionDAG::getMemIntrinsicNode to help identify -verify-machineinstrs issues.
llvm-svn: 303047
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We were asserting in RegisterBankInfo if RBI.copyCost() returns
UINT_MAX. This is OK for RegBankSelect::Mode::Fast since we only
try one instruction mapping and can't recover from this, but for
RegBankSelect::Mode::Greedy we will be considering multiple
instruction mappings, so we can recover if we see a UNIT_MAX copy
cost.
The copy cost for one pair of register banks in the AMDGPU backend
will be UNIT_MAX, so this patch will prevent AMDGPU tests from
breaking.
Reviewers: ab, qcolombet, t.p.northover, dsanders
Reviewed By: qcolombet
Subscribers: tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D33144
llvm-svn: 303043
|
| |
|
|
|
|
|
|
|
| |
We were previously silently emitting bogus data in release mode,
making it very hard to diagnose the error, or crashing with an
assert in debug mode. A proper diagnostic is now always emitted
when the value to be emitted is out of range.
llvm-svn: 303041
|
| |
|
|
|
|
|
|
| |
This patch finishes off the conversion of ComputeSignBit to computeKnownBits.
Differential Revision: https://reviews.llvm.org/D33166
llvm-svn: 303035
|
| |
|
|
|
|
|
| |
I need to add some asserts to these constructors that are easier to
add once they're in the .cpp file.
llvm-svn: 303032
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ValueTracking
Summary:
Merge overflow computation for signed add,
appearing both in InstCombine and ValueTracking.
As part of the merge,
cleanup the interface for overflow checks in InstCombine.
Patch by Yoav Ben-Shalom.
Reviewers: craig.topper, majnemer
Reviewed By: craig.topper
Subscribers: takuto.ikuta, llvm-commits
Differential Revision: https://reviews.llvm.org/D32946
llvm-svn: 303029
|
| |
|
|
| |
llvm-svn: 303028
|
| |
|
|
|
|
|
|
|
|
| |
Replace SelectionDAG::getNode(ISD::SELECT, ...)
and SelectionDAG::getNode(ISD::VSELECT, ...)
with SelectionDAG::getSelect(...)
Saves a few lines of code and in some cases saves the need to explicitly
check the type of the desired node.
llvm-svn: 303024
|
| |
|
|
| |
llvm-svn: 303023
|
| |
|
|
| |
llvm-svn: 303022
|
| |
|
|
| |
llvm-svn: 303021
|
| |
|
|
| |
llvm-svn: 303018
|
| |
|
|
|
|
| |
sequences
llvm-svn: 303017
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
After rL301562, on FreeBSD the DynamicLibrary unittests fail, because
the test uses getMainExecutable("DynamicLibraryTests", Ptr), and since
the path does not contain any slashes, retrieving the main executable
will not work.
Reimplement getMainExecutable() for FreeBSD and NetBSD using sysctl(3),
which is more reliable than fiddling with relative or absolute paths.
Also add retrieval of the original argv[] from the GoogleTest framework,
to use as a fallback for other OSes.
Reviewers: emaste, marsupial, hans, krytarowski
Reviewed By: krytarowski
Subscribers: krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D33171
llvm-svn: 303015
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Running `llvm-readobj -coff-directives msvcrt.lib` resulted in this error:
Invalid data was encountered while parsing the file
This happened because some of the object files in the archive have empty
`.drectve` sections. These empty sections result in a `parse_failed` error being
returned from `COFFObjectFile::getSectionContents()`, which in turn caused
`llvm-readobj` to stop. With this change, `getSectionContents` now returns
success, and like before the resulting array is empty.
Patch by Dave Lee.
Differential Revision: https://reviews.llvm.org/D32652
llvm-svn: 303014
|
| |
|
|
|
|
|
|
| |
mask.
Tweak cost model to match what lowering actually does.
llvm-svn: 303013
|
| |
|
|
| |
llvm-svn: 303012
|
| |
|
|
| |
llvm-svn: 303010
|
| |
|
|
|
|
| |
constant splats for vXi64 shifts.
llvm-svn: 303009
|
| |
|
|
|
|
|
|
| |
its commuted variants.
We already had (A & ~B) | (A ^ B), but we missed the cases where the not was part of the xor.
llvm-svn: 303004
|
| |
|
|
| |
llvm-svn: 303002
|
| |
|
|
| |
llvm-svn: 302999
|
| |
|
|
|
|
| |
demandedelts in ComputeNumSignBits
llvm-svn: 302997
|
| |
|
|
| |
llvm-svn: 302996
|
| |
|
|
|
|
|
| |
This reorganisation prevents us from cluttering up the top-level lib directory
with more driver libraries such as llvm-dlltool (see D29892).
llvm-svn: 302995
|
| |
|
|
| |
llvm-svn: 302993
|
| |
|
|
|
|
| |
ComputeSignBit. NFC
llvm-svn: 302991
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Further perf tests on Jaguar indicate that:
vxorps %ymm0, %ymm0, %ymm0
vcmpps $15, %ymm0, %ymm0, %ymm0
is consistently faster (by about 9%) than:
vpcmpeqd %xmm0, %xmm0, %xmm0
vinsertf128 $1, %xmm0, %ymm0, %ymm0
Testing equivalent code on a SandyBridge (E5-2640) puts it slightly (~3%) faster as well.
Committed on behalf of @dtemirbulatov
Differential Revision: https://reviews.llvm.org/D32416
llvm-svn: 302989
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form.
Here it is:
before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ]
after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ]
and after this change we have:
%e.0.ph = phi i32 [ 0, %for.inc.2.i ]
%e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ]
Committed on behalf of @dtemirbulatov
Differential Revision: https://reviews.llvm.org/D33055
llvm-svn: 302988
|
| |
|
|
| |
llvm-svn: 302985
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
- MIRYamlMapping: Default value provided for fields which have optional
mappings. Implemented == operators for required classes. When a field's value is
same as default value specified YAML IO class will not print it.
- MIRPrinter: Above mentioned behaviour is not on by default. If -simplify-mir
option not specified, then make yaml::Output to print fields with default values
too.
Differential Revision: https://reviews.llvm.org/D32304
llvm-svn: 302984
|
| |
|
|
| |
llvm-svn: 302983
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
something changed in the initial Worklist creation
Summary:
If the Worklist build causes an IR change this change flag currently factors into the flag for running another iteration of the iteration loop. But only changes during processing should trigger another loop.
This patch captures the worklist creation change flag into the outside the loop flag currently used for DbgDeclares and only sends that flag up to the caller. Rerunning the loop only depends on IC.run() now.
This uses the debug output of InstCombine to determine if one or two iterations run. I couldn't think of a better way to detect it since the second spurious iteration shoudn't make any visible changes. Just wasted computation.
I can do a pre-commit of the test case with the CHECK-NOT as a CHECK if this is an ok way to check this.
This is a subset of D31678 as I'm still not sure how to verify the analysis behavior for that.
Reviewers: davide, majnemer, spatel, chandlerc
Reviewed By: davide
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32453
llvm-svn: 302982
|
| |
|
|
| |
llvm-svn: 302974
|
| |
|
|
|
|
|
|
|
|
| |
Contributed by Dr. Gergő Érdi.
Fixes a bug.
Raised from (https://github.com/avr-rust/rust/issues/49).
llvm-svn: 302973
|
| |
|
|
| |
llvm-svn: 302970
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Implemented frequency based cost/saving analysis
and related options.
The pass is now in a state ready to be turne on
in the pipeline (in follow up).
Differential Revision: http://reviews.llvm.org/D32783
llvm-svn: 302967
|
| |
|
|
|
|
|
|
| |
This might be useful across various GISel Passes
https://reviews.llvm.org/D33051
llvm-svn: 302964
|
| |
|
|
|
|
|
|
| |
We end up dereferencing the end iterator here when the Aspect doesn't exist in the DefaultAction map.
Change the API to return Optional<LLT> and return None when not found.
Also update the callers to handle the None case
llvm-svn: 302963
|
| |
|
|
| |
llvm-svn: 302961
|
| |
|
|
|
|
|
|
|
|
| |
to SVML routines
Patch by Chris Chrulski
Differential Revision: https://reviews.llvm.org/D31789
llvm-svn: 302957
|
| |
|
|
|
|
|
|
|
|
| |
generated from -ffast-math
Patch by Chris Chrulski
Differential Revision: https://reviews.llvm.org/D31788
llvm-svn: 302956
|
| |
|
|
|
|
|
|
|
|
| |
math-finite.h that create '__<func>_finite as functions
Patch by Chris Chrulski
Differential Revision: https://reviews.llvm.org/D31787
llvm-svn: 302955
|