| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
| |
Minor tweaks to whitespace formatting that I noticed was off. NFC.
llvm-svn: 227014
|
| |
|
|
|
|
|
|
|
|
|
| |
This fixes a regression introduced by r226816.
When replacing a splat shuffle node with a constant build_vector,
make sure that the new build_vector has a valid number of elements.
Thanks to Patrik Hagglund for reporting this problem and providing a
small reproducible.
llvm-svn: 227002
|
| |
|
|
|
|
| |
Should fix PR22305.
llvm-svn: 226969
|
| |
|
|
|
|
|
|
| |
- input_iterator
- define an operator->
- make constructors private were possible
llvm-svn: 226967
|
| |
|
|
|
|
|
|
| |
DIExpression::Operand, so we can write range-based for loops.
Thanks to David Blaikie for the idea.
llvm-svn: 226939
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This mostly reverts commit r222062 and replaces it with a new enum. At
some point this enum will grow at least for other MSVC EH personalities.
Also beefs up the way we were sniffing the personality function.
Previously we would emit the Itanium LSDA despite using
__C_specific_handler.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D6987
llvm-svn: 226920
|
| |
|
|
| |
llvm-svn: 226919
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: When trying to constant fold an FMA in the DAG, getNode()
fails to fold the FMA if an operand is not finite. In this case this
patch allows the constant folding if !TLI->hasFloatingPointExceptions()
Reviewers: resistor
Reviewed By: resistor
Subscribers: hfinkel, llvm-commits
Differential Revision: http://reviews.llvm.org/D6912
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 226901
|
| |
|
|
|
|
|
|
|
|
|
|
| |
v2: use getZExtValue
add missing break
codestyle
v3: add few more comments
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 226880
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specifically, gc.result benefits from this greatly. Instead of:
gc.result.int.*
gc.result.float.*
gc.result.ptr.*
...
We now have a gc.result.* that can specialize to literally any type.
Differential Revision: http://reviews.llvm.org/D7020
llvm-svn: 226857
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a 2nd try at the same optimization as http://reviews.llvm.org/D6698.
That patch was checked in at r224611, but reverted at r225031 because it
caused a failure outside of the regression tests.
The cause of the crash was not recognizing consecutive stores that have mixed
source values (loads and vector element extracts), so this patch adds a check
to bail out if any store value is not coming from a vector element extract.
This patch also refactors the shared logic of the constant source and vector
extracted elements source cases into a helper function.
Differential Revision: http://reviews.llvm.org/D6850
llvm-svn: 226845
|
| |
|
|
|
|
|
|
|
|
|
| |
problems when inlining two calls to the same function from the same call site."
The underlying bug has been fixed in r226736 so there's no need to
workaround this anymore.
This reverts commit r220923.
llvm-svn: 226842
|
| |
|
|
|
|
| |
Addresses review feedback from Duncan.
llvm-svn: 226835
|
| |
|
|
|
|
|
|
|
|
|
| |
This solves PR22276.
Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead.
Differential Revision: http://reviews.llvm.org/D7093
Fixed recommit of r226811.
llvm-svn: 226816
|
| |
|
|
| |
llvm-svn: 226814
|
| |
|
|
|
|
|
|
|
| |
This solves PR22276.
Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead.
Differential Revision: http://reviews.llvm.org/D7093
llvm-svn: 226811
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The problem occurs when after vectorization we have type
<2 x i32>. This type is promoted to <2 x i64> and then requires
additional efforts for expanding loads and truncating stores.
I added EXPAND / TRUNCATE attributes to the masked load/store
SDNodes. The code now contains additional shuffles.
I've prepared changes in the cost estimation for masked memory
operations, it will be submitted separately.
llvm-svn: 226808
|
| |
|
|
| |
llvm-svn: 226806
|
| |
|
|
|
|
|
|
|
| |
Type MVT::i1 became legal in KNL, but store operation can't be narrowed to this type,
since the size of VT (1 bit) is not equal to its actual store size(8 bits).
Added a test provided by David (dag@cray.com)
llvm-svn: 226805
|
| |
|
|
| |
llvm-svn: 226767
|
| |
|
|
| |
llvm-svn: 226748
|
| |
|
|
|
|
|
| |
It can help with argument juggling on some targets, and is generally a good
idea.
llvm-svn: 226740
|
| |
|
|
|
|
|
|
|
| |
are only reading a dead superregister value
This was not necessary before as this case can only be detected when the
liveness analysis is at subregister level.
llvm-svn: 226733
|
| |
|
|
|
|
|
|
|
|
|
| |
This cleans up code and is more in line with the general philosophy of
modifying LiveIntervals through LiveIntervalAnalysis instead of changing
them directly.
This also fixes a case where SplitEditor::removeBackCopies() would miss
the subregister ranges.
llvm-svn: 226690
|
| |
|
|
|
|
|
|
| |
This cleans up code and is more in line with the general philosophy of
modifying LiveIntervals through LiveIntervalAnalysis instead of changing
them directly.
llvm-svn: 226687
|
| |
|
|
| |
llvm-svn: 226686
|
| |
|
|
|
|
|
|
| |
It hadn't gone through review yet, but was still on my local copy.
This reverts commit r226663
llvm-svn: 226665
|
| |
|
|
| |
llvm-svn: 226663
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This addresses part of llvm.org/PR22262. Specifically, it prevents
considering the densities of sub-ranges that have fewer than
TLI.getMinimumJumpTableEntries() elements. Those densities won't help
jump tables.
This is not a complete solution but works around the most pressing
issue.
Review: http://reviews.llvm.org/D7070
llvm-svn: 226600
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is in preparation for a fix to llvm.org/PR22262. One of the ideas
here is to first find a good jump table range first and then split
before and after it. Thereby, we don't need to use the
split-based-on-density heuristic at all, which can make the "binary
tree" deteriorate in various cases.
Also some minor cleanups.
No functional changes.
llvm-svn: 226551
|
| |
|
|
|
|
|
|
|
|
| |
a DominatorTree argument as that is the analysis that it wants to
update.
This removes the last non-loop utility function in Utils/ which accepts
a raw Pass argument.
llvm-svn: 226537
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
frontends to use a DIExpression with a DW_OP_deref instead.
This is not only a much more natural place for this informationl; there
is also a technical reason: The FlagIndirectVariable is used to mark a
variable that is turned into a reference by virtue of the calling
convention; this happens for example to aggregate return values.
The inliner, for example, may actually need to undo this indirection to
correctly represent the value in its new context. This is impossible to
implement because the DIVariable can't be safely modified. We can however
safely construct a new DIExpression on the fly.
llvm-svn: 226476
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No change in this commit, but clang was changed to also produce trivial comdats when
needed.
Original message:
Don't create new comdats in CodeGen.
This patch stops the implicit creation of comdats during codegen.
Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.
llvm-svn: 226467
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
APIs and replace it and numerous booleans with an option struct.
The critical edge splitting API has a really large surface of flags and
so it seems worth burning a small option struct / builder. This struct
can be constructed with the various preserved analyses and then flags
can be flipped in a builder style.
The various users are now responsible for directly passing along their
analysis information. This should be enough for the critical edge
splitting to work cleanly with the new pass manager as well.
This API is still pretty crufty and could be cleaned up a lot, but I've
focused on this change just threading an option struct rather than
a pass through the API.
llvm-svn: 226456
|
| |
|
|
|
|
|
|
|
|
| |
source and dest are local
This fixes PR21792.
Differential Revision: http://reviews.llvm.org/D6823
llvm-svn: 226433
|
| |
|
|
| |
llvm-svn: 226414
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Loading 2 2x32-bit float vectors into the bottom half of a 256-bit vector
produced suboptimal code in AVX2 mode with certain IR combinations.
In particular, the IR optimizer folded 2f32 + 2f32 -> 4f32, 4f32 + 4f32
(undef) -> 8f32 into a 2f32 + 2f32 -> 8f32, which seems more canonical,
but then mysteriously generated rather bad code; the movq/movhpd combination
didn't match.
The problem lay in the BUILD_VECTOR optimization path. The 2f32 inputs
would get promoted to 4f32 by the type legalizer, eventually resulting
in a BUILD_VECTOR on two 4f32 into an 8f32. The BUILD_VECTOR then, recognizing
these were both half the output size, concatted them and then produced
a shuffle. However, the resulting concat + shuffle was more complex than
it should be; in the case where the upper half of the output is undef, we
probably want to generate shuffle + concat instead.
This enhancement causes the vector_shuffle combine step to recognize this
suboptimal pattern and correct it. I included it there instead of in BUILD_VECTOR
in case the same suboptimal pattern occurs for other reasons.
This results in the optimizer correctly producing the optimal movq + movhpd
sequence for all three variations on this IR, even with AVX2.
I've included a test case.
Radar link: rdar://problem/19287012
Fix for PR 21943.
From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 226360
|
| |
|
|
| |
llvm-svn: 226353
|
| |
|
|
| |
llvm-svn: 226352
|
| |
|
|
|
|
|
|
|
|
|
| |
- Consistenly put comments above the function declaration, not the
definition. To achieve this some duplicate comments got merged and
some comment parts describing implementation details got moved into their
functions.
- Consistently use doxygen comments above functions.
- Do not use doxygen comments inside functions.
llvm-svn: 226351
|
| |
|
|
| |
llvm-svn: 226350
|
| |
|
|
|
|
| |
Be a bit more explicit about the fact that addrspace(1) is not reserved.
llvm-svn: 226344
|
| |
|
|
|
|
| |
Nothing interesting here...
llvm-svn: 226342
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Note: This change ended up being slightly more controversial than expected. Chandler has tentatively okayed this for the moment, but I may be revisiting this in the near future after we settle some high level questions.
Rather than have the GCStrategy object owned by the GCModuleInfo - which is an immutable analysis pass used mainly by gc.root - have it be owned by the LLVMContext. This simplifies the ownership logic (i.e. can you have two instances of the same strategy at once?), but more importantly, allows us to access the GCStrategy in the middle end optimizer. To this end, I add an accessor through Function which becomes the canonical way to get at a GCStrategy instance.
In the near future, this will allows me to move some of the checks from http://reviews.llvm.org/D6808 into the Verifier itself, and to introduce optimization legality predicates for some of the recent additions to InstCombine. (These will follow as separate changes.)
Differential Revision: http://reviews.llvm.org/D6811
llvm-svn: 226311
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Searching all of the existing gc.root implementations I'm aware of (all three of them), there was exactly one use of this mechanism, and that was to implement a performance improvement that should have been applied to the default lowering.
Having this function is requiring a dependency on a CodeGen class (MachineFunction), in a class which is otherwise completely independent of CodeGen. I could solve this differently, but given that I see absolutely no value in preserving this mechanism, I going to just get rid of it.
Note: Tis is the first time I'm intentionally breaking previously supported gc.root functionality. Given 3.6 has branched, I believe this is a good time to do this.
Differential Revision: http://reviews.llvm.org/D7004
llvm-svn: 226305
|
| |
|
|
|
|
| |
This breaks AddressSanitizer (ninja check-asan) on Windows
llvm-svn: 226251
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r226173, adding r226038 back.
No change in this commit, but clang was changed to also produce trivial comdats for
costructors, destructors and vtables when needed.
Original message:
Don't create new comdats in CodeGen.
This patch stops the implicit creation of comdats during codegen.
Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.
llvm-svn: 226242
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
reserved registers""
Reapply r226071 with fixes. Two fixes:
1. We need to manually remove the old and create the new 'deaf defs'
associated with physical register definitions when we move the definition of
the physical register from the copy point to the point of the original vreg def.
This problem was picked up by the machinstr verifier, and could trigger a
verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've
turned on the verifier in the tests.
2. When moving the def point of the phys reg up, we need to make sure that it
is neither defined nor read in between the two instructions. We don't, however,
extend the live ranges of phys reg defs to cover uses, so just checking for
live-range overlap between the pair interval and the phys reg aliases won't
pick up reads. As a result, we manually iterate over the range and check for
reads.
A test soon to be committed to the PowerPC backend will test this change.
Original commit message:
[RegisterCoalescer] Remove copies to reserved registers
This allows the RegisterCoalescer to join "non-flipped" range pairs with a
physical destination register -- which allows the RegisterCoalescer to remove
copies like this:
<vreg> = something (maybe a load, for example)
... (things that don't use PHYSREG)
PHYSREG = COPY <vreg>
(with all of the restrictions normally applied by the RegisterCoalescer: having
compatible register classes, etc. )
Previously, the RegisterCoalescer handled only the opposite case (copying
*from* a physical register). I don't handle the problem fully here, but try to
get the common case where there is only one use of <vreg> (the COPY).
An upcoming commit to the PowerPC backend will make this pattern much more
common on PPC64/ELF systems.
llvm-svn: 226200
|
| |
|
|
|
|
| |
Use static functions for helpers rather than static member functions. a) this changes the linking (minor at best), and b) this makes it obvious no object state is involved.
llvm-svn: 226198
|
| |
|
|
| |
llvm-svn: 226196
|