| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
| |
llvm-svn: 227173
|
| |
|
|
|
|
|
| |
subtarget and so doesn't need the TargetMachine or to access via
getSubtargetImpl. Update all callers.
llvm-svn: 227160
|
| |
|
|
|
|
| |
getSubtargetImpl.
llvm-svn: 227159
|
| |
|
|
|
|
| |
code.
llvm-svn: 227157
|
| |
|
|
|
|
| |
a subtarget.
llvm-svn: 227156
|
| |
|
|
| |
llvm-svn: 227121
|
| |
|
|
| |
llvm-svn: 227119
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
derived classes.
Since global data alignment, layout, and mangling is often based on the
DataLayout, move it to the TargetMachine. This ensures that global
data is going to be layed out and mangled consistently if the subtarget
changes on a per function basis. Prior to this all targets(*) have
had subtarget dependent code moved out and onto the TargetMachine.
*One target hasn't been migrated as part of this change: R600. The
R600 port has, as a subtarget feature, the size of pointers and
this affects global data layout. I've currently hacked in a FIXME
to enable progress, but the port needs to be updated to either pass
the 64-bitness to the TargetMachine, or fix the DataLayout to
avoid subtarget dependent features.
llvm-svn: 227113
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This change reverts the interesting parts of 226311 (and 227046). This change introduced two problems, and I've been convinced that an alternate approach is preferrable anyways.
The bugs were:
- Registery appears to require all users be within the same linkage unit. After this change, asking for "statepoint-example" in Transform/ would sometimes get you nullptr, whereas asking the same question in CodeGen would return the right GCStrategy. The correct long term fix is to get rid of the utter hack which is Registry, but I don't have time for that right now. 227046 appears to have been an attempt to fix this, but I don't believe it does so completely.
- GCMetadataPrinter::finishAssembly was being called more than once per GCStrategy. Each Strategy was being added to the GCModuleInfo multiple times.
Once I get time again, I'm going to split GCModuleInfo into the gc.root specific part and a GCStrategy owning Analysis pass. I'm probably also going to kill off the Registry. Once that's done, I'll move the new GCStrategyAnalysis and all built in GCStrategies into Analysis. (As original suggested by Chandler.) This will accomplish my original goal of being able to access GCStrategy from Transform/ without adding all of the builtin GCs to IR/.
llvm-svn: 227109
|
| |
|
|
|
|
|
|
|
|
|
| |
physical register that is described in a DBG_VALUE.
In the testcase the DBG_VALUE describing "p5" becomes unavailable
because the register its address is in is clobbered and we (currently)
aren't smart enough to realize that the value is rematerialized immediately
after the DBG_VALUE and/or is actually a stack slot.
llvm-svn: 227056
|
| |
|
|
| |
llvm-svn: 227035
|
| |
|
|
|
|
| |
Minor tweaks to whitespace formatting that I noticed was off. NFC.
llvm-svn: 227014
|
| |
|
|
|
|
|
|
|
|
|
| |
This fixes a regression introduced by r226816.
When replacing a splat shuffle node with a constant build_vector,
make sure that the new build_vector has a valid number of elements.
Thanks to Patrik Hagglund for reporting this problem and providing a
small reproducible.
llvm-svn: 227002
|
| |
|
|
|
|
| |
Should fix PR22305.
llvm-svn: 226969
|
| |
|
|
|
|
|
|
| |
- input_iterator
- define an operator->
- make constructors private were possible
llvm-svn: 226967
|
| |
|
|
|
|
|
|
| |
DIExpression::Operand, so we can write range-based for loops.
Thanks to David Blaikie for the idea.
llvm-svn: 226939
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This mostly reverts commit r222062 and replaces it with a new enum. At
some point this enum will grow at least for other MSVC EH personalities.
Also beefs up the way we were sniffing the personality function.
Previously we would emit the Itanium LSDA despite using
__C_specific_handler.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D6987
llvm-svn: 226920
|
| |
|
|
| |
llvm-svn: 226919
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: When trying to constant fold an FMA in the DAG, getNode()
fails to fold the FMA if an operand is not finite. In this case this
patch allows the constant folding if !TLI->hasFloatingPointExceptions()
Reviewers: resistor
Reviewed By: resistor
Subscribers: hfinkel, llvm-commits
Differential Revision: http://reviews.llvm.org/D6912
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 226901
|
| |
|
|
|
|
|
|
|
|
|
|
| |
v2: use getZExtValue
add missing break
codestyle
v3: add few more comments
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com>
llvm-svn: 226880
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specifically, gc.result benefits from this greatly. Instead of:
gc.result.int.*
gc.result.float.*
gc.result.ptr.*
...
We now have a gc.result.* that can specialize to literally any type.
Differential Revision: http://reviews.llvm.org/D7020
llvm-svn: 226857
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a 2nd try at the same optimization as http://reviews.llvm.org/D6698.
That patch was checked in at r224611, but reverted at r225031 because it
caused a failure outside of the regression tests.
The cause of the crash was not recognizing consecutive stores that have mixed
source values (loads and vector element extracts), so this patch adds a check
to bail out if any store value is not coming from a vector element extract.
This patch also refactors the shared logic of the constant source and vector
extracted elements source cases into a helper function.
Differential Revision: http://reviews.llvm.org/D6850
llvm-svn: 226845
|
| |
|
|
|
|
|
|
|
|
|
| |
problems when inlining two calls to the same function from the same call site."
The underlying bug has been fixed in r226736 so there's no need to
workaround this anymore.
This reverts commit r220923.
llvm-svn: 226842
|
| |
|
|
|
|
| |
Addresses review feedback from Duncan.
llvm-svn: 226835
|
| |
|
|
|
|
|
|
|
|
|
| |
This solves PR22276.
Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead.
Differential Revision: http://reviews.llvm.org/D7093
Fixed recommit of r226811.
llvm-svn: 226816
|
| |
|
|
| |
llvm-svn: 226814
|
| |
|
|
|
|
|
|
|
| |
This solves PR22276.
Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead.
Differential Revision: http://reviews.llvm.org/D7093
llvm-svn: 226811
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The problem occurs when after vectorization we have type
<2 x i32>. This type is promoted to <2 x i64> and then requires
additional efforts for expanding loads and truncating stores.
I added EXPAND / TRUNCATE attributes to the masked load/store
SDNodes. The code now contains additional shuffles.
I've prepared changes in the cost estimation for masked memory
operations, it will be submitted separately.
llvm-svn: 226808
|
| |
|
|
| |
llvm-svn: 226806
|
| |
|
|
|
|
|
|
|
| |
Type MVT::i1 became legal in KNL, but store operation can't be narrowed to this type,
since the size of VT (1 bit) is not equal to its actual store size(8 bits).
Added a test provided by David (dag@cray.com)
llvm-svn: 226805
|
| |
|
|
| |
llvm-svn: 226767
|
| |
|
|
| |
llvm-svn: 226748
|
| |
|
|
|
|
|
| |
It can help with argument juggling on some targets, and is generally a good
idea.
llvm-svn: 226740
|
| |
|
|
|
|
|
|
|
| |
are only reading a dead superregister value
This was not necessary before as this case can only be detected when the
liveness analysis is at subregister level.
llvm-svn: 226733
|
| |
|
|
|
|
|
|
|
|
|
| |
This cleans up code and is more in line with the general philosophy of
modifying LiveIntervals through LiveIntervalAnalysis instead of changing
them directly.
This also fixes a case where SplitEditor::removeBackCopies() would miss
the subregister ranges.
llvm-svn: 226690
|
| |
|
|
|
|
|
|
| |
This cleans up code and is more in line with the general philosophy of
modifying LiveIntervals through LiveIntervalAnalysis instead of changing
them directly.
llvm-svn: 226687
|
| |
|
|
| |
llvm-svn: 226686
|
| |
|
|
|
|
|
|
| |
It hadn't gone through review yet, but was still on my local copy.
This reverts commit r226663
llvm-svn: 226665
|
| |
|
|
| |
llvm-svn: 226663
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This addresses part of llvm.org/PR22262. Specifically, it prevents
considering the densities of sub-ranges that have fewer than
TLI.getMinimumJumpTableEntries() elements. Those densities won't help
jump tables.
This is not a complete solution but works around the most pressing
issue.
Review: http://reviews.llvm.org/D7070
llvm-svn: 226600
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is in preparation for a fix to llvm.org/PR22262. One of the ideas
here is to first find a good jump table range first and then split
before and after it. Thereby, we don't need to use the
split-based-on-density heuristic at all, which can make the "binary
tree" deteriorate in various cases.
Also some minor cleanups.
No functional changes.
llvm-svn: 226551
|
| |
|
|
|
|
|
|
|
|
| |
a DominatorTree argument as that is the analysis that it wants to
update.
This removes the last non-loop utility function in Utils/ which accepts
a raw Pass argument.
llvm-svn: 226537
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
frontends to use a DIExpression with a DW_OP_deref instead.
This is not only a much more natural place for this informationl; there
is also a technical reason: The FlagIndirectVariable is used to mark a
variable that is turned into a reference by virtue of the calling
convention; this happens for example to aggregate return values.
The inliner, for example, may actually need to undo this indirection to
correctly represent the value in its new context. This is impossible to
implement because the DIVariable can't be safely modified. We can however
safely construct a new DIExpression on the fly.
llvm-svn: 226476
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No change in this commit, but clang was changed to also produce trivial comdats when
needed.
Original message:
Don't create new comdats in CodeGen.
This patch stops the implicit creation of comdats during codegen.
Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.
llvm-svn: 226467
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
APIs and replace it and numerous booleans with an option struct.
The critical edge splitting API has a really large surface of flags and
so it seems worth burning a small option struct / builder. This struct
can be constructed with the various preserved analyses and then flags
can be flipped in a builder style.
The various users are now responsible for directly passing along their
analysis information. This should be enough for the critical edge
splitting to work cleanly with the new pass manager as well.
This API is still pretty crufty and could be cleaned up a lot, but I've
focused on this change just threading an option struct rather than
a pass through the API.
llvm-svn: 226456
|
| |
|
|
|
|
|
|
|
|
| |
source and dest are local
This fixes PR21792.
Differential Revision: http://reviews.llvm.org/D6823
llvm-svn: 226433
|
| |
|
|
| |
llvm-svn: 226414
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Loading 2 2x32-bit float vectors into the bottom half of a 256-bit vector
produced suboptimal code in AVX2 mode with certain IR combinations.
In particular, the IR optimizer folded 2f32 + 2f32 -> 4f32, 4f32 + 4f32
(undef) -> 8f32 into a 2f32 + 2f32 -> 8f32, which seems more canonical,
but then mysteriously generated rather bad code; the movq/movhpd combination
didn't match.
The problem lay in the BUILD_VECTOR optimization path. The 2f32 inputs
would get promoted to 4f32 by the type legalizer, eventually resulting
in a BUILD_VECTOR on two 4f32 into an 8f32. The BUILD_VECTOR then, recognizing
these were both half the output size, concatted them and then produced
a shuffle. However, the resulting concat + shuffle was more complex than
it should be; in the case where the upper half of the output is undef, we
probably want to generate shuffle + concat instead.
This enhancement causes the vector_shuffle combine step to recognize this
suboptimal pattern and correct it. I included it there instead of in BUILD_VECTOR
in case the same suboptimal pattern occurs for other reasons.
This results in the optimizer correctly producing the optimal movq + movhpd
sequence for all three variations on this IR, even with AVX2.
I've included a test case.
Radar link: rdar://problem/19287012
Fix for PR 21943.
From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 226360
|
| |
|
|
| |
llvm-svn: 226353
|
| |
|
|
| |
llvm-svn: 226352
|