| Commit message (Collapse) | Author | Age | Files | Lines | 
| ... |  | 
| | 
| 
| 
|  | 
llvm-svn: 155746
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.
llvm-svn: 155745
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Previously, ARMConstantIslandPass would conservatively compute the
address of an aligned basic block as:
  RoundUpToAlignment(Offset + UnknownPadding)
This worked fine for the layout algorithm itself, but it could fool the
verify() function because it accounts for alignment padding twice: Once
when adding the worst case UnknownPadding, and again by rounding up the
fictional block offset. This meant that when optimizeThumb2Instructions
would shrink an instruction, the conservative distance estimate could
grow. That shouldn't be possible since the woorst case alignment padding
wss already included.
This patch drops the use of RoundUpToAlignment, and depends only on
worst case padding to compute conservative block offsets. This has the
weird effect that the computed offset for an aligned block may not be
aligned.
The important difference is that shrinking an instruction can never
cause the estimated distance between two instructions to grow. The
estimated distance is always larger than the real distance that only the
assembler knows.
<rdar://problem/11339352>
llvm-svn: 155744
 | 
| | 
| 
| 
| 
| 
|  | 
This definitely caused regression with ARM -mno-thumb.
llvm-svn: 155743
 | 
| | 
| 
| 
| 
| 
|  | 
vector elements.
llvm-svn: 155742
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
x == -y --> x+y == 0
 x != -y --> x+y != 0
On x86, the generated code goes from
   negl    %esi
   cmpl    %esi, %edi
   je    .LBB0_2
to
   addl    %esi, %edi
   je    .L4
This case is correctly handled for ARM with "cmn".
Patch by Manman Ren.
rdar://11245199
PR12545
llvm-svn: 155739
 | 
| | 
| 
| 
|  | 
llvm-svn: 155735
 | 
| | 
| 
| 
|  | 
llvm-svn: 155733
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Target specific types should not be vectorized. As a practical matter,
these types are already register matched (at least in the x86 case),
and codegen does not always work correctly (at least in the ppc case,
and this is not worth fixing because ppc_fp128 is currently broken and
will probably go away soon).
llvm-svn: 155729
 | 
| | 
| 
| 
|  | 
llvm-svn: 155727
 | 
| | 
| 
| 
|  | 
llvm-svn: 155725
 | 
| | 
| 
| 
| 
| 
|  | 
<rdar://problem/11325085>.
llvm-svn: 155724
 | 
| | 
| 
| 
| 
| 
| 
|  | 
The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow
issues. <rdar://problem/11286839>.
llvm-svn: 155722
 | 
| | 
| 
| 
| 
| 
|  | 
properly with how the code handles all-undef PHI nodes.
llvm-svn: 155721
 | 
| | 
| 
| 
|  | 
llvm-svn: 155720
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
pre-pentiumpro architectures.
* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.
Fixes PR6679. Patch by Christoph Erhardt!
llvm-svn: 155704
 | 
| | 
| 
| 
|  | 
llvm-svn: 155701
 | 
| | 
| 
| 
| 
| 
|  | 
the mask operand in the MCInst.
llvm-svn: 155700
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
vectors"
It broke stage2 build. stage1/clang sometimes crashed.
llvm-svn: 155699
 | 
| | 
| 
| 
|  | 
llvm-svn: 155698
 | 
| | 
| 
| 
|  | 
llvm-svn: 155686
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
instructions.
- However, it does support dmb, dsb, isb, mrs, and msr.
rdar://11331541
llvm-svn: 155685
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
instead of getAggregateElement. This has the advantage of being
more consistent and allowing higher-level constant folding to
procede even if an inner extract element cannot be folded.
Make ConstantFoldInstruction call ConstantFoldConstantExpression
on the instruction's operands, making it more consistent with 
ConstantFoldConstantExpression itself. This makes sure that
ConstantExprs get TargetData-aware folding before being handed
off as operands for further folding.
This causes more expressions to be folded, but due to a known
shortcoming in constant folding, this currently has the side effect
of stripping a few more nuw and inbounds flags in the non-targetdata
side of constant-fold-gep.ll. This is mostly harmless.
This fixes rdar://11324230.
llvm-svn: 155682
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The required checks are moved to ChainInstruction() itself and the
policy decisions are moved to IVChain::isProfitableInc().
Also cache the ExprBase in IVChain to avoid frequent recomputations.
No functional change intended.
llvm-svn: 155676
 | 
| | 
| 
| 
| 
| 
|  | 
No functional change intended.
llvm-svn: 155675
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
(x & y) | (x ^ y) -> x | y 
 (x & y) + (x ^ y) -> x | y 
Patch by Manman Ren.
rdar://10770603
llvm-svn: 155674
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
DAGCombine strangeness may result in multiple loads from the same
offset. They both may try to glue themselves to another load. We could
insist that the redundant loads glue themselves to each other, but the
beter fix is to bail out from bad gluing at the time we detect it.
Fixes rdar://11314175: BuildSchedUnits assert.
llvm-svn: 155668
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The base address for the PC-relative load is Align(PC,4), so it's the
address of the word containing the 16-bit instruction, not the address
of the instruction itself. Ugh.
rdar://11314619
llvm-svn: 155659
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
the FeatureLeaForSP feature bit when llvm auto detects Intel Atom.
Patch by Andy Zhang
llvm-svn: 155655
 | 
| | 
| 
| 
| 
| 
|  | 
'REPLACEMENT CHARACTER' (U+FFFD) when getAsInteger fails.
llvm-svn: 155653
 | 
| | 
| 
| 
| 
| 
| 
|  | 
On some cores it's a bad idea for performance to mix VFP and NEON instructions
and since these patterns are NEON anyway, the NEON load should be used.
llvm-svn: 155630
 | 
| | 
| 
| 
|  | 
llvm-svn: 155626
 | 
| | 
| 
| 
| 
| 
|  | 
corei7-avx, core-avx-i, and core-avx2 cpu names.
llvm-svn: 155618
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
elements to minimize the number of multiplies required to compute the
final result. This uses a heuristic to attempt to form near-optimal
binary exponentiation-style multiply chains. While there are some cases
it misses, it seems to at least a decent job on a very diverse range of
inputs.
Initial benchmarks show no interesting regressions, and an 8%
improvement on SPASS. Let me know if any other interesting results (in
either direction) crop up!
Credit to Richard Smith for the core algorithm, and helping code the
patch itself.
llvm-svn: 155616
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
the feature set of v7a. This comes about if the user specifies something like
-arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as
uxtab in this case.
rdar://11318438
llvm-svn: 155601
 | 
| | 
| 
| 
| 
| 
|  | 
MDNodeOperand value.
llvm-svn: 155599
 | 
| | 
| 
| 
|  | 
llvm-svn: 155567
 | 
| | 
| 
| 
|  | 
llvm-svn: 155566
 | 
| | 
| 
| 
| 
| 
|  | 
right-shifted by #32. These are stored as shifts by #0 in the MCInst and correctly marshalled when transforming from or to assembly representation.
llvm-svn: 155565
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
Cross-class joins have been normal and fully supported for a while now.
With TableGen generating the getMatchingSuperRegClass() hook, they are
unlikely to cause problems again.
llvm-svn: 155552
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Remove the heuristic for disabling cross-class joins. The greedy
register allocator can handle the narrow register classes, and when it
splits a live range, it can pick a larger register class.
Benchmarks were unaffected by this change.
<rdar://problem/11302212>
llvm-svn: 155551
 | 
| | 
| 
| 
| 
| 
|  | 
only targets that want the function get it. This prevents other targets from getting an unused function warning.
llvm-svn: 155538
 | 
| | 
| 
| 
| 
| 
|  | 
ZERO_EXTEND/ANY_EXTEND combine. These will be converted to target specific nodes during lowering. This is more consistent with other code.
llvm-svn: 155537
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
in poor taste.
Talking through some alternate solutions with Chandler.
llvm-svn: 155530
 | 
| | 
| 
| 
|  | 
llvm-svn: 155522
 | 
| | 
| 
| 
| 
| 
| 
|  | 
of a precise count. Also, move RRInfo's Partial field into PtrState,
now that it won't increase the size.
llvm-svn: 155513
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
These lists exclude invoke unwind edges and loop backedges which
are being ignored. This makes it easier to ignore them
consistently.
llvm-svn: 155500
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
When an instruction match is found, but the subtarget features it
requires are not available (missing floating point unit, or thumb vs arm
mode, for example), issue a diagnostic that identifies what the feature
mismatch is.
rdar://11257547
llvm-svn: 155499
 | 
| | 
| 
| 
|  | 
llvm-svn: 155486
 | 
| | 
| 
| 
| 
| 
|  | 
Fix 12592. Patch by Matt Pharr.
llvm-svn: 155480
 |