| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
forget to right shift the source by 32 first. rdar://9287902
llvm-svn: 129556
|
|
|
|
|
| |
Ref: I.1 Instruction encoding diagrams and pseudocode
llvm-svn: 129552
|
|
|
|
| |
llvm-svn: 129551
|
|
|
|
|
|
|
|
| |
instructions
(single element or n-element structure to all lanes).
llvm-svn: 129550
|
|
|
|
| |
llvm-svn: 129548
|
|
|
|
|
|
|
| |
canonical, and generally leads to better code. Found while looking at
an article about saturating arithmetic.
llvm-svn: 129545
|
|
|
|
|
|
|
|
|
|
| |
repeatedly undo each other. The solution is to perform more aggressive constant folding to make one of the edges just folded away rather than trying to thread it.
Fixes <rdar://problem/9284786>.
Discovered with CSmith.
llvm-svn: 129538
|
|
|
|
| |
llvm-svn: 129532
|
|
|
|
|
|
| |
operations.
llvm-svn: 129531
|
|
|
|
| |
llvm-svn: 129527
|
|
|
|
|
|
| |
RHS of a shift.
llvm-svn: 129522
|
|
|
|
|
|
| |
size of the clang binary in Debug builds from 690MB to 679MB.
llvm-svn: 129518
|
|
|
|
| |
llvm-svn: 129517
|
|
|
|
| |
llvm-svn: 129509
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is done by pushing physical register definitions close to their
use, which happens to handle flag definitions if they're not glued to
the branch. This seems to be generally a good thing though, so I
didn't need to add a target hook yet.
The primary motivation is to generate code closer to what people
expect and rule out missed opportunity from enabling macro-op
fusion. As a side benefit, we get several 2-5% gains on x86
benchmarks. There is one regression:
SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is
an independent scheduler bug that will be tracked separately.
See rdar://problem/9283108.
Incidentally, pre-RA scheduling is only half the solution. Fixing the
later passes is tracked by:
<rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump
Fixes:
<rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion
llvm-svn: 129508
|
|
|
|
|
|
| |
improvements, that will lead to fixing PR6627.
llvm-svn: 129504
|
|
|
|
| |
llvm-svn: 129503
|
|
|
|
|
|
|
|
|
|
|
|
| |
instruction around, reducing work.
Greatly simplify handling of debug instructions. There is no need to
build up a vector of them and then move them into the one predecessor
if we're processing a block. Instead just rescan the block and *copy*
them into the pred. If a block gets merged into multiple preds, this
will retain more debug info.
llvm-svn: 129502
|
|
|
|
| |
llvm-svn: 129501
|
|
|
|
|
|
| |
(movzx/movsx) because they give more information. Revert that part of the patch.
llvm-svn: 129498
|
|
|
|
|
|
| |
cases, it's much nicer and more informative reading the alias.
llvm-svn: 129497
|
|
|
|
|
|
| |
the alias".
llvm-svn: 129485
|
|
|
|
|
|
| |
where the RHS is of the legal type for the new operation.
llvm-svn: 129484
|
|
|
|
|
|
| |
rdar://problem/9280370
llvm-svn: 129480
|
|
|
|
|
|
|
|
| |
the same allocation size but different primitive sizes(e.g., <3xi32> and
<4xi32>). When ScalarRepl promotes them, it can't use a bit cast but
should use a shuffle vector instead.
llvm-svn: 129472
|
|
|
|
|
|
|
|
| |
instructions (tBcc and t2Bcc).
rdar://problem/9280470
llvm-svn: 129471
|
|
|
|
|
|
| |
rdar://problem/9279440
llvm-svn: 129469
|
|
|
|
| |
llvm-svn: 129468
|
|
|
|
|
|
|
| |
ignored. There was a test to catch this, but it was just blindly updated in
a large change. This fixes another part of <rdar://problem/9275290>.
llvm-svn: 129466
|
|
|
|
| |
llvm-svn: 129463
|
|
|
|
|
|
|
|
| |
as such.
rdar://problem/9276651
llvm-svn: 129462
|
|
|
|
|
|
| |
understand actual reason behind this fixme. Spot checking suggest that newer gdb does not need this.
llvm-svn: 129461
|
|
|
|
|
|
|
|
| |
not properly handled.
rdar://problem/9276427
llvm-svn: 129456
|
|
|
|
|
|
| |
http://llvm.org/viewvc/llvm-project?view=rev&revision=129387.
llvm-svn: 129451
|
|
|
|
| |
llvm-svn: 129450
|
|
|
|
|
|
|
|
|
| |
LoopUnroll class's ctor. Doing so
will allow multiple context with different loop unroll parameters to run. This is a minor change and no effect
on existing application.
llvm-svn: 129449
|
|
|
|
| |
llvm-svn: 129447
|
|
|
|
| |
llvm-svn: 129445
|
|
|
|
|
|
| |
related tweaks to ExprMapKeyType.
llvm-svn: 129443
|
|
|
|
| |
llvm-svn: 129442
|
|
|
|
| |
llvm-svn: 129441
|
|
|
|
| |
llvm-svn: 129439
|
|
|
|
| |
llvm-svn: 129435
|
|
|
|
|
|
|
| |
the max itself, so it is not easy to write a test case for this, but I added a
test case that would fail if the code in AsmPrinter were removed.
llvm-svn: 129432
|
|
|
|
| |
llvm-svn: 129429
|
|
|
|
|
|
|
| |
alignment for its type, use the minimum of the specified alignment and the ABI
alignment. This fixes <rdar://problem/9275290>.
llvm-svn: 129428
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
latency.
Additional fixes:
Do something reasonable for subtargets with generic
itineraries by handle node latency the same as for an empty
itinerary. Now nodes default to unit latency unless an itinerary
explicitly specifies a zero cycle stage or it is a TokenFactor chain.
Original fixes:
UnitsSharePred was a source of randomness in the scheduler: node
priority depended on the queue data structure. I rewrote the recent
VRegCycle heuristics to completely replace the old heuristic without
any randomness. To make the ndoe latency adjustments work, I also
needed to do something a little more reasonable with TokenFactor. I
gave it zero latency to its consumers and always schedule it as low as
possible.
llvm-svn: 129421
|
|
|
|
| |
llvm-svn: 129419
|
|
|
|
| |
llvm-svn: 129417
|
|
|
|
|
|
| |
Implement the ones that were missing in the asm streamer.
llvm-svn: 129413
|