| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
Switching between i32 and i64 based on the LHS type is a good idea in
theory, but pre-legalisation uses i64 regardless of our choice,
leading to potential ISel errors.
Should fix PR19294.
llvm-svn: 205519
|
| |
|
|
|
|
|
| |
We cannot use STACK_LIMIT, as it is not reserved for the compiler
by the C spec.
llvm-svn: 205516
|
| |
|
|
|
|
| |
This should fix PR19314.
llvm-svn: 205514
|
| |
|
|
|
|
|
|
|
| |
TargetInstrInfo::findCommutedOpIndices to enable VFMA*231 commutation, rather
than abusing commuteInstruction.
Thanks very much for the suggestion guys!
llvm-svn: 205489
|
| |
|
|
|
|
|
|
|
|
| |
PPCTTI::getMemoryOpCost will now make use of BasicTTI::getMemoryOpCost to
calculate the base cost of the memory access, and then adjust on top of that.
There is no functionality change from this modification, but it will become
important so that PPCTTI can take advantage of scalarization information for which
BasicTTI::getMemoryOpCost will account in the near future.
llvm-svn: 205476
|
| |
|
|
|
|
|
|
|
|
|
| |
on FMA3 memory operands. FMA3 instructions are VEX encoded, so they can load
from unaligned memory.
Testcase to follow, along with related patch.
<rdar://problem/16478629>
llvm-svn: 205472
|
| |
|
|
|
|
| |
GetElementPtr opaque (r204739).
llvm-svn: 205468
|
| |
|
|
|
|
|
| |
Update the subtarget information for Windows on ARM. This enables using the MC
layer to target Windows on ARM.
llvm-svn: 205459
|
| |
|
|
|
|
| |
No functional change.
llvm-svn: 205458
|
| |
|
|
|
|
| |
There are no implementations of these for R600.
llvm-svn: 205455
|
| |
|
|
|
|
|
|
| |
Just pass a MachineInstr reference rather than an MBB iterator.
Creating a MachineInstr& is the first thing every implementation did
anyway.
llvm-svn: 205453
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Unlike other v6+ processors, cortex-m0 never supports unaligned accesses.
From the v6m ARM ARM:
"A3.2 Alignment support: ARMv6-M always generates a fault when an unaligned
access occurs."
rdar://16491560
llvm-svn: 205452
|
| |
|
|
|
|
| |
No functional change, but more readable code.
llvm-svn: 205451
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Adds the instructions ext/ext32/cins/cins32.
It also changes pop/dpop to accept the two operand version and
adds a simple pattern to generate baddu.
Tests for the two operand versions (including baddu/dmul/dpop/pop)
and the code generation pattern for baddu are included.
Reviewed by: Daniel.Sanders@imgtec.com
llvm-svn: 205449
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205446
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205445
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205444
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205443
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205442
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205441
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205440
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205439
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205438
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205437
|
| |
|
|
| |
llvm-svn: 205435
|
| |
|
|
|
|
| |
Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav.
llvm-svn: 205430
|
| |
|
|
|
|
|
|
|
|
| |
Weak symbols cannot use the small code model's usual ADRP sequences since the
instruction simply may not be able to encode a value of 0.
This redirects them to use the GOT, which hopefully linkers are able to cope
with even in the static relocation model.
llvm-svn: 205426
|
| |
|
|
|
|
|
| |
We were creating libcall nodes that returned an MVT::f128, when these
particular operations actually return an int of some stripe.
llvm-svn: 205425
|
| |
|
|
|
|
|
|
| |
Again, coalescing and other optimisations swiftly made the MachineInstrs
consistent again, but when compiled at -O0 a bad INSERT_SUBREGISTER was
produced.
llvm-svn: 205423
|
| |
|
|
|
|
|
|
| |
The previous attempt was fine with optimisations, but was actually rather
cavalier with its types. When compiled at -O0, it produced invalid COPY
MachineInstrs.
llvm-svn: 205422
|
| |
|
|
| |
llvm-svn: 205421
|
| |
|
|
|
|
|
|
|
| |
ARM specific optimiztion, finding places in ARM machine code where 2 dmbs
follow one another, and eliminating one of them.
Patch by Reinoud Elhorst.
llvm-svn: 205409
|
| |
|
|
|
|
|
|
|
| |
and isTargetCygwin() to isTargetWindowsCygwin() to be consistent with the
four Windows environments in Triple.h.
Suggestion by Saleem Abdulrasool!
llvm-svn: 205393
|
| |
|
|
|
|
|
| |
framework works (for the compiler part), since the design
document is not available.
llvm-svn: 205379
|
| |
|
|
| |
llvm-svn: 205352
|
| |
|
|
|
|
| |
Environment == Triple::MSVC so it will never be MinGW or Cygwin.
llvm-svn: 205349
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This provides an initial implementation of getUnrollingPreferences for x86.
getUnrollingPreferences is used by the generic (concatenation) unroller, which
is distinct from the unrolling done by the loop vectorizer. Many modern x86
cores have some kind of uop cache and loop-stream detector (LSD) used to
efficiently dispatch small loops, and taking full advantage of this requires
unrolling small loops (small here means 10s of uops).
These caches also have limits on the number of taken branches in the loop, and
so we also cap the loop unrolling factor based on the maximum "depth" of the
loop. This is currently calculated with a partial DFS traversal (partial
because it will stop early if the path length grows too much). This is still an
approximation, and one that is both conservative (because it does not account
for branches eliminated via block placement) and optimistic (because it is only
recording the maximum depth over minimum paths). Nevertheless, because the
loops that fit in these uop caches are so small, it is not clear how much the
details matter.
The original set of patches posted for review produced the following test-suite
performance results (from the TSVC benchmark) at that time:
ControlLoops-dbl - 13% speedup
ControlLoops-flt - 15% speedup
Reductions-dbl - 7.5% speedup
llvm-svn: 205348
|
| |
|
|
|
|
|
|
|
| |
Adds the Octeon cnMips instructions "load multiplier register MPLx" and "load product register Px".
Includes tests.
Reviews by: Daniel.Sanders@imgtec.com
llvm-svn: 205343
|
| |
|
|
|
|
|
| |
Identical to Win32 method except the GS segment register is used for TLS
instead of FS and pvArbitrary is at TEB offset 0x28 instead of 0x14.
llvm-svn: 205342
|
| |
|
|
|
|
|
|
| |
to reflect its current functionality.
Based on Takumi NAKAMURA suggestion.
llvm-svn: 205338
|
| |
|
|
|
|
| |
ThumbLE/ThumbBE
llvm-svn: 205317
|
| |
|
|
| |
llvm-svn: 205314
|
| |
|
|
|
|
| |
Suggestion from Yaron Keren.
llvm-svn: 205313
|
| |
|
|
|
|
|
|
| |
The Cyclone CPU is similar to swift for most LLVM purposes, but does have two
preferred instructions for zeroing a VFP register. This teaches LLVM about
them.
llvm-svn: 205309
|
| |
|
|
|
|
|
|
|
| |
This is for consistency with other functions. The Parse* functions consume
tokens and the Match* functions don't.
No functional change.
llvm-svn: 205305
|
| |
|
|
|
|
| |
implicitly. No functional change intended.
llvm-svn: 205304
|
| |
|
|
| |
llvm-svn: 205302
|
| |
|
|
| |
llvm-svn: 205301
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This should fix the issues the D3222 caused in lld. Testcase is based on
the one that failed in the buildbot.
Depends on D3233
Reviewers: matheusalmeida, vmedic
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3234
llvm-svn: 205298
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Parsing registers no longer consume the $ token before it's confirmed whether it really has a register or not, therefore it's no longer impossible to match symbols if registers were tried first.
Depends on D3232
Reviewers: matheusalmeida, vmedic
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D3233
llvm-svn: 205297
|