| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
The terminal barrier of a cmpxchg expansion will be either Acquire or
SequentiallyConsistent. In either case it can be skipped if the
operation has Monotonic requirements on failure.
rdar://problem/15996804
llvm-svn: 205535
|
| |
|
|
|
|
| |
Differential Revision: http://llvm-reviews.chandlerc.com/D3141
llvm-svn: 205532
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds the 'mips4' processor and a simple test of the ELF e_flags.
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL
I made one small change to the testcase so that it uses
mips64-unknown-linux instead of mips4-unknown-linux.
This patch indirectly adds FeatureCondMov to FeatureMips64. This is ok
because it's supposed to be there anyway and it turns out that
FeatureCondMov is not a predicate of any instructions at the moment
(this is a bug that hasn't been noticed because there are no targets
without the conditional move instructions yet).
CC: theraven
Differential Revision: http://llvm-reviews.chandlerc.com/D3244
llvm-svn: 205530
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
llc doesn't generate nodes for unconditional fall-through branches for targets
without FastISel implementation (X86 has it, but can be disabled by
"-fast-isel=false") in SelectionDAGBuilder::visitBr().
So for line 4 in the following testcase
1: void foo(int i){
2: switch(i){
3: default:
4: break;
5: }
6: return;
7: }
there is no corresponding line in .debug_line section, and a debugger
cannot set a breakpoint at line 4.
Fix this by always emitting a branch when we're not optimizing and add a
testcase to ensure that there's code on every line we'd want to break.
Patch by Daniil Fukalov.
llvm-svn: 205529
|
| |
|
|
|
|
| |
Differential Revision: http://llvm-reviews.chandlerc.com/D3245
llvm-svn: 205528
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded
at MachineInstr emission time had grown to be extremely large and
involved, to account for the subtly different code needed for the
various flavours (8/16/32/64 bit, cmpxchg/add/minmax).
Moving this transformation into the IR clears up the code
substantially, and makes future optimisations much easier:
1. an atomicrmw followed by using the *new* value can be more
efficient. As an IR pass, simple CSE could handle this
efficiently.
2. Making use of cmpxchg success/failure orderings only has to be done
in one (simpler) place.
3. The common "cmpxchg; did we store?" idiom can be exposed to
optimisation.
I intend to gradually improve this situation within the ARM backend
and make sure there are no hidden issues before moving the code out
into CodeGen to be shared with (at least ARM64/AArch64, though I think
PPC & Mips could benefit too).
llvm-svn: 205525
|
| |
|
|
|
|
|
| |
The trouble as in ARMAsmParser, in ParseInstruction method. It assumes that ARM::R12 + 1 == ARM::SP.
It is wrong, since ARM::<Register> codes are generated by tablegen and actually could be any random numbers.
llvm-svn: 205524
|
| |
|
|
|
|
|
|
|
|
| |
type of the
add operation since extract_vector_elt can perform an extend operation. Get the input lane
type from the vector on which we're performing the vpaddl operation on and extend or
truncate it to the output type of the original add node.
llvm-svn: 205523
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
%highest(sym1 - sym2 + const) relocations. Remove "ABS_" from VK_Mips_HI
and VK_Mips_LO enums in MipsMCExpr, to be consistent with VK_Mips_HIGHER
and VK_Mips_HIGHEST.
This change also deletes test file test/MC/Mips/higher_highest.ll and moves
its CHECK's to the new test file test/MC/Mips/higher-highest-addressing.s.
The deleted file tests that R_MIPS_HIGHER and R_MIPS_HIGHEST relocations are
emitted in the .o file. Since it uses -force-mips-long-branch option, it was
created when MipsLongBranch's implementation was emitting R_MIPS_HIGHER and
R_MIPS_HIGHEST relocations in the .o file. It was disabled when MipsLongBranch
started to directly calculate offsets.
Differential Revision: http://llvm-reviews.chandlerc.com/D3230
llvm-svn: 205522
|
| |
|
|
|
|
|
|
|
|
| |
Switching between i32 and i64 based on the LHS type is a good idea in
theory, but pre-legalisation uses i64 regardless of our choice,
leading to potential ISel errors.
Should fix PR19294.
llvm-svn: 205519
|
| |
|
|
|
|
|
| |
We cannot use STACK_LIMIT, as it is not reserved for the compiler
by the C spec.
llvm-svn: 205516
|
| |
|
|
|
|
| |
This should fix PR19314.
llvm-svn: 205514
|
| |
|
|
|
|
|
|
|
|
|
| |
While we were encoding 64 bit values (data8) in the subrange itself,
using a 32 bit type for the subrange was still confusing the gdb. Oh,
and make it unsigned too.
As the comment points out, this could be pushed into the frontend so
that it would be 32 or 64 bit as appropriate, etc.
llvm-svn: 205512
|
| |
|
|
|
|
|
|
| |
I should have read that comment a little more carefully. ;)
Regression test in the works, committing in the mean time to un-break people.
llvm-svn: 205511
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This has the following advantages:
* Less code.
* The old ELF implementation was wrong for non-relocatable objects.
* The old ELF implementation (and I think MachO) was wrong for thumb.
No current testcase since this is only used from MCJIT and it only uses
relocatable objects and I don't think it supports thumb yet.
llvm-svn: 205508
|
| |
|
|
|
|
| |
All existing users explicitly ask for an address or a file offset.
llvm-svn: 205503
|
| |
|
|
|
|
|
| |
This code is no longer usefull, because we only compute and use the
IDom once. There is no benefit in caching it anymore.
llvm-svn: 205498
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vector loads
When a vector type legalizes to a larger vector type, and the target does not
support the associated extending load (or truncating store), then legalization
will scalarize the load (or store) resulting in an associated scalarization
cost. BasicTTI::getMemoryOpCost needs to account for this.
Between this, and r205487, PowerPC on the P7 with VSX enabled shows:
MultiSource/Benchmarks/PAQ8p/paq8p: 43% speedup
SingleSource/Benchmarks/BenchmarkGame/puzzle: 51% speedup
SingleSource/UnitTests/Vectorizer/gcc-loops 28% speedup
(some of these are new; some of these, such as PAQ8p, just reverse regressions
that VSX support would trigger)
llvm-svn: 205495
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r205479.
It turns out that nm does use addresses, it is just that every reasonable
relocatable ELF object has sections with address 0. I have no idea if those
exist in reality, but it at least it shows that llvm-nm should use the name
address.
The added test was includes an unusual .o file with non 0 section addresses. I
created it by hacking ELFObjectWriter.cpp.
Really sorry for the churn.
llvm-svn: 205493
|
| |
|
|
|
|
|
|
|
| |
TargetInstrInfo::findCommutedOpIndices to enable VFMA*231 commutation, rather
than abusing commuteInstruction.
Thanks very much for the suggestion guys!
llvm-svn: 205489
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
For an cast (extension, etc.), the currently logic predicts a low cost if the
associated operation (keyed on the destination type) is legal (or promoted).
This is not true when the number of values required to legalize the type is
changing. For example, <8 x i16> being sign extended by <8 x i32> is not
generically cheap on PPC with VSX, even though sign extension to v4i32 is
legal, because two output v4i32 values are required compared to the single
v8i16 input value, and without custom logic in the target, this conversion will
scalarize.
llvm-svn: 205487
|
| |
|
|
|
|
|
|
| |
opportunities in the current basic block, rather than just the last one seen.
<rdar://problem/16478629>
llvm-svn: 205481
|
| |
|
|
|
|
|
|
|
|
|
| |
What llvm-nm prints depends on the file format. On ELF for example, if the
file is relocatable, it prints offsets. If it is not, it prints addresses.
Since it doesn't really need to care what it is that it is printing, use the
generic term value.
Fix or implement getSymbolValue to keep llvm-nm working.
llvm-svn: 205479
|
| |
|
|
|
|
|
|
|
|
| |
PPCTTI::getMemoryOpCost will now make use of BasicTTI::getMemoryOpCost to
calculate the base cost of the memory access, and then adjust on top of that.
There is no functionality change from this modification, but it will become
important so that PPCTTI can take advantage of scalarization information for which
BasicTTI::getMemoryOpCost will account in the near future.
llvm-svn: 205476
|
| |
|
|
|
|
| |
performing unary constant folding operations (r204737).
llvm-svn: 205474
|
| |
|
|
|
|
|
|
|
|
|
| |
on FMA3 memory operands. FMA3 instructions are VEX encoded, so they can load
from unaligned memory.
Testcase to follow, along with related patch.
<rdar://problem/16478629>
llvm-svn: 205472
|
| |
|
|
|
|
|
|
|
|
|
| |
This reverts commit r199244.
Conflicts:
include/llvm-c/lto.h
include/llvm/LTO/LTOCodeGenerator.h
lib/LTO/LTOCodeGenerator.cpp
llvm-svn: 205471
|
| |
|
|
|
|
| |
GetElementPtr opaque (r204739).
llvm-svn: 205468
|
| |
|
|
|
|
|
| |
Update the subtarget information for Windows on ARM. This enables using the MC
layer to target Windows on ARM.
llvm-svn: 205459
|
| |
|
|
|
|
| |
No functional change.
llvm-svn: 205458
|
| |
|
|
|
|
| |
There are no implementations of these for R600.
llvm-svn: 205455
|
| |
|
|
|
|
|
|
| |
Just pass a MachineInstr reference rather than an MBB iterator.
Creating a MachineInstr& is the first thing every implementation did
anyway.
llvm-svn: 205453
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Unlike other v6+ processors, cortex-m0 never supports unaligned accesses.
From the v6m ARM ARM:
"A3.2 Alignment support: ARMv6-M always generates a fault when an unaligned
access occurs."
rdar://16491560
llvm-svn: 205452
|
| |
|
|
|
|
| |
No functional change, but more readable code.
llvm-svn: 205451
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Adds the instructions ext/ext32/cins/cins32.
It also changes pop/dpop to accept the two operand version and
adds a simple pattern to generate baddu.
Tests for the two operand versions (including baddu/dmul/dpop/pop)
and the code generation pattern for baddu are included.
Reviewed by: Daniel.Sanders@imgtec.com
llvm-svn: 205449
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205446
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205445
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205444
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205443
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205442
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205441
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205440
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205439
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205438
|
| |
|
|
|
|
| |
No functional change intended.
llvm-svn: 205437
|
| |
|
|
| |
llvm-svn: 205435
|
| |
|
|
|
|
| |
Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav.
llvm-svn: 205430
|
| |
|
|
| |
llvm-svn: 205429
|
| |
|
|
|
|
|
|
|
|
| |
Weak symbols cannot use the small code model's usual ADRP sequences since the
instruction simply may not be able to encode a value of 0.
This redirects them to use the GOT, which hopefully linkers are able to cope
with even in the static relocation model.
llvm-svn: 205426
|
| |
|
|
|
|
|
| |
We were creating libcall nodes that returned an MVT::f128, when these
particular operations actually return an int of some stripe.
llvm-svn: 205425
|