| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
target pointer type.
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering.
llvm-svn: 154310
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
x86 addressing modes. This allows PIE-based TLS offsets to fit directly
into an addressing mode immediate offset, which is the last remaining
code quality issue from PR12380. With this patch, that PR is completely
fixed.
To understand why this patch is correct to match these offsets into
addressing mode immediates, break it down by cases:
1) 32-bit is trivially correct, and unmodified here.
2) 64-bit non-small mode is unchanged and never matches.
3) 64-bit small PIC code which is RIP-relative is handled specially in
the match to try to fit RIP into the base register. If it fails, it
now early exits. This behavior is unchanged by the patch.
4) 64-bit small non-PIC code which is not RIP-relative continues to work
as it did before. The reason these immediates are safe is because the
ABI ensures they fit in small mode. This behavior is unchanged.
5) 64-bit small PIC code which is *not* using RIP-relative addressing.
This is the only case changed by the patch, and the primary place you
see it is in TLS, either the win64 section offset TLS or Linux
local-exec TLS model in a PIC compilation. Here the ABI again ensures
that the immediates fit because we are in small mode, and any other
operations required due to the PIC relocation model have been handled
externally to the Wrapper node (extra loads etc are made around the
wrapper node in ISelLowering).
I've tested this as much as I can comparing it with GCC's output, and
everything appears safe. I discussed this with Anton and it made sense
to him at least at face value. That said, if there are issues with PIC
code after this patch, yell and we can revert it.
llvm-svn: 154304
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
optimizations which are valid for position independent code being linked
into a single executable, but not for such code being linked into
a shared library.
I discussed the design of this with Eric Christopher, and the decision
was to support an optional bit rather than a completely separate
relocation model. Fundamentally, this is still PIC relocation, its just
that certain optimizations are only valid under a PIC relocation model
when the resulting code won't be in a shared library. The simplest path
to here is to expose a single bit option in the TargetOptions. If folks
have different/better designs, I'm all ears. =]
I've included the first optimization based upon this: changing TLS
models to the *Exec models when PIE is enabled. This is the LLVM
component of PR12380 and is all of the hard work.
llvm-svn: 154294
|
| |
|
|
|
|
|
|
| |
in TargetLowering. There was already a FIXME about this location being
odd. The interface is simplified as a consequence. This will also make
it easier to change TLS models when compiling with PIE.
llvm-svn: 154292
|
| |
|
|
|
|
|
|
| |
Previously we used three instructions to broadcast an immediate value into a
vector register.
On Sandybridge we continue to load the broadcasted value from the constant pool.
llvm-svn: 154284
|
| |
|
|
|
|
| |
remove patterns for selecting the intrinsic. Similar was already done for avx1.
llvm-svn: 154272
|
| |
|
|
|
|
| |
AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns.
llvm-svn: 154268
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tLDRr instruction with the last register operand set to the zero register
prints in assembly as if no register was specified, and the assembler encodes
it as a tLDRi instruction with a zero immediate. With the integrated assembler,
that zero register gets emitted as "r0", so we get "ldr rx, [ry, r0]" which
is broken. Emit the instruction as tLDRi with a zero immediate. I don't
know if there's a good way to write a testcase for this. Suggestions welcome.
Opportunities for follow-up work:
1) The asm printer should complain if a non-optional register operand is set
to the zero register, instead of silently dropping it.
2) The integrated assembler should complain in the same situation, instead of
silently emitting the operand as "r0".
llvm-svn: 154261
|
| |
|
|
|
|
|
| |
Cygwin-1.7 supports dw2. Some recent mingw distros support one, too.
I have confirmed test-suite/SingleSource/Benchmarks/Shootout-C++/except.cpp can pass on Cygwin.
llvm-svn: 154247
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
by default.
This is a behaviour configurable in the MCAsmInfo. I've decided to turn
it on by default in (possibly optimistic) hopes that most assemblers are
reasonably sane. If this proves a problem, switching to default seems
reasonable.
I'm not sure if this is the opportune place to test, but it seemed good
to make sure it was tested somewhere.
llvm-svn: 154235
|
| |
|
|
| |
llvm-svn: 154226
|
| |
|
|
| |
llvm-svn: 154210
|
| |
|
|
|
|
|
| |
After register masks were introdruced to represent the call clobbers, it
is no longer necessary to have duplicate instruction for iOS.
llvm-svn: 154209
|
| |
|
|
|
|
| |
which exists for this purpose.
llvm-svn: 154199
|
| |
|
|
|
|
|
| |
ARM and Thumb2 mode can use cmn instructions to compare against negative
immediates. Thumb1 mode can't.
llvm-svn: 154183
|
| |
|
|
| |
llvm-svn: 154171
|
| |
|
|
|
|
| |
a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413.
llvm-svn: 154166
|
| |
|
|
|
|
|
|
| |
We had special instructions for iOS because r9 is call-clobbered, but
that is represented dynamically by the register mask operands now, so
there is no need for the pseudo-instructions.
llvm-svn: 154144
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The load/store optimizer splits LDRD/STRD into two instructions when the
register pairing doesn't work out. For negative offsets in Thumb2, it uses
t2STRi8 to do that. That's fine, except for the case when the offset is in
the range [-4,-1]. In that case, we'll also form a second t2STRi8 with
the original offset plus 4, resulting in a t2STRi8 with a non-negative
offset, which ends up as if it were an STRT, which is completely bogus.
Similarly for loads.
No testcase, unfortunately, as any I've been able to construct is both large
and extremely fragile.
rdar://11193937
llvm-svn: 154141
|
| |
|
|
|
|
|
|
|
|
| |
'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out.
Thumb1 aliases for adding a negative immediate to the stack pointer,
also.
rdar://11192734
llvm-svn: 154123
|
| |
|
|
|
|
| |
some corner cases involving the PC register as an operand for these instructions.
llvm-svn: 154101
|
| |
|
|
| |
llvm-svn: 154100
|
| |
|
|
|
|
| |
rdar://11189467
llvm-svn: 154087
|
| |
|
|
|
|
|
|
| |
Plain 'cpsr' is an alias for 'cpsr_fc'.
rdar://11153753
llvm-svn: 154080
|
| |
|
|
| |
llvm-svn: 154062
|
| |
|
|
| |
llvm-svn: 154054
|
| |
|
|
|
|
| |
types for N32 ABI. Add new test case and update existing ones.
llvm-svn: 154038
|
| |
|
|
|
|
|
| |
types for N32 ABI. Test case will be updated after the patch that fixes
TargetLowering::getPICJumpTableRelocBase is checked in.
llvm-svn: 154036
|
| |
|
|
|
|
| |
types for N32 ABI and update test case.
llvm-svn: 154034
|
| |
|
|
|
|
|
|
|
|
| |
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.
<rdar://problem/11182914>
llvm-svn: 154033
|
| |
|
|
|
|
| |
types for N32 ABI and update test case.
llvm-svn: 154031
|
| |
|
|
|
|
|
|
| |
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.
llvm-svn: 154011
|
| |
|
|
|
|
|
| |
And indirectly, a dependency on most of the core LLVM optimization
libraries.
llvm-svn: 153957
|
| |
|
|
|
|
|
|
| |
to issue call via
PLT when LLVM is built as shared library. This mimics the X86 backend towards the approach.
llvm-svn: 153938
|
| |
|
|
| |
llvm-svn: 153935
|
| |
|
|
|
|
| |
lib/Target/Mips/Disassembler.
llvm-svn: 153926
|
| |
|
|
| |
llvm-svn: 153925
|
| |
|
|
|
|
| |
Patch by Vladimir Medic.
llvm-svn: 153924
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch allows llvm to recognize that a 64 bit object file is being produced
and that the subsequently generated ELF header has the correct information.
The test case checks for both big and little endian flavors.
Patch by Jack Carter.
llvm-svn: 153889
|
| |
|
|
| |
llvm-svn: 153886
|
| |
|
|
| |
llvm-svn: 153876
|
| |
|
|
|
|
|
|
| |
MCInstPrinter.
All implementations used the same code.
llvm-svn: 153866
|
| |
|
|
|
|
| |
using the instruction name table from MCInstrInfo. Reduces static data in the InstPrinter implementations.
llvm-svn: 153863
|
| |
|
|
|
|
| |
getInstructionName and the static data it contains since the same tables are already in MCInstrInfo.
llvm-svn: 153860
|
| |
|
|
| |
llvm-svn: 153852
|
| |
|
|
| |
llvm-svn: 153851
|
| |
|
|
| |
llvm-svn: 153850
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
(and also scalar_to_vector).
2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))
3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y).
4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.
Code which was previously compiled to this:
movd (%rsi), %xmm0
movdqa .LCPI0_0(%rip), %xmm2
pshufb %xmm2, %xmm0
movd (%rdi), %xmm1
pshufb %xmm2, %xmm1
pxor %xmm0, %xmm1
pshufb .LCPI0_1(%rip), %xmm1
movd %xmm1, (%rdi)
ret
Now compiles to this:
movl (%rsi), %eax
xorl %eax, (%rdi)
ret
llvm-svn: 153848
|
| |
|
|
|
|
|
| |
The 440 and A2 cores have detailed itineraries, and this allows them to be
fully used to maximize throughput.
llvm-svn: 153845
|
| |
|
|
| |
llvm-svn: 153844
|