| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
| |
register support. Test case included.
Contributer: Vladimir Medic
llvm-svn: 163268
|
|
|
|
| |
llvm-svn: 163263
|
|
|
|
| |
llvm-svn: 163258
|
|
|
|
|
|
| |
MachineInstr.
llvm-svn: 163257
|
|
|
|
| |
llvm-svn: 163256
|
|
|
|
|
|
| |
ArchiveMemberHeader. Found by gcc48 -Wcast-qual.
llvm-svn: 163255
|
|
|
|
|
|
| |
of its constness. Found by gcc48 -Wcast-qual.
llvm-svn: 163254
|
|
|
|
|
|
| |
the SubtargetInfoKV tables. Found by gcc48 -Wcast-qual.
llvm-svn: 163251
|
|
|
|
|
|
| |
by casting. Found with gcc48.
llvm-svn: 163247
|
|
|
|
| |
llvm-svn: 163243
|
|
|
|
|
|
|
|
| |
Since TOC is just defined for PPC64, move its definition to PPC64 td file.
Patch by Adhemerval Zanella.
llvm-svn: 163234
|
|
|
|
|
|
| |
inteldialect.
llvm-svn: 163231
|
|
|
|
|
|
|
|
| |
Previous patch accidentally decided it couldn't convert a VFP to a
NEON instruction after it had already destroyed the old one. Not a
good move.
llvm-svn: 163230
|
|
|
|
| |
llvm-svn: 163225
|
|
|
|
|
|
|
| |
Make sure to return a pointer into the target memory, not the local memory.
Often they are the same, but we can't assume that.
llvm-svn: 163217
|
|
|
|
|
|
|
|
| |
It relies on clear() being fast and the cache rarely has more than 1 or 2
elements, so give it an inline capacity and always shrink it back down in case
it grows. DenseMap will grow to 64 buckets which makes clear() a lot slower.
llvm-svn: 163215
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
subreg_hireg of register pair Rp.
* lib/Target/Hexagon/HexagonPeephole.cpp(PeepholeDoubleRegsMap): New
DenseMap similar to PeepholeMap that additionally records subreg info
too.
(runOnMachineFunction): Record information in PeepholeDoubleRegsMap
and copy propagate the high sub-reg of Rp0 in Rp1 = lsr(Rp0, #32) to
the instruction Rx = COPY Rp1:logreg_subreg.
* test/CodeGen/Hexagon/remove_lsr.ll: New test.
llvm-svn: 163214
|
|
|
|
| |
llvm-svn: 163205
|
|
|
|
|
|
| |
types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value.
llvm-svn: 163203
|
|
|
|
|
|
| |
Reid Watson
llvm-svn: 163199
|
|
|
|
|
|
| |
insert_subvector into undef accomplishes the same thing.
llvm-svn: 163198
|
|
|
|
|
|
| |
Also add patterns to turn subvector inserts with loads to index 0 of an undef into VMOVAPS.
llvm-svn: 163196
|
|
|
|
|
|
| |
major release.
llvm-svn: 163195
|
|
|
|
| |
llvm-svn: 163194
|
|
|
|
| |
llvm-svn: 163193
|
|
|
|
|
|
| |
build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores.
llvm-svn: 163192
|
|
|
|
| |
llvm-svn: 163190
|
|
|
|
| |
llvm-svn: 163187
|
|
|
|
|
|
| |
Reader/Writer.
llvm-svn: 163185
|
|
|
|
| |
llvm-svn: 163184
|
|
|
|
| |
llvm-svn: 163181
|
|
|
|
|
|
|
|
| |
pointers-to-strong-pointers may be in play. These can lead to retains and
releases happening in unstructured ways, foiling the optimizer. This fixes
rdar://12150909.
llvm-svn: 163180
|
|
|
|
| |
llvm-svn: 163179
|
|
|
|
|
|
|
| |
Implicit uses can be dynamically tied to defs. This will soon be used
for predicated instructions on ARM.
llvm-svn: 163177
|
|
|
|
|
|
| |
class.
llvm-svn: 163175
|
|
|
|
|
|
|
| |
implementation does not co-exist well with how the sideeffect and alignstack
attributes are handled. The reverts r161641.
llvm-svn: 163174
|
|
|
|
|
|
| |
Doesn't set MadeChange to TRUE if BypassSlowDivision doesn't change anything.
llvm-svn: 163165
|
|
|
|
|
|
|
|
|
|
| |
Also a few minor changes:
- use pre-inc instead of post-inc
- use isa instead of dyn_cast
- 80 col
- trailing spaces
llvm-svn: 163164
|
|
|
|
| |
llvm-svn: 163154
|
|
|
|
|
|
|
|
|
|
| |
The MachineOperand::TiedTo field was maintained, but not used.
This patch enables it in isRegTiedToDefOperand() and
isRegTiedToUseOperand() which are the actual functions use by the
register allocator.
llvm-svn: 163153
|
|
|
|
| |
llvm-svn: 163152
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After much agonizing, use a full 4 bits of precious MachineOperand space
to encode this. This uses existing padding, and doesn't grow
MachineOperand beyond its current 32 bytes.
This allows tied defs among the first 15 operands on a normal
instruction, just like the current MCInstrDesc constraint encoding.
Inline assembly needs to be able to tie more than the first 15 operands,
and gets special treatment.
Tied uses can appear beyond 15 operands, as long as they are tied to a
def that's in range.
llvm-svn: 163151
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder, or both.
Patch by Tyler Nowicki!
llvm-svn: 163150
|
|
|
|
|
|
|
| |
Change current Hexagon MI scheduler to use new converging
scheduler. Integrates DFA resource model into it.
llvm-svn: 163137
|
|
|
|
|
|
|
|
|
|
|
| |
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.
Bug 12213
Patch by Yin Ma!
llvm-svn: 163136
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of 4.
Since this specific shuffle is widely used in many workloads we have ~10% performance on them.
shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14>
vmovaps (%rdx), %ymm0
vshufps $8, %ymm0, %ymm0, %ymm0
vmovaps (%rcx), %ymm1
vshufps $8, %ymm0, %ymm1, %ymm1
vunpcklps %ymm0, %ymm1, %ymm0
vmovaps (%rcx), %ymm0
vmovsldup (%rdx), %ymm1
vblendps $85, %ymm0, %ymm1, %ymm0
llvm-svn: 163134
|
|
|
|
|
|
|
|
|
|
| |
Scan the body of the loop and find instructions that may trap.
Use this information when deciding if it is safe to hoist or sink instructions.
Notice that we can optimize the search of instructions that may throw in the case of nested loops.
rdar://11518836
llvm-svn: 163132
|
|
|
|
| |
llvm-svn: 163131
|
|
|
|
|
|
|
|
|
|
|
| |
by instruction address from DWARF.
Add --inlining flag to llvm-dwarfdump to demonstrate and test this functionality,
so that "llvm-dwarfdump --inlining --address=0x..." now works much like
"addr2line -i 0x...", provided that the binary has debug info
(Clang's -gline-tables-only *is* enough).
llvm-svn: 163128
|
|
|
|
|
|
|
|
|
|
| |
If an allocation has a must-alias relation to the access pointer, we treat it
as a Def. Otherwise, without this check, the code here was just skipping over
the allocation call and ignoring it. I noticed this by inspection and don't
have a specific testcase that it breaks, but it seems like we need to treat
a may-alias allocation as a Clobber.
llvm-svn: 163127
|