| Commit message (Collapse) | Author | Age | Files | Lines | 
| ... |  | 
| | 
| 
| 
| 
|  | 
Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 192812
 | 
| | 
| 
| 
|  | 
llvm-svn: 192805
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Before this patch we would assert when building llvm as multiple shared
libraries (cmake's BUILD_SHARED_LIBS). The problem was the line
if (T.AsmStreamerCtorFn == Target::createDefaultAsmStreamer)
which returns false because of -fvisibility-inlines-hidden. It is easy
to fix just this one case, but I decided to try to also make the
registration more strict. It looks like the old logic for ignoring
followup registration was just a temporary hack that outlived its
usefulness.
This patch converts the ifs to asserts, fixes the few cases that were
registering twice and makes sure all the asserts compare with null.
Thanks for Joerg for reporting the problem and reviewing the patch.
llvm-svn: 192803
 | 
| | 
| 
| 
| 
| 
|  | 
value and unsigned saturating accumulate of signed value instructions.
llvm-svn: 192800
 | 
| | 
| 
| 
| 
| 
| 
|  | 
The input to an RxSBG operation can be narrower as long as the upper bits
are don't care.  This fixes a FIXME added in r192783.
llvm-svn: 192790
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
We previously used the default expansion to SELECT_CC, which in turn would
expand to "LHI; BRC; LHI".  In most cases it's better to use an IPM-based
sequence instead.
llvm-svn: 192784
 | 
| | 
| 
| 
| 
| 
|  | 
We had a MCAsmInfoCOFF, but no common class for all the ELF MCAsmInfos before.
llvm-svn: 192760
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
No functionality change, but exposes the API so that codegen can use it too.
Patch by Katya Romanova.
llvm-svn: 192757
 | 
| | 
| 
| 
|  | 
llvm-svn: 192752
 | 
| | 
| 
| 
|  | 
llvm-svn: 192751
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This changes the SelectionDAG scheduling preference to source
order. Soon, the SelectionDAG scheduler can be bypassed saving
a nice chunk of compile time.
Performance differences that result from this change are often a
consequence of register coalescing. The register coalescer is far from
perfect. Bugs can be filed for deficiencies.
On x86 SandyBridge/Haswell, the source order schedule is often
preserved, particularly for small blocks.
Register pressure is generally improved over the SD scheduler's ILP
mode. However, we are still able to handle large blocks that require
latency hiding, unlike the SD scheduler's BURR mode. MI scheduler also
attempts to discover the critical path in single-block loops and
adjust heuristics accordingly.
The MI scheduler relies on the new machine model. This is currently
unimplemented for AVX, so we may not be generating the best code yet.
Unit tests are updated so they don't depend on SD scheduling heuristics.
llvm-svn: 192750
 | 
| | 
| 
| 
|  | 
llvm-svn: 192743
 | 
| | 
| 
| 
| 
| 
|  | 
scalar signed saturating negate instructions.
llvm-svn: 192733
 | 
| | 
| 
| 
| 
| 
|  | 
PR17309
llvm-svn: 192730
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
- Type of index used in extract_vector_elt or insert_vector_elt supposes
  to be TLI.getVectorIdxTy() which is pointer type on most targets. It'd
  better to truncate (or zero-extend in case it's changed later) it to
  mask element type to guarantee they are matching instead of asserting
  that.
llvm-svn: 192722
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
- Lower signed division by constant powers-of-2 to target-independent
  DAG operators instead of target-dependent ones to support them better
  on targets where vector types are legal but shift operators on that
  types are illegal. E.g., on AVX, PSRAW is only available on <8 x i16>
  though <16 x i16> is a legal type.
llvm-svn: 192721
 | 
| | 
| 
| 
|  | 
llvm-svn: 192699
 | 
| | 
| 
| 
| 
| 
|  | 
(atleast) windows and darwin.
llvm-svn: 192697
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
llvm-c headers.
This new library will be linked in when using the "all-targets"
component and contains the LLVMInitializeAll* functions.
This means that those functions will exist as real symbols in
the shared library, and can therefore can be called from
bindings that are using ffi the shared library.
llvm-svn: 192690
 | 
| | 
| 
| 
|  | 
llvm-svn: 192681
 | 
| | 
| 
| 
|  | 
llvm-svn: 192678
 | 
| | 
| 
| 
| 
| 
|  | 
x86_sse42_crc32_32_8 and was not mapped to a clang builtin. I'm not even sure why this form of the instruction is even called out explicitly in the docs. Also add AutoUpgrade support to convert it into the other intrinsic with appropriate trunc and zext.
llvm-svn: 192672
 | 
| | 
| 
| 
| 
| 
|  | 
parts of the accumulators and gets expanded post-RA.
llvm-svn: 192667
 | 
| | 
| 
| 
| 
| 
|  | 
of relying on AddedComplexity.
llvm-svn: 192665
 | 
| | 
| 
| 
|  | 
llvm-svn: 192663
 | 
| | 
| 
| 
|  | 
llvm-svn: 192662
 | 
| | 
| 
| 
|  | 
llvm-svn: 192661
 | 
| | 
| 
| 
|  | 
llvm-svn: 192660
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
through bitcast, ptrtoint, and inttoptr instructions. This is valid
only if the related instructions are in that same basic block, otherwise
we may reference variables that were not live accross basic blocks
resulting in undefined virtual registers.
The bug was exposed when both SDISel and FastISel were used within the same
function, i.e., one basic block is issued with FastISel and another with SDISel,
as demonstrated with the testcase.
<rdar://problem/15192473>
llvm-svn: 192636
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This pass is needed to break false dependencies. Without it, unlucky
register assignment can result in wild (5x) swings in
performance. This pass was trying to handle AVX but not getting it
right. AVX doesn't have partial register defs, it has unused register
reads in which the high bits of a source operand are copied into the
unused bits of the dest.
Fixing this requires conservative liveness analysis. This is awkard
because the pass already has its own pseudo-liveness. However, proper
liveness is expensive, and we would like to use a generic utility to
compute it. The fix only invokes liveness on-demand. It is rare to
detect a case that needs undef-read dependence breaking, but when it
happens, it can be needed many times within a very large block.
I think the existing heuristic which uses a register window of 16 is
too conservative for loop-carried false dependencies. If the loop is a
reduction. The out-of-order engine may be able to execute several loop
iterations in parallel. However, I'll leave this tuning exercise for
next time.
llvm-svn: 192635
 | 
| | 
| 
| 
|  | 
llvm-svn: 192633
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
a) x86-64 TLS has been documented
b) the code path should use movq for the correct relocation
   to be generated.
I've also added a fixme for the test case that we should improve
the code generated, it should look something like is documented
in the tls abi document.
llvm-svn: 192631
 | 
| | 
| 
| 
|  | 
llvm-svn: 192630
 | 
| | 
| 
| 
|  | 
llvm-svn: 192629
 | 
| | 
| 
| 
|  | 
llvm-svn: 192596
 | 
| | 
| 
| 
|  | 
llvm-svn: 192591
 | 
| | 
| 
| 
| 
| 
| 
|  | 
Some previous implicit defaults have changed, for example FP and NEON
are now on by default.
llvm-svn: 192590
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
List of instructions:
bclri.{b,h,w,d}
binsli.{b,h,w,d}
binsri.{b,h,w,d}
bnegi.{b,h,w,d}
bseti.{b,h,w,d}
sat_s.{b,h,w,d}
sat_u.{b,h,w,d}
slli.{b,h,w,d}
srai.{b,h,w,d}
srari.{b,h,w,d}
srli.{b,h,w,d}
srlri.{b,h,w,d}
llvm-svn: 192589
 | 
| | 
| 
| 
| 
| 
| 
|  | 
List of instructions:
and.v, bmnz.v, bmz.v, bsel.v, nor.v, or.v, xor.v.
llvm-svn: 192588
 | 
| | 
| 
| 
|  | 
llvm-svn: 192587
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
List of instructions:
copy_s.{b,h,w}
copy_u.{b,h,w}
sldi.{b,h,w,d}
splati.{b,h,w,d}
llvm-svn: 192586
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
INSERT is the first type of MSA instruction that requires a change to the way
MSA registers are parsed. This happens because MSA registers may be suffixed by
an index in the form of an immediate or a general purpose register. The changes
to parseMSARegs reflect that requirement.
llvm-svn: 192582
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
The alignment of allocated space was wrong, see Bugzila 17345.
Done by Zvi Rackover <zvi.rackover@intel.com>.
llvm-svn: 192573
 | 
| | 
| 
| 
| 
| 
|  | 
instructions.
llvm-svn: 192568
 | 
| | 
| 
| 
| 
| 
|  | 
instructions to parse either GR32 or GR64 without resorting to duplicating instructions.
llvm-svn: 192567
 | 
| | 
| 
| 
| 
| 
|  | 
is a shorter encoding that was part of SSE2, but a memory form was added in SSE4.1. This is the register form of that encoding.
llvm-svn: 192566
 | 
| | 
| 
| 
| 
| 
|  | 
disassembler tables. Add PINSRWrr64i to complement the AVX version.
llvm-svn: 192565
 | 
| | 
| 
| 
| 
| 
|  | 
produce a 1-bit result so we can just use SUBREG_TO_REG to extend the 32-bit versions.
llvm-svn: 192562
 | 
| | 
| 
| 
|  | 
llvm-svn: 192557
 | 
| | 
| 
| 
|  | 
llvm-svn: 192556
 |