| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Reason: The testcase fails in many architectures.
Differential Revision: http://reviews.llvm.org/D15163
llvm-svn: 255416
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this.
Reviewers: craig.topper, RKSimon
Subscribers: aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D14603
llvm-svn: 255415
|
|
|
|
| |
llvm-svn: 255414
|
|
|
|
| |
llvm-svn: 255400
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change is discussed in D15392 and should allow us to effectively
revert:
http://llvm.org/viewvc/llvm-project?view=revision&revision=255261
if we canonicalize bitcasts ahead of extracts.
It should be safe to convert any pair of bitcasts into a single bitcast,
however, it was mentioned here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20110829/127089.html
that we're not allowed to bitcast from an x86_mmx to some other types, but I'm
not seeing any failures from that, and we have regression tests in CodeGen/X86
that appear to cover all of those cases.
Some day we'll get to remove that MMX wart from LLVM IR completely?
Differential Revision: http://reviews.llvm.org/D15468
llvm-svn: 255399
|
|
|
|
|
|
|
|
|
|
|
| |
This branch adds hints for highly biased branches on the PPC architecture. Even
in absence of profiling information, LLVM will mark code reaching unreachable
terminators and other exceptional control flow constructs as highly unlikely to
be reached.
Patch by Tom Jablin!
llvm-svn: 255398
|
|
|
|
|
|
|
|
|
| |
This sets the maximum entry count among all functions in the program to the
module using module flags. This allows the optimizer to use this information.
Differential Revision: http://reviews.llvm.org/D15163
llvm-svn: 255397
|
|
|
|
|
|
|
|
|
| |
Many tests are now passing due to eliminateFrameIndex implementation and
the list needs to be re-triaged because it unblocks other failures, and
some previous failures are different. However I'm about to churn it more
by implementing more lowering, so will wait on that.
llvm-svn: 255396
|
|
|
|
|
|
|
|
| |
multiplication-to-shift conversion.
because it broke buildbot.
llvm-svn: 255395
|
|
|
|
| |
llvm-svn: 255394
|
|
|
|
| |
llvm-svn: 255393
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Use the SP32 physical register as the base for FrameIndex
lowering. Update it and the __stack_pointer global var in the prolog and
epilog. Extend the mapping of virtual registers to wasm locals to
include the physical registers.
Rather than modify the target-independent PrologEpilogInserter (which
asserts that there are no virtual registers left) include a
slightly-modified copy for Wasm that does not have this assertion and
only clears the virtual registers if scavenging was needed (which of
course it isn't for wasm).
Differential Revision: http://reviews.llvm.org/D15344
llvm-svn: 255392
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this.
Reviewers: craig.topper, RKSimon
Subscribers: aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D14603
llvm-svn: 255391
|
|
|
|
|
|
|
|
|
| |
appropriate bits.
This fixes the remaining clang regression test failures when linking clang with
lld on Darwin.
llvm-svn: 255390
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DenseMap is the wrong data structure to use for sample records and call
sites. The keys are too large, causing massive core memory growth when
reading profiles.
Before this patch, a 21Mb input profile was causing the compiler to grow
to 3Gb in memory. By switching to std::map, the compiler now grows to
300Mb in memory.
There still are some opportunities for memory footprint reduction. I'll
be looking at those next.
llvm-svn: 255389
|
|
|
|
| |
llvm-svn: 255388
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After much discussion, ending here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html
it has been decided that, instead of having the vectorizer directly generate
special absdiff and horizontal-add intrinsics, we'll recognize the relevant
reduction patterns during CodeGen. Accordingly, these intrinsics are not needed
(the operations they represent can be pattern matched, as is already done in
some backends). Thus, we're backing these out in favor of the current
development work.
r248483 - Codegen: Fix llvm.*absdiff semantic.
r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA
r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA
r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation
llvm-svn: 255387
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I am seeing disappointing clang performance on a large PowerPC64
Linux box. GetRandomNumberSeed() does a buffered read from
/dev/urandom to seed its PRNG. As a result we read an entire page
even though we only need 4 bytes.
With every clang task reading a page worth of /dev/urandom we
end up spending a large amount of time stuck on kernel spinlock.
Patch by Anton Blanchard!
llvm-svn: 255386
|
|
|
|
| |
llvm-svn: 255385
|
|
|
|
|
|
|
| |
second one as 0 instead of writing the same bits to the module file twice.
This typically reduces PCM file size by about 1%.
llvm-svn: 255384
|
|
|
|
|
|
|
|
|
| |
have a nested name specifier. Strictly speaking, forward declarations of class
template partial specializations are not permitted at all, but that seems like
an obvious wording defect, and if we allow them without a nested name specifier
we should also allow them with a nested name specifier.
llvm-svn: 255383
|
|
|
|
|
|
|
|
|
| |
There's no way to make a flag alias to two flags, so add a /WCL4 flag that
maps to the All, Extra diag groups. Fixes PR25563.
http://reviews.llvm.org/D15350
llvm-svn: 255382
|
|
|
|
|
|
| |
This reverts commit f994b46a2028c8a8b9b55fe010a95122bca07540.
llvm-svn: 255381
|
|
|
|
| |
llvm-svn: 255380
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D15435
llvm-svn: 255379
|
|
|
|
|
|
|
| |
If we don't filter these out we can end up, generating bogus paths, for example:
/home/user/lld/build/bin -> /home/user/home/user/lld/build/bin/lld/build/bin.
llvm-svn: 255378
|
|
|
|
|
|
|
|
|
| |
a hidden tag"
Now not trying to use a C++ lookup mechanism in C (d'oh). Unqualified
lookup is actually fine for this case in C.
llvm-svn: 255377
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change set includes all changes to make the code conform to the OMP 4.5 specification:
* Removed hint / hinted_init definitions from include/40 files
* Hint values are powers of 2 to enable composition (4.5 spec)
* Hinted lock initialization functions were renamed (4.5 spec)
kmp_init_lock_hinted -> omp_init_lock_with_hint
kmp_init_nest_lock_hinted -> omp_init_nest_lock_with_hint
* __kmpc_critical_section_with_hint was added to support a critical section with
a hint (4.5 spec)
* __kmp_map_hint_to_lock was added to convert a hint (possibly a composite) to
an internal lock type
* kmpc_init_lock_with_hint and kmpc_init_nest_lock_with_hint were added as
internal entries for the hinted lock initializers. The preivous internal
functions (__kmp_init*) were moved to kmp_csupport.c and reused in multiple
places
* Added the two init functions to dllexports
* KMP_USE_DYNAMIC_LOCK is turned on if OMP_41_ENABLED is turned on
Differential Revision: http://reviews.llvm.org/D15205
llvm-svn: 255376
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Added a new user TSX lock implementation, RTM, This implementation is a
light-weight version of the adaptive lock implementation, omitting the
back-off logic for deciding when to specualte (or not). The fall-back lock is
still the queuing lock.
* Changed indirect lock table management. The data for indirect lock management
was encapsulated in the "kmp_indirect_lock_table_t" type. Also, the lock table
dimension was changed to 2D (was linear), and each entry is a
kmp_indirect_lock_t object now (was a pointer to an object).
* Some clean up in the critical section code
* Removed the limits of the tuning parameters read from KMP_ADAPTIVE_LOCK_PROPS
* KMP_USE_DYNAMIC_LOCK=1 also turns on these two switches:
KMP_USE_TSX, KMP_USE_ADAPTIVE_LOCKS
Differential Revision: http://reviews.llvm.org/D15204
llvm-svn: 255375
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: The Hexagon ABI plugin uses hardcoded registers when setting up function calls. This is OK for the Hexagon simulator, but the register numbers are different on the gdbserver running on hardware. Change the hardcoded registers to LLDB generic registers.
Reviewers: clayborg
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D15457
llvm-svn: 255374
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are going to be two more patches which bring this feature up to date and in line with OpenMP 4.5.
* Renamed jump tables for the lock functions (and some clean up).
* Renamed some macros to be in KMP_ namespace.
* Return type of unset functions changed from void to int.
* Enabled use of _xebgin() et al. intrinsics for accessing TSX instructions.
Differential Revision: http://reviews.llvm.org/D15199
llvm-svn: 255373
|
|
|
|
|
|
|
| |
The message for a type definition in an "if" condition was different
from the other three for no particular reason.
llvm-svn: 255372
|
|
|
|
|
|
| |
asm label after the first ODR-use. Detects problems like the one in PR22830 where gcc and clang both compiled the file but with different behaviour.
llvm-svn: 255371
|
|
|
|
|
|
| |
As noted in http://reviews.llvm.org/D15392 , we should be able to improve this.
llvm-svn: 255370
|
|
|
|
| |
llvm-svn: 255369
|
|
|
|
| |
llvm-svn: 255368
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(This is part-2 of the patch -- fixing test cases)
Before the patch, -fprofile-instr-generate compile will fail
if no integrated-as is specified when the file contains
any static functions (the -S output is also invalid).
This patch fixed the issue. With the change, the index format
version will be bumped up by 1. Backward compatibility is
preserved with this change.
Differential Revision: http://reviews.llvm.org/D15243
llvm-svn: 255366
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before the patch, -fprofile-instr-generate compile will fail
if no integrated-as is specified when the file contains
any static functions (the -S output is also invalid).
This patch fixed the issue. With the change, the index format
version will be bumped up by 1. Backward compatibility is
preserved with this change.
Differential Revision: http://reviews.llvm.org/D15243
llvm-svn: 255365
|
|
|
|
|
|
|
|
| |
warnings in source/Target/Target.cpp.
Simplify smart pointers checks in conditions.
llvm-svn: 255364
|
|
|
|
| |
llvm-svn: 255363
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
computeRegisterLiveness() was broken in that it reported dead for a
register even if a subregister was alive. I assume this was because the
results of analayzePhysRegs() are hard to understand with respect to
subregisters.
This commit: Changes the results of analyzePhysRegs (=struct
PhysRegInfo) to be clearly understandable, also renames the fields to
avoid silent breakage of third-party code (and improve the grammar).
Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling
it and removing workarounds for the bug.
This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033
Differential Revision: http://reviews.llvm.org/D15320
llvm-svn: 255362
|
|
|
|
| |
llvm-svn: 255361
|
|
|
|
| |
llvm-svn: 255360
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are redundant pairs of nodes defined for
INSERT_VECTOR_ELEMENT/EXTRACT_VECTOR_ELEMENT.
insertelement/extractelement are slightly closer to the corresponding
C++ node name, and has stricter type checking so prefer it.
Update targets to only use these nodes where it is trivial to do so.
AArch64, ARM, and Mips all have various type errors on simple replacement,
so they will need work to fix.
Example from AArch64:
def : Pat<(sext_inreg (vector_extract (v16i8 V128:$Rn), VectorIndexB:$idx), i8),
(i32 (SMOVvi8to32 V128:$Rn, VectorIndexB:$idx))>;
Which is trying to do sext_inreg i8, i8.
llvm-svn: 255359
|
|
|
|
|
|
|
| |
and appends them to our list of comments (which can additionally include
things like decoded addresses).
llvm-svn: 255358
|
|
|
|
|
|
|
| |
There is work under way in llvm to avoid creating unnecessary names for
symbols. This makes lld capable of handling that.
llvm-svn: 255357
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
ADJCALLSTACK{DOWN,UP} (aka CALLSEQ_{START,END}) MIs are supposed to use
and def the stack pointer. Since they do not, all the nodes are being
eliminated by DeadMachineInstructionElim, so they aren't in the IR when
PrologEpilogInserter/eliminateCallFramePseudo needs them.
This change fixes that, but since RegStackify will not stackify across
them (and it runs early, before PEI), change LowerCall to only emit them
when the call frame size is > 0. That makes the current code work the
same way and makes code handled by D15344 also work the same way. We can
expand the condition beyond NumBytes > 0 in the future if needed.
Reviewers: sunfish, jfb
Subscribers: jfb, dschuff, llvm-commits
Differential Revision: http://reviews.llvm.org/D15459
llvm-svn: 255356
|
|
|
|
|
|
| |
This matches the behavior of both gold and bfd ld.
llvm-svn: 255355
|
|
|
|
|
|
|
|
| |
Revert "[DSE] Disable non-local DSE to see if the bots go green."
Revert "[DeadStoreElimination] Use range-based loops. NFC."
Revert "[DeadStoreElimination] Add support for non-local DSE."
llvm-svn: 255354
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The access function has a short entry and a short exit, the initialization
block is only run the first time. To improve the performance, we want to
have a short frame at the entry and exit.
We explicitly handle most of the CSRs via copies. Only the CSRs that are not
handled via copies will be in CSR_SaveList.
Frame lowering and prologue/epilogue insertion will generate a short frame
in the entry and exit according to CSR_SaveList. The majority of the CSRs will
be handled by register allcoator. Register allocator will try to spill and
reload them in the initialization block.
We add CSRsViaCopy, it will be explicitly handled during lowering.
1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target
supports it for the given calling convention and the function has only return
exits). We also call TLI->initializeSplitCSR to perform initialization.
2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to
virtual registers at beginning of the entry block and copies from virtual
registers to CSRsViaCopy at beginning of the exit blocks.
3> we also need to make sure the explicit copies will not be eliminated.
rdar://problem/23557469
Differential Revision: http://reviews.llvm.org/D15340
llvm-svn: 255353
|