| Commit message (Collapse) | Author | Age | Files | Lines | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary:
These are analog to the existing LLVMConstExactSDiv and LLVMBuildExactSDiv
functions.
Reviewers: deadalnix, majnemer
Subscribers: majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D25259
llvm-svn: 283269
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This patch adds write methods to StringTableBuilder so that it is
easier to change the underlying implementation.
Using the write methods, avoid creating a temporary buffer when using
mmaped output.
It also uses a more compact key in the DenseMap. Overall this produces
a slightly faster lld:
firefox
  master 6.853419709
  patch  6.841968912 1.00167361138x faster
chromium
  master 4.297280174
  patch  4.298712163 1.00033323147x slower
chromium fast
  master 1.802335952
  patch  1.806872459 1.00251701521x slower
the gold plugin
  master 0.3247149
  patch  0.321971644 1.00852017888x faster
clang
  master 0.551279945
  patch  0.543733194 1.01387951128x faster
llvm-as
  master 0.032743458
  patch  0.032143478 1.01866568391x faster
the gold plugin fsds
  master 0.350814247
  patch  0.348571741 1.00643341309x faster
clang fsds
  master 0.6281672
  patch  0.621130222 1.01132931187x faster
llvm-as fsds
  master 0.030168899
  patch  0.029797155 1.01247582194x faster
scylla
  master 3.104222518
  patch  3.059590248 1.01458766252x faster
llvm-svn: 283266
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
dependency to clang's implementation
Summary:
Attempting to fix PR30384.
Take the same approach as in compiler_rt and add a simplified version of __get_cpuid_max.
Including cpuid.h is no longer needed.
Reviewers: echristo, joerg
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D24597
llvm-svn: 283265
 | 
| | 
| 
| 
|  | 
llvm-svn: 283254
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The motivation for the change is that we can't have pseudo-global settings for
codegen living in TargetOptions because that doesn't work with LTO.
Ideally, these reciprocal attributes will be moved to the instruction-level via
FMF, metadata, or something else. But making them function attributes is at least
an improvement over the current state.
The ingredients of this patch are:
    Remove the reciprocal estimate command-line debug option.
    Add TargetRecip to TargetLowering.
    Remove TargetRecip from TargetOptions.
    Clean up the TargetRecip implementation to work with this new scheme.
    Set the default reciprocal settings in TargetLoweringBase (everything is off).
    Update the PowerPC defaults, users, and tests.
    Update the x86 defaults, users, and tests.
Note that if this patch needs to be reverted, the related clang patch checked in
at r283251 should be reverted too.
Differential Revision: https://reviews.llvm.org/D24816
llvm-svn: 283252
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
load commands that uses the MachO::encryption_info_command and
MachO::encryption_info_command types but not used in llvm libObject
code but used in llvm tool code.
This includes just LC_ENCRYPTION_INFO and
LC_ENCRYPTION_INFO_64 load commands.
llvm-svn: 283250
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
Patch by Michael LeMay
Differential revision: http://reviews.llvm.org/D24896
llvm-svn: 283248
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
AArch64InstrInfo::shouldScheduleAdjacent() determines whether two
instruction can benefit from macroop fusion on apple CPUs. The list
turned out to be incomplete:
- the "rr" variants of the instructions were missing
- even the "rs" variants can have shift value == 0 and behave like the
  "rr" variants
This also splits the MacropFusion target feature into
ArithmeticBccFusion and ArithmeticCbzFusion.
Differential Revision: https://reviews.llvm.org/D25142
llvm-svn: 283243
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The VM layout is not stable between iOS version releases, so switch to dynamic shadow offset.
This is the LLVM counterpart of https://reviews.llvm.org/D25218
Differential Revision: https://reviews.llvm.org/D25219
llvm-svn: 283239
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The purpose of the YAML diagnostic output file is to collect information on
optimizations performed, or not performed, for later processing by tools that
help users (and compiler developers) understand how code was optimized. As
such, the diagnostics that appear in the file should not be coupled to what a
user might want to see summarized for them as the compiler runs, and in fact,
because the user likely does not know what optimization diagnostics their tools
might want to use, the user cannot provide a useful filter regardless. As such,
we shouldn't filter the diagnostics going to the output file.
Differential Revision: https://reviews.llvm.org/D25224
llvm-svn: 283236
 | 
| | 
| 
| 
|  | 
llvm-svn: 283231
 | 
| | 
| 
| 
|  | 
llvm-svn: 283230
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary:
This patch modifies the findBasePointer to handle the shufflevector instruction.
Tests run: RS4GC tests, local downstream tests.
Reviewers: reames, sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25197
llvm-svn: 283219
 | 
| | 
| 
| 
|  | 
llvm-svn: 283216
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This patch corresponds to review:
The newly added VSX D-Form (register + offset) memory ops target the upper half
of the VSX register set. The existing ones target the lower half. In order to
unify these and have the ability to target all the VSX registers using D-Form
operations, this patch defines Pseudo-ops for the loads/stores which are
expanded post-RA. The expansion then choses the correct opcode based on the
register that was allocated for the operation.
llvm-svn: 283212
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Treat soft-float as unsupported for fast-isel. Additionally, ensure we check
that lowering f32 arguments also considers the case of soft-float mode.
Reviewers: ehostunreach, vkalintiris, zoran.jovanovic
Differential Review: https://reviews.llvm.org/D24505
llvm-svn: 283209
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The SMULO/UMULO DAG nodes, when not directly supported by the target,
expand to a multiplication twice as wide. In case that the resulting
type is not legal, an __mul?i3 intrinsic is used. Since the type is
not legal, the legalizer cannot directly call the intrinsic with
the wide arguments; instead, it "pre-lowers" them by splitting them
in halves.
The "pre-lowering" code in essence made assumptions about
the calling convention, specifically that i(N*2) values will be
split into two iN values and passed in consecutive registers in
little-endian order. This, naturally, breaks on a big-endian system,
such as our OR1K out-of-tree backend.
Thanks to James Miller <james@aatch.net> for help in debugging.
Differential Revision: https://reviews.llvm.org/D25223
llvm-svn: 283203
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
This fixes the inconsistency of the fp denormal option names: in LLVM this was
DenormalType, but in Clang this is DenormalMode which seems better.
Differential Revision: https://reviews.llvm.org/D24906
llvm-svn: 283192
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This patch corresponds to review:
https://reviews.llvm.org/D23155
This patch removes the VSHRC register class (based on D20310) and adds
exploitation of the Power9 sub-word integer loads into VSX registers as well
as vector sign extensions.
The new instructions are useful for a few purposes:
    Int to Fp conversions of 1 or 2-byte values loaded from memory
    Building vectors of 1 or 2-byte integers with values loaded from memory
    Storing individual 1 or 2-byte elements from integer vectors
This patch implements all of those uses.
llvm-svn: 283190
 | 
| | 
| 
| 
| 
| 
|  | 
being used and as is is pretty weak anyway
llvm-svn: 283187
 | 
| | 
| 
| 
| 
| 
|  | 
match MOV8rm.
llvm-svn: 283184
 | 
| | 
| 
| 
| 
| 
|  | 
are known to be minimal inputs for at least one coverage feature (works only with -shrink=1 for now)
llvm-svn: 283178
 | 
| | 
| 
| 
| 
| 
| 
|  | 
Allow inserting multiple instructions in the
expanded loop.
llvm-svn: 283177
 | 
| | 
| 
| 
|  | 
llvm-svn: 283175
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
This reverts commit ff234efbe23528e4f4c80c78057b920a51f434b2.
Causing crashes on aarch64 build.
llvm-svn: 283172
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
Slightly improves the precision of GlobalsAA in certain situations, and
makes the behavior of optimization passes more predictable.
Differential Revision: https://reviews.llvm.org/D24104
llvm-svn: 283165
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.
In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.
This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.
llvm-svn: 283164
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
WebAssembly has officially switched from being an AST to being a stack
machine. Update various bits of terminology and README.md entries
accordingly.
llvm-svn: 283154
 | 
| | 
| 
| 
| 
| 
|  | 
The comment is present inside the body of GetVRegDef.
llvm-svn: 283153
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
This is to avoid problems with win32 + ELF which surprisingly happens a
lot in practice: If a user just specifies -march on the commandline the
object format changes along with the architecture to ELF in many
instances while the OS stays with the default/host OS.
llvm-svn: 283151
 | 
| | 
| 
| 
|  | 
llvm-svn: 283150
 | 
| | 
| 
| 
|  | 
llvm-svn: 283147
 | 
| | 
| 
| 
| 
| 
| 
|  | 
WebAssembly documentation consistently says "f32" rather than "fp32" to
describe 32-bit floating-point.
llvm-svn: 283146
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Refactor the code so that the same function can be used for all
instructions with all the same operands for up to 3 operands.
This is going to be useful for cast instructions.
NFC.
llvm-svn: 283144
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
Each shadow only represents data flow that is restricted to its reaching
def. Propagating more than that could lead to spurious register liveness,
resulting in extra (incorrectly) block live-ins.
llvm-svn: 283143
 | 
| | 
| 
| 
|  | 
llvm-svn: 283142
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Windows has no GOT relocations the way elf/darwin has. Some people use
x86_64-pc-win32-macho to build EFI firmware; Do not produce GOT
relocations for this target.
Differential Revision: https://reviews.llvm.org/D24627
llvm-svn: 283140
 | 
| | 
| 
| 
| 
| 
| 
|  | 
This fixes one spot I had missed in r265762.  Credit goes to Philip
Reames for spotting this one!
llvm-svn: 283137
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary: LoopSink pass uses some common function in LICM. This patch refactor the LICM code to make it usable by LoopSink pass (https://reviews.llvm.org/D22778).
Reviewers: davidxl, danielcdh, hfinkel, chandlerc
Subscribers: hfinkel, llvm-commits
Differential Revision: https://reviews.llvm.org/D24168
llvm-svn: 283134
 | 
| | 
| 
| 
|  | 
llvm-svn: 283133
 | 
| | 
| 
| 
|  | 
llvm-svn: 283130
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
|  | 
Splitting the edge is nontrivial because of the landing pad, and we would
currently assert trying to do it.
Differential Revision: https://reviews.llvm.org/D24680
llvm-svn: 283129
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
This reverts commit r283125.
lld needs to be updated.
llvm-svn: 283127
 | 
| | 
| 
| 
| 
| 
|  | 
Print target basic block for a branch.
llvm-svn: 283126
 | 
| | 
| 
| 
| 
| 
| 
|  | 
Also assert isFinalized in getSize(). This just reduces the noise from
another patch.
llvm-svn: 283125
 | 
| | 
| 
| 
|  | 
llvm-svn: 283122
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
(PR30433)
This should fix:
https://llvm.org/bugs/show_bug.cgi?id=30433
There are a couple of open questions about the codegen:
1. Should we let scalar ops be scalars and avoid vector constant loads/splats?
2. Should we have a pass to combine constants such as the inverted pair that we have here?
Differential Revision: https://reviews.llvm.org/D25165
 
llvm-svn: 283119
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
If the llvm. prefix is dropped other parts of llvm don't see this as
an intrinsic.  This means that the number of regular symbols depends
on the context the module is loaded into, which causes LTO to abort.
Fixes PR30509.
llvm-svn: 283117
 | 
| | 
| 
| 
|  | 
llvm-svn: 283115
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Retrying after buildbot reset.
To lex hash directives we peek ahead to find component tokens, create a
unified token, and unlex the peeked tokens so the parser does not need
to parse the tokens then. Make sure we do not to lex another hash
directive during peek operation.
This fixes PR28921.
Reviewers: rnk, loladiro
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D24839
llvm-svn: 283111
 |