| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having two ways to do this doesn't seem terribly helpful and
consistently using the insert version (which we already has) seems like
it'll make the code easier to understand to anyone working with standard
data structures. (I also updated many references to the Entry's
key and value to use first() and second instead of getKey{Data,Length,}
and get/setValue - for similar consistency)
Also removes the GetOrCreateValue functions so there's less surface area
to StringMap to fix/improve/change/accommodate move semantics, etc.
llvm-svn: 222319
|
|
|
|
| |
llvm-svn: 222292
|
|
|
|
|
|
|
|
|
| |
shift-right for booleans (i1).
Arithmetic shift-right immediate with sign-/zero-extensions also works for
boolean values. Update the assert and the test cases to reflect that fact.
llvm-svn: 222272
|
|
|
|
|
|
|
|
|
| |
shift-right for booleans (i1).
Logical shift-right immediate with sign-/zero-extensions also works for boolean
values. Update the assert and the test cases to reflect that fact.
llvm-svn: 222270
|
|
|
|
|
|
|
|
|
|
|
| |
"zero" shifts."
Shifts also perform sign-/zero-extends to larger types, which requires us to emit
an integer extend instead of a simple COPY.
Related to PR21594.
llvm-svn: 222257
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"optimizeCompareInstr" converts compares (cmp/cmn) into plain sub/add
instructions when the flags are not used anymore. This conversion is valid for
most instructions, but not all. Some instructions that don't set the flags
(e.g. sub with immediate) can set the SP, whereas the flag setting version uses
the same encoding for the "zero" register.
Update the code to also check for the return register before performing the
optimization to make sure that a cmp doesn't suddenly turn into a sub that sets
the stack pointer.
I don't have a test case for this, because it isn't easy to trigger.
llvm-svn: 222255
|
|
|
|
|
|
|
|
| |
This change emits a COPY for a shift-immediate with a "zero" shift value.
This fixes PR21594 where we emitted a shift instruction with an incorrect
immediate operand.
llvm-svn: 222247
|
|
|
|
|
|
| |
requires TargetLoweringObjectFile to be passed.
llvm-svn: 221926
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The generic FastISel code would bail, because it can't emit a sign-extend for
AArch64. This copies the code over and uses AArch64 specific emit functions.
This is not ideal and 'computeAddress' should handles this, so it can fold the
address computation into the memory operation.
I plan to clean up 'computeAddress' anyways, so I will add that in a future
commit.
Related to rdar://problem/18962471.
llvm-svn: 221923
|
|
|
|
|
|
| |
TargetMachine so that different subtargets could share the TLOF effectively
llvm-svn: 221878
|
|
|
|
|
|
|
|
|
|
|
| |
'false' value.
Optimize selects of i1 in the presence of 'true' and 'false' operands to simple
logic operations.
This fixes rdar://problem/18960150.
llvm-svn: 221848
|
|
|
|
|
|
|
|
|
| |
This folds the compare emission into the select emission when possible, so we
can directly use the flags and don't have to emit a separate compare.
Related to rdar://problem/18960150.
llvm-svn: 221847
|
|
|
|
|
|
| |
Related to rdar://problem/18960150.
llvm-svn: 221846
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch MCDisassembler::getInstruction takes an ArrayRef<uint8_t>
instead of a MemoryObject.
Even on X86 there is a maximum size an instruction can have. Given
that, it seems way simpler and more efficient to just pass an ArrayRef
to the disassembler instead of a MemoryObject and have it do a virtual
call every time it wants some extra bytes.
llvm-svn: 221751
|
|
|
|
|
|
|
|
| |
Lower the llvm.fabs intrinsic to the 'fabs' MI instruction.
This fixes rdar://problem/18946552.
llvm-svn: 221729
|
|
|
|
|
|
| |
Base classes were storing a second copy.
llvm-svn: 221667
|
|
|
|
|
|
|
|
|
| |
In the case we optimize an integer extend away and replace it directly with the
source register, we also have to clear all kill flags at all its uses.
This is necessary, because the orignal IR instruction might be trivially dead,
but we replaced it with a nop at MI level.
llvm-svn: 221628
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a few cases of:
* Wrong variable name style.
* Lines longer than 80 columns.
* Repeated names in comments.
* clang-format of the above.
This make the next patch a lot easier to read.
llvm-svn: 221615
|
|
|
|
|
|
|
|
|
|
| |
Reversing a CB* instruction used to drop the flags on the condition. On the
included testcase, this lead to a read from an undefined vreg.
Using addOperand keeps the flags, here <undef>.
Differential Revision: http://reviews.llvm.org/D6159
llvm-svn: 221507
|
|
|
|
|
|
|
|
|
| |
While fixing up the register classes in the machine combiner in a previous
commit I missed one.
This fixes the last one and adds a test case.
llvm-svn: 221308
|
|
|
|
|
|
| |
This kind of pattern is emitted by the loop vectorizer.
llvm-svn: 221289
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D6062
llvm-svn: 221204
|
|
|
|
| |
llvm-svn: 221199
|
|
|
|
|
|
|
| |
Some literals in the AArch64 backend had 15 'f's rather than 16, causing
comparisons with a constant 0xffffffffffffffff to be miscompiled.
llvm-svn: 221157
|
|
|
|
|
|
|
|
|
|
| |
This removes calls to isMaterializable in the following cases:
* It was redundant with a call to isDeclaration now that isDeclaration returns
the correct answer for materializable functions.
* It was followed by a call to Materialize. Just call Materialize and check EC.
llvm-svn: 221050
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our internal test reveals such case should not be transformed:
cmp x17, #3
b.lt .LBB10_15
...
subs x12, x12, #1
b.gt .LBB10_1
where x12 is a liveout, becomes:
cmp x17, #2
b.le .LBB10_15
...
subs x12, x12, #2
b.ge .LBB10_1
Unable to provide test case as it's difficult to reproduce on community branch.
http://reviews.llvm.org/D6048
Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>!
llvm-svn: 220987
|
|
|
|
|
|
|
|
|
| |
a CMP which defines the flags used by B.CC.
http://reviews.llvm.org/D6047
Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>!
llvm-svn: 220961
|
|
|
|
|
|
|
|
|
|
| |
Benchmarks have shown that it's harmless to the performance there, and having a
unified set of passes between the two cores where possible helps big.LITTLE
deployment.
Patch by Z. Zheng.
llvm-svn: 220744
|
|
|
|
|
|
| |
[-Winconsistent-missing-override]
llvm-svn: 220739
|
|
|
|
|
|
|
|
|
|
|
|
| |
check.
This is a minor change to use the immediate version when the operand is a null
value. This should get rid of an unnecessary 'mov' instruction in debug
builds and align the code more with the one generated by SelectionDAG.
This fixes rdar://problem/18785125.
llvm-svn: 220713
|
|
|
|
|
|
|
|
|
| |
Minor enhancement to use 'tbz' for i1 compare-and-branch to get rid of an 'and'
instruction.
This fixes rdar://problem/18784953.
llvm-svn: 220712
|
|
|
|
|
|
|
|
|
| |
The pattern matching for a 'ConstantInt' value was too restrictive. Checking for
a 'Constant' with a bull value is sufficient for using an 'cbz/cbnz' instruction.
This fixes rdar://problem/18784732.
llvm-svn: 220709
|
|
|
|
|
|
|
|
|
|
|
|
| |
instruction if it is in a different basic block.
This fixes a bug where the input register was not defined for the 'tbz/tbnz'
instruction. This happened, because we folded the 'and' instruction from a
different basic block.
This fixes rdar://problem/18784013.
llvm-svn: 220704
|
|
|
|
|
|
|
|
|
|
|
|
| |
At higher optimization levels the LLVM IR may contain more complex patterns for
loads/stores from/to frame indices. The 'computeAddress' function wasn't able to
handle this and triggered an assertion.
This fix extends the possible addressing modes for frame indices.
This fixes rdar://problem/18783298.
llvm-svn: 220700
|
|
|
|
|
|
|
|
|
|
|
| |
sets as keys into a cache of interference matrice values in the Interference
constraint adder.
Creating interference matrices was one of the large remaining time-sinks in
PBQP. Caching them reduces the total compile time (when using PBQP) on the
nightly test suite by ~10%.
llvm-svn: 220688
|
|
|
|
|
|
|
|
|
|
| |
This fixes a miscompilation in the AArch64 fast-isel which was
triggered when a branch is based on an icmp with condition eq or ne,
and type i1, i8 or i16. The cbz instruction compares the whole 32-bit
register, so values with the bottom 1, 8 or 16 bits clear would cause
the wrong branch to be taken.
llvm-svn: 220553
|
|
|
|
|
|
|
|
|
|
| |
This has been implement using the MCTargetStreamer interface as is done in the
ARM, Mips and PPC backends.
Phabricator: http://reviews.llvm.org/D5891
PR20964
llvm-svn: 220422
|
|
|
|
|
|
| |
And add a long awaited testcase.
llvm-svn: 220381
|
|
|
|
|
|
|
|
| |
This enables targets to adapt their pass pipeline to the register
allocator in use. For example, with the AArch64 backend, using PBQP
with the cortex-a57, the FPLoadBalancing pass is no longer necessary.
llvm-svn: 220321
|
|
|
|
|
|
|
|
|
|
| |
We should be talking about the number of source elements, not the number of destination elements, given we know at this point that the source and dest element numbers are not the same.
While we're at it, avoid writing to std::vector::end()...
Bug found with random testing and a lot of coffee.
llvm-svn: 220051
|
|
|
|
|
|
|
|
|
|
|
| |
When the constant divisor was larger than 32bits, then the optimized code
generated for the AArch64 backend would emit the wrong code, because the shift
was defined as a shift of a 32bit constant '(1<<Lg2(divisor))' and we would
loose the upper 32bits.
This fixes rdar://problem/18678801.
llvm-svn: 219934
|
|
|
|
|
|
|
|
|
|
|
| |
This is mostly a copy of the existing FastISel GEP code, but we have to
duplicate it for AArch64, because otherwise we would bail out even for simple
cases. This is because the standard fastEmit functions don't cover MUL at all
and ADD is lowered very inefficientily.
The original commit had a bug in the add emit logic, which has been fixed.
llvm-svn: 219831
|
|
|
|
|
|
|
|
| |
function. NFC.
Simplify add with immediate emission by factoring it out into a helper function.
llvm-svn: 219830
|
|
|
|
| |
llvm-svn: 219799
|
|
|
|
|
|
| |
This breaks our internal build bots. Reverting it to get the bots green again.
llvm-svn: 219776
|
|
|
|
| |
llvm-svn: 219750
|
|
|
|
|
|
|
|
| |
This is a follow up to commit r219742. It removes the CCInMI variable
and accesses the CC in CSCINC directly. In the case of a conditional
branch accessing the CC with CCInMI was wrong.
llvm-svn: 219748
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Peephole optimization that generates a single conditional branch
for csinc-branch sequences like in the examples below. This is
possible when the csinc sets or clears a register based on a condition
code and the branch checks that register. Also the condition
code may not be modified between the csinc and the original branch.
Examples:
1. Convert csinc w9, wzr, wzr, <CC>;tbnz w9, #0, 0x44
to b.<invCC>
2. Convert csinc w9, wzr, wzr, <CC>; tbz w9, #0, 0x44
to b.<CC>
rdar://problem/18506500
llvm-svn: 219742
|
|
|
|
|
|
|
|
| |
This is mostly a copy of the existing FastISel GEP code, but on AArch64 we bail
out even for simple cases, because the standard fastEmit functions don't cover
MUL and ADD is lowered inefficientily.
llvm-svn: 219726
|
|
|
|
|
|
|
|
|
|
|
| |
Sign-/zero-extend folding depended on the load and the integer extend to be
both selected by FastISel. This cannot always be garantueed and SelectionDAG
might interfer. This commit adds additonal checks to load and integer extend
lowering to catch this.
Related to rdar://problem/18495928.
llvm-svn: 219716
|