summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombine] (add/uaddo X, Carry) -> (addcarry X, 0, Carry)Amaury Sechet2017-06-011-0/+49
| | | | | | | | | | | | | | | Summary: This enables further transforms. Depends on D32916 Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32925 llvm-svn: 304401
* Add LiveRangeShrink pass to shrink live range within BB.Dehao Chen2017-05-313-0/+233
| | | | | | | | | | | | | | Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB. Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb Reviewed By: MatzeB, andreadb Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D32563 llvm-svn: 304371
* ImplicitNullChecks: Clear kill/dead flags when moving instructions aroundMatthias Braun2017-05-311-2/+14
| | | | | | | The values are marked as livein in the successor blocks so marking them as killed or dead was wrong. llvm-svn: 304366
* Check hasPersonalityFn before calling getPersonalityFnReid Kleckner2017-05-311-4/+5
| | | | llvm-svn: 304365
* [EH] Fix the LSDA that we emit for unknown EH personalitiesReid Kleckner2017-05-312-5/+16
| | | | | | | | | | | | | | We should have a single call site entry with no landing pad. This indicates that no EH action should be taken and the unwinder should unwind to the next frame. We currently don't recognize __gxx_personality_seh0 as a known personality, so we forcibly emit a table, and that table was wrong. This was filed as PR33220. Now we emit a correct table for that personality. The next step is to recognize that we can completely skip the table for this personality. llvm-svn: 304363
* Try to fix buildbotsMatthias Braun2017-05-311-1/+3
| | | | | | | | It seems not all of our bots have a std::vector::erase() taking a const_iterator (even though that seems to be part of C++11) attempt to workaround. llvm-svn: 304349
* X86FloatingPoint: Fix livein listsMatthias Braun2017-05-311-0/+5
| | | | | | | | | | | | After transforming FP to ST registers: - Do not add the ST register to the livein lists, they are reserved so we do not need to track their liveness. - Remove the FP registers from the livein lists, they don't have defs or uses anymore and so are not live. - (The setKillFlags() call is moved to an earlier place as it relies on the FP registers still being present in the livein list.) llvm-svn: 304342
* [ScheduleDAG] Deal with already scheduled loads in ScheduleDAG.Nirav Dave2017-05-311-128/+150
| | | | | | | | | | | | | | | | | Summary: If we attempt to unfold an SUnit in ScheduleDAG that results in finding an already scheduled load, we must should abort the unfold as it will not improve scheduling. This fixes PR32610. Reviewers: jmolloy, sunfish, bogner, spatel Subscribers: llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D32911 llvm-svn: 304321
* TargetMachine: Indicate whether machine verifier passes.Matthias Braun2017-05-311-1/+6
| | | | | | | | | | | | | This adds a callback to the LLVMTargetMachine that lets target indicate that they do not pass the machine verifier checks in all cases yet. This is intended to be a temporary measure while the targets are fixed allowing us to enable the machine verifier by default with EXPENSIVE_CHECKS enabled! Differential Revision: https://reviews.llvm.org/D33696 llvm-svn: 304320
* [PPC] Inline expansion of memcmpZaara Syeda2017-05-312-4/+615
| | | | | | | | | | | | | | | This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 llvm-svn: 304313
* [DAG] Avoid use of stale store.Nirav Dave2017-05-311-2/+2
| | | | | | | | | | | | | | | | | | Correct references to alignment of store which may be deleted in a previous iteration of merge. Instead use first store that would be merged. Corrects pr33172's use-after-poison caught by ASan. Reviewers: spatel, hfinkel, RKSimon Reviewed By: RKSimon Subscribers: thegameg, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33686 llvm-svn: 304299
* [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-05-312-46/+73
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 304265
* MachineInstr: Do not skip dead def operands when printing.Matthias Braun2017-05-301-32/+0
| | | | | | | | This was introduced a long time ago in r86583 when regmask operands didn't exist. Nowadays the behavior hurts more than it helps. This removes it. llvm-svn: 304254
* [AntiDepBreaker] Revert r299124 and add a test.Tim Shen2017-05-302-2/+2
| | | | | | | | | | | | | | | | | Summary: AntiDepBreaker intends to add all live-outs, including the implicit CSRs, in StartBlock. r299124 was done without understanding that intention. Now with the live-ins propagated correctly (D32464), we can revert this change. Reviewers: MatzeB, qcolombet Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D33697 llvm-svn: 304251
* TargetPassConfig: Keep a reference to an LLVMTargetMachine; NFCMatthias Braun2017-05-301-5/+5
| | | | | | | | | | | TargetPassConfig is not useful for targets that do not use the CodeGen library, so we may just as well store a pointer to an LLVMTargetMachine instead of just to a TargetMachine. While at it, also change the constructor to take a reference instead of a pointer as the TM must not be nullptr. llvm-svn: 304247
* MIR: remove explicit "noVRegs" property.Tim Northover2017-05-302-4/+0
| | | | | | | We can infer this from the incoming MIR, so there's no reason to represent it with a special flag. llvm-svn: 304246
* [Localizer] Don't trick to be smart for the insertion pointQuentin Colombet2017-05-301-6/+4
| | | | | | | | | There is no guarantee that the first use of a constant that is traversed is actually the first in the related basic block. Thus, if we use that as the insertion point we may end up with definitions that don't dominate there use. llvm-svn: 304244
* [SelectionDAG] Remove special case for ISD::FPOWI from the strict FP ↵Craig Topper2017-05-301-4/+0
| | | | | | | | intrinsic handling. This code was compensating for FPOWI defaulting to Legal and many targets not changing it to Expand. This was fixed in r304215 to default to Expand so this special handling should no longer be necessary. llvm-svn: 304221
* [CodeView] Rename ModuleDebugFragment -> DebugSubsection.Zachary Turner2017-05-302-10/+9
| | | | | | | This is more concise, and matches the terminology used in other parts of the codebase more closely. llvm-svn: 304218
* [SelectionDAG] Set ISD::FPOWI to Expand by defaultCraig Topper2017-05-302-1/+1
| | | | | | | | | | | | | | | | | Summary: Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie". This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default. Reviewers: spatel, RKSimon, efriedma Reviewed By: RKSimon Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits Differential Revision: https://reviews.llvm.org/D33530 llvm-svn: 304215
* [GlobalIsel] Fix a warning with GCC 7 -Wpedantic. NFCI.Davide Italiano2017-05-291-1/+1
| | | | llvm-svn: 304174
* [DAGCombiner] fix load narrowing transform to exclude loads with extensionSanjay Patel2017-05-291-1/+2
| | | | | | | | | | The extending load possibility was missed in: https://reviews.llvm.org/rL304072 We might want to handle this cases as a follow-up, but bailing out for now to avoid miscompiling. llvm-svn: 304153
* DebugInfo: Include .dwo file name when hashing multiple CUs in a single fileMehdi Amini2017-05-293-3/+13
| | | | | | | | | | | | | | | | | | | | | | | This is really a workaround for ThinLTO in particular - since it can import partial CUs that may end up looking very similar/the same as the same partial import in another ThinLTO compile. An alternative fix would be to change the DICompileUnit metadata to include a "primary file" or the like - and when importing for ThinLTO set the primary file to the name of the DICompileUnit that is being imported into. This involves changing the schema and would reduce the excessive uniqueness in the hash that this change creates - allowing diagnosing of more duplicate CUs than will be caught with this change. But duplicate CUs can still be caught in non-ThinLTO builds & are mostly a nuisance rather than a particularly deliberate/effective tool for finding broken code. (arguably the hash could always include the dwo file and nothing in fission would break, I think..) Reapply of r304119 after adding a triple to the test and moving it to the X86 directory. llvm-svn: 304130
* DebugInfo: Omit an empty CU when a subprogram was moved into its useMehdi Amini2017-05-291-8/+12
| | | | | | | | | | | When the only use of a CU is for a subprogram that's only emitted into the using CU (to avoid cross-CU references in DWO files), avoid creating that CU at all. Reapply of r304111 after adding a triple to the test and moving it to the X86 directory. llvm-svn: 304129
* Revert "[IfConversion] Keep the CFG updated incrementally in IfConvertTriangle"Tobias Grosser2017-05-291-21/+6
| | | | | | | | | | | | | | | | | | The reverted change introdued assertions ala: "MachineBasicBlock::succ_iterator llvm::MachineBasicBlock::removeSuccessor(succ_iterator, bool): Assertion `I != Successors.end() && "Not a current successor!"' Mikael, the original committer, wrote me that he is working on a fix, but that it likely will take some time to get this resolved. As this bug is one of the last two issues that keep the AOSP buildbot from turning green, I revert the original commit r302876. I am looking forward to see this recommitted after the assertion has been resolved. llvm-svn: 304128
* Revert "DebugInfo: Omit an empty CU when a subprogram was moved into its use"Mehdi Amini2017-05-291-12/+8
| | | | | | | This reverts commit r304111. GreenDragon is broken. llvm-svn: 304126
* Revert "DebugInfo: Include .dwo file name when hashing multiple CUs in a ↵Mehdi Amini2017-05-293-13/+3
| | | | | | | | single file" This reverts commit r304119 and r304118. GreenDragon is broken. llvm-svn: 304125
* DebugInfo: Include .dwo file name when hashing multiple CUs in a single fileDavid Blaikie2017-05-293-3/+13
| | | | | | | | | | | | | | | | | | | | This is really a workaround for ThinLTO in particular - since it can import partial CUs that may end up looking very similar/the same as the same partial import in another ThinLTO compile. An alternative fix would be to change the DICompileUnit metadata to include a "primary file" or the like - and when importing for ThinLTO set the primary file to the name of the DICompileUnit that is being imported into. This involves changing the schema and would reduce the excessive uniqueness in the hash that this change creates - allowing diagnosing of more duplicate CUs than will be caught with this change. But duplicate CUs can still be caught in non-ThinLTO builds & are mostly a nuisance rather than a particularly deliberate/effective tool for finding broken code. (arguably the hash could always include the dwo file and nothing in fission would break, I think..) llvm-svn: 304119
* Prune trailing whitespace. (To regenerate makefiles)NAKAMURA Takumi2017-05-281-2/+2
| | | | llvm-svn: 304112
* DebugInfo: Omit an empty CU when a subprogram was moved into its useDavid Blaikie2017-05-281-8/+12
| | | | | | | | When the only use of a CU is for a subprogram that's only emitted into the using CU (to avoid cross-CU references in DWO files), avoid creating that CU at all. llvm-svn: 304111
* [DAGCombiner] use narrow load to avoid vector extractSanjay Patel2017-05-271-0/+50
| | | | | | | | | | | | | | | | | | If we have (extract_subvector(load wide vector)) with no other users, that can just be (load narrow vector). This is intentionally conservative. Follow-ups may loosen the one-use constraint to account for the extract cost or just remove the one-use check. The memop chain updating is based on code that already exists multiple times in x86 lowering, so that should be pulled into a helper function as a follow-up. Background: this is a potential improvement noticed via regressions caused by making x86's peekThroughBitcasts() not loop on consecutive bitcasts (see comments in D33137). Differential Revision: https://reviews.llvm.org/D33578 llvm-svn: 304072
* AArch64/PEI: Do not add reserved regs to liveinsMatthias Braun2017-05-271-1/+2
| | | | | | | We do not track liveness for reserved registers. It is unnecessary to add them to block livein lists. llvm-svn: 304059
* ScheduleDAGInstrs: Fix fixupKills()Matthias Braun2017-05-273-159/+51
| | | | | | | | | | | | Rewrite fixupKills() to use the LivePhysRegs class. Simplifies the code and fixes a bug where the CSR registers in return blocks where missed leading to invalid kill flags. Also remove the unnecessary rule that we wouldn't set kill flags on tied operands. No tests as I have an upcoming commit improving MachineVerifier checks to catch these cases in multiple existing lit tests. llvm-svn: 304055
* [GlobalISel] Add a localizer pass for target to useQuentin Colombet2017-05-273-0/+127
| | | | | | | | | | | | | This reverts commit r299287 plus clean-ups. The localizer pass is a helper pass that could be run at O0 in the GISel pipeline to work around the deficiency of the fast register allocator. It basically shortens the live-ranges of the constants so that the allocator does not spill all over the place. Long term fix would be to make the greedy allocator fast. llvm-svn: 304051
* BranchRelaxation: computeLiveIns() after creating new blockMatthias Braun2017-05-271-0/+4
| | | | | | | | One case in BranchRelaxation did not compute liveins after creating a new block. This is catched by existing tests with an upcoming commit that will improve MachineVerifier checking of livein lists. llvm-svn: 304049
* LivePhysRegs: Add default for removeRegsInMask(Clobbers); NFCMatthias Braun2017-05-261-1/+1
| | | | llvm-svn: 304036
* MachineVerifier: Remove unused set; NFCMatthias Braun2017-05-261-5/+0
| | | | llvm-svn: 304035
* Make helper functions static. NFC.Benjamin Kramer2017-05-262-6/+8
| | | | llvm-svn: 304029
* DebugInfo: Do not emit empty CUsDavid Blaikie2017-05-262-15/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consistent with GCC and addresses a shortcoming with ThinLTO where many imported CUs may end up being empty (because the functions imported from them either ended up not being used (and were then discarded, since they're imported as available_externally) or optimized away entirely). Test cases previously testing empty CUs (either intentionally, or because they didn't need anything more complicated) had a trivial 'int' or similar basic type added to their retained types list. This is a first order approximation - a deeper implementation could do things like: 1) Be more lazy about construction of the CU - for example if two CUs containing a single identical retained type are linked together, with this change one of the two CUs will be produced but empty (since a duplicate type won't be produced). 2) Go further and invert all the CU links the same way the subprogram link is inverted - keep named CU lists of retained types, macros, etc, and have those link back to the CU. Then if they're emitted, the CU is emitted, but never otherwise - this would allow the metadata itself to be dropped earlier too, though it seems unlikely that's an important optimization as there shouldn't be many CUs relative to the number of other entities. llvm-svn: 304020
* DebugInfo: Don't include locations for debug-having code inlined into ↵David Blaikie2017-05-261-0/+4
| | | | | | | | | | | | | | | nodebug functions This produced 'strange' DWARF anyway - the CU would have no ranges (or at least not a range including the inlined code) nor any subprogram or inlined_subroutine - yet the line table would have entries for these instructions. (this actually becomes more relevant with changes coming after this, where a CU without any contents will be omitted entirely - so there would be no line table to put this on anyway) llvm-svn: 304004
* LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEIMatthias Braun2017-05-261-31/+48
| | | | | | | | | | | | | | | | | Re-commit r303938 and r303954 with a fix for addLiveIns(): the internal addPristines() function must be called on an empty set or it may accidentally reset saved registers. - addLiveOutsNoPristines() needs to add callee saved registers that are actually saved and restored somewhere to the set (they are not pristine). - Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines(). This fixes the problem from D32156. Differential Revision: https://reviews.llvm.org/D32464 llvm-svn: 304001
* [DAGCombiner] use narrow vector ops to eliminate concat/extract (PR32790)Sanjay Patel2017-05-261-0/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the best case: extract (binop (concat X1, X2), (concat Y1, Y2)), N --> binop XN, YN ...we kill all of the extract/concat and just have narrow binops remaining. If only one of the binop operands is amenable, this transform is still worthwhile because we kill some of the extract/concat. Optional bitcasting makes the code more complicated, but there doesn't seem to be a way to avoid that. The TODO about extending to more than bitwise logic is there because we really will regress several x86 tests including madd, psad, and even a plain integer-multiply-by-2 or shift-left-by-1. I don't think there's anything fundamentally wrong with this patch that would cause those regressions; those folds are just missing or brittle. If we extend to more binops, I found that this patch will fire on at least one non-x86 regression test. There's an ARM NEON test in test/CodeGen/ARM/coalesce-subregs.ll with a pattern like: t5: v2f32 = vector_shuffle<0,3> t2, t4 t6: v1i64 = bitcast t5 t8: v1i64 = BUILD_VECTOR Constant:i64<0> t9: v2i64 = concat_vectors t6, t8 t10: v4f32 = bitcast t9 t12: v4f32 = fmul t11, t10 t13: v2i64 = bitcast t12 t16: v1i64 = extract_subvector t13, Constant:i32<0> There was no functional change in the codegen from this transform from what I could see though. For the x86 test changes: 1. PR32790() is the closest call. We don't reduce the AVX1 instruction count in that case, but we improve throughput. Also, on a core like Jaguar that double-pumps 256-bit ops, there's an unseen win because two 128-bit ops have the same cost as the wider 256-bit op. SSE/AVX2/AXV512 are not affected which is expected because only AVX1 has the extract/concat ops to match the pattern. 2. do_not_use_256bit_op() is the best case. Everyone wins by avoiding the concat/extract. Related bug for IR filed as: https://bugs.llvm.org/show_bug.cgi?id=33026 3. The SSE diffs in vector-trunc-math.ll are just scheduling/RA, so nothing real AFAICT. 4. The AVX1 diffs in vector-tzcnt-256.ll are all the same pattern: we reduced the instruction count by one in each case by eliminating two insert/extract while adding one narrower logic op. https://bugs.llvm.org/show_bug.cgi?id=32790 Differential Revision: https://reviews.llvm.org/D33137 llvm-svn: 303997
* [DAG] Move legal type checks in store merge to be checked onlyNirav Dave2017-05-261-2/+4
| | | | | | on non-legal cases. NFC. llvm-svn: 303994
* [ARM] Fix lowering of misaligned memcpy/memsetJohn Brawn2017-05-261-12/+12
| | | | | | | | | | | | | | | | | Currently getOptimalMemOpType returns i32 for large enough sizes without checking for alignment, leading to poor code generation when misaligned accesses aren't permitted as we generate a word store then later split it up into byte stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for memset we splat the memset value into a word then immediately split it up again. Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type to use, but also fix a bug there where it wasn't correctly checking if misaligned memory accesses are allowed. Differential Revision: https://reviews.llvm.org/D33442 llvm-svn: 303990
* LivePhysRegs: Skip reserved regs in computeLiveIns; NFCIMatthias Braun2017-05-264-6/+12
| | | | | | | | | | Re-commit r303937 + r303949 as they were not the cause for the build failures. We do not track liveness of reserved registers so adding them to the liveins list in computeLiveIns() was completely unnecessary. llvm-svn: 303970
* Revert "LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEI"Matthias Braun2017-05-261-41/+27
| | | | | | | | | | Tentatively revert this to see if it fixes the buildbot stage2 breakages. This reverts commit r303938. This reverts commit r303954. llvm-svn: 303960
* Revert "LivePhysRegs: Skip reserved regs in computeLiveIns; NFCI"Matthias Braun2017-05-264-12/+6
| | | | | | | | | | Tentatively revert, suspecting that it caused breakage in stage2 buildbots. This reverts commit r303949. This reverts commit r303937. llvm-svn: 303955
* LivePhysRegs: Follow-up to r303937Matthias Braun2017-05-261-1/+1
| | | | | | | We may have situations in which a superregister is reserved and not added to liveins, so we have to add the subregisters. llvm-svn: 303949
* LivePhysRegs: Fix addLiveOutsNoPristines() for return blocks past PEIMatthias Braun2017-05-251-27/+41
| | | | | | | | | | | | | - addLiveOutsNoPristines() needs to add callee saved registers that are actually saved and restored somewhere to the set (they are not pristine). - Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines(). This fixes the problem from D32156. Differential Revision: https://reviews.llvm.org/D32464 llvm-svn: 303938
* LivePhysRegs: Skip reserved regs in computeLiveIns; NFCIMatthias Braun2017-05-254-5/+11
| | | | | | | We do not track liveness of reserved registers so adding them to the liveins list in computeLiveIns() was completely unnecessary. llvm-svn: 303937
OpenPOWER on IntegriCloud