summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86/cmpxchg-clobber-flags.ll
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Handle COPYs of physregs better (regalloc hints)Simon Pilgrim2018-09-191-2/+2
| | | | | | | | | | | | | | Enable enableMultipleCopyHints() on X86. Original Patch by @jonpa: While enabling the mischeduler for SystemZ, it was discovered that for some reason a test needed one extra seemingly needless COPY (test/CodeGen/SystemZ/call-03.ll). The handling for that is resulted in this patch, which improves the register coalescing by providing not just one copy hint, but a sorted list of copy hints. On SystemZ, this gives ~12500 less register moves on SPEC, as well as marginally less spilling. Instead of improving just the SystemZ backend, the improvement has been implemented in common-code (calculateSpillWeightAndHint(). This gives a lot of test failures, but since this should be a general improvement I hope that the involved targets will help and review the test updates. Differential Revision: https://reviews.llvm.org/D38128 llvm-svn: 342578
* [x86] Switch EFLAGS copy lowering to use reg-reg form of testing forChandler Carruth2018-04-181-6/+6
| | | | | | | | | | | | | | | | a zero register. Previously I tried this and saw LLVM unable to transform this to fold with memory operands such as spill slot rematerialization. However, it clearly works as shown in this patch. We turn these into `cmpb $0, <mem>` when useful for folding a memory operand without issue. This form has no disadvantage compared to `testb $-1, <mem>`. So overall, this is likely no worse and may be slightly smaller in some cases due to the `testb %reg, %reg` form. Differential Revision: https://reviews.llvm.org/D45475 llvm-svn: 330269
* [x86] Introduce a pass to begin more systematically fixing PR36028 and ↵Chandler Carruth2018-04-101-289/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | similar issues. The key idea is to lower COPY nodes populating EFLAGS by scanning the uses of EFLAGS and introducing dedicated code to preserve the necessary state in a GPR. In the vast majority of cases, these uses are cmovCC and jCC instructions. For such cases, we can very easily save and restore the necessary information by simply inserting a setCC into a GPR where the original flags are live, and then testing that GPR directly to feed the cmov or conditional branch. However, things are a bit more tricky if arithmetic is using the flags. This patch handles the vast majority of cases that seem to come up in practice: adc, adcx, adox, rcl, and rcr; all without taking advantage of partially preserved EFLAGS as LLVM doesn't currently model that at all. There are a large number of operations that techinaclly observe EFLAGS currently but shouldn't in this case -- they typically are using DF. Currently, they will not be handled by this approach. However, I have never seen this issue come up in practice. It is already pretty rare to have these patterns come up in practical code with LLVM. I had to resort to writing MIR tests to cover most of the logic in this pass already. I suspect even with its current amount of coverage of arithmetic users of EFLAGS it will be a significant improvement over the current use of pushf/popf. It will also produce substantially faster code in most of the common patterns. This patch also removes all of the old lowering for EFLAGS copies, and the hack that forced us to use a frame pointer when EFLAGS copies were found anywhere in a function so that the dynamic stack adjustment wasn't a problem. None of this is needed as we now lower all of these copies directly in MI and without require stack adjustments. Lots of thanks to Reid who came up with several aspects of this approach, and Craig who helped me work out a couple of things tripping me up while working on this. Differential Revision: https://reviews.llvm.org/D45146 llvm-svn: 329657
* [x86] Tidy up test case, generate check lines with script. NFC.Chandler Carruth2018-04-031-141/+412
| | | | | | | | | Just adds basic block labels and tidies up where comments go in the test case and then generates fresh CHECK lines with the script. This way, the check lines are much easier to maintain. They were already close to this but not quite there. llvm-svn: 329040
* X86: Fix X86CallFrameOptimization to search for the COPY StackPointerZvi Rackover2017-10-241-8/+34
| | | | | | | | | | | | | | | | | | | SelectionDAG inserts a copy of ESP into a virtual register. X86CallFrameOptimization assumed that the COPY, if present, is always right after the call-frame setup instruction (ADJCALLSTACKDOWN). This was a wrong assumption as the COPY can be located anywhere between the call-frame setup instruction and its first use. If the COPY happened to be located in a different location than what X86CallFrameOptimization assumed, visiting it while processing the call chain would lead to a conservative bail-out. The fix is quite straightfoward, scan ahead for the stack-pointer copy and make note of it so it can be ignored while processing the call chain. Fixes pr34903 Differential Revision: https://reviews.llvm.org/D38730 llvm-svn: 316416
* [MachineBasicBlock] Take advantage of the partially dead information.Quentin Colombet2016-04-261-2/+1
| | | | | | | Thanks to that information we wouldn't lie on a register being live whereas it is not. llvm-svn: 267622
* [MachineInstrBundle] Improvement the recognition of dead definitions.Quentin Colombet2016-04-261-2/+1
| | | | | | | Now, it is possible to know that partial definitions are dead definitions and recognize that clobbered registers are also dead. llvm-svn: 267621
* [X86] Enable call frame optimization ("mov to push") not only for optsize ↵Hans Wennborg2016-03-301-2/+4
| | | | | | | | | | (PR26325) The size savings are significant, and from what I can tell, both ICC and GCC do this. Differential Revision: http://reviews.llvm.org/D18573 llvm-svn: 264966
* MachineInstrBundle: Fix reversed isSuperRegisterEq() callMatthias Braun2016-01-051-8/+15
| | | | | | | | | | | | | | | Unfortunately this fix had the effect of exposing the -verify-machineinstrs FIXME of X86InstrInfo.cpp in two testcases for which I disabled it for now. Two testcases also have additional pushq/popq where the corrected code cannot prove that %rax is dead any longer. Looking at the examples, this could potentially be fixed by improving computeRegisterLiveness() to check the live-in lists of the successors blocks when reaching the end of a block. This fixes http://llvm.org/PR25951. llvm-svn: 256799
* CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness()Matthias Braun2015-12-111-43/+8
| | | | | | | | | | | | | | | | | | | | computeRegisterLiveness() was broken in that it reported dead for a register even if a subregister was alive. I assume this was because the results of analayzePhysRegs() are hard to understand with respect to subregisters. This commit: Changes the results of analyzePhysRegs (=struct PhysRegInfo) to be clearly understandable, also renames the fields to avoid silent breakage of third-party code (and improve the grammar). Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling it and removing workarounds for the bug. This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033 Differential Revision: http://reviews.llvm.org/D15320 llvm-svn: 255362
* X86: Don't emit SAHF/LAHF for 64-bit targets unless explicitly supportedHans Wennborg2015-12-041-23/+49
| | | | | | | | | | | | | | | These instructions are not supported by all CPUs in 64-bit mode. Emitting them causes Chromium to crash on start-up for users with such chips. (GCC puts these instructions behind -msahf on 64-bit for the same reason.) This patch adds FeatureLAHFSAHF, enables it by default for 32-bit targets and modern CPUs, and changes X86InstrInfo::copyPhysReg back to the lowering from before r244503 when the instructions are not available. Differential Revision: http://reviews.llvm.org/D15240 llvm-svn: 254793
* X86InstrInfo::copyPhysReg: workaround reg livenessJF Bastien2015-12-041-4/+39
| | | | | | | | | | | | | | | | | | | | | | Summary: computeRegisterLiveness and analyzePhysReg are currently getting confused about liveness in some cases, breaking copyPhysReg's calculation of whether AX is dead in some cases. Work around this issue temporarily by assuming that AX is always live. See detail in: https://llvm.org/bugs/show_bug.cgi?id=25033#c7 And associated bugs PR24535 PR25033 PR24991 PR24992 PR25201. This workaround makes the code correct but slightly inefficient, but it seems to confuse the machine instr verifier which now things EAX was undefined in some cases where it's being conservatively saved / restored. Reviewers: majnemer, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15198 llvm-svn: 254680
* x86: Emit LAHF/SAHF instead of PUSHF/POPFJF Bastien2015-08-101-28/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask system bits in the past, and EFLAGS's DF bit is a constant source of hilarity). Commit r220529 fixed PR20376 by saving cmpxchg's flags result using EFLAGS, this commit now generated LAHF/SAHF instead, for all of x86 (not just NaCl) because it leads to an overall performance gain over PUSHF/POPF. As with the previous patch this code generation is pretty bad because it occurs very later, after register allocation, and in many cases it rematerializes flags which were already available (e.g. already in a register through SETE). Fortunately it's somewhat rare that this code needs to fire. I did [[ https://github.com/jfbastien/benchmark-x86-flags | a bit of benchmarking ]], the results on an Intel Haswell E5-2690 CPU at 2.9GHz are: | Time per call (ms) | Runtime (ms) | Benchmark | | 0.000012514 | 6257 | sete.i386 | | 0.000012810 | 6405 | sete.i386-fast | | 0.000010456 | 5228 | sete.x86-64 | | 0.000010496 | 5248 | sete.x86-64-fast | | 0.000012906 | 6453 | lahf-sahf.i386 | | 0.000013236 | 6618 | lahf-sahf.i386-fast | | 0.000010580 | 5290 | lahf-sahf.x86-64 | | 0.000010304 | 5152 | lahf-sahf.x86-64-fast | | 0.000028056 | 14028 | pushf-popf.i386 | | 0.000027160 | 13580 | pushf-popf.i386-fast | | 0.000023810 | 11905 | pushf-popf.x86-64 | | 0.000026468 | 13234 | pushf-popf.x86-64-fast | Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at least not for this purpose. Reviewers: rnk, jvoung, t.p.northover Subscribers: llvm-commits Differential revision: http://reviews.llvm.org/D6629 llvm-svn: 244503
* Revert "Fix MO's analyzePhysReg, it was confusing sub- and super-registers. ↵JF Bastien2015-08-051-96/+28
| | | | | | | | Problem pointed out by Michael Hordijk." I mistakenly committed the patch for D6629, and was trying to commit another. Reverting until it gets proper signoff. llvm-svn: 244121
* Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem ↵JF Bastien2015-08-051-28/+96
| | | | | | pointed out by Michael Hordijk. llvm-svn: 244120
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
* x86-32: PUSHF/POPF use/def EFLAGSJF Bastien2014-12-161-14/+15
| | | | | | | | | | | | | | Summary: As a side-quest for D6629 jvoung pointed out that I should use -verify-machineinstrs and this found a bug in x86-32's handling of EFLAGS for PUSHF/POPF. This patch fixes the use/def, and adds -verify-machineinstrs to all x86 tests which contain 'EFLAGS'. One exception: this patch leaves inline-asm-fpstack.ll as-is because it fails -verify-machineinstrs in a way unrelated to EFLAGS. This patch also modifies cmpxchg-clobber-flags.ll along the lines of what D6629 already does by also testing i386. Test Plan: ninja check Reviewers: t.p.northover, jvoung Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6687 llvm-svn: 224359
* ScheduleDAG: record PhysReg dependencies represented by CopyFromReg nodesTim Northover2014-10-231-0/+86
x86's CMPXCHG -> EFLAGS consumer wasn't being recorded as a real EFLAGS dependency because it was represented by a pair of CopyFromReg(EFLAGS) -> CopyToReg(EFLAGS) nodes. ScheduleDAG was expecting the source to be an implicit-def on the instruction, where the result numbers in the DAG and the Uses list in TableGen matched up precisely. The Copy notation seems much more robust, so this patch extends ScheduleDAG rather than refactoring x86. Should fix PR20376. llvm-svn: 220529
OpenPOWER on IntegriCloud