summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86ISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Remove ISD::SETCC match from combineX86ADD. It's done improperly and doesn't ↵Amaury Sechet2017-06-011-2/+1
| | | | | | work. llvm-svn: 304403
* Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.Galina Kistanova2017-05-311-5/+13
| | | | llvm-svn: 304312
* Revert "This patch closes PR28513: an optimization of multiplication by ↵Vedant Kumar2017-05-301-79/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | different constants. It's implemented on DAG combiner level." This reverts commit r304209. I think this change is responsible for a tablgen failure in stage2 builds: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rthinlto_build/2171/ I reproduced the failure locally (without ThinLTO), reverted the commit, rebuilt the stage1 clang, rebuilt the stage2 llvm-tblgen tool, and found that the crash disappears when the commit is reverted. Here is the stack trace: FAILED: lib/Target/ARM/ARMGenRegisterBank.inc.tmp cd /Volumes/Builds/pz-master-stage2-RA/lib/Target/ARM && /Volumes/Builds/pz-master-stage2-RA/bin/llvm-tblgen -gen-register-bank -I /Users/vk/llvm/lib/Target/ARM -I /Users/vk/llvm/include -I /Users/vk/llvm/lib/Target /Users/vk/llvm/lib/Target/ARM/ARM.td -o /Volumes /Builds/pz-master-stage2-RA/lib/Target/ARM/ARMGenRegisterBank.inc.tmp 0 llvm-tblgen 0x0000000106fc9568 llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 40 1 llvm-tblgen 0x0000000106fc9be6 SignalHandler(int) + 422 2 libsystem_platform.dylib 0x00000001076a7fba _sigtramp + 26 3 libsystem_platform.dylib 0x00007fff58deb468 _sigtramp + 1366570184 4 llvm-tblgen 0x0000000106e89cc7 llvm::CodeGenRegBank::getCompositeSubRegIndex(llvm::CodeGenSubRegIndex*, llvm::CodeGenSubRegIndex*) + 615 5 llvm-tblgen 0x0000000106e88be6 llvm::CodeGenRegister::computeSubRegs(llvm::CodeGenRegBank&) + 2182 6 llvm-tblgen 0x0000000106e8e9f0 llvm::CodeGenRegBank::CodeGenRegBank(llvm::RecordKeeper&) + 2192 7 llvm-tblgen 0x0000000106f384a1 llvm::EmitRegisterBank(llvm::RecordKeeper&, llvm::raw_ostream&) + 65 8 llvm-tblgen 0x0000000106f72c64 (anonymous namespace)::LLVMTableGenMain(llvm::raw_ostream&, llvm::RecordKeeper&) + 1172 9 llvm-tblgen 0x0000000106fcb15f llvm::TableGenMain(char*, bool (*)(llvm::raw_ostream&, llvm::RecordKeeper&)) + 3599 10 llvm-tblgen 0x0000000106f727a6 main + 134 11 libdyld.dylib 0x000000010733c6a5 start + 1 Stack dump: 0. Program arguments: /Volumes/Builds/pz-master-stage2-RA/bin/llvm-tblgen -gen-register-bank -I /Users/vk/llvm/lib/Target/ARM -I /Users/vk/llvm/include -I /Users/vk/llvm/lib/Target /Users/vk/llvm/lib/Target/ARM/ARM.td -o /Volumes/Builds/pz-master-stage2-RA/lib/Target/ARM/ARMGenRegisterBank.inc.tmp /bin/sh: line 1: 41986 Segmentation fault: 11 /Volumes/Builds/pz-master-stage2-RA/bin/llvm-tblgen -gen-register-bank -I /Users/vk/llvm/lib/Target/ARM -I /Users/vk/llvm/include -I /Users/vk/llvm/lib/Target /Users/vk/llvm/lib/Target/ARM/ARM.td -o /Volumes/Builds/pz -master-stage2-RA/lib/Target/ARM/ARMGenRegisterBank.inc.tmp llvm-svn: 304231
* [SelectionDAG] Set ISD::FPOWI to Expand by defaultCraig Topper2017-05-301-1/+0
| | | | | | | | | | | | | | | | | Summary: Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie". This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default. Reviewers: spatel, RKSimon, efriedma Reviewed By: RKSimon Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits Differential Revision: https://reviews.llvm.org/D33530 llvm-svn: 304215
* This patch closes PR28513: an optimization of multiplication by different ↵Andrew V. Tischenko2017-05-301-1/+79
| | | | | | | | constants. It's implemented on DAG combiner level. llvm-svn: 304209
* [X86] Adding vpopcntd and vpopcntq instructionsOren Ben Simhon2017-05-251-0/+8
| | | | | | | | | AVX512_VPOPCNTDQ is a new feature set that was published by Intel. The patch represents the LLVM side of the addition of two new intrinsic based instructions (vpopcntd and vpopcntq). Differential Revision: https://reviews.llvm.org/D33169 llvm-svn: 303858
* [X86][AVX512] Make i1 illegal in the CodeGenGuy Blank2017-05-191-59/+56
| | | | | | | | | | This patch defines the i1 type as illegal in the X86 backend for AVX512. For DAG operations on <N x i1> types (build vector, extract vector element, ...) i8 is used, and should be truncated/extended. This should produce better scalar code for i1 types since GPRs will be used instead of mask registers. Differential Revision: https://reviews.llvm.org/D32273 llvm-svn: 303421
* Fix PR33028Michael Liao2017-05-171-13/+34
| | | | | | | | | | | | - '-verify-mahcineinstrs' starts to complain allocatable live-in physical registers on non-entry or non-landing-pad basic blocks. - Refactor the XBEGIN translation to define EAX on a dedicated fallback code path due to XABORT. Add a pseudo instruction to define EAX explicitly to avoid add physical register live-in. Differential Revision: https://reviews.llvm.org/D33168 llvm-svn: 303306
* IR: Give function GlobalValue::getRealLinkageName() a less misleading name: ↵Peter Collingbourne2017-05-161-2/+2
| | | | | | | | | | | | dropLLVMManglingEscape(). This function gives the wrong answer on some non-ELF platforms in some cases. The function that does the right thing lives in Mangler.h. To try to discourage people from using this function, give it a different name. Differential Revision: https://reviews.llvm.org/D33162 llvm-svn: 303134
* [X86] Utilize SelectionDAG::getSelect(). NFC.Zvi Rackover2017-05-141-34/+27
| | | | | | | | | | Replace SelectionDAG::getNode(ISD::SELECT, ...) and SelectionDAG::getNode(ISD::VSELECT, ...) with SelectionDAG::getSelect(...) Saves a few lines of code and in some cases saves the need to explicitly check the type of the desired node. llvm-svn: 303024
* [X86] Remove unused value from IntrinsicType enum. NFCCraig Topper2017-05-141-6/+0
| | | | llvm-svn: 303018
* [X86][AVX] Allow 32-bit targets to peek through subvectors to extract ↵Simon Pilgrim2017-05-141-1/+10
| | | | | | constant splats for vXi64 shifts. llvm-svn: 303009
* [KnownBits] Add bit counting methods to KnownBits struct and use them where ↵Craig Topper2017-05-121-1/+1
| | | | | | | | | | | | possible This patch adds min/max population count, leading/trailing zero/one bit counting methods. The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give. Differential Revision: https://reviews.llvm.org/D32931 llvm-svn: 302925
* Issue diagnostics when returning FP values on x86_64 without SSE1/2Reid Kleckner2017-05-111-9/+24
| | | | | | | | | | | | | Avoid using report_fatal_error, because it will ask the user to file a bug. If the user attempts to disable SSE on x86_64 and them use floating point, that's a bug in their code, not a bug in the compiler. This is just a start. There are other ways to crash the backend in this configuration, but they should be updated to follow this pattern. Differential Revision: https://reviews.llvm.org/D27522 llvm-svn: 302835
* [x86] Fix a failure to select with AVX-512 when the type legalizerChandler Carruth2017-05-111-5/+29
| | | | | | | | | | | | | | | | | | | | | | | | | manages to form a VSELECT with a non-i1 element type condition. Those are technically allowed in SDAG (at least, the generic type legalization logic will form them and I wouldn't want to try to audit everything te preclude forming them) so we need to be able to lower them. This isn't too hard to implement. We mark VSELECT as custom so we get a chance in C++, add a fast path for i1 conditions to get directly handled by the patterns, and a fallback when we need to manually force the condition to be an i1 that uses the vptestm instruction to turn a non-mask into a mask. This, unsurprisingly, generates awful code. But it at least doesn't crash. This was actually impacting open source packages built with LLVM for AVX-512 in the wild, so quickly landing a patch that at least stops the immediate bleeding. I think I've found where to fix the codegen quality issue, but less confident of that change so separating it out from the thing that doesn't change the result of any existing test case but causes mine to not crash. llvm-svn: 302785
* Strip trailing whitespace. NFCI.Simon Pilgrim2017-05-111-1/+1
| | | | llvm-svn: 302784
* Suppress all uses of LLVM_END_WITH_NULL. NFC.Serge Guelton2017-05-091-3/+2
| | | | | | | | | Use variadic templates instead of relying on <cstdarg> + sentinel. This enforces better type checking and makes code more readable. Differential Revision: https://reviews.llvm.org/D32541 llvm-svn: 302571
* VX512] Only look at lower bit in constant scalar masksGuy Blank2017-05-091-2/+4
| | | | | | | | | for scalar masked instructions only the lower bit of the mask is relevant. so for constant masks we should either do an unmasked operation or no operation, depending on the value of the lower bit. This patch handles cases where the lower bit is '1'. Differential Revision: https://reviews.llvm.org/D32805 llvm-svn: 302546
* Add extra operand to CALLSEQ_START to keep frame part set up previouslySerge Pavlov2017-05-091-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using arguments with attribute inalloca creates problems for verification of machine representation. This attribute instructs the backend that the argument is prepared in stack prior to CALLSEQ_START..CALLSEQ_END sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size stored in CALLSEQ_START in this case does not count the size of this argument. However CALLSEQ_END still keeps total frame size, as caller can be responsible for cleanup of entire frame. So CALLSEQ_START and CALLSEQ_END keep different frame size and the difference is treated by MachineVerifier as stack error. Currently there is no way to distinguish this case from actual errors. This patch adds additional argument to CALLSEQ_START and its target-specific counterparts to keep size of stack that is set up prior to the call frame sequence. This argument allows MachineVerifier to calculate actual frame size associated with frame setup instruction and correctly process the case of inalloca arguments. The changes made by the patch are: - Frame setup instructions get the second mandatory argument. It affects all targets that use frame pseudo instructions and touched many files although the changes are uniform. - Access to frame properties are implemented using special instructions rather than calls getOperand(N).getImm(). For X86 and ARM such replacement was made previously. - Changes that reflect appearance of additional argument of frame setup instruction. These involve proper instruction initialization and methods that access instruction arguments. - MachineVerifier retrieves frame size using method, which reports sum of frame parts initialized inside frame instruction pair and outside it. The patch implements approach proposed by Quentin Colombet in https://bugs.llvm.org/show_bug.cgi?id=27481#c1. It fixes 9 tests failed with machine verifier enabled and listed in PR27481. Differential Revision: https://reviews.llvm.org/D32394 llvm-svn: 302527
* [X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X)Simon Pilgrim2017-05-091-0/+8
| | | | | | | | | | Similar to what we do for vXi8 ASHR(X, 7), use SSE42's PCMPGTQ to splat the sign instead of using the PSRAD+PSHUFD. Avoiding bitcasts this improves combines that utilize computeNumSignBits, permits memory folding and reduces pipe pressure. Although it does require a second register, given that this is a (cheap) zero register the impact is minimal. Differential Revision: https://reviews.llvm.org/D32973 llvm-svn: 302525
* [X86][SSE] Improve combineLogicBlendIntoPBLENDV to use general masks.Simon Pilgrim2017-05-081-23/+13
| | | | | | | | | | | | | | | Currently combineLogicBlendIntoPBLENDV can only match ASHR to detect sign splatting of a bit mask, this patch generalises this to use computeNumSignBits instead. This is a first step in several things we can do to improve PBLENDV support: * Better matching of X86ISD::ANDNP patterns. * Handle floating point cases. * Better vector and bitcast support in computeNumSignBits. * Recognise that PBLENDV only uses the sign bit of the mask, we should be able strip away sign splats (ASHR, PCMPGT isNeg tests etc.). Differential Revision: https://reviews.llvm.org/D32953 llvm-svn: 302424
* [XRay] Custom event logging intrinsicDean Michael Berris2017-05-081-0/+4
| | | | | | | | | | | | | | | | | | | | This patch introduces an LLVM intrinsic and a target opcode for custom event logging in XRay. Initially, its use case will be to allow users of XRay to log some type of string ("poor man's printf"). The target opcode compiles to a noop sled large enough to enable calling through to a runtime-determined relative function call. At runtime, when X-Ray is enabled, the sled is replaced by compiler-rt with a trampoline to the logic for creating the custom log entries. Future patches will implement the compiler-rt parts and clang-side support for emitting the IR corresponding to this intrinsic. Reviewers: timshen, dberris Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits Differential Revision: https://reviews.llvm.org/D27503 llvm-svn: 302405
* [X86][AVX512] Relax assertion and just exit combine for unsupported types ↵Simon Pilgrim2017-05-061-1/+3
| | | | | | (PR32907) llvm-svn: 302361
* [X86][AVX512] Move v2i64/v4i64 VPABS lowering to tablegenSimon Pilgrim2017-05-061-2/+4
| | | | | | Extend NoVLX targets to use the 512-bit versions llvm-svn: 302359
* [X86] Reduce code for setting operations actions by merging into loops ↵Simon Pilgrim2017-05-061-37/+21
| | | | | | across multiple types/ops. NFCI. llvm-svn: 302357
* [X86][SSE] Break register dependencies on v16i8/v8i16 BUILD_VECTOR on SSE41Simon Pilgrim2017-05-061-2/+2
| | | | | | rL294581 broke unnecessary register dependencies on partial v16i8/v8i16 BUILD_VECTORs, but on SSE41 we (currently) use insertion for full BUILD_VECTORs as well. By allowing full insertion to occur on SSE41 targets we can break register dependencies here as well. llvm-svn: 302355
* [X86] Use SDValue::getConstantOperandVal helper. NFCI.Simon Pilgrim2017-05-051-9/+6
| | | | llvm-svn: 302286
* [KnownBits] Add wrapper methods for setting and clear all bits in the ↵Craig Topper2017-05-051-2/+2
| | | | | | | | | | underlying APInts in KnownBits. This adds routines for reseting KnownBits to unknown, making the value all zeros or all ones. It also adds methods for querying if the value is zero, all ones or unknown. Differential Revision: https://reviews.llvm.org/D32637 llvm-svn: 302262
* [X86][AVX512] Improve support and testing for CTLZ of 512-bit vectors ↵Simon Pilgrim2017-05-051-3/+7
| | | | | | without CDI llvm-svn: 302233
* [X86] Remove duplicate operation actions. NFCI.Simon Pilgrim2017-05-051-5/+0
| | | | llvm-svn: 302230
* [X86][AVX512CDI] Move v2i64/v4i64 and v4i32/v8i32 VPLZCNT lowering to tablegenSimon Pilgrim2017-05-051-39/+13
| | | | | | Extend NoVLX targets to use the 512-bit versions llvm-svn: 302229
* Remove unused variableSimon Pilgrim2017-05-051-1/+0
| | | | llvm-svn: 302226
* [X86][AVX] Add LowerIntUnary helpers to split unary vector ops in half. NFCI.Simon Pilgrim2017-05-051-76/+51
| | | | | | Same as LowerIntArith helpers but for unary ops instead of binary. llvm-svn: 302222
* [KnownBits] Add zext, sext, and trunc methods to KnownBitsCraig Topper2017-05-031-2/+1
| | | | | | | | This patch adds zext, sext, and trunc methods to KnownBits and uses them where possible. Differential Revision: https://reviews.llvm.org/D32784 llvm-svn: 302088
* Silence a 'enum and non-enum used in conditional' warning.Simon Pilgrim2017-05-031-1/+1
| | | | llvm-svn: 302048
* [X86][LWP] Add llvm support for LWP instructions (reapplied).Simon Pilgrim2017-05-031-0/+14
| | | | | | | | | | This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4). Reapplied - this time without changing line endings of existing files. Differential Revision: https://reviews.llvm.org/D32769 llvm-svn: 302041
* Revert rL302028 due to accidental line ending changes.Simon Pilgrim2017-05-031-14/+0
| | | | llvm-svn: 302038
* [X86][LWP] Add llvm support for LWP instructions.Simon Pilgrim2017-05-031-0/+14
| | | | | | | | This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4). Differential Revision: https://reviews.llvm.org/D32769 llvm-svn: 302028
* [X86][AVX512] remove unnecessary case. NFCGuy Blank2017-05-031-2/+1
| | | | | | | | VFPCLASS is for vector types and not scalar, so it cannot get here. Differential Revision: https://reviews.llvm.org/D32694 llvm-svn: 302023
* [X86] Support of no_caller_saved_registers attributeOren Ben Simhon2017-05-031-7/+23
| | | | | | | | | This patch implements the LLVM part for no_caller_saved_registers attribute as appears here: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=5ed3cc7b66af4758f7849ed6f65f4365be8223be. In order to implement the attribute, we use the dynamic CSR mechanism to remove returned/passed arguments from the function regmask/CSR list. Differential Revision: https://reviews.llvm.org/D31876 llvm-svn: 302020
* [X86] Refactored LowerINTRINSIC_W_CHAIN to use a switch statament. NFCI.Simon Pilgrim2017-05-031-7/+9
| | | | | | Pre-commit as requested in D32769. llvm-svn: 302010
* [X86] Tidyup subvector insert/extract helpers. NFCI.Simon Pilgrim2017-05-021-24/+9
| | | | | | Use getConstantOperandVal where possible. llvm-svn: 301912
* Fix typo in comment. NFCI.Simon Pilgrim2017-05-021-2/+2
| | | | llvm-svn: 301911
* [X86] Reduce code for setting operations actions by merging into loops ↵Simon Pilgrim2017-05-011-129/+68
| | | | | | across multiple types/ops. NFCI. llvm-svn: 301879
* [X86][AVX] Rename LowerVectorBroadcast to lowerBuildVectorAsBroadcast. NFCI.Simon Pilgrim2017-05-011-11/+8
| | | | | | Since the shuffle refactor, this is only used during BUILD_VECTOR lowering. llvm-svn: 301834
* Generalize the specialized flag-carrying SDNodes by moving flags into SDNode.Amara Emerson2017-05-011-10/+8
| | | | | | | | This removes BinaryWithFlagsSDNode, and flags are now all passed by value. Differential Revision: https://reviews.llvm.org/D32527 llvm-svn: 301803
* Do not legalize large add with addc/adde, introduce addcarry and do it with ↵Amaury Sechet2017-04-301-0/+64
| | | | | | | | | | | | | | uaddo/addcarry Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D29872 llvm-svn: 301775
* [APInt] Replace calls to setBits with more specific calls to setBitsFrom and ↵Craig Topper2017-04-301-3/+3
| | | | | | setLowBits where possible. llvm-svn: 301768
* [X86] Clear KnownBits instead of reconstructing it. NFCCraig Topper2017-04-301-1/+1
| | | | llvm-svn: 301767
* TargetLowering: Add finalizeLowering() function; NFCMatthias Braun2017-04-281-4/+11
| | | | | | | | | | | | | | | | | Adds a new method finalizeLowering to TargetLoweringBase. This is in preparation for an upcoming commit. This function is meant for target specific adjustments to MachineFrameInfo or register reservations. Move the freezeRegisters() and the hasCopyImplyingStackAdjustment() handling into the new function to prove the concept. As an added bonus GlobalISel no longer missed the hasCopyImplyingStackAdjustment() handling with this. Differential Revision: https://reviews.llvm.org/D32621 llvm-svn: 301679
OpenPOWER on IntegriCloud