summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [DWARF] Suppress .loc directives from CFI instructionsPaul Robinson2016-12-091-2/+2
| | | | | | | | | Like DBG_VALUE, these emit nothing to the .text section, and sometimes have no source location specified. Just ignore them. Differential Revision: http://reviews.llvm.org/D27492 llvm-svn: 289256
* [SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes (REAPPLIED)Simon Pilgrim2016-12-091-0/+36
| | | | | | Reapplied with fix for PR31323 - X86 SSE2 vXi16 multiplies for illegal types were creating CONCAT_VECTORS nodes with vector inputs that might not total the number of elements in the result type. llvm-svn: 289232
* AMDGPU: Fix i128 mulMatt Arsenault2016-12-091-1/+1
| | | | llvm-svn: 289231
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-12-092-142/+276
| | | | | | | | UseAA is enabled." This reverts commit r289221 which appears to be triggering an assertion llvm-svn: 289226
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2016-12-092-276/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Retrying after fixing overly aggressive load-store forwarding optimization. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates *worse* code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores *CAN* be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289221
* Use SelectionDAG.getSplatBuildVector helper. NFCI.Simon Pilgrim2016-12-091-6/+5
| | | | llvm-svn: 289220
* [SelectionDAG] Use SelectionDAG.getBuildVector helper. NFCI.Simon Pilgrim2016-12-092-9/+6
| | | | | | Makes interception of BUILD_VECTOR creation easier for debugging. llvm-svn: 289218
* [SelectionDAG] Add additional checks to CONCAT_VECTORS creationSimon Pilgrim2016-12-091-0/+10
| | | | | | Part of the work for PR31323 - add extra asserts checking that the input vectors are of consistent type and result in the correct number of vector elements. llvm-svn: 289214
* Plug another leak in the DWARF unittests, DIEInlineStrings are never destroyed.Benjamin Kramer2016-12-091-1/+1
| | | | llvm-svn: 289208
* [SelectionDAG] Add partial BITCAST support to computeKnownBitsSimon Pilgrim2016-12-091-0/+44
| | | | | | | | | | Adds support for bitcasting a little endian 'small element' vector to 'large element' scalar/vector (e.g. v16i8 to v4i32 or v2i32 to i64), which is required for PR30845. We extract the knownbits for each 'small element' part and concatenate the results together. We can add support for big endian and 'large element' scalar/vector to 'small element' vector bitcasting once we have test cases for them. Differential Revision: https://reviews.llvm.org/D27129 llvm-svn: 289200
* Revert "[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes"Daniel Jasper2016-12-091-36/+0
| | | | | | | | This reverts commit r288916 as it is currently causing a crasher in Halide. Reproducer on llvm.org/PR31323. While it might be that halide is generating invalid IR, llc shouldn't crash. llvm-svn: 289194
* GlobalISel: fall back gracefully for debug intrinsics.Tim Northover2016-12-081-0/+6
| | | | | | | Supporting them properly is a reasonably complex chunk of work, so to allow bot testing before then we should at least be able to fall back to DAG ISel. llvm-svn: 289150
* GlobalISel: factor overflow handling into separate function. NFC.Tim Northover2016-12-081-28/+38
| | | | llvm-svn: 289149
* Don't emit .seh_handler directives for any cleanup funcletsReid Kleckner2016-12-081-6/+6
| | | | | | | | | | | | | | | | We were falsely claiming that we had an LSDA for the relevant EH personality before this change, which could lead to the EH machinery interpreting random adjacent data as an LSDA. Fixes PR31317 This change is safe because cleanups can't contain exception handlers today. We do these things to maintain that invariant: - C++ destructors are naturally out-of-line - __finally blocks are outlined in clang - LLVM's inliner will not inline EH constructs into cleanups llvm-svn: 289101
* Prune unused libdeps.NAKAMURA Takumi2016-12-082-2/+2
| | | | llvm-svn: 289060
* [SelectionDAG] Add expansion and promotion of [US]MUL_LOHINicolai Haehnle2016-12-084-27/+179
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Most targets set the action for these nodes to Expand even though there isn't actually any code for them in ExpandNode. Instead, targets simply relied on the fact that no code generates these nodes as long as the nodes aren't legal or custom. However, generating these nodes can be useful e.g. for divide-by-constant in wider integer types. Expand of [US]MUL_LOHI will use MULH[US] when legal or custom, and a sequence of half-width multiplications otherwise. Promote uses a wider multiply. This patch intends to not change the generated code, but indirect effects are possible since expansions/promotions that were previously done in DAGCombine may now be done in LegalizeDAG. See D24822 for a change that actually uses the new expansion. Reviewers: spatel, bkramer, venkatra, efriedma, hfinkel, ast, nadav, tstellarAMD Subscribers: arsenm, jyknight, nemanjai, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24956 llvm-svn: 289050
* Move DwarfGenerator.cpp to unittestsDaniel Jasper2016-12-083-494/+0
| | | | | | | | | So far it creates a test helper and so it should be moved there. It also create a layering cycle between CodeGen and CodeGen/AsmPrinter, which should be avoided. Review: https://reviews.llvm.org/D27570 llvm-svn: 289044
* Wdocumentation fixSimon Pilgrim2016-12-081-1/+1
| | | | llvm-svn: 289038
* Revert "[CodeGen] Fix invalid DWARF info on Win64"Keno Fischer2016-12-084-5/+5
| | | | | | Appears to break on build bots. Reverting pending investigation. llvm-svn: 289014
* [CodeGen] Fix invalid DWARF info on Win64Keno Fischer2016-12-084-5/+5
| | | | | | | | | | | | | | | The relocations for `DIEEntry::EmitValue` were wrong for Win64 (emitting FK_Data_4 instead of FK_SecRel_4). This corrects that oversight so that the DWARF data is correct in Win64 COFF files. Fixes PR15393. Patch by Jameson Nash <jameson@juliacomputing.com> based on a patch by David Majnemer. Differential Revision: https://reviews.llvm.org/D21731 llvm-svn: 289013
* Make a DWARF generator so we can unit test DWARF APIs with gtest.Greg Clayton2016-12-087-130/+749
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only tests we have for the DWARF parser are the tests that use llvm-dwarfdump and expect output from textual dumps. More DWARF parser modification are coming in the next few weeks and I wanted to add tests that can verify that we can encode and decode all form types, as well as test some other basic DWARF APIs where we ask DIE objects for their children and siblings. DwarfGenerator.cpp was added in the lib/CodeGen directory. This file contains the code necessary to easily create DWARF for tests: dwarfgen::Generator DG; Triple Triple("x86_64--"); bool success = DG.init(Triple, Version); if (!success) return; dwarfgen::CompileUnit &CU = DG.addCompileUnit(); dwarfgen::DIE CUDie = CU.getUnitDIE(); CUDie.addAttribute(DW_AT_name, DW_FORM_strp, "/tmp/main.c"); CUDie.addAttribute(DW_AT_language, DW_FORM_data2, DW_LANG_C); dwarfgen::DIE SubprogramDie = CUDie.addChild(DW_TAG_subprogram); SubprogramDie.addAttribute(DW_AT_name, DW_FORM_strp, "main"); SubprogramDie.addAttribute(DW_AT_low_pc, DW_FORM_addr, 0x1000U); SubprogramDie.addAttribute(DW_AT_high_pc, DW_FORM_addr, 0x2000U); dwarfgen::DIE IntDie = CUDie.addChild(DW_TAG_base_type); IntDie.addAttribute(DW_AT_name, DW_FORM_strp, "int"); IntDie.addAttribute(DW_AT_encoding, DW_FORM_data1, DW_ATE_signed); IntDie.addAttribute(DW_AT_byte_size, DW_FORM_data1, 4); dwarfgen::DIE ArgcDie = SubprogramDie.addChild(DW_TAG_formal_parameter); ArgcDie.addAttribute(DW_AT_name, DW_FORM_strp, "argc"); // ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref4, IntDie); ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref_addr, IntDie); StringRef FileBytes = DG.generate(); MemoryBufferRef FileBuffer(FileBytes, "dwarf"); auto Obj = object::ObjectFile::createObjectFile(FileBuffer); EXPECT_TRUE((bool)Obj); DWARFContextInMemory DwarfContext(*Obj.get()); This code is backed by the AsmPrinter code that emits DWARF for the actual compiler. While adding unit tests it was discovered that DIEValue that used DIEEntry as their values had bugs where DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref8, and DW_FORM_ref_udata forms were not supported. These are all now supported. Added support for DW_FORM_string so we can emit inlined C strings. Centralized the code to unique abbreviations into a new DIEAbbrevSet class and made both the dwarfgen::Generator and the llvm::DwarfFile classes use the new class. Fixed comments in the llvm::DIE class so that the Offset is known to be the compile/type unit offset. DIEInteger now supports more DW_FORM values. There are also unit tests that cover: Encoding and decoding all form types and values Encoding and decoding all reference types (DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref4, DW_FORM_ref8, DW_FORM_ref_udata, DW_FORM_ref_addr) including cross compile unit references with that go forward one compile unit and backward on compile unit. Differential Revision: https://reviews.llvm.org/D27326 llvm-svn: 289010
* TargetPassConfig: Rename DisablePostRA -> DisablePostRASched; NFCMatthias Braun2016-12-081-3/+3
| | | | llvm-svn: 289003
* LivePhysReg: Use reference instead of pointer in init(); NFCMatthias Braun2016-12-084-8/+8
| | | | llvm-svn: 289002
* [InlineSpiller] Don't call TargetInstrInfo::foldMemoryOperand with an empty ↵Quentin Colombet2016-12-081-0/+5
| | | | | | | | list. Since r287792 if we try to do that we will hit an assert. llvm-svn: 289001
* GlobalISel: use correct builder for ConstantExprs.Tim Northover2016-12-071-32/+45
| | | | | | | | ConstantExpr instances were emitting code into the current block rather than the entry block. This meant they didn't necessarily dominate all uses, which is clearly wrong. llvm-svn: 288985
* GlobalISel: store the current MachineFunction as direct state. NFC.Tim Northover2016-12-071-45/+41
| | | | | | | Having to ask the MIRBuilder for the current function is a little awkward, and I'm intending to improve how that's threaded through anyway. llvm-svn: 288983
* GlobalISel: simplify MachineIRBuilder interface.Tim Northover2016-12-072-27/+21
| | | | | | | | | | | | MachineIRBuilder had weird before/after and beginning/end flags for the insert point. Unfortunately the non-default means that instructions will be inserted in reverse order which is almost never what anyone wants. Really, I think we just want (like IRBuilder has) the ability to insert at any C++ iterator-style point (i.e. before any instruction or before MBB.end()). So this fixes MIRBuilders to behave like IRBuilders in this respect. llvm-svn: 288980
* [SelectionDAG] Add knownbits support for vector demandedelts in ↵Simon Pilgrim2016-12-071-2/+4
| | | | | | SMAX/SMIN/UMAX/UMIN opcodes llvm-svn: 288926
* [SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodesSimon Pilgrim2016-12-071-0/+36
| | | | llvm-svn: 288916
* [SelectionDAG] Removed old knownbits TODO comment. NFCI.Simon Pilgrim2016-12-071-3/+0
| | | | | | EXTRACT_VECTOR_ELT does support demanded elts if the element index is known and in range. llvm-svn: 288913
* [CodeGen] Fix result type for SMULO/UMULO legalizationEli Friedman2016-12-061-0/+9
| | | | | | | | | | | | | | | On some platforms (like MSP430) the second element of the result structure for SMULO/UMULO may have a shorter type than the one returned by SetCC. We need to truncate it to the right type, or else some incorrect code may be generated later on. This fixes issue https://github.com/rust-lang/rust/issues/37829 Patch by Vadzim Dambrouski! Differential Revision: https://reviews.llvm.org/D27154 llvm-svn: 288857
* GlobalISel: correctly handle small args via memory.Tim Northover2016-12-061-1/+1
| | | | | | | We were rounding size in bits down rather than up, leading to 0-sized slots for i1 (assert!) and bugs for other types not byte-aligned. llvm-svn: 288848
* [DAGCombine] Add (sext_in_reg (zext x)) -> (sext x) combineSimon Pilgrim2016-12-061-0/+9
| | | | | | | | Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted. Differential Revision: https://reviews.llvm.org/D27461 llvm-svn: 288842
* GlobalISel: fall back gracefully when we hit unhandled legalizer default.Tim Northover2016-12-061-1/+3
| | | | llvm-svn: 288840
* [SelectionDAG] We can ignore knownbits from an undef shuffle vector index if ↵Simon Pilgrim2016-12-061-3/+3
| | | | | | we don't actually demand that element llvm-svn: 288839
* GlobalISel: handle G_SEQUENCE fallbacks gracefully.Tim Northover2016-12-061-0/+3
| | | | | | | | | | There were two problems: + AArch64 was reusing random data from its binary op tables, which is complete nonsense for G_SEQUENCE. + Even when AArch64 gave up and said it couldn't handle G_SEQUENCE, the generic code asserted. llvm-svn: 288836
* GlobalISel: allow G_SELECT instructions for pointers.Tim Northover2016-12-061-4/+5
| | | | llvm-svn: 288835
* GlobalISel: stop the legalizer from trying to handle oddly-sized types.Tim Northover2016-12-061-0/+5
| | | | | | | | It'll almost immediately fail because it always tries to half/double the size until it finds a legal one. Unfortunately, this triggers an assertion preventing the DAG fallback from being possible. llvm-svn: 288834
* Avoid repeated calls to Op.getOpcode(). NFCI.Simon Pilgrim2016-12-061-3/+4
| | | | llvm-svn: 288814
* Add missing parens in assert.Sam McCall2016-12-061-1/+1
| | | | | | | | | | Summary: Add missing parens in assert, which warn in GCC. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27448 llvm-svn: 288792
* GlobalISel: avoid looking too closely at PHIs when we bail.Tim Northover2016-12-051-9/+11
| | | | | | | | The function used to finish off PHIs by adding the relevant basic blocks can fail if we're aborting and still don't actually have the needed MachineBasicBlocks. So avoid trying in that case. llvm-svn: 288727
* GlobalISel: place constants correctly in the entry block.Tim Northover2016-12-051-1/+1
| | | | | | | | | When the entry block was empty after arg lowering, we were always placing constants at the end. This is probably hamrless while translating the same block, but horribly wrong once its terminator has been translated. So switch to inserting at the beginning. llvm-svn: 288720
* GlobalISel: handle pointer arguments that get assigned to the stack.Tim Northover2016-12-051-1/+4
| | | | llvm-svn: 288717
* GlobalISel: translate constants larger than 64 bits.Tim Northover2016-12-051-1/+1
| | | | llvm-svn: 288713
* GlobalISel: make G_CONSTANT take a ConstantInt rather than int64_t.Tim Northover2016-12-053-7/+21
| | | | | | | | This makes it more similar to the floating-point constant, and also allows for larger constants to be translated later. There's no real functional change in this patch though, just syntax updates. llvm-svn: 288712
* GlobalISel: improve translation fallback for constants.Tim Northover2016-12-051-1/+1
| | | | | | | | Returning 0 (NoReg) from getOrCreateVReg leads to unexpected situations later in the translation. It's better to return a valid (if undefined) register and let the rest of the instruction carry on as planned. llvm-svn: 288709
* [DIExpression] Introduce a dedicated DW_OP_LLVM_fragment operationAdrian Prantl2016-12-0513-140/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | so we can stop using DW_OP_bit_piece with the wrong semantics. The entire back story can be found here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161114/405934.html The gist is that in LLVM we've been misinterpreting DW_OP_bit_piece's offset field to mean the offset into the source variable rather than the offset into the location at the top the DWARF expression stack. In order to be able to fix this in a subsequent patch, this patch introduces a dedicated DW_OP_LLVM_fragment operation with the semantics that we used to apply to DW_OP_bit_piece, which is what we actually need while inside of LLVM. This patch is complete with a bitcode upgrade for expressions using the old format. It does not yet fix the DWARF backend to use DW_OP_bit_piece correctly. Implementation note: We discussed several options for implementing this, including reserving a dedicated field in DIExpression for the fragment size and offset, but using an custom operator at the end of the expression works just fine and is more efficient because we then only pay for it when we need it. Differential Revision: https://reviews.llvm.org/D27361 rdar://problem/29335809 llvm-svn: 288683
* [TargetLowering] add special-case for demanded bits analysis of 'not'Sanjay Patel2016-12-051-5/+19
| | | | | | | | | | | | | | | | We treat bitwise 'not' as a special operation and try not to reduce its all-ones mask. Presumably, this is because a 'not' may be cheaper than a generic 'xor' or it may get folded into another logic op if the target has those. However, if we can remove a logic instruction by changing the xor's constant mask value, that should always be a win. Note that the IR version of SimplifyDemandedBits() does not treat 'not' as a special-case currently (although that's marked with a FIXME). So if you run this IR through -instcombine, you should get the same end result. I'm hoping to add a different backend transform that will expose this problem though, so I need to solve this first. Differential Revision: https://reviews.llvm.org/D27356 llvm-svn: 288676
* [GlobalISel] Extract handleAssignments out of AArch64CallLoweringDiana Picus2016-12-051-2/+38
| | | | | | | | | | | | This function seems target-independent so far: all the target-specific behaviour is isolated in the CCAssignFn and the ValueHandler (which we're also extracting into the generic CallLowering). The intention is to use this in the ARM backend. Differential Revision: https://reviews.llvm.org/D27045 llvm-svn: 288658
* DAG: Fold out out of bounds insert_vector_eltMatt Arsenault2016-12-031-0/+7
| | | | | | | getNode already prevents formation of out of bounds constant extract_vector_elts. Do the same for insert_vector_elt. llvm-svn: 288603
OpenPOWER on IntegriCloud