summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [SelectionDAG] Add support for EXTRACT_SUBVECTOR to ComputeNumSignBitsSimon Pilgrim2016-12-121-0/+2
| | | | | | Pre-commit as discussed on D27657 llvm-svn: 289425
* instr-combiner: sum up all latencies of the transformed instructionsSebastian Pop2016-12-111-2/+9
| | | | | | | | | | | | | | | | | | | | We have found that -- when the selected subarchitecture has a scheduling model and we are not optimizing for size -- the machine-instruction combiner uses a too-simple algorithm to compute the cost of one of the two alternatives [before and after running a combining pass on a section of code], and therefor it throws away the combination results too often. This fix has the potential to help any ISA with the potential to combine instructions and for which at least one subarchitecture has a scheduling model. As of now, this is only known to definitely affect AArch64 subarchitectures with a scheduling model. Regression tested on AMD64/GNU-Linux, new test case tested to fail on an unpatched compiler and pass on a patched compiler. Patch by Abe Skolnik and Sebastian Pop. llvm-svn: 289399
* [SelectionDAG] Add ability for computeKnownBits to peek through bitcasts ↵Simon Pilgrim2016-12-101-1/+23
| | | | | | | | from 'large element' scalar/vector to 'small element' vector. Extension to D27129 which already supported bitcasts from 'small element' vector to 'large element' scalar/vector types. llvm-svn: 289329
* Fix LLVM's use of DW_OP_bit_piece in DWARF expressions.Adrian Prantl2016-12-097-156/+129
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLVM's use of DW_OP_bit_piece is incorrect and a based on a misunderstanding of the wording in the DWARF specification. The offset argument of DW_OP_bit_piece refers to the offset into the location that is on the top of the DWARF expression stack, and not an offset into the source variable. This has since also been clarified in the DWARF specification. This patch fixes all uses of DW_OP_bit_piece to emit the correct offset and simplifies the DwarfExpression class to semi-automaticaly emit empty DW_OP_pieces to adjust the offset of the source variable, thus simplifying the code using DwarfExpression. While this is an incompatible bugfix, in practice I don't expect this to be much of a problem since LLVM's old interpretation and the correct interpretation of DW_OP_bit_piece differ only when there are gaps in the fragmented locations of the described variables or if individual fragments are smaller than a byte. LLDB at least won't interpret locations with gaps in them because is has no way to present undefined bits in a variable, and there is a high probability that an old-form expression will be malformed when interpreted correctly, because the DW_OP_bit_piece offset will be outside of the location at the top of the stack. As a nice side-effect, this patch enables us to use a more efficient encoding for subregisters: In order to express a sub-register at a non-zero offset we now use a DW_OP_bit_piece instead of shifting the value into place manually. This patch also adds missing test coverage for code paths that weren't exercised before. <rdar://problem/29335809> Differential Revision: https://reviews.llvm.org/D27550 llvm-svn: 289266
* [DWARF] Suppress .loc directives from CFI instructionsPaul Robinson2016-12-091-2/+2
| | | | | | | | | Like DBG_VALUE, these emit nothing to the .text section, and sometimes have no source location specified. Just ignore them. Differential Revision: http://reviews.llvm.org/D27492 llvm-svn: 289256
* [SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes (REAPPLIED)Simon Pilgrim2016-12-091-0/+36
| | | | | | Reapplied with fix for PR31323 - X86 SSE2 vXi16 multiplies for illegal types were creating CONCAT_VECTORS nodes with vector inputs that might not total the number of elements in the result type. llvm-svn: 289232
* AMDGPU: Fix i128 mulMatt Arsenault2016-12-091-1/+1
| | | | llvm-svn: 289231
* Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵Nirav Dave2016-12-092-142/+276
| | | | | | | | UseAA is enabled." This reverts commit r289221 which appears to be triggering an assertion llvm-svn: 289226
* In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵Nirav Dave2016-12-092-276/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled. Retrying after fixing overly aggressive load-store forwarding optimization. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates *worse* code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores *CAN* be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289221
* Use SelectionDAG.getSplatBuildVector helper. NFCI.Simon Pilgrim2016-12-091-6/+5
| | | | llvm-svn: 289220
* [SelectionDAG] Use SelectionDAG.getBuildVector helper. NFCI.Simon Pilgrim2016-12-092-9/+6
| | | | | | Makes interception of BUILD_VECTOR creation easier for debugging. llvm-svn: 289218
* [SelectionDAG] Add additional checks to CONCAT_VECTORS creationSimon Pilgrim2016-12-091-0/+10
| | | | | | Part of the work for PR31323 - add extra asserts checking that the input vectors are of consistent type and result in the correct number of vector elements. llvm-svn: 289214
* Plug another leak in the DWARF unittests, DIEInlineStrings are never destroyed.Benjamin Kramer2016-12-091-1/+1
| | | | llvm-svn: 289208
* [SelectionDAG] Add partial BITCAST support to computeKnownBitsSimon Pilgrim2016-12-091-0/+44
| | | | | | | | | | Adds support for bitcasting a little endian 'small element' vector to 'large element' scalar/vector (e.g. v16i8 to v4i32 or v2i32 to i64), which is required for PR30845. We extract the knownbits for each 'small element' part and concatenate the results together. We can add support for big endian and 'large element' scalar/vector to 'small element' vector bitcasting once we have test cases for them. Differential Revision: https://reviews.llvm.org/D27129 llvm-svn: 289200
* Revert "[SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes"Daniel Jasper2016-12-091-36/+0
| | | | | | | | This reverts commit r288916 as it is currently causing a crasher in Halide. Reproducer on llvm.org/PR31323. While it might be that halide is generating invalid IR, llc shouldn't crash. llvm-svn: 289194
* GlobalISel: fall back gracefully for debug intrinsics.Tim Northover2016-12-081-0/+6
| | | | | | | Supporting them properly is a reasonably complex chunk of work, so to allow bot testing before then we should at least be able to fall back to DAG ISel. llvm-svn: 289150
* GlobalISel: factor overflow handling into separate function. NFC.Tim Northover2016-12-081-28/+38
| | | | llvm-svn: 289149
* Don't emit .seh_handler directives for any cleanup funcletsReid Kleckner2016-12-081-6/+6
| | | | | | | | | | | | | | | | We were falsely claiming that we had an LSDA for the relevant EH personality before this change, which could lead to the EH machinery interpreting random adjacent data as an LSDA. Fixes PR31317 This change is safe because cleanups can't contain exception handlers today. We do these things to maintain that invariant: - C++ destructors are naturally out-of-line - __finally blocks are outlined in clang - LLVM's inliner will not inline EH constructs into cleanups llvm-svn: 289101
* Prune unused libdeps.NAKAMURA Takumi2016-12-082-2/+2
| | | | llvm-svn: 289060
* [SelectionDAG] Add expansion and promotion of [US]MUL_LOHINicolai Haehnle2016-12-084-27/+179
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Most targets set the action for these nodes to Expand even though there isn't actually any code for them in ExpandNode. Instead, targets simply relied on the fact that no code generates these nodes as long as the nodes aren't legal or custom. However, generating these nodes can be useful e.g. for divide-by-constant in wider integer types. Expand of [US]MUL_LOHI will use MULH[US] when legal or custom, and a sequence of half-width multiplications otherwise. Promote uses a wider multiply. This patch intends to not change the generated code, but indirect effects are possible since expansions/promotions that were previously done in DAGCombine may now be done in LegalizeDAG. See D24822 for a change that actually uses the new expansion. Reviewers: spatel, bkramer, venkatra, efriedma, hfinkel, ast, nadav, tstellarAMD Subscribers: arsenm, jyknight, nemanjai, wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D24956 llvm-svn: 289050
* Move DwarfGenerator.cpp to unittestsDaniel Jasper2016-12-083-494/+0
| | | | | | | | | So far it creates a test helper and so it should be moved there. It also create a layering cycle between CodeGen and CodeGen/AsmPrinter, which should be avoided. Review: https://reviews.llvm.org/D27570 llvm-svn: 289044
* Wdocumentation fixSimon Pilgrim2016-12-081-1/+1
| | | | llvm-svn: 289038
* Revert "[CodeGen] Fix invalid DWARF info on Win64"Keno Fischer2016-12-084-5/+5
| | | | | | Appears to break on build bots. Reverting pending investigation. llvm-svn: 289014
* [CodeGen] Fix invalid DWARF info on Win64Keno Fischer2016-12-084-5/+5
| | | | | | | | | | | | | | | The relocations for `DIEEntry::EmitValue` were wrong for Win64 (emitting FK_Data_4 instead of FK_SecRel_4). This corrects that oversight so that the DWARF data is correct in Win64 COFF files. Fixes PR15393. Patch by Jameson Nash <jameson@juliacomputing.com> based on a patch by David Majnemer. Differential Revision: https://reviews.llvm.org/D21731 llvm-svn: 289013
* Make a DWARF generator so we can unit test DWARF APIs with gtest.Greg Clayton2016-12-087-130/+749
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only tests we have for the DWARF parser are the tests that use llvm-dwarfdump and expect output from textual dumps. More DWARF parser modification are coming in the next few weeks and I wanted to add tests that can verify that we can encode and decode all form types, as well as test some other basic DWARF APIs where we ask DIE objects for their children and siblings. DwarfGenerator.cpp was added in the lib/CodeGen directory. This file contains the code necessary to easily create DWARF for tests: dwarfgen::Generator DG; Triple Triple("x86_64--"); bool success = DG.init(Triple, Version); if (!success) return; dwarfgen::CompileUnit &CU = DG.addCompileUnit(); dwarfgen::DIE CUDie = CU.getUnitDIE(); CUDie.addAttribute(DW_AT_name, DW_FORM_strp, "/tmp/main.c"); CUDie.addAttribute(DW_AT_language, DW_FORM_data2, DW_LANG_C); dwarfgen::DIE SubprogramDie = CUDie.addChild(DW_TAG_subprogram); SubprogramDie.addAttribute(DW_AT_name, DW_FORM_strp, "main"); SubprogramDie.addAttribute(DW_AT_low_pc, DW_FORM_addr, 0x1000U); SubprogramDie.addAttribute(DW_AT_high_pc, DW_FORM_addr, 0x2000U); dwarfgen::DIE IntDie = CUDie.addChild(DW_TAG_base_type); IntDie.addAttribute(DW_AT_name, DW_FORM_strp, "int"); IntDie.addAttribute(DW_AT_encoding, DW_FORM_data1, DW_ATE_signed); IntDie.addAttribute(DW_AT_byte_size, DW_FORM_data1, 4); dwarfgen::DIE ArgcDie = SubprogramDie.addChild(DW_TAG_formal_parameter); ArgcDie.addAttribute(DW_AT_name, DW_FORM_strp, "argc"); // ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref4, IntDie); ArgcDie.addAttribute(DW_AT_type, DW_FORM_ref_addr, IntDie); StringRef FileBytes = DG.generate(); MemoryBufferRef FileBuffer(FileBytes, "dwarf"); auto Obj = object::ObjectFile::createObjectFile(FileBuffer); EXPECT_TRUE((bool)Obj); DWARFContextInMemory DwarfContext(*Obj.get()); This code is backed by the AsmPrinter code that emits DWARF for the actual compiler. While adding unit tests it was discovered that DIEValue that used DIEEntry as their values had bugs where DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref8, and DW_FORM_ref_udata forms were not supported. These are all now supported. Added support for DW_FORM_string so we can emit inlined C strings. Centralized the code to unique abbreviations into a new DIEAbbrevSet class and made both the dwarfgen::Generator and the llvm::DwarfFile classes use the new class. Fixed comments in the llvm::DIE class so that the Offset is known to be the compile/type unit offset. DIEInteger now supports more DW_FORM values. There are also unit tests that cover: Encoding and decoding all form types and values Encoding and decoding all reference types (DW_FORM_ref1, DW_FORM_ref2, DW_FORM_ref4, DW_FORM_ref8, DW_FORM_ref_udata, DW_FORM_ref_addr) including cross compile unit references with that go forward one compile unit and backward on compile unit. Differential Revision: https://reviews.llvm.org/D27326 llvm-svn: 289010
* TargetPassConfig: Rename DisablePostRA -> DisablePostRASched; NFCMatthias Braun2016-12-081-3/+3
| | | | llvm-svn: 289003
* LivePhysReg: Use reference instead of pointer in init(); NFCMatthias Braun2016-12-084-8/+8
| | | | llvm-svn: 289002
* [InlineSpiller] Don't call TargetInstrInfo::foldMemoryOperand with an empty ↵Quentin Colombet2016-12-081-0/+5
| | | | | | | | list. Since r287792 if we try to do that we will hit an assert. llvm-svn: 289001
* GlobalISel: use correct builder for ConstantExprs.Tim Northover2016-12-071-32/+45
| | | | | | | | ConstantExpr instances were emitting code into the current block rather than the entry block. This meant they didn't necessarily dominate all uses, which is clearly wrong. llvm-svn: 288985
* GlobalISel: store the current MachineFunction as direct state. NFC.Tim Northover2016-12-071-45/+41
| | | | | | | Having to ask the MIRBuilder for the current function is a little awkward, and I'm intending to improve how that's threaded through anyway. llvm-svn: 288983
* GlobalISel: simplify MachineIRBuilder interface.Tim Northover2016-12-072-27/+21
| | | | | | | | | | | | MachineIRBuilder had weird before/after and beginning/end flags for the insert point. Unfortunately the non-default means that instructions will be inserted in reverse order which is almost never what anyone wants. Really, I think we just want (like IRBuilder has) the ability to insert at any C++ iterator-style point (i.e. before any instruction or before MBB.end()). So this fixes MIRBuilders to behave like IRBuilders in this respect. llvm-svn: 288980
* [SelectionDAG] Add knownbits support for vector demandedelts in ↵Simon Pilgrim2016-12-071-2/+4
| | | | | | SMAX/SMIN/UMAX/UMIN opcodes llvm-svn: 288926
* [SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodesSimon Pilgrim2016-12-071-0/+36
| | | | llvm-svn: 288916
* [SelectionDAG] Removed old knownbits TODO comment. NFCI.Simon Pilgrim2016-12-071-3/+0
| | | | | | EXTRACT_VECTOR_ELT does support demanded elts if the element index is known and in range. llvm-svn: 288913
* [CodeGen] Fix result type for SMULO/UMULO legalizationEli Friedman2016-12-061-0/+9
| | | | | | | | | | | | | | | On some platforms (like MSP430) the second element of the result structure for SMULO/UMULO may have a shorter type than the one returned by SetCC. We need to truncate it to the right type, or else some incorrect code may be generated later on. This fixes issue https://github.com/rust-lang/rust/issues/37829 Patch by Vadzim Dambrouski! Differential Revision: https://reviews.llvm.org/D27154 llvm-svn: 288857
* GlobalISel: correctly handle small args via memory.Tim Northover2016-12-061-1/+1
| | | | | | | We were rounding size in bits down rather than up, leading to 0-sized slots for i1 (assert!) and bugs for other types not byte-aligned. llvm-svn: 288848
* [DAGCombine] Add (sext_in_reg (zext x)) -> (sext x) combineSimon Pilgrim2016-12-061-0/+9
| | | | | | | | Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted. Differential Revision: https://reviews.llvm.org/D27461 llvm-svn: 288842
* GlobalISel: fall back gracefully when we hit unhandled legalizer default.Tim Northover2016-12-061-1/+3
| | | | llvm-svn: 288840
* [SelectionDAG] We can ignore knownbits from an undef shuffle vector index if ↵Simon Pilgrim2016-12-061-3/+3
| | | | | | we don't actually demand that element llvm-svn: 288839
* GlobalISel: handle G_SEQUENCE fallbacks gracefully.Tim Northover2016-12-061-0/+3
| | | | | | | | | | There were two problems: + AArch64 was reusing random data from its binary op tables, which is complete nonsense for G_SEQUENCE. + Even when AArch64 gave up and said it couldn't handle G_SEQUENCE, the generic code asserted. llvm-svn: 288836
* GlobalISel: allow G_SELECT instructions for pointers.Tim Northover2016-12-061-4/+5
| | | | llvm-svn: 288835
* GlobalISel: stop the legalizer from trying to handle oddly-sized types.Tim Northover2016-12-061-0/+5
| | | | | | | | It'll almost immediately fail because it always tries to half/double the size until it finds a legal one. Unfortunately, this triggers an assertion preventing the DAG fallback from being possible. llvm-svn: 288834
* Avoid repeated calls to Op.getOpcode(). NFCI.Simon Pilgrim2016-12-061-3/+4
| | | | llvm-svn: 288814
* Add missing parens in assert.Sam McCall2016-12-061-1/+1
| | | | | | | | | | Summary: Add missing parens in assert, which warn in GCC. Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27448 llvm-svn: 288792
* GlobalISel: avoid looking too closely at PHIs when we bail.Tim Northover2016-12-051-9/+11
| | | | | | | | The function used to finish off PHIs by adding the relevant basic blocks can fail if we're aborting and still don't actually have the needed MachineBasicBlocks. So avoid trying in that case. llvm-svn: 288727
* GlobalISel: place constants correctly in the entry block.Tim Northover2016-12-051-1/+1
| | | | | | | | | When the entry block was empty after arg lowering, we were always placing constants at the end. This is probably hamrless while translating the same block, but horribly wrong once its terminator has been translated. So switch to inserting at the beginning. llvm-svn: 288720
* GlobalISel: handle pointer arguments that get assigned to the stack.Tim Northover2016-12-051-1/+4
| | | | llvm-svn: 288717
* GlobalISel: translate constants larger than 64 bits.Tim Northover2016-12-051-1/+1
| | | | llvm-svn: 288713
* GlobalISel: make G_CONSTANT take a ConstantInt rather than int64_t.Tim Northover2016-12-053-7/+21
| | | | | | | | This makes it more similar to the floating-point constant, and also allows for larger constants to be translated later. There's no real functional change in this patch though, just syntax updates. llvm-svn: 288712
* GlobalISel: improve translation fallback for constants.Tim Northover2016-12-051-1/+1
| | | | | | | | Returning 0 (NoReg) from getOrCreateVReg leads to unexpected situations later in the translation. It's better to return a valid (if undefined) register and let the rest of the instruction carry on as planned. llvm-svn: 288709
OpenPOWER on IntegriCloud