summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Omit DW_TAG_subprograms for subprograms without inlined subroutines when ↵David Blaikie2014-09-192-24/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | producing -gmlt data To reduce the size of -gmlt data, skip the subprograms without any inlined subroutines. Since we've now got the ability to make these determinations in the backend (funnily enough - we added the flag so we wouldn't produce ranges under -gmlt, but with this change we use the flag, but go back to producing ranges under -gmlt). Instead, just produce CU ranges to inform the consumer which parts of the code are described by this CU's line table. Tools could inspect the line table directly to compute the range, but the CU ranges only seem to be about 0.5% of object/executable size, so I'm not too worried about teaching llvm-symbolizer that trick just yet - it's certainly a possible piece of future work. Update an llvm-symbolizer test just to demonstrate that this schema is acceptable there (if it wasn't, the compiler-rt tests would catch this, but good to have an in-llvm-tree test for llvm-symbolizer's behavior here) Building the clang binary with -gmlt with this patch reduces the total size of object files by 5.1% (5.56% without ranges) without compression and the executable by 4.37% (4.75% without ranges). llvm-svn: 218129
* Change DwarfCompileUnit::createGlobalVariable to getOrCreateGlobalVariable.Frederic Riss2014-09-193-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This will allow to request the creation of a forward delacred variable at is point of use (for imported declarations, this will be DwarfDebug::constructImportedEntityDIE) rather than having to put the forward decl in a retention list. Note that getOrCreateGlobalVariable returns the actual definition DIE when the routine creates a declaration and a definition DIE. If you agree this is the right behavior, then I'll have a followup patch that registers the definition in the DIE map instead of the declaration as it is today (this 'breaks' only one test, where we test that the imported entity is the declaration). I'm not sure what's best here, but it's easy enough for a consumer to follow the DW_AT_specification link to get to the declaration, whereas it takes more work to find the actual definition from a declaration DIE. Reviewers: echristo, dblaikie, aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5381 llvm-svn: 218126
* Optionally enable more-aggressive FMA formation in DAGCombineHal Finkel2014-09-191-5/+10
| | | | | | | | | | | | | | | | | The heuristic used by DAGCombine to form FMAs checks that the FMUL has only one use, but this is overly-conservative on some systems. Specifically, if the FMA and the FADD have the same latency (and the FMA does not compete for resources with the FMUL any more than the FADD does), there is no need for the restriction, and furthermore, forming the FMA leaving the FMUL can still allow for higher overall throughput and decreased critical-path length. Here we add a new TLI callback, enableAggressiveFMAFusion, false by default, to elide the hasOneUse check. This is enabled for PowerPC by default, as most PowerPC systems will benefit. Patch by Olivier Sallenave, thanks! llvm-svn: 218120
* Optimize sext/zext insertion algorithm in back-end.Jiangning Liu2014-09-193-8/+61
| | | | | | | | With this optimization, we will not always insert zext for values crossing basic blocks, but insert sext if the users of a value crossing basic block has preference of sign predicate. llvm-svn: 218101
* Omit DW_AT_frame_base under -gmlt for sizeDavid Blaikie2014-09-191-3/+7
| | | | llvm-svn: 218100
* Describe the -gmlt optimization committed in the previous revision.David Blaikie2014-09-191-0/+1
| | | | llvm-svn: 218099
* Omit all the extra static attributes on subprograms in -gmltDavid Blaikie2014-09-191-0/+3
| | | | | | | | This omission will be done in a fancier manner once we're dealing with "put gmlt in the skeleton CUs under fission" - it'll have to be conditional on the kind of CU we're emitting into (skeleton or gmlt). llvm-svn: 218098
* Fix an it's vs. its typo.Hans Wennborg2014-09-191-1/+1
| | | | llvm-svn: 218093
* Revert part of r218041.Frederic Riss2014-09-181-0/+3
| | | | | | | | The patch moved some logic around in an attempt to generate potentially more DW_AT_declaration attributes. The patch was flawed though and it stopped generating the attribute in some cases. llvm-svn: 218060
* Always emit DW_AT_declaration attribute when the variable isn't a definition.Frederic Riss2014-09-181-3/+3
| | | | | | | | | | | | | | | Summary: This doesn't show up today as we don't emit decalration only variables. This will be tested when the followup patches implementing import of forward declared entities lands in clang. Reviewers: echristo, dblaikie, aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5382 llvm-svn: 218041
* Add a new pass FunctionTargetTransformInfo. This pass serves as aEric Christopher2014-09-181-3/+3
| | | | | | | | | | | shim between the TargetTransformInfo immutable pass and the Subtarget via the TargetMachine and Function. Migrate a single call from BasicTargetTransformInfo as an example and provide shims where TargetMachine begins taking a Function to determine the subtarget. No functional change. llvm-svn: 218004
* [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPassRobin Morisset2014-09-171-52/+133
| | | | | | | | | | | | This required a new hook called hasLoadLinkedStoreConditional to know whether to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to CmpXchg (X86). Apart from that, the new code in AtomicExpandPass is mostly moved from X86AtomicExpandPass. The main result of this patch is to get rid of that pass, which had lots of code duplicated with AtomicExpandPass. llvm-svn: 217928
* [CodeGenPrepare][AddressingModeMatcher] The promotion mechanism was expectingQuentin Colombet2014-09-161-45/+55
| | | | | | instructions when truncate, sext, or zext were created. Fix that. llvm-svn: 217926
* Add back a fallback case for targets that do not or cannot implement ↵Owen Anderson2014-09-161-1/+5
| | | | | | getNoopForMachoTarget(). llvm-svn: 217899
* Fix BasicTTI::getCmpSelInstrCost to deal with illegal vector typesHal Finkel2014-09-161-1/+2
| | | | | | | | | | | | | | | | | | | The default implementation of getCmpSelInstrCost, which provides the cost of icmp/fcmp/select instructions, did not deal sensibly with illegal vector types that were scalarized. We'd ask for the legalization cost of the vector type, which would return something like (4, f64) given an input of <4 x double>, and we'd then check the TLI status of the ISD opcode on that scalar type. This would result in querying (ISD::VSELECT, f64), for example. Amusingly enough, ISD::VSELECT on scalar types is marked as Legal by default (as with most other operations), and most backends never change this because VSELECT is never generated on scalars. However, seeing the resulting operation as Legal, we'd neglect to add the scalarization cost before returning. The result is that we'd grossly under-estimate the cost of cmps/selects on illegal vector types. Now, if type legalization clearly results in scalarization, we skip the early return and add the scalarization cost. llvm-svn: 217859
* DebugInfo: Add comment describing the need to disable address pool usage in ↵David Blaikie2014-09-151-0/+5
| | | | | | | | skeleton units. Post commit review from Eric Christopher. llvm-svn: 217842
* Replace repeated null checks with an assert. NFC.Sanjay Patel2014-09-151-18/+14
| | | | | | | Without a vector to hold the created ops, these functions don't have any use. llvm-svn: 217831
* [FastISel] Move optimizeCmpPredicate to FastISel base class. NFC.Juergen Ributzka2014-09-151-0/+40
| | | | | | Make the optimizeCmpPredicate function available to all targets. llvm-svn: 217822
* Replace dead links to "Hacker's Delight" with general references. NFC.Sanjay Patel2014-09-153-10/+10
| | | | llvm-svn: 217814
* Fix a lot of confusion around inserting nops on empty functions.Rafael Espindola2014-09-152-15/+7
| | | | | | | | | | | | | | | | On MachO, and MachO only, we cannot have a truly empty function since that breaks the linker logic for atomizing the section. When we are emitting a frame pointer, the presence of an unreachable will create a cfi instruction pointing past the last instruction. This is perfectly fine. The FDE information encodes the pc range it applies to. If some tool cannot handle this, we should explicitly say which bug we are working around and only work around it when it is actually relevant (not for ELF for example). Given the unreachable we could omit the .cfi_def_cfa_register, but then again, we could also omit the entire function prologue if we wanted to. llvm-svn: 217801
* [CodeGenPrepare][AddressingModeMatcher] Fix a think-o for the sext(zext) -> ↵Quentin Colombet2014-09-151-7/+9
| | | | | | | | | | | zext promotion introduced in r217629. We were returning the old sext instead of the new zext as the promoted instruction! Thanks Joerg Sonnenberger for the test case. llvm-svn: 217800
* In DwarfEHPrepare, after all passes are run, RewindFunction may be a danglingYaron Keren2014-09-141-0/+5
| | | | | | | pointer to a dead function. To make sure it's valid, doFinalization nullptrs RewindFunction just like the constructor and so it will be found on next run. llvm-svn: 217737
* Allow targets to custom legalize vector insertion and extraction.Owen Anderson2014-09-121-0/+8
| | | | llvm-svn: 217711
* Remove an unnecessary restriction. MIsNeedChainEdge() should be checked ↵Owen Anderson2014-09-121-1/+1
| | | | | | | | | even when scheduler AliasAnalysis is not enabled. A good chunk of the MIsNeedChainEdge() is logic that is valid and should be applied even for targets that are not using for alias analysis. llvm-svn: 217706
* Legalizer: Use the scalar bit width when promoting bit counting instrs onBenjamin Kramer2014-09-121-5/+6
| | | | | | | | | vectors. e.g. when promoting ctlz from <2 x i32> to <2 x i64> we have to fixup the result by 32 bits, not 64. PR20917. llvm-svn: 217671
* [CodeGenPrepare] Teach the addressing mode matcher how to promote zext.Quentin Colombet2014-09-111-13/+56
| | | | | | I.e., teach it about 'sext (zext a to ty) to ty2' => zext a to ty2. llvm-svn: 217629
* Remove the unused string section symbol parameter from DwarfFile::emitStringsDavid Blaikie2014-09-119-61/+43
| | | | | | | | | | | | | | | | | | | And since it /looked/ like the DwarfStrSectionSym was unused, I tried removing it - but then it turned out that DwarfStringPool was reconstructing the same label (and expecting it to have already been emitted) and uses that. So I kept it around, but wanted to pass it in to users - since it seemed a bit silly for DwarfStringPool to have it passed in and returned but itself have no use for it. The only two users don't handle strings in both .dwo and .o files so they only ever need the one symbol - no need to keep it (and have an unused symbol) in the DwarfStringPool used for fission/.dwo. Refactor a bunch of accelerator table usage to remove duplication so I didn't have to touch 4-5 callers. llvm-svn: 217628
* Add DAG combine for shl + add of constants.Matt Arsenault2014-09-111-32/+12
| | | | | | | | | | | | | | Do (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) This is already done for multiplies, but since multiplies by powers of two are turned into shifts, we also need to handle it here. This might want checks for isLegalAddImmediate to avoid transforming an add of a legal immediate with one that isn't. llvm-svn: 217610
* Combine fmul vector FP constants when unsafe math is allowed.Sanjay Patel2014-09-111-6/+22
| | | | | | | | | | | | | | | | | | | This is an extension of the change made with r215820: http://llvm.org/viewvc/llvm-project?view=revision&revision=215820 That patch allowed combining of splatted vector FP constants that are multiplied. This patch allows combining non-uniform vector FP constants too by relaxing the check on the type of vector. Also, canonicalize a vector fmul in the same way that we already do for scalars - if only one operand of the fmul is a constant, make it operand 1. Otherwise, we miss potential folds. This fold is also done by -instcombine, but it's possible that extra fmuls may have been generated during lowering. Differential Revision: http://reviews.llvm.org/D5254 llvm-svn: 217599
* Build correct vector filled with undef nodesDavid Xu2014-09-111-4/+20
| | | | llvm-svn: 217570
* Cleanup: Use the appropriate API for accessing the DIVariable of aAdrian Prantl2014-09-101-1/+1
| | | | | | DBG_VALUE intrinsic. llvm-svn: 217533
* Rename getMaximumUnrollFactor -> getMaxInterleaveFactor; also rename option ↵Sanjay Patel2014-09-101-2/+2
| | | | | | | | | | | names controlling this variable. "Unroll" is not the appropriate name for this variable. Clang already uses the term "interleave" in pragmas and metadata for this. Differential Revision: http://reviews.llvm.org/D5066 llvm-svn: 217528
* [asan-assembly-instrumentation] Added CFI directives to the generated ↵Yuri Gorshenin2014-09-101-0/+5
| | | | | | | | | | | | | | instrumentation code. Summary: [asan-assembly-instrumentation] Added CFI directives to the generated instrumentation code. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5189 llvm-svn: 217482
* Sink PrevCU updating into DwarfUnit::addRange to ensure consistencyDavid Blaikie2014-09-094-6/+8
| | | | | | | | | So that the two operations in DwarfDebug couldn't get separated (because I accidentally separated them in some work in progress), put them together. While we're here, move DwarfUnit::addRange to DwarfCompileUnit, since it's not relevant to type units. llvm-svn: 217468
* Remove DwarfDebug::PrevSection, PrevCU is sufficient for handling address ↵David Blaikie2014-09-093-15/+3
| | | | | | | | | | | | | range holes. PrevSection/PrevCU are used to detect holes in the address range of a CU to ensure the DW_AT_ranges does not include those holes. When we see a function with no debug info, though it may be in the same range as the prior and subsequent functions, there should be a gap in the CU's ranges. By setting PrevCU to null in that case, the range would not be extended to cover the gap. llvm-svn: 217466
* [MachineSinking] Conservatively clear kill flags after coalescing.Patrik Hagglund2014-09-091-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This solves the problem of having a kill flag inside a loop with a definition of the register prior to the loop: %vreg368<def> ... Inside loop: %vreg520<def> = COPY %vreg368 %vreg568<def,tied1> = add %vreg341<tied0>, %vreg520<kill> => was coalesced into => %vreg568<def,tied1> = add %vreg341<tied0>, %vreg368<kill> MachineVerifier then complained: *** Bad machine code: Virtual register killed in block, but needed live out. *** The kill flag for %vreg368 is incorrect, and is cleared by this patch. This is similar to the clearing done at the end of MachineSinking::SinkInstruction(). Patch provided by Jonas Paulsson. Reviewed by Quentin Colombet and Juergen Ributzka. llvm-svn: 217427
* Fast-ISel: Remove dead code after falling back from selecting call ↵Hans Wennborg2014-09-081-15/+10
| | | | | | | | | | | | | | | | | instructions (PR20863) Previously, fast-isel would not clean up after failing to select a call instruction, because it would have called flushLocalValueMap() which moves the insertion point, making SavedInsertPt in selectInstruction() invalid. Fixing this by making SavedInsertPt a member variable, and having flushLocalValueMap() update it. This removes some redundant code at -O0, and more importantly fixes PR20863. Differential Revision: http://reviews.llvm.org/D5249 llvm-svn: 217401
* Group unsafe fmul math folds together for easier reading. No functional change.Sanjay Patel2014-09-081-6/+10
| | | | llvm-svn: 217399
* Fix the FIXME that was just added in r217390 - remove a bunch of redundant ↵Sanjay Patel2014-09-081-43/+2
| | | | | | | | fold permutations. The testcases for these folds already exist in test/CodeGen/X86/fp-fast.ll. llvm-svn: 217393
* group unsafe math folds together for easier readingSanjay Patel2014-09-081-150/+142
| | | | | | Also added a FIXME regarding redundant folds for non-canonicalized constants. llvm-svn: 217390
* [AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph.Chad Rosier2014-09-081-0/+9
| | | | | | | Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! Phabricator Review: http://reviews.llvm.org/D5103 llvm-svn: 217371
* DebugInfo: Do not use DW_FORM_GNU_addr_index in skeleton CUs, GDB 7.8 errors ↵David Blaikie2014-09-071-1/+1
| | | | | | | | | | on this. It's probably not a huge deal to not do this - if we could, maybe the address could be reused by a subprogram low_pc and avoid an extra relocation, but it's just one per CU at best. llvm-svn: 217338
* Allow vector fsub ops with constants to get the same optimizations as scalars.Sanjay Patel2014-09-051-2/+2
| | | | | | | | This problem is bigger than just fsub, but this is the minimum fix to solve fneg for PR20556 ( http://llvm.org/bugs/show_bug.cgi?id=20556 ), and we solve zero subtraction with the same change. llvm-svn: 217286
* clean up; NFCSanjay Patel2014-09-051-2/+2
| | | | llvm-svn: 217278
* Revert "Disable the fix for pr20793 because of a gnu ld bug."Rafael Espindola2014-09-051-11/+0
| | | | | | | | | | This reverts commit r217211. Both the bfd ld and gold outputs were valid. They were using a Rela relocation, so the value present in the relocated location was not used, which caused me to misread the output. llvm-svn: 217264
* Set the parent pointer of cloned DBG_VALUE instructions correctly.Adrian Prantl2014-09-051-1/+1
| | | | | | | | | | | | | | | | | | Fixes PR20523. When spilling variables onto the stack, spillVirtReg() is setting the parent pointer of the cloned DBG_VALUE intrinsic for the stack location to the parent pointer of the original intrinsic. MachineInstr parent pointers should however always point to the parent basic block. MBB is shadowing the MBB member variable. The instruction still ends up being inserted into the right basic block, because it's inserted after MI which serves as the iterator. I failed at constructing a reliable testcase for this, see http://llvm.org/bugs/show_bug.cgi?id=20523 for a large testcases. llvm-svn: 217260
* Disable the fix for pr20793 because of a gnu ld bug.Rafael Espindola2014-09-051-0/+11
| | | | llvm-svn: 217211
* Refactor to avoid code duplication. NFC.Rafael Espindola2014-09-051-34/+25
| | | | llvm-svn: 217207
* Fix pr20793.Rafael Espindola2014-09-041-24/+48
| | | | | | With this patch the third field of llvm.global_ctors is also used on ELF. llvm-svn: 217202
* MC Win64: Put unwind info for COMDAT code into the same COMDAT groupReid Kleckner2014-09-041-19/+4
| | | | | | | | | | | | | | | | | Summary: This fixes a long standing issue where we would emit many little .text sections and only one .pdata and .xdata section. Now we generate one .pdata / .xdata pair per .text section and associate them correctly. Fixes PR19667. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5181 llvm-svn: 217176
OpenPOWER on IntegriCloud