summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [MC] Create unique .pdata sections for every .text sectionReid Kleckner2016-05-022-14/+16
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds a unique ID to the COFF section uniquing map, similar to the one we have for ELF. The unique id is not currently exposed via the assembler because we don't have a use case for it yet. Users generally create .pdata with the .seh_* family of directives, and the assembler internally needs to produce .pdata and .xdata sections corresponding to the code section. The association between .text sections and the assembler-created .xdata and .pdata sections is maintained as an ID field of MCSectionCOFF. The CFI-related sections are created with the given unique ID, so if more code is added to the same text section, we can find and reuse the CFI sections that were already created. Reviewers: majnemer, rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19376 llvm-svn: 268331
* [MachineBlockPlacement] Let the target optimize the branches at the end.Quentin Colombet2016-05-021-0/+13
| | | | | | | | | | | | | | | After the layout of the basic blocks is set, the target may be able to get rid of unconditional branches to fallthrough blocks that the generic code does not catch. This happens any time TargetInstrInfo::AnalyzeBranch is not able to analyze all the branches involved in the terminators sequence, while still understanding a few of them. In such situation, AnalyzeBranch can directly modify the branches if it has been instructed to do so. This patch takes advantage of that. llvm-svn: 268328
* [X86] Model FAULTING_LOAD_OP as a terminator and branch.Quentin Colombet2016-05-021-13/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | This operation may branch to the handler block and we do not want it to happen anywhere within the basic block. Moreover, by marking it "terminator and branch" the machine verifier does not wrongly assume (because of AnalyzeBranch not knowing better) the branch is analyzable. Indeed, the target was seeing only the unconditional branch and not the faulting load op and thought it was a simple unconditional block. The machine verifier was complaining because of that and moreover, other optimizations could have done wrong transformation! In the process, simplify the representation of the handler block in the faulting load op. Now, we directly reference the handler block instead of using a label. This has the benefits of: 1. MC knows how to issue a label for a BB, so leave that to it. 2. Accessing the target BB from its label is painful, whereas it is direct from a MBB operand. Note: The 2 bytes offset in implicit-null-check.ll comes from the fact the unconditional jumps are not removed anymore, as the whole terminator sequence is not analyzable anymore. Will fix it in a subsequence commit. llvm-svn: 268327
* DebugInfo: Avoid propagating incorrect debug locations in SelectionDAG via CSE.Wolfgang Pieb2016-05-021-31/+37
| | | | | | | | | | | | | | | | | | Summary: When SelectionDAG performs CSE it is possible that the context's source location is different from that of the selected node. This can lead to incorrect line number records. We update the debug location to the one that occurs earlier in the instruction sequence. This fixes PR21006. Reviewers: echristo, sdmitrouk Subscribers: jevinskie, asl, llvm-commits Differential Revision: http://reviews.llvm.org/D12094 llvm-svn: 268323
* ScheduleDAGInstrs.cpp: Don't peel the iterator when it points the end. This ↵NAKAMURA Takumi2016-05-021-1/+1
| | | | | | will fix the crash in r268143. llvm-svn: 268257
* Cleanup comments. NFC.Chad Rosier2016-05-021-7/+9
| | | | llvm-svn: 268233
* Fix grammar and correct comment - the debug information wasn't incorrect, ↵Eric Christopher2016-05-021-2/+2
| | | | | | rather suboptimal. llvm-svn: 268211
* [CodeGen] Add OPC_MoveChild0-OPC_MoveChild7 opcodes to isel matching tables ↵Craig Topper2016-05-021-0/+12
| | | | | | to optimize table size. Shaves about 12K off the X86 matcher table. llvm-svn: 268209
* getelementptr instruction, support index vector of EVT.Igor Breger2016-05-011-1/+2
| | | | | | Differential Revision: http://reviews.llvm.org/D19775 llvm-svn: 268195
* CodeGen: convert to range based loopsSaleem Abdulrasool2016-04-301-36/+20
| | | | | | | Convert to using some range based loops, avoid unnecessary variables for unchecked casts. NFC. llvm-svn: 268165
* Reverting 268054 & 268063 as they caused PR27579.Amjad Aboud2016-04-306-201/+52
| | | | llvm-svn: 268150
* [MBP] Use Function::optForSize() instead of checking OptimizeForSize directly.Haicheng Wu2016-04-291-2/+1
| | | | | | Fix a FIXME. Disable loop alignment if compiled with -Oz now. llvm-svn: 268121
* DAGCombiner: Reduce truncated shl widthMatt Arsenault2016-04-291-0/+19
| | | | llvm-svn: 268094
* Use SelectionDAG::getTargetConstant* helper functions. NFC.Simon Pilgrim2016-04-291-4/+4
| | | | | | Instead of SelectionDAG::getConstant directly to make it more obvious that we're creating target constants. llvm-svn: 268074
* [MBP] Split placement and alignment into two functions. NFC.Haicheng Wu2016-04-291-0/+5
| | | | | | Cut and Paste. llvm-svn: 268067
* Recommitted r264280 "Supporting all entities declared in lexical scope in ↵Amjad Aboud2016-04-296-52/+201
| | | | | | | | LLVM debug info." After fixing PR26942 in r267004. llvm-svn: 268054
* Unify XDEBUG and EXPENSIVE_CHECKS (into the latter), and add an option to ↵Filipe Cabecinhas2016-04-292-4/+4
| | | | | | | | | | | | | | | | | | | the cmake build to enable them. Summary: Historically, we had a switch in the Makefiles for turning on "expensive checks". This has never been ported to the cmake build, but the (dead-ish) code is still around. This will also make it easier to turn it on in buildbots. Reviewers: chandlerc Subscribers: jyknight, mzolotukhin, RKSimon, gberry, llvm-commits Differential Revision: http://reviews.llvm.org/D19723 llvm-svn: 268050
* RegisterPressure: Fix default lanemask for missing regunit intervalsMatthias Braun2016-04-291-35/+33
| | | | | | | | | | | | | | In case of missing live intervals for a physical registers getLanesWithProperty() would report 0 which was not a safe default in all situations. Add a parameter to pass in a safe default. No testcase because in-tree targets do not skip computing register unit live intervals. Also cleanup the getXXX() functions to not perform the RequireLiveIntervals checks anymore so we do not even need to return safe defaults. llvm-svn: 267977
* RegisterPressure: Cannot produce dead (subregister) defs anymoreMatthias Braun2016-04-291-3/+2
| | | | | | | | | With the DetectDeadLanes pass in place we cannot run into situations anymore where defs suddenly become dead. Also add a missing check so we do not try to add an undef flag to a physreg (found by visual inspection, no failing test). llvm-svn: 267976
* LiveIntervalAnalysis: Remove LiveVariables requirementMatthias Braun2016-04-283-7/+3
| | | | | | | | | | | | This requirement was a huge hack to keep LiveVariables alive because it was optionally used by TwoAddressInstructionPass and PHIElimination. However we have AnalysisUsage::addUsedIfAvailable() which we can use in those passes. This re-applies r260806 with LiveVariables manually added to PowerPC to hopefully not break the stage 2 bots this time. llvm-svn: 267954
* [CodeGen] Remove extra ';'Marcin Koscielnicki2016-04-281-1/+1
| | | | | | Squashes a -Wpedantic warning. llvm-svn: 267944
* LiveIntervalAnalysis: No need to deal with dead subregister defs anymore.Matthias Braun2016-04-281-20/+3
| | | | | | | | The DetectDeadLaneMask already ensures that we have no dead subregister definitions making the special handling in LiveIntervalAnalysis unnecessary. This reverts most of r248335. llvm-svn: 267937
* Reset the TopRPTracker's position in ScheduleDAGMILive::initQueuesKrzysztof Parzyszek2016-04-281-5/+11
| | | | | | | | | | | | | | | | | | | ScheduleDAGMI::initQueues changes the RegionBegin to the first non-debug instruction. Since it does not track register pressure, it does not affect any RP trackers. ScheduleDAGMILive inherits initQueues from ScheduleDAGMI, and it does reset the TopTPTracker in its schedule method. Any derived, target-specific scheduler will need to do it as well, but the TopRPTracker is only exposed as a "const" object to derived classes. Without the ability to modify the tracker directly, this leaves a derived scheduler with a potential of having the TopRPTracker out-of-sync with the CurrentTop. The symptom of the problem: void llvm::ScheduleDAGMILive::scheduleMI(llvm::SUnit *, bool): Assertion `TopRPTracker.getPos() == CurrentTop && "out of sync"' failed. Differential Revision: http://reviews.llvm.org/D19438 llvm-svn: 267918
* Debug Info: Restore the pre-r240853 behavior for DWARF2 bitfields.Adrian Prantl2016-04-281-24/+10
| | | | | | | | | The DWARF2 specification of DW_AT_bit_offset is ambiguous for little-endian machines, but by restoring to the old behavior we match what debuggers expect and what other popular compilers generate. llvm-svn: 267896
* Debug info: Support DWARF4 bitfields via DW_AT_data_bit_offset.Adrian Prantl2016-04-281-28/+30
| | | | | | | | | | | | | The DWARF2 specification of DW_AT_bit_offset was written from the perspective of a big-endian machine with unclear semantics for other systems. DWARF4 deprecated DW_AT_bit_offset and introduced a new attribute DW_AT_data_bit_offset that simply counts the number of bits from the beginning of the containing entity regardless of endianness. After this patch LLVM emits DW_AT_bit_offset for DWARF 2 or 3 and DW_AT_data_bit_offset when DWARF 4 or later is requested. llvm-svn: 267895
* [CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in ↵Craig Topper2016-04-281-0/+4
| | | | | | TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior. llvm-svn: 267853
* CodeGen: Add DetectDeadLanes pass.Matthias Braun2016-04-284-0/+534
| | | | | | | | | | | | | | | | | | | | The DetectDeadLanes pass performs a dataflow analysis of used/defined subregister lanes across COPY instructions and instructions that will get lowered to copies. It detects dead definitions and uses reading undefined values which are obscured by COPY and subregister usage. These dead definitions cause trouble in the register coalescer which cannot deal with definitions suddenly becoming dead after coalescing COPY instructions. For now the pass only adds dead and undef flags to machine operands. It should be possible to extend it in the future to remove the dead instructions and redo the analysis for the affected virtual registers. Differential Revision: http://reviews.llvm.org/D18427 llvm-svn: 267851
* LiveIntervalAnalysis: Fix handleMove() using wrong value numbersMatthias Braun2016-04-281-2/+1
| | | | | | | | | | handleMove() was incorrectly swapping two value numbers. This was missed before because the problem only occured when moving subregister definitions and needed -verify-machineinstrs to be detected. I cannot add a testcase as long as I cannot reapply r260905/r260806. llvm-svn: 267840
* [ImplicitNullChecks] Properly update the live-in of the block of the memory ↵Quentin Colombet2016-04-271-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | operation. We basically replace: HoistBB: cond_br NullBB, NotNullBB NullBB: ... NotNullBB: <reg> = load into HoistBB <reg> = load_faulting_op NullBB uncond_br NotNullBB NullBB: ... NotNullBB: ## <reg> is now live-in of NotNullBB ... This partially fixes the machine verifier error for test/CodeGen/X86/implicit-null-check.ll, but it still fails because of the implicit CFG structure. llvm-svn: 267817
* Fix build failure under NDEBUG.Than McIntosh2016-04-271-0/+4
| | | | llvm-svn: 267774
* [CodeGenPrepare] Don't sink a cast past its userDavid Majnemer2016-04-271-0/+5
| | | | | | | | | | The sink cast machinery is supposed to sink casts as close to their user as possible. However, an EH pad is the first instruction in it's basic block. Don't sink if the user is an EH pad. This fixes PR27536. llvm-svn: 267767
* Refactor debugging code, NFC.Than McIntosh2016-04-271-31/+30
| | | | | | | | | | | | | | | | | Summary: Refactor debugging routines to reduce code duplication. Remove a couple of #include's that were not needed. Don't require MachineDominator as a prereq for this pass (not needed). These changes split off from http://reviews.llvm.org/D18827. Reviewers: wmi, gbiv, qcolombet Subscribers: llvm-commits, davidxl, jevinskie Differential Revision: http://reviews.llvm.org/D18992 llvm-svn: 267766
* [DAGCombiner] Follow coding convention for function name (NFC)Gerolf Hoflehner2016-04-271-2/+2
| | | | llvm-svn: 267745
* Revert r267649, it caused PR27539.Nico Weber2016-04-271-11/+7
| | | | llvm-svn: 267723
* Detects the SAD pattern on X86 so that much better code will be emitted once ↵Cong Hou2016-04-271-7/+11
| | | | | | | | the pattern is matched. Differential revision: http://reviews.llvm.org/D14840 llvm-svn: 267649
* [MachineInstrBundle] Actually set the PartialDeadDef flag only when the registerQuentin Colombet2016-04-271-1/+1
| | | | | | | | | is defined! The users were checking the proper thing (Defined + PartialDeadDef), but the information may have been wrong for other use cases, so fix that. llvm-svn: 267641
* [MachineBasicBlock] Take advantage of the partially dead information.Quentin Colombet2016-04-261-2/+9
| | | | | | | Thanks to that information we wouldn't lie on a register being live whereas it is not. llvm-svn: 267622
* [MachineInstrBundle] Improvement the recognition of dead definitions.Quentin Colombet2016-04-261-3/+7
| | | | | | | Now, it is possible to know that partial definitions are dead definitions and recognize that clobbered registers are also dead. llvm-svn: 267621
* [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.Ahmed Bougacha2016-04-262-40/+25
| | | | | | Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606
* [Tail duplication] Handle source registers with subregistersKrzysztof Parzyszek2016-04-261-34/+80
| | | | | | | | | | | | | | When a block is tail-duplicated, the PHI nodes from that block are replaced with appropriate COPY instructions. When those PHI nodes contained use operands with subregisters, the subregisters were dropped from the COPY instructions, resulting in incorrect code. Keep track of the subregister information and use this information when remapping instructions from the duplicated block. Differential Revision: http://reviews.llvm.org/D19337 llvm-svn: 267583
* [CodeGenPrepare] use branch weight metadata to decide if a select should be ↵Sanjay Patel2016-04-262-11/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | turned into a branch This is part of solving PR27344: https://llvm.org/bugs/show_bug.cgi?id=27344 CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the IR further with a select in place. For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly predictable branch. Even the most limited HW branch predictors will be correct on this branch almost all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all the times the branch was predicted correctly. As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's MispredictPenalty. Or we could just let targets override the default by implementing the hook with that and other target-specific options. Note that trying to statically determine mispredict rates for close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 50/50 taken/not-taken might still be 100% predictable. Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable() branch weight default values are 4 and 64. A proposal to change that is in D19435. Differential Revision: http://reviews.llvm.org/D19488 llvm-svn: 267572
* [CodeGenPrepare] don't convert an unpredictable select into control flowSanjay Patel2016-04-261-1/+2
| | | | | | | Suggested in the review of D19488: http://reviews.llvm.org/D19488 llvm-svn: 267504
* [PR27390] [CodeGen] Reject indexed loads in CombinerDAG.Marcin Koscielnicki2016-04-252-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | visitAND, when folding and (load) forgets to check which output of an indexed load is involved, happily folding the updated address output on the following testcase: target datalayout = "e-m:e-i64:64-n32:64" target triple = "powerpc64le-unknown-linux-gnu" %typ = type { i32, i32 } define signext i32 @_Z8access_pP1Tc(%typ* %p, i8 zeroext %type) { %b = getelementptr inbounds %typ, %typ* %p, i64 0, i32 1 %1 = load i32, i32* %b, align 4 %2 = ptrtoint i32* %b to i64 %3 = and i64 %2, -35184372088833 %4 = inttoptr i64 %3 to i32* %_msld = load i32, i32* %4, align 4 %zzz = add i32 %1, %_msld ret i32 %zzz } Fix this by checking ResNo. I've found a few more places that currently neglect to check for indexed load, and tightened them up as well, but I don't have test cases for them. In fact, they might not be triggerable at all, at least with current targets. Still, better safe than sorry. Differential Revision: http://reviews.llvm.org/D19202 llvm-svn: 267420
* [WinEH] Update SplitAnalysis::computeLastSplitPoint to cope with multiple EH ↵David Majnemer2016-04-252-14/+12
| | | | | | | | | | | | | | | | | | | successors We didn't have logic to correctly handle CFGs where there was more than one EH-pad successor (these are novel with WinEH). There were situations where a register was live in one exceptional successor but not another but the code as written would only consider the first exceptional successor it found. This resulted in split points which were insufficiently early if an invoke was present. This fixes PR27501. N.B. This removes getLandingPadSuccessor. llvm-svn: 267412
* [MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)Gerolf Hoflehner2016-04-243-3/+26
| | | | | | | | | | | | | | | | | | | The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328
* [CodeGen] Teach DAG combine to fold select_cc seteq X, 0, sizeof(X), ↵Craig Topper2016-04-241-0/+35
| | | | | | | | ctlz_zero_undef(X) -> ctlz(X). InstCombine already does this for IR and X86 pattern matches this during isel. A follow up commit will remove the X86 patterns to allow this to be tested. llvm-svn: 267325
* DebugInfo: Remove MDString-based type referencesDuncan P. N. Exon Smith2016-04-234-35/+14
| | | | | | | | | | | | | | | | | | | | | | | | Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around DIType*. It is no longer legal to refer to a DICompositeType by its 'identifier:', and DIBuilder no longer retains all types with an 'identifier:' automatically. Aside from the bitcode upgrade, this is mainly removing logic to resolve an MDString-based reference to an actualy DIType. The commits leading up to this have made the implicit type map in DICompileUnit's 'retainedTypes:' field superfluous. This does not remove DITypeRef, DIScopeRef, DINodeRef, and DITypeRefArray, or stop using them in DI-related metadata. Although as of this commit they aren't serving a useful purpose, there are patchces under review to reuse them for CodeView support. The tests in LLVM were updated with deref-typerefs.sh, which is attached to the thread "[RFC] Lazy-loading of debug info metadata": http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html llvm-svn: 267296
* replace duplicated static functions for profile metadata access with ↵Sanjay Patel2016-04-231-25/+2
| | | | | | BranchInst member function; NFCI llvm-svn: 267295
* [CodeGen] When promoting CTTZ operations to larger type, don't insert a ↵Craig Topper2016-04-231-9/+11
| | | | | | select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part. llvm-svn: 267280
* Re-commit optimization bisect support (r267022) without new pass manager ↵Andrew Kaylor2016-04-2215-15/+18
| | | | | | | | | | support. The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling). Differential Revision: http://reviews.llvm.org/D19172 llvm-svn: 267231
OpenPOWER on IntegriCloud