summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [StatepointLowering] Remove distinction between call and invoke safepointsIgor Laevsky2015-11-042-24/+31
| | | | | | | | | | | | There is no point in having invoke safepoints handled differently than the call safepoints. All relevant decisions could be made by looking at whether or not gc.result and gc.relocate lay in a same basic block. This change will allow to lower call safepoints with relocates and results in a different basic blocks. See test case for example. Differential Revision: http://reviews.llvm.org/D14158 llvm-svn: 252028
* CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering.Peter Collingbourne2015-11-031-3/+28
| | | | | | | | | | | | | | | | | A profile of an LTO link of Chrome revealed that we were spending some ~30-50% of execution time in the function Constant::getRelocationInfo(), which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn from TargetMachine::getNameWithPrefix(). It turns out that we only need the result of getKindForGlobal() when targeting Mach-O, so this change moves the relevant part of the logic to TargetLoweringObjectFileMachO. NFCI. Differential Revision: http://reviews.llvm.org/D14168 llvm-svn: 252014
* [SelectionDAG] Use existing constant nodes instead of recreating them. NFC.Simon Pilgrim2015-11-031-9/+6
| | | | llvm-svn: 251990
* Delete dead code.Rafael Espindola2015-11-031-3/+0
| | | | llvm-svn: 251960
* [CodegenPrepare] Do not rematerialize gc.relocates across different basic blocksIgor Laevsky2015-11-031-0/+8
| | | | | | Differential Revision: http://reviews.llvm.org/D14258 llvm-svn: 251957
* [X86] Generate .cfi_adjust_cfa_offset correctly when pushing argumentsMichael Kuperstein2015-11-031-0/+3
| | | | | | | | | | | | | | | When push instructions are being used to pass function arguments on the stack, and either EH or debugging are enabled, we need to generate .cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is enough for the CFA offset to be correct at every call site, while for debugging we want to be correct after every push. Darwin does not support this well, so don't use pushes whenever it would be required. Differential Revision: http://reviews.llvm.org/D13767 llvm-svn: 251904
* RegisterPressure: Improve assert messageMatthias Braun2015-11-031-1/+2
| | | | llvm-svn: 251885
* RegisterPressure: Slightly nicer pressure diff dumpingMatthias Braun2015-11-031-1/+3
| | | | llvm-svn: 251884
* ScheduleDAGInstrs: Remove IsPostRA flag; NFCMatthias Braun2015-11-034-37/+27
| | | | | | | | | | | | | | | | | ScheduleDAGInstrs doesn't behave differently before or after register allocation. It was only used in a method of MachineSchedulerBase which behaved differently in MachineScheduler/PostMachineScheduler. Change this to let MachineScheduler/PostMachineScheduler just pass in a parameter to that function. The order of the LiveIntervals* and bool RemoveKillFlags paramters have been switched to make out-of-tree code fail instead of unintentionally passing a value intended for the IsPostRA flag to the (previously following and default initialized) RemoveKillFlags. Differential Revision: http://reviews.llvm.org/D14245 llvm-svn: 251883
* [CGP] widen switch condition and case constants to target's register width ↵Sanjay Patel2015-11-021-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (2nd try) This is a redo of r251849 except the tests have been split into arch-specific folders to hopefully make the bots happy. This is a follow-up from the discussion in D12965. The block-at-a-time limitation of SelectionDAG also came up in D13297. Without the InstCombine change from D12965, I don't expect this patch to make any difference in the real world because InstCombine does not shrink cases like this in visitSwitchInst(). But we need to have this CGP safety harness in place before proceeding with any shrinkage in D12965, so we won't generate extra extends for compares. I've opted for IR regression tests in the patch because that seems like a clearer way to test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86 will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473 Before: BB#0: mr 4, 3 extsh. 3, 4 ble 0, .LBB0_5 BB#1: cmpwi 3, 99 bgt 0, .LBB0_9 BB#2: rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend li 3, 0 cmplwi 4, 1 beqlr 0 BB#3: cmplwi 4, 10 bne 0, .LBB0_12 BB#4: li 3, 1 blr .LBB0_5: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 65436 beq 0, .LBB0_13 BB#6: cmplwi 3, 65526 beq 0, .LBB0_15 BB#7: cmplwi 3, 65535 bne 0, .LBB0_12 BB#8: li 3, 4 blr .LBB0_9: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 100 beq 0, .LBB0_14 ... After: BB#0: rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons cmpwi 4, 999 ble 0, .LBB0_5 BB#1: lis 3, 0 ori 3, 3, 65525 cmpw 4, 3 bgt 0, .LBB0_9 BB#2: cmplwi 4, 1000 beq 0, .LBB0_14 BB#3: cmplwi 4, 65436 bne 0, .LBB0_13 BB#4: li 3, 6 blr .LBB0_5: li 3, 0 cmplwi 4, 1 beqlr 0 BB#6: cmplwi 4, 10 beq 0, .LBB0_12 BB#7: cmplwi 4, 100 bne 0, .LBB0_13 BB#8: li 3, 2 blr .LBB0_9: cmplwi 4, 65526 beq 0, .LBB0_15 BB#10: cmplwi 4, 65535 bne 0, .LBB0_13 ... Differential Revision: http://reviews.llvm.org/D13532 llvm-svn: 251857
* revert r251849; need to move tests to arch-specific foldersSanjay Patel2015-11-021-47/+0
| | | | llvm-svn: 251851
* [CGP] widen switch condition and case constants to target's register widthSanjay Patel2015-11-021-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a follow-up from the discussion in D12965. The block-at-a-time limitation of SelectionDAG also came up in D13297. Without the InstCombine change from D12965, I don't expect this patch to make any difference in the real world because InstCombine does not shrink cases like this in visitSwitchInst(). But we need to have this CGP safety harness in place before proceeding with any shrinkage in D12965, so we won't generate extra extends for compares. I've opted for IR regression tests in the patch because that seems like a clearer way to test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86 will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473 Before: BB#0: mr 4, 3 extsh. 3, 4 ble 0, .LBB0_5 BB#1: cmpwi 3, 99 bgt 0, .LBB0_9 BB#2: rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend li 3, 0 cmplwi 4, 1 beqlr 0 BB#3: cmplwi 4, 10 bne 0, .LBB0_12 BB#4: li 3, 1 blr .LBB0_5: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 65436 beq 0, .LBB0_13 BB#6: cmplwi 3, 65526 beq 0, .LBB0_15 BB#7: cmplwi 3, 65535 bne 0, .LBB0_12 BB#8: li 3, 4 blr .LBB0_9: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 100 beq 0, .LBB0_14 ... After: BB#0: rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons cmpwi 4, 999 ble 0, .LBB0_5 BB#1: lis 3, 0 ori 3, 3, 65525 cmpw 4, 3 bgt 0, .LBB0_9 BB#2: cmplwi 4, 1000 beq 0, .LBB0_14 BB#3: cmplwi 4, 65436 bne 0, .LBB0_13 BB#4: li 3, 6 blr .LBB0_5: li 3, 0 cmplwi 4, 1 beqlr 0 BB#6: cmplwi 4, 10 beq 0, .LBB0_12 BB#7: cmplwi 4, 100 bne 0, .LBB0_13 BB#8: li 3, 2 blr .LBB0_9: cmplwi 4, 65526 beq 0, .LBB0_15 BB#10: cmplwi 4, 65535 bne 0, .LBB0_13 ... Differential Revision: http://reviews.llvm.org/D13532 llvm-svn: 251849
* In MachineBlockPlacement, filter cold blocks off the loop chain when profile ↵Cong Hou2015-11-021-2/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | data is available. In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawback that cold blocks in the loop may be inserted on a hot function path, hence increasing branch cost and also reducing icache locality. Consider a simple example shown below: A | B⇆C | D When B->C is quite cold, the best BB-layout should be A,B,D,C. But the current implementation produces A,C,B,D. This patch filters those cold blocks off from the loop chain by comparing the ratio: LoopBBFreq / LoopFreq to 20%: if it is less than 20%, we don't include this BB to the loop chain. Here LoopFreq is the frequency of the loop when we reduce the loop into a single node. In general we have more cold blocks when the loop has few iterations. And vice versa. Differential revision: http://reviews.llvm.org/D11662 llvm-svn: 251833
* Fix two issues in MergeConsecutiveStores:James Y Knight2015-11-021-2/+15
| | | | | | | | | | | | | | | | | | | | | | 1) PR25154. This is basically a repeat of PR18102, which was fixed in r200201, and broken again by r234430. The latter changed which of the store nodes was merged into from the first to the last. Thus, we now also need to prefer merging a later store at a given address into the target node, instead of an earlier one. 2) While investigating that, I also realized I'd introduced a bug in r236850. There, I removed a check for alignment -- not realizing that nothing except the alignment check was ensuring that none of the stores were overlapping! This is a really bogus way to ensure there's no aliased stores. A better solution to both of these issues is likely to always use the code added in the 'if (UseAA)' branches which rearrange the chain based on a more principled analysis. I'll look into whether that can be used always, but in the interest of getting things back to working, I think a minimal change makes sense. llvm-svn: 251816
* [MachineVerifier] Analyze MachineMemOperands for mem-to-mem moves.Jonas Paulsson2015-10-291-6/+25
| | | | | | | | | | | | | Since the verifier will give false reports if it incorrectly thinks MI is loading or storing using an FI, it is necessary to scan memoperands and find out how the FI is used in the instruction. This should be relatively rare. Needed to make CodeGen/SystemZ/spill-01.ll pass, which now runs with this flag. Reviewed by Quentin Colombet. llvm-svn: 251620
* Revert "ScheduleDAGInstrs: Remove IsPostRA flag"Matthias Braun2015-10-292-19/+28
| | | | | | | | It broke 3 arm testcases. This reverts commit r251608. llvm-svn: 251615
* MachineScheduler: Fix typo in debug messageMatthias Braun2015-10-291-1/+1
| | | | | | Maybe I just missed the humor there ;-) llvm-svn: 251609
* ScheduleDAGInstrs: Remove IsPostRA flagMatthias Braun2015-10-292-28/+19
| | | | | | | | This was a layering violation in ScheduleDAGInstrs (and MachineSchedulerBase) they both shouldn't know directly whether they are used by the PostMachineScheduler or the MachineScheduler. llvm-svn: 251608
* MachineScheduler: Use ranged for and slightly simplify the codeMatthias Braun2015-10-291-11/+12
| | | | llvm-svn: 251607
* ARM: support .watchos_version_min and .tvos_version_min.Tim Northover2015-10-281-4/+12
| | | | | | | | These MachO file directives are used by linkers and other tools to provide compatibility information, much like the existing .ios_version_min and .macosx_version_min. llvm-svn: 251569
* [ValueTracking] Use !range metadata more aggressively in KnownBitsSanjoy Das2015-10-281-1/+1
| | | | | | | | | | | | | | Summary: Teach `computeKnownBitsFromRangeMetadata` to use `!range` metadata more aggressively. Reviewers: majnemer, nlewycky, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14100 llvm-svn: 251487
* [SelectionDAG] Don't inspect !range metadata for extended loadsSanjoy Das2015-10-281-1/+2
| | | | | | | | | | | | | | | | | | Summary: Don't call `computeKnownBitsFromRangeMetadata` for extended loads -- this can cause a mismatch between the width of the !range metadata and the width of the APInt's accumulating `KnownZero` (and `KnownOne` in the future). This isn't a problem now, but will be after a future change. Note: this can be made more aggressive in the future. Reviewers: nlewycky Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14107 llvm-svn: 251486
* Make the SelectionDAG graph printer use SDNode::PersistentId labels.James Y Knight2015-10-273-11/+16
| | | | | | | | r248010 changed the -debug output to use short ids, but did not similarly modify the graph printer. Change to be consistent, for ease of cross-reference. llvm-svn: 251465
* Use the 'arcp' fast-math-flag when combining repeated FP divisorsSanjay Patel2015-10-271-5/+11
| | | | | | | | | | | | This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. This was originally part of D8900. Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and possibly other changes. Differential Revision: http://reviews.llvm.org/D9708 llvm-svn: 251450
* Create a new interface addSuccessorWithoutWeight(MBB*) in MBB to add ↵Cong Hou2015-10-273-21/+32
| | | | | | | | | | | | | | successors when optimization is disabled. When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights. We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled. In this patch, a new interface addSuccessorWithoutWeight(MBB*) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list. Differential revision: http://reviews.llvm.org/D13963 llvm-svn: 251429
* Do not use "else" when both branches return (NFC)Mehdi Amini2015-10-271-2/+1
| | | | | From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 251398
* Fix llc crash processing S/UREM for -Oz builds caused by rL250825.Steve King2015-10-271-5/+21
| | | | | | | | | | | | | | When taking the remainder of a value divided by a constant, visitREM() attempts to convert the REM to a longer but faster sequence of instructions. This conversion calls combine() on a speculative DIV instruction. Commit rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes. Flow eventually hits unreachable(). This patch adds a test case and a check to prevent visitREM() from trying to convert the REM instruction in cases where a DIVREM is possible. See http://reviews.llvm.org/D14035 llvm-svn: 251373
* Fix indents. It's a follow up to r251353.Ivan Krasin2015-10-261-2/+2
| | | | llvm-svn: 251364
* Move imported entities into DwarfCompilationUnit to speed up LTO linking.Ivan Krasin2015-10-264-22/+14
| | | | | | | | | | | | | | | | Summary: In particular, this CL speeds up the official Chrome linking with LTO by 1.8x. See more details in https://crbug.com/542426 Reviewers: dblaikie Subscribers: jevinskie Differential Revision: http://reviews.llvm.org/D13918 llvm-svn: 251353
* Remove assert(false) in favor of asserting the if conditional it is ↵David Blaikie2015-10-261-8/+5
| | | | | | | | contained within. Also adjust the code to avoid 3 redundant map lookups. llvm-svn: 251327
* [safestack] Fast access to the unsafe stack pointer on AArch64/Android.Evgeniy Stepanov2015-10-261-0/+13
| | | | | | | | | | | | | | | | | | | | | Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. The previous iteration of this change was reverted in r250461. This version leaves the generic, compiler-rt based implementation in SafeStack.cpp instead of moving it to TargetLoweringBase in order to allow testing without a TargetMachine. llvm-svn: 251324
* Scalarizer for masked.gather and masked.scatter intrinsics.Elena Demikhovsky2015-10-251-1/+261
| | | | | | | | | | When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations. If the mask is not constant, the scalarizer will build a chain of conditional basic blocks. I added isLegalMaskedGather() isLegalMaskedScatter() APIs. Differential Revision: http://reviews.llvm.org/D13722 llvm-svn: 251237
* [X86] Use correct calling convention for MCU psABI libcallsMichael Kuperstein2015-10-251-0/+3
| | | | | | | | | | | | When using the MCU psABI, compiler-generated library calls should pass some parameters in-register. However, since inreg marking for x86 is currently done by the front end, it will not be applied to backend-generated calls. This is a workaround for PR3997, which describes a similar issue for -mregparm. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251223
* Refactor: Simplify boolean conditional return statements in lib/CodeGen.Rafael Espindola2015-10-249-60/+24
| | | | | | Patch by Richard. llvm-svn: 251213
* [DAGCombiner] Tidy up ConstantFP commutation. NFCISimon Pilgrim2015-10-241-37/+21
| | | | | | Move ConstantFP canonicalization of commutative instructions to start of 2-op node creation (matches integer) - simplifies constant folding code. llvm-svn: 251203
* [DAGCombiner] Generalize masking of constant rotates.Simon Pilgrim2015-10-241-5/+10
| | | | | | | | We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded. Followup to D13851. llvm-svn: 251197
* [X86][XOP] Add support for lowering vector rotationsSimon Pilgrim2015-10-241-55/+55
| | | | | | | | | | This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions. This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future. Differential Revision: http://reviews.llvm.org/D13851 llvm-svn: 251188
* [CodeGen] Mark setjmp/catchret MBBs address-takenJoseph Tremoulet2015-10-233-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This ensures that BranchFolding (and similar) won't remove these blocks. Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are address-taken but do not have BBs that are address-taken, since otherwise its call to getAddrLabelSymbolTableToEmit would fail an assertion on such blocks. I audited the other callers of getAddrLabelSymbolTableToEmit (and getAddrLabelSymbol); they all have BBs known to be address-taken except for the call through getAddrLabelSymbol from WinException::create32bitRef; that call is actually now unreachable, so I've removed it and updated the signature of create32bitRef. This fixes PR25168. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, llvm-commits Differential Revision: http://reviews.llvm.org/D13774 llvm-svn: 251113
* [CodeGen] Remove usage of NDEBUG in header.Davide Italiano2015-10-231-7/+0
| | | | | | Moreover, this seems unused. llvm-svn: 251081
* MachineScheduler: Add a way to disable the 'ReduceLatency' heuristicMatthias Braun2015-10-221-2/+2
| | | | llvm-svn: 251037
* Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. ↵Craig Topper2015-10-225-71/+66
| | | | | | This removes the need to pass a hardcoded size in many places. NFC llvm-svn: 251032
* [X86] - Catch extra combine opportunities for redundant imuls.Zia Ansari2015-10-221-8/+92
| | | | | | | | | | | | When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple users which would result in an extra add instruction. In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add. I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works). Differential Revision: http://reviews.llvm.org/D13740 llvm-svn: 251028
* [WinEH] Remove extraneous call to emitEHRegistrationOffsetLabelDavid Majnemer2015-10-211-1/+0
| | | | | | It's a relic from the earlier implementation, let's remove it. llvm-svn: 250964
* LegalizeDAG: Implement promote for build_vectorMatt Arsenault2015-10-211-0/+30
| | | | | | | | | | This will be used in future commits for AMDGPU to promote operations on i64 vectors into operations on 32-bit vector components. This will be used / tested in future AMDGPU commits. llvm-svn: 250945
* Masked Load/Store optimization for scalar codeElena Demikhovsky2015-10-211-12/+72
| | | | | | | | | When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks. I added optimization for constant mask vector. Differential Revision: http://reviews.llvm.org/D13855 llvm-svn: 250893
* Let MachineVerifier be aware of mem-to-mem instructions.Jonas Paulsson2015-10-211-2/+8
| | | | | | | | | | | | | | | A mem-to-mem instruction (that both loads and stores), which store to an FI, cannot pass the verifier since it thinks it is loading from the FI. For the mem-to-mem instruction, do a looser check in visitMachineOperand() and only check liveness at the reg-slot while analyzing a frame index operand. Needed to make CodeGen/SystemZ/xor-01.ll pass with -verify-machineinstrs, which now runs with this flag. Reviewed by Evan Cheng and Quentin Colombet. llvm-svn: 250885
* Tail duplication can mix incompatible registers in phi nodesKrzysztof Parzyszek2015-10-211-0/+21
| | | | | | | | | Do not tail duplicate blocks where the successor has a phi node, and the corresponding value in that phi node uses a subregister. http://reviews.llvm.org/D13922 llvm-svn: 250877
* Two switch blocks in VectorLegalizer::LegalizeOp already have aArtyom Skrobov2015-10-201-0/+1
| | | | | | | | | | default: llvm_unreachable("This action is not supported yet!"); -- so I'm adding one to the third switch block, too. This is a follow-up fix for http://reviews.llvm.org/D13862 llvm-svn: 250830
* Adding support for TargetLoweringBase::LibCallArtyom Skrobov2015-10-201-251/+275
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: TargetLoweringBase::Expand is defined as "Try to expand this to other ops, otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between the two possibilities was defined in a rather convoluted way: - if DIVREM is legal, expand to DIVREM - if DIVREM has a custom lowering, expand to DIVREM - if DIVREM libcall is defined and a remainder from the same division is computed elsewhere, expand to a DIVREM libcall - else, expand to a DIV libcall This had the undesirable effect that if both DIV and DIVREM are implemented as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM libcall, even when the remainder isn't used. The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that backends can directly control whether they prefer an expansion or a conversion to a libcall. This makes the generic lowering code even more generic, allowing its reuse in a wider range of target-specific configurations. The useful effect is that ARM backend will now generate a call to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where it doesn't need the remainder. There's no functional change outside the ARM backend. Reviewers: t.p.northover, rengolin Subscribers: t.p.northover, llvm-commits, aemerson Differential Revision: http://reviews.llvm.org/D13862 llvm-svn: 250826
* Combining DIV+REM->DIVREM doesn't belong in LegalizeDAG; move it over into ↵Artyom Skrobov2015-10-203-67/+99
| | | | | | | | | | | | | | | | | | | | | | | | DAGCombiner. Summary: In addition to moving the code over, this patch amends the DIV,REM -> DIVREM combining to run on all affected nodes at once: if the nodes are converted to DIVREM one at a time, then the resulting DIVREM may get legalized by the backend into something target-specific that we won't be able to recognize and correlate with the remaining nodes. The motivation is to "prepare terrain" for D13862: when we set DIV and REM to be legalized to libcalls, instead of the DIVREM, we otherwise lose the ability to combine them together. To prevent this, we need to take the DIV,REM -> DIVREM combining out of the lowering stage. Reviewers: RKSimon, eli.friedman, rengolin Subscribers: john.brawn, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13733 llvm-svn: 250825
OpenPOWER on IntegriCloud