bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	R600: Implement getRecipEstimate	Matt Arsenault	2015-01-13	1	-0/+1
\| \| \| \| \| \| \| \| \|	This requires a new hook to prevent expanding sqrt in terms of rsqrt and reciprocal. v_rcp_f32, v_rsq_f32, and v_sqrt_f32 are all the same rate, so this expansion would just double the number of instructions and cycles. llvm-svn: 225828
*	[CodeGen] Use MVT iterator_ranges in legality loops. NFC intended.	Ahmed Bougacha	2015-01-07	1	-19/+14
\| \| \| \| \| \| \| \|	A few loops do trickier things than just iterating on an MVT subset, so I'll leave them be for now. Follow-up of r225387. llvm-svn: 225392
*	[CodeGenPrepare] Reapply r224351 with a fix for the assertion failure:	Quentin Colombet	2014-12-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The type promotion helper does not support vector type, so when make such it does not kick in in such cases. Original commit message: [CodeGenPrepare] Move sign/zero extensions near loads using type promotion. This patch extends the optimization in CodeGenPrepare that moves a sign/zero extension near a load when the target can combine them. The optimization may promote any operations between the extension and the load to make that possible. Although this optimization may be beneficial for all targets, in particular AArch64, this is enabled for X86 only as I have not benchmarked it for other targets yet. Context Most targets feature extended loads, i.e., loads that perform a zero or sign extension for free. In that context it is interesting to expose such pattern in CodeGenPrepare so that the instruction selection pass can form such loads. Sometimes, this pattern is blocked because of instructions between the load and the extension. When those instructions are promotable to the extended type, we can expose this pattern. Motivating Example Let us consider an example: define void @foo(i8* %addr1, i32* %addr2, i8 %a, i32 %b) { %ld = load i8* %addr1 %zextld = zext i8 %ld to i32 %ld2 = load i32* %addr2 %add = add nsw i32 %ld2, %zextld %sextadd = sext i32 %add to i64 %zexta = zext i8 %a to i32 %addza = add nsw i32 %zexta, %zextld %sextaddza = sext i32 %addza to i64 %addb = add nsw i32 %b, %zextld %sextaddb = sext i32 %addb to i64 call void @dummy(i64 %sextadd, i64 %sextaddza, i64 %sextaddb) ret void } As it is, this IR generates the following assembly on x86_64: [...] movzbl (%rdi), %eax # zero-extended load movl (%rsi), %es # plain load addl %eax, %esi # 32-bit add movslq %esi, %rdi # sign extend the result of add movzbl %dl, %edx # zero extend the first argument addl %eax, %edx # 32-bit add movslq %edx, %rsi # sign extend the result of add addl %eax, %ecx # 32-bit add movslq %ecx, %rdx # sign extend the result of add [...] The throughput of this sequence is 7.45 cycles on Ivy Bridge according to IACA. Now, by promoting the additions to form more extended loads we would generate: [...] movzbl (%rdi), %eax # zero-extended load movslq (%rsi), %rdi # sign-extended load addq %rax, %rdi # 64-bit add movzbl %dl, %esi # zero extend the first argument addq %rax, %rsi # 64-bit add movslq %ecx, %rdx # sign extend the second argument addq %rax, %rdx # 64-bit add [...] The throughput of this sequence is 6.15 cycles on Ivy Bridge according to IACA. This kind of sequences happen a lot on code using 32-bit indexes on 64-bit architectures. Note: The throughput numbers are similar on Sandy Bridge and Haswell. Proposed Solution To avoid the penalty of all these sign/zero extensions, we merge them in the loads at the beginning of the chain of computation by promoting all the chain of computation on the extended type. The promotion is done if and only if we do not introduce new extensions, i.e., if we do not degrade the code quality. To achieve this, we extend the existing “move ext to load” optimization with the promotion mechanism introduced to match larger patterns for addressing mode (r200947). The idea of this extension is to perform the following transformation: ext(promotableInst1(...(promotableInstN(load)))) => promotedInst1(...(promotedInstN(ext(load)))) The promotion mechanism in that optimization is enabled by a new TargetLowering switch, which is off by default. In other words, by default, the optimization performs the “move ext to load” optimization as it was before this patch. Performance Configuration: x86_64: Ivy Bridge fixed at 2900MHz running OS X 10.10. Tested Optimization Levels: O3/Os Tests: llvm-testsuite + externals. Results: - No regression beside noise. - Improvements: CINT2006/473.astar: ~2% Benchmarks/PAQ8p: ~2% Misc/perlin: ~3% The results are consistent for both O3 and Os. <rdar://problem/18310086> llvm-svn: 224402
*	Revert "[CodeGenPrepare] Move sign/zero extensions near loads using type ↵	Reid Kleckner	2014-12-17	1	-1/+0
\| \| \| \| \| \| \| \| \|	promotion." This reverts commit r224351. It causes assertion failures when building ICU. llvm-svn: 224397
*	[CodeGenPrepare] Move sign/zero extensions near loads using type promotion.	Quentin Colombet	2014-12-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch extends the optimization in CodeGenPrepare that moves a sign/zero extension near a load when the target can combine them. The optimization may promote any operations between the extension and the load to make that possible. Although this optimization may be beneficial for all targets, in particular AArch64, this is enabled for X86 only as I have not benchmarked it for other targets yet. Context Most targets feature extended loads, i.e., loads that perform a zero or sign extension for free. In that context it is interesting to expose such pattern in CodeGenPrepare so that the instruction selection pass can form such loads. Sometimes, this pattern is blocked because of instructions between the load and the extension. When those instructions are promotable to the extended type, we can expose this pattern. Motivating Example Let us consider an example: define void @foo(i8* %addr1, i32* %addr2, i8 %a, i32 %b) { %ld = load i8* %addr1 %zextld = zext i8 %ld to i32 %ld2 = load i32* %addr2 %add = add nsw i32 %ld2, %zextld %sextadd = sext i32 %add to i64 %zexta = zext i8 %a to i32 %addza = add nsw i32 %zexta, %zextld %sextaddza = sext i32 %addza to i64 %addb = add nsw i32 %b, %zextld %sextaddb = sext i32 %addb to i64 call void @dummy(i64 %sextadd, i64 %sextaddza, i64 %sextaddb) ret void } As it is, this IR generates the following assembly on x86_64: [...] movzbl (%rdi), %eax # zero-extended load movl (%rsi), %es # plain load addl %eax, %esi # 32-bit add movslq %esi, %rdi # sign extend the result of add movzbl %dl, %edx # zero extend the first argument addl %eax, %edx # 32-bit add movslq %edx, %rsi # sign extend the result of add addl %eax, %ecx # 32-bit add movslq %ecx, %rdx # sign extend the result of add [...] The throughput of this sequence is 7.45 cycles on Ivy Bridge according to IACA. Now, by promoting the additions to form more extended loads we would generate: [...] movzbl (%rdi), %eax # zero-extended load movslq (%rsi), %rdi # sign-extended load addq %rax, %rdi # 64-bit add movzbl %dl, %esi # zero extend the first argument addq %rax, %rsi # 64-bit add movslq %ecx, %rdx # sign extend the second argument addq %rax, %rdx # 64-bit add [...] The throughput of this sequence is 6.15 cycles on Ivy Bridge according to IACA. This kind of sequences happen a lot on code using 32-bit indexes on 64-bit architectures. Note: The throughput numbers are similar on Sandy Bridge and Haswell. Proposed Solution To avoid the penalty of all these sign/zero extensions, we merge them in the loads at the beginning of the chain of computation by promoting all the chain of computation on the extended type. The promotion is done if and only if we do not introduce new extensions, i.e., if we do not degrade the code quality. To achieve this, we extend the existing “move ext to load” optimization with the promotion mechanism introduced to match larger patterns for addressing mode (r200947). The idea of this extension is to perform the following transformation: ext(promotableInst1(...(promotableInstN(load)))) => promotedInst1(...(promotedInstN(ext(load)))) The promotion mechanism in that optimization is enabled by a new TargetLowering switch, which is off by default. In other words, by default, the optimization performs the “move ext to load” optimization as it was before this patch. Performance Configuration: x86_64: Ivy Bridge fixed at 2900MHz running OS X 10.10. Tested Optimization Levels: O3/Os Tests: llvm-testsuite + externals. Results: - No regression beside noise. - Improvements: CINT2006/473.astar: ~2% Benchmarks/PAQ8p: ~2% Misc/perlin: ~3% The results are consistent for both O3 and Os. <rdar://problem/18310086> llvm-svn: 224351
*	[Statepoints 2/4] Statepoint infrastructure for garbage collection: MI & ↵	Philip Reames	2014-12-01	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	x86-64 Backend This is the second patch in a small series. This patch contains the MachineInstruction and x86-64 backend pieces required to lower Statepoints. It does not include the code to actually generate the STATEPOINT machine instruction and as a result, the entire patch is currently dead code. I will be submitting the SelectionDAG parts within the next 24-48 hours. Since those pieces are by far the most complicated, I wanted to minimize the size of that patch. That patch will include the tests which exercise the functionality in this patch. The entire series can be seen as one combined whole in http://reviews.llvm.org/D5683. The STATEPOINT psuedo node is generated after all gc values are explicitly spilled to stack slots. The purpose of this node is to wrap an actual call instruction while recording the spill locations of the meta arguments used for garbage collection and other purposes. The STATEPOINT is modeled as modifing all of those locations to prevent backend optimizations from forwarding the value from before the STATEPOINT to after the STATEPOINT. (Doing so would break relocation semantics for collectors which wish to relocate roots.) The implementation of STATEPOINT is closely modeled on PATCHPOINT. Eventually, much of the code in this patch will be removed. The long term plan is to merge the functionality provided by statepoints and patchpoints. Merging their implementations in the backend is likely to be a good starting point. Reviewed by: atrick, ributzka llvm-svn: 223085
*	Target triple OS detection tidyup. NFC	Simon Pilgrim	2014-11-29	1	-1/+1
\| \| \| \| \| \|	Use Triple::isOS*() helpers where possible. llvm-svn: 222960
*	Replace a couple asserts with static_asserts.	Craig Topper	2014-11-17	1	-2/+2
\| \| \| \|	llvm-svn: 222114
*	We can get the TLOF from the TargetMachine - so constructor no longer ↵	Aditya Nandakumar	2014-11-13	1	-4/+3
\| \| \| \| \| \|	requires TargetLoweringObjectFile to be passed. llvm-svn: 221926
*	This patch changes the ownership of TLOF from TargetLoweringBase to ↵	Aditya Nandakumar	2014-11-13	1	-4/+0
\| \| \| \| \| \|	TargetMachine so that different subtargets could share the TLOF effectively llvm-svn: 221878
*	PR20557: Fix the bug that bogus cpu parameter crashes llc on AArch64 backend.	Hao Liu	2014-10-31	1	-1/+5
\| \| \| \| \| \|	Initial patch by Oleg Ranevskyy. llvm-svn: 220945
*	Add minnum / maxnum codegen	Matt Arsenault	2014-10-21	1	-0/+20
\| \| \| \|	llvm-svn: 220342
*	Reinstate "Nuke the old JIT."	Eric Christopher	2014-09-02	1	-1/+0
\| \| \| \| \| \| \| \|	Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reinstates commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 216982
*	name change: isPow2DivCheap -> isPow2SDivCheap	Sanjay Patel	2014-08-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	isPow2DivCheap That name doesn't specify signed or unsigned. Lazy as I am, I eventually read the function and variable comments. It turns out that this is strictly about signed div. But I discovered that the comments are wrong: srl/add/sra is not the general sequence for signed integer division by power-of-2. We need one more 'sra': sra/srl/add/sra That's the sequence produced in DAGCombiner. The first 'sra' may be removed when dividing by exactly '2', but that's a special case. This patch corrects the comments, changes the name of the flag bit, and changes the name of the accessor methods. No functional change intended. Differential Revision: http://reviews.llvm.org/D5010 llvm-svn: 216237
*	Added a TLI hook to signal that the target does not have or does not care about	Pedro Artigas	2014-08-08	1	-0/+1
\| \| \| \| \| \| \| \|	floating point exceptions, added use of flag to fold potentially exception raising floating point math in selection DAG. No functionality change, as targets have to explicitly ask for this behavior and none does today. llvm-svn: 215222
*	Temporarily Revert "Nuke the old JIT." as it's not quite ready to	Eric Christopher	2014-08-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	be deleted. This will be reapplied as soon as possible and before the 3.6 branch date at any rate. Approved by Jim Grosbach, Lang Hames, Rafael Espindola. This reverts commits r215111, 215115, 215116, 215117, 215136. llvm-svn: 215154
*	Nuke the old JIT.	Rafael Espindola	2014-08-07	1	-1/+0
\| \| \| \| \| \| \| \| \|	I am sure we will be finding bits and pieces of dead code for years to come, but this is a good start. Thanks to Lang Hames for making MCJIT a good replacement! llvm-svn: 215111
*	Remove the TargetMachine forwards for TargetSubtargetInfo based	Eric Christopher	2014-08-04	1	-7/+8
\| \| \| \| \| \|	information and update all callers. No functional change. llvm-svn: 214781
*	CodeGen: emit IR-level f16 conversion intrinsics as fptrunc/fpext	Tim Northover	2014-07-21	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes the first stage DAG for @llvm.convert.to.fp16 an fptrunc, and correspondingly @llvm.convert.from.fp16 an fpext. The legalisation path is now uniform, regardless of the input IR: fptrunc -> FP_TO_FP16 (if f16 illegal) -> libcall fpext -> FP16_TO_FP (if f16 illegal) -> libcall Each target should be able to select the version that best matches its operations and not be required to duplicate patterns for both fptrunc and FP_TO_FP16 (for example). As a result we can remove some redundant AArch64 patterns. llvm-svn: 213507
*	CodeGen: soften f16 type by default instead of marking legal.	Tim Northover	2014-07-18	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Actual support for softening f16 operations is still limited, and can be added when it's needed. But Soften is much closer to being a useful thing to try than keeping it Legal when no registers can actually hold such values. Longer term, we probably want something between Soften and Promote semantics for most targets, it'll be more efficient to promote the 4 basic operations to f32 than libcall them. llvm-svn: 213372
*	CodeGen: generate single libcall for fptrunc -> f16 operations.	Tim Northover	2014-07-17	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \|	Previously we asserted on this code. Currently compiler-rt doesn't actually implement any of these new libcalls, but external help is pretty much the only viable option for LLVM. I've followed the much more generic "__truncST2" naming, as opposed to the odd name for f32 -> f16 truncation. This can obviously be changed later, or overridden by any targets that need to. llvm-svn: 213252
*	[x86,SDAG] Introduce any- and sign-extend-vector-inreg nodes analogous	Chandler Carruth	2014-07-10	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to the zero-extend-vector-inreg node introduced previously for the same purpose: manage the type legalization of widened extend operations, especially to support the experimental widening mode for x86. I'm adding both because sign-extend is expanded in terms of any-extend with shifts to propagate the sign bit. This removes the last fundamental scalarization from vec_cast2.ll (a test case that hit many really bad edge cases for widening legalization), although the trunc tests in that file still appear scalarized because the the shuffle legalization is scalarizing. Funny thing, I've been working on that. Some initial experiments with this and SSE2 scenarios is showing moderately good behavior already for sign extension. Still some work to do on the shuffle combining on X86 before we're generating optimal sequences, but avoiding scalarization is a huge step forward. llvm-svn: 212714
*	Make it possible for ints/floats to return different values from ↵	Daniel Sanders	2014-07-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getBooleanContents() Summary: On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer comparisons return 0 or 1. Updated the various uses of getBooleanContents. Two simplifications had to be disabled when float and int boolean contents differ: - ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially discoverable (i.e. when the condition of the VSELECT is a SETCC node). - visitVSELECT (select C, 0, 1) -> (xor C, 1). Come to think of it, this one could test for the common case of 'C' being a SETCC too. Preserved existing behaviour for all other targets and updated the affected MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low' variable was counting in the wrong direction because it thought it could simply add the result of the comparison. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4389 llvm-svn: 212697
*	[SDAG] Make the new zext-vector-inreg node default to expand so targets	Chandler Carruth	2014-07-09	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \|	don't need to set it manually. This is based on feedback from Tom who pointed out that if every target needs to handle this we need to reach out to those maintainers. In fact, it doesn't make sense to duplicate everything when anything other than expand seems unlikely at this stage. llvm-svn: 212661
*	[codegen,aarch64] Add a target hook to the code generator to control	Chandler Carruth	2014-07-03	1	-40/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vector type legalization strategies in a more fine grained manner, and change the legalization of several v1iN types and v1f32 to be widening rather than scalarization on AArch64. This fixes an assertion failure caused by scalarizing nodes like "v1i32 trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32. This also provides a foundation for other targets to have more granular control over how vector types are legalized. Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow some work to start taking place on top of this patch as it adds some really important hooks to the backend that I'd like to immediately start using. =] http://reviews.llvm.org/D4322 llvm-svn: 212242
*	IR: add "cmpxchg weak" variant to support permitted failure.	Tim Northover	2014-06-13	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity all cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903
*	InitLibcallNames can take a Triple instead of a TargetMachine.	Eric Christopher	2014-06-02	1	-4/+4
\| \| \| \|	llvm-svn: 210045
*	Remove unused variable.	Eric Christopher	2014-05-22	1	-1/+0
\| \| \| \|	llvm-svn: 209391
*	X86TTI: Adjust sdiv cost now that we can lower it on plain SSE2.	Benjamin Kramer	2014-04-27	1	-1/+1
\| \| \| \| \| \| \|	Includes a fix for a horrible typo that caused all SDIV costs to be slightly off :) llvm-svn: 207371
*	Revert r206749 till a final decision about the intrinsics is made.	Michael Zolotukhin	2014-04-26	1	-1/+0
\| \| \| \|	llvm-svn: 207313
*	Set default value of HasExtractBitsInsn to false	Yi Jiang	2014-04-21	1	-0/+1
\| \| \| \|	llvm-svn: 206803
*	Reapply r206732. This time without optimization of branches.	Michael Zolotukhin	2014-04-21	1	-0/+1
\| \| \| \|	llvm-svn: 206749
*	Revert r206732 which is causing llc to crash on most of the build bots.	Chandler Carruth	2014-04-21	1	-1/+0
\| \| \| \| \| \| \| \|	Original commit message: Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN, safe.srem.iN, safe.urem.iN (iN = i8, i61, i32, or i64). llvm-svn: 206735
*	Implement builtins for safe division: safe.sdiv.iN, safe.udiv.iN, safe.srem.iN,	Michael Zolotukhin	2014-04-21	1	-0/+1
\| \| \| \| \| \|	safe.urem.iN (iN = i8, i16, i32, or i64). llvm-svn: 206732
*	[C++11] More 'nullptr' conversion. In some cases just using a boolean check ↵	Craig Topper	2014-04-14	1	-17/+17
\| \| \| \| \| \|	instead of comparing to nullptr. llvm-svn: 206142
*	Change shouldSplitVectorElementType to better match the description.	Matt Arsenault	2014-03-31	1	-1/+1
\| \| \| \| \| \|	Pass the entire vector type, and not just the element. llvm-svn: 205247
*	CodeGen: add sensible defaults for the ISD::FROUND operation	Tim Northover	2014-03-29	1	-0/+9
\| \| \| \| \| \|	Some exotic types didn't know how to handle FROUND, which ARM64 uses. llvm-svn: 205088
*	CodeGenPrep: wrangle IR to exploit AArch64 tbz/tbnz inst.	Tim Northover	2014-03-29	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given IR like: %bit = and %val, #imm-with-1-bit-set %tst = icmp %bit, 0 br i1 %tst, label %true, label %false some targets can emit just a single instruction (tbz/tbnz in the AArch64 case). However, with ISel acting at the basic-block level, all three instructions need to be together for this to be possible. This adds another transformation to CodeGenPrep to expose these opportunities, if targets opt in via the hook. llvm-svn: 205086
*	[C++11] Replace llvm::tie with std::tie.	Benjamin Kramer	2014-03-02	1	-1/+1
\| \| \| \| \| \|	The old implementation is no longer needed in C++11. llvm-svn: 202644
*	move getNameWithPrefix and getSymbol to TargetMachine.	Rafael Espindola	2014-02-19	1	-27/+0
\| \| \| \| \| \| \| \| \| \|	TargetLoweringBase is implemented in CodeGen, so before this patch we had a dependency fom Target to CodeGen. This would show up as a link failure of llvm-stress when building with -DBUILD_SHARED_LIBS=ON. This fixes pr18900. llvm-svn: 201711
*	Add back r201608, r201622, r201624 and r201625	Rafael Espindola	2014-02-19	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	r201608 made llvm corretly handle private globals with MachO. r201622 fixed a bug in it and r201624 and r201625 were changes for using private linkage, assuming that llvm would do the right thing. They all got reverted because r201608 introduced a crash in LTO. This patch includes a fix for that. The issue was that TargetLoweringObjectFile now has to be initialized before we can mangle names of private globals. This is trivially true during the normal codegen pipeline (the asm printer does it), but LTO has to do it manually. llvm-svn: 201700
*	Revert r201622 and r201608.	Daniel Jasper	2014-02-19	1	-29/+0
\| \| \| \| \| \| \|	This causes the LLVMgold plugin to segfault. More information on the replies to r201608. llvm-svn: 201669
*	Avoid an infinite cycle with private linkage and -f{data\|function}-sections.	Rafael Espindola	2014-02-19	1	-2/+3
\| \| \| \| \| \| \| \| \| \|	When outputting an object we check its section to find its name, but when looking for the section with -ffunction-section we look for the symbol name. Break the loop by requesting a name with the private prefix when constructing the section name. This matches the behavior before r201608. llvm-svn: 201622
*	Fix PR18743.	Rafael Espindola	2014-02-18	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IR @foo = private constant i32 42 is valid, but before this patch we would produce an invalid MachO from it. It was invalid because it would use an L label in a section where the liker needs the labels in order to atomize it. One way of fixing it would be to just reject this IR in the backend, but that would not be very front end friendly. What this patch does is use an 'l' prefix in sections that we know the linker requires symbols for atomizing them. This allows frontends to just use private and not worry about which sections they go to or how the linker handles them. One small issue with this strategy is that now a symbol name depends on the section, which is not available before codegen. This is not a problem in practice. The reason is that it only happens with private linkage, which will be ignored by the non codegen users (llvm-nm and llvm-ar). llvm-svn: 201608
*	Rename some member variables from TD to DL.	Rafael Espindola	2014-02-18	1	-5/+5
\| \| \| \| \| \|	TargetData was renamed DataLayout back in r165242. llvm-svn: 201581
*	TargetLowering: n * r where n > 2 should be an illegal addressing mode	Tom Stellard	2014-02-14	1	-0/+2
\| \| \| \|	llvm-svn: 201433
*	Disable compare sinking in CodeGenPrepare when multiple condition registers ↵	Hal Finkel	2014-01-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	are available As noted in the comment above CodeGenPrepare::OptimizeInst, which aggressively sinks compares to reduce pressure on the condition register(s), for targets such as PowerPC with multiple condition registers, this may not be the right thing to do. This adds an HasMultipleConditionRegisters boolean to TLI, and CodeGenPrepare::OptimizeInst is skipped when HasMultipleConditionRegisters is true. This functionality will be used by the PowerPC backend in an upcoming commit. Especially when the PowerPC backend starts tracking individual condition register bits as separate allocatable entities (which will happen in this upcoming commit), this sinking from CodeGenPrepare::OptimizeInst is significantly suboptimial. llvm-svn: 198354
*	Remove the 's' DataLayout specification	Rafael Espindola	2014-01-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During the years there have been some attempts at figuring out how to align byval arguments. A look at the commit log suggests that they were * Use the ABI alignment. * When that was not sufficient for x86-64, I added the 's' specification to DataLayout. * When that was not sufficient Evan added the virtual getByValTypeAlignment. * When even that was not sufficient, we just got the FE to add the alignment to the byval. This patch is just a simple cleanup that removes my first attempt at fixing the problem. I also added an AArch64 implementation of getByValTypeAlignment to make sure this patch is a nop. I also left the 's' parsing for backward compatibility. I will send a short email to llvmdev about the change for anyone maintaining an out of tree target. llvm-svn: 198287
*	Remove unused variable from r195944.	Lang Hames	2013-11-29	1	-1/+0
\| \| \| \|	llvm-svn: 195945
*	Refactor a lot of patchpoint/stackmap related code to simplify and make it	Lang Hames	2013-11-29	1	-0/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	target independent. Most of the x86 specific stackmap/patchpoint handling was necessitated by the use of the native address-mode format for frame index operands. PEI has now been modified to treat stackmap/patchpoint similarly to DEBUG_INFO, allowing us to use a simple, platform independent register/offset pair for frame indexes on stackmap/patchpoints. Notes: - Folding is now platform independent and automatically supported. - Emiting patchpoints with direct memory references now just involves calling the TargetLoweringBase::emitPatchPoint utility method from the target's XXXTargetLowering::EmitInstrWithCustomInserter method. (See X86TargetLowering for an example). - No more ugly platform-specific operand parsers. This patch shouldn't change the generated output for X86. llvm-svn: 195944