summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* After r147827 and r147902, it's now possible for unallocatable registers to beEvan Cheng2012-01-141-0/+5
| | | | | | | | | | | | | | | | | | | live across BBs before register allocation. This miscompiled 197.parser when a cmp + b are optimized to a cbnz instruction even though the CPSR def is live-in a successor. cbnz r6, LBB89_12 ... LBB89_12: ble LBB89_1 The fix consists of two parts. 1) Teach LiveVariables that some unallocatable registers might be liveouts so don't mark their last use as kill if they are. 2) ARM constantpool island pass shouldn't form cbz / cbnz if the conditional branch does not kill CPSR. rdar://10676853 llvm-svn: 148168
* Fix pasto from r146196.Chad Rosier2012-01-141-2/+2
| | | | llvm-svn: 148167
* Use RegisterTuples to generate pseudo-registers.Jakob Stoklund Olesen2012-01-134-45/+51
| | | | | | | | | | The QQ and QQQQ registers are not 'real', they are pseudo-registers used to model some vld and vst instructions. This makes the call clobber lists longer, but I intend to get rid of those soon. llvm-svn: 148151
* Revert r148131, it was committed before it was ready.Devang Patel2012-01-131-46/+40
| | | | llvm-svn: 148134
* Refactor.Devang Patel2012-01-131-40/+46
| | | | llvm-svn: 148131
* Convert SHUFPD with the same register for both sources to PSHUFD if it would ↵Craig Topper2012-01-132-1/+20
| | | | | | prevent a register copy. Similar to SHUFPS, but requires the mask to be converted. llvm-svn: 148112
* use v8i32 as optimal mem type over v8f32 if AVX2 is enabled. Similar to SSE2 ↵Craig Topper2012-01-131-3/+6
| | | | | | vs SSE1. llvm-svn: 148109
* Make X86 instruction selection use 256-bit VPXOR for build_vector of all ↵Craig Topper2012-01-134-37/+62
| | | | | | ones if AVX2 is enabled. This gives the ExeDepsFix pass a chance to choose FP vs int as appropriate. Also use v8i32 as the type for getZeroVector if AVX2 is enabled. This is consistent with SSE2 using prefering v4i32. llvm-svn: 148108
* Add patterns for v16i16 and v32i8 immAllZerosV to select VPXOR to match ↵Craig Topper2012-01-131-0/+8
| | | | | | v4i64 and v8i32. llvm-svn: 148106
* Added the MachineSchedulerPass skeleton.Andrew Trick2012-01-131-1/+10
| | | | llvm-svn: 148105
* Use 8i32 constant pool entry for converting AVX2_SETALLONES. Possibly fixes ↵Craig Topper2012-01-131-0/+2
| | | | | | PR11750. llvm-svn: 148101
* Fix typo in PerformAddCombine that caused any vector type to be checked for ↵Craig Topper2012-01-131-1/+1
| | | | | | horizontal add/sub if AVX2 is enabled. This caused an assert to fail for non 128/256-bit vectors when done before type legalizing. Fixes PR11749. llvm-svn: 148096
* Fix off-by-one error.Bill Wendling2012-01-131-1/+1
| | | | llvm-svn: 148077
* Fix the code that was WRONG.Bill Wendling2012-01-121-13/+6
| | | | | | | The registers are placed into the saved registers list in the reverse order, which is why the original loop was written to loop backwards. llvm-svn: 148064
* Fixed a bug in LowerVECTOR_SHUFFLE caused assertion failureElena Demikhovsky2012-01-121-1/+5
| | | | | | | lc: X86ISelLowering.cpp:6480: llvm::SDValue llvm::X86TargetLowering::LowerVECTOR_SHUFFLE(llvm::SDValue, llvm::SelectionDAG&) const: Assertion `V1.getOpcode() != ISD::UNDEF&& "Op 1 of shuffle should not be undef"' failed. Added a test. llvm-svn: 148044
* Support segmented stacks on 64-bit FreeBSD.Rafael Espindola2012-01-121-2/+8
| | | | | | | This patch uses tcb_spare field in the tcb structure to store info. Patch by Jyun-Yan You. llvm-svn: 148041
* Support segmented stacks on win32.Rafael Espindola2012-01-121-7/+17
| | | | | | | Uses the pvArbitrary slot of the TIB, which is reserved for applications. We only support frames with a static size. llvm-svn: 148040
* Rename X86ATTAsmParser -> X86AsmParserDevang Patel2012-01-122-19/+18
| | | | | | We are using one parser to parse att as well as intel style syntax. llvm-svn: 148032
* After Jakob's r147938 exception handling on i386 was completely broken.Benjamin Kramer2012-01-121-0/+7
| | | | | | | | | Restore the (obviously wrong) behavior from before r147938 without relying on undefined behavior. Add a fat FIXME note. This should fix nightly tester failures. llvm-svn: 148030
* Fix a bug in the AVX 256-bit shuffle code in cases where the splat element ↵Nadav Rotem2012-01-121-1/+1
| | | | | | | | is on the boundary of two 128-bit vectors. The attached testcase was stuck in an endless loop. llvm-svn: 148027
* X86: Generalize the x << (y & const) optimization to also catch masks with ↵Benjamin Kramer2012-01-121-21/+25
| | | | | | more set bits set than 31 or 63. llvm-svn: 148024
* Add predicate method check match memory operand size, if available.Devang Patel2012-01-122-17/+96
| | | | | | In att style asm syntax memory operand size is derived from suffix attached with mnemonic. In intel style asm syntax it is part of memory operand hence predicate method check is required to select appropriate instruction. llvm-svn: 148006
* Add intel style operand parser skeleton. Devang Patel2012-01-121-1/+97
| | | | | | This is a work in progress. llvm-svn: 148002
* Switch all of the uses of my InsertDAGNode helper to follow the exactChandler Carruth2012-01-121-8/+22
| | | | | | | | | | | | | | | | | | | | same pattern. We already had this pattern is a few places, but others tried to make a rough approximation of an actual DAG structure. As not everywhere went to this trouble, nothing could rely on this being done. In fact, I've checked all references to these node Ids, and the ones that are using the topo-sort properties are actually satisfied with a strict-weak-ordering. The requirement appears to be that Use >= Def. I've added a big blurb of comments to this bit of the transform to clarify why the order is so important for the next reader of the code. I'm starting with this change as it is very small, and trivially reverted if something breaks or the >= above really does need to be >. If that proves the case, we can hide the problem by reverting this patch, but the problem exists elsewhere as well, and so a more comprehensive solution will be needed. llvm-svn: 148001
* Fix assert.Eric Christopher2012-01-111-2/+2
| | | | llvm-svn: 147966
* Support segmented stacks on mac.Rafael Espindola2012-01-112-18/+68
| | | | | | | | This uses TLS slot 90, which actually belongs to JavaScriptCore. We only support frames with static size Patch by Brian Anderson. llvm-svn: 147960
* Generate the segmented stack prologue for fastcc too.Rafael Espindola2012-01-111-1/+2
| | | | | | Patch by Brian Anderson. llvm-svn: 147958
* Revert r147945 which disabled an addressing mode transformation. I hadChandler Carruth2012-01-111-4/+0
| | | | | | | | | hoped this would revive one of the llvm-gcc selfhost build bots, but it didn't so it doesn't appear that my transform is the culprit. If anyone else is seeing failures, please let me know! llvm-svn: 147957
* Use unsigned comparison in segmented stack prologue.Rafael Espindola2012-01-111-1/+1
| | | | | | | | This is a comparison of two addresses, and GCC does the comparison unsigned. Patch by Brian Anderson. llvm-svn: 147954
* Explicitly set the scale to 1 on some segstack prologue instrs.Rafael Espindola2012-01-112-4/+4
| | | | | | Patch by Brian Anderson. llvm-svn: 147952
* Add XOP Intrinsics and testsJan Sjödin2012-01-111-73/+662
| | | | llvm-svn: 147949
* Fix a bug in the lowering of BUILD_VECTOR for AVX. SCALAR_TO_VECTOR does not ↵Nadav Rotem2012-01-111-4/+2
| | | | | | zero untouched elements. Use INSERT_VECTOR_ELT instead. llvm-svn: 147948
* Disable the transformation I added in r147936 to see if it fixes someChandler Carruth2012-01-111-0/+4
| | | | | | | | strange build bot failures that look like a miscompile into an infloop. I'll investigate this tomorrow, but I'd both like to know whether my patch is the culprit, and get the bots back to green. llvm-svn: 147945
* Hoist a really redundant code pattern into a helper function, and deleteChandler Carruth2012-01-111-80/+29
| | | | | | lots of lines of code. No functionality changed. llvm-svn: 147942
* Simplify the AND-rooted mask+shift checking code to match that of theChandler Carruth2012-01-111-8/+6
| | | | | | SRL-rooted code. llvm-svn: 147941
* Unify the interface of the three mask+shift transform helpers, andChandler Carruth2012-01-111-26/+34
| | | | | | | factor the differences that were hiding in one of them into its other caller, the SRL handling code. No change in behavior. llvm-svn: 147940
* Clarify and make explicit some of the requirements for transformingChandler Carruth2012-01-111-52/+64
| | | | | | | | | | mask+shift pairs at the beginning of the ISD::AND case block, and then hoist the final pattern into a helper function, simplifying and reflowing it appropriately. This should have no observable behavior change, but several simplifications fell out of this such as directly computing the new mask constant, etc. llvm-svn: 147939
* Fix undefined code and reenable test case.Jakob Stoklund Olesen2012-01-111-2/+2
| | | | | | | I don't think the compact encoding code is right, but at least is has defined behavior now. llvm-svn: 147938
* Hoist the logic to transform shift+mask combinations into sub-registerChandler Carruth2012-01-111-56/+68
| | | | | | | | extracts and scaled addressing modes into its own helper function. No functionality changed here, just hoisting and layout fixes falling out of that hoisting. llvm-svn: 147937
* Teach the X86 instruction selection to do some heroic transforms toChandler Carruth2012-01-111-0/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | detect a pattern which can be implemented with a small 'shl' embedded in the addressing mode scale. This happens in real code as follows: unsigned x = my_accelerator_table[input >> 11]; Here we have some lookup table that we look into using the high bits of 'input'. Each entity in the table is 4-bytes, which means this implicitly gets turned into (once lowered out of a GEP): *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2)); The shift right followed by a shift left is canonicalized to a smaller shift right and masking off the low bits. That hides the shift right which x86 has an addressing mode designed to support. We now detect masks of this form, and produce the longer shift right followed by the proper addressing mode. In addition to saving a (rather large) instruction, this also reduces stalls in Intel chips on benchmarks I've measured. In order for all of this to work, one part of the DAG needs to be canonicalized *still further* than it currently is. This involves removing pointless 'trunc' nodes between a zextload and a zext. Without that, we end up generating spurious masks and hiding the pattern. llvm-svn: 147936
* Add big endian mips support. Based on a patch by Jack Carter.Rafael Espindola2012-01-113-16/+20
| | | | llvm-svn: 147924
* Add the skeleton of an asm parser for mips.Rafael Espindola2012-01-117-2/+114
| | | | llvm-svn: 147923
* ARM Ld/St Optimizer fix.Andrew Trick2012-01-111-3/+4
| | | | | | | | Allow LDRD to be formed from pairs with different LDR encodings. This was the original intention of the pass. Somewhere along the way, the LDR opcodes were refined which broke the optimization. We really don't care what the original opcodes are as long as they both map to the same LDRD and the immediate still fits. Fixes rdar://10435045 ARMLoadStoreOptimization cannot handle mixed LDRi8/LDRi12 llvm-svn: 147922
* Fixed order of operands in comment to match code.Lang Hames2012-01-101-1/+1
| | | | llvm-svn: 147890
* Default stack alignment for 32bit x86 should be 4 Bytes, not 8 Bytes.Joerg Sonnenberger2012-01-101-1/+1
| | | | | | | Add a test that checks the stack alignment of a simple function for Darwin, Linux and NetBSD for 32bit and 64bit mode. llvm-svn: 147888
* Consider unknown alignment caused by OptimizeThumb2Instructions().Jakob Stoklund Olesen2012-01-101-4/+25
| | | | | | | | | | | | | | | | | | | | This function runs after all constant islands have been placed, and may shrink some instructions to their 2-byte forms. This can actually cause some constant pool entries to move out of range because of growing alignment padding. Treat instructions that may be shrunk the same as inline asm - they erode the known alignment bits. Also reinstate an old assertion in verify(). It is correct now that basic block offsets include alignments. Add a single large test case that will hopefully exercise many parts of the constant island pass. <rdar://problem/10670199> llvm-svn: 147885
* Add missing VEX predicates to VMOVSDto64rr/VMOVSDto64mr. This fixes a fewChad Rosier2012-01-101-2/+3
| | | | | | | failing test cases on our internal AVX nightly tester. rdar://10663637 llvm-svn: 147881
* ARM updating VST2 pseudo-lowering fixed vs. register update.Jim Grosbach2012-01-103-8/+8
| | | | | | rdar://10663487 llvm-svn: 147876
* Fix some leftover control reaches end of non-void function warnings.Benjamin Kramer2012-01-103-2/+4
| | | | llvm-svn: 147874
* Move default case for covered enum outside of switch.Richard Smith2012-01-101-1/+1
| | | | llvm-svn: 147870
OpenPOWER on IntegriCloud