summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [x86] Teach the x86 DAG combiner to form MOVSLDUP and MOVSHDUPChandler Carruth2014-09-155-30/+135
| | | | | | | | | | | | instructions when it finds an appropriate pattern. These are lovely instructions, and its a shame to not use them. =] They are fast, and can hand loads folded into their operands, etc. I've also plumbed the comment shuffle decoding through the various layers so that the test cases are printed nicely. llvm-svn: 217758
* Fix a non-virtual destructor warning introduced in r217747.Frederic Riss2014-09-151-0/+2
| | | | llvm-svn: 217756
* [x86] Undo a flawed transform I added to form UNPCK instructions whenChandler Carruth2014-09-157-102/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | AVX is available, and generally tidy up things surrounding UNPCK formation. Originally, I was thinking that the only advantage of PSHUFD over UNPCK instruction variants was its free copy, and otherwise we should use the shorter encoding UNPCK instructions. This isn't right though, there is a larger advantage of being able to fold a load into the operand of a PSHUFD. For UNPCK, the operand *must* be in a register so it can be the second input. This removes the UNPCK formation in the target-specific DAG combine for v4i32 shuffles. It also lifts the v8 and v16 cases out of the AVX-specific check as they are potentially replacing multiple instructions with a single instruction and so should always be valuable. The floating point checks are simplified accordingly. This also adjusts the formation of PSHUFD instructions to attempt to match the shuffle mask to one which would fit an UNPCK instruction variant. This was originally motivated to allow it to match the UNPCK instructions in the combiner, but clearly won't now. Eventually, we should add a MachineCombiner pass that can form UNPCK instructions post-RA when the operand is known to be in a register and thus there is no loss. llvm-svn: 217755
* [x86] Teach the new vector shuffle lowering to use 'punpcklwd' andChandler Carruth2014-09-152-26/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | 'punpckhwd' instructions when suitable rather than falling back to the generic algorithm. While we could canonicalize to these patterns late in the process, that wouldn't help when the freedom to use them is only visible during initial lowering when undef lanes are well understood. This, it turns out, is very important for matching the shuffle patterns that are used to lower sign extension. Fixes a small but relevant regression in gcc-loops with the new lowering. When I changed this I noticed that several 'pshufd' lowerings became unpck variants. This is bad because it removes the ability to freely copy in the same instruction. I've adjusted the widening test to handle undef lanes correctly and now those will correctly continue to use 'pshufd' to lower. However, this caused a bunch of churn in the test cases. No functional change, just churn. Both of these changes are part of addressing a general weakness in the new lowering -- it doesn't sufficiently leverage undef lanes. I've at least a couple of patches that will help there at least in an academic sense. llvm-svn: 217752
* Fix ambiguous typedef introduced in r217747.Frederic Riss2014-09-151-2/+2
| | | | | | | | Use fully qualified name inside a typedef from llvm::iterator_range<...> to iterator_range. This is reported (rightly I think) by GCC as an ambiguous name redefinition. Hope this fixes the buildbots. llvm-svn: 217751
* InstSimplify: Simplify trivial and/or of icmpsDavid Majnemer2014-09-152-0/+234
| | | | | | | | | | | | | Some ICmpInsts when anded/ored with another ICmpInst trivially reduces to true or false depending on whether or not all integers or no integers satisfy the intersected/unioned range. This sort of trivial looking code can come about when InstCombine performs a range reduction-type operation on sdiv and the like. This fixes PR20916. llvm-svn: 217750
* Fix DebugInfo replaceAllUsesWith.Frederic Riss2014-09-151-1/+1
| | | | | | | | | | | | | | | | | | | | | Summary: replaceAllUsesWith had been modified to allow a DbgNode value to be replaced by itself. In that case a new node is created by copying the current DbgNode and the copy is used as replacement value. When that copying happens, the value stored in this->DbgNode at the end of RAUW would be a reference to the Node that has just been deleted. This doesn't produce any bug right now, because the DI node on which we call RAUW won't be used again. Reviewers: dblaikie, echristo, aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5326 llvm-svn: 217749
* Move replaceAllUsesWith() from DIType to DIDescriptor.Frederic Riss2014-09-152-7/+7
| | | | | | | | | | | | | | | | | | | RAUW was only used on DIType to merge declarations and full definitions of types. In order to support the same functionality for functions and global variables, move the function up type DI type hierarchy to the common parent of DIType, DISubprogram and DIVariable which is DIDescriptor. This functionality will be exercized when we add the code to emit imported declarations for forward declared function/variables. Reviewers: echristo, dblaikie, aprantl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5325 llvm-svn: 217748
* Introduce the DWARFUnitSection abstraction.Frederic Riss2014-09-156-44/+69
| | | | | | | | | | | | | | | | | | A DWARFUnitSection is the collection of Units that have been extracted from the same debug section. By embeding a reference to their DWARFUnitSection in each unit, the DIEs will be able to resolve inter-unit references by interrogating their Unit's DWARFUnitSection. This is a minimal patch where the DWARFUnitSection is-a SmallVector of Units, thus exposing exactly the same interface as before. Followup-up patches might change from inheritance to composition in order to expose only the wanted DWARFUnitSection abstraction. Differential Revision: http://reviews.llvm.org/D5310 llvm-svn: 217747
* llvm-cov: Clean up some redundancy in the view API (NFC)Justin Bogner2014-09-153-44/+29
| | | | | | | This removes the need to pass a starting and ending line when creating a SourceCoverageView, since these are easy to determine. llvm-svn: 217746
* llvm-cov: Simplify CounterMappingRegion, pushing logic to its userJustin Bogner2014-09-152-18/+17
| | | | | | | | | A single function in SourceCoverageDataManager was the only user of some of the comparisons in CounterMappingRegion, and at this point we know that only one file is relevant. This lets us use slightly simpler logic directly in the client. llvm-svn: 217745
* [x86] Teach the new vector shuffle lowering to use BLENDPS and BLENDPD.Chandler Carruth2014-09-144-37/+134
| | | | | | | | | | | | | | | | | These are super simple. They even take precedence over crazy instructions like INSERTPS because they have very high throughput on modern x86 chips. I still have to teach the integer shuffle variants about this to avoid so many domain crossings. However, due to the particular instructions available, that's a touch more complex and so a separate patch. Also, the backend doesn't seem to realize it can commute blend instructions by negating the mask. That would help remove a number of copies here. Suggestions on how to do this welcome, it's an area I'm less familiar with. llvm-svn: 217744
* llvm/test/CodeGen/X86/vec_shuffle-38.ll: Add explicit ↵NAKAMURA Takumi2014-09-141-1/+1
| | | | | | -mtriple=x86_64-unknown to avoid incompatibility of win32. llvm-svn: 217742
* [x86] Add an SSE41 mode to this test. Nothing interesting here, its theChandler Carruth2014-09-141-0/+19
| | | | | | same as SSE3. llvm-svn: 217741
* [x86] Switch this test to use an ALL prefix with special SSE2 and SSE3Chandler Carruth2014-09-141-120/+129
| | | | | | | | | variants where significant. This will make it more obvious what is happening when we start using blends in SSE41. llvm-svn: 217740
* [x86] Add some test cases where we should emit blendpd in SSE4.1. NoChandler Carruth2014-09-141-0/+15
| | | | | | actual change yet though. llvm-svn: 217739
* [x86] Teach the vector combiner that picks a canonical shuffle from toChandler Carruth2014-09-149-31/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | support transforming the forms from the new vector shuffle lowering to use 'movddup' when appropriate. A bunch of the cases where we actually form 'movddup' don't actually show up in the test results because something even later than DAG legalization maps them back to 'unpcklpd'. If this shows back up as a performance problem, I'll probably chase it down, but it is at least an encoded size loss. =/ To make this work, also always do this canonicalizing step for floating point vectors where the baseline shuffle instructions don't provide any free copies of their inputs. This also causes us to canonicalize unpck[hl]pd into mov{hl,lh}ps (resp.) which is a nice encoding space win. There is one test which is "regressed" by this: extractelement-load. There, the test case where the optimization it is testing *fails*, the exact instruction pattern which results is slightly different. This should probably be fixed by having the appropriate extract formed earlier in the DAG, but that would defeat the purpose of the test.... If this test case is critically important for anyone, please let me know and I'll try to work on it. The prior behavior was actually contrary to the comment in the test case and seems likely to have been an accident. llvm-svn: 217738
* In DwarfEHPrepare, after all passes are run, RewindFunction may be a danglingYaron Keren2014-09-141-0/+5
| | | | | | | pointer to a dead function. To make sure it's valid, doFinalization nullptrs RewindFunction just like the constructor and so it will be found on next run. llvm-svn: 217737
* R600/SI: Fix broken check linesMatt Arsenault2014-09-141-2/+2
| | | | llvm-svn: 217736
* [A57FPLoadBalancing] Modify r217689 - actually we do need to check defsJames Molloy2014-09-141-6/+6
| | | | | | | | ... Just make sure we check uses first so we see the kill first. It turns out ignoring defs gives some pretty nasty runtime failures. I'm certain this is the fix but I'm still reducing a testcase. llvm-svn: 217735
* [FastISel][AArch64] Add support for non-native types for logical ops.Juergen Ributzka2014-09-132-36/+224
| | | | | | | | | Extend the logical ops selection to also support non-native types such as i1, i8, and i16. Fixes rdar://problem/18330589. llvm-svn: 217732
* Add control of function merging to the PMBuilder.Nick Lewycky2014-09-132-0/+11
| | | | llvm-svn: 217731
* Fix typoMatt Arsenault2014-09-131-4/+4
| | | | llvm-svn: 217730
* Simplify code. No functionality change.Benjamin Kramer2014-09-131-15/+3
| | | | llvm-svn: 217726
* [AArch64] Update test case to pass with post-RA MI scheduler.Chad Rosier2014-09-131-1/+1
| | | | | | | | Check that the post RA scheduler is being skipped, regardless of whether it's the top-down list latency scheduler or the post-RA MI scheduler. llvm-svn: 217725
* [llvm-objdump] Use PRIX64 with format()Nick Kledzik2014-09-131-1/+2
| | | | llvm-svn: 217724
* Stop suppress error messages in test case to see why one buildbot is failingNick Kledzik2014-09-121-1/+1
| | | | llvm-svn: 217715
* [AArch64] Don't enable the post-RA MI scheduler at OptNone.Chad Rosier2014-09-121-1/+2
| | | | | | Hopefully, this will appease the bots. llvm-svn: 217712
* Allow targets to custom legalize vector insertion and extraction.Owen Anderson2014-09-121-0/+8
| | | | llvm-svn: 217711
* [llvm-objdump] support -rebase option for mach-o to dump rebasing infoNick Kledzik2014-09-127-1/+356
| | | | | | | | | | Similar to my previous -exports-trie option, the -rebase option dumps info from the LC_DYLD_INFO load command. The rebasing info is a list of the the locations that dyld needs to adjust if a mach-o image is not loaded at its preferred address. Since ASLR is now the default, images almost never load at their preferred address, and thus need to be rebased by dyld. llvm-svn: 217709
* llvm-profdata: Avoid undefined behaviour when reading raw profilesJustin Bogner2014-09-122-2/+3
| | | | | | | | | | The raw profiles that are generated in compiler-rt always add padding so that each profile is aligned, so we can simply treat files that don't have this property as malformed. Caught by Alexey's new ubsan bot. Thanks! llvm-svn: 217708
* Remove an unnecessary restriction. MIsNeedChainEdge() should be checked ↵Owen Anderson2014-09-121-1/+1
| | | | | | | | | even when scheduler AliasAnalysis is not enabled. A good chunk of the MIsNeedChainEdge() is logic that is valid and should be applied even for targets that are not using for alias analysis. llvm-svn: 217706
* The MCAssembler.h include isn't used.Yaron Keren2014-09-121-1/+0
| | | | llvm-svn: 217705
* Add an overload of getLastArgNoClaim taking two OptSpecifiers.Ehsan Akhgari2014-09-122-0/+10
| | | | | | | | | | | | | | Summary: This will be used in clang. Test Plan: Will be tested on the clang side. Reviewers: hansw Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5337 llvm-svn: 217702
* FileCheckize. NFC.Chad Rosier2014-09-121-21/+25
| | | | llvm-svn: 217698
* Add support for le64.JF Bastien2014-09-122-2/+10
| | | | | | | | | | | | | | | | | Summary: le64 is a generic little-endian 64-bit processor, mimicking le32. Depends on D5318. Test Plan: make check-all Reviewers: dschuff Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5319 llvm-svn: 217697
* [AArch64] Enable post-RA MI scheduler.Chad Rosier2014-09-123-1/+37
| | | | | | | Phabricator Revision: http://reviews.llvm.org/D5278 Patch by Sanjin Sijaric! llvm-svn: 217693
* [A57FPLoadBalancing] Remove support for vector typesJames Molloy2014-09-121-5/+0
| | | | | | | | Vector MUL/MLAs have tied operands, which gives us extra constraints that we currently can't handle. Instead of silently doing the wrong thing, remove support to be readded later properly. llvm-svn: 217690
* [A57FPLoadBalancing] Ignore <def>s when checking if a chain may be killed.James Molloy2014-09-121-0/+4
| | | | | | | | Defs are seen before uses, so a def without the kill flag doesn't necessarily mean that the register is not killed on that instruction. It may be killed in a later use operand. llvm-svn: 217689
* [lit] Parse all strings as UTF-8 rather than ASCII.Jordan Rose2014-09-125-15/+33
| | | | | | | | | | As far as I can tell UTF-8 has been supported since the beginning of Python's codec support, and it's the de facto standard for text these days, at least for primarily-English text. This allows us to put Unicode into lit RUN lines. rdar://problem/18311663 llvm-svn: 217688
* Move sys::fs::AccessMode out of @brief in the function. [-Wdocumentation]NAKAMURA Takumi2014-09-121-1/+2
| | | | | FIXME: Annotate sys::fs::AccessMode. llvm-svn: 217685
* sys::fs::access(): Fix @param [-Wdocumentation]NAKAMURA Takumi2014-09-121-1/+1
| | | | llvm-svn: 217684
* llvm/test/CodeGen/X86/vec_ctbits.ll: Add explicit -mtriple=x86_64-unknown. ↵NAKAMURA Takumi2014-09-121-1/+1
| | | | | | It was incompatible to Win32 x64. llvm-svn: 217683
* [A57LoadBalancing] unique_ptr-ify.James Molloy2014-09-121-25/+20
| | | | | | Thanks to David Blakie for the in-depth review! llvm-svn: 217682
* [mips][microMIPS] Implement JRADDIUSP instructionZoran Jovanovic2014-09-125-0/+57
| | | | | | Differential Revision: http://reviews.llvm.org/D5046 llvm-svn: 217681
* Address comments on r217622Bill Schmidt2014-09-122-4/+18
| | | | llvm-svn: 217680
* [mips][microMIPS] Implement BGEZALS and BLTZALS instructionsZoran Jovanovic2014-09-123-0/+23
| | | | | | Differential Revision: http://reviews.llvm.org/D5004 llvm-svn: 217678
* [mips][microMIPS] Implement JALS and JALRS instructions.Zoran Jovanovic2014-09-123-4/+47
| | | | | | Differential Revision: http://reviews.llvm.org/D5003 llvm-svn: 217676
* [mips][microMIPS] Implement TLBP, TLBR, TLBWI and TLBWR instructionsZoran Jovanovic2014-09-124-5/+31
| | | | | | Differential Revision: http://reviews.llvm.org/D5211 llvm-svn: 217675
* [ARM] Teach the cost model that cross-class copies are costly.James Molloy2014-09-122-56/+63
| | | | | | Cross-class copies being expensive is actually a trait of the microarchitecture, but as I haven't yet seen an example of a microarchitecture where they're cheap it seems best to just enable this by default, covering the non-mcpu build case. llvm-svn: 217674
OpenPOWER on IntegriCloud