summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* Convert some X86 blendv* intrinsics into IR.Filipe Cabecinhas2014-05-275-0/+157
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implemented an InstCombine transformation that takes a blendv* intrinsic call and translates it into an IR select, if the mask is constant. This will eventually get lowered into blends with immediates if possible, or pblendvb (with an option to further optimize if we can transform the pblendvb into a blend+immediate instruction, depending on the selector). It will also enable optimizations by the IR passes, which give up on sight of the intrinsic. Both the transformation and the lowering of its result to asm got shiny new tests. The transformation is a bit convoluted because of blendvp[sd]'s definition: Its mask is a floating point value! This forces us to convert it and get the highest bit. I suppose this happened because the mask has type __m128 in Intel's intrinsic and v4sf (for blendps) in gcc's builtin. I will send an email to llvm-dev to discuss if we want to change this or not. Reviewers: grosbach, delena, nadav Differential Revision: http://reviews.llvm.org/D3859 llvm-svn: 209643
* Fix link.Rafael Espindola2014-05-261-1/+3
| | | | llvm-svn: 209640
* Use existing helper function.Rafael Espindola2014-05-261-8/+1
| | | | | | No functionality change. llvm-svn: 209639
* [PPC] Use alias symbols in address computation.Rafael Espindola2014-05-263-34/+46
| | | | | | | This seems to match what gcc does for ppc and what every other llvm backend does. llvm-svn: 209638
* AArch64: force i1 to be zero-extended at an ABI boundary.Tim Northover2014-05-262-0/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit is debatable. There are two possible approaches, neither of which is really satisfactory: 1. Use "@foo(i1 zeroext)" to mean an extension to 32-bits on Darwin, and 8 bits otherwise. 2. Redefine "@foo(i1)" to mean that the i1 is extended by the caller to 8 bits. This goes against the spirit of "zeroext" I think, but it's a bit of a vague construct anyway (by definition you're going to extend to the amount required by the ABI, that's why it's the ABI!). This implements option 2. The DAG machinery really isn't setup for the first (there's a fairly strong assumption that "zeroext" goes to at least the smallest register size), and even if it was the resulting DAG looks like it would be inferior in many cases. Theoretically we could add AssertZext nodes in the consumers of ABI-passed values too now, but this actually seems to make the code worse in practice by making truncation proceed in two steps. The code produced is equally valid if we continue to assume only the low bit is defined. Should fix PR19850 llvm-svn: 209637
* AArch64: simplify calling conventions slightly.Tim Northover2014-05-265-128/+44
| | | | | | | | | We can eliminate the custom C++ code in favour of some TableGen to check the same things. Functionality should be identical, except for a buffer overrun that was present in the C++ code and meant webkit failed if any small argument needed to be passed on the stack. llvm-svn: 209636
* Some cleanup for r209568.Michael Zolotukhin2014-05-261-9/+7
| | | | llvm-svn: 209634
* Convert a few loops to use ranges.Rafael Espindola2014-05-261-54/+51
| | | | llvm-svn: 209628
* [AArch64] Add store + add folding regression tests for the load/store ↵Tilmann Scheller2014-05-261-2/+66
| | | | | | | | | | | | | | | | optimization pass. Add tests for the following transform: str X, [x0, #32] ... add x0, x0, #32 -> str X, [x0, #32]! with X being either w1, x1, s0, d0 or q0. llvm-svn: 209627
* [AArch64] Add more regression tests for the load/store optimization pass.Tilmann Scheller2014-05-261-11/+81
| | | | | | | | | | | | | | Cover the following cases: ldr X, [x0, #32] ... add x0, x0, #32 -> ldr X, [x0, #32]! with X being either w1, x1, s0, d0 or q0. llvm-svn: 209624
* [asan] decrease asan-instrumentation-with-call-threshold from 10000 to 7000, ↵Kostya Serebryany2014-05-261-1/+1
| | | | | | see PR17409 llvm-svn: 209623
* Remove accidentally committed whitespace.Tilmann Scheller2014-05-261-2/+2
| | | | llvm-svn: 209619
* [AArch64] Add a regression test for the load store optimizer.Tilmann Scheller2014-05-261-0/+31
| | | | | | | | We have a couple of regression tests for load/store pairing, but (to my knowledge) there are no regression tests for the load/store + add/sub folding. As a first step towards increased test coverage of this area, this commit adds a test for one instance of a load + add to pre-indexed load transformation. llvm-svn: 209618
* Make the LoopRotate pass's maximum header size configurable both ↵Owen Anderson2014-05-262-5/+15
| | | | | | | | | | programmatically and via the command line, mirroring similar functionality in LoopUnroll. In situations where clients used custom unrolling thresholds, their intent could previously be foiled by LoopRotate having a hardcoded threshold. llvm-svn: 209617
* DebugInfo: Test linkonce-odr functions under LTO.David Blaikie2014-05-261-0/+74
| | | | | | | | | | | | | | | | | | | | | | | This was previously regressed/broken by r192749 (reverted due to this issue in r192938) and I was about to break it again by accident with some more invasive changes that deal with the subprogram lists. So to avoid that and further issues - here's a test. It's a pretty basic test - in both r192749 and my impending case, this test would crash, but checking the basics (that we put a subprogram in just one of the two CUs) seems like a good start. We still get this wrong in weird ways if the linkonce-odr function happens to not be identical in the metadata (because it's defined in two different files (hence the # line directives in this test), etc) even though it meets the language requirements (identical token stream) for such a thing. That results in two subprogram DIEs, but only one of them gets the parameter and high/low pc information, etc. We probably need to use the DIRef infrastructure to deduplicate functions as we do types to address this issue - or perhaps teach the BC linker to remove the duplicate entries in subprogram lists? llvm-svn: 209614
* DwarfUnit: Remove some misleading no-op code introduced in r204162.David Blaikie2014-05-261-4/+0
| | | | | | | Post commit review feedback from Manman called this out, but it looks like it slipped through the cracks. llvm-svn: 209611
* Just check the entire string.Rafael Espindola2014-05-262-64/+64
| | | | | | Thanks to David Blaikie for the suggestion. llvm-svn: 209610
* Reformat linefeeds.NAKAMURA Takumi2014-05-264-8/+1
| | | | llvm-svn: 209609
* Trailing whitespace.NAKAMURA Takumi2014-05-261-6/+6
| | | | llvm-svn: 209608
* tools: avoid use of std::functionSaleem Abdulrasool2014-05-253-12/+14
| | | | | | | | | | | Remove the use of the std::function and replace the capturing lambda with a non-capturing one, opting to pass the user data down to the context. This is needed as std::function is not yet available on all hosted platforms (it requires RTTI, which breaks on Windows). Thanks to Nico Rieck for pointing this out! llvm-svn: 209607
* tools: split out Win64EHDumper from COFFDumperSaleem Abdulrasool2014-05-254-328/+406
| | | | | | | | | | Move the implementation of the Win64 EH printer from the COFFDumper into its own class. This is in preparation for adding support to print ARM EH information. The only real change here is in printUnwindInfo where we now lambda lift the implicit this parameter for the resolveFunction. Also setup the printing to handle ARM. This now has set the stage to introduce ARM EH printing. llvm-svn: 209606
* tools: inline simple single-use functionSaleem Abdulrasool2014-05-251-18/+6
| | | | | | | This inlines the single use function in preparation for splitting the Win64EH printing out of the COFFDumper into its own entity. llvm-svn: 209605
* tools: refactor COFFDumper symbol resolution logicSaleem Abdulrasool2014-05-251-61/+69
| | | | | | | | | | Make the use of the cache more transparent to the users. There is no reason that the cached entries really need to be passed along. The overhead for doing so is minimal: a single extra parameter. This requires that some standalone functions be brought into the COFFDumper class so that they may access the cache. llvm-svn: 209604
* tools: use references rather than out pointers in COFFDumperSaleem Abdulrasool2014-05-251-18/+8
| | | | | | | Switch to use references for parameters that are guaranteed to be non-null. Simplifies the code a slight bit in preparation for another change. llvm-svn: 209603
* DebugInfo: Fix inlining with #file directives a little harderDavid Blaikie2014-05-252-5/+8
| | | | | | | | | | | Seems my previous fix was insufficient - we were still not adding the inlined function to the abstract scope list. Which meant it wasn't flagged as inline, didn't have nested lexical scopes in the abstract definition, and didn't have abstract variables - so the inlined variable didn't reference an abstract variable, instead being described completely inline. llvm-svn: 209602
* Streamline test case by avoiding a temporary file and piping llc output ↵David Blaikie2014-05-251-2/+1
| | | | | | | | | straight to llvm-dwarfdump We still do temporary files in many cases, just updating this particular one because I was debugging it and made this change while doing so. llvm-svn: 209601
* Emit data or code export directives based on the type.Rafael Espindola2014-05-252-7/+7
| | | | | | | | | | | | | | | | | | | | | | Currently we look at the Aliasee to decide what type of export directive to use. It seems better to use the type of the alias directly. This is similar to how we handle the alias having the same address but other attributes (linkage, visibility) from the aliasee. With this patch it is now possible to do things like target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-pc-windows-msvc" @foo = global [6 x i8] c"\B8*\00\00\00\C3", section ".text", align 16 @f = dllexport alias i32 (), [6 x i8]* @foo !llvm.module.flags = !{!0} !0 = metadata !{i32 6, metadata !"Linker Options", metadata !1} !1 = metadata !{metadata !2, metadata !3} !2 = metadata !{metadata !"/DEFAULTLIB:libcmt.lib"} !3 = metadata !{metadata !"/DEFAULTLIB:oldnames.lib"} llvm-svn: 209600
* Make these CHECKs a bit more strict.Rafael Espindola2014-05-252-62/+62
| | | | | | The " at the end of the line makes sure we matched the entire directive. llvm-svn: 209599
* Add an extension point for peephole optimizers.Peter Collingbourne2014-05-252-1/+15
| | | | | | | | | | This extension point allows adding passes that perform peephole optimizations similar to the instruction combiner. These passes will be inserted after each instance of the instruction combiner pass. Differential Revision: http://reviews.llvm.org/D3905 llvm-svn: 209595
* Fix some misplaced spaces around 'override'Hans Wennborg2014-05-244-12/+12
| | | | llvm-svn: 209589
* build: sort llvm-readobj sourcesSaleem Abdulrasool2014-05-241-4/+4
| | | | | | Sort the source files. NFC. llvm-svn: 209587
* llvm-readobj: remove some dead codeSaleem Abdulrasool2014-05-241-28/+0
| | | | llvm-svn: 209586
* AArch64: disable FastISel for large code model.Tim Northover2014-05-241-0/+5
| | | | | | | | | The code emitted is what would be expected for the small model, so it shouldn't be used when objects can be the full 64-bits away. This fixes MCJIT tests on Linux. llvm-svn: 209585
* MachineVerifier: Clean up some syntactic weirdness left behind by find&replace.Benjamin Kramer2014-05-241-6/+6
| | | | | | No functionality change. llvm-svn: 209581
* CodeGen: Make MachineBasicBlock::back skip to the beginning of the last bundle.Benjamin Kramer2014-05-244-13/+95
| | | | | | | | | | | | This makes front/back symmetric with begin/end, avoiding some confusion. Added instr_front/instr_back for the old behavior, corresponding to instr_begin/instr_end. Audited all three in-tree users of back(), all of them look like they don't want to look inside bundles. Fixes an assertion (PR19815) when generating debug info on mips, where a delay slot was bundled at the end of a branch. llvm-svn: 209580
* AArch64/ARM64: move ARM64 into AArch64's placeTim Northover2014-05-24636-14518/+14412
| | | | | | | | | | | | | | | This commit starts with a "git mv ARM64 AArch64" and continues out from there, renaming the C++ classes, intrinsics, and other target-local objects for consistency. "ARM64" test directories are also moved, and tests that began their life in ARM64 use an arm64 triple, those from AArch64 use an aarch64 triple. Both should be equivalent though. This finishes the AArch64 merge, and everyone should feel free to continue committing as normal now. llvm-svn: 209577
* AArch64/ARM64: remove AArch64 from tree prior to renaming ARM64.Tim Northover2014-05-24355-67373/+73
| | | | | | | | | | | | | | | | I'm doing this in two phases for a better "git blame" record. This commit removes the previous AArch64 backend and redirects all functionality to ARM64. It also deduplicates test-lines and removes orphaned AArch64 tests. The next step will be "git mv ARM64 AArch64" and rewire most of the tests. Hopefully LLVM is still functional, though it would be even better if no-one ever had to care because the rename happens straight afterwards. llvm-svn: 209576
* llvm/test/Object/ar-error.test: Don't check the message "No such file or ↵NAKAMURA Takumi2014-05-241-1/+2
| | | | | | | | directory". It didn't match on non-English version of Windows. llvm-svn: 209570
* Implement sext(C1 + C2*X) --> sext(C1) + sext(C2*X) andMichael Zolotukhin2014-05-242-0/+210
| | | | | | | | | | | sext{C1,+,C2} --> sext(C1) + sext{0,+,C2} transformation in Scalar Evolution. That helps SLP-vectorizer to recognize consecutive loads/stores. <rdar://problem/14860614> llvm-svn: 209568
* ARM64: extract a 32-bit subreg when selecting an inreg extendTim Northover2014-05-242-12/+154
| | | | | | | | After the load/store refactoring, we were sometimes trying to feed a GPR64 into a 32-bit register offset operand. This failed in copyPhysReg. llvm-svn: 209566
* DebugInfo: Generalize some tests to handle variations in attribute ordering.David Blaikie2014-05-2312-58/+73
| | | | | | | | | | | | | | | | In an effort to fix inlined debug info in situations where the out of line definition of a function preceeds any inlined usage, the order in which some attributes are added to subprogram DIEs may change. (in essence, definition-necessary attributes like DW_AT_low_pc/high_pc will be added immediately, but the names, types, and other features will be delayed to module end where they may either be added to the subprogram DIE or instead reference an abstract definition for those values) These tests can be generalized to be resilient to this change. 5 or so tests actually have to be incompatibly changed to cope with this reordering and will go along with the change that affects the order. llvm-svn: 209554
* DebugInfo: Generalize a test case to not depend on abbreviation numbering.David Blaikie2014-05-231-8/+8
| | | | | | | It's an unnecessary detail for this test and just gets in the way when making unrelated changes to the output in this test. llvm-svn: 209553
* Test case comments. Fix sloppiness.Andrew Trick2014-05-231-2/+2
| | | | llvm-svn: 209551
* clang-format function.Rafael Espindola2014-05-231-8/+6
| | | | llvm-svn: 209550
* Remove a confusing use of a static method.Rafael Espindola2014-05-231-1/+1
| | | | | | No functionality change. llvm-svn: 209548
* DebugInfo: Put concrete definitions referencing abstract definitions in the ↵David Blaikie2014-05-232-1/+95
| | | | | | | | | | | | | | | | | | | same scope as the abstract definition. This seems like a simple cleanup/improved consistency, but also helps lay the foundation to fix the bug mentioned in the test case: concrete definitions preceeding any inlined usage aren't properly split into concrete + abstract (because they're not known to need it until it's too late). Once we start deferring this choice until later, we won't have the choice to put concrete definitions for inlined subroutines in a different scope from concrete definitions for non-inlined subroutines (since we won't know at time-of-construction which one it'll be). This change brings those two cases into alignment ahead of that future chaneg/fix. llvm-svn: 209547
* Fix and improve SCEV ComputeBackedgeTankCount.Andrew Trick2014-05-232-19/+102
| | | | | | | | | | | | | This is a follow-up to r209358: PR19799: Indvars miscompile due to an incorrect max backedge taken count from SCEV. That fix was incomplete as pointed out by Arnold and Michael Z. The code was also too confusing. It needed a careful rewrite with more unit tests. This version will also happen to optimize more cases. <rdar://17005101> PR19799: Indvars miscompile... llvm-svn: 209545
* Revert part of "Fix broken FileCheck prefixes"Nico Rieck2014-05-232-11/+11
| | | | | | This reverts part of commit r209538. llvm-svn: 209544
* Use alias linkage and visibility to decide tls access mode.Rafael Espindola2014-05-233-16/+13
| | | | | | | | | | | | | | | | | This matches both what we do for the non-thread case and what gcc does. With this patch clang would match gcc's behaviour in static __thread int a = 42; extern __thread int b __attribute__((alias("a"))); int *f(void) { return &a; } int *g(void) { return &b; } if not for pr19843. Manually writing the IL does produce the same access modes. It is also a step in the direction of fixing pr19844. llvm-svn: 209543
* Remove unused CHECK linesNico Rieck2014-05-231-4/+0
| | | | llvm-svn: 209539
OpenPOWER on IntegriCloud