summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] Sharpended test case in pr21210.llGerolf Hoflehner2016-04-271-3/+4
| | | | llvm-svn: 267742
* [AMDGPU][llvm-mc] Add support of TTMP quads. Rework M0 exclusion for SMRD.Artem Tamazov2016-04-273-0/+107
| | | | | | | | | | | Added support of TTMP quads. Reworked M0 exclusion machinery for SMRD and similar instructions to enable usage of TTMP registers in those instructions as destinations. Tests added. Differential Revision: http://reviews.llvm.org/D19342 llvm-svn: 267733
* [PDB] Fix function names for private symbols in PDBsReid Kleckner2016-04-274-14/+22
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: llvm-symbolizer wants to get linkage names of functions for historical reasons. Linkage names are only recorded in the PDB for public symbols, and the linkage name is apparently stored separately in some "public symbol" record. We had a workaround in PDBContext which would look for such symbols when the user requested linkage names. However, when given an address that was truly in a private function and public funciton, we would accidentally find nearby public symbols and return those function names. The fix is to look for both function symbols and public symbols and only prefer the public symbol name if the addresses of the symbols agree. Fixes PR27492 Reviewers: zturner Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19571 llvm-svn: 267732
* AMDGPU/SI: Add llvm.amdgcn.s.waitcnt.all intrinsicNicolai Haehnle2016-04-271-0/+38
| | | | | | | | | | | | | | | | | Summary: So it appears that to guarantee some of the ordering requirements of a GLSL memoryBarrier() executed in the shader, we need to emit an s_waitcnt. (We can't use an s_barrier, because memoryBarrier() may appear anywhere in the shader, in particular it may appear in non-uniform control flow.) Reviewers: arsenm, mareko, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19203 llvm-svn: 267729
* [TTI] Add hook for vector extract with extensionMatthew Simpson2016-04-271-19/+14
| | | | | | | | | | | | | | | This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725
* [AMDGPU][llvm-mc] s_getreg/setreg* - Support symbolic names of hardware ↵Artem Tamazov2016-04-274-23/+60
| | | | | | | | | | | | registers. Possibility to specify code of hardware register kept. Disassemble to symbolic name, if name is known. Tests updated/added. Differential Revision: http://reviews.llvm.org/D19335 llvm-svn: 267724
* Revert r267649, it caused PR27539.Nico Weber2016-04-271-973/+0
| | | | llvm-svn: 267723
* Remove size 1 from check as that isn't part of what the test is meant to be ↵Kristof Beyls2016-04-271-1/+1
| | | | | | | | | | | | | testing. This test also runs on e.g. ARM-native builds when the X86 backend is also built. This test produces code for the default instruction set, even though it is in a "X86" sub-directory. Given that this test doesn't seem to be testing anything architecture-specific, it seems it's best to adapt the check to not check for an architecture-dependent value (the size of the function), rather than hard-code the test to target x86. llvm-svn: 267722
* [ThinLTO] Use valueid instead of bitcode offsets in combined index fileTeresa Johnson2016-04-278-27/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: With the removal of support for lazy parsing of combined index summary records (e.g. r267344), we no longer need to include the summary record bitcode offset in the VST entries for definitions. Change the combined index format to be similar to the per-module index format in using value ids to cross-reference from the summary record to the VST entry (rather than the summary record bitcode offset to cross-reference in the other direction). The visible changes are: 1) Add the value id to the combined summary records 2) Remove the summary offset from the combined VST records, which has the following effects: - No longer need the VST_CODE_COMBINED_GVDEFENTRY record, as all combined index VST entries now only contain the value id and corresponding GUID. - No longer have duplicate VST entries in the case where there are multiple definitions of a symbol (e.g. weak/linkonce), as they all have the same value id and GUID. An implication of #2 above is that in order to hook up an alias to the correct aliasee based on the value id of the aliasee recorded in the combined index alias record, we need to scan the entries in the index for that GUID to find the one from the same module (i.e. the case where there are multiple entries for the aliasee). But the reader no longer has to maintain a special map to hook up the alias/aliasee. Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19481 llvm-svn: 267712
* [InstCombine][SSE] Regenerated vector shift testsSimon Pilgrim2016-04-271-356/+505
| | | | llvm-svn: 267699
* [mips][microMIPS] Add CodeGen support for SUBU16, SUB, SUBU, DSUB and DSUBU ↵Zlatko Buljan2016-04-274-37/+93
| | | | | | | | instructions Differential Revision: http://reviews.llvm.org/D16676 llvm-svn: 267694
* [mips][microMIPS] Add CodeGen support for SLL16, SRL16, SLL, SLLV, SRA, ↵Zlatko Buljan2016-04-2712-4/+207
| | | | | | | | SRAV, SRL and SRLV instructions Differential Revision: http://reviews.llvm.org/D17989 llvm-svn: 267693
* Use DL preferred alignment for alloca in Value::getPointerAlignmentArtur Pilipenko2016-04-271-1/+7
| | | | | | | | | | Teach Value::getPointerAlignment that allocas with no explicit alignment are aligned to preferred alignment of the allocated type. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D17569 llvm-svn: 267689
* [InstCombine][SSE] Added DemandedBits tests for MOVMSK instructionsSimon Pilgrim2016-04-271-0/+137
| | | | | | MOVMSK zeros the upper bits of the gpr - we should be able to use this. llvm-svn: 267686
* [LoopDist] Add llvm.loop.distribute.enable loop metadataAdam Nemet2016-04-271-0/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 llvm-svn: 267672
* Revert "Support "preserving" the summary information when using setModule() ↵Mehdi Amini2016-04-271-37/+0
| | | | | | | | | | API in LTOCodeGenerator" This reverts commit r267665. ASAN shows that there is a use of undefined value. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267668
* Support "preserving" the summary information when using setModule() API in ↵Mehdi Amini2016-04-271-0/+37
| | | | | | | | | LTOCodeGenerator Another attempt at r267655... From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267665
* Revert "Support "preserving" the summary information when using setModule() ↵Mehdi Amini2016-04-271-37/+0
| | | | | | | | | | API in LTOCodeGenerator" This reverts commit r267657, r267656, and r267655. The test does not pass on multiple bots, I'm unsure why yet but let's unbreak them. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267664
* The patch fixes PR27392.Evgeny Stupachenko2016-04-274-18/+28
| | | | | | | | | | | | | | | | | Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct. Reviewer: hfinkel Differential Revision: http://reviews.llvm.org/D19256 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 267662
* [ppc64] fix bug in prologue that mfocrf's cr operand should be explict state ↵Chuang-Yu Cheng2016-04-271-1/+1
| | | | | | | | | | | | instead of implicit This fixes PR27414 Reviewers: kbarton mgrang tjablin http://reviews.llvm.org/D19255 llvm-svn: 267660
* Fix the test from r267656: Support "preserving" the summary information when ↵Mehdi Amini2016-04-271-2/+0
| | | | | | | using setModule() API in LTOCodeGenerator From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267657
* Add a test for r267655: Support "preserving" the summary information when ↵Mehdi Amini2016-04-271-0/+39
| | | | | | | using setModule() API in LTOCodeGenerator From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267656
* [X86] Don't assume that MMX extractelts are from index 0.Ahmed Bougacha2016-04-271-0/+28
| | | | | | | It's probably the case for all 3 MMX users out there, but with hand-crafted IR, you can trigger selection failures. Fix that. llvm-svn: 267652
* [X86] Re-enable MMX i32 extractelt combine.Ahmed Bougacha2016-04-271-0/+15
| | | | | | | | | This effectively adds back the extractelt combine removed by r262358: the direct case can still occur (because x86_mmx is special, see r262446), but it's the indirect case that's now superseded by the generic combine. llvm-svn: 267651
* Detects the SAD pattern on X86 so that much better code will be emitted once ↵Cong Hou2016-04-271-0/+973
| | | | | | | | the pattern is matched. Differential revision: http://reviews.llvm.org/D14840 llvm-svn: 267649
* ThinLTO: do not promote GlobalVariable that have a specific section.Mehdi Amini2016-04-272-0/+38
| | | | | | | Differential Revision: http://reviews.llvm.org/D18298 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267646
* LTOCodeGenerator: turns linkonce(_odr) into weak_(odr) when present ↵Mehdi Amini2016-04-271-3/+5
| | | | | | | | | | | | | | | | | "MustPreserve" set Summary: If the linker requested to preserve a linkonce function, we should honor this even if we drop all uses. Reviewers: dexonsmith Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19527 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267644
* [LVI] Reduce compile time by lazily scanning blocks if neededPhilip Reames2016-04-271-0/+22
| | | | | | | | | | When encountering a non-local pointer, LVI would eagerly scan the block for dereferences of the given object to prove the pointer to be non null. That's all well and good, but *then* we'd go recurse through our input blocks. As a result, we could end up scanning each and every block we traverse, even if the final definition was obviously non null or we found a constant value somewhere up the chain. The previous code papered over this by using the isKnownNonNull routine from value tracking. This made the duplication less painful in the common case. Instead, we know do the block scan only *after* we've gotten the recursive results back. This lets us stop scanning individual blocks as soon as we've determined it to be non-null in any predecessor block and use our usual merge rules to propagate that information cheaply through successor blocks. For a pointer which can be found non-null, this does strictly less work and sometimes substaintially so. Note that the case where we *can't* prove something non-null is still the really expensive case. We end up scanning each and every block looking for a dereference and never end up finding one. llvm-svn: 267642
* [X86] Make sure it is safe to clobber EFLAGS, if need be, when choosingQuentin Colombet2016-04-261-0/+44
| | | | | | | | | | | | | | | the prologue. Do not use basic blocks that have EFLAGS live-in as prologue if we need to realign the stack. Realigning the stack uses AND instruction and this clobbers EFLAGS. An other alternative would have been to save and restore EFLAGS around the stack realignment code, but this is likely inefficient. Fixes PR27531. llvm-svn: 267634
* PM: Port Reassociate to the new pass managerJustin Bogner2016-04-261-0/+1
| | | | llvm-svn: 267631
* [X86] Replace -mcpu with -mattr in several testsMitch Bodart2016-04-264-5/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D19568 llvm-svn: 267629
* [SimplifyCFG] propagate branch metadata when creating selectSanjay Patel2016-04-261-1/+28
| | | | llvm-svn: 267624
* [MachineBasicBlock] Take advantage of the partially dead information.Quentin Colombet2016-04-261-2/+1
| | | | | | | Thanks to that information we wouldn't lie on a register being live whereas it is not. llvm-svn: 267622
* [MachineInstrBundle] Improvement the recognition of dead definitions.Quentin Colombet2016-04-261-2/+1
| | | | | | | Now, it is possible to know that partial definitions are dead definitions and recognize that clobbered registers are also dead. llvm-svn: 267621
* [LVI] Apply transfer rule for overdefine inputs for binary operatorsPhilip Reames2016-04-261-0/+23
| | | | | | | | As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well. This patch builds on 267609 which did the same thing for unary casts. llvm-svn: 267620
* [LVI] A better fix for the assertion error introduced by 267609Philip Reames2016-04-261-0/+11
| | | | | | Essentially, I was using the wrong size function. For types which were sized, but not primitive, I wasn't getting a useful size for the operand and failed an assert. I fixed this, and also added a guard that the input is a sized type. Test case is for the original mistake. I'm not sure how to actually exercise the sized type check. llvm-svn: 267618
* [LowerExpectIntrinsic] make default likely/unlikely ratio biggerSanjay Patel2016-04-261-4/+4
| | | | | | | | | | We need the default ratio to be sufficiently large that it triggers transforms based on block frequency info (BFI) and plays well with the recently introduced BranchProbability used by CGP. Differential Revision: http://reviews.llvm.org/D19435 llvm-svn: 267615
* [LVI] Infer local facts from unary expressionsPhilip Reames2016-04-261-0/+37
| | | | | | | | | | As pointed out by John Regehr over in http://reviews.llvm.org/D19485, LVI was being incredibly stupid about applying its transfer rules. Rather than gathering local facts from the expression itself, it was simply giving up entirely if one of the inputs was overdefined. This greatly impacts the precision of the overall analysis and makes it far more fragile as well. This patch implements only the unary operation case. Once this is in, I'll implement the same for the binary operations. Differential Revision: http://reviews.llvm.org/D19492 llvm-svn: 267609
* Revert "[SimplifyLibCalls] sprintf doesn't copy null bytes"David Majnemer2016-04-261-3/+2
| | | | | | | | | | The destination buffer that sprintf uses is restrict qualified, we do not need to worry about derived pointers referenced via format specifiers. This reverts commit r267580. llvm-svn: 267605
* Try to get ResponseFile.ll passing on Windows after r267556.Nico Weber2016-04-261-2/+4
| | | | llvm-svn: 267599
* Masked Store in Loop Vectorizer - bugfixElena Demikhovsky2016-04-261-0/+46
| | | | | | | | Fixed a bug in loop vectorization with conditional store. Differential Revision: http://reviews.llvm.org/D19532 llvm-svn: 267597
* PM: Port Internalize to the new pass managerJustin Bogner2016-04-261-0/+1
| | | | llvm-svn: 267596
* Parse and dump PDB DBI Stream Header InformationZachary Turner2016-04-261-0/+10
| | | | | | | | | | | | | | | | | | The DBI stream contains a lot of bookkeeping information for other streams. In particular it contains information about section contributions and linked modules. This patch is a first attempt at parsing some of the information out of the DBI stream. It currently only parses and dumps the headers of the DBI stream, so none of the module data or section contribution data is pulled out. This is just a proof of concept that we understand the basic properties of the DBI stream's metadata, and followup patches will try to extract more detailed information out. Differential Revision: http://reviews.llvm.org/D19500 Reviewed By: majnemer, ruiu llvm-svn: 267585
* [Tail duplication] Handle source registers with subregistersKrzysztof Parzyszek2016-04-261-0/+67
| | | | | | | | | | | | | | When a block is tail-duplicated, the PHI nodes from that block are replaced with appropriate COPY instructions. When those PHI nodes contained use operands with subregisters, the subregisters were dropped from the COPY instructions, resulting in incorrect code. Keep track of the subregister information and use this information when remapping instructions from the duplicated block. Differential Revision: http://reviews.llvm.org/D19337 llvm-svn: 267583
* Reapply: "ARM: put correct symbol index on indirect pointers in __thread_ptr.""Tim Northover2016-04-261-1/+5
| | | | | | | A latent bug in llvm-objdump used the wrong format specifier on 32-bit targets, causing the test to fail. This fixes the issue. llvm-svn: 267582
* [SimplifyLibCalls] sprintf doesn't copy null bytesDavid Majnemer2016-04-261-2/+3
| | | | | | | | | | sprintf doesn't read or copy the terminating null byte from it's string operands. sprintf will append it's own after processing all of the format specifiers. This fixes PR27526. llvm-svn: 267580
* Swift Calling Convention: use %RAX for sret.Manman Ren2016-04-261-19/+24
| | | | | | | We don't need to copy the sret argument into %rax upon return. rdar://25671494 llvm-svn: 267579
* tests: tweak MIR for ARM tests to correct MI issuesSaleem Abdulrasool2016-04-262-5/+7
| | | | | | | | | The Machine Instruction Verifier flagged some issues in the serialized MIR. Adjust the input to correct them. Fixes the remaining portion of PR27480. llvm-svn: 267578
* test: remove some bleeding whitespaceSaleem Abdulrasool2016-04-262-57/+57
| | | | | | Kill bleeding whitespace. NFC llvm-svn: 267577
* [CodeGenPrepare] use branch weight metadata to decide if a select should be ↵Sanjay Patel2016-04-261-3/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | turned into a branch This is part of solving PR27344: https://llvm.org/bugs/show_bug.cgi?id=27344 CGP should undo the SimplifyCFG transform for the same reason that earlier patches have used this same mechanism: it's possible that passes between SimplifyCFG and CGP may be able to optimize the IR further with a select in place. For the TLI hook default, >99% taken or not taken is chosen as the default threshold for a highly predictable branch. Even the most limited HW branch predictors will be correct on this branch almost all the time, so even a massive mispredict penalty perf loss would be overcome by the win from all the times the branch was predicted correctly. As a follow-up, we could make the default target hook less conservative by using the SchedMachineModel's MispredictPenalty. Or we could just let targets override the default by implementing the hook with that and other target-specific options. Note that trying to statically determine mispredict rates for close-to-balanced profile weight data is generally impossible if the HW is sufficiently advanced. Ie, 50/50 taken/not-taken might still be 100% predictable. Finally, note that this patch as-is will not solve PR27344 because the current __builtin_unpredictable() branch weight default values are 4 and 64. A proposal to change that is in D19435. Differential Revision: http://reviews.llvm.org/D19488 llvm-svn: 267572
OpenPOWER on IntegriCloud