summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* getParent() ^ 3 == getModule() ; NFCISanjay Patel2015-12-1416-40/+28
| | | | llvm-svn: 255511
* Remove dead function AArch64TargetLowering::getFunctionAlignment. NFC.Geoff Berry2015-12-142-8/+0
| | | | | | | | | | Reviewers: t.p.northover, jmolloy, mcrosier Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D15458 llvm-svn: 255509
* AMDGPU: Fix splitting vector loads with existing offsetsMatt Arsenault2015-12-142-9/+122
| | | | | | | If the original MMO had an offset, it was dropped. Also use the correct alignment after adding the new offset. llvm-svn: 255508
* [InstCombine] fold trunc ([lshr] (bitcast vector) ) --> extractelement (PR25543)Sanjay Patel2015-12-142-66/+54
| | | | | | | | | | | | | | | | | | | | | | | This is a fix for PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The idea is to take the existing fold of: bitcast ( trunc ( lshr ( bitcast X))) --> extractelement (bitcast X) ( http://reviews.llvm.org/rL112232 ) And break it into less specific transforms so we'll catch more cases such as the example in the bug report: bitcast ( trunc ( lshr ( bitcast X))) --> bitcast ( extractelement (bitcast X)) --> extractelement (bitcast X) Enabling patches for this change: http://reviews.llvm.org/rL255399 (combine bitcasts) http://reviews.llvm.org/rL255433 (canonicalize extractelement(bitcast X)) Differential Revision: http://reviews.llvm.org/D15392 llvm-svn: 255504
* [Hexagon] Subtarget features/default CPU correctionsKrzysztof Parzyszek2015-12-146-15/+22
| | | | llvm-svn: 255501
* [PPC] Early exit loop. NFC.Chad Rosier2015-12-141-1/+4
| | | | llvm-svn: 255497
* [sanitizer] [msan] VarArgHelper for AArch64Adhemerval Zanella2015-12-142-0/+314
| | | | | | | | This patch add support for variadic argument for AArch64. All the MSAN unit tests are not passing as well the signal_stress_test (currently set as XFAIl for aarch64). llvm-svn: 255495
* Don't create unnecessary PHIsJames Molloy2015-12-143-4/+205
| | | | | | | | | | | | In conditional store merging, we were creating PHIs when we didn't need to. If the value to be predicated isn't defined in the block we're predicating, then it doesn't need a PHI at all (because we only deal with triangles and diamonds, any value not in the predicated BB must dominate the predicated BB). This fixes a large code size increase in some benchmarks in a popular embedded benchmark suite. llvm-svn: 255489
* Reformat to untabify.NAKAMURA Takumi2015-12-142-12/+11
| | | | llvm-svn: 255483
* [llvm-dwp] Deduplicate type unitsDavid Blaikie2015-12-144-6/+47
| | | | | | | | It's O(N^2) because it does a simple walk through the existing types to find duplicates, but that will be fixed in a follow-up commit to use a mapping data structure of some kind. llvm-svn: 255482
* [llvm-dwp] Remove some unused test codeDavid Blaikie2015-12-142-5/+0
| | | | llvm-svn: 255481
* [Docs] Fix underlines that were too short or too long.Akira Hatanaka2015-12-141-3/+3
| | | | llvm-svn: 255480
* I Added a triple flag for x86-evenDirective test.Michael Zuckerman2015-12-131-1/+1
| | | | | | | | Continue of rL255461 Differential Revision: http://reviews.llvm.org/D15413 llvm-svn: 255469
* Revert r255460, which still causes test failures on some platforms.Cong Hou2015-12-133-176/+32
| | | | | | Further investigation on the failures is ongoing. llvm-svn: 255463
* [X86][inline asm] support even directive Michael Zuckerman2015-12-134-1/+76
| | | | | | | | | | | The .even directive aligns content to an evan-numbered address. In at&t syntax .even In Microsoft syntax even (without the dot). Differential Revision: http://reviews.llvm.org/D15413 llvm-svn: 255462
* Fix a type issue in r255455. Should not use unsigned type as std::abs()'s ↵Cong Hou2015-12-131-1/+1
| | | | | | template type. llvm-svn: 255461
* [LoopVectorizer] Refine loop vectorizer's register usage calculator by ↵Cong Hou2015-12-133-32/+176
| | | | | | | | | | | | | | | | | | | | ignoring specific instructions. (This is the second attempt to check in this patch: REQUIRES: asserts is added to reg-usage.ll now.) LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the register usage for specific VFs. However, it takes into account many instructions that won't be vectorized, such as induction variables, GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative when choosing VF. In this patch, the induction variables that won't be vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set so that their register usage won't be considered any more. Differential revision: http://reviews.llvm.org/D15177 llvm-svn: 255460
* Fix line endingsSimon Pilgrim2015-12-131-14/+14
| | | | llvm-svn: 255459
* Replace <cstdint> by llvm/Support/DataTypes.h for the typedef of uint64_t. NFC.Cong Hou2015-12-131-1/+1
| | | | llvm-svn: 255458
* Add the missing header file <cstdint> needed by uint64_tCong Hou2015-12-131-0/+1
| | | | llvm-svn: 255457
* Revert r255454 as it leads to several test failers on buildbots.Cong Hou2015-12-133-175/+32
| | | | llvm-svn: 255456
* Normalize MBB's successors' probabilities in several locations.Cong Hou2015-12-1312-27/+68
| | | | | | | | | | | | This patch adds some missing calls to MBB::normalizeSuccProbs() in several locations where it should be called. Those places are found by checking if the sum of successors' probabilities is approximate one in MachineBlockPlacement pass with some instrumented code (not in this patch). Differential revision: http://reviews.llvm.org/D15259 llvm-svn: 255455
* [LoopVectorizer] Refine loop vectorizer's register usage calculator by ↵Cong Hou2015-12-133-32/+175
| | | | | | | | | | | | | | | | | ignoring specific instructions. LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the register usage for specific VFs. However, it takes into account many instructions that won't be vectorized, such as induction variables, GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative when choosing VF. In this patch, the induction variables that won't be vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set so that their register usage won't be considered any more. Differential revision: http://reviews.llvm.org/D15177 llvm-svn: 255454
* ARM: only emit EABI attributes on EABI targetsSaleem Abdulrasool2015-12-132-1/+12
| | | | | | | EABI attributes should only be emitted on EABI targets. This prevents the emission of the optimization goals EABI attribute on Windows ARM. llvm-svn: 255448
* Revert r255444.Nico Weber2015-12-136-333/+0
| | | | | | | | It doesn't build on Windows and broke the Windows LLD and LLDB bots: http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/27693/steps/build_Lld/logs/stdio http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc/builds/13468/steps/build/logs/stdio llvm-svn: 255446
* Add a C++11 ThreadPool implementation in LLVMMehdi Amini2015-12-126-0/+333
| | | | | | | | | | | | | | | | | | This is a very simple implementation of a thread pool using C++11 thread. It accepts any std::function<void()> for asynchronous execution. Individual task can be synchronize using the returned future, or the client can block on the full queue completion. In case LLVM is configured with Threading disabled, it falls back to sequential execution using std::async with launch:deferred. This is intended to support parallelism for ThinLTO processing in linker plugin, but is generic enough for any other uses. Differential Revision: http://reviews.llvm.org/D15464 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255444
* [llvm-objdump/MachoDump] Simplify.Davide Italiano2015-12-121-7/+3
| | | | llvm-svn: 255443
* [X86][AVX512] Added support for VMOVQ shuffle commentsSimon Pilgrim2015-12-122-54/+26
| | | | llvm-svn: 255442
* Partially fix memcpy / memset / memmove lowering in SelectionDAG ↵Manuel Jacob2015-12-123-22/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | construction if address space != 0. Summary: Previously SelectionDAGBuilder asserted that the pointer operands of memcpy / memset / memmove intrinsics are in address space < 256. This assert implicitly assumed the X86 backend, where all address spaces < 256 are equivalent to address space 0 from the code generator's point of view. On some targets (R600 and NVPTX) several address spaces < 256 have a target-defined meaning, so this assert made little sense for these targets. This patch removes this wrong assertion and adds extra checks before lowering these intrinsics to library calls. If a pointer operand can't be casted to address space 0 without changing semantics, a fatal error is reported to the user. The new behavior should be valid for all targets that give address spaces != 0 a target-specified meaning (NVPTX, R600, X86). NVPTX lowers big or variable-sized memory intrinsics before SelectionDAG construction. All other memory intrinsics are inlined (the threshold is set very high for this target). R600 doesn't support memcpy / memset / memmove library calls (previously the illegal emission of a call to such library function triggered an error somewhere in the code generator). X86 now emits inline loads and stores for address spaces 256 and 257 up to the same threshold that is used for address space 0 and reports a fatal error otherwise. I call this a "partial fix" because there are still cases that can't be lowered. A fatal error is reported in these cases. Reviewers: arsenm, theraven, compnerd, hfinkel Subscribers: hfinkel, llvm-commits, alex Differential Revision: http://reviews.llvm.org/D7241 llvm-svn: 255441
* [PGO] Stop using invalid char in instr variable names.Xinliang David Li2015-12-124-10/+30
| | | | | | | | | | | | | Before the patch, -fprofile-instr-generate compile will fail if no integrated-as is specified when the file contains any static functions (the -S output is also invalid). This is the second try. The fix in this patch is very localized. Only profile symbol names of profile symbols with internal linkage are fixed up while initializer of name syms are not changes. This means there is no format change nor version bump. llvm-svn: 255434
* [InstCombine] canonicalize (bitcast (extractelement X)) --> ↵Sanjay Patel2015-12-122-30/+19
| | | | | | | | | | | | | | | | | | | | | (extractelement(bitcast X)) This change was discussed in D15392. It allows us to remove the fold that was added in: http://reviews.llvm.org/r255261 ...and it will allow us to generalize this fold: http://reviews.llvm.org/rL112232 while preserving the order of bitcast + extract that it produces and testing shows is better handled by the backend. Note that the existing check for "isVectorTy()" wasn't strong enough in general and specifically because: x86_mmx. It's not a vector, but it's not vectorizable either. So here we check VectorType::isValidElementType() directly before proceeding with the transform. llvm-svn: 255433
* [X86][AVX] Tests tidyupSimon Pilgrim2015-12-122-69/+68
| | | | | | Cleanup/regenerate some tests for some upcoming patches. llvm-svn: 255432
* Try to appease sphinxDavid Majnemer2015-12-121-0/+1
| | | | llvm-svn: 255429
* Move catchpad-phi-cast.ll to the X86 specific subdirectoryDavid Majnemer2015-12-121-0/+0
| | | | | | | It is X86 specific and will not be properly exercised unless LLVM is built with the X86 target. llvm-svn: 255426
* Try to appease a buildbotDavid Majnemer2015-12-121-0/+1
| | | | | | | | | The builder complains thusly: error C2027: use of undefined type 'llvm::raw_ostream' Try to make it happy by including raw_ostream.h llvm-svn: 255425
* [IR] Reformulate LLVM's EH funclet IRDavid Majnemer2015-12-12106-6086/+3078
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While we have successfully implemented a funclet-oriented EH scheme on top of LLVM IR, our scheme has some notable deficiencies: - catchendpad and cleanupendpad are necessary in the current design but they are difficult to explain to others, even to seasoned LLVM experts. - catchendpad and cleanupendpad are optimization barriers. They cannot be split and force all potentially throwing call-sites to be invokes. This has a noticable effect on the quality of our code generation. - catchpad, while similar in some aspects to invoke, is fairly awkward. It is unsplittable, starts a funclet, and has control flow to other funclets. - The nesting relationship between funclets is currently a property of control flow edges. Because of this, we are forced to carefully analyze the flow graph to see if there might potentially exist illegal nesting among funclets. While we have logic to clone funclets when they are illegally nested, it would be nicer if we had a representation which forbade them upfront. Let's clean this up a bit by doing the following: - Instead, make catchpad more like cleanuppad and landingpad: no control flow, just a bunch of simple operands; catchpad would be splittable. - Introduce catchswitch, a control flow instruction designed to model the constraints of funclet oriented EH. - Make funclet scoping explicit by having funclet instructions consume the token produced by the funclet which contains them. - Remove catchendpad and cleanupendpad. Their presence can be inferred implicitly using coloring information. N.B. The state numbering code for the CLR has been updated but the veracity of it's output cannot be spoken for. An expert should take a look to make sure the results are reasonable. Reviewers: rnk, JosephTremoulet, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D15139 llvm-svn: 255422
* [PowerPC] OutStreamer cleanup in PPCAsmPrinterHal Finkel2015-12-121-23/+19
| | | | | | | | We don't need to pass OutStreamer as a parameter to LowerSTACKMAP and LowerPATCHPOINT. It is a member variable of PPCAsmPrinter, and thus, is already available. NFC. llvm-svn: 255418
* [X86ISelLowering] Add additional support for multiplication-to-shift conversion.Chen Li2015-12-122-3/+70
| | | | | | | | | | | | Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this. Reviewers: craig.topper, RKSimon Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14603 llvm-svn: 255415
* Fix test/CodeGen/PowerPC/ppc-shrink-wrapping.ll after r255398Hal Finkel2015-12-121-1/+1
| | | | llvm-svn: 255414
* [InstCombine] allow any pair of bitcasts to be combinedSanjay Patel2015-12-122-22/+18
| | | | | | | | | | | | | | | | | | | | This change is discussed in D15392 and should allow us to effectively revert: http://llvm.org/viewvc/llvm-project?view=revision&revision=255261 if we canonicalize bitcasts ahead of extracts. It should be safe to convert any pair of bitcasts into a single bitcast, however, it was mentioned here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20110829/127089.html that we're not allowed to bitcast from an x86_mmx to some other types, but I'm not seeing any failures from that, and we have regression tests in CodeGen/X86 that appear to cover all of those cases. Some day we'll get to remove that MMX wart from LLVM IR completely? Differential Revision: http://reviews.llvm.org/D15468 llvm-svn: 255399
* [PowerPC] Add Branch Hints for Highly-Biased BranchesHal Finkel2015-12-123-2/+209
| | | | | | | | | | | This branch adds hints for highly biased branches on the PPC architecture. Even in absence of profiling information, LLVM will mark code reaching unreachable terminators and other exceptional control flow constructs as highly unlikely to be reached. Patch by Tom Jablin! llvm-svn: 255398
* [WebAssembly] Update test expectationsDerek Schuff2015-12-121-108/+49
| | | | | | | | | Many tests are now passing due to eliminateFrameIndex implementation and the list needs to be re-triaged because it unblocks other failures, and some previous failures are different. However I'm about to churn it more by implementing more lowering, so will wait on that. llvm-svn: 255396
* Revert rL255391: [X86ISelLowering] Add additional support for ↵Chen Li2015-12-122-71/+3
| | | | | | | | multiplication-to-shift conversion. because it broke buildbot. llvm-svn: 255395
* use FileCheck for better checkingSanjay Patel2015-12-121-3/+22
| | | | llvm-svn: 255394
* [WebAssembly] Implement prolog/epilog insertion and FrameIndex eliminationDerek Schuff2015-12-1112-19/+1303
| | | | | | | | | | | | | | | | | | Summary: Use the SP32 physical register as the base for FrameIndex lowering. Update it and the __stack_pointer global var in the prolog and epilog. Extend the mapping of virtual registers to wasm locals to include the physical registers. Rather than modify the target-independent PrologEpilogInserter (which asserts that there are no virtual registers left) include a slightly-modified copy for Wasm that does not have this assertion and only clears the virtual registers if scavenging was needed (which of course it isn't for wasm). Differential Revision: http://reviews.llvm.org/D15344 llvm-svn: 255392
* [X86ISelLowering] Add additional support for multiplication-to-shift conversion.Chen Li2015-12-112-3/+71
| | | | | | | | | | | | Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this. Reviewers: craig.topper, RKSimon Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14603 llvm-svn: 255391
* SamplePGO - Reduce memory utilization by 10x.Diego Novillo2015-12-113-61/+8
| | | | | | | | | | | | | | | DenseMap is the wrong data structure to use for sample records and call sites. The keys are too large, causing massive core memory growth when reading profiles. Before this patch, a 21Mb input profile was causing the compiler to grow to 3Gb in memory. By switching to std::map, the compiler now grows to 300Mb in memory. There still are some opportunities for memory footprint reduction. I'll be looking at those next. llvm-svn: 255389
* SelectionDAG: Match min/max if the scalar operation is legalMatt Arsenault2015-12-117-87/+361
| | | | llvm-svn: 255388
* Revert r248483, r242546, r242545, and r242409 - absdiff intrinsicsHal Finkel2015-12-1116-399/+35
| | | | | | | | | | | | | | | | | | | | After much discussion, ending here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html it has been decided that, instead of having the vectorizer directly generate special absdiff and horizontal-add intrinsics, we'll recognize the relevant reduction patterns during CodeGen. Accordingly, these intrinsics are not needed (the operations they represent can be pattern matched, as is already done in some backends). Thus, we're backing these out in favor of the current development work. r248483 - Codegen: Fix llvm.*absdiff semantic. r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation llvm-svn: 255387
* Avoid buffered reads of /dev/urandomRafael Espindola2015-12-111-4/+9
| | | | | | | | | | | | | | I am seeing disappointing clang performance on a large PowerPC64 Linux box. GetRandomNumberSeed() does a buffered read from /dev/urandom to seed its PRNG. As a result we read an entire page even though we only need 4 bytes. With every clang task reading a page worth of /dev/urandom we end up spending a large amount of time stuck on kernel spinlock. Patch by Anton Blanchard! llvm-svn: 255386
OpenPOWER on IntegriCloud