summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Remove more dead code left over from the handling of i8/i16 ↵Craig Topper2018-04-041-21/+0
| | | | | | UMUL_LOHI/SMUL_LOHI that is no longer needed. NFC llvm-svn: 329152
* [SCEV] Prove implications for SCEVUnknown PhisMax Kazantsev2018-04-041-0/+118
| | | | | | | | | | | | | | | | | This patch teaches SCEV how to prove implications for SCEVUnknown nodes that are Phis. If we need to prove `Pred` for `LHS, RHS`, and `LHS` is a Phi with possible incoming values `L1, L2, ..., LN`, then if we prove `Pred` for `(L1, RHS), (L2, RHS), ..., (LN, RHS)` then we can also prove it for `(LHS, RHS)`. If both `LHS` and `RHS` are Phis from the same block, it is sufficient to prove the predicate for values that come from the same predecessor block. The typical case that it handles is that we sometimes need to prove that `Phi(Len, Len - 1) >= 0` given that `Len > 0`. The new logic was added to `isImpliedViaOperations` and only uses it and non-recursive reasoning to prove the facts we need, so it should not hurt compile time a lot. Differential Revision: https://reviews.llvm.org/D44001 Reviewed By: anna llvm-svn: 329150
* [X86] Remove dead code for handling i8/i16 UMUL_LOHI/SMUL_LOHI from ↵Craig Topper2018-04-041-12/+0
| | | | | | | | X86ISelDAGToDAG.cpp. NFC These are promoted to i16/i32 multiplies by a DAG combine. llvm-svn: 329147
* [X86] Remove some code that was only needed when i1 was a legal type. NFCCraig Topper2018-04-041-6/+0
| | | | llvm-svn: 329146
* [SimplifyCFG] Teach merge conditional stores to handle cases where the ↵Craig Topper2018-04-041-1/+16
| | | | | | | | | | | | | | | | | | | PostBB has more than 2 predecessors by inserting a new block for the store. Summary: Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors. This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store. Reviewers: efriedma, jmolloy, ABataev Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39760 llvm-svn: 329142
* Fix bad #include path in r329139Vlad Tsyrklevich2018-04-041-1/+1
| | | | llvm-svn: 329140
* Add the ShadowCallStack passVlad Tsyrklevich2018-04-044-0/+334
| | | | | | | | | | | | | | | | | | | | Summary: The ShadowCallStack pass instruments functions marked with the shadowcallstack attribute. The instrumented prolog saves the return address to [gs:offset] where offset is stored and updated in [gs:0]. The instrumented epilog loads/updates the return address from [gs:0] and checks that it matches the return address on the stack before returning. Reviewers: pcc, vitalybuka Reviewed By: pcc Subscribers: cryptoad, eugenis, craig.topper, mgorny, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44802 llvm-svn: 329139
* Minor no-op cmake file style fix.Nico Weber2018-04-041-1/+2
| | | | llvm-svn: 329137
* Reapply r329133 with fix.Lang Hames2018-04-041-5/+36
| | | | llvm-svn: 329136
* Revert r329133 "[RuntimeDyld][AArch64] Add some error pluming / generation..."Lang Hames2018-04-041-32/+4
| | | | | | This broke a number of buildbots. Looking in to it now... llvm-svn: 329135
* [MachineOutliner] Test for X86FI->getUsesRedZone() as well as ↵Jessica Paquette2018-04-031-1/+5
| | | | | | | | | | | | | | | | Attribute::NoRedZone This commit is similar to r329120, but uses the existing getUsesRedZone() function in X86MachineFunctionInfo. This teaches the outliner to look at whether or not a function *truly* uses a redzone instead of just the noredzone attribute on a function. Thus, after this commit, it's possible to outline from x86 without using -mno-red-zone and still get outlining results. This also adds a new test for the new redzone behaviour. llvm-svn: 329134
* [RuntimeDyld][AArch64] Add some error pluming / generation to catch unhandledLang Hames2018-04-031-4/+32
| | | | | | relocation types on AArch64. llvm-svn: 329133
* [AMDGPU] performMinMaxCombine should not optimize patterns of vectors to ↵Farhana Aleen2018-04-031-1/+1
| | | | | | | | | | | | | | | | min3/max3. Summary: There are no packed instructions for min3 or max3. So, performMinMaxCombine should not optimize vectors of f16 to min3/max3. Author: FarhanaAleen Reviewed By: arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D45219 llvm-svn: 329131
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-04-031-12/+12
| | | | | | Fix typo and simplify matching expression. llvm-svn: 329130
* [Hexagon] peel loops with runtime small trip countsIkhlas Ajbar2018-04-032-4/+2
| | | | | | | | Move the check canPeel() to Hexagon Target before setting PeelCount. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329129
* [InstCombine] allow more fmul folds with 'reassoc'Sanjay Patel2018-04-031-64/+64
| | | | | | | | The tests marked with 'FIXME' require loosening the check in SimplifyAssociativeOrCommutative() to optimize completely; that's still checking isFast() in Instruction::isAssociative(). llvm-svn: 329121
* [MachineOutliner] Keep track of fns that use a redzone in AArch64FunctionInfoJessica Paquette2018-04-033-7/+27
| | | | | | | | | | | | This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It returns true if the function is known to use a redzone, false if it is known to not use a redzone, and no value otherwise. This removes the requirement to pass -mno-red-zone when outlining for AArch64. https://reviews.llvm.org/D45189 llvm-svn: 329120
* Revert "MSG"Farhana Aleen2018-04-031-1/+1
| | | | | | | | This reverts commit 9a0ce889d1c39c74d69ecad5ce9c875155ae55de. This was committed by mistake. llvm-svn: 329119
* Fix bad copy-and-paste in r329108Vlad Tsyrklevich2018-04-031-1/+1
| | | | llvm-svn: 329118
* [MachineOutliner][NFC] Make outlined functions have internal linkageJessica Paquette2018-04-031-1/+1
| | | | | | | | | | | | The linkage type on outlined functions was private before. This meant that if you set a breakpoint in an outlined function, the debugger wouldn't be able to give a sane name to the outlined function. This commit changes the linkage type to internal and updates any tests that relied on the prefixes on the names of outlined functions. llvm-svn: 329116
* MSGFarhana Aleen2018-04-031-1/+1
| | | | llvm-svn: 329114
* [coroutines] Respect alloca alignment requirements when building coroutine frameGor Nishanov2018-04-032-9/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If an alloca need to be stored in the coroutine frame and it has an alignment specified and the alignment does not match the natural alignment of the alloca type. Insert appropriate padding into the coroutine frame to make sure that it gets requested alignment. For example for a packet type (which natural alignment is 1), but alloca alignment is 8, we may need to insert a padding field with required number of bytes to make sure it is properly aligned. ``` %PackedStruct = type <{ i64 }> ... %data = alloca %PackedStruct, align 8 ``` If the previous field in the coroutine frame had alignment 2, we would have [6 x i8] inserted before %PackedStruct in the coroutine frame: ``` %f.Frame = type { ..., i16, [6 x i8], %PackedStruct } ``` Reviewers: rnk, lewissbaker, modocache Reviewed By: modocache Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D45221 llvm-svn: 329112
* [LoopInterchange] Add remark for calls preventing interchanging.Florian Hahn2018-04-031-0/+7
| | | | | | | | | | | | | | It also updates test/Transforms/LoopInterchange/call-instructions.ll to use accesses where we can prove dependence after D35430. Reviewers: sebpop, karthikthecool, blitz.opensource Reviewed By: sebpop Differential Revision: https://reviews.llvm.org/D45206 llvm-svn: 329111
* Add the ShadowCallStack attributeVlad Tsyrklevich2018-04-039-0/+16
| | | | | | | | | | | | | | | | | | Summary: Introduce the ShadowCallStack function attribute. It's added to functions compiled with -fsanitize=shadow-call-stack in order to mark functions to be instrumented by a ShadowCallStack pass to be submitted in a separate change. Reviewers: pcc, kcc, kubamracek Reviewed By: pcc, kcc Subscribers: cryptoad, mehdi_amini, javed.absar, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44800 llvm-svn: 329108
* [DebugInfoPDB] Add a few missing definitions to PDBTypes.hAaron Smith2018-04-031-0/+3
| | | | | | | | | The missing definitions are from cvconst.h shipped with DIA SDK. Correct the url to MSDN for MemoryTypeEnum and set the underlying type of PDB_StackFrameType and PDB_MemoryType to uint16_t. llvm-svn: 329104
* [CodeGen]Add NoVRegs property on PostRASink and ShrinkWrapJun Bum Lim2018-04-034-3/+14
| | | | | | | | | | | | | | | | | Summary: This change declare that PostRAMachineSinking and ShrinkWrap require NoVRegs property, so now the MachineFunctionPass can enforce this check. These passes are disabled in NVPTX & WebAssembly. Reviewers: dschuff, jlebar, tra, jgravelle-google, MatzeB, sebpop, thegameg, mcrosier Reviewed By: dschuff, thegameg Subscribers: jholewinski, jfb, sbc100, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D45183 llvm-svn: 329095
* [SLP] Fixed formatting, NFC.Alexey Bataev2018-04-031-1/+2
| | | | llvm-svn: 329091
* [DEBUGINFO] Add option that allows to disable emission of flags in .loc ↵Alexey Bataev2018-04-032-19/+32
| | | | | | | | | | | | | | | | | | directives. Summary: Some targets do not support extended format of .loc directive and support only simple format: .loc <FileID> <Line> <Column>. Patch adds MCAsmInfo flag and option that allows emit .loc directive without additional flags. Reviewers: echristo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45184 llvm-svn: 329089
* [InstCombine] Fold compare of int constant against a splatted vector of intsDaniel Neilson2018-04-032-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Folding patterns like: %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %ext = extractelement <4 x i8> %insvec, i32 0 %cond = icmp eq i32 %ext, 0 Combined with existing rules, this allows us to fold patterns like: %insvec = insertelement <4 x i8> undef, i8 %val, i32 0 %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %cond = icmp eq i8 %val, 0 When we construct a splat vector via a shuffle, and bitcast the vector into an integer type for comparison against an integer constant. Then we can simplify the the comparison to compare the splatted value against the integer constant. Reviewers: spatel, anna, mkazantsev Reviewed By: spatel Subscribers: efriedma, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D44997 llvm-svn: 329087
* [SLP] Fix PR36481: vectorize reassociated instructions.Alexey Bataev2018-04-032-92/+289
| | | | | | | | | | | | | | | | | | Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Patch does not support reordering of the repeated instruction, this must be handled in the separate patch. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 329085
* Recommit "[SLP] Fix issues with debug output in the SLP vectorizer."Alexey Bataev2018-04-031-3/+4
| | | | | | | | | | | | | The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used errs() rather than dbgs(). llvm-svn: 329082
* [Hexagon] Remove -mhvx-double and the corresponding subtarget featureKrzysztof Parzyszek2018-04-033-50/+32
| | | | | | | Specifying the HVX vector length should be done via the -mhvx-length option. llvm-svn: 329079
* Adding optional Name parameter to createVirtualRegister and ↵Puyan Lotfi2018-04-031-4/+5
| | | | | | createGenericVirtualRegister. llvm-svn: 329076
* Revert "[SLP] Fix PR36481: vectorize reassociated instructions."Benjamin Kramer2018-04-032-291/+95
| | | | | | This reverts commit r328980 and r329046. Makes the vectorizer crash. llvm-svn: 329071
* [MC] Fix -Wmissing-field-initializer warning after r329067.Andrea Di Biagio2018-04-031-0/+1
| | | | | | | | This should fix the problem reported by the lld buildbots: - Builder lld-x86_64-darwin13, Build #19782 - Builder lld-perf-testsuite, Build #1419 llvm-svn: 329068
* [MC][Tablegen] Allow the definition of processor register files in the ↵Andrea Di Biagio2018-04-031-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | scheduling model for llvm-mca This patch allows the description of register files in processor scheduling models. This addresses PR36662. A new tablegen class named 'RegisterFile' has been added to TargetSchedule.td. Targets can optionally describe register files for their processors using that class. In particular, class RegisterFile allows to specify: - The total number of physical registers. - Which target registers are accessible through the register file. - The cost of allocating a register at register renaming stage. Example (from this patch - see file X86/X86ScheduleBtVer2.td) def FpuPRF : RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2]> Here, FpuPRF describes a register file for MMX/XMM/YMM registers. On Jaguar (btver2), a YMM register definition consumes 2 physical registers, while MMX/XMM register definitions only cost 1 physical register. The syntax allows to specify an empty set of register classes. An empty set of register classes means: this register file models all the registers specified by the Target. For each register class, users can specify an optional register cost. By default, register costs default to 1. A value of 0 for the number of physical registers means: "this register file has an unbounded number of physical registers". This patch is structured in two parts. * Part 1 - MC/Tablegen * A first part adds the tablegen definition of RegisterFile, and teaches the SubtargetEmitter how to emit information related to register files. Information about register files is accessible through an instance of MCExtraProcessorInfo. The idea behind this design is to logically partition the processor description which is only used by external tools (like llvm-mca) from the processor information used by the llvm machine schedulers. I think that this design would make easier for targets to get rid of the extra processor information if they don't want it. * Part 2 - llvm-mca related * The second part of this patch is related to changes to llvm-mca. The main differences are: 1) class RegisterFile now needs to take into account the "cost of a register" when allocating physical registers at register renaming stage. 2) Point 1. triggered a minor refactoring which lef to the removal of the "maximum 32 register files" restriction. 3) The BackendStatistics view has been updated so that we can print out extra details related to each register file implemented by the processor. The effect of point 3. is also visible in tests register-files-[1..5].s. Differential Revision: https://reviews.llvm.org/D44980 llvm-svn: 329067
* [PowerPC] reorder entries in P9InstrResources.td in alphabetical order; NFCHiroshi Inoue2018-04-031-1/+1
| | | | | | Reorder entries added in my previous commit (rL328969) to keep alphabetical order. llvm-svn: 329064
* MSan: introduce the conservative assembly handling mode.Alexander Potapenko2018-04-031-1/+49
| | | | | | | | | | | | The default assembly handling mode may introduce false positives in the cases when MSan doesn't understand that the assembly call initializes the memory pointed to by one of its arguments. We introduce the conservative mode, which initializes the first |sizeof(type)| bytes for every |type*| pointer passed into the assembly statement. llvm-svn: 329054
* [SCEV] Fix PR36974.Serguei Katkov2018-04-031-5/+6
| | | | | | | | | | | | | The patch changes the usage of dominate to properlyDominate to satisfy the condition !(a < a) while using std::max. It is actually NFC due to set data structure is used to keep the Loops and no two identical loops can be in collection. So in reality there is no difference between usage of dominate and properlyDominate in this particular case. However it might be changed so it is better to fix it. llvm-svn: 329051
* [X86] Reduce number of OpPrefix bits in TSFlags to 2. NFCICraig Topper2018-04-033-33/+37
| | | | | | TSFlag doesn't need to disambiguate NoPrfx from PS. So shift the encodings so PS is NoPrfx|0x4. llvm-svn: 329049
* [SCEV] Make computeExitLimit more simple and more powerfulMax Kazantsev2018-04-031-58/+17
| | | | | | | | | | | | | | | | | | | | | | | Current implementation of `computeExitLimit` has a big piece of code the only purpose of which is to prove that after the execution of this block the latch will be executed. What it currently checks is actually a subset of situations where the exiting block dominates latch. This patch replaces all these checks for simple particular cases with domination check over loop's latch which is the only necessary condition of taking the exiting block into consideration. This change allows to calculate exact loop taken count for simple loops like for (int i = 0; i < 100; i++) { if (cond) {...} else {...} if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44677 Reviewed By: efriedma llvm-svn: 329047
* [SLP] Fix issues with debug output in the SLP vectorizer.Chandler Carruth2018-04-031-10/+10
| | | | | | | | | | | | | The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used `errs()` rather than `dbgs()`. llvm-svn: 329046
* bpf: fix incorrect SELECT_CC loweringYonghong Song2018-04-031-6/+0
| | | | | | | | | | | | | | | | | | | | | | | Commit 37962a331c77 ("bpf: Improve expanding logic in LowerSELECT_CC") intended to improve code quality for certain jmp conditions. The commit, however, has a couple of issues: (1). In code, just swap is not enough, ConditionalCode CC should also be swapped, otherwise incorrect code will be generated. (2). The ConditionalCode swap should be subject to getHasJmpExt(). If getHasJmpExt() is False, certain conditional codes will not be supported and swap may generate incorrect code. The original goal for this patch is to optimize jmp operations which does not have JmpExt turned on. If JmpExt is on, better code could be generated. For example, the test select_ri.ll is introduced to demonstrate the optimization. The same result can be achieved with -mcpu=v2 flag. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 329043
* peel loops with runtime small trip countsIkhlas Ajbar2018-04-033-2/+17
| | | | | | | | | | For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329042
* [SLP] Distinguish "demanded and shrinkable" from "demanded and not ↵Haicheng Wu2018-04-031-17/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shrinkable" values when determining the minimum bitwidth We use two approaches for determining the minimum bitwidth. * Demanded bits * Value tracking If demanded bits doesn't result in a narrower type, we then try value tracking. We need this if we want to root SLP trees with the indices of getelementptr instructions since all the bits of the indices are demanded. But there is a missing piece though. We need to be able to distinguish "demanded and shrinkable" from "demanded and not shrinkable". For example, the bits of %i in %i = sext i32 %e1 to i64 %gep = getelementptr inbounds i64, i64* %p, i64 %i are demanded, but we can shrink %i's type to i32 because it won't change the result of the getelementptr. On the other hand, in %tmp15 = sext i32 %tmp14 to i64 %tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0 it doesn't make sense to shrink %tmp15 and we can skip the value tracking. Ideas are from Matthew Simpson! Differential Revision: https://reviews.llvm.org/D44868 llvm-svn: 329035
* [Coroutines] Avoid assert splitting hidden corosBrian Gesiak2018-04-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When attempting to split a coroutine with 'hidden' visibility (for example, a C++ coroutine that is inlined when compiled with the option '-fvisibility-inlines-hidden'), LLVM would hit an assertion in include/llvm/IR/GlobalValue.h:240: "local linkage requires default visibility". The issue is that the visibility is copied from the source of the function split in the `CloneFunctionInto` function, but the linkage is not. To fix, create the new function first with external linkage, then copy the linkage from the original function *after* `CloneFunctionInto` is called. Since `GlobalValue::setLinkage` in turn calls `maybeSetDsoLocal`, the explicit call to `setDSOLocal` can be removed in CoroSplit.cpp. Test Plan: check-llvm Reviewers: GorNishanov, lewissbaker, EricWF, majnemer, rnk Reviewed By: rnk Subscribers: llvm-commits, eric_niebler Differential Revision: https://reviews.llvm.org/D44185 llvm-svn: 329033
* Align stubs for external and common global variables to pointer size.Rafael Espindola2018-04-021-0/+1
| | | | | | | | | This patch fixes PR36885: clang++ generates unaligned stub symbol holding a pointer. Patch by Rahul Chaudhry! llvm-svn: 329030
* [InstCombine] Don't strip function type casts from musttail callsReid Kleckner2018-04-021-1/+10
| | | | | | | | | | | | | | | Summary: The cast simplifications that instcombine does here do not make any attempt to obey the verifier rules for musttail calls. Therefore we have to disable them. Reviewers: efriedma, majnemer, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45186 llvm-svn: 329027
* Treat inlining a notail call as a regular, non-tail callReid Kleckner2018-04-021-0/+6
| | | | | | | | | Otherwise, we end up inlining a musttail call into a non-tail position, which breaks verifier invariants. Fixes PR31014 llvm-svn: 329015
* [ORC] Create a new SymbolStringPool by default in ExecutionSession constructor.Lang Hames2018-04-023-12/+3
| | | | | | This makes the common case of constructing an ExecutionSession tidier. llvm-svn: 329013
OpenPOWER on IntegriCloud