summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] performMinMaxCombine should not optimize patterns of vectors to ↵Farhana Aleen2018-04-031-1/+1
| | | | | | | | | | | | | | | | min3/max3. Summary: There are no packed instructions for min3 or max3. So, performMinMaxCombine should not optimize vectors of f16 to min3/max3. Author: FarhanaAleen Reviewed By: arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D45219 llvm-svn: 329131
* [AArch64] Adjust the cost model for Exynos M3Evandro Menezes2018-04-031-12/+12
| | | | | | Fix typo and simplify matching expression. llvm-svn: 329130
* [Hexagon] peel loops with runtime small trip countsIkhlas Ajbar2018-04-032-4/+2
| | | | | | | | Move the check canPeel() to Hexagon Target before setting PeelCount. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329129
* [InstCombine] allow more fmul folds with 'reassoc'Sanjay Patel2018-04-031-64/+64
| | | | | | | | The tests marked with 'FIXME' require loosening the check in SimplifyAssociativeOrCommutative() to optimize completely; that's still checking isFast() in Instruction::isAssociative(). llvm-svn: 329121
* [MachineOutliner] Keep track of fns that use a redzone in AArch64FunctionInfoJessica Paquette2018-04-033-7/+27
| | | | | | | | | | | | This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It returns true if the function is known to use a redzone, false if it is known to not use a redzone, and no value otherwise. This removes the requirement to pass -mno-red-zone when outlining for AArch64. https://reviews.llvm.org/D45189 llvm-svn: 329120
* Revert "MSG"Farhana Aleen2018-04-031-1/+1
| | | | | | | | This reverts commit 9a0ce889d1c39c74d69ecad5ce9c875155ae55de. This was committed by mistake. llvm-svn: 329119
* Fix bad copy-and-paste in r329108Vlad Tsyrklevich2018-04-031-1/+1
| | | | llvm-svn: 329118
* [MachineOutliner][NFC] Make outlined functions have internal linkageJessica Paquette2018-04-031-1/+1
| | | | | | | | | | | | The linkage type on outlined functions was private before. This meant that if you set a breakpoint in an outlined function, the debugger wouldn't be able to give a sane name to the outlined function. This commit changes the linkage type to internal and updates any tests that relied on the prefixes on the names of outlined functions. llvm-svn: 329116
* MSGFarhana Aleen2018-04-031-1/+1
| | | | llvm-svn: 329114
* [coroutines] Respect alloca alignment requirements when building coroutine frameGor Nishanov2018-04-032-9/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: If an alloca need to be stored in the coroutine frame and it has an alignment specified and the alignment does not match the natural alignment of the alloca type. Insert appropriate padding into the coroutine frame to make sure that it gets requested alignment. For example for a packet type (which natural alignment is 1), but alloca alignment is 8, we may need to insert a padding field with required number of bytes to make sure it is properly aligned. ``` %PackedStruct = type <{ i64 }> ... %data = alloca %PackedStruct, align 8 ``` If the previous field in the coroutine frame had alignment 2, we would have [6 x i8] inserted before %PackedStruct in the coroutine frame: ``` %f.Frame = type { ..., i16, [6 x i8], %PackedStruct } ``` Reviewers: rnk, lewissbaker, modocache Reviewed By: modocache Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D45221 llvm-svn: 329112
* [LoopInterchange] Add remark for calls preventing interchanging.Florian Hahn2018-04-031-0/+7
| | | | | | | | | | | | | | It also updates test/Transforms/LoopInterchange/call-instructions.ll to use accesses where we can prove dependence after D35430. Reviewers: sebpop, karthikthecool, blitz.opensource Reviewed By: sebpop Differential Revision: https://reviews.llvm.org/D45206 llvm-svn: 329111
* Add the ShadowCallStack attributeVlad Tsyrklevich2018-04-039-0/+16
| | | | | | | | | | | | | | | | | | Summary: Introduce the ShadowCallStack function attribute. It's added to functions compiled with -fsanitize=shadow-call-stack in order to mark functions to be instrumented by a ShadowCallStack pass to be submitted in a separate change. Reviewers: pcc, kcc, kubamracek Reviewed By: pcc, kcc Subscribers: cryptoad, mehdi_amini, javed.absar, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44800 llvm-svn: 329108
* [DebugInfoPDB] Add a few missing definitions to PDBTypes.hAaron Smith2018-04-031-0/+3
| | | | | | | | | The missing definitions are from cvconst.h shipped with DIA SDK. Correct the url to MSDN for MemoryTypeEnum and set the underlying type of PDB_StackFrameType and PDB_MemoryType to uint16_t. llvm-svn: 329104
* [CodeGen]Add NoVRegs property on PostRASink and ShrinkWrapJun Bum Lim2018-04-034-3/+14
| | | | | | | | | | | | | | | | | Summary: This change declare that PostRAMachineSinking and ShrinkWrap require NoVRegs property, so now the MachineFunctionPass can enforce this check. These passes are disabled in NVPTX & WebAssembly. Reviewers: dschuff, jlebar, tra, jgravelle-google, MatzeB, sebpop, thegameg, mcrosier Reviewed By: dschuff, thegameg Subscribers: jholewinski, jfb, sbc100, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D45183 llvm-svn: 329095
* [SLP] Fixed formatting, NFC.Alexey Bataev2018-04-031-1/+2
| | | | llvm-svn: 329091
* [DEBUGINFO] Add option that allows to disable emission of flags in .loc ↵Alexey Bataev2018-04-032-19/+32
| | | | | | | | | | | | | | | | | | directives. Summary: Some targets do not support extended format of .loc directive and support only simple format: .loc <FileID> <Line> <Column>. Patch adds MCAsmInfo flag and option that allows emit .loc directive without additional flags. Reviewers: echristo Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45184 llvm-svn: 329089
* [InstCombine] Fold compare of int constant against a splatted vector of intsDaniel Neilson2018-04-032-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Folding patterns like: %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %ext = extractelement <4 x i8> %insvec, i32 0 %cond = icmp eq i32 %ext, 0 Combined with existing rules, this allows us to fold patterns like: %insvec = insertelement <4 x i8> undef, i8 %val, i32 0 %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %cond = icmp eq i8 %val, 0 When we construct a splat vector via a shuffle, and bitcast the vector into an integer type for comparison against an integer constant. Then we can simplify the the comparison to compare the splatted value against the integer constant. Reviewers: spatel, anna, mkazantsev Reviewed By: spatel Subscribers: efriedma, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D44997 llvm-svn: 329087
* [SLP] Fix PR36481: vectorize reassociated instructions.Alexey Bataev2018-04-032-92/+289
| | | | | | | | | | | | | | | | | | Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Patch does not support reordering of the repeated instruction, this must be handled in the separate patch. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 329085
* Recommit "[SLP] Fix issues with debug output in the SLP vectorizer."Alexey Bataev2018-04-031-3/+4
| | | | | | | | | | | | | The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used errs() rather than dbgs(). llvm-svn: 329082
* [Hexagon] Remove -mhvx-double and the corresponding subtarget featureKrzysztof Parzyszek2018-04-033-50/+32
| | | | | | | Specifying the HVX vector length should be done via the -mhvx-length option. llvm-svn: 329079
* Adding optional Name parameter to createVirtualRegister and ↵Puyan Lotfi2018-04-031-4/+5
| | | | | | createGenericVirtualRegister. llvm-svn: 329076
* Revert "[SLP] Fix PR36481: vectorize reassociated instructions."Benjamin Kramer2018-04-032-291/+95
| | | | | | This reverts commit r328980 and r329046. Makes the vectorizer crash. llvm-svn: 329071
* [MC] Fix -Wmissing-field-initializer warning after r329067.Andrea Di Biagio2018-04-031-0/+1
| | | | | | | | This should fix the problem reported by the lld buildbots: - Builder lld-x86_64-darwin13, Build #19782 - Builder lld-perf-testsuite, Build #1419 llvm-svn: 329068
* [MC][Tablegen] Allow the definition of processor register files in the ↵Andrea Di Biagio2018-04-031-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | scheduling model for llvm-mca This patch allows the description of register files in processor scheduling models. This addresses PR36662. A new tablegen class named 'RegisterFile' has been added to TargetSchedule.td. Targets can optionally describe register files for their processors using that class. In particular, class RegisterFile allows to specify: - The total number of physical registers. - Which target registers are accessible through the register file. - The cost of allocating a register at register renaming stage. Example (from this patch - see file X86/X86ScheduleBtVer2.td) def FpuPRF : RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2]> Here, FpuPRF describes a register file for MMX/XMM/YMM registers. On Jaguar (btver2), a YMM register definition consumes 2 physical registers, while MMX/XMM register definitions only cost 1 physical register. The syntax allows to specify an empty set of register classes. An empty set of register classes means: this register file models all the registers specified by the Target. For each register class, users can specify an optional register cost. By default, register costs default to 1. A value of 0 for the number of physical registers means: "this register file has an unbounded number of physical registers". This patch is structured in two parts. * Part 1 - MC/Tablegen * A first part adds the tablegen definition of RegisterFile, and teaches the SubtargetEmitter how to emit information related to register files. Information about register files is accessible through an instance of MCExtraProcessorInfo. The idea behind this design is to logically partition the processor description which is only used by external tools (like llvm-mca) from the processor information used by the llvm machine schedulers. I think that this design would make easier for targets to get rid of the extra processor information if they don't want it. * Part 2 - llvm-mca related * The second part of this patch is related to changes to llvm-mca. The main differences are: 1) class RegisterFile now needs to take into account the "cost of a register" when allocating physical registers at register renaming stage. 2) Point 1. triggered a minor refactoring which lef to the removal of the "maximum 32 register files" restriction. 3) The BackendStatistics view has been updated so that we can print out extra details related to each register file implemented by the processor. The effect of point 3. is also visible in tests register-files-[1..5].s. Differential Revision: https://reviews.llvm.org/D44980 llvm-svn: 329067
* [PowerPC] reorder entries in P9InstrResources.td in alphabetical order; NFCHiroshi Inoue2018-04-031-1/+1
| | | | | | Reorder entries added in my previous commit (rL328969) to keep alphabetical order. llvm-svn: 329064
* MSan: introduce the conservative assembly handling mode.Alexander Potapenko2018-04-031-1/+49
| | | | | | | | | | | | The default assembly handling mode may introduce false positives in the cases when MSan doesn't understand that the assembly call initializes the memory pointed to by one of its arguments. We introduce the conservative mode, which initializes the first |sizeof(type)| bytes for every |type*| pointer passed into the assembly statement. llvm-svn: 329054
* [SCEV] Fix PR36974.Serguei Katkov2018-04-031-5/+6
| | | | | | | | | | | | | The patch changes the usage of dominate to properlyDominate to satisfy the condition !(a < a) while using std::max. It is actually NFC due to set data structure is used to keep the Loops and no two identical loops can be in collection. So in reality there is no difference between usage of dominate and properlyDominate in this particular case. However it might be changed so it is better to fix it. llvm-svn: 329051
* [X86] Reduce number of OpPrefix bits in TSFlags to 2. NFCICraig Topper2018-04-033-33/+37
| | | | | | TSFlag doesn't need to disambiguate NoPrfx from PS. So shift the encodings so PS is NoPrfx|0x4. llvm-svn: 329049
* [SCEV] Make computeExitLimit more simple and more powerfulMax Kazantsev2018-04-031-58/+17
| | | | | | | | | | | | | | | | | | | | | | | Current implementation of `computeExitLimit` has a big piece of code the only purpose of which is to prove that after the execution of this block the latch will be executed. What it currently checks is actually a subset of situations where the exiting block dominates latch. This patch replaces all these checks for simple particular cases with domination check over loop's latch which is the only necessary condition of taking the exiting block into consideration. This change allows to calculate exact loop taken count for simple loops like for (int i = 0; i < 100; i++) { if (cond) {...} else {...} if (i > 50) break; . . . } Differential Revision: https://reviews.llvm.org/D44677 Reviewed By: efriedma llvm-svn: 329047
* [SLP] Fix issues with debug output in the SLP vectorizer.Chandler Carruth2018-04-031-10/+10
| | | | | | | | | | | | | The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used `errs()` rather than `dbgs()`. llvm-svn: 329046
* bpf: fix incorrect SELECT_CC loweringYonghong Song2018-04-031-6/+0
| | | | | | | | | | | | | | | | | | | | | | | Commit 37962a331c77 ("bpf: Improve expanding logic in LowerSELECT_CC") intended to improve code quality for certain jmp conditions. The commit, however, has a couple of issues: (1). In code, just swap is not enough, ConditionalCode CC should also be swapped, otherwise incorrect code will be generated. (2). The ConditionalCode swap should be subject to getHasJmpExt(). If getHasJmpExt() is False, certain conditional codes will not be supported and swap may generate incorrect code. The original goal for this patch is to optimize jmp operations which does not have JmpExt turned on. If JmpExt is on, better code could be generated. For example, the test select_ri.ll is introduced to demonstrate the optimization. The same result can be achieved with -mcpu=v2 flag. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> llvm-svn: 329043
* peel loops with runtime small trip countsIkhlas Ajbar2018-04-033-2/+17
| | | | | | | | | | For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329042
* [SLP] Distinguish "demanded and shrinkable" from "demanded and not ↵Haicheng Wu2018-04-031-17/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shrinkable" values when determining the minimum bitwidth We use two approaches for determining the minimum bitwidth. * Demanded bits * Value tracking If demanded bits doesn't result in a narrower type, we then try value tracking. We need this if we want to root SLP trees with the indices of getelementptr instructions since all the bits of the indices are demanded. But there is a missing piece though. We need to be able to distinguish "demanded and shrinkable" from "demanded and not shrinkable". For example, the bits of %i in %i = sext i32 %e1 to i64 %gep = getelementptr inbounds i64, i64* %p, i64 %i are demanded, but we can shrink %i's type to i32 because it won't change the result of the getelementptr. On the other hand, in %tmp15 = sext i32 %tmp14 to i64 %tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0 it doesn't make sense to shrink %tmp15 and we can skip the value tracking. Ideas are from Matthew Simpson! Differential Revision: https://reviews.llvm.org/D44868 llvm-svn: 329035
* [Coroutines] Avoid assert splitting hidden corosBrian Gesiak2018-04-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When attempting to split a coroutine with 'hidden' visibility (for example, a C++ coroutine that is inlined when compiled with the option '-fvisibility-inlines-hidden'), LLVM would hit an assertion in include/llvm/IR/GlobalValue.h:240: "local linkage requires default visibility". The issue is that the visibility is copied from the source of the function split in the `CloneFunctionInto` function, but the linkage is not. To fix, create the new function first with external linkage, then copy the linkage from the original function *after* `CloneFunctionInto` is called. Since `GlobalValue::setLinkage` in turn calls `maybeSetDsoLocal`, the explicit call to `setDSOLocal` can be removed in CoroSplit.cpp. Test Plan: check-llvm Reviewers: GorNishanov, lewissbaker, EricWF, majnemer, rnk Reviewed By: rnk Subscribers: llvm-commits, eric_niebler Differential Revision: https://reviews.llvm.org/D44185 llvm-svn: 329033
* Align stubs for external and common global variables to pointer size.Rafael Espindola2018-04-021-0/+1
| | | | | | | | | This patch fixes PR36885: clang++ generates unaligned stub symbol holding a pointer. Patch by Rahul Chaudhry! llvm-svn: 329030
* [InstCombine] Don't strip function type casts from musttail callsReid Kleckner2018-04-021-1/+10
| | | | | | | | | | | | | | | Summary: The cast simplifications that instcombine does here do not make any attempt to obey the verifier rules for musttail calls. Therefore we have to disable them. Reviewers: efriedma, majnemer, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45186 llvm-svn: 329027
* Treat inlining a notail call as a regular, non-tail callReid Kleckner2018-04-021-0/+6
| | | | | | | | | Otherwise, we end up inlining a musttail call into a non-tail position, which breaks verifier invariants. Fixes PR31014 llvm-svn: 329015
* [ORC] Create a new SymbolStringPool by default in ExecutionSession constructor.Lang Hames2018-04-023-12/+3
| | | | | | This makes the common case of constructing an ExecutionSession tidier. llvm-svn: 329013
* [InstCombine] add folds for icmp + sub (PR36969)Sanjay Patel2018-04-021-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | (A - B) >u A --> A <u B C <u (C - D) --> C <u D https://rise4fun.com/Alive/e7j Name: ugt %sub = sub i8 %x, %y %cmp = icmp ugt i8 %sub, %x => %cmp = icmp ult i8 %x, %y Name: ult %sub = sub i8 %x, %y %cmp = icmp ult i8 %x, %sub => %cmp = icmp ult i8 %x, %y This should fix: https://bugs.llvm.org/show_bug.cgi?id=36969 llvm-svn: 329011
* Fix header mismatch in DIBuilder Type APIsHarlan Haskins2018-04-021-5/+5
| | | | | | | Some of the headers changed slightly, and the accompanying implementation didn't change. This caused a silent failure. llvm-svn: 329003
* [llvm-pdbutil] Add an export subcommand.Zachary Turner2018-04-022-4/+13
| | | | | | | | | | | | | | | | | This command can dump the binary contents of a stream to a file. This is useful when you want to do side-by-side comparisons of a specific stream from two PDBs to examine the differences between them. You can export both of them to a file, then open them up side by side in a hex editor (for example), so as to eliminate any differences that might arise from the contents being on different blocks in the PDB. In subsequent patches I plan to improve the "explain" subcommand so that you can explain the contents of a binary file that isn't necessarily a full PDB, but one of these dumped streams, by telling the subcommand how to interpret the contents. llvm-svn: 329002
* Remove HAVE_LIBPSAPI, HAVE_SHELL32.Nico Weber2018-04-022-11/+1
| | | | | | | | | | These used to be set in the old autoconf build, but the cmake build has had a "TODO: actually check for these" comment since it was checked in, and they were set to 1 on mingw unconditionally. It seems safe to say that they always exist under mingw, so just remove them and assume they're set exactly when on mingw (with msvc, we use `pragma comment` instead of linking these via flags). llvm-svn: 328992
* [DeadArgumentElim] Clone function level metadatasRong Xu2018-04-021-5/+11
| | | | | | | | | | | | Some Function level metadatas, such as function entry count, are not cloned in DeadArgumentElim. This happens a lot in lto/thinlto because of DeadArgumentElim after internalization. This patch clones the metadatas in the original function to the new function. Differential Revision: https://reviews.llvm.org/D44127 llvm-svn: 328991
* Remove HAVE_DIRENT_H.Nico Weber2018-04-021-17/+2
| | | | | | | The autoconf manual: "This macro is obsolescent, as all current systems with directory libraries have <dirent.h>. New programs need not use this macro." llvm-svn: 328989
* [AMDGPU][MC][GFX9] Added instructions v_cvt_norm_*16_f16, v_sat_pk_u8_i16Dmitry Preobrazhensky2018-04-021-0/+8
| | | | | | | | | See bug 36847: https://bugs.llvm.org/show_bug.cgi?id=36847 Differential Revision: https://reviews.llvm.org/D45097 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328988
* [coroutines] Add support for llvm.coro.noop intrinsicsGor Nishanov2018-04-022-7/+48
| | | | | | | | | | | | | | | | Summary: A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined coroutine noop_coroutine that does nothing. To implement this feature, we implemented an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that does nothing when resumed or destroyed. Reviewers: EricWF, modocache, rnk, lewissbaker Reviewed By: modocache Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45114 llvm-svn: 328986
* [AMDGPU][MC][GFX9] Added s_atomic_* and s_buffer_atomic_* instructionsDmitry Preobrazhensky2018-04-024-1/+204
| | | | | | | | | | | Fixed a bug which caused Tablegen crash. See bug 36837: https://bugs.llvm.org/show_bug.cgi?id=36837 Differential Revision: https://reviews.llvm.org/D45085 Reviewers: artem.tamazov, arsenm, timcorringham llvm-svn: 328983
* [Hexagon] Clean up some code in HexagonAsmPrinter, NFCKrzysztof Parzyszek2018-04-022-48/+53
| | | | llvm-svn: 328981
* [SLP] Fix PR36481: vectorize reassociated instructions.Alexey Bataev2018-04-022-92/+288
| | | | | | | | | | | | | | | Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 328980
* Revert r328975, it makes TableGen assert on the bots.Nico Weber2018-04-024-204/+1
| | | | llvm-svn: 328978
OpenPOWER on IntegriCloud