bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[GISel]: Remove redundant copies at the end of ISel	Aditya Nandakumar	2018-01-24	14	-83/+43
\| \| \| \| \| \| \| \| \|	https://reviews.llvm.org/D42402 A lot of these copies are useless (copies b/w VRegs having the same regclass) and should be cleaned up. llvm-svn: 323291
*	AArch64: Cyclone: Remove SlowMisaligned128Store tuning flag	Matthias Braun	2018-01-24	4	-14/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove FeatureSlowMisaligned128Store from cyclone flags. This flag causes splitting of 16 byte wide stores into 2 stored of 8 bytes. This was useful on older apple CPUs which were slow for 16byte stores that were not aligned on 16byte. As the compiler often cannot predict the actual alignment, the splitting was choosen. This has been a topic for a lot of debate as the splitting also decreases performance for some benchmarks. Measuring the effects on newer apple chips (rdar://35525421) shows that it harms more cases than it helps. So it is time to retire this workaround. llvm-svn: 323289
*	[WebAssembly] MC: Use inline triple in test bitcode files	Sam Clegg	2018-01-23	17	-18/+54
\| \| \| \| \| \| \| \| \|	This matches the CodeGen tests and makes it a little easy to run these from the command line manually. Differential Revision: https://reviews.llvm.org/D42440 llvm-svn: 323275
*	[WebAssembly] Add to test expectations for ↵	Sam Clegg	2018-01-23	1	-2/+7
\| \| \| \| \| \| \| \|	test/MC/WebAssembly/weak-alias.ll. NFC. Split out from D42095 llvm-svn: 323272
*	[PPC] Avoid incorrect fp-i128-fp lowering.	Tim Shen	2018-01-23	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fix an issue that's similar to what D41411 fixed: float(__int128(float_var)) shouldn't be optimized to xscvdpsxds + xscvsxdsp, as they mean (float)(int64_t)float_var. Reviewers: jtony, hfinkel, echristo Subscribers: sanjoy, nemanjai, hiraditya, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D42400 llvm-svn: 323270
*	[SLPVectorizer] add test for PR13837; NFC	Sanjay Patel	2018-01-23	1	-0/+31
\| \| \| \| \| \| \| \| \|	This was probably fixed long ago, but I don't see a test that lines up with the example and target in the bug report: https://bugs.llvm.org/show_bug.cgi?id=13837 ...so adding it here. llvm-svn: 323269
*	Add bdver shuffle sink tests.	Simon Pilgrim	2018-01-23	1	-0/+21
\| \| \| \|	llvm-svn: 323268
*	[llvm-extract] Support extracting basic blocks	Volkan Keles	2018-01-23	6	-0/+147
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently, there is no way to extract a basic block from a function easily. This patch extends llvm-extract to extract the specified basic block(s). Reviewers: loladiro, rafael, bogner Reviewed By: bogner Subscribers: hintonda, mgorny, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D41638 llvm-svn: 323266
*	Regenerate select test. NFCI.	Simon Pilgrim	2018-01-23	1	-53/+74
\| \| \| \|	llvm-svn: 323265
*	Regenerate shuffle sink test. NFCI.	Simon Pilgrim	2018-01-23	1	-42/+69
\| \| \| \|	llvm-svn: 323264
*	[X86] Move 'Int_' to the end of the name of the VCOMISS/VUCOMISS and ↵	Craig Topper	2018-01-23	3	-64/+64
\| \| \| \| \| \| \| \|	instructions to get them picked up by the scheduler model regexs. All other intrinsic instructions put the _Int on the end. This make these instructions consistent and gets the prefix instregexs in the scheduler models to pick them up. llvm-svn: 323261
*	[X86][AVX] LowerBUILD_VECTORAsVariablePermute - add support for VPERMILPV to ↵	Simon Pilgrim	2018-01-23	1	-15/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	v2i64/v2f64 Minor refactor to make it possible for LowerBUILD_VECTORAsVariablePermute to be used with a wider variety of shuffles op and types. I'd have liked to add v4i32/v4f32 support as well but we don't see v4i32 index extractions at the moment (which is why I created D42308) After this I intend to begin adding scaling support for PSHUFB (v8i16, v4i32, v2i64)) and VPERMPS (v4f64, v4i64). Differential Revision: https://reviews.llvm.org/D42431 llvm-svn: 323260
*	[safestack] Inline safestack pointer access when possible.	Evgeniy Stepanov	2018-01-23	2	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This adds an -mllvm flag that forces the use of a runtime function call to get the unsafe stack pointer, the same that is currently used on non-x86, non-aarch64 android. The call may be inlined. Reviewers: pcc Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D37405 llvm-svn: 323259
*	[Debugify] Add a mode to opt to enable faster testing	Vedant Kumar	2018-01-23	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Opt's "-enable-debugify" mode adds an instance of Debugify at the beginning of the pass pipeline, and an instance of CheckDebugify at the end. You can enable this mode with lit using: -Dopt="opt -enable-debugify". Note that running test suites in this mode will result in many failures due to strict FileCheck commands, etc. It can be more useful to look for assertion failures which arise only when Debugify is enabled, e.g to prove that we have (or do not have) test coverage for some code path with debug info present. Differential Revision: https://reviews.llvm.org/D41793 llvm-svn: 323256
*	Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵	Alexey Bataev	2018-01-23	3	-24/+24
\| \| \| \| \| \| \| \|	as shuffle." This reverts commit r323246 because of the broken buildbots. llvm-svn: 323252
*	[Hexagon] Add patterns for sext_inreg of HVX vector types	Krzysztof Parzyszek	2018-01-23	1	-0/+54
\| \| \| \|	llvm-svn: 323250
*	[SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.	Alexey Bataev	2018-01-23	3	-24/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323246
*	X86 Tests: Add AVX512BW config to CodeGenPrepare test. NFC	Zvi Rackover	2018-01-23	1	-9/+10
\| \| \| \| \| \| \|	Case points out that we don't consider shifts supported by AVX512BW in isVectorShiftByScalarCheap() llvm-svn: 323242
*	[WebAssembly] Remove "name" section of object wasm object files	Sam Clegg	2018-01-23	6	-58/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	LLD is unaffected, no changes needed there. LLD continues to write out a name section, using the symbol names. Fixes: https://github.com/WebAssembly/tool-conventions/issues/37 Patch by Nicholas Wilson! Differential Revision: https://reviews.llvm.org/D42425 llvm-svn: 323234
*	[Hexagon] Implement basic vector operations on vectors vNi1	Krzysztof Parzyszek	2018-01-23	4	-0/+155
\| \| \| \| \| \| \| \| \| \| \|	In addition to that, make sure that there are no boolean vector types that are associated with multiple register classes. Specifically, remove v32i1 and v64i1 from integer register classes. These types will correspond to results of vector comparisons, and as such should belong to the vector predicate class. Having them in scalar registers as well makes legalization ambiguous. llvm-svn: 323229
*	[X86][SSE] LowerBUILD_VECTORAsVariablePermute - extract subvector from ↵	Simon Pilgrim	2018-01-23	1	-50/+1
\| \| \| \| \| \|	oversized index vectors llvm-svn: 323223
*	[WebAssembly] Add mem.* intrinsics.	Dan Gohman	2018-01-23	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \|	The grow_memory and current_memory instructions are expected to be officially renamed to mem.grow and mem.size. Introduce new intrinsics with the new names. These new names aren't yet official, so for now, use them at your own risk. Also, take this opportunity to add arguments for the currently unused immediate field in those instructions. llvm-svn: 323222
*	[WebAssembly] Switch to *-wasm as the default target triple.	Dan Gohman	2018-01-23	4	-11/+11
\| \| \| \| \| \| \| \|	This makes wasm32-unknown-unknown-wasm the default, which supports the .o file writer and the new linking ABI. To enable s2wasm-compatible output, use the wasm32-unknown-unknown-elf triple. llvm-svn: 323220
*	Verifier: fix bug treating debug info issue as non-debug info issue	Yaxun Liu	2018-01-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Normally when llvm-as sees only debug info errors in LLVM assembly, it simply drops the debug info and outputs a valid LLVM bitcode and returns 0. There is a bug in LLVM verifier which incorrectly treats a debug info error as non-debug info error, which causes llvm-as returns 1 even though llvm-as can drop the invalid debug info and outputs a valid LLVM bitcode. This patch fixes that. Differential Revision: https://reviews.llvm.org/D42391 llvm-svn: 323216
*	[x86] Reautogenerate a bunch of tests for D42287. NFC	Alexander Ivchenko	2018-01-23	6	-257/+273
\| \| \| \|	llvm-svn: 323215
*	CodeGen: Fix assertion in ScheduleDAGMILive::scheduleMI due to llvm.dbg.value	Yaxun Liu	2018-01-23	1	-0/+436
\| \| \| \| \| \| \| \| \| \|	Fix a bug in ScheduleDAGMILive::scheduleMI which causes BotRPTracker not tracking CurrentBottom in some rare cases involving llvm.dbg.value. This issues causes amdgcn target to assert when compiling some user codes with -g. Differential Revision: https://reviews.llvm.org/D42394 llvm-svn: 323214
*	[X86] Rewrite vXi1 element insertion by using a vXi1 scalar_to_vector and ↵	Craig Topper	2018-01-23	6	-552/+514
\| \| \| \| \| \| \| \| \| \|	inserting into a vXi1 vector. The existing code was already doing something very similar to subvector insertion so this allows us to remove the nearly duplicate code. This patch is a little larger than it should be due to differences between the DQI handling between the two today. llvm-svn: 323212
*	[X86][SSE] LowerBUILD_VECTORAsVariablePermute - ensure that the source ↵	Simon Pilgrim	2018-01-23	1	-0/+250
\| \| \| \| \| \| \| \|	vector is not larger than the destination We might be able to support this in the future with VPERMV3, OR(PSHUFB, PSHUFB) etc. llvm-svn: 323210
*	[x86] Mostly reautogenerate a bunch of tests that affect D37775. NFC	Alexander Ivchenko	2018-01-23	17	-1418/+2140
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tests required minor manual tweaks: CodeGen/MIR/X86/generic-instr-type.mir CodeGen/X86/GlobalISel/select-copy.mir CodeGen/X86/GlobalISel/select-ext.mir CodeGen/X86/GlobalISel/select-intrinsic-x86-flags-read-u32.mir CodeGen/X86/GlobalISel/select-phi.mir CodeGen/X86/GlobalISel/select-trunc.mir CodeGen/X86/GlobalISel/select-frameIndex.mir And following tests are split into 32/64 versions: CodeGen/X86/GlobalISel/legalize-GV.mir CodeGen/X86/GlobalISel/select-frameIndex.mir llvm-svn: 323209
*	[X86][SSE] LowerBUILD_VECTORAsVariablePermute - ensure that the index vector ↵	Simon Pilgrim	2018-01-23	1	-0/+111
\| \| \| \| \| \|	has the correct number of elements llvm-svn: 323206
*	AArch64: get type from correct result when forming BFX	Tim Northover	2018-01-23	1	-0/+17
\| \| \| \| \| \| \|	Some nodes produce multiple values so when obtaining the type of an ISD::OR we need to make sure we ask for the correct one. Hopefully that's all of them. llvm-svn: 323205
*	AArch64: get type from correct result when forming BFI/BFM	Tim Northover	2018-01-23	1	-0/+17
\| \| \| \| \| \| \|	Some nodes produce multiple values so when obtaining the type of an ISD::OR we need to make sure we ask for the correct one. llvm-svn: 323202
*	[X86] Legalize v32i1 without BWI via splitting to v16i1 rather than the ↵	Craig Topper	2018-01-23	14	-16698/+3332
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	default of promoting to v32i8. Summary: For the most part its better to keep v32i1 as a mask type of a narrower width than trying to promote it to a ymm register. I had to add some overrides to the methods that get the types for the calling convention so that we still use v32i8 for argument/return purposes. There are still some regressions in here. I definitely saw some around shuffles. I think we probably should move vXi1 shuffle from lowering to a DAG combine where I think the extend and truncate we have to emit would be better combined. I think we also need a DAG combine to remove trunc from (extract_vector_elt (trunc)) Overall this removes something like 13000 CHECK lines from lit tests. Reviewers: zvi, RKSimon, delena, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42031 llvm-svn: 323201
*	[CGP] Fix the GV handling in complex addressing mode	Serguei Katkov	2018-01-23	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If in complex addressing mode the difference is in GV then base reg should not be installed because we plan to use base reg as a merge point of different GVs. This is a fix for PR35980. Reviewers: reames, john.brawn, santosh Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42230 llvm-svn: 323192
*	[X86][SSE] LowerBUILD_VECTORAsVariablePermute - fix PSHUFB source/index ↵	Simon Pilgrim	2018-01-23	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	operand ordering As detailed in rL317463, PSHUFB (like most variable shuffle instructions) uses Op[0] for the source vector and Op[1] for the shuffle index vector, VPERMV works in reverse which is probably where the confusion comes from. Differential Revision: https://reviews.llvm.org/D42380 llvm-svn: 323190
*	[Analysis] Disable exp/exp2/pow finite lib calls on Android with -ffast-math.	MinSeong Kim	2018-01-23	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Since r322087, glibc's finite lib calls are generated when possible. However, glibc is not supported on Android. Therefore this change enables llvm to finely distinguish between linux and Android for unsupported library calls. The change also include some regression tests. Reviewers: srhines, pirama Reviewed By: srhines Subscribers: kongyi, chh, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D42288 llvm-svn: 323187
*	[mips] Properly select abs and sqrt instructions	Stefan Maksimovic	2018-01-23	26	-33/+242
\| \| \| \| \| \| \| \| \| \| \| \| \|	- Alter abs for micromips to have both AFGR64 and FGR64 variants, same as sqrt - Remove sqrt and abs from MicroMips32r6InstrInfo.td, use micromips FGR64 variants - Restrict non-micromips abs/sqrt with NotInMicroMips predicate Differential revision: https://reviews.llvm.org/D41439 llvm-svn: 323184
*	[InstSimplify] (X << Y) % X -> 0	Anton Bikineev	2018-01-23	1	-12/+4
\| \| \| \|	llvm-svn: 323182
*	[X86] Don't reorder (srl (and X, C1), C2) if (and X, C1) can be matched as a ↵	Craig Topper	2018-01-23	3	-1058/+1036
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	movzx Summary: If we can match as a zero extend there's no need to flip the order to get an encoding benefit. As movzx is 3 bytes with independent source/dest registers. The shortest 'and' we could make is also 3 bytes unless we get lucky in the register allocator and its on AL/AX/EAX which have a 2 byte encoding. This patch was more impressive before r322957 went in. It removed some of the same Ands that got deleted by that patch. Reviewers: spatel, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42313 llvm-svn: 323175
*	[X86] Remove 'NOREX' comment from the printing of _NOREX instructions.	Craig Topper	2018-01-23	9	-50/+50
\| \| \| \| \| \|	Some of the NOREX instructions are used in 32-bit mode making this printing confusing. It also doesn't provide a lot of value since you can see the h-register being used by the instruction. llvm-svn: 323174
*	NewPM: Add an extension point for the start of the pipeline.	David Blaikie	2018-01-23	2	-3/+15
\| \| \| \| \| \| \| \| \| \|	This applies to most pipelines except the LTO and ThinLTO backend actions - it is for use at the beginning of the overall pipeline. This extension point will be used to add the GCOV pass when enabled in Clang. llvm-svn: 323166
*	[WebAssembly] Store function index rather than table index in TABLE_INDEX ↵	Sam Clegg	2018-01-23	4	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	relocations Relocations of type R_WEBASSEMBLY_TABLE_INDEX represent places where the table index for a given function is needed. While the value stored in this location is a table index, the index in the relocation entry itself is a function index (the index of the function which is to be called indirectly). This is how is was spec'd originally but the LLVM implementation didn't do this. This makes things a little simpler in the linker since the table in the input file can essentially be ignored that the output table can be created purely based on these relocations. Patch by Nicholas Wilson! Differential Revision: https://reviews.llvm.org/D42080 llvm-svn: 323165
*	Introduce the "retpoline" x86 mitigation technique for variant #2 of the ↵	Chandler Carruth	2018-01-22	4	-0/+599
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took in the speculative execution, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` in addition to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile all libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we strongly recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 llvm-svn: 323155
*	[AMDGPU] SI Load Store Optimizer: When merging with offset, use ↵	Mark Searles	2018-01-22	2	-22/+82
\| \| \| \| \| \| \| \| \| \| \|	V_ADD_{I\|U}32_e64 - Change inserted add ( V_ADD_{I\|U}32_e32 ) to _e64 version ( V_ADD_{I\|U}32_e64 ) so that the add uses a vreg for the carry; this prevents inserted v_add from killing VCC; the _e64 version doesn't accept a literal in its encoding, so we need to introduce a mov instr as well to get the imm into a register. - Change pass name to "SI Load Store Optimizer"; this removes the '/', which complicates scripts. Differential Revision: https://reviews.llvm.org/D42124 llvm-svn: 323153
*	[llvm-objcopy] Use physical instead of virtual address when aligning and ↵	Jake Ehrlich	2018-01-22	6	-12/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	placing sections in binary For sections with different virtual and physical addresses, alignment and placement in the output binary should be based on the physical address. Ran into this problem with a bare metal ARM project where llvm-objcopy added a lot of zero-padding before the .data section that had differing addresses. GNU objcopy did not add the padding, and after this fix, neither does llvm-objcopy. Update a test case so a section has different physical and virtual addresses. Fixes B35708 Authored By: Owen Shaw (owenpshaw) Differential Revision: https://reviews.llvm.org/D41619 llvm-svn: 323144
*	[mips] add warnings for using dsp and msa flags with inappropriate revisions	Petar Jovanovic	2018-01-22	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \|	Dsp and dspr2 require MIPS revision 2, while msa requires revision 5. Adding warnings for cases when these flags are used with earlier revision. Patch by Milos Stojanovic. Differential Revision: https://reviews.llvm.org/D40490 llvm-svn: 323131
*	[AArch64][SVE] Asm: PTRUE and PTRUES instructions	Sander de Smalen	2018-01-22	4	-0/+586
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These instructions initialize a predicate vector from a pattern/immediate. Reviewers: fhahn, rengolin, evandro, mcrosier, t.p.northover, samparker, olista01 Reviewed By: samparker Subscribers: aemerson, javed.absar, tschuett, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D41819 llvm-svn: 323124
*	[AArch64] optimise v4f16 fcmps to utilise vector instructions	Carey Williams	2018-01-22	1	-176/+85
\| \| \| \| \| \| \| \| \|	Improves the code generation for v4f16 FCMP instructions when FullFP16 is not supported. Generating FCTVL(s) rather than a longer series of FCVTs. Differential Revision: https://reviews.llvm.org/D41772 llvm-svn: 323118
*	[ThinLTO] Re-commit of dot dumper after test fix	Eugene Leviant	2018-01-22	2	-0/+78
\| \| \| \|	llvm-svn: 323116
*	[X86][AVX] Add test case for PR34370	Simon Pilgrim	2018-01-22	1	-0/+78
\| \| \| \|	llvm-svn: 323106