bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[LICM/MSSA] Add promotion to scalars by building an AliasSetTracker with ↵	Alina Sbirlea	2019-02-06	4	-45/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MemorySSA. Summary: Experimentally we found that promotion to scalars carries less benefits than sinking and hoisting in LICM. When using MemorySSA, we build an AliasSetTracker on demand in order to reuse the current infrastructure. We only build it if less than AccessCapForMSSAPromotion exist in the loop, a cap that is by default set to 250. This value ensures there are no runtime regressions, and there are small compile time gains for pathological cases. A much lower value (20) was found to yield a single regression in the llvm-test-suite and much higher benefits for compile times. Conservatively we set the current cap to a high value, but we will explore lowering it when MemorySSA is enabled by default. Reviewers: sanjoy, chandlerc Subscribers: nemanjai, jlebar, Prazek, george.burgess.iv, jfb, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D56625 llvm-svn: 353339
*	[DAG] Immediately cleanup unused nodes from extend-based combines.	Nirav Dave	2019-02-06	1	-2/+7
\| \| \| \|	llvm-svn: 353338
*	Move IR flag handling directly into builder calls for cases translated from ↵	Michael Berg	2019-02-06	2	-43/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instructions in GlobalIsel Reviewers: aditya_nandakumar, volkan Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, volkan, Petar.Avramovic Differential Revision: https://reviews.llvm.org/D57630 llvm-svn: 353336
*	[AliasSetTracker] Pass MustAlias to addPointer more often.	Alina Sbirlea	2019-02-06	1	-24/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Pass the alias info to addPointer when available. Will save an alias() call for must sets when adding a known Must or May alias. [Part of a series of cleanup patches] Reviewers: reames, mkazantsev Subscribers: sanjoy, jlebar, llvm-commits Differential Revision: https://reviews.llvm.org/D56613 llvm-svn: 353335
*	[X86][DAG] Avoid creating dangling bitcast.	Nirav Dave	2019-02-06	1	-1/+2
\| \| \| \| \| \| \| \|	combineExtractWithShuffle may leave a dangling bitcast which may prevent further optimization in later passes. Avoid constructing it unless it is used. llvm-svn: 353333
*	[SystemZ] Improved handling of the @llvm.ctlz intrinsic.	Jonas Paulsson	2019-02-06	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since SystemZ supports counting of leading zeros with the FLOGR instruction, isCheapToSpeculateCtlz() should return true, which it now does. ISD::CTLZ_ZERO_UNDEF i32 is now handled the same way as ISD::CTLZ is, which is needed since promotion to i64 is required and CTLZ_ZERO_UNDEF is only expanded to CTLZ if it is Legal or Custom. Review: Ulrich Weigand https://reviews.llvm.org/D57710 llvm-svn: 353330
*	build: Remove the cmake check for malloc.h.	Peter Collingbourne	2019-02-06	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \|	As far as I can tell, malloc.h is only being used here to provide a definition of mallinfo (malloc itself is declared in stdlib.h via cstdlib). We already have a macro for whether mallinfo is available, so switch to using that instead. Differential Revision: https://reviews.llvm.org/D57807 llvm-svn: 353329
*	[SystemZ] Wait with VGBM selection until after DAGCombine2.	Jonas Paulsson	2019-02-06	5	-41/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Don't lower BUILD_VECTORs to BYTE_MASK, but instead expose the BUILD_VECTORs to the DAGCombiner and select them to VGBM in Select(). This allows the DAGCombiner to understand the constant vector values. For floating point, only all-zeros vectors are now generated with VGBM, as it turned out to be somewhat complicated to handle any arbitrary constants, while in practice this is very rare and hardly needed. The SystemZ ISD opcodes z_byte_mask, z_vzero and z_vones have been removed. Review: Ulrich Weigand https://reviews.llvm.org/D57152 llvm-svn: 353325
*	[SelectionDAG] Cleanup some code comments. NFC	Bjorn Pettersson	2019-02-06	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	Don't repeat the function name in some doxygen comments. (Just a minor cleanup, while testing to push from the git monorepo setup.) llvm-svn: 353317
*	[GlobalISel][NFC] Gardening: Factor out code for simple unary intrinsics	Jessica Paquette	2019-02-06	1	-78/+58
\| \| \| \| \| \| \| \| \| \| \| \| \|	There was a lot of repeated code wrt unary math intrinsics in translateKnownIntrinsic. This factors out the repeated MIRBuilder code into two functions: translateSimpleUnaryIntrinsic and getSimpleUnaryIntrinsicOpcode. This simplifies adding simple unary intrinsics, since after this, all you have to do is add the mapping to SimpleUnaryIntrinsicOpcodes. Differential Revision: https://reviews.llvm.org/D57774 llvm-svn: 353316
*	[yaml2obj]Allow number for ELF symbol type	James Henderson	2019-02-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	yaml2obj previously only recognised standard STT_* names, and didn't allow arbitrary numbers. This change allows the user to specify a number for the type instead. It also adds a test to verify the existing behaviour for obj2yaml for unkown symbol types. Reviewed by: grimar Differential Revision: https://reviews.llvm.org/D57822 llvm-svn: 353315
*	[InstCombine] X \| C == C --> (X & ~C) == 0	Sanjay Patel	2019-02-06	1	-9/+18
\| \| \| \| \| \| \| \| \| \|	We should canonicalize to one of these forms, and compare-with-zero could be more conducive to follow-on transforms. This also leads to generally better codegen as shown in PR40611: https://bugs.llvm.org/show_bug.cgi?id=40611 llvm-svn: 353313
*	AArch64: enforce even/odd register pairs for CASP instructions.	Tim Northover	2019-02-06	2	-6/+8
\| \| \| \| \| \| \| \|	ARMv8.1a CASP instructions need the first of the pair to be an even register (otherwise the encoding is unallocated). We enforced this during assembly, but not CodeGen before. llvm-svn: 353308
*	[InlineAsm][X86] Add backend support for X86 flag output parameters.	Nirav Dave	2019-02-06	4	-12/+90
\| \| \| \| \| \| \|	Allow custom handling of inline assembly output parameters and add X86 flag parameter support. llvm-svn: 353307
*	[SelectionDAGBuilder] Refactor Inline Asm output check. NFCI.	Nirav Dave	2019-02-06	1	-13/+26
\| \| \| \|	llvm-svn: 353305
*	[SystemZ] Do not return INT_MIN from strcmp/memcmp	Ulrich Weigand	2019-02-06	4	-125/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IPM sequence currently generated to compute the strcmp/memcmp result will return INT_MIN for the "less than zero" case. While this is in compliance with the standard, strictly speaking, it turns out that common applications cannot handle this, e.g. because they negate a comparison result in order to implement reverse compares. This patch changes code to use a different sequence that will result in -2 for the "less than zero" case (same as GCC). However, this requires that the two source operands of the compare instructions are inverted, which breaks the optimization in removeIPMBasedCompare. Therefore, I've removed this (and all of optimizeCompareInstr), and replaced it with a mostly equivalent optimization in combineCCMask at the DAGcombine level. llvm-svn: 353304
*	AArch64: annotate atomics with dropped acquire semantics when printing.	Tim Northover	2019-02-06	3	-62/+50
\| \| \| \| \| \| \| \| \| \| \|	A quirk of the v8.1a spec is that when the writeback regiser for an atomic read-modify-write instruction is wzr/xzr, the instruction no longer enforces acquire ordering. However, it's still written with the misleading 'a' mnemonic. So this adds an annotation when disassembling such instructions, mentioning the change. llvm-svn: 353303
*	[x86] vectorize cast ops in lowering to avoid register file transfers	Sanjay Patel	2019-02-06	1	-0/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The proposal in D56796 may cross the line because we're trying to avoid vectorization transforms in generic DAG combining. So this is an alternate, later, x86-specific translation of that patch. There are several potential follow-ups to enhance this: 1. Allow extraction from non-zero element index. 2. Peek through extends of smaller width integers. 3. Support x86-specific conversion opcodes like X86ISD::CVTSI2P Differential Revision: https://reviews.llvm.org/D56864 llvm-svn: 353302
*	[MCA] Speedup ResourceManager queries. NFCI	Andrea Di Biagio	2019-02-06	1	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	When a resource unit R is released, the ResourceManager notifies groups that contain R. Before this patch, the logic in method ResourceManager::release() implemented a potentially slow iterative search of dependent groups on the entire set of processor resources. This patch replaces that logic with a simpler (and often faster) lookup on array `Resource2Groups`. This patch gives an average speedup of ~3-4% (observed on a release build when testing for target btver2). No functional change intended. llvm-svn: 353301
*	[DAGCombine][NFC] GatherAllAliases should take a LSBaseSDNode.	Clement Courbet	2019-02-06	1	-8/+8
\| \| \| \| \| \| \|	GatherAllAliases only makes sense for LSBaseSDNode. Enforce it with static typing instead of runtime cast. llvm-svn: 353291
*	[NFC] Simplify check in guard widening	Max Kazantsev	2019-02-06	1	-9/+3
\| \| \| \|	llvm-svn: 353290
*	[DebugInfo]Print correct value for special opcode address increment	James Henderson	2019-02-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	The wrong variable was being used when printing the address increment in verbose output of .debug_line. This patch fixes this. Reviewed by: JDevlieghere Differential Revision: https://reviews.llvm.org/D57693 llvm-svn: 353288
*	[yaml::BinaryRef] Slight perf tuning (for llvm-exegesis analysis mode)	Roman Lebedev	2019-02-06	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: llvm-exegesis uses this functionality to read it's benchmark dumps. This reading of `.yaml`s takes ~60% of runtime for 14656 benchmark points (i.e. one sweep over all x86 instructions), but only 30% of time for 3x as much benchmark points. In particular, this `BinaryRef` appears to be an obvious pain point. Without patch: ``` $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-orig.html no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 14656 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-orig.html' ... no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 14656 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-orig.html' Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-orig.html' (25 runs): 972.86 msec task-clock # 0.994 CPUs utilized ( +- 0.25% ) 30 context-switches # 30.774 M/sec ( +- 21.74% ) 0 cpu-migrations # 0.370 M/sec ( +- 67.81% ) 11873 page-faults # 12211.512 M/sec ( +- 0.00% ) 3898373408 cycles # 4009682.186 GHz ( +- 0.25% ) (83.12%) 360399748 stalled-cycles-frontend # 9.24% frontend cycles idle ( +- 0.54% ) (83.24%) 1099450483 stalled-cycles-backend # 28.20% backend cycles idle ( +- 0.59% ) (33.63%) 4910528820 instructions # 1.26 insn per cycle # 0.22 stalled cycles per insn ( +- 0.13% ) (50.21%) 1111976775 branches # 1143726625.854 M/sec ( +- 0.10% ) (66.77%) 23248474 branch-misses # 2.09% of all branches ( +- 0.19% ) (83.29%) 0.97850 +- 0.00647 seconds time elapsed ( +- 0.66% ) ``` With the patch: ``` $ perf stat -r 25 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-new.html no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 14656 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-new.html' ... no exegesis target for x86_64-unknown-linux-gnu, using default Parsed 14656 benchmark points Printing sched class consistency analysis results to file '/tmp/clusters-new.html' Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-new.html' (25 runs): 905.29 msec task-clock # 0.999 CPUs utilized ( +- 0.11% ) 15 context-switches # 16.533 M/sec ( +- 32.27% ) 0 cpu-migrations # 0.000 K/sec 11873 page-faults # 13121.789 M/sec ( +- 0.00% ) 3627759720 cycles # 4009283.100 GHz ( +- 0.11% ) (83.19%) 370401480 stalled-cycles-frontend # 10.21% frontend cycles idle ( +- 0.22% ) (83.19%) 1007114438 stalled-cycles-backend # 27.76% backend cycles idle ( +- 0.34% ) (33.62%) 4414014304 instructions # 1.22 insn per cycle # 0.23 stalled cycles per insn ( +- 0.08% ) (50.36%) 1003751700 branches # 1109314021.971 M/sec ( +- 0.07% ) (66.97%) 24611010 branch-misses # 2.45% of all branches ( +- 0.10% ) (83.41%) 0.90593 +- 0.00105 seconds time elapsed ( +- 0.12% ) ``` So this decreases the overall run time of llvm-exegesis analysis mode (on one sweep) by roughly -7%. To be noted, `BinaryRef::writeAsBinary()` change is the reason for the perf changes, usage of `llvm::isHexDigit()` instead of `isxdigit()` does not appear to have any perf impact, i have only changed it "for symmetry". `writeAsBinary()` change is correct, it produces identical de-hex-ified buffer, and the final output is thus identical: ``` $ sha512sum /tmp/clusters-* db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-new.html db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf /tmp/clusters-orig.html ``` Reviewers: silvas, espindola, sbc100, zturner, courbet, gchatelet Reviewed By: gchatelet Subscribers: tschuett, RKSimon, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57699 llvm-svn: 353282
*	[NFC] Factor out detatchment of dead blocks from their erasing	Max Kazantsev	2019-02-06	1	-18/+26
\| \| \| \|	llvm-svn: 353277
*	[LoopSimplifyCFG] Do not count dead exit blocks twice, make CFG simpler	Max Kazantsev	2019-02-06	1	-1/+3
\| \| \| \|	llvm-svn: 353276
*	[NFC] Revert rL353274	Max Kazantsev	2019-02-06	1	-10/+5
\| \| \| \|	llvm-svn: 353275
*	[NFC] Extend API of DeleteDeadBlock(s) to collect updates without DTU	Max Kazantsev	2019-02-06	1	-5/+10
\| \| \| \|	llvm-svn: 353274
*	[NFC] Replace readonly SmallVectorImpl with ArrayRef	Max Kazantsev	2019-02-06	1	-3/+2
\| \| \| \|	llvm-svn: 353273
*	[HotColdSplit] Move splitting after instrumented PGO use	Teresa Johnson	2019-02-06	2	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Follow up to D57082 which moved splitting earlier in the pipeline, in order to perform it before inlining. However, it was moved too early, before the IR is annotated with instrumented PGO data. This caused the splitting to incorrectly determine cold functions. Move it to just after PGO annotation (still before inlining), in both pass managers. Reviewers: vsk, hiraditya, sebpop Subscribers: mehdi_amini, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57805 llvm-svn: 353270
*	[AliasSetTracker] Minor style tweak to avoid a variable w/two distinct live ↵	Philip Reames	2019-02-06	1	-4/+2
\| \| \| \| \| \|	ranges [NFC] llvm-svn: 353267
*	Move DomTreeUpdater from IR to Analysis	Richard Trieu	2019-02-06	17	-16/+16
\| \| \| \| \| \| \| \|	DomTreeUpdater depends on headers from Analysis, but is in IR. This is a layering violation since Analysis depends on IR. Relocate this code from IR to Analysis to fix the layering violation. llvm-svn: 353265
*	[WebAssembly] Tidy up `let` statements in .td files (NFC)	Heejin Ahn	2019-02-06	5	-71/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: - Delete {} for one-line `let` statements - Don't indent within `let` blocks - Add comments after `let` block's closing braces Reviewers: tlively Subscribers: dschuff, sbc100, jgravelle-google, sunfish, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57730 llvm-svn: 353248
*	[BasicAA] Cache nonEscapingLocalObjects for alias() calls.	Alina Sbirlea	2019-02-05	1	-7/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Use a small cache for Values tested by nonEscapingLocalObject(). Since the calls to PointerMayBeCaptured are fairly expensive, this saves a good amount of compile time for anything relying heavily on BasicAA.alias() calls. This uses the same approach as the AliasCache, i.e. the cache is reset after each alias() call. The cache is not used or updated by modRefInfo calls since it's harder to know when to reset the cache. Testcases that show improvements with this patch are too large to include. Example compile time improvement: 7s to 6s. Reviewers: chandlerc, sunfish Subscribers: sanjoy, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57627 llvm-svn: 353245
*	[HotColdSplit] Do not split out `resume` instructions	Vedant Kumar	2019-02-05	1	-2/+6
\| \| \| \| \| \| \| \| \|	Resumes that are not reachable from a cleanup landing pad are considered to be unreachable. It’s not safe to split them out. rdar://47808235 llvm-svn: 353242
*	[ADT] Add a fallible_iterator wrapper.	Lang Hames	2019-02-05	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A fallible iterator is one whose increment or decrement operations may fail. This would usually be supported by replacing the ++ and -- operators with methods that return error: class MyFallibleIterator { public: // ... Error inc(); Errro dec(); // ... }; The downside of this style is that it no longer conforms to the C++ iterator concept, and can not make use of standard algorithms and features such as range-based for loops. The fallible_iterator wrapper takes an iterator written in the style above and adapts it to (mostly) conform with the C++ iterator concept. It does this by providing standard ++ and -- operator implementations, returning any errors generated via a side channel (an Error reference passed into the wrapper at construction time), and immediately jumping the iterator to a known 'end' value upon error. It also marks the Error as checked any time an iterator is compared with a known end value and found to be inequal, allowing early exit from loops without redundant error checking. Usage looks like: MyFallibleIterator I = ..., E = ...; Error Err = Error::success(); for (auto &Elem : make_fallible_range(I, E, Err)) { // Loop body is only entered when safe. // Early exits from loop body permitted without checking Err. if (SomeCondition) return; } if (Err) // Handle error. Since failure causes a fallible iterator to jump to end, testing that a fallible iterator is not an end value implicitly verifies that the error is a success value, and so is equivalent to an error check. Reviewers: dblaikie, rupprecht Subscribers: mgorny, dexonsmith, kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57618 llvm-svn: 353237
*	[InstCombine] limit extracting shuffle transform based on uses	Sanjay Patel	2019-02-05	1	-0/+5
\| \| \| \| \| \| \| \| \| \|	As discussed in D53037, this can lead to worse codegen, and we don't generally expect the backend to be able to optimize arbitrary shuffles. If there's only one use of the 1st shuffle, that means it's getting removed, so that should always be safe. llvm-svn: 353235
*	[PGO] Use a function for creating variable for profile file name. NFC.	Rong Xu	2019-02-05	2	-16/+19
\| \| \| \| \| \| \|	Factored out the code for creating variable for profile file name to a function. llvm-svn: 353230
*	[NFC][GlobalISel]: Add a convenience method to MachineInstrBuilder to ↵	Aditya Nandakumar	2019-02-05	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	simplify getOperand(i).getReg() https://reviews.llvm.org/D57608 It's a common pattern in GISel to have a MachineInstrBuilder from which we get various regs (commonly MIB->getOperand(0).getReg()). This adds a helper method and the above can be replaced with MIB.getReg(0). llvm-svn: 353223
*	[MC] Don't error on numberless .file directives on MachO	Reid Kleckner	2019-02-05	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Before r349976, MC ignored such directives when producing an object file and asserted when re-producing textual assembly output. I turned this assertion into a hard error in both cases in r349976, but this makes it unnecessarily difficult to write a single assembly file that supports both MachO and other object formats that support .file. A user reported this as PR40578, and we decided to go back to ignoring the directive. Fixes PR40578 Reviewers: mstorsjo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57772 llvm-svn: 353218
*	[WebAssembly] Lower memmove to memory.copy	Thomas Lively	2019-02-05	3	-1/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The lowering is identical to the memcpy lowering. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57727 llvm-svn: 353216
*	[TargetLibraryInfo] Regroup run time functions for Windows (NFC)	Evandro Menezes	2019-02-05	1	-38/+37
\| \| \| \| \| \|	Regroup supported and unsupported functions by precision and C standard. llvm-svn: 353213
*	GlobalISel: Verify G_GEP	Matt Arsenault	2019-02-05	1	-0/+16
\| \| \| \|	llvm-svn: 353209
*	[AMDGPU] Consider XOR in waterfall loop as a terminator	Scott Linder	2019-02-05	1	-1/+1
\| \| \| \| \| \| \| \|	Ensure the XOR in the waterfall loop for indirect addressing is considered a terminator. Differential Revision: https://reviews.llvm.org/D57703 llvm-svn: 353207
*	[DEBUG_INFO][NVPTX] Generate DW_AT_address_class to get the values in debugger.	Alexey Bataev	2019-02-05	2	-2/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: According to https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf, the compiler should emit the DW_AT_address_class attribute for all variable and parameter. It means, that DW_AT_address_class attribute should be used in the non-standard way to support compatibility with the cuda-gdb debugger. Clang is able to generate the information about the variable address class. This information is emitted as the expression sequence `DW_OP_constu <DWARF Address Space> DW_OP_swap DW_OP_xderef`. The patch tries to find all such expressions and transform them into `DW_AT_address_class <DWARF Address Space>` if target is NVPTX and the debugger is gdb. If the expression is not found, then default values are used. For the local variables <DWARF Address Space> is set to ADDR_local_space(6), for the globals <DWARF Address Space> is set to ADDR_global_space(5). The values are taken from the table in the same section 5.2. CUDA-Specific DWARF Definitions. Reviewers: echristo, probinson Subscribers: jholewinski, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D57157 llvm-svn: 353203
*	AMDGPU: Fix assert on trunc from bitcast of build_vector	Matt Arsenault	2019-02-05	1	-1/+1
\| \| \| \| \| \| \| \| \|	The v2i64 argument is lowered to a bitcast of v4i32 build_vector. This would then attempt to use the i32-element as the source of the vector truncate. This really would need to collect 2 elements from the build_vector to produce the intended truncate. llvm-svn: 353202
*	[X86][SSE] Disable ZERO_EXTEND shuffle combining	Simon Pilgrim	2019-02-05	1	-2/+2
\| \| \| \| \| \|	rL352997 enabled ZERO_EXTEND from non-shuffle-able value types. I've disabled it for now to fix a regression identified by @asbirlea until I can fix this properly. llvm-svn: 353198
*	[LLVM-C] Add Bindings to GlobalIFunc	Robert Widmann	2019-02-05	1	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds the standard gauntlet of accessors for global indirect functions and updates the echo test. Now it would be nice to have a target abstraction so one could know if they have access to a suitable ELF linker and runtime. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56177 llvm-svn: 353193
*	Enable integrated assembler on MSP430 by default.	Anton Korobeynikov	2019-02-05	1	-0/+1
\| \| \| \| \| \| \| \|	Patch by Kristina Bessonova! Differential Revision: https://reviews.llvm.org/D56787 llvm-svn: 353192
*	[AArch64][Outliner] Don't outline BTI instructions	Oliver Stannard	2019-02-05	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \|	We can't outline BTI instructions, because they need to be the very first instruction executed after an indirect call or branch. If we outline them, then an indirect call might go to the branch to the outlined function, which will fault. Differential revision: https://reviews.llvm.org/D57753 llvm-svn: 353190
*	[X86][AVX] Attempt to combine shuffles to subvector broadcast load	Simon Pilgrim	2019-02-05	1	-0/+18
\| \| \| \|	llvm-svn: 353189