bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86] Make combineLoopSADPattern use CONCAT_VECTORS instead of ↵	Craig Topper	2019-08-23	1	-3/+5
\| \| \| \| \| \| \| \| \|	INSERT_SUBVECTORS for widening with zeros. CONCAT_VECTORS is more canonical for the early DAG combine runs until we start getting into the op legalization phases. llvm-svn: 369734
*	[X86] Improve lowering of v2i32 SAD handling in combineLoopSADPattern.	Craig Topper	2019-08-23	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For v2i32 we only feed 2 i8 elements into the psadbw instructions with 0s in the other 14 bytes. The resulting psadbw instruction will produce zeros in bits [127:16] of the output. We need to take the result and feed it to a v2i32 add where the first element includes bits [15:0] of the sad result. The other element should be zero. Prior to this patch we were using a truncate to take 0 from bits 95:64 of the psadbw. This results in a pshufd to move those bits to 63:32. But since we also have zeroes in bits 63:32 of the psadbw output, we should just take those bits. The previous code probably worked better with promoting legalization, but now we use widening legalization. I've preserved the old behavior if -x86-experimental-vector-widening-legalization=false until we get that option removed. llvm-svn: 369733
*	[IndVars] Fix a bug noticed by inspection	Philip Reames	2019-08-23	1	-1/+2
\| \| \| \| \| \|	We were computing the loop exit value, but not ensuring the addrec belonged to the loop whose exit value we were computing. I couldn't actually trip this; the test case shows the basic setup which might trip this, but none of the variations I've tried actually do. llvm-svn: 369730
*	[AlignmentFromAssumptions] getNewAlignmentDiff(): use getURemExpr()	Fangrui Song	2019-08-23	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The alignment is calculated incorrectly, thus sometimes it doesn't generate aligned mov instructions, as shown by the example below: ``` // b.cc typedef long long index; extern "C" index g_tid; extern "C" index g_num; void add3(float* __restrict__ a, float* __restrict__ b, float* __restrict__ c) { index n = 641024; index m = 161024; index k = 41024; index tid = g_tid; index num = g_num; __builtin_assume_aligned(a, 32); __builtin_assume_aligned(b, 32); __builtin_assume_aligned(c, 32); for (index i0=tidk; i0<m; i0+=numk) for (index i1=0; i1<nm; i1+=m) for (index i2=0; i2<k; i2++) c[i1+i0+i2] = b[i0+i2] + a[i1+i0+i2]; } ``` Compile with `clang b.cc -Ofast -march=skylake -mavx2 -S` ``` vmovaps -224(%rdi,%rbx,4), %ymm0 vmovups -192(%rdi,%rbx,4), %ymm1 # should be movaps vmovups -160(%rdi,%rbx,4), %ymm2 # should be movaps vmovups -128(%rdi,%rbx,4), %ymm3 # should be movaps vaddps -224(%rsi,%rbx,4), %ymm0, %ymm0 vaddps -192(%rsi,%rbx,4), %ymm1, %ymm1 vaddps -160(%rsi,%rbx,4), %ymm2, %ymm2 vaddps -128(%rsi,%rbx,4), %ymm3, %ymm3 vmovaps %ymm0, -224(%rdx,%rbx,4) vmovups %ymm1, -192(%rdx,%rbx,4) # should be movaps vmovups %ymm2, -160(%rdx,%rbx,4) # should be movaps vmovups %ymm3, -128(%rdx,%rbx,4) # should be movaps ``` Differential Revision: https://reviews.llvm.org/D66575 Patch by Dun Liang llvm-svn: 369723
*	hwasan: Untag unwound stack frames by wrapping personality functions.	Peter Collingbourne	2019-08-23	1	-8/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One problem with untagging memory in landing pads is that it only works correctly if the function that catches the exception is instrumented. If the function is uninstrumented, we have no opportunity to untag the memory. To address this, replace landing pad instrumentation with personality function wrapping. Each function with an instrumented stack has its personality function replaced with a wrapper provided by the runtime. Functions that did not have a personality function to begin with also get wrappers if they may be unwound past. As the unwinder calls personality functions during stack unwinding, the original personality function is called and the function's stack frame is untagged by the wrapper if the personality function instructs the unwinder to keep unwinding. If unwinding stops at a landing pad, the function is still responsible for untagging its stack frame if it resumes unwinding. The old landing pad mechanism is preserved for compatibility with old runtimes. Differential Revision: https://reviews.llvm.org/D66377 llvm-svn: 369721
*	[MC] Minor cleanup to MCFixup::Kind handling. NFC.	Sam Clegg	2019-08-23	19	-43/+39
\| \| \| \| \| \| \| \| \| \|	Prefer `MCFixupKind` where possible and add getTargetKind() to convert to `unsigned` when needed rather than scattering cast operators around the place. Differential Revision: https://reviews.llvm.org/D59890 llvm-svn: 369720
*	IR. Change strip* family of functions to not look through aliases.	Peter Collingbourne	2019-08-22	12	-39/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I noticed another instance of the issue where references to aliases were being replaced with aliasees, this time in InstCombine. In the instance that I saw it turned out to be only a QoI issue (a symbol ended up being missing from the symbol table due to the last reference to the alias being removed, preventing HWASAN from symbolizing a global reference), but it could easily have manifested as incorrect behaviour. Since this is the third such issue encountered (previously: D65118, D65314) it seems to be time to address this common error/QoI issue once and for all and make the strip* family of functions not look through aliases. Includes a test for the specific issue that I saw, but no doubt there are other similar bugs fixed here. As with D65118 this has been tested to make sure that the optimization isn't load bearing. I built Clang, Chromium for Linux, Android and Windows as well as the test-suite and there were no size regressions. Differential Revision: https://reviews.llvm.org/D66606 llvm-svn: 369697
*	Fight a bit against global initializers. NFC.	Benjamin Kramer	2019-08-22	1	-1/+1
\| \| \| \|	llvm-svn: 369695
*	GlobalISel: Don't create G_UADDE with constant false carry in	Matt Arsenault	2019-08-22	1	-5/+7
\| \| \| \| \| \| \| \|	The x86 tests are now broken (in paticular add-scalar.ll now hits the DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a custom lowering for this, so that will need to be implemented. llvm-svn: 369673
*	[MachO][TLOF] Use hasLocalLinkage to determine if indirect symbol is local	Francis Visoiu Mistrih	2019-08-22	6	-14/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Local symbols in the indirect symbol table contain the value `INDIRECT_SYMBOL_LOCAL` and the corresponding __pointers entry must contain the address of the target. In r349060, I added support for local symbols in the indirect symbol table, which was checking if the symbol `isDefined` && `!isExternal` to determine if the symbol is local or not. It turns out that `isDefined` will return false if the user of the symbol comes before its definition, and we'll again generate .long 0 which will be the symbol at the adress 0x0. Instead of doing that, use GlobalValue::hasLocalLinkage() to check if the symbol is local. Differential Revision: https://reviews.llvm.org/D66563 llvm-svn: 369671
*	[X86] Remove MCInstLower code that drops operands from some CALL and TAILJMP ↵	Craig Topper	2019-08-22	1	-18/+8
\| \| \| \| \| \| \| \| \| \|	instructions. Add asserts to verify operand count It appears the FIXME here was handled at some point. r159728 from 2012 seems to be at least aportion of fixing it. Differential Revision: https://reviews.llvm.org/D66570 llvm-svn: 369665
*	[MBP] Disable aggressive loop rotate in plain mode	Guozhi Wei	2019-08-22	1	-36/+80
\| \| \| \| \| \| \| \| \| \|	Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 369664
*	[DAGCombiner] Remove explicit call to AddToWorklist in sqrt and reciprocal ↵	Amaury Sechet	2019-08-22	1	-32/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	computations Summary: These nodes end up being processed regardless due to DAGCombiner ensuring arguments are processed. This changes the order in which nodes are processed, which fixes an issue on PowerPC. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri, mcberg2017, stefanp, hfinkel Subscribers: nemanjai, MaskRay, jsji, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66548 llvm-svn: 369662
*	[X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions.	Andrea Di Biagio	2019-08-22	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Single operand MUL instructions that implicitly set EAX have the following latency/throughput profile (see below): imul %cl # latency: 3cy - uOPs: 1 - 1 JMul imul %cx # latency: 3cy - uOPs: 3 - 3 JMul imul %ecx # latency: 3cy - uOPs: 2 - 2 JMul imul %rcx # latency: 6cy - uOPs: 2 - 4 JMul mul %cl # latency: 3cy - uOPs: 1 - 1 JMul mul %cx # latency: 3cy - uOPs: 3 - 3 JMul mul %ecx # latency: 3cy - uOPs: 2 - 2 JMul mul %rcx # latency: 6cy - uOPs: 2 - 4 JMul Excluding the 64bit variant, which has a latency of 6cy, every other instruction has a latency of 3cy. However, the number of decoded macro-opcodes (as well as the resource cyles) depend on the MUL size. The two operand MULs have a more predictable profile (see below): imul %dx, %dx # latency: 3cy - uOPs: 1 - 1 JMul imul %edx, %edx # latency: 3cy - uOPs: 1 - 1 JMul imul %rdx, %rdx # latency: 6cy - uOPs: 1 - 4 JMul imul $3, %dx, %dx # latency: 4cy - uOPs: 2 - 2 JMul imul $3, %ecx, %ecx # latency: 3cy - uOPs: 1 - 1 JMul imul $3, %rdx, %rdx # latency: 6cy - uOPs: 1 - 4 JMul This patch updates the values in the Jaguar scheduling model and regenerates llvm-mca tests. Differential Revision: https://reviews.llvm.org/D66547 llvm-svn: 369661
*	[PowerPC] Add combined ELF ABI and 32/64 bit queries to the subtarget. [NFC]	Sean Fertile	2019-08-22	6	-58/+59
\| \| \| \| \| \| \| \| \| \| \| \|	A lot of places in the code combine checks for both ABI (SVR4/Darwin/AIX) and addressing mode (64-bit vs 32-bit). In an attempt to make some of the code more readable I've added a couple functions that combine checking for the ELF abi and 64-bit/32-bit code at once. As we add more AIX support I intend to add similar functions for the AIX ABI. Differential Revision: https://reviews.llvm.org/D65814 llvm-svn: 369658
*	[PowerPC][XCOFF][MC] Explicitly set containing csect on symbols. [NFC]	Sean Fertile	2019-08-22	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \|	Previously we would get the csect a symbol was contained in through its fragment. This works only if we are writing an object file, and only for defined symbols. To fix this we set the contating csect explicitly on the MCSymbolXCOFF object. Differential Revision: https://reviews.llvm.org/D66032 llvm-svn: 369657
*	[Attributor][NFC] Move DerefState to header and use StateWrapper	Hideto Ueno	2019-08-22	1	-96/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In D65402, I want to get DerefState from AADereferenceable but it was not allowed. This patch moves DerefState definition into Attributor.h and makes AADerefenceable inherit StateWrapper. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66585 llvm-svn: 369653
*	[SlotIndexes] Add print-slotindexes to disable printing slotindexes	Jinsong Ji	2019-08-22	1	-2/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When we print the IR with --print-after/before-*, SlotIndexes will be printed whenever available (We haven't freed it). This introduces some noises when we try to compare the IR among different optimizations. eg: -print-before=machine-cp will print SlotIndexes for 1st machine-cp pass, but NOT for 2nd machine-cp; -print-after=machine-cp will NOT print SlotIndexes for both machine-cp passes. So SlotIndexes in 1st pass introduce noises when differing these IRs. This patch introduces an option to hide indexes. Reviewers: stoklund, thegameg, qcolombet Reviewed By: thegameg Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66500 llvm-svn: 369650
*	[MCA] consistently use MCPhysReg instead of unsigned as register type. NFCI	Andrea Di Biagio	2019-08-22	4	-17/+15
\| \| \| \|	llvm-svn: 369648
*	[yaml2obj] - Lookup relocation symbols in dynamic symbol when .dynsym ↵	George Rimar	2019-08-22	1	-10/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	referenced. This fixes https://bugs.llvm.org/show_bug.cgi?id=40337. Previously, it was always assumed that relocations referenced symbols in the static symbol table. Now, if the Link field references a section called ".dynsym" it will look up these symbols in the dynamic symbol table. This patch is heavily based on D59097 by James Henderson Differential revision: https://reviews.llvm.org/D66532 llvm-svn: 369645
*	[X86][BtVer2] Fix latency and throughput of XCHG and XADD.	Andrea Di Biagio	2019-08-22	1	-1/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Jaguar, XCHG has a latency of 1cy and decodes to 2 macro-opcodes. Maximum throughput for XCHG is 1 IPC. The byte exchange has worse latency and decodes to 1 extra uOP; maximum observed throughput is 0.5 IPC. ``` xchgb %cl, %dl # Latency: 2cy - uOPs: 3 - 2 ALU xchgw %cx, %dx # Latency: 1cy - uOPs: 2 - 2 ALU xchgl %ecx, %edx # Latency: 1cy - uOPs: 2 - 2 ALU xchgq %rcx, %rdx # Latency: 1cy - uOPs: 2 - 2 ALU ``` The reg-mem forms of XCHG are atomic operations with an observed latency of 16cy. The resource usage is similar to the XCHGrr variants. The biggest difference is obviously the bus-locking, which prevents the LS to issue other memory uOPs in parallel until the unlocking store uOP is executed. ``` xchgb %cl, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgw %cx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgl %ecx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy xchgq %rcx, (%rsp) # Latency: 16cy - uOPs: 3 - ECX latency: 11cy ``` The exchanged in/out register operand becomes available after 11cy from the start of execution. Added test xchg.s to verify that we correctly see that register write committed in 11cy (and not 16cy). Reg-reg XADD instructions have the same latency/throughput than the byte exchange (register-register variant). ``` xaddb %cl, %dl # latency: 2cy - uOPs: 3 - 3 ALU xaddw %cx, %dx # latency: 2cy - uOPs: 3 - 3 ALU xaddl %ecx, %edx # latency: 2cy - uOPs: 3 - 3 ALU xaddq %rcx, %rdx # latency: 2cy - uOPs: 3 - 3 ALU ``` The non-atomic RM variants have a latency of 11cy, and decode to 4 macro-opcodes. They still consume 2 ALU pipes, and the exchange in/out register operand becomes available in 3cy (it matches the 'load-to-use latency'). ``` xaddb %cl, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddw %cx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddl %ecx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU xaddq %rcx, (%rsp) # latency: 11cy - uOPs: 4 - 3 ALU ``` The atomic XADD variants execute in 16cy. The in/out register operand is available after 11cy from the start of execution. ``` lock xaddb %cl, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddw %cx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddl %ecx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy lock xaddq %rcx, (%rsp) # latency: 16cy - uOPs: 4 - 3 ALU -- ECX latency: 11cy ``` Added test xadd.s to verify those latencies as well as read-advance values. Differential Revision: https://reviews.llvm.org/D66535 llvm-svn: 369642
*	[MVT] Add MVT equivalent to EVT::getHalfNumVectorElementsVT() helper. NFCI.	Simon Pilgrim	2019-08-22	1	-21/+11
\| \| \| \| \| \|	Allows for some cleanup in a lot of SSE/AVX vector splitting code llvm-svn: 369640
*	Reapply: [ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32	Sam Tebbs	2019-08-22	1	-6/+7
\| \| \| \| \| \| \| \| \|	The CodeGen/Thumb2/mve-vaddv.ll test needed to be amended to reflect the changes from the above patch. This reverts commit cd53ff6, reapplying 7c6b229. llvm-svn: 369638
*	[Loop Peeling] Fix silly bug in metadata update.	Serguei Katkov	2019-08-22	1	-6/+6
\| \| \| \| \| \| \|	We must update loop metedata before we moved to parent loop if it is present. llvm-svn: 369637
*	Revert r369626 "[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32"	Hans Wennborg	2019-08-22	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It broke the bots, see e.g. http://lab.llvm.org:8011/builders/clang-cuda-build/builds/36275/ > This patch fixes shifts by a 128/256 bit shift amount. It also fixes > codegen for shifts of 32 by delegating to LLVM's default optimisation > instead of emitting a long shift. > > Tests that used to generate long shifts of 32 are updated to check for the > more optimised codegen. > > Differential revision: https://reviews.llvm.org/D66519 > > llvm-svn: 369626 llvm-svn: 369636
*	[X86] Lower the cost of v2i32->v2f64 sint_to_fp under vector widening ↵	Craig Topper	2019-08-22	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	legalization. I don't really understand the costs we're using for fp_to_sint, but prior to widening legalization we used 20 as the cost for this via the v2i64->v2f64 entry. That number seems better than the 40 we got with widening legalization. So now we need either a v2i32->v2f64 entry or a v4i32->v2f64 entry depending on whether AVX is enabled or not since we skip the first SSE2 table look up under AVX. llvm-svn: 369628
*	[Support] Improve readNativeFile(Slice) interface	Pavel Labath	2019-08-22	3	-92/+66
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: There was a subtle, but pretty important difference between the Slice and regular versions of this function. The Slice function was zero-initializing the rest of the buffer when the read syscall returned less bytes than expected, while the regular function did not. This patch removes the inconsistency by making both functions not zero-initialize the buffer. The zeroing code is moved to the MemoryBuffer class, which is currently the only user of this code. This makes the API more consistent, and the code shorter. While in there, I also refactor the functions to return the number of bytes through the regular return value (via Expected<size_t>) instead of a separate by-ref argument. Reviewers: aganea, rnk Subscribers: kristina, Bigcheese, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66471 llvm-svn: 369627
*	[ARM] Fix lsrl with a 128/256 bit shift amount or a shift of 32	Sam Tebbs	2019-08-22	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes shifts by a 128/256 bit shift amount. It also fixes codegen for shifts of 32 by delegating to LLVM's default optimisation instead of emitting a long shift. Tests that used to generate long shifts of 32 are updated to check for the more optimised codegen. Differential revision: https://reviews.llvm.org/D66519 llvm-svn: 369626
*	[TargetLowering] Remove optional arguments passing to makeLibCall	Shiva Chen	2019-08-22	9	-95/+177
\| \| \| \| \| \| \| \| \| \|	The patch introduces MakeLibCallOptions struct as suggested by @efriedma on D65497. The struct contain argument flags which will pass to makeLibCall function. The patch should not has any functionality changes. Differential Revision: https://reviews.llvm.org/D65795 llvm-svn: 369622
*	[X86] Making X86OptimizeLEAs pass public. NFC	Pengfei Wang	2019-08-22	3	-26/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: wxiao3, LuoYuanke, andrew.w.kaylor, craig.topper, annita.zhang, liutianle, pengfei, xiangzhangllvm, RKSimon, spatel, andreadb Reviewed By: RKSimon Subscribers: andreadb, hiraditya, llvm-commits Tags: #llvm Patch by Gen Pei (gpei) Differential Revision: https://reviews.llvm.org/D65933 llvm-svn: 369612
*	[COFF] Fix section name for constants larger than 64 bits on Windows	Fangrui Song	2019-08-22	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	APIntToHexString returns wrong value ("0000000000000000ffffffffffffffff") for integer larger than 64 bits, and thus TargetLoweringObjectFileCOFF::getSectionForConstant returns same section name for all numbers larger than 64 bits. This patch tries to fix it. Differential Revision: https://reviews.llvm.org/D66458 Patch by Senran Zhang llvm-svn: 369610
*	[Object] FIX: update PlatformKind name in TapiFile	Cyndy Ishida	2019-08-21	1	-2/+2
\| \| \| \| \| \| \|	Buildbots that use GCC failed to compile because overwritten namespace with variable name llvm-svn: 369602
*	[Object] Add tapi files to object	Cyndy Ishida	2019-08-21	5	-3/+163
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The intention for this is to allow reading and printing symbols out from llvm-nm. Tapi file, and Tapi universal follow a similiar format to their respective MachO Object format. The tests are dependent on llvm-nm processing tbd files which is why its in D66160 Reviewers: ributzka, steven_wu, lhames Reviewed By: ributzka, lhames Subscribers: mgorny, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66159 llvm-svn: 369600
*	[X86] Correct the scheduler classes for TAILJMP and TCRETURN CodeGenOnly ↵	Craig Topper	2019-08-21	1	-24/+25
\| \| \| \| \| \| \| \| \| \| \| \|	instructions. We had an odd combination of WriteJump applied to some memory instructions and WriteJumpLd applied to register and immediate instructions. Thsi should hopefully assign them all correctly. llvm-svn: 369599
*	[X86] Replace a couple hardcoded '5's with X86::AddrNumOperands for ↵	Craig Topper	2019-08-21	1	-2/+3
\| \| \| \| \| \|	readability. NFC llvm-svn: 369598
*	[RISCV] Remove fix introduced by r369573, superseded by r369580	Luis Marques	2019-08-21	1	-3/+0
\| \| \| \|	llvm-svn: 369590
*	[Attributor] Fix: Gracefully handle non-instruction users	Johannes Doerfert	2019-08-21	1	-1/+5
\| \| \| \| \| \| \|	Function can have users that are not instructions, e.g., bitcasts. For now, we simply give up when we see them. llvm-svn: 369588
*	Add FileWriter to GSYM and encode/decode functions to AddressRange and ↵	Greg Clayton	2019-08-21	3	-0/+115
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AddressRanges The full GSYM patch started with: https://reviews.llvm.org/D53379 This patch add the ability to encode data using the new llvm::gsym::FileWriter class. FileWriter is a simplified binary data writer class that doesn't require targets, target definitions, architectures, or require any other optional compile time libraries to be enabled via the build process. This class needs the ability to seek to different spots in the binary data that it produces to fix up offsets and sizes in GSYM data. It currently uses std::ostream over llvm::raw_ostream because llvm::raw_ostream doesn't support seeking which is required when encoding and decoding GSYM data. AddressRange objects are encoded and decoded to be relative to a base address. This will be the FunctionInfo's start address if the AddressRange is directly contained in a FunctionInfo, or a base address of the containing parent AddressRange or AddressRanges. This allows address ranges to be efficiently encoded using ULEB128 encodings as we encode the offset and size of each range instead of full addresses. This also makes encoded addresses easy to relocate as we just need to relocate one base address. Differential Revision: https://reviews.llvm.org/D63828 llvm-svn: 369587
*	[RISCV] Fix use of side-effects in asserts in decoder functions	Luis Marques	2019-08-21	1	-6/+9
\| \| \| \|	llvm-svn: 369580
*	[BinaryFormat] Teach identify_magic about Tapi files.	Cyndy Ishida	2019-08-21	4	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Tapi files are YAML files that start with the !tapi tag. The only execption are TBD v1 files, which don't have a tag. In that case we have to scan a little further and check if the first key "archs" exists. This is the first patch in a series of patches to add libObject support for text-based dynamic library (.tbd) files. This patch is practically exactly the same as D37820, that was never pushed to master, and is needed for future commits related to reading tbd files for llvm-nm Reviewers: ributzka, steven_wu, bollu, espindola, jfb, shafik, jdoerfert Reviewed By: steven_wu Subscribers: dexonsmith, llvm-commits Tags: #llvm, #clang, #sanitizers, #lldb, #libc, #openmp Differential Revision: https://reviews.llvm.org/D66149 llvm-svn: 369579
*	[Attributor][NFC] Fix copy & paste error	Johannes Doerfert	2019-08-21	1	-1/+1
\| \| \| \|	llvm-svn: 369577
*	[Attributor][NFC] Remove leftover semicolon	Johannes Doerfert	2019-08-21	1	-1/+1
\| \| \| \|	llvm-svn: 369576
*	[Attributor] Use existing unreachable instead of introducing new ones	Johannes Doerfert	2019-08-21	1	-0/+3
\| \| \| \| \| \| \|	So far we split the unreachable off and placed a new one, this is not necessary. llvm-svn: 369575
*	Fix -Werror=unused-variable error after r369528.	Richard Smith	2019-08-21	1	-0/+3
\| \| \| \|	llvm-svn: 369573
*	[GVN] Do PHI translations across all edges between the load and the ↵	Florian Hahn	2019-08-21	1	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	unavailable pred. Currently we do not properly translate addresses with PHIs if LoadBB != LI->getParent(), because PHITranslateAddr expects a direct predecessor as argument, because it considers all instructions outside of the current block to not requiring translation. The amount of cases that trigger this should be very low, as most single predecessor blocks should be folded into their predecessor by GVN before we actually start with value numbering. It is still not guaranteed to happen, so we should do PHI translation along all edges between the loads' block and the predecessor where we have to place a load. There are a few test cases showing current limits of the PHI translation, which could be improved later. Reviewers: spatel, reames, efriedma, john.brawn Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D65020 llvm-svn: 369570
*	Revert r369549 as it broke the bots.	Aaron Ballman	2019-08-21	1	-4/+3
\| \| \| \| \| \|	http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/13605/ llvm-svn: 369569
*	Revert r367389 (and follow-up r368404); it caused PR43073.	Nico Weber	2019-08-21	1	-50/+76
\| \| \| \|	llvm-svn: 369567
*	[WebAssembly] Handle aliases in WebAssemblyFixFunctionBitcasts	Sam Clegg	2019-08-21	1	-0/+2
\| \| \| \| \| \| \| \|	Fixes: https://github.com/emscripten-core/emscripten/issues/8770 Differential Revision: https://reviews.llvm.org/D66508 llvm-svn: 369566
*	[MVT] Add v16f16 and v32f16 vectors.	Craig Topper	2019-08-21	2	-0/+6
\| \| \| \| \| \| \| \| \|	I might look at improving PR43065 which will require being able to mark a 256 and 512 bit vector of f16 as Legal. Differential Revision: https://reviews.llvm.org/D66515 llvm-svn: 369565
*	[mips] Replace call `expandLoadAddress` by `loadAndAddSymbolAddress`. NFC	Simon Atanasyan	2019-08-21	1	-2/+2
\| \| \| \| \| \| \| \|	In case of expanding `lw/sw $reg, symbol($reg)` instruction for PIC it's enough to call the `loadAndAddSymbolAddress` method. Additional work performed by the `expandLoadAddress` is not required here. llvm-svn: 369563