bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Revert "[mips] Correct the predicates of sign extension instructions"	Simon Dardis	2018-05-02	4	-5/+29
\| \| \| \| \| \| \| \| \|	I accidently committed this patch after asking for a review, but it has not been reviewed yet. This reverts r331346. llvm-svn: 331348
*	[X86] Convert most remaining uses of X86SchedWritePair scheduler classes to ↵	Simon Pilgrim	2018-05-02	2	-194/+222
\| \| \| \| \| \| \| \|	X86SchedWriteWidths. We've dealt with the majority already. llvm-svn: 331347
*	[mips] Correct the predicates of sign extension instructions	Simon Dardis	2018-05-02	4	-29/+5
\| \| \| \| \| \|	And eliminate the duplication of those instructions for microMIPS32r6. llvm-svn: 331346
*	[AArch64][SVE] Asm: Support for non-temporal, contiguous LDNT1/STNT1 ↵	Sander de Smalen	2018-05-02	2	-0/+150
\| \| \| \| \| \| \| \| \| \| \| \|	load/store instructions. Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D46269 llvm-svn: 331343
*	[LoopInterchange] Update some loops to use range base for loops (NFC).	Florian Hahn	2018-05-02	1	-30/+24
\| \| \| \|	llvm-svn: 331342
*	[mips] Correct the predicates for shifts.	Simon Dardis	2018-05-02	2	-23/+21
\| \| \| \| \| \| \| \|	Reviewers: smaksimovic, abeserminji, atanasyan Differential Revision: https://reviews.llvm.org/D46123 llvm-svn: 331341
*	[X86] Cleanup WriteFAdd/WriteFCmp scheduler classes with more common default ↵	Simon Pilgrim	2018-05-02	6	-105/+41
\| \| \| \| \| \| \| \| \| \|	values Intel models were targeting x87 instead of packed sse. Also fixes XOP's VFRCZ to use WriteFAdd/WriteFAddY. llvm-svn: 331340
*	[AArch64][SVE] Asm: Support for LD1RQ load-and-replicate quad-word vector ↵	Sander de Smalen	2018-05-02	4	-0/+77
\| \| \| \| \| \| \| \| \| \| \| \|	instructions. Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46250 llvm-svn: 331339
*	[SelectionDAG] Selection of DBG_VALUE using a PHI node result (pt 2)	Bjorn Pettersson	2018-05-02	2	-6/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow up to rL331182. A PHI node can be split up into several MIR PHI nodes when being selected. When there is a dbg.value intrinsic that uses the result of such a PHI node we need to select several DBG_VALUE instructions, with fragment expressions, in order to do a correct selection. Reviewers: rnk, aprantl, vsk Reviewed By: vsk Subscribers: mattd, llvm-commits, JDevlieghere, aprantl, gbedwell, rnk Tags: #debug-info Differential Revision: https://reviews.llvm.org/D46329 llvm-svn: 331337
*	Fix release build breakage	Sam Clegg	2018-05-02	1	-0/+2
\| \| \| \| \| \| \|	This function was added in rL331220 but wasn't testing in release configurations. llvm-svn: 331320
*	[AMDGPU] Support horizontal vectorization.	Farhana Aleen	2018-05-01	3	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \|	Author: FarhanaAleen Reviewed By: rampitec, arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D46213 llvm-svn: 331313
*	[CFLGraph][NFC] Simplify/reorder switch in visitConstantExpr	David Bolvansky	2018-05-01	1	-37/+17
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: hfinkel, efriedma, spatel, dsanders, Danil, rjmccall Reviewed By: rjmccall Subscribers: dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D46259 llvm-svn: 331312
*	[AggressiveInstCombine] convert a chain of 'or-shift' bits into masked compare	Sanjay Patel	2018-05-01	1	-21/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and (or (lshr X, C), ...), 1 --> (X & C') != 0 I initially thought about implementing the minimal pattern in instcombine as mentioned here: https://bugs.llvm.org/show_bug.cgi?id=37098#c6 ...but we need to do better to catch the more general sequence from the motivating test (more than 2 bits in the compare). And a test-suite run with statistics showed that this pattern only happened 2 times currently. It would potentially happen more often if reassociation worked better (D45842), but it's probably still not too frequent? This is small enough that I didn't see a need to create a whole new class/file within AggressiveInstCombine. There are likely other relatively small matchers like what was discussed in D44266 that would slide under foldUnusualPatterns() (name suggestions welcome). We could potentially also consolidate matchers for ctpop, bswap, etc under here. Differential Revision: https://reviews.llvm.org/D45986 llvm-svn: 331311
*	Create a MachineBasicBlock for created IR-level BasicBlock	Jessica Paquette	2018-05-01	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While running the lit tests for the most recent version of D45916 (https://reviews.llvm.org/D45916), I found that a couple tests for this pass suddenly started segfaulting. Since the outliner wasn't actually doing anything to the code in either of these tests I got curious. I found that the pass doesn’t completely create the machine-level constructs necessary to actually add a MachineFunction and MachineBasicBlock to the module. This patch adds in those missing bits. After this, adding the outliner before this pass won’t cause it to segfault. You can recreate this behaviour by adding the MachineOutliner directly before the pass and having it return false immediately. https://reviews.llvm.org/D46330 llvm-svn: 331307
*	[DAGCombiner] Fix SDLoc in a (zext (zextload x)) combine (4/N)	Vedant Kumar	2018-05-01	1	-33/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The logic for this combine is almost identical to the logic for a (sext (sextload x)) combine. This commit factors out the logic so it can be shared by both combines, and corrects the SDLoc assigned in the zext version of the combine. Prior to this patch, for the given test case, we would apply the location associated with the udiv instruction to instructions which perform the load. Part of: llvm.org/PR37262 llvm-svn: 331303
*	[DAGCombiner] Fix SDLoc in a (sext (sextload x)) combine (3/N)	Vedant Kumar	2018-05-01	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Prior to this patch, for the given test case, we would apply the location associated with the sdiv instruction to instructions which perform the load. Part of: llvm.org/PR37262. Differential Revision: https://reviews.llvm.org/D46222 llvm-svn: 331302
*	[DAGCombiner] Change the SDLoc on split extloads (2/N)	Vedant Kumar	2018-05-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In DAGCombiner, we try to simplify this pattern: ([s\|z]ext (load ...)) Conceptually, a new extload which is created while splitting the load should have the same debug location as the load. Making this change affects the IROrder of the new load, causing some test case churn. In practice, the new location is never different from the location of the [s\|z]ext, at least not during check-llvm or a stage2 build. Part of: llvm.org/PR37262 Differential Revision: https://reviews.llvm.org/D46156 llvm-svn: 331301
*	[DAGCombiner] Set the right SDLoc on a newly-created zextload (1/N)	Vedant Kumar	2018-05-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Setting the right SDLoc on a newly-created zextload fixes a line table bug which resulted in non-linear stepping behavior. Several backend tests contained CHECK lines which relied on the IROrder inherited from the wrong SDLoc. This patch breaks that dependence where feasbile and regenerates test cases where not. In some cases, changing a node's IROrder may alter register allocation and spill behavior. This can affect performance. I have chosen not to prevent this by applying a "known good" IROrder to SDLocs, as this may hide a more general bug in the scheduler, or cause regressions on other test inputs. rdar://33755881, Part of: llvm.org/PR37262 Differential Revision: https://reviews.llvm.org/D45995 llvm-svn: 331300
*	AMDGPU: Remove remnants of gfx901 (it was deprecated some time ago)	Konstantin Zhuravlyov	2018-05-01	1	-2/+1
\| \| \| \|	llvm-svn: 331298
*	[X86][AMD][Bulldozer] Fix Bulldozer Model 2 detection.	Roman Lebedev	2018-05-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: I have discovered an issue by accident. ``` $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 21 Model: 2 Model name: AMD FX(tm)-8350 Eight-Core Processor Stepping: 0 CPU MHz: 3584.018 CPU max MHz: 4000.0000 CPU min MHz: 1400.0000 BogoMIPS: 8027.22 Virtualization: AMD-V L1d cache: 16K L1i cache: 64K L2 cache: 2048K L3 cache: 8192K NUMA node0 CPU(s): 0-7 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate vmmcall bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold ``` So this is model-2 bulldozer AMD CPU. GCC agrees: ``` $ echo \| gcc -E - -march=native -### <...> /usr/lib/gcc/x86_64-linux-gnu/7/cc1 -E -quiet -imultiarch x86_64-linux-gnu - "-march=bdver2" -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -msse4a -mcx16 -msahf -mno-movbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mlwp -mfma -mfma4 -mxop -mbmi -mno-sgx -mno-bmi2 -mtbm -mavx -mno-avx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mno-rdrnd -mf16c -mno-fsgsbase -mno-rdseed -mprfchw -mno-adx -mfxsr -mxsave -mno-xsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-avx5124fmaps -mno-avx5124vnniw -mno-clwb -mno-mwaitx -mno-clzero -mno-pku -mno-rdpid --param "l1-cache-size=16" --param "l1-cache-line-size=64" --param "l2-cache-size=2048" "-mtune=bdver2" <...> ``` But clang does not: (look for `bdver1`) ``` $ echo \| clang -E - -march=native -### clang version 7.0.0- (trunk) Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/local/bin "/usr/lib/llvm-7/bin/clang" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-E" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "-" "-mrelocation-model" "static" "-mthread-model" "posix" "-mdisable-fp-elim" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-fuse-init-array" "-target-cpu" "bdver1" "-target-feature" "+sse2" "-target-feature" "+cx16" "-target-feature" "+sahf" "-target-feature" "+tbm" "-target-feature" "-avx512ifma" "-target-feature" "-sha" "-target-feature" "-gfni" "-target-feature" "+fma4" "-target-feature" "-vpclmulqdq" "-target-feature" "+prfchw" "-target-feature" "-bmi2" "-target-feature" "-cldemote" "-target-feature" "-fsgsbase" "-target-feature" "-xsavec" "-target-feature" "+popcnt" "-target-feature" "+aes" "-target-feature" "-avx512bitalg" "-target-feature" "-xsaves" "-target-feature" "-avx512er" "-target-feature" "-avx512vnni" "-target-feature" "-avx512vpopcntdq" "-target-feature" "-clwb" "-target-feature" "-avx512f" "-target-feature" "-clzero" "-target-feature" "-pku" "-target-feature" "+mmx" "-target-feature" "+lwp" "-target-feature" "-rdpid" "-target-feature" "+xop" "-target-feature" "-rdseed" "-target-feature" "-waitpkg" "-target-feature" "-ibt" "-target-feature" "+sse4a" "-target-feature" "-avx512bw" "-target-feature" "-clflushopt" "-target-feature" "+xsave" "-target-feature" "-avx512vbmi2" "-target-feature" "-avx512vl" "-target-feature" "-avx512cd" "-target-feature" "+avx" "-target-feature" "-vaes" "-target-feature" "-rtm" "-target-feature" "+fma" "-target-feature" "+bmi" "-target-feature" "-rdrnd" "-target-feature" "-mwaitx" "-target-feature" "+sse4.1" "-target-feature" "+sse4.2" "-target-feature" "-avx2" "-target-feature" "-wbnoinvd" "-target-feature" "+sse" "-target-feature" "+lzcnt" "-target-feature" "+pclmul" "-target-feature" "-prefetchwt1" "-target-feature" "+f16c" "-target-feature" "+ssse3" "-target-feature" "-sgx" "-target-feature" "-shstk" "-target-feature" "+cmov" "-target-feature" "-avx512vbmi" "-target-feature" "-movbe" "-target-feature" "-xsaveopt" "-target-feature" "-avx512dq" "-target-feature" "-adx" "-target-feature" "-avx512pf" "-target-feature" "+sse3" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/usr/lib/llvm-7/lib/clang/7.0.0" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/usr/lib/llvm-7/lib/clang/7.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdebug-compilation-dir" "/build/llvm-build-Clang-release" "-ferror-limit" "19" "-fmessage-length" "271" "-fobjc-runtime=gcc" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "-" "-x" "c" "-" ``` So clang, unlike gcc, considers this to be `bdver1`. After some digging, i've come across `getAMDProcessorTypeAndSubtype()` in `Host.cpp`. I have added the following debug printf after the call to that function in `sys::getHostCPUName()`: ``` errs() << "Family " << Family << " Model " << Model << " Type " << Type "\n"; ``` Which produced: ``` Family 21 Model 2 Type 5 ``` Which matches the `lscpu` output. As it was pointed in the review by @craig.topper: >>! In D46314#1084123, @craig.topper wrote: > I dont' think this is right. Here is what I found on wikipedia. https://en.wikipedia.org/wiki/List_of_AMD_CPU_microarchitectures. > > AMD Bulldozer Family 15h - the successor of 10h/K10. Bulldozer is designed for processors in the 10 to 220W category, implementing XOP, FMA4 and CVT16 instruction sets. Orochi was the first design which implemented it. For Bulldozer, CPUID model numbers are 00h and 01h. > AMD Piledriver Family 15h (2nd-gen) - successor to Bulldozer. CPUID model numbers are 02h (earliest "Vishera" Piledrivers) and 10h-1Fh. > AMD Steamroller Family 15h (3rd-gen) - third-generation Bulldozer derived core. CPUID model numbers are 30h-3Fh. > AMD Excavator Family 15h (4th-gen) - fourth-generation Bulldozer derived core. CPUID model numbers are 60h-6Fh, later updated revisions have model numbers 70h-7Fh. > > > So there's a weird exception where model 2 should go with 0x10-0x1f. Though It does not help that the code can't be tested at the moment. With this logical change, the `bdver2` is properly detected. ``` $ echo \| /build/llvm-build-Clang-release/bin/clang -E - -march=native -### clang version 7.0.0 (trunk 331249) (llvm/trunk 331256) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /build/llvm-build-Clang-release/bin "/build/llvm-build-Clang-release/bin/clang-7" "-cc1" "-triple" "x86_64-unknown-linux-gnu" "-E" "-disable-free" "-main-file-name" "-" "-mrelocation-model" "static" "-mthread-model" "posix" "-mdisable-fp-elim" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-fuse-init-array" "-target-cpu" "bdver2" "-target-feature" "+sse2" "-target-feature" "+cx16" "-target-feature" "+sahf" "-target-feature" "+tbm" "-target-feature" "-avx512ifma" "-target-feature" "-sha" "-target-feature" "-gfni" "-target-feature" "+fma4" "-target-feature" "-vpclmulqdq" "-target-feature" "+prfchw" "-target-feature" "-bmi2" "-target-feature" "-cldemote" "-target-feature" "-fsgsbase" "-target-feature" "-xsavec" "-target-feature" "+popcnt" "-target-feature" "+aes" "-target-feature" "-avx512bitalg" "-target-feature" "-movdiri" "-target-feature" "-xsaves" "-target-feature" "-avx512er" "-target-feature" "-avx512vnni" "-target-feature" "-avx512vpopcntdq" "-target-feature" "-clwb" "-target-feature" "-avx512f" "-target-feature" "-clzero" "-target-feature" "-pku" "-target-feature" "+mmx" "-target-feature" "+lwp" "-target-feature" "-rdpid" "-target-feature" "+xop" "-target-feature" "-rdseed" "-target-feature" "-waitpkg" "-target-feature" "-movdir64b" "-target-feature" "-ibt" "-target-feature" "+sse4a" "-target-feature" "-avx512bw" "-target-feature" "-clflushopt" "-target-feature" "+xsave" "-target-feature" "-avx512vbmi2" "-target-feature" "-avx512vl" "-target-feature" "-avx512cd" "-target-feature" "+avx" "-target-feature" "-vaes" "-target-feature" "-rtm" "-target-feature" "+fma" "-target-feature" "+bmi" "-target-feature" "-rdrnd" "-target-feature" "-mwaitx" "-target-feature" "+sse4.1" "-target-feature" "+sse4.2" "-target-feature" "-avx2" "-target-feature" "-wbnoinvd" "-target-feature" "+sse" "-target-feature" "+lzcnt" "-target-feature" "+pclmul" "-target-feature" "-prefetchwt1" "-target-feature" "+f16c" "-target-feature" "+ssse3" "-target-feature" "-sgx" "-target-feature" "-shstk" "-target-feature" "+cmov" "-target-feature" "-avx512vbmi" "-target-feature" "-movbe" "-target-feature" "-xsaveopt" "-target-feature" "-avx512dq" "-target-feature" "-adx" "-target-feature" "-avx512pf" "-target-feature" "+sse3" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/build/llvm-build-Clang-release/lib/clang/7.0.0" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/build/llvm-build-Clang-release/lib/clang/7.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdebug-compilation-dir" "/build/llvm-build-Clang-release" "-ferror-limit" "19" "-fmessage-length" "271" "-fobjc-runtime=gcc" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "-" "-x" "c" "-" ``` Reviewers: craig.topper, GBuella, RKSimon, asbirlea, echristo, bkramer, spatel, andreadb, GGanesh Reviewed By: craig.topper Subscribers: sdardis, aprantl, arichardson, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D46314 llvm-svn: 331294
*	[X86] Split WriteFMul/WriteFDiv into XMM and YMM/ZMM scheduler classes	Simon Pilgrim	2018-05-01	10	-89/+73
\| \| \| \|	llvm-svn: 331293
*	llvm-symbolizer: Handle function definitions nested within other functions	David Blaikie	2018-05-01	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \|	LLVM always puts function definition DIEs at the top level, but under some circumstances GCC does not (at least in this case with member functions of a function-local type). To ensure that doesn't appear as though the local type's member function is unduly inlined within the outer function - ensure the inline discovery DIE parent walk stops at the first DW_TAG_subprogram. llvm-svn: 331291
*	[X86] Split WriteFRcp/WriteFRsqrt/WriteFSqrt into XMM and YMM/ZMM scheduler ↵	Simon Pilgrim	2018-05-01	12	-97/+117
\| \| \| \| \| \|	classes llvm-svn: 331290
*	[X86] Split WriteFCmp into XMM and YMM/ZMM scheduler classes	Simon Pilgrim	2018-05-01	12	-83/+48
\| \| \| \| \| \|	Removes more WriteFCmp InstRW overrides llvm-svn: 331283
*	[X86] Split WriteFAdd into XMM and YMM/ZMM scheduler classes	Simon Pilgrim	2018-05-01	10	-79/+19
\| \| \| \| \| \|	Removes more WriteFAdd InstRW overrides llvm-svn: 331276
*	Remove @brief commands from doxygen comments, too.	Adrian Prantl	2018-05-01	16	-89/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-up to r331272. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done https://reviews.llvm.org/D46290 llvm-svn: 331275
*	[X86] Convert all uses of WriteFAdd to X86SchedWriteWidths.	Simon Pilgrim	2018-05-01	3	-162/+180
\| \| \| \| \| \|	In preparation of splitting WriteFAdd by vector width. llvm-svn: 331273
*	Remove \brief commands from doxygen comments.	Adrian Prantl	2018-05-01	411	-1883/+1883
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
*	[LV] Preserve inbounds on created GEPs	Daniel Neilson	2018-05-01	1	-3/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a fix for PR23997. The loop vectorizer is not preserving the inbounds property of GEPs that it creates. This is inhibiting some optimizations. This patch preserves the inbounds property in the case where a load/store is being fed by an inbounds GEP. Reviewers: mkuper, javed.absar, hsaito Reviewed By: hsaito Subscribers: dcaballe, hsaito, llvm-commits Differential Revision: https://reviews.llvm.org/D46191 llvm-svn: 331269
*	Fix the issue that ComputeValueKnownInPredecessors only handles the case when	Wei Mi	2018-05-01	1	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	phi is on lhs of a comparison op. For the following testcase, L1: %t0 = add i32 %m, 7 %t3 = icmp eq i32* %t2, null br i1 %t3, label %L3, label %L2 L2: %t4 = load i32, i32* %t2, align 4 br label %L3 L3: %t5 = phi i32 [ %t0, %L1 ], [ %t4, %L2 ] %t6 = icmp eq i32 %t0, %t5 br i1 %t6, label %L4, label %L5 We know if we go through the path L1 --> L3, %t6 should always be true. However currently, if the rhs of the eq comparison is phi, JumpThreading fails to evaluate %t6 to true. And we know that Instcombine cannot guarantee always canonicalizing phi to the left hand side of the comparison operation according to the operand priority comparison mechanism in instcombine. The patch handles the case when rhs of the comparison op is a phi. Differential Revision: https://reviews.llvm.org/D46275 llvm-svn: 331266
*	[X86] Split WriteFShuffle into XMM and YMM/ZMM scheduler classes	Simon Pilgrim	2018-05-01	10	-89/+22
\| \| \| \| \| \|	Removes more WriteFShuffle InstRW overrides llvm-svn: 331264
*	[X86] Convert all uses of WriteFShuffle to X86SchedWriteWidths.	Simon Pilgrim	2018-05-01	2	-123/+153
\| \| \| \| \| \|	In preparation of splitting WriteFShuffle by vector width. llvm-svn: 331262
*	[AArch64][SVE] Asm: Support for contiguous ST1 (scalar+scalar) store ↵	Sander de Smalen	2018-05-01	2	-1/+43
\| \| \| \| \| \| \| \| \| \| \| \|	instructions. Reviewers: fhahn, rengolin, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D46121 llvm-svn: 331260
*	Reland r331175: "[mips] Fix the predicates of jump and branch and link ↵	Simon Dardis	2018-05-01	4	-50/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions" The previous version of this patch restricted the 'jal' instruction to MIPS and microMIPSr3. microMIPS32r6 does not have this instruction and instead uses jal as an alias for balc. Original commit message: > Reviewers: smaksimovic, atanasyan, abeserminji > > Differential Revision: https://reviews.llvm.org/D46114 > llvm-svn: 331259
*	[X86] Split WriteVecLogic into XMM and YMM/ZMM scheduler classes	Simon Pilgrim	2018-05-01	10	-38/+16
\| \| \| \| \| \|	This removes all the WriteVecLogic InstRW overrides. llvm-svn: 331258
*	[InstCombine] Adjusting bswap pattern matching to hold for And/Shift mixed case	Omer Paparo Bivas	2018-05-01	1	-1/+12
\| \| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D45731 Change-Id: I85d4226504e954933c41598327c91b2d08192a9d llvm-svn: 331257
*	[X86] Convert all uses of WriteFLogic/WriteVecLogic to X86SchedWriteWidths.	Simon Pilgrim	2018-05-01	2	-141/+157
\| \| \| \| \| \|	In preparation of splitting WriteVecLogic by vector width. llvm-svn: 331256
*	[MC] Add llvm_unreachable to toString to fix compile time warning.	Florian Hahn	2018-05-01	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Without this change, GCC 7 raises the warning below: control reaches end of non-void function Reviewers: sbc100, andreadb Reviewed By: andreadb Differential Revision: https://reviews.llvm.org/D46304 llvm-svn: 331255
*	[X86] Tag PSLLDQ/PSRLDQ as WriteShuffle scheduler classes instead of shifts.	Simon Pilgrim	2018-05-01	6	-39/+25
\| \| \| \| \| \|	Although they are encoded similar to bit shifts, the byte shifts behave like shuffles from a scheduling point of view. llvm-svn: 331253
*	[X86] Correct spill slot size.	Andrea Di Biagio	2018-05-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a bug introduced by revision 330778 (originally reviewed at: https://reviews.llvm.org/D44782), where function isFrameLoadOpcode returned the wrong number of bytes read for opcodes VMOVSSrm and VMOVSDrm. This corrects that mistake, and extends the regression test to catch cases where the dead stores should be removed. Patch by Jeremy Morse. Differential Revision: https://reviews.llvm.org/D46256 llvm-svn: 331252
*	NFC, Avoid a warning in WasmObjectWriter	Gabor Buella	2018-05-01	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	The warning was (introduced in r331220): lib/MC/WasmObjectWriter.cpp:51:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ llvm-svn: 331251
*	[X86] movdiri and movdir64b instructions	Gabor Buella	2018-05-01	6	-2/+57
\| \| \| \| \| \| \| \| \| \|	Reviewers: spatel, craig.topper, RKSimon Reviewed By: craig.topper, RKSimon Differential Revision: https://reviews.llvm.org/D45983 llvm-svn: 331248
*	[PM/LoopUnswitch] Remove the last manual domtree update code from loop	Chandler Carruth	2018-05-01	1	-170/+18
\| \| \| \| \| \| \| \| \|	unswitch and replace it with the amazingly simple update API code. This addresses piles of FIXMEs around the update logic here and makes everything substantially simpler. llvm-svn: 331247
*	[PM/LoopUnswitch] Add back a successor set that was removed based on	Chandler Carruth	2018-05-01	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	code review. It turns out this is necessary, and I read the comment on the API correctly the first time. ;] The `applyUpdates` routine requires that updates are "balanced". This is in order to cleanly handle cycles like inserting, removing, nad then re-inserting the same edge. This precludes inserting the same edge multiple times in a row as handling that would cause the insertion logic to become ordered instead of unordered (which is what the API provides). It happens that in this specific case nothing (other than an assert and contract violation) goes wrong because we're never inserting and removing the same edge. The implementation happens to do the right thing to eliminate redundant insertions in that case. But the requirement is there and there is an assert to catch it. Somehow, after the code review I never did another asserts-clang build testing loop-unswich for a long time. As a consequence, I didn't notice this despite a bunch of testing going on, but it shows up immediately with an asserts build of clang itself. llvm-svn: 331246
*	[X86] Remove 'opaque ptr' from the intel syntax parser and printer.	Craig Topper	2018-05-01	6	-58/+55
\| \| \| \| \| \| \| \|	Previously for instructions like fxsave we would print "opaque ptr" as part of the memory operand. Now we print nothing. We also no longer accept "opaque ptr" in the parser. We still accept any size to be specified for these instructions, but we may want to consider only parsing when no explicit size is specified. This what gas does. llvm-svn: 331243
*	Temporarily revert "[DEBUG] Initial adaptation of NVPTX target for debug ↵	Eric Christopher	2018-05-01	12	-222/+280
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	info emission." This appears to have some issues associated with the file directive output causing multiple global symbols with the name "file" to be emitted into a startup section. I'm investigating more specific causes and working with the original author. This reverts commit r330271. Also Revert "[DEBUGINFO, NVPTX] Add the test for the debug info of the local" This reverts commit r330592 and the follow up of 330779 as the testcase is dependent upon r330271. llvm-svn: 331237
*	[ModRefInfo] Rename local variable IsMustAlias to avoid shadowing MustAlias ↵	Alina Sbirlea	2018-04-30	1	-3/+3
\| \| \| \| \| \|	enum entry. llvm-svn: 331222
*	[SimplifyCFG] Use BB::instructionsWithoutDebug to skip DbgInfo (NFC).	Florian Hahn	2018-04-30	1	-37/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch updates some code responsible the skip debug info to use BasicBlock::instructionsWithoutDebug. I think this makes things slightly simpler and more direct. Reviewers: aprantl, vsk, hans, danielcdh Reviewed By: hans Differential Revision: https://reviews.llvm.org/D46252 llvm-svn: 331221
*	[WebAssembly] MC: Improve debug output	Sam Clegg	2018-04-30	1	-8/+33
\| \| \| \|	llvm-svn: 331220
*	[LivePhysRegs] Remove registers clobbered by regmasks from the live set	Krzysztof Parzyszek	2018-04-30	2	-6/+5
\| \| \| \| \| \| \| \|	Dead defs were being removed from the live set (in stepForward), but registers clobbered by regmasks weren't (more specifically, they were actually removed by removeRegsInMask, but then they were added back in). llvm-svn: 331219