bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86][NFC] Generalize the naming of "Retpoline Thunks" and related code to ↵	Scott Constable	2020-06-24	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	"Indirect Thunks" There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g., https://software.intel.com/security-software-guidance/software-guidance/load-value-injection Therefore it makes sense to refactor X86RetpolineThunks as a more general capability. Differential Revision: https://reviews.llvm.org/D76810
*	[ms] [X86] Use "P" modifier on all branch-target operands in inline X86 ↵	Eric Astor	2020-01-09	1	-27/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	assembly. Summary: Extend D71677 to apply to all branch-target operands, rather than special-casing call instructions. Also add a regression test for llvm.org/PR44272, since this finishes fixing it. Reviewers: thakis, rnk Reviewed By: thakis Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72417
*	[X86] AMD Znver2 (Rome) Scheduler enablement	Ganesh Gopalasubramanian	2020-01-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The patch gives out the details of the znver2 scheduler model. There are few improvements with respect to execution units, latencies and throughput when compared with znver1. The tests that were present for znver1 for llvm-mca tool were replicated. The latencies, execution units, timeline and throughput information are updated for znver2. Reviewers: craig.topper, Simon Pilgrim Differential Revision: https://reviews.llvm.org/D66088
*	[X86] Reorder X86any* PatFrags to put the strict node first so that chain ↵	Craig Topper	2020-01-03	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	property will be inferred for the instruction by the tablegen backend. Also use X86any_vfpround instead of X86vfpround in some instruction definitions so the strict version can be used to infer the chain property. Without these changes we don't propagate strict FP chain through isel for some instructions.
*	[FPEnv][X86] Constrained FCmp intrinsics enabling on X86	Wang, Pengfei	2019-12-11	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision. Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70582
*	[PGO][PGSO] DAG.shouldOptForSize part.	Hiroshi Yamauchi	2019-11-21	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: (Split of off D67120) SelectionDAG::shouldOptForSize changes for profile guided size optimization. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70095
*	[X86][BMI] Pull out schedule classes from bmi_andn<> and bmi_bls<>	Simon Pilgrim	2019-10-21	1	-9/+10
\| \| \| \| \| \|	Stop hardwiring classes llvm-svn: 375470
*	[X86] Require last argument to LWPINS/LWPVAL builtins to be an ICE. Add ↵	Craig Topper	2019-09-22	1	-4/+4
\| \| \| \| \| \| \| \|	ImmArg to the llvm intrinsics. Update the isel patterns to use timm instead of imm. llvm-svn: 372534
*	[X86] Updated target specific selection dag code to conservatively check for ↵	Philip Reames	2019-09-10	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	isAtomic in addition to isVolatile See D66309 for context. This is the first sweep of x86 target specific code to add isAtomic bailouts where appropriate. The intention here is to have the switch from AtomicSDNode to LoadSDNode/StoreSDNode be close to NFC; that is, I'm not looking to allow additional optimizations at this time. Sorry for the lack of tests. As discussed in the review, most of these are vector tests (for which atomicity is not well defined) and I couldn't figure out to exercise the anyextend cases which aren't vector specific. Differential Revision: https://reviews.llvm.org/D66322 llvm-svn: 371547
*	Revert [Windows] Disable TrapUnreachable for Win64, add SEH_NoReturn	Reid Kleckner	2019-09-03	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts r370525 (git commit 0bb1630685fba255fa93def92603f064c2ffd203) Also reverts r370543 (git commit 185ddc08eed6542781040b8499ef7ad15c8ae9f4) The approach I took only works for functions marked `noreturn`. In general, a call that is not known to be noreturn may be followed by unreachable for other reasons. For example, there could be multiple call sites to a function that throws sometimes, and at some call sites, it is known to always throw, so it is followed by unreachable. We need to insert an `int3` in these cases to pacify the Windows unwinder. I think this probably deserves its own standalone, Win64-only fixup pass that runs after block placement. Implementing that will take some time, so let's revert to TrapUnreachable in the mean time. llvm-svn: 370829
*	Fix SEH_NoReturn machine verifier error	Reid Kleckner	2019-08-30	1	-1/+1
\| \| \| \|	llvm-svn: 370543
*	[Windows] Disable TrapUnreachable for Win64, add SEH_NoReturn	Reid Kleckner	2019-08-30	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Users have complained llvm.trap produce two ud2 instructions on Win64, one for the trap, and one for unreachable. This change fixes that. TrapUnreachable was added and enabled for Win64 in r206684 (April 2014) to avoid poorly understood issues with the Windows unwinder. There seem to be two major things in play: - the unwinder - C++ EH, _CxxFrameHandler3 & co The unwinder disassembles forward from the return address to scan for epilogues. Inserting a ud2 had the effect of stopping the unwinder, and ensuring that it ran the EH personality function for the current frame. However, it's not clear what the unwinder does when the return address happens to be the last address of one function and the first address of the next function. The Visual C++ EH personality, _CxxFrameHandler3, needs to figure out what the current EH state number is. It does this by consulting the ip2state table, which maps from PC to state number. This seems to go wrong when the return address is the last PC of the function or catch funclet. I'm not sure precisely which system is involved here, but in order to address these real or hypothetical problems, I believe it is enough to insert int3 after a call site if it would otherwise be the last instruction in a function or funclet. I was able to reproduce some similar problems locally by arranging for a noreturn call to appear at the end of a catch block immediately before an unrelated function, and I confirmed that the problems go away when an extra trailing int3 instruction is added. MSVC inserts int3 after every noreturn function call, but I believe it's only necessary to do it if the call would be the last instruction. This change inserts a pseudo instruction that expands to int3 if it is in the last basic block of a function or funclet. I did what I could to run the Microsoft compiler EH tests, and the ones I was able to run showed no behavior difference before or after this change. Differential Revision: https://reviews.llvm.org/D66980 llvm-svn: 370525
*	[X86] Remove what little support we had for MPX	Craig Topper	2019-08-29	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-Deprecate -mmpx and -mno-mpx command line options -Remove CPUID detection of mpx for -march=native -Remove MPX from all CPUs -Remove MPX preprocessor define I've left the "mpx" string in the backend so we don't fail on old IR, but its not connected to anything. gcc has also deprecated these command line options. https://www.phoronix.com/scan.php?page=news_item&px=GCC-Patch-To-Drop-MPX Differential Revision: https://reviews.llvm.org/D66669 llvm-svn: 370393
*	[X86] Improve the diagnostic for larger than 4-bit immediate for ↵	Craig Topper	2019-08-10	1	-0/+1
\| \| \| \| \| \|	vpermil2pd/ps. Only allow MCConstantExprs. llvm-svn: 368505
*	[X86] Allow any 8-bit immediate to be used with bt/btc/btr/bts memory aliases.	Craig Topper	2019-08-07	1	-4/+4
\| \| \| \| \| \| \| \|	We have aliases that disambiguate memory forms of bt/btc/btr/bts without suffixes to the 32-bit form. These aliases should have been updated when the instructions were updated in r356413. llvm-svn: 368127
*	[X86] Limit vpermil2pd/vpermil2ps immediates to 4 bits in the assembly parser.	Craig Topper	2019-08-07	1	-0/+14
\| \| \| \| \| \| \| \| \| \|	The upper 4 bits of the immediate byte are used to encode a register. We need to limit the explicit immediate to fit in the remaining 4 bits. Fixes PR42899. llvm-svn: 368123
*	[X86] Add patterns with and_flag_nocf for BLSI and TBM instructions.	Craig Topper	2019-07-10	1	-6/+19
\| \| \| \| \| \|	Fixes similar issues to r352306. llvm-svn: 365705
*	[X86] Add VP2INTERSECT instructions	Pengfei Wang	2019-05-31	1	-0/+28
\| \| \| \| \| \| \| \| \| \|	Support Intel AVX512 VP2INTERSECT instructions in llvm Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D62366 llvm-svn: 362188
*	[X86] Add ENQCMD instructions	Pengfei Wang	2019-05-30	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \|	For more details about these instructions, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference. Patch by Tianqing Wang (tianqing) Differential Revision: https://reviews.llvm.org/D62281 llvm-svn: 362053
*	Enable AVX512_BF16 instructions, which are supported for BFLOAT16 in Cooper Lake	Luo, Yuanke	2019-05-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: 1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake; 2. Enable VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision. VCVTNE2PS2BF16: Convert Two Packed Single Data to One Packed BF16 Data. VCVTNEPS2BF16: Convert Packed Single Data to Packed BF16 Data. VDPBF16PS: Dot Product of BF16 Pairs Accumulated into Packed Single Precision. For more details about BF16 isa, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Author: LiuTianle Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, RKSimon, spatel Reviewed By: craig.topper Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60550 llvm-svn: 360017
*	[X86] Use (SUBREG_TO_REG (MOV32rm)) for extloadi64i8/extloadi64i16 when the ↵	Craig Topper	2019-04-07	1	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	load is 4 byte aligned or better and not volatile. Summary: Previously we would use MOVZXrm8/MOVZXrm16, but those are longer encodings. This is similar to what we do in the loadi32 predicate. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60341 llvm-svn: 357875
*	[X86] Merge the different CMOV instructions for each condition code into ↵	Craig Topper	2019-04-05	1	-13/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	single instructions that store the condition code as an immediate. Summary: Reorder the condition code enum to match their encodings. Move it to MC layer so it can be used by the scheduler models. This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between CMOV instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. This does complicate the scheduler models a little since we can't assign the A and BE instructions to a separate class now. I plan to make similar changes for SETcc and Jcc. Reviewers: RKSimon, spatel, lebedev.ri, andreadb, courbet Reviewed By: RKSimon Subscribers: gchatelet, hiraditya, kristina, lebedev.ri, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60041 llvm-svn: 357800
*	[IR] Refactor attribute methods in Function class (NFC)	Evandro Menezes	2019-04-04	1	-5/+5
\| \| \| \| \| \| \| \|	Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731
*	[X86] Remove CustomInserters for RDPKRU/WRPKRU. Use some custom lowering and ↵	Craig Topper	2019-04-04	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	new ISD opcodes instead. These inserters inserted some instructions to zero some registers and copied from virtual registers to physical registers. This change instead inserts the zeros directly into the DAG at lowering time using new ISD opcodes that take the extra zeroes as inputs. The zeros will then go through isel on their own to select the MOV32r0 pseudo. Then we just need to mention the physical registers directly in the isel patterns and the isel table and InstrEmitter will take care of inserting the necessary copies to/from physical registers. llvm-svn: 357659
*	[X86] Remove CustomInserter pseudos for MONITOR/MONITORX/CLZERO. Use custom ↵	Craig Topper	2019-04-03	1	-21/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instruction selection instead. This custom inserter existed so we could do a weird thing where we pretended that the instructions support a full address mode instead of taking a pointer in EAX/RAX. I think was largely so we could be pointer size agnostic in the isel pattern. To make this work we would then put the address into an LEA into EAX/RAX in front of the instruction after isel. But the LEA is overkill when we just have a base pointer. So we end up using the LEA as a slower MOV instruction. With this change we now just do custom selection during isel instead and just assign the incoming address of the intrinsic into EAX/RAX based on its size. After the intrinsic is selected, we can let isel take care of selecting an LEA or other operation to do any address computation needed in this basic block. I've also split the instruction into a 32-bit mode version and a 64-bit mode version so the implicit use is properly sized based on the pointer. Without this we get comments in the assembly output about killing eax and defing rax or vice versa depending on whether we define the instruction to use EAX/RAX. llvm-svn: 357652
*	[X86] Classify the AVX512 rounding control operand as ↵	Craig Topper	2019-04-01	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	X86::OPERAND_ROUNDING_CONTROL instead of MCOI::OPERAND_IMMEDIATE. Add an assert on legal values of rounding control in the encoder and remove an explicit mask. This should allow llvm-exegesis to intelligently constrain the rounding mode. The mask in the encoder shouldn't be necessary any more. We used to allow codegen to use 8-11 for rounding mode and the assembler would use 0-3 to mean the same thing so we masked here and in the printer. Codegen now matches the assembler and the printer was updated, but I forgot to update the encoder. llvm-svn: 357419
*	Revert r356688 "[X86] Don't avoid folding multiple use sign extended 8-bit ↵	Craig Topper	2019-03-25	1	-0/+13
\| \| \| \| \| \| \| \|	immediate into instructions under optsize." Looking back over how the one use optimization works, I don't think this is the right way to fix this. llvm-svn: 356866
*	[X86] Don't avoid folding multiple use sign extended 8-bit immediate into ↵	Craig Topper	2019-03-21	1	-13/+0
\| \| \| \| \| \| \| \| \| \| \| \|	instructions under optsize. Under optsize we try to avoid folding immediates into instructions under optsize. But if the immediate is 16-bits or 32 bits, but can be encoded as an 8-bit immediate we don't save enough from disabling the folding unless the immediate has enough uses to make up for the size of the move which is either 3 bytes or 5 bytes since there are no sign extended 8-bit moves. We would also save something if the immediate was a live out of the basic block and thus a move was unavoidable, but that would require a more advanced heuristic than just counting uses. Note we only avoid folding multiple use immediates into the patterns that use X86ISD::ADD/SUB/XOR/OR/AND/CMP/ADC/SBB nodes and not the more common ISD::ADD/SUB/XOR/OR/AND nodes. Differential Revision: https://reviews.llvm.org/D59522 llvm-svn: 356688
*	[X86] Add CMPXCHG8B feature flag. Set it for all CPUs except i386/i486 ↵	Craig Topper	2019-03-20	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	including 'generic'. Disable use of CMPXCHG8B when this flag isn't set. CMPXCHG8B was introduced on i586/pentium generation. If its not enabled, limit the atomic width to 32 bits so the AtomicExpandPass will expand to lib calls. Unclear if we should be using a different limit for other configs. The default is 1024 and experimentation shows that using an i256 atomic will cause a crash in SelectionDAG. Differential Revision: https://reviews.llvm.org/D59576 llvm-svn: 356631
*	[X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCI	Andrea Di Biagio	2019-03-20	1	-7/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes the following dag node opcodes from namespace X86ISD: RDTSC_DAG, RDTSCP_DAG, RDPMC_DAG The logic that expands RDTSC/RDPMC/XGETBV intrinsics is basically the same. The only differences are: RDTSC/RDTSCP don't implicitly read ECX. RDTSCP also implicitly writes ECX. I moved the common expansion logic into a helper function with the goal to get rid of code repetition. That helper is now used for the expansion of RDTSC/RDTSCP/RDPMC/XGETBV intrinsics. No functional change intended. Differential Revision: https://reviews.llvm.org/D59547 llvm-svn: 356546
*	[X86] Re-disable cmpxchg16b for 32-bit mode assembly parsing.	Craig Topper	2019-03-19	1	-1/+2
\| \| \| \| \| \|	This was broken recently when I factored the 64 bit mode check into hasCmpxchg16 without thinking about the AssemblerPredicate. llvm-svn: 356531
*	[X86] Allow any 8-bit immediate to be used with BT/BTC/BTR/BTS not just sign ↵	Craig Topper	2019-03-18	1	-30/+46
\| \| \| \| \| \| \| \|	extended 8-bit immediates. We need to allow [128,255] in addition to [-128, 127] to match gas. llvm-svn: 356413
*	[X86] Replace uses of i64immSExt32_su with i64relocImmSExt32_su.	Craig Topper	2019-03-18	1	-4/+1
\| \| \| \| \| \|	For the i8, i16, and i32 instructions we were using a relocImm. Presumably we should for i64 as well. llvm-svn: 356406
*	[X86] Rename imm8_su/imm16_su/imm32_su to ↵	Craig Topper	2019-03-18	1	-6/+6
\| \| \| \| \| \|	relocImm8_su/relocImm16_su/relocImm32_su/ to accurately reflect what they are. llvm-svn: 356393
*	[X86] Remove the _alt forms of (V)CMP instructions. Use a combination of ↵	Craig Topper	2019-03-18	1	-10/+0
\| \| \| \| \| \| \| \| \| \|	custom printing and custom parsing to achieve the same result and more Similar to previous change done for VPCOM and VPCMP Differential Revision: https://reviews.llvm.org/D59468 llvm-svn: 356384
*	[X86] Merge printf32mem/printi32mem into a single printdwordmem. Do the same ↵	Craig Topper	2019-03-17	1	-32/+30
\| \| \| \| \| \| \| \| \| \| \| \|	for all other printing functions. The only thing the print methods currently need to know is the string to print for the memory size in intel syntax. This patch merges the functions based on this string. If we ever need something else in the future, its easy to split them back out. This reduces the number of cases in the assembly printers. It shrinks the intel printer to only use 7 bytes per instruction instead of 8. llvm-svn: 356352
*	[X86] Remove the _alt forms of AVX512 VPCMP instructions. Use a combination ↵	Craig Topper	2019-03-17	1	-5/+0
\| \| \| \| \| \| \| \| \| \|	of custom printing and custom parsing to achieve the same result and more Similar to the previous patch for VPCOM. Differential Revision: https://reviews.llvm.org/D59398 llvm-svn: 356344
*	[X86] Remove the _alt forms of XOP VPCOM instructions. Use a combination of ↵	Craig Topper	2019-03-17	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	custom printing and custom parsing to achieve the same result and more Previously we had a regular form of the instruction used when the immediate was 0-7. And _alt form that allowed the full 8 bit immediate. Codegen would always use the 0-7 form since the immediate was always checked to be in range. Assembly parsing would use the 0-7 form when a mnemonic like vpcomtrueb was used. If the immediate was specified directly the _alt form was used. The disassembler would prefer to use the 0-7 form instruction when the immediate was in range and the _alt form otherwise. This way disassembly would print the most readable form when possible. The assembly parsing for things like vpcomtrueb relied on splitting the mnemonic into 3 pieces. A "vpcom" prefix, an immediate representing the "true", and a suffix of "b". The tablegenerated printing code would similarly print a "vpcom" prefix, decode the immediate into a string, and then print "b". The _alt form on the other hand parsed and printed like any other instruction with no specialness. With this patch we drop to one form and solve the disassembly printing issue by doing custom printing when the immediate is 0-7. The parsing code has been tweaked to turn "vpcomtrueb" into "vpcomb" and then the immediate for the "true" is inserted either before or after the other operands depending on at&t or intel syntax. I'd rather not do the custom printing, but I tried using an InstAlias for each possible mnemonic for all 8 immediates for all 16 combinations of element size, signedness, and memory/register. The code emitted into printAliasInstr ended up checking the number of operands, the register class of each operand, and the immediate for all 256 aliases. This was repeated for both the at&t and intel printer. Despite a lot of common checks between all of the aliases, when compiled with clang at least this commonality was not well optimized. Nor do all the checks seem necessary. Since I want to do a similar thing for vcmpps/pd/ss/sd which have 32 immediate values and 3 encoding flavors, 3 register sizes, etc. This didn't seem to scale well for clang binary size. So custom printing seemed a better trade off. I also considered just using the InstAlias for the matching and not the printing. But that seemed like it would add a lot of extra rows to the matcher table. Especially given that the 32 immediates for vpcmpps have 46 strings associated with them. Differential Revision: https://reviews.llvm.org/D59398 llvm-svn: 356343
*	[X86] Check for 64-bit mode in X86Subtarget::hasCmpxchg16b()	Craig Topper	2019-03-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The feature flag alone can't be trusted since it can be passed via -mattr. Need to ensure 64-bit mode as well. We had a 64 bit mode check on the instruction to make the assembler work correctly. But we weren't guarding any of our lowering code or the hooks for the AtomicExpandPass. I've added 32-bit command lines to atomic128.ll with and without cx16. The tests there would all previously fail if -mattr=cx16 was passed to them. I had to move one test case for f128 to a new file as it seems to have a different 32-bit mode or possibly sse issue. Differential Revision: https://reviews.llvm.org/D59308 llvm-svn: 356078
*	[X86] Print all register forms of x87 fadd/fsub/fdiv/fmul as having two ↵	Craig Topper	2019-02-04	1	-26/+26
\| \| \| \| \| \| \| \| \| \|	arguments where on is %st. All of these instructions consume one encoded register and the other register is %st. They either write the result to %st or the encoded register. Previously we printed both arguments when the encoded register was written. And we printed one argument when the result was written to %st. For the stack popping forms the encoded register is always the destination and we didn't print both operands. This was inconsistent with gcc and objdump and just makes the output assembly code harder to read. This patch changes things to always print both operands making us consistent with gcc and objdump. The parser should still be able to handle the single register forms just as it did before. This also matches the GNU assembler behavior. llvm-svn: 353061
*	[X86] Print %st(0) as %st when its implicit to the instruction. Continue ↵	Craig Topper	2019-02-04	1	-9/+9
\| \| \| \| \| \| \| \|	printing it as %st(0) when its encoded in the instruction. This is a step back from the change I made in r352985. This appears to be more consistent with gcc and objdump behavior. llvm-svn: 353015
*	Revert r352985 "[X86] Print %st(0) as %st to match what gcc inline asm uses ↵	Craig Topper	2019-02-04	1	-8/+8
\| \| \| \| \| \| \| \| \| \|	as the clobber name to make MS inline asm work correctly" Looking into gcc and objdump behavior more this was overly aggressive. If the register is encoded in the instruction we should print %st(0), if its implicit we should print %st. I'll be making a more directed change in a future patch. llvm-svn: 353013
*	[X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber ↵	Craig Topper	2019-02-03	1	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	name to make MS inline asm work correctly Summary: When calculating clobbers for MS style inline assembly we fail if the asm clobbers stack top because we print st(0) and try to pass it through the gcc register name check. This was found with when I attempted to make a emms/femms clobber all ST registers. If you use emms/femms in MS inline asm we would try to use st(0) as the clobber name but clang would think that wasn't a valid clobber name. This also matches what objdump disassembly prints. It's also what is printed by gcc -S. Reviewers: RKSimon, rnk, efriedma, spatel, andreadb, lebedev.ri Reviewed By: rnk Subscribers: eraman, gbedwell, lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D57621 llvm-svn: 352985
*	[X86] Remove a couple places where we unnecessarily pass 0 to the ↵	Craig Topper	2019-01-30	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	EmitPriority of some FP instruction aliases. NFC As far as I can tell we already won't emit these aliases due to an operand count check in the tablegen code. Removing these because I couldn't make sense of the inconsistency between fadd and fmul from reading the code. I checked the AsmMatcher and AsmWriter files before and after this change and there were no differences. llvm-svn: 352608
*	[X86] Add some missing blsr patterns	Gabor Buella	2019-01-27	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The add+and sequence followed by a branch can happen e.g. when looping over the set bits of an integer: ``` while (x != 0) { func(x & ~x); x &= x - 1; } ``` Reviewed By: ctopper Differential Revision: https://reviews.llvm.org/D57296 llvm-svn: 352306
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[X86] Remove X86ISD::INC/DEC. Just select them from X86ISD::ADD/SUB at isel time	Craig Topper	2019-01-02	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	INC/DEC are pretty much the same as ADD/SUB except that they don't update the C flag. This patch removes the special nodes and just pattern matches from ADD/SUB during isel if the C flag isn't being used. I had to avoid selecting DEC is the result isn't used. This will become a SUB immediate which will turned into a CMP later by optimizeCompareInstr. This lead to the one test change where we use a CMP instead of a DEC for an overflow intrinsic since we only checked the flag. This also exposed a hole in our RMW flag matching use of hasNoCarryFlagUses. Our root node for the match is a store and there's no guarantee that all the flag users have been selected yet. So hasNoCarryFlagUses needs to check copyToReg and machine opcodes, but it also needs to check for the pre-match SETCC, SETCC_CARRY, BRCOND, and CMOV opcodes. Differential Revision: https://reviews.llvm.org/D55975 llvm-svn: 350245
*	[X86] Add isel patterns to match BMI/TBMI instructions when lowering has ↵	Craig Topper	2018-12-21	1	-0/+57
\| \| \| \| \| \| \| \| \| \| \| \|	turned the root nodes into one of the flag producing binops. This fixes the patterns that have or/and as a root. 'and' is handled differently since thy usually have a CMP wrapped around them. I had to look for uses of the CF flag because all these nodes have non-standard CF flag behavior. A real or/xor would always clear CF. In practice we shouldn't be using the CF flag from these nodes as far as I know. Differential Revision: https://reviews.llvm.org/D55813 llvm-svn: 349962
*	[SelectionDAG] Move (repeated) SDTIntShiftDOp double shift node def to ↵	Simon Pilgrim	2018-11-16	1	-4/+0
\| \| \| \| \| \| \| \|	common code. NFCI. Prep work for PR39467. llvm-svn: 347067
*	Bias physical register immediate assignments	Nirav Dave	2018-11-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The machine scheduler currently biases register copies to/from physical registers to be closer to their point of use / def to minimize their live ranges. This change extends this to also physical register assignments from immediate values. This causes a reduction in reduction in overall register pressure and minor reduction in spills and indirectly fixes an out-of-registers assertion (PR39391). Most test changes are from minor instruction reorderings and register name selection changes and direct consequences of that. Reviewers: MatzeB, qcolombet, myatsina, pcc Subscribers: nemanjai, jvesely, nhaehnle, eraman, hiraditya, javed.absar, arphaman, jfb, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D54218 llvm-svn: 346894