bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[X86] Move GISel accessor initialization from TargetMachine to Subtarget.	Quentin Colombet	2017-07-01	1	-0/+55
\| \| \| \| \| \|	NFC llvm-svn: 306921
*	Sort the remaining #include lines in include/... and lib/....	Chandler Carruth	2017-06-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
*	[X86] Adding vpopcntd and vpopcntq instructions	Oren Ben Simhon	2017-05-25	1	-0/+1
\| \| \| \| \| \| \| \| \|	AVX512_VPOPCNTDQ is a new feature set that was published by Intel. The patch represents the LLVM side of the addition of two new intrinsic based instructions (vpopcntd and vpopcntq). Differential Revision: https://reviews.llvm.org/D33169 llvm-svn: 303858
*	[globalisel][tablegen] Demote OptForSize/OptForMinSize/ForCodeSize to ↵	Daniel Sanders	2017-05-19	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	per-function predicates. Summary: This causes them to be re-computed more often than necessary but resolves objections that were raised post-commit on r301750. Reviewers: qcolombet, ab, t.p.northover, rovka, kristof.beyls Reviewed By: qcolombet Subscribers: igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D32861 llvm-svn: 303418
*	[X86] Replace slow LEA instructions in X86	Lama Saba	2017-05-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to Intel's Optimization Reference Manual for SNB+: " For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must dispatch via port 1: - LEA that has all three source operands: base, index, and offset - LEA that uses base and index registers where the base is EBP, RBP,or R13 - LEA that uses RIP relative addressing mode - LEA that uses 16-bit addressing mode " This patch currently handles the first 2 cases only. Differential Revision: https://reviews.llvm.org/D32277 llvm-svn: 303333
*	[X86] Disabling PLT in Regcall CC Functions	Oren Ben Simhon	2017-05-04	1	-2/+8
\| \| \| \| \| \| \| \| \| \|	According to psABI, PLT stub clobbers XMM8-XMM15. In Regcall calling convention those registers are used for passing parameters. Thus we need to prevent lazy binding in Regcall. Differential Revision: https://reviews.llvm.org/D32430 llvm-svn: 302124
*	[X86][LWP] Add llvm support for LWP instructions (reapplied).	Simon Pilgrim	2017-05-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4). Reapplied - this time without changing line endings of existing files. Differential Revision: https://reviews.llvm.org/D32769 llvm-svn: 302041
*	Revert rL302028 due to accidental line ending changes.	Simon Pilgrim	2017-05-03	1	-1/+0
\| \| \| \|	llvm-svn: 302038
*	[X86][LWP] Add llvm support for LWP instructions.	Simon Pilgrim	2017-05-03	1	-0/+1
\| \| \| \| \| \| \| \|	This patch adds support for the the LightWeight Profiling (LWP) instructions which are available on all AMD Bulldozer class CPUs (bdver1 to bdver4). Differential Revision: https://reviews.llvm.org/D32769 llvm-svn: 302028
*	X86: initialize a few subtarget variables.	Tim Northover	2017-05-01	1	-0/+3
\| \| \| \| \| \|	Otherwise an indeterminate value gets read, causing a bunch of UBSan failures. llvm-svn: 301819
*	[globalisel][tablegen] Compute available feature bits correctly.	Daniel Sanders	2017-04-29	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Predicate<> now has a field to indicate how often it must be recomputed. Currently, there are two frequencies, per-module (RecomputePerFunction==0) and per-function (RecomputePerFunction==1). Per-function predicates are currently recomputed more frequently than necessary since the only predicate in this category is cheap to test. Per-module predicates are now computed in getSubtargetImpl() while per-function predicates are computed in selectImpl(). Tablegen now manages the PredicateBitset internally. It should only be necessary to add the required includes. Also fixed a problem revealed by the test case where constrainSelectedInstRegOperands() would attempt to tie operands that BuildMI had already tied. Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar Reviewed By: rovka Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D32491 llvm-svn: 301750
*	Rename FastString flag.	Clement Courbet	2017-04-21	1	-1/+1
\| \| \| \|	llvm-svn: 300959
*	X86 memcpy: use REPMOVSB instead of REPMOVS{Q,D,W} for inline copies	Clement Courbet	2017-04-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	when the subtarget has fast strings. This has two advantages: - Speed is improved. For example, on Haswell thoughput improvements increase linearly with size from 256 to 512 bytes, after which they plateau: (e.g. 1% for 260 bytes, 25% for 400 bytes, 40% for 508 bytes). - Code is much smaller (no need to handle boundaries). llvm-svn: 300957
*	[X86] Generate VZEROUPPER for Skylake-avx512.	Amjad Aboud	2017-03-03	1	-1/+1
\| \| \| \| \| \| \| \|	VZEROUPPER should not be issued on Knights Landing (KNL), but on Skylake-avx512 it should be. Differential Revision: https://reviews.llvm.org/D29874 llvm-svn: 296859
*	[X86] Use SHLD with both inputs from the same register to implement rotate ↵	Craig Topper	2017-02-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	on Sandy Bridge and later Intel CPUs Summary: Sandy Bridge and later CPUs have better throughput using a SHLD to implement rotate versus the normal rotate instructions. Additionally it saves one uop and avoids a partial flag update dependency. This patch implements this change on any Sandy Bridge or later processor without BMI2 instructions. With BMI2 we will use RORX as we currently do. Reviewers: zvi Reviewed By: zvi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30181 llvm-svn: 295697
*	[X86] Remove the HLE feature flag.	Craig Topper	2017-02-09	1	-1/+0
\| \| \| \| \| \|	We only implemented it for one of the 3 HLE instructions and that instruction is also under the RTM flag. Clang only implements the RTM flag from its command line. llvm-svn: 294562
*	[X86] Clzero intrinsic and its addition under znver1	Craig Topper	2017-02-09	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch does the following. 1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero 2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1) 3. Adds the clzero feature under znver1 architecture. 4. The custom inserter is added in Lowering. 5. A testcase is added to check the intrinsic. 6. The clzero instruction is added to assembler test. Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me. Differential revision: https://reviews.llvm.org/D29385 llvm-svn: 294558
*	[X86] Fix some Clang-tidy modernize and Include What You Use warnings; other ↵	Eugene Zelenko	2017-02-02	1	-6/+7
\| \| \| \| \| \|	minor fixes (NFC). llvm-svn: 293949
*	X86: Produce @ABS8 symbol modifiers for absolute symbols in range [0,128).	Peter Collingbourne	2017-02-02	1	-2/+12
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D28689 llvm-svn: 293844
*	Remove an overeager assert from r288844.	Joerg Sonnenberger	2017-01-17	1	-3/+0
\| \| \| \|	llvm-svn: 292244
*	IR, X86: Understand !absolute_symbol metadata on global variables.	Peter Collingbourne	2016-12-08	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Attaching !absolute_symbol to a global variable does two things: 1) Marks it as an absolute symbol reference. 2) Specifies the value range of that symbol's address. Teach the X86 backend to allow absolute symbols to appear in place of immediates by extending the relocImm and mov64imm32 matchers. Start using relocImm in more places where it is legal. As previously proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-October/105800.html Differential Revision: https://reviews.llvm.org/D25878 llvm-svn: 289087
*	[X86] Prefer reduced width multiplication over pmulld on Silvermont	Zvi Rackover	2016-12-06	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Prefer expansions such as: pmullw,pmulhw,unpacklwd,unpackhwd over pmulld. On Silvermont [source: Optimization Reference Manual]: PMULLD has a throughput of 1/11 [instruction/cycles]. PMULHUW/PMULHW/PMULLW have a throughput of 1/2 [instruction/cycles]. Fixes pr31202. Analysis of this issue was done by Fahana Aleen. Reviewers: wmi, delena, mkuper Subscribers: RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D27203 llvm-svn: 288844
*	[X86][GlobalISel] Add minimal call lowering support to the IRTranslator	Zvi Rackover	2016-11-15	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Add basic functionality to support call lowering for X86. Currently only supports functions which return void and take zero arguments. Inspired by commit 286573. Reviewers: ab, qcolombet, t.p.northover Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26593 llvm-svn: 286935
*	[X86] Take advantage of the lzcnt instruction on btver2 architectures when ↵	Pierre Gousseau	2016-10-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ORing comparisons to zero. This change adds transformations such as: zext(or(setcc(eq, (cmp x, 0)), setcc(eq, (cmp y, 0)))) To: srl(or(ctlz(x), ctlz(y)), log2(bitsize(x)) This optimisation is beneficial on Jaguar architecture only, where lzcnt has a good reciprocal throughput. Other architectures such as Intel's Haswell/Broadwell or AMD's Bulldozer/PileDriver do not benefit from it. For this reason the change also adds a "HasFastLZCNT" feature which gets enabled for Jaguar. Differential Revision: https://reviews.llvm.org/D23446 llvm-svn: 284248
*	[X86] Heuristic to selectively build Newton-Raphson SQRT estimation	Nikolai Bozhenov	2016-08-04	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On modern Intel processors hardware SQRT in many cases is faster than RSQRT followed by Newton-Raphson refinement. The patch introduces a simple heuristic to choose between hardware SQRT instruction and Newton-Raphson software estimation. The patch treats scalars and vectors differently. The heuristic is that for scalars the compiler should optimize for latency while for vectors it should optimize for throughput. It is based on the assumption that throughput bound code is likely to be vectorized. Basically, the patch disables scalar NR for big cores and disables NR completely for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores. Secondly, vector SQRT has been greatly improved in Skylake and has better throughput compared to NR. Differential Revision: https://reviews.llvm.org/D21379 llvm-svn: 277725
*	Delete unused includes. NFC.	Rafael Espindola	2016-06-30	1	-1/+0
\| \| \| \|	llvm-svn: 274225
*	Drop support for creating $stubs.	Rafael Espindola	2016-06-29	1	-7/+0
\| \| \| \| \| \|	They are created by ld64 since OS X 10.5. llvm-svn: 274130
*	Move shouldAssumeDSOLocal to Target.	Rafael Espindola	2016-06-27	1	-3/+2
\| \| \| \| \| \|	Should fix the shared library build. llvm-svn: 273958
*	Simplify PICStyles.	Rafael Espindola	2016-06-20	1	-14/+6
\| \| \| \| \| \| \| \|	The main difference is that StubDynamicNoPIC is gone. The dynamic-no-pic mode as the name implies is simply not pic. It is just conservative about what it assumes to be dso local. llvm-svn: 273222
*	[X86Subtarget] Use isPositionIndependent(). NFC.	Davide Italiano	2016-06-18	1	-3/+3
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D21480 llvm-svn: 273071
*	Use shouldAssumeDSOLocal on AArch64.	Rafael Espindola	2016-05-26	1	-43/+1
\| \| \| \| \| \|	This reduces code duplication and now AArch64 also handles PIE. llvm-svn: 270844
*	Fix shouldAssumeDSOLocal for private linkage.	Rafael Espindola	2016-05-25	1	-1/+1
\| \| \| \|	llvm-svn: 270746
*	[X86] Reduce memory allocations in X86TargetMachine::getSubtargetImpl	David Majnemer	2016-05-20	1	-2/+2
\| \| \| \| \| \| \| \|	We performed a number of memory allocations each time getTTI was called, remove them by using SmallString. No functionality change intended. llvm-svn: 270246
*	Refactor X86 symbol access classification.	Rafael Espindola	2016-05-20	1	-100/+110
\| \| \| \| \| \| \| \| \| \| \| \|	This refactors the logic in X86 to avoid code duplication. It also splits it in two steps: it first decides if a symbol is local to the DSO and then uses that information to decide how to access it. The first part is implemented by shouldAssumeDSOLocal. It is not in any way specific to X86. In a followup patch I intend to move it to somewhere common and reused it in other backends. llvm-svn: 270209
*	Record a TargetMachine instead of a Reloc::Model.	Rafael Espindola	2016-05-19	1	-6/+7
\| \| \| \| \| \|	Addresses r270095's code review. llvm-svn: 270147
*	Remember the relocation model. NFC.	Rafael Espindola	2016-05-19	1	-9/+8
\| \| \| \| \| \|	This avoids passing a TargetMachine in a few places. llvm-svn: 270095
*	Style fixes. NFC.	Rafael Espindola	2016-05-19	1	-3/+3
\| \| \| \|	llvm-svn: 270093
*	Add new flag and intrinsic support for MWAITX and MONITORX instructions	Ashutosh Nema	2016-05-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT pair while adding a timer function, such that another termination of the MWAITX instruction occurs when the timer expires. The presence of the MONITORX and MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29. The MONITORX and MWAITX instructions are intercepted by the same bits that intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be monitored. MWAITX instruction causes the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events. Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is "0F 01 FB". These opcode information is used in adding tests for the disassembler. These instructions are enabled for AMD's bdver4 architecture. Patch by Ganesh Gopalasubramanian! Reviewers: echristo, craig.topper, RKSimon Subscribers: RKSimon, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19795 llvm-svn: 269911
*	Simplify handling of hidden stub.	Rafael Espindola	2016-05-17	1	-3/+3
\| \| \| \| \| \| \| \| \|	Since r207518 they are printed exactly like non-hidden stubs on x86 and since r207517 on ARM. This means we can use a single set for all stubs in those platforms. llvm-svn: 269776
*	[X86] Extend some Linux special cases to cover kFreeBSD.	Marcin Koscielnicki	2016-05-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Both Linux and kFreeBSD use glibc, so follow similiar code paths. Add isTargetGlibc to check for this, and use it instead of isTargetLinux in a few places. Fixes PR22248 for kFreeBSD. Differential Revision: http://reviews.llvm.org/D19104 llvm-svn: 268624
*	Differential Revision: http://reviews.llvm.org/D19733	Sriraman Tallam	2016-04-29	1	-2/+1
\| \| \| \|	llvm-svn: 268106
*	Differential Revision: http://reviews.llvm.org/D19040	Sriraman Tallam	2016-04-22	1	-4/+11
\| \| \| \|	llvm-svn: 267229
*	[X86] enable PIE for functions	Asaf Badouh	2016-04-20	1	-0/+29
\| \| \| \| \| \| \| \|	Call locally defined function directly for PIE/fPIE Differential Revision: http://reviews.llvm.org/D19226 llvm-svn: 266863
*	Test commit.	Sriraman Tallam	2016-04-11	1	-1/+1
\| \| \| \|	llvm-svn: 265976
*	[X86] Introduction of FeatureX87.	Andrey Turetskiy	2016-03-23	1	-0/+1
\| \| \| \| \| \| \| \|	Add FeatureX87 in X86 backend to be able to define CPUs which doesn't have x87. Differential Revision: http://reviews.llvm.org/D13979 llvm-svn: 264148
*	Disable the vzeroupper insertion pass on PS4.	Yunzhong Gao	2016-02-12	1	-0/+1
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16837 llvm-svn: 260764
*	Added Skylake client to X86 targets and features	Elena Demikhovsky	2016-01-24	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Changes in X86.td: I set features of Intel processors in incremental form: IVB = SNB + X HSW = IVB + X .. I added Skylake client processor and defined it's features FeatureADX was missing on KNL Added some new features to appropriate processors SMAP, IFMA, PREFETCHWT1, VMFUNC and others Differential Revision: http://reviews.llvm.org/D16357 llvm-svn: 258659
*	[AVX512] adding AVXVBMI feature flag	Michael Zuckerman	2016-01-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	The feature flag is for VPERMB,VPERMI2B,VPERMT2B and VPMULTISHIFTQB instructions. More about the instruction can be found in: hattps://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf Differential Revision: http://reviews.llvm.org/D16190 llvm-svn: 258012
*	[x86] adding PKU feature flag	Asaf Badouh	2015-12-15	1	-0/+1
\| \| \| \| \| \| \| \| \|	the feature flag is essential for RDPKRU and WRPKRU instruction more about the instruction can be found in the SDM rev 56, vol 2 from http://www.intel.com/sdm Differential Revision: http://reviews.llvm.org/D15491 llvm-svn: 255644
*	X86: Don't emit SAHF/LAHF for 64-bit targets unless explicitly supported	Hans Wennborg	2015-12-04	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These instructions are not supported by all CPUs in 64-bit mode. Emitting them causes Chromium to crash on start-up for users with such chips. (GCC puts these instructions behind -msahf on 64-bit for the same reason.) This patch adds FeatureLAHFSAHF, enables it by default for 32-bit targets and modern CPUs, and changes X86InstrInfo::copyPhysReg back to the lowering from before r244503 when the instructions are not available. Differential Revision: http://reviews.llvm.org/D15240 llvm-svn: 254793