bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[AMDGPU] Implemented fma cost analysis	Stanislav Mekhanoshin	2019-12-18	1	-0/+10
\| \| \| \|	Differential Revision: https://reviews.llvm.org/D71676
*	[ARM] Teach the Arm cost model that a Shift can be folded into other ↵	David Green	2019-12-09	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966
*	AMDGPU: Refactor treatment of denormal mode	Matt Arsenault	2019-11-19	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \|	Start moving towards treating this as a property of the calling convention, and not the subtarget. The default denormal mode should not be part of the subtarget, and be moved into a separate function attribute. This patch is still NFC. The denormal mode remains as a subtarget feature for now, but make the necessary changes to switch to using an attribute.
*	[AMDGPU] Tune inlining parameters for AMDGPU target (part 2)	dfukalov	2019-11-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Most of IR instructions got better code size estimations after commit 47a5c36b. So default parameters values should be updated to improve inlining and unrolling for the target. Reviewers: rampitec, arsenm Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70391
*	[AMDGPU] Improve code size cost model	Daniil Fukalov	2019-10-17	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added estimation for zero size insertelement, extractelement and llvm.fabs operators. Updated inline/unroll parameters default values. Reviewers: rampitec, arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68881 llvm-svn: 375109
*	[System Model] [TTI] Define AMDGPUTTIImpl::getST and AMDGPUTTIImpl::getTLI	Vitaly Buka	2019-10-09	1	-2/+10
\| \| \| \| \| \|	To fix "infinite recursion" warning. llvm-svn: 374222
*	InferAddressSpaces: Move target intrinsic handling to TTI	Matt Arsenault	2019-08-14	1	-0/+5
\| \| \| \| \| \| \| \|	I'm planning on handling intrinsics that will benefit from checking the address space enums. Don't bother moving the address collection for now, since those won't need th enums. llvm-svn: 368895
*	[AMDGPU] Tune inlining parameters for AMDGPU target	Daniil Fukalov	2019-07-17	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Since the target has no significant advantage of vectorization, vector instructions bous threshold bonus should be optional. amdgpu-inline-arg-alloca-cost parameter default value and the target InliningThresholdMultiplier value tuned then respectively. Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64642 llvm-svn: 366348
*	AMDGPU: Ignore subtarget for InferAddressSpaces	Matt Arsenault	2019-06-17	1	-2/+1
\| \| \| \| \| \| \|	Even if the target doesn't have flat instructions, addrspace(0) is still flat. It just happens to not work. llvm-svn: 363561
*	AMDGPU: Assume ECC is enabled by default if supported	Matt Arsenault	2019-04-03	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	The test should really be checking for the property directly in the code object headers, but there are problems with this. I don't see this directly represented in the text form, and for the binary emission this is depending on a function level subtarget feature to emit a global flag. llvm-svn: 357558
*	AMDGPU: Remove debugger related subtarget features	Matt Arsenault	2019-02-21	1	-2/+0
\| \| \| \| \| \|	As far as I know these aren't needed anymore. llvm-svn: 354634
*	AMDGPU: Ignore CodeObjectV3 when inlining	Matt Arsenault	2019-02-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	This was inhibiting inlining of library functions when clang was invoking the inliner directly. This is covering a bit of a mess with subtarget feature handling, and this shouldn't be a subtarget feature. The behavior is different depending on whether you are using a -mattr flag in clang, or llc, opt. llvm-svn: 353899
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	AMDGPU: Remove remnants of old address space mapping	Matt Arsenault	2018-08-31	1	-1/+1
\| \| \| \|	llvm-svn: 341165
*	AMDGPU: Refactor Subtarget classes	Tom Stellard	2018-07-11	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851
*	AMDGPU: Separate R600 and GCN TableGen files	Tom Stellard	2018-06-28	1	-11/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We now have two sets of generated TableGen files, one for R600 and one for GCN, so each sub-target now has its own tables of instructions, registers, ISel patterns, etc. This should help reduce compile time since each sub-target now only has to consider information that is specific to itself. This will also help prevent the R600 sub-target from slowing down new features for GCN, like disassembler support, GlobalISel, etc. Reviewers: arsenm, nhaehnle, jvesely Reviewed By: arsenm Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46365 llvm-svn: 335942
*	AMDGPU: Remove ability to reserve VGPRs for debugger	Konstantin Zhuravlyov	2018-06-21	1	-1/+0
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D48234 llvm-svn: 335288
*	AMDGPU: Split AMDGPUTTI into GCNTTI and R600TTI	Tom Stellard	2018-05-30	1	-1/+65
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D47359 llvm-svn: 333605
*	AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers	Tom Stellard	2018-05-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are huge so we only want to include them where needed. This will also make it easier if we want to split the R600 and GCN definitions into separate tablegenerated files. I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h because it uses some enums from the header to initialize default values for the SIMachineFunction class, so I ended up having to remove includes of SIMachineFunctionInfo.h from headers too. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46272 llvm-svn: 332930
*	[AMDGPU] Support horizontal vectorization of min/max.	Farhana Aleen	2018-05-09	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Author: FarhanaAleen Reviewed By: rampitec Subscribers: AMDGPU Differential Revision: https://reviews.llvm.org/D46604 llvm-svn: 331920
*	[AMDGPU] Support horizontal vectorization.	Farhana Aleen	2018-05-01	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Author: FarhanaAleen Reviewed By: rampitec, arsenm Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D46213 llvm-svn: 331313
*	[AMDGPU] Increased vector length for global/constant loads.	Farhana Aleen	2018-03-07	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords. Author: FarhanaAleen Reviewed By: rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D44179 llvm-svn: 326910
*	Revert "[AMDGPU] Widened vector length for global/constant address space."	Farhana Aleen	2018-03-07	1	-6/+0
\| \| \| \| \| \|	This reverts commit ce988cc100dc65e7c6c727aff31ceb99231cab03. llvm-svn: 326907
*	[AMDGPU] Widened vector length for global/constant address space.	Farhana Aleen	2018-03-07	1	-0/+6
\| \| \| \|	llvm-svn: 326904
*	Revert "[AMDGPU] Increased vector length for global/constant loads."	Konstantin Zhuravlyov	2018-02-20	1	-6/+0
\| \| \| \| \| \| \| \| \| \|	https://reviews.llvm.org/rL325518 It breaks following OpenCL conformance tests: - Basic - parameter_types - Basic - vload_private llvm-svn: 325643
*	[AMDGPU] Increased vector length for global/constant loads.	Mark Searles	2018-02-19	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache; loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords. Author: FarhanaAleen Reviewed By: rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D43275 llvm-svn: 325518
*	LSR: Check more intrinsic pointer operands	Matt Arsenault	2017-12-11	1	-0/+2
\| \| \| \|	llvm-svn: 320424
*	[AMDGPU] Port of HSAIL inliner	Stanislav Mekhanoshin	2017-09-20	1	-0/+2
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D36849 llvm-svn: 313714
*	[AMDGPU] Fix some Clang-tidy modernize-use-using and Include What You Use ↵	Eugene Zelenko	2017-08-08	1	-8/+20
\| \| \| \| \| \|	warnings; other minor fixes (NFC). llvm-svn: 310429
*	AMDGPU: Use a custom areInlineCompatible	Matt Arsenault	2017-08-07	1	-0/+29
\| \| \| \| \| \| \|	Fixes not inlining OpenCL library functions on AMDGPU, which don't have an explicitly set target-cpu. llvm-svn: 310269
*	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI.	Geoff Berry	2017-06-28	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 llvm-svn: 306554
*	AMDGPU: Allow vectorization of packed types	Matt Arsenault	2017-06-20	1	-2/+4
\| \| \| \|	llvm-svn: 305844
*	DivergencyAnalysis patch for review	Alexander Timofeev	2017-06-15	1	-0/+1
\| \| \| \|	llvm-svn: 305494
*	Const correctness for TTI::getRegisterBitWidth	Daniel Neilson	2017-06-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The method TargetTransformInfo::getRegisterBitWidth() is declared const, but the type erasing implementation classes (TargetTransformInfo::Concept & TargetTransformInfo::Model) that were introduced by Chandler in https://reviews.llvm.org/D7293 do not have the method declared const. This is an NFC to tidy up the const consistency between TTI and its implementation. Reviewers: chandlerc, rnk, reames Reviewed By: reames Subscribers: reames, jfb, arsenm, dschuff, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D33903 llvm-svn: 305189
*	AMDGPU: Make some packed shuffles free	Matt Arsenault	2017-05-10	1	-0/+3
\| \| \| \| \| \| \|	VOP3P instructions can encode access to either half of the register. llvm-svn: 302730
*	[AMDGPU] Get address space mapping by target triple environment	Yaxun Liu	2017-03-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As we introduced target triple environment amdgiz and amdgizcl, the address space values are no longer enums. We have to decide the value by target triple. The basic idea is to use struct AMDGPUAS to represent address space values. For address space values which are not depend on target triple, use static const members, so that they don't occupy extra memory space and is equivalent to a compile time constant. Since the struct is lightweight and cheap, it can be created on the fly at the point of usage. Or it can be added as member to a pass and created at the beginning of the run* function. Differential Revision: https://reviews.llvm.org/D31284 llvm-svn: 298846
*	LoadStoreVectorizer: Split even sized illegal chains properly	Matt Arsenault	2017-02-23	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement isLegalToVectorizeLoadChain for AMDGPU to avoid producing private address spaces accesses that will need to be split up later. This was doing the wrong thing in the case where the queried chain was an even number of elements. A possible <4 x i32> store was being split into store <2 x i32> store i32 store i32 rather than store <2 x i32> store <2 x i32> when legal. llvm-svn: 295933
*	AMDGPU: Fix warning	Matt Arsenault	2017-01-31	1	-1/+2
\| \| \| \|	llvm-svn: 293717
*	AMDGPU: Implement hook for InferAddressSpaces	Matt Arsenault	2017-01-31	1	-1/+11
\| \| \| \| \| \| \| \| \| \|	For now just port some of the existing NVPTX tests and from an old HSAIL optimization pass which approximately did the same thing. Don't enable the pass yet until more testing is done. llvm-svn: 293580
*	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.	Mohammed Agabaria	2017-01-11	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 llvm-svn: 291657
*	Do a sweep over move ctors and remove those that are identical to the default.	Benjamin Kramer	2016-10-20	1	-7/+0
\| \| \| \| \| \| \| \| \| \|	All of these existed because MSVC 2013 was unable to synthesize default move ctors. We recently dropped support for it so all that error-prone boilerplate can go. No functionality change intended. llvm-svn: 284721
*	Add new target hooks for LoadStoreVectorizer	Volkan Keles	2016-10-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Added 6 new target hooks for the vectorizer in order to filter types, handle size constraints and decide how to split chains. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, mzolotukhin, wdng, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D24727 llvm-svn: 283099
*	[TTI] The cost model should not assume vector casts get completely scalarized	Michael Kuperstein	2016-07-06	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cost model should not assume vector casts get completely scalarized, since on targets that have vector support, the common case is a partial split up to the legal vector size. So, when a vector cast gets split, the resulting casts end up legal and cheap. Instead of pessimistically assuming scalarization, base TTI can use the costs the concrete TTI provides for the split vector, plus a fudge factor to account for the cost of the split itself. This fudge factor is currently 1 by default, except on AMDGPU where inserts and extracts are considered free. Differential Revision: http://reviews.llvm.org/D21251 llvm-svn: 274642
*	AMDGPU: Implement getLoadStoreVecRegBitWidth	Matt Arsenault	2016-07-01	1	-0/+1
\| \| \| \|	llvm-svn: 274312
*	AMDGPU: Implement per-function subtargets	Matt Arsenault	2016-06-27	1	-3/+4
\| \| \| \|	llvm-svn: 273940
*	AMDGPU: Other sizes of popcnt are fast	Matt Arsenault	2016-05-18	1	-1/+1
\| \| \| \| \| \| \|	We can chain bcnt instructions together, so any width popcnt is pretty fast. llvm-svn: 269950
*	AMDGPU: Partially implement getArithmeticInstrCost for FP ops	Matt Arsenault	2016-03-25	1	-1/+30
\| \| \| \|	llvm-svn: 264374
*	AMDGPU: R600 code splitting cleanup	Matt Arsenault	2016-03-11	1	-3/+3
\| \| \| \| \| \| \|	Move a few functions only used by R600 to R600 specific code, fix header macros to stop using R600, mark classes as final. llvm-svn: 263204
*	AMDGPU: Override getCFInstrCost	Matt Arsenault	2015-12-16	1	-0/+2
\| \| \| \| \| \|	The default cost was 0 with the assumption that it is predictable. llvm-svn: 255796
*	AMDGPU/SI: Implement AMDGPUTargetTransformInfo::isSourceOfDivergence()	Tom Stellard	2015-12-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15476 llvm-svn: 255661