bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[X86][SSE41] Non-temporal loads shouldn't be folded if it can be avoided ↵	Simon Pilgrim	2017-06-05	1	-96/+252
\| \| \| \| \| \| \| \| \| \|	(PR32743) Missed SSE41 non-temporal load case in previous commit Differential Revision: https://reviews.llvm.org/D33728 llvm-svn: 304722
*	Symbols re-defined with -wrap and -defsym need to be excluded from inter-	Dmitry Mikulin	2017-06-05	1	-0/+16
\| \| \| \| \| \| \| \| \| \|	procedural optimizations to prevent dropping symbols and allow the linker to process re-directs. PR33145: --wrap doesn't work with lto. Differential Revision: https://reviews.llvm.org/D33621 llvm-svn: 304719
*	[X86][AVX1] Split 256-bit vector non-temporal loads to keep it non-temporal ↵	Simon Pilgrim	2017-06-05	2	-100/+202
\| \| \| \| \| \| \| \|	(PR32744) Differential Revision: https://reviews.llvm.org/D33728 llvm-svn: 304718
*	[X86][SSE] Non-temporal loads shouldn't be folded if it can be avoided (PR32743)	Simon Pilgrim	2017-06-05	1	-65/+148
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33728 llvm-svn: 304717
*	[ARM] GlobalISel: Constrain callee register on indirect calls	Diana Picus	2017-06-05	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	When lowering calls, we generate instructions with machine opcodes rather than generic ones. Therefore, we need to constrain the register classes of the operands. Also enable the machine verifier on the arm-irtranslator.ll test, since that would've caught this issue. Fixes (part of) PR32146. llvm-svn: 304712
*	[LLVM-C] [OCaml] Expose Type::subtypes.	whitequark	2017-06-05	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	The C functions added are LLVMGetNumContainedTypes and LLVMGetSubtypes. The OCaml function added is Llvm.subtypes. Patch by Ekaterina Vaartis. Differential Revision: https://reviews.llvm.org/D33677 llvm-svn: 304709
*	Move ARM specific test to ELF/ARM dir	Javed Absar	2017-06-05	1	-0/+0
\| \| \| \| \| \| \| \| \| \| \|	Moving ARM specific test clang-section.s from MC/ELF to MC/ELF/ARM Buildbots reported failures on commit https://reviews.llvm.org/rL304705 Full details are available at: http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/10333 llvm-svn: 304706
*	Add support for #pragma clang section	Javed Absar	2017-06-05	2	-0/+539
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch provides a means to specify section-names for global variables, functions and static variables, using #pragma directives. This feature is only defined to work sensibly for ELF targets. One can specify section names as: #pragma clang section bss="myBSS" data="myData" rodata="myRodata" text="myText" One can "unspecify" a section name with empty string e.g. #pragma clang section bss="" data="" text="" rodata="" Reviewers: Roger Ferrer, Jonathan Roelofs, Reid Kleckner Differential Revision: https://reviews.llvm.org/D33413 llvm-svn: 304704
*	[ARM] Support fixup for Thumb2 modified immediate	Peter Smith	2017-06-05	5	-2/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds a new fixup fixup_t2_so_imm for the t2_so_imm_asmoperand "T2SOImm". The fixup permits code such as: .L1: sub r3, r3, #.L2 - .L1 .L2: to assemble in Thumb2 as well as in ARM state. The operand predicate isT2SOImm() explicitly doesn't match expressions containing :upper16: and :lower16: as expressions with these operators must match the movt and movw instructions. The test mov r0, foo2 in thumb2-diagnostics is moved to a new file as the fixup delays the error message till after the assembler has quit due to the other errors. As the mov instruction shares the t2_so_imm_asmoperand mov instructions with a non constant expression now match t2MOVi rather than t2MOVi16 so the error message is slightly different. Fixes PR28647 Differential Revision: https://reviews.llvm.org/D33492 llvm-svn: 304702
*	[InstCombine] Fix extractelement use before def	Sven van Haastregt	2017-06-05	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \|	This fixes a bug that can cause extractelements with operands that haven't been defined yet to be inserted at a wrong point when optimising insertelements. Patch by Karl Hylen. Differential Revision: https://reviews.llvm.org/D33449 llvm-svn: 304701
*	Revert "[sanitizer-coverage] one more flavor of coverage: ↵	Renato Golin	2017-06-05	1	-13/+0
\| \| \| \| \| \| \| \|	-fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet." This reverts commit r304630, as it broke ARM/AArch64 bots for 2 days. llvm-svn: 304698
*	[AMDGPU] Fix SIFoldOperands crash with clamp	Stanislav Mekhanoshin	2017-06-05	1	-0/+20
\| \| \| \| \| \| \| \| \|	Fixes bug #33302. Pass did not account that Src1 of max instruction can be an immediate. Differential Revision: https://reviews.llvm.org/D33884 llvm-svn: 304696
*	[X86][SSE] Change BUILD_VECTOR interleaving ordering to improve ↵	Simon Pilgrim	2017-06-04	19	-1297/+1240
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	coalescing/combine opportunities We currently generate BUILD_VECTOR as a tree of UNPCKL shuffles of the same type: e.g. for v4f32: Step 1: unpcklps 0, 2 ==> X: <?, ?, 2, 0> : unpcklps 1, 3 ==> Y: <?, ?, 3, 1> Step 2: unpcklps X, Y ==> <3, 2, 1, 0> The issue is because we are not placing sequential vector elements together early enough, we fail to recognise many combinable patterns - consecutive scalar loads, extractions etc. Instead, this patch unpacks progressively larger sequential vector elements together: e.g. for v4f32: Step 1: unpcklps 0, 2 ==> X: <?, ?, 1, 0> : unpcklps 1, 3 ==> Y: <?, ?, 3, 2> Step 2: unpcklpd X, Y ==> <3, 2, 1, 0> This does mean that we are creating UNPCKL shuffle of different value types, but the relevant combines that benefit from this are quite capable of handling the additional BITCASTs that are now included in the shuffle tree. Differential Revision: https://reviews.llvm.org/D33864 llvm-svn: 304688
*	[GlobalISel][X86] merge irtranslator-call test files. NFC	Igor Breger	2017-06-04	3	-57/+34
\| \| \| \|	llvm-svn: 304683
*	[X86] Replace 'REQUIRES: x86' in tests with 'REQUIRES: ↵	Craig Topper	2017-06-04	10	-10/+10
\| \| \| \| \| \|	x86-registered-target' which seems to be the correct way to make them run on an x86 build. llvm-svn: 304682
*	[ConstantFolding] Properly support constant folding of vector powi ↵	Craig Topper	2017-06-04	1	-2/+1
\| \| \| \| \| \|	intrinsic. The second argument is not a vector so needs special treatment. llvm-svn: 304679
*	[InstSimplify] Add test case demonstrating that we fail to constant fold ↵	Craig Topper	2017-06-04	1	-0/+24
\| \| \| \| \| \|	vector llvm.powi intrinsics due to the second argument not being a vector. llvm-svn: 304678
*	[InstCombine] Add support for simplifying ctlz/cttz intrinsics based on ↵	Craig Topper	2017-06-03	1	-10/+4
\| \| \| \| \| \|	known bits. llvm-svn: 304669
*	[ConstantFolding] Fix constant folding for vector cttz and ctlz intrinsics ↵	Craig Topper	2017-06-03	2	-6/+3
\| \| \| \| \| \|	to understand that the second argument is still a scalar. llvm-svn: 304668
*	[InstCombine][InstSimplify] Add various tests for ctlz/cttz with vectors, ↵	Craig Topper	2017-06-03	2	-0/+171
\| \| \| \| \| \|	some showing missed optimizations. NFC llvm-svn: 304667
*	[InstCombine] Use cttz instead of ctlz in the cttz_cmp_vec test case. Looks ↵	Craig Topper	2017-06-03	1	-1/+1
\| \| \| \| \| \|	like a copy paste mistake. llvm-svn: 304666
*	[AMDGPU] Untangle SDWA pass from SIShrinkInstructions	Stanislav Mekhanoshin	2017-06-03	22	-95/+102
\| \| \| \| \| \| \| \| \| \| \| \|	Remove dependency of SDWA pass on SIShrinkInstructions. The goal is to move SDWA even higher in the stack to avoid second run of MachineLICM, MachineCSE and SIFoldOperands. Also added handling to preserve original src modifiers. Differential Revision: https://reviews.llvm.org/D33860 llvm-svn: 304665
*	Regenerate expectations for trunc-to-bool.ll . NFC	Amaury Sechet	2017-06-03	1	-10/+60
\| \| \| \|	llvm-svn: 304660
*	[X86][SSE] Add SCALAR_TO_VECTOR(PEXTRW/PEXTRB) support to faux shuffle combining	Simon Pilgrim	2017-06-03	1	-39/+5
\| \| \| \| \| \|	Generalized existing SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT) code to support AssertZext + PEXTRW/PEXTRB cases as well. llvm-svn: 304659
*	[sanitizer-coverage] one more flavor of coverage: ↵	Kostya Serebryany	2017-06-03	1	-0/+13
\| \| \| \| \| \|	-fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. llvm-svn: 304630
*	AMDGPU/GlobalISel: Mark 1-bit integer constants as legal	Tom Stellard	2017-06-03	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These are mostly legal, but will probably need special lowering for some cases. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D33791 llvm-svn: 304628
*	Revert "[CFI] Remove LinkerSubsectionsViaSymbols."	Evgeniy Stepanov	2017-06-03	1	-1/+15
\| \| \| \| \| \|	This reverts commit r304582: breaks cfi-devirt :: anon-namespace.cpp on Darwin. llvm-svn: 304626
*	[AMDGPU] Preserve operand order in SIFoldOperands	Stanislav Mekhanoshin	2017-06-03	3	-12/+8
\| \| \| \| \| \| \| \| \|	SIFoldOperands can commute operands even if no folding was done. This change is to preserve IR is no folding was done. Differential Revision: https://reviews.llvm.org/D33802 llvm-svn: 304625
*	[SystemZ] Simplify test case. NFC	Quentin Colombet	2017-06-02	1	-12/+0
\| \| \| \| \| \|	Remove useless successors information. llvm-svn: 304615
*	[x86] fix over-specific triple; NFC	Sanjay Patel	2017-06-02	1	-204/+204
\| \| \| \| \| \| \| \|	There's nothing darwin-specific in these tests, and using that setting causes extra phantom diffs when the auto-generated check lines are regenerated today. llvm-svn: 304614
*	Canonicalize a test via utils/update_test_checks.py	Philip Reames	2017-06-02	1	-31/+91
\| \| \| \| \| \|	Turns out I might not have further changes to make here, but with the way I'd written the tests, even I couldn't tell that. :( llvm-svn: 304613
*	[x86] add tests for unsigned vector compares with known signbits; NFC (PR33276)	Sanjay Patel	2017-06-02	1	-0/+519
\| \| \| \|	llvm-svn: 304612
*	RegisterScavenging: Add ScavengerTest pass	Matthias Braun	2017-06-02	2	-0/+203
\| \| \| \| \| \| \| \| \|	This pass allows to run the register scavenging independently of PrologEpilogInserter to allow targeted testing. Also adds some basic register scavenging tests. llvm-svn: 304606
*	[RABasic] Properly update the LiveRegMatrix when LR splitting occur	Quentin Colombet	2017-06-02	1	-0/+279
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Prior to this patch we used to not touch the LiveRegMatrix while doing live-range splitting. In other words, when live-range splitting was occurring, the LiveRegMatrix was not reflecting the changes. This is generally fine because it means the query to the LiveRegMatrix will be conservately correct. However, when decisions are taken based on what is going to happen on the interferences (e.g., when we spill a register and know that it is going to be available for another one), we might hit an assertion that the color used for the assignment is still in use. This patch makes sure the changes on the live-ranges are properly reflected in the LiveRegMatrix, so the assertions don't break. An alternative could have been to remove the assertion, but it would make the invariants of the code and the general reasoning more complicated in my opnion. http://llvm.org/PR33057 llvm-svn: 304603
*	[RABasic] Properly initialize the pass	Quentin Colombet	2017-06-02	1	-0/+1
\| \| \| \| \| \| \|	Use the initializeXXX method to initialize the RABasic pass in the pipeline. This enables us to take advantage of the .mir infrastructure. llvm-svn: 304602
*	[PartialInlining] Minor cost anaysis tuning	Xinliang David Li	2017-06-02	2	-0/+105
\| \| \| \| \| \|	Also added a test option and 2 cost analysis related tests. llvm-svn: 304599
*	[InlineCost] Enable the new switch cost heuristic	Jun Bum Lim	2017-06-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is to enable the new switch inline cost heuristic (r301649) by removing the old heuristic as well as the flag itself. In my experiment for LLVM test suite and spec2000/2006, +17.82% performance and 8% code size reduce was observed in spec2000/vertex with O3 LTO in AArch64. No significant code size / performance regression was found in O3/O2/Os. No significant complain was reported from the llvm-dev thread. Reviewers: hans, chandlerc, eraman, haicheng, mcrosier, bmakam, eastig, ddibyend, echristo Reviewed By: echristo Subscribers: javed.absar, kristof.beyls, echristo, aemerson, rengolin, mehdi_amini Differential Revision: https://reviews.llvm.org/D32653 llvm-svn: 304594
*	[X86] Correctly broadcast NaN-like integers as float on AVX.	Ahmed Bougacha	2017-06-02	1	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since r288804, we try to lower build_vectors on AVX using broadcasts of float/double. However, when we broadcast integer values that happen to have a NaN float bitpattern, we lose the NaN payload, thereby changing the integer value being broadcast. This is caused by ConstantFP::get, to which we pass the splat i32 as a float (by bitcasting it using bitsToFloat). ConstantFP::get takes a double parameter, so we end up lossily converting a single-precision NaN to double-precision. Instead, avoid any kinds of conversions by directly building an APFloat from the splatted APInt. Note that this also fixes another piece of code (broadcast of subvectors), that currently isn't susceptible to the same problem. Also note that we could really just use APInt and ConstantInt throughout: the constant pool type doesn't matter much. Still, for consistency, use the appropriate type. llvm-svn: 304590
*	[CodeView] Support CodeView subsections in any order.	Zachary Turner	2017-06-02	2	-41/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we would expect certain subsections to appear in a certain order because some subsections would reference other subsections, but in practice we need to support arbitrary orderings since some object file and PDB file producers generate them this way. This also paves the way for supporting Yaml <-> Object File conversion of CodeView, since Object Files typically have quite a large number of subsections in their debug info. Differential Revision: https://reviews.llvm.org/D33807 llvm-svn: 304588
*	Regenerate expectation for wide-fma-contraction.ll . NFC	Amaury Sechet	2017-06-02	1	-16/+38
\| \| \| \|	llvm-svn: 304586
*	[SROA] Fix crash due to bad bitcast	Keno Fischer	2017-06-02	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: As shown in the test case, SROA was crashing when trying to split stores (to the alloca) of loads (from anywhere), because it assumed the pointer operand to the loads and stores had to have the same address space. This isn't the case. Make sure to use the correct pointer type for both the load and the store. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D32593 llvm-svn: 304585
*	[CFI] Remove LinkerSubsectionsViaSymbols.	Evgeniy Stepanov	2017-06-02	1	-15/+1
\| \| \| \| \| \| \| \| \| \|	Since D17854 LinkerSubsectionsViaSymbols is unnecessary. It is interfering with ThinLTO implementation of CFI-ICall, where the aliases used on the !LinkerSubsectionsViaSymbols branch are needed to export jump tables to ThinLTO backends. llvm-svn: 304582
*	Skip CFI for dead functions.	Evgeniy Stepanov	2017-06-02	14	-11/+42
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33805 llvm-svn: 304578
*	Move summary dead stripping before regular LTO.	Evgeniy Stepanov	2017-06-02	1	-1/+15
\| \| \| \| \| \| \| \| \|	This way dead stripping results are recorded in combined summary and can be used in regular LTO passes. Differential Revision: https://reviews.llvm.org/D33615 llvm-svn: 304577
*	AMDGPU: Make auto waitcnt before barrier a feature	Konstantin Zhuravlyov	2017-06-02	1	-3/+6
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D33793 llvm-svn: 304571
*	Add placeholder for more extensive verification of psuedo ops	Philip Reames	2017-06-02	12	-13/+13
\| \| \| \| \| \| \| \| \| \|	This initial patch doesn't actually do much useful. It's just to show where the new code goes. Once this is in, I'll extend the verification logic to check more useful properties. For those curious, the more complicated version of this patch already found one very suspicious thing. Differential Revision: https://reviews.llvm.org/D33819 llvm-svn: 304564
*	[InstCombine] fix icmp with not op and constant to work with splat vector ↵	Sanjay Patel	2017-06-02	1	-2/+1
\| \| \| \| \| \|	constant llvm-svn: 304562
*	[InstSimplify][ConstantFolding] Teach constant folding how to handle icmp ↵	Craig Topper	2017-06-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	null, (inttoptr x) as well as it handles icmp (inttoptr x), null Summary: The constant folding code currently assumes that the constant expression will always be on the left and the simple null will be on the right. But that's not true at least on the path from InstSimplify. This patch adds support to ConstantFolding to detect the reversed case. Reviewers: spatel, dberlin, majnemer, davide, joey Reviewed By: joey Subscribers: joey, llvm-commits Differential Revision: https://reviews.llvm.org/D33801 llvm-svn: 304559
*	Update select.ll expected results. NFC	Amaury Sechet	2017-06-02	1	-0/+31
\| \| \| \|	llvm-svn: 304557
*	[InstCombine] fix/add tests for icmp with not ops; NFC	Sanjay Patel	2017-06-02	1	-10/+40
\| \| \| \| \| \| \|	The existing test was not minimal, and there was no coverage for the variants with a constant or vector types. llvm-svn: 304555