bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Empty line. NFCI	Xin Tong	2017-02-25	1	-1/+0
\| \| \| \|	llvm-svn: 296250
*	Minor code cleanup. NFC.	Junmo Park	2017-02-25	1	-1/+1
\| \| \| \|	llvm-svn: 296222
*	Remove redundant code. NFC.	Akira Hatanaka	2017-02-25	1	-4/+0
\| \| \| \|	llvm-svn: 296219
*	Clean up ObjCARCOpts.cpp. NFC.	Akira Hatanaka	2017-02-25	1	-81/+7
\| \| \| \| \| \| \|	I removed unused functions and variables and moved variables closer to their uses. llvm-svn: 296218
*	[PDB] General improvements to Stream library.	Zachary Turner	2017-02-25	31	-287/+312
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds various new functionality and cleanup surrounding the use of the Stream library. Major changes include: * Renaming of all classes for more consistency / meaningfulness * Addition of some new methods for reading multiple values at once. * Full suite of unit tests for reader / writer functionality. * Full set of doxygen comments for all classes. * Streams now store their own endianness. * Fixed some bugs in a few of the classes that were discovered by the unit tests. llvm-svn: 296215
*	[PDB] Rename Stream related source files.	Zachary Turner	2017-02-25	33	-52/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is part of a larger effort to get the Stream code moved up to Support. I don't want to do it in one large patch, in part because the changes are so big that it will treat everything as file deletions and add, losing history in the process. Aside from that though, it's just a good idea in general to make small changes. So this change only changes the names of the Stream related source files, and applies necessary source fix ups. llvm-svn: 296211
*	[InlineCost] Move the code in isGEPOffsetConstant to a lambda.	Easwaran Raman	2017-02-25	1	-13/+9
\| \| \| \| \| \|	Differential revision: https://reviews.llvm.org/D30112 llvm-svn: 296208
*	Minor code cleanup. NFC.	Junmo Park	2017-02-25	1	-1/+1
\| \| \| \|	llvm-svn: 296207
*	[PGO] Directory name stripping in global identifier for static functions	Rong Xu	2017-02-25	1	-1/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current internal option -static-func-full-module-prefix keeps all the directory path the profile counter names for static functions. The default of this option is false. This strips the directory names from the source filename which is problematic: (1) it creates linker errors for profile-generation compilation, exposed in our internal benchmarks. We are seeing messages like "warning: relocation refers to discarded section". This is due to the name conflicts after the stripping. (2) the stripping only applies to getPGOFuncName. Current Thin-LTO module importing for the indirect-calls assumes the source directory name not being stripped. Current default value for this option can potentially prevent some inter-module indirect-call-promotions. This patch turns the default value for -static-func-full-module-prefix to true. The second part of the patch is to have an alternative implementation under the internal option -static-func-strip-dirname-prefix=<value> This options specifies level of directories to be stripped from the source filename. Using a large value as the parameter has the same effect as -static-func-full-module-prefix. Differential Revision: http://reviews.llvm.org/D29512 llvm-svn: 296206
*	[WebAssembly] Add support for using a wasm global for the stack pointer.	Dan Gohman	2017-02-24	7	-42/+137
\| \| \| \| \| \| \|	This replaces the __stack_pointer variable which was allocated in linear memory. llvm-svn: 296201
*	[Hexagon] Undo shift folding where it could simplify addressing mode	Krzysztof Parzyszek	2017-02-24	1	-3/+75
\| \| \| \| \| \| \| \| \| \| \| \|	For example, avoid (single shift): r0 = and(##536870908,lsr(r0,#3)) r0 = memw(r1+r0<<#0) in favor of (two shifts): r0 = lsr(r0,#5) r0 = memw(r1+r0<<#2) llvm-svn: 296196
*	[WebAssembly] Basic support for Wasm object file encoding.	Dan Gohman	2017-02-24	28	-153/+1587
\| \| \| \| \| \| \| \| \|	With the "wasm32-unknown-unknown-wasm" triple, this allows writing out simple wasm object files, and is another step in a larger series toward migrating from ELF to general wasm object support. Note that this code and the binary format itself is still experimental. llvm-svn: 296190
*	[Hexagon] Prettify code in HexagonDAGToDAGISel::Select	Krzysztof Parzyszek	2017-02-24	1	-47/+13
\| \| \| \|	llvm-svn: 296187
*	AMDGPU : Replace FMAD with FMA when denormals are enabled.	Wei Ding	2017-02-24	4	-1/+20
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D29958 llvm-svn: 296186
*	Revert "Correct register pressure calculation in presence of subregs"	Stanislav Mekhanoshin	2017-02-24	5	-64/+13
\| \| \| \| \| \| \| \|	This reverts commit r296009. It broke one out of tree target and also does not account for all partial lines added or removed when calculating PressureDiff. llvm-svn: 296182
*	Disallow redefinition of section symbols.	Evgeniy Stepanov	2017-02-24	1	-1/+5
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D30235 llvm-svn: 296180
*	Initialize MCContext::InlineSrcMgr in the constructor.	Evgeniy Stepanov	2017-02-24	1	-2/+3
\| \| \| \| \| \|	Found with ASan (and a local source change) on test/CodeGen/XCore/section-name.ll. llvm-svn: 296179
*	GlobalISel: check for CImm rather than Imm on G_CONSTANTs.	Tim Northover	2017-02-24	1	-2/+5
\| \| \| \| \| \| \|	All G_CONSTANTS created by the MachineIRBuilder have an operand of type CImm (i.e. a ConstantInt), so that's what the selector needs to look for. llvm-svn: 296176
*	[WebAssembly] Handle f16 in fast-isel.	Dan Gohman	2017-02-24	1	-0/+2
\| \| \| \|	llvm-svn: 296172
*	Fix Indentation. NFCI	Xin Tong	2017-02-24	1	-2/+2
\| \| \| \|	llvm-svn: 296169
*	[CodeGenPrepare] Make -addr-sink-using-gep work with address spaces.	Eli Friedman	2017-02-24	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	When we construct addressing modes, we use isNoopAddrSpaceCast to ignore addrspacecast instructions. Make sure we insert the correct addrspacecast when we reconstruct the addressing mode. Differential Revision: https://reviews.llvm.org/D30114 llvm-svn: 296167
*	[InstCombine] Fix bug in pointer replacement	Yaxun Liu	2017-02-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This optimisation was crashing when there was a chain of more than one bitcast instruction to replace, as a result of the changes in D27283. Patch by James Price. Differential Revision: https://reviews.llvm.org/D30347 llvm-svn: 296163
*	[Target/MIPS] Kill dead code, no functional change intended.	Davide Italiano	2017-02-24	1	-11/+0
\| \| \| \| \| \|	Hopefully placates gcc with -Werror. llvm-svn: 296153
*	[CGP] Split some critical edges coming out of indirect branches	Michael Kuperstein	2017-02-24	1	-0/+155
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Splitting critical edges when one of the source edges is an indirectbr is hard in general (because it requires changing the memory the indirectbr reads). But if a block only has a single indirectbr predecessor (which is the common case), we can simulate splitting that edge by splitting the destination block, and retargeting the direct branches. This is motivated by the use of computed gotos in python 2.7: PyEval_EvalFrame() ends up using an indirect branch with ~100 successors, and passing a constant to each of those. Since MachineSink can't break indirect critical edges on demand (and doing this in MIR doesn't look feasible), this causes us to emit about ~100 defs of registers containing constants, which we in the predecessor block, where only one of those constants is used in each successor. So, at each computed goto, we needlessly spill about a 100 constants to stack. The end result is that a clang-compiled python interpreter can be about ~2.5x slower on a simple python reduction loop than a gcc-compiled interpreter. Differential Revision: https://reviews.llvm.org/D29916 llvm-svn: 296149
*	Revert: r296141 [APInt] Add APInt::extractBits() method to extract APInt ↵	Simon Pilgrim	2017-02-24	3	-39/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	subrange The current pattern for extract bits in range is typically: Mask.lshr(BitOffset).trunc(SubSizeInBits); Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable. This is another of the compile time issues identified in PR32037 (see also D30265). This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation. Differential Revision: https://reviews.llvm.org/D30336 llvm-svn: 296147
*	[LV] Merge floating-point and integer induction widening code	Matthew Simpson	2017-02-24	1	-65/+92
\| \| \| \| \| \| \| \| \| \| \|	This patch merges the existing floating-point induction variable widening code into the integer induction variable widening code, creating a single set of functions for both kinds of inductions. The primary motivation for doing this is to enable vector phi node creation for floating-point induction variables. Differential Revision: https://reviews.llvm.org/D30211 llvm-svn: 296145
*	[PowerPC] Use subfic instruction for subtract from immediate	Nemanja Ivanovic	2017-02-24	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \|	Provide a 64-bit pattern to use SUBFIC for subtracting from a 16-bit immediate. The corresponding pattern already exists for 32-bit integers. Committing on behalf of Hiroshi Inoue. Differential Revision: https://reviews.llvm.org/D29387 llvm-svn: 296144
*	[PowerPC] Use rldicr instruction for AND with an immediate if possible	Nemanja Ivanovic	2017-02-24	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \|	Emit clrrdi (extended mnemonic for rldicr) for AND-ing with masks that clear bits from the right hand size. Committing on behalf of Hiroshi Inoue. Differential Revision: https://reviews.llvm.org/D29388 llvm-svn: 296143
*	[APInt] Add APInt::extractBits() method to extract APInt subrange	Simon Pilgrim	2017-02-24	3	-8/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current pattern for extract bits in range is typically: Mask.lshr(BitOffset).trunc(SubSizeInBits); Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable. This is another of the compile time issues identified in PR32037 (see also D30265). This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation. Differential Revision: https://reviews.llvm.org/D30336 llvm-svn: 296141
*	[DAGCombiner] add missing folds for scalar select of {-1,0,1}	Sanjay Patel	2017-02-24	1	-3/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The motivation for filling out these select-of-constants cases goes back to D24480, where we discussed removing an IR fold from add(zext) --> select. And that goes back to: https://reviews.llvm.org/rL75531 https://reviews.llvm.org/rL159230 The idea is that we should always canonicalize patterns like this to a select-of-constants in IR because that's the smallest IR and the best for value tracking. Note that we currently do the opposite in some cases (like the cases in this patch). Ie, the proposed folds in this patch already exist in InstCombine today: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineSelect.cpp#L1151 As this patch shows, most targets generate better machine code for simple ext/add/not ops rather than a select of constants. So the follow-up steps to make this less of a patchwork of special-case folds and missing IR canonicalization: 1. Have DAGCombiner convert any select of constants into ext/add/not ops. 2 Have InstCombine canonicalize in the other direction (create more selects). Differential Revision: https://reviews.llvm.org/D30180 llvm-svn: 296137
*	Recommit "[mips] Fix atomic compare and swap at O0."	Simon Dardis	2017-02-24	7	-151/+400
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This time with the missing files. Similar to PR/25526, fast-regalloc introduces spills at the end of basic blocks. When this occurs in between an ll and sc, the store can cause the atomic sequence to fail. This patch fixes the issue by introducing more pseudos to represent atomic operations and moving their lowering to after the expansion of postRA pseudos. This resolves PR/32020. Thanks to James Cowgill for reporting the issue! Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D30257 llvm-svn: 296134
*	Revert "[mips] Fix atomic compare and swap at O0."	Simon Dardis	2017-02-24	6	-59/+151
\| \| \| \| \| \|	This reverts r296132. I forgot to include the tests. llvm-svn: 296133
*	[mips] Fix atomic compare and swap at O0.	Simon Dardis	2017-02-24	6	-151/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to PR/25526, fast-regalloc introduces spills at the end of basic blocks. When this occurs in between an ll and sc, the store can cause the atomic sequence to fail. This patch fixes the issue by introducing more pseudos to represent atomic operations and moving their lowering to after the expansion of postRA pseudos. This resolves PR/32020. Thanks to James Cowgill for reporting the issue! Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D30257 llvm-svn: 296132
*	[globalisel] Decouple src pattern operands from dst pattern operands.	Daniel Sanders	2017-02-24	2	-1/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This isn't testable for AArch64 by itself so this patch also adds support for constant immediates in the pattern and physical register uses in the result. The new IntOperandMatcher matches the constant in patterns such as '(set $rd:GPR32, (G_XOR $rs:GPR32, -1))'. It's always safe to fold immediates into an instruction so this is the first rule that will match across multiple BB's. The Renderer hierarchy is responsible for adding operands to the result instruction. Renderers can copy operands (CopyRenderer) or add physical registers (in particular %wzr and %xzr) to the result instruction in any order (OperandMatchers now import the operand names from SelectionDAG to allow renderers to access any operand). This allows us to emit the result instruction for: %1 = G_XOR %0, -1 --> %1 = ORNWrr %wzr, %0 %1 = G_XOR -1, %0 --> %1 = ORNWrr %wzr, %0 although the latter is untested since the matcher/importer has not been taught about commutativity yet. Added BuildMIAction which can build new instructions and mutate them where possible. W.r.t the mutation aspect, MatchActions are now told the name of an instruction they can recycle and BuildMIAction will emit mutation code when the renderers are appropriate. They are appropriate when all operands are rendered using CopyRenderer and the indices are the same as the matcher. This currently assumes that all operands have at least one matcher. Finally, this change also fixes a crash in AArch64InstructionSelector::select() caused by an immediate operand passing isImm() rather than isCImm(). This was uncovered by the other changes and was detected by existing tests. Depends on D29711 Reviewers: t.p.northover, ab, qcolombet, rovka, aditya_nandakumar, javed.absar Reviewed By: rovka Subscribers: aemerson, dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D29712 llvm-svn: 296131
*	[X86][SSE] Target shuffle combine can try to combine up to 16 vectors	Simon Pilgrim	2017-02-24	1	-6/+6
\| \| \| \| \| \|	Noticed while profiling PR32037, the target shuffle ops were being stored in SmallVector<*,8> types but the combiner could store as many as 16 ops at maximum depth (2 per depth). llvm-svn: 296130
*	[InstCombine] don't try SimplifyDemandedInstructionBits from zext/sext ↵	Sanjay Patel	2017-02-24	1	-10/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	because it's slow and unnecessary This one seems more obvious than D30270 that it can't make improvements because an extension always needs all of the incoming bits. There's one specific transform in SimplifyDemandedInstructionBits of converting a sext to a zext when the sign-bit is known zero, but that is handled explicitly in visitSext() with ComputeSignBit(). Like D30270, there are no IR differences (other than instruction names) for the case in PR32037: https://bugs.llvm.org//show_bug.cgi?id=32037 ...and no regression test differences. Zext/sext are a smaller part of the profile, but this still appears to shave off another 0.5% or so from 'opt -O2'. Differential Revision: https://reviews.llvm.org/D30280 llvm-svn: 296129
*	[x86] use DAG.getAllOnesConstant(); NFCI	Sanjay Patel	2017-02-24	1	-18/+11
\| \| \| \|	llvm-svn: 296128
*	[mips] Handle 64 bit immediate in and/or/xor pseudo instructions on mips64	Simon Dardis	2017-02-24	3	-15/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously LLVM was assuming 32-bit signed immediates which results in and with a bitmask that has bit 31 set to incorrectly include bits 63-32 in the result. After applying this patch I can now compile all of the FreeBSD mips assembly code with clang. This issue also affects the nor, slt and sltu macros and I will fix those in a separate review. Patch By: Alexander Richardson Commit message reformatted by sdardis. Reviewers: atanasyan, theraven, sdardis Differential Revision: https://reviews.llvm.org/D30298 llvm-svn: 296125
*	[ARM] GlobalISel: Select G_STORE	Diana Picus	2017-02-24	1	-16/+20
\| \| \| \| \| \|	Same as selecting G_LOAD. llvm-svn: 296122
*	[ARM] GlobalISel: Add reg bank mappings for stores	Diana Picus	2017-02-24	1	-0/+2
\| \| \| \| \| \|	Same as the ones for loads. llvm-svn: 296115
*	[ARM] GlobalISel: Legalize stores	Diana Picus	2017-02-24	1	-3/+6
\| \| \| \| \| \|	Allow the same types that we allow for loads. llvm-svn: 296108
*	[mips][mc] Fix a crash when disassembling odd sized sections	Simon Dardis	2017-02-24	1	-30/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the MIPS disassembler consistent with the other targets in returning a Size of zero when the input buffer cannot contain an instruction due to it's size. Previously it reported the minimum instruction size when it failed due to the buffer not being big enough for an instruction causing llvm-objdump to crash when disassembling all sections. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D29984 llvm-svn: 296105
*	Revert "[ARM] GlobalISel: Legalize stores"	Diana Picus	2017-02-24	1	-5/+3
\| \| \| \| \| \|	This reverts commit r296103 because the test broke on one of the bots. Sorry! llvm-svn: 296104
*	[ARM] GlobalISel: Legalize stores	Diana Picus	2017-02-24	1	-3/+5
\| \| \| \| \| \|	Allow the same types that we allow for loads. llvm-svn: 296103
*	[APInt] Add APInt::setBits() method to set all bits in range	Simon Pilgrim	2017-02-24	3	-9/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current pattern for setting bits in range is typically: Mask \|= APInt::getBitsSet(MaskSizeInBits, LoPos, HiPos); Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation memory for the temporary variable. This is one of the key compile time issues identified in PR32037. This patch adds the APInt::setBits() helper method which avoids the temporary memory allocation completely, this first implementation uses setBit() internally instead but already significantly reduces the regression in PR32037 (~10% drop). Additional optimization may be possible. I investigated whether there is need for APInt::clearBits() and APInt::flipBits() equivalents but haven't seen these patterns to be particularly common, but reusing the code would be trivial. Differential Revision: https://reviews.llvm.org/D30265 llvm-svn: 296102
*	Add missing initialization for MachineOptimizationRemarkEmitter	Justin Bogner	2017-02-24	1	-0/+1
\| \| \| \| \| \|	This was missed in r293110. llvm-svn: 296096
*	[WebAssembly] Add a README.txt entry for mergeable sections.	Dan Gohman	2017-02-24	1	-0/+5
\| \| \| \|	llvm-svn: 296095
*	[AVX-512] Separate the fadd/fsub/fmul/fdiv/fmax/fmin with rounding mode ISD ↵	Craig Topper	2017-02-24	5	-26/+38
\| \| \| \| \| \|	opcodes into separate packed and scalar opcodes. This is more consistent with the rest of the ISD opcodes. NFC llvm-svn: 296094
*	[ExecutionDepsFix] Use range-based for loop. NFC	Craig Topper	2017-02-24	1	-2/+1
\| \| \| \|	llvm-svn: 296093
*	[IR][X86] Fix llvm version number in comments in AutoUpgrade. Forgot the ↵	Craig Topper	2017-02-24	1	-13/+13
\| \| \| \| \| \|	next release is 5.0 not 4.1 llvm-svn: 296092