bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Resubmit "[CodeView] Hook CodeViewRecordIO for reading/writing symbols."	Zachary Turner	2016-12-16	1	-13/+17
\| \| \| \| \| \| \|	The original patch was broken due to some undefined behavior as well as warnings that were triggering -Werror. llvm-svn: 290000
*	[CodeGenPrep] Skip merging empty case blocks	Jun Bum Lim	2016-12-16	1	-31/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block and unit test failures in AVR and WebAssembly : Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting. Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22696 llvm-svn: 289988
*	Inline stripInvariantGroupMetadata out of existence	Sanjoy Das	2016-12-16	1	-7/+2
\| \| \| \| \| \| \|	As a one liner function, I don't think it is pulling its weight in terms of helping readability. llvm-svn: 289987
*	Revert "[IR] Remove the DIExpression field from DIGlobalVariable."	Adrian Prantl	2016-12-16	5	-130/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 289920 (again). I forgot to implement a Bitcode upgrade for the case where a DIGlobalVariable has not DIExpression. Unfortunately it is not possible to safely upgrade these variables without adding a flag to the bitcode record indicating which version they are. My plan of record is to roll the planned follow-up patch that adds a unit: field to DIGlobalVariable into this patch before recomitting. This way we only need one Bitcode upgrade for both changes (with a version flag in the bitcode record to safely distinguish the record formats). Sorry for the churn! llvm-svn: 289982
*	Revert "[CodeView] Hook CodeViewRecordIO for reading/writing symbols."	Zachary Turner	2016-12-16	1	-17/+13
\| \| \| \| \| \| \|	This reverts commit r289978, which is failing due to some rebase/merge issues. llvm-svn: 289981
*	[CodeView] Hook CodeViewRecordIO for reading/writing symbols.	Zachary Turner	2016-12-16	1	-13/+17
\| \| \| \| \| \| \| \| \|	This is the 3rd of 3 patches to get reading and writing of CodeView symbol and type records to use a single codepath. Differential Revision: https://reviews.llvm.org/D26427 llvm-svn: 289978
*	Implement LaneBitmask::any(), use it to replace !none(), NFCI	Krzysztof Parzyszek	2016-12-16	16	-55/+55
\| \| \| \|	llvm-svn: 289974
*	Fix CodeGenPrepare::stripInvariantGroupMetadata	Sanjoy Das	2016-12-16	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \|	`dropUnknownNonDebugMetadata` takes a list of "known" metadata IDs. The only reason it worked at all is that `getMetadataID` returns something unrelated -- it returns the subclass ID of the receiver (which is used in `dyn_cast` etc.). That does not numerically match `LLVMContext::MD_invariant_group` and ends up dropping `invariant_group` along with every other metadata that does not numerically match `LLVMContext::MD_invariant_group`. llvm-svn: 289973
*	Fix name typo in SelectonDAG	Joel Jones	2016-12-16	1	-4/+4
\| \| \| \|	llvm-svn: 289969
*	Revert "[CodeGenPrep] Skip merging empty case blocks"	Jun Bum Lim	2016-12-16	1	-137/+31
\| \| \| \| \| \|	This reverts commit r289951. llvm-svn: 289960
*	[CodeGenPrep] Skip merging empty case blocks	Jun Bum Lim	2016-12-16	1	-31/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is recommit of r287553 after fixing the invalid loop info after eliminating an empty block: Summary: Merging an empty case block into the header block of switch could cause ISel to add COPY instructions in the header of switch, instead of the case block, if the case block is used as an incoming block of a PHI. This could potentially increase dynamic instructions, especially when the switch is in a loop. I added a test case which was reduced from the benchmark I was targetting. Reviewers: t.p.northover, mcrosier, manmanren, wmi, joerg, davidxl Subscribers: joerg, qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22696 llvm-svn: 289951
*	[MIRParser] Add parsing hex literals of arbitrary size as unsigned integers	Krzysztof Parzyszek	2016-12-16	1	-13/+38
\| \| \| \| \| \|	The current code does not parse hex literals larger than 32-bit. llvm-svn: 289943
*	[ARM] GlobalISel: Select add i32, i32	Diana Picus	2016-12-16	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add the minimal support necessary to select a function that returns the sum of two i32 values. This includes some support for argument/return lowering of i32 values through registers, as well as the handling of copy and add instructions throughout the GlobalISel pipeline. Differential Revision: https://reviews.llvm.org/D26677 llvm-svn: 289940
*	[codegen] Add generic functions to skip debug values.	Florian Hahn	2016-12-16	5	-75/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This commits moves skipDebugInstructionsForward and skipDebugInstructionsBackward from lib/CodeGen/IfConversion.cpp to include/llvm/CodeGen/MachineBasicBlock.h and updates some codgen files to use them. This refactoring was suggested in https://reviews.llvm.org/D27688 and I thought it's best to do the refactoring in a separate review, but I could also put both changes in a single review if that's preferred. Also, the names for the functions aren't the snappiest and I would be happy to rename them if anybody has suggestions. Reviewers: eli.friedman, iteratee, aprantl, MatzeB Subscribers: MatzeB, llvm-commits Differential Revision: https://reviews.llvm.org/D27782 llvm-svn: 289933
*	[IR] Remove the DIExpression field from DIGlobalVariable.	Adrian Prantl	2016-12-16	5	-85/+130
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289920
*	Add extra headers that got deleted by my revert in r289916 but for which	Chandler Carruth	2016-12-16	1	-1/+2
\| \| \| \| \| \|	new usage had already grown in the file. llvm-svn: 289917
*	Revert patch series introducing the DAG combine to match a load-by-bytes	Chandler Carruth	2016-12-16	1	-283/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	idiom. r289538: Match load by bytes idiom and fold it into a single load r289540: Fix a buildbot failure introduced by r289538 r289545: Use more detailed assertion messages in the code ... r289646: Add a couple of assertions to the load combine code ... This DAG combine has a bad crash in it that is quite hard to trigger sadly -- it relies on sneaking code with UB through the SDAG build and into this particular combine. I've responded to the original commit with a test case that reproduces it. However, the code also has other problems that will require substantial changes to address and so I'm going ahead and reverting it for now. This should unblock us and perhaps others that are hitting the crash in the wild and will let a fresh patch with updated approach come in cleanly afterward. Sorry for any trouble or disruption! llvm-svn: 289916
*	Revert "[IR] Remove the DIExpression field from DIGlobalVariable."	Adrian Prantl	2016-12-16	5	-130/+85
\| \| \| \| \| \|	This reverts commit 289902 while investigating bot berakage. llvm-svn: 289906
*	[IR] Remove the DIExpression field from DIGlobalVariable.	Adrian Prantl	2016-12-16	5	-85/+130
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289902
*	DebugInfo: Address non-deterministic output (iterating a SmallPtrSet) in 289697	David Blaikie	2016-12-15	3	-9/+5
\| \| \| \| \| \| \| \|	Post-commit review feedback from Adrian Prantl. Hopefully this fixes that up :) llvm-svn: 289892
*	[IRTranslator] Merge the entry and ABI lowering blocks.	Quentin Colombet	2016-12-15	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IRTranslator uses an additional block before the LLVM-IR entry block to perform all the ABI lowering and the constant hoisting. Thus, this block is the actual entry block and it falls through the LLVM-IR entry block. However, with such representation, we end up with two basic blocks that are not maximal. Therefore, this patch adds a bit of canonicalization by merging both the LLVM-IR entry block and the ABI lowering/constants hoisting into one block, making the resulting block more likely to be maximal (indeed the LLVM-IR entry block might not have been maximal). llvm-svn: 289891
*	DebugInfo: Emit ranges for functions with DISubprograms but lacking ↵	David Blaikie	2016-12-15	3	-29/+20
\| \| \| \| \| \| \| \| \|	locations on any instructions This seems more consistent, and helps tidy up/simplify some other code in this change. llvm-svn: 289889
*	Don't combine splats with other shuffles.	Eli Friedman	2016-12-15	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	We sometimes end up creating shuffles which are worse than the obvious translation of the IR. Fixes https://llvm.org/bugs/show_bug.cgi?id=31301 . Differential Revision: https://reviews.llvm.org/D27793 llvm-svn: 289882
*	Don't combine a shuffle of two BUILD_VECTORs with duplicate elements.	Eli Friedman	2016-12-15	1	-10/+23
\| \| \| \| \| \| \| \| \| \| \| \| \|	Targets can't handle this case well in general; we often transform a shuffle of two cheap BUILD_VECTORs to element-by-element insertion, which is very inefficient. Fixes https://llvm.org/bugs/show_bug.cgi?id=31364 . Partially fixes https://llvm.org/bugs/show_bug.cgi?id=31301. Differential Revision: https://reviews.llvm.org/D27787 llvm-svn: 289874
*	[LiveRangeEdit] Change eliminateDeadDef assert to if condition.	Geoff Berry	2016-12-15	1	-4/+5
\| \| \| \| \| \| \| \| \| \|	The assert could potentially fire (though no cases have been encountered), so just check that the instruction we're handling specially for rematerialization only has one def to begin with. Reviewed by Wei Mi over email. llvm-svn: 289861
*	Extract LaneBitmask into a separate type	Krzysztof Parzyszek	2016-12-15	23	-195/+209
\| \| \| \| \| \| \| \| \| \| \| \|	Specifically avoid implicit conversions from/to integral types to avoid potential errors when changing the underlying type. For example, a typical initialization of a "full" mask was "LaneMask = ~0u", which would result in a value of 0x00000000FFFFFFFF if the type was extended to uint64_t. Differential Revision: https://reviews.llvm.org/D27454 llvm-svn: 289820
*	[ARM] Implement execute-only support in CodeGen	Prakhar Bahuguna	2016-12-15	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This implements execute-only support for ARM code generation, which prevents the compiler from generating data accesses to code sections. The following changes are involved: * Add the CodeGen option "-arm-execute-only" to the ARM code generator. * Add the clang flag "-mexecute-only" as well as the GCC-compatible alias "-mpure-code" to enable this option. * When enabled, literal pools are replaced with MOVW/MOVT instructions, with VMOV used in addition for floating-point literals. As the MOVT instruction is required, execute-only support is only available in Thumb mode for targets supporting ARMv8-M baseline or Thumb2. * Jump tables are placed in data sections when in execute-only mode. * The execute-only text section is assigned section ID 0, and is marked as unreadable with the SHF_ARM_PURECODE flag with symbol 'y'. This also overrides selection of ELF sections for globals. llvm-svn: 289784
*	Trying to fix NDEBUG build after r289764	Hal Finkel	2016-12-15	1	-0/+2
\| \| \| \|	llvm-svn: 289766
*	[MachineBlockPlacement] Don't make blocks "uneditable"	Sanjoy Das	2016-12-15	2	-7/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes an issue with MachineBlockPlacement due to a badly timed call to `analyzeBranch` with `AllowModify` set to true. The timeline is as follows: 1. `MachineBlockPlacement::maybeTailDuplicateBlock` calls `TailDup.shouldTailDuplicate` on its argument, which in turn calls `analyzeBranch` with `AllowModify` set to true. 2. This `analyzeBranch` call edits the terminator sequence of the block based on the physical layout of the machine function, turning an unanalyzable non-fallthrough block to a unanalyzable fallthrough block. Normally MBP bails out of rearranging such blocks, but this block was unanalyzable non-fallthrough (and thus rearrangeable) the first time MBP looked at it, and so it goes ahead and decides where it should be placed in the function. 3. When placing this block MBP fails to analyze and thus update the block in keeping with the new physical layout. Concretely, before (1) we have something like: ``` LBL0: < unknown terminator op that may branch to LBL1 > jmp LBL1 LBL1: ... A LBL2: ... B ``` In (2), analyze branch simplifies this to ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump removed LBL1: ... A LBL2: ... B ``` In (3), MachineBlockPlacement goes ahead with its plan of putting LBL2 after the first block since that is profitable. ``` LBL0: < unknown terminator op that may branch to LBL2 > ;; jmp LBL1 <- redundant jump LBL2: ... B LBL1: ... A ``` and the program now has incorrect behavior (we no longer fall-through from `LBL0` to `LBL1`) because MBP can no longer edit LBL0. There are several possible solutions, but I went with removing the teeth off of the `analyzeBranch` calls in TailDuplicator. That makes thinking about the result of these calls easier, and breaks nothing in the lit test suite. I've also added some bookkeeping to the MachineBlockPlacement pass and used that to write an assert that would have caught this. Reviewers: chandlerc, gberry, MatzeB, iteratee Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D27783 llvm-svn: 289764
*	[DAG] allow more select folding for targets that have 'and not' (PR31175)	Sanjay Patel	2016-12-14	1	-6/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original motivation for this patch comes from wanting to canonicalize more IR to selects and also canonicalizing min/max. If we're going to do that, we need more backend fixups to undo select codegen when simpler ops will do. I chose AArch64 for the tests because that shows the difference in the simplest way. This should fix: https://llvm.org/bugs/show_bug.cgi?id=31175 Differential Revision: https://reviews.llvm.org/D27489 llvm-svn: 289738
*	DebugInfo: Improve type safety and simplify some subprogram finalization code	David Blaikie	2016-12-14	2	-11/+9
\| \| \| \| \| \| \|	This probably ended up this way aften the subprogram<>function link inversion and debug info metadata schema changes. llvm-svn: 289697
*	[WinEH] Avoid holding references to BlockColor (DenseMap) entries while ↵	Andrew Kaylor	2016-12-14	1	-1/+5
\| \| \| \| \| \| \| \|	inserting new elements Differential Revision: https://reviews.llvm.org/D27693 llvm-svn: 289694
*	Revert "In visitSTORE, always use FindBetterChain, rather than only when ↵	Nirav Dave	2016-12-14	2	-229/+279
\| \| \| \| \| \| \| \| \| \|	UseAA is enabled." Reverting due to ARM MCJIT and MIPS LLD error. This reverts commit r289659. llvm-svn: 289667
*	In visitSTORE, always use FindBetterChain, rather than only when UseAA is ↵	Nirav Dave	2016-12-14	2	-279/+229
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	enabled. Retrying after fixing after removing load-store factoring through token factors in favor of improved token factor operand pruning Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289659
*	[DAGCombiner] Try to use SelectionDAG::isKnownToBeAPowerOfTwo instead of ↵	Simon Pilgrim	2016-12-14	2	-30/+63
\| \| \| \| \| \| \| \| \| \| \| \|	just APInt::isPowerOf2 Generalize sdiv/udiv/srem/urem combines using APInt::isPowerOf2, which only works for const/splat-const values, to call SelectionDAG::isKnownToBeAPowerOfTwo instead which recognises many more cases. Added a DAGCombiner::BuildLogBase2 helper since PowerOf2 combines often involve taking the log2 of such a value. Differential Revision: https://reviews.llvm.org/D27714 llvm-svn: 289654
*	Replace APFloatBase static fltSemantics data members with getter functions	Stephan Bergmann	2016-12-14	6	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	At least the plugin used by the LibreOffice build (<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly uses those members (through inline functions in LLVM/Clang include files in turn using them), but they are not exported by utils/extract_symbols.py on Windows, and accessing data across DLL/EXE boundaries on Windows is generally problematic. Differential Revision: https://reviews.llvm.org/D26671 llvm-svn: 289647
*	Add a couple of assertions to the load combine code introduced by r289538	Artur Pilipenko	2016-12-14	1	-1/+5
\| \| \| \|	llvm-svn: 289646
*	[DWARF] Preserve column number when emitting 'line 0' record	Paul Robinson	2016-12-14	1	-4/+9
\| \| \| \| \| \| \| \|	Follow-up to r289256, address a FIXME to avoid resetting the column number. This reduced .debug_line by 2.6% in a RelWithDebInfo self-build of clang. llvm-svn: 289620
*	Generalize strided store pattern in interleave access pass	Alina Sbirlea	2016-12-13	1	-16/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch aims to generalize matching of the strided store accesses to more general masks. The more general rule is to have consecutive accesses based on the stride: [x, y, ... z, x+1, y+1, ...z+1, x+2, y+2, ...z+2, ...] All elements in the masks need not form a contiguous space, there may be gaps. As before, undefs are allowed and filled in with adjacent element loads. Reviewers: HaoLiu, mssimpso Subscribers: mkuper, delena, llvm-commits Differential Revision: https://reviews.llvm.org/D23646 llvm-svn: 289573
*	Use more detailed assertion messages in the code introduced by r289538	Artur Pilipenko	2016-12-13	1	-4/+8
\| \| \| \|	llvm-svn: 289545
*	Fix a buildbot failure introduced by r289538	Artur Pilipenko	2016-12-13	1	-2/+1
\| \| \| \| \| \|	Build failed because of unused variable in product mode. llvm-svn: 289540
*	[DAGCombiner] Match load by bytes idiom and fold it into a single load	Artur Pilipenko	2016-12-13	1	-0/+276
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it. Assuming little endian target: i8 a = ... i32 val = a[0] \| (a[1] << 8) \| (a[2] << 16) \| (a[3] << 24) => i32 val = ((i32)a) i8 a = ... i32 val = (a[0] << 24) \| (a[1] << 16) \| (a[2] << 8) \| a[3] => i32 val = BSWAP(((i32)a)) This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations. Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part: i32 val = a[i] \| (a[i + 1] << 8) \| (a[i + 2] << 16) \| (a[i + 3] << 24) Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses. The general scheme is to match OR expressions by recursively calculating the origin of individual bits which constitute the resulting OR value. If all the OR bits come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed. Reviewed By: hfinkel, RKSimon, filcab Differential Revision: https://reviews.llvm.org/D26149 llvm-svn: 289538
*	Move BaseIndexOffset in DAGCombiner.cpp so it will be available for the ↵	Artur Pilipenko	2016-12-13	1	-104/+104
\| \| \| \| \| \|	upcoming user llvm-svn: 289537
*	[SelectionDAG] computeKnownBits - simplified knownbits sign extension. NFCI.	Simon Pilgrim	2016-12-13	1	-13/+4
\| \| \| \| \| \|	We don't need to extract+test the sign bit of the known ones/zeros, we can use sext which will handle all of this. llvm-svn: 289534
*	[GlobalISel] Move extendRegister where it belongs. NFCI	Diana Picus	2016-12-13	1	-0/+29
\| \| \| \| \| \|	Apparently I missed this one when I moved ValueHandler back in r288658. Sorry! llvm-svn: 289528
*	[peephole] Enhance folding logic to work for STATEPOINTs	Philip Reames	2016-12-13	1	-9/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The general idea here is to get enough of the existing restrictions out of the way that the already existing folding logic in foldMemoryOperand can kick in for STATEPOINTs and fold references to immutable stack slots. The key changes are: Support for folding multiple operands at once which reference the same load Support for folding multiple loads into a single instruction Walk all the operands of the instruction for varidic instructions (this is a bug fix!) Once this lands, I'll post another patch which refactors the TII interface here. There's nothing actually x86 specific about the x86 code used here. Differential Revision: https://reviews.llvm.org/D24103 llvm-svn: 289510
*	[Statepoints] Reuse stack slots more than once within a basic block	Philip Reames	2016-12-13	1	-4/+9
\| \| \| \| \| \| \| \| \| \|	The stack slot reuse code had a really amusing bug. We ended up only reusing a stack slot exact once (initial use + reuse) within a basic block. If we had a third statepoint to process, we ended up allocating a new set of stack slots. If we crossed a basic block boundary, the set got cleared. As a result, code which is invoke heavy doesn't see the problem, but multiple calls within a basic block does. Net result: as we optimize invokes into calls, lowering gets worse. The root error here is that the bitmap uses by the custom allocator wasn't kept in sync. The result was that we ended up resizing the bitmap on the next statepoint (to handle the cross block case), reset the bit once, but then never reset it again. Differential Revision: https://reviews.llvm.org/D25243 llvm-svn: 289509
*	Avoid infinite loops in branch folding	Andrew Kaylor	2016-12-12	1	-1/+13
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D27582 llvm-svn: 289486
*	Recommit r288212: Emit 'no line' information for interesting 'orphan' ↵	Paul Robinson	2016-12-12	3	-16/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	instructions. DWARF specifies that "line 0" really means "no appropriate source location" in the line table. By default, use this for branch targets and some other cases that have no specified source location, to prevent inheriting unfortunate line numbers from physically preceding instructions (which might be from completely unrelated source). Updated patch allows enabling or suppressing this behavior for all unspecified source locations. Differential Revision: http://reviews.llvm.org/D24180 llvm-svn: 289468
*	[LiveRangeEdit] Add assert string and descriptive comment.	Geoff Berry	2016-12-12	1	-1/+3
\| \| \| \|	llvm-svn: 289456