summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [AVX512] Fix load opcode for fast isel.Igor Breger2016-06-071-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D21067 llvm-svn: 272006
* [PowerPC] Support multiple return values with fast iselUlrich Weigand2016-06-071-1/+1
| | | | | | | | | | | | | | | Using an LLVM IR aggregate return value type containing three or more integer values causes an abort in the fast isel pass. This patch adds two more registers to RetCC_PPC64_ELF_FIS to allow returning up to four integers with fast isel, just the same as is currently supported with regular isel (RetCC_PPC). This is needed for Swift and (possibly) other non-clang frontends. Fixes PR26190. llvm-svn: 272005
* [X86][SSE] Improved blend+zero target shuffle combining to use combined ↵Simon Pilgrim2016-06-071-7/+11
| | | | | | | | | | shuffle mask directly We currently only combine to blend+zero if the target value type has 8 elements or less, but this was missing a lot of cases where the combined mask had been widened. This change makes it so we use the combined mask to determine the blend value type, allowing us to catch more widened cases. llvm-svn: 272003
* [ARM] Shrink post-indexed LDR and STR to LDM/STMJames Molloy2016-06-071-0/+42
| | | | | | | | | | | | | | A Thumb-2 post-indexed LDR instruction such as: ldr.w r0, [r1], #4 Can be rewritten as: ldm.n r1!, {r0} LDMs can be more expensive than LDRs on some cores, so this has been enabled only in minsize mode. llvm-svn: 272002
* [ARM] Transform LDMs into writeback form to save code sizeJames Molloy2016-06-071-3/+23
| | | | | | | | | | | | | | If we have an LDM that uses only low registers and doesn't write to its base register: ldm.w r0, {r1, r2, r3} And that base register is dead after the LDM, then we can convert it to writeback form and use a narrow encoding: ldm.n r0!, {r1, r2, r3} Obviously, this introduces a new register write and so can cause WAW hazards, so I've enabled it only in minsize mode. This is a code size trick that ARM Compiler 5 ("armcc") does that we don't. llvm-svn: 272000
* [ARM] Incorrect relocation type for Thumb2 B<cond>.wPeter Smith2016-06-071-0/+2
| | | | | | | | | | | | | | | | | The Thumb2 conditional branch B<cond>.W has a different encoding (T3) to the unconditional branch B.W (T4) as it needs to record <cond>. As the encoding is different the B<cond>.W is given a different relocation type. ELF for the ARM Architecture 4.6.1.6 (Table-13) states that R_ARM_THM_JUMP19 should be used for B<cond>.W. At present the MC layer is using the R_ARM_THM_JUMP24 from B.W. This change makes B<cond>.W use R_ARM_THM_JUMP19 and alters the existing test that checks for R_ARM_THM_JUMP24 to expect R_ARM_THM_JUMP19. llvm-svn: 271997
* [InstCombine][AVX2] Add support for simplifying AVX2 per-element shifts to ↵Simon Pilgrim2016-06-071-0/+125
| | | | | | | | | | | | | | | | | | native shifts Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit). If the shift amount is constant we can sometimes convert these instructions to native shifts: 1 - if all shift amounts are in range then the conversion is trivial. 2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion. 3 - logical shifts just return zero if all elements have out of range shift amounts. In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts. Differential Revision: http://reviews.llvm.org/D19675 llvm-svn: 271996
* [InstCombine][SSE] Add MOVMSK constant folding (PR27982)Simon Pilgrim2016-06-071-0/+51
| | | | | | | | | | This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions. The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs. Differential Revision: http://reviews.llvm.org/D20998 llvm-svn: 271990
* [AVX512] Allow avx2 and sse41 nontemporal load intrinsics to select EVEX ↵Craig Topper2016-06-072-11/+13
| | | | | | encoded instructions when VLX is enabled. llvm-svn: 271988
* [AVX512] Remove unnecessary mayLoad, mayStore, hasSidEffects flags from ↵Craig Topper2016-06-072-580/+488
| | | | | | instructions that have patterns that imply them. Add the same set of flags to instructions that don't have patterns to imply them. llvm-svn: 271987
* [AVX512] Add NoVLX to a couple patterns that have VLX equivalents. Ordering ↵Craig Topper2016-06-071-1/+1
| | | | | | of the patterns in the .td file protects this, but its better to be explicit. llvm-svn: 271986
* [pdb] Use MappedBlockStream to parse the PDB directory.Zachary Turner2016-06-0711-125/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to efficiently write PDBs, we need to be able to make a StreamWriter class similar to a StreamReader, which can transparently deal with writing to discontiguous streams, and we need to use this for all writing, similar to how we use StreamReader for all reading. Most discontiguous streams are the typical numbered streams that appear in a PDB file and are described by the directory, but the exception to this, that until now has been parsed by hand, is the directory itself. MappedBlockStream works by querying the directory to find out which blocks a stream occupies and various other things, so naturally the same logic could not possibly work to describe the blocks that the directory itself resided on. To solve this, I've introduced an abstraction IPDBStreamData, which allows the client to query for the list of blocks occupied by the stream, as well as the stream length. I provide two implementations of this: one which queries the directory (for indexed streams), and one which queries the super block (for the directory stream). This has the side benefit of vastly simplifying the code to parse the directory. Whereas before a mini state machine was rolled by hand, now we simply use FixedStreamArray to read out the stream sizes, then build a vector of FixedStreamArrays for the stream map, all in just a few lines of code. Reviewed By: ruiu Differential Revision: http://reviews.llvm.org/D21046 llvm-svn: 271982
* [LibFuzzer] s/dataflow sanitizer/DataflowSanitizer/Dan Liew2016-06-071-2/+2
| | | | llvm-svn: 271980
* [LibFuzzer] Disable building and running LSan tests on Apple platforms ↵Dan Liew2016-06-074-0/+18
| | | | | | | | because LSan is not currently supported. Differential Revision: http://reviews.llvm.org/D20947 llvm-svn: 271979
* ARM: correct TLS access on WoASaleem Abdulrasool2016-06-075-4/+25
| | | | | | | | | | | | TLS access requires an offset from the TLS index. The index itself is the section-relative distance of the symbol. For ARM, the relevant relocation (IMAGE_REL_ARM_SECREL) is applied as a constant. This means that the value may not be an immediate and must be lowered into a constant pool. This offset will not be base relocated. We were previously emitting the actual address of the symbol which would be base relocated and would therefore be the vaue offset by the ImageBase + TLS Offset. llvm-svn: 271974
* ARM: clang-format a couple of switches, add commentsSaleem Abdulrasool2016-06-073-15/+25
| | | | | | | clang-format a couple of switches in preparation for a future change. Add some enumeration comments llvm-svn: 271973
* ARM: normalise space in the patternsSaleem Abdulrasool2016-06-071-8/+7
| | | | | | Just adjust the whitespace for the selection patterns. NFC. llvm-svn: 271972
* Add comments.Rui Ueyama2016-06-071-0/+2
| | | | llvm-svn: 271967
* Re-land "[codeview] Emit information about global variables"Reid Kleckner2016-06-072-21/+101
| | | | | | | | | | | | This reverts commit r271962 and reinstantes r271957. MSVC's linker doesn't appear to like it if you have an empty symbol substream, so only open a symbol substream if we're going to emit something about globals into it. Makes check-asan pass. llvm-svn: 271965
* Try one more time to pacify -Wpessimizing-move, MSVC, libstdc++4.7, and the ↵Reid Kleckner2016-06-061-2/+1
| | | | | | world without a named variable llvm-svn: 271964
* Revert "[codeview] Emit information about global variables"Reid Kleckner2016-06-062-95/+21
| | | | | | This reverts commit r271957, it broke check-asan on Windows. llvm-svn: 271962
* [InstCombine] scalarizePHI should not assume the code it sees has been CSE'dMichael Kuperstein2016-06-061-12/+26
| | | | | | | | | | | | | | scalarizePHI only looked for phis that have exactly two uses - the "latch" use, and an extract. Unfortunately, we can not assume all equivalent extracts are CSE'd, since InstCombine itself may create an extract which is a duplicate of an existing one. This extends it to handle several distinct extracts from the same index. This should fix at least some of the performance regressions from PR27988. Differential Revision: http://reviews.llvm.org/D20983 llvm-svn: 271961
* Attempt to work around lack of std::map::emplace in libstdc++4.7Reid Kleckner2016-06-061-1/+2
| | | | llvm-svn: 271958
* [codeview] Emit information about global variablesReid Kleckner2016-06-062-21/+95
| | | | | | | This currently emits everything as S_GDATA32, which isn't right for things like thread locals, but it's a start. llvm-svn: 271957
* Verifier: Simplify and fix issue where we were not verifying unmaterialized ↵Peter Collingbourne2016-06-061-15/+13
| | | | | | | | | | | | | | functions. Arrange to call verify(Function &) on each function, followed by verify(Module &), whether the verifier is being used from the pass or from verifyModule(). As a side effect, this fixes an issue that caused us not to call verify(Function &) on unmaterialized functions from verifyModule(). Differential Revision: http://reviews.llvm.org/D21042 llvm-svn: 271956
* [pdbdump] Verify the size of TPI hash records.Rui Ueyama2016-06-061-0/+5
| | | | llvm-svn: 271954
* Verifier: Remove dead code.Peter Collingbourne2016-06-061-14/+8
| | | | | | | Remove previously unreachable code that verifies that a function definition has an entry block. By definition, a function definition has at least one block. llvm-svn: 271948
* [LibFuzzer] Provide stub implementation of __sanitizer_cov_trace_pc_indirDan Liew2016-06-061-1/+9
| | | | | | | | | | | | | Calls to this function are currently injected by the ``SanitizerCoverageModule`` pass when the both the ``indirect-calls`` and ``trace-pc`` sanitizer coverage options are enabled and the code being instrumented has indirect calls. Previously because LibFuzzer did not define this function this would lead to link errors when building some of the tests on OSX. Differential Revision: http://reviews.llvm.org/D20946 llvm-svn: 271938
* AMDGPU: Add function for getting instruction sizeMatt Arsenault2016-06-062-0/+51
| | | | llvm-svn: 271936
* AMDGPU: Fix constantexpr addrspacecastsMatt Arsenault2016-06-062-4/+71
| | | | | | | | If we had a constant group address space cast the queue pointer wasn't enabled for the function, resulting in a crash on noreg later. llvm-svn: 271935
* [PM] Preserve the correct set of analyses for GVN.Davide Italiano2016-06-061-1/+6
| | | | llvm-svn: 271934
* [GVN] Switch dump() definition over to LLVM_DUMP_METHOD.Davide Italiano2016-06-061-2/+1
| | | | llvm-svn: 271932
* [LoopUnrollAnalyzer] Fix a crash in analyzeLoopUnrollCost.Michael Zolotukhin2016-06-061-3/+5
| | | | | | | | | | | In some cases, when simplifying with SCEV, we might consider pointer values as just usual integer values. Thus, we might get a different type from what we had originally in the map of simplified values, and hence we need to check types before operating on the values. This fixes PR28015. llvm-svn: 271931
* Reapply [LSR] Create fewer redundant instructions.Geoff Berry2016-06-061-20/+22
| | | | | | | | | | | | | | | | | | | Summary: Fix LSRInstance::HoistInsertPosition() to check the original insert position block first for a canonical insertion point that is dominated by all inputs. This leads to SCEV being able to reuse more instructions since it currently tracks the instructions it creates for reuse by keeping a table of <Value, insert point> pairs. Originally reviewed in http://reviews.llvm.org/D18001 Reviewers: atrick Subscribers: llvm-commits, mzolotukhin, mcrosier Differential Revision: http://reviews.llvm.org/D18480 llvm-svn: 271929
* [pdbdump] Print out New FPO stream contents.Rui Ueyama2016-06-061-1/+24
| | | | | | | | | | The data strucutre in the new FPO stream is described in the PE/COFF spec. There is one record per function if frame pointer is omitted. Differential Revision: http://reviews.llvm.org/D20999 llvm-svn: 271926
* [MBP] Reduce code size by running tail merging in MBP.Haicheng Wu2016-06-063-25/+92
| | | | | | | | | | | | | The code layout that TailMerging (inside BranchFolding) works on is not the final layout optimized based on the branch probability. Generally, after BlockPlacement, many new merging opportunities emerge. This patch calls Tail Merging after MBP and calls MBP again if Tail Merging merges anything. Differential Revision: http://reviews.llvm.org/D20276 llvm-svn: 271925
* [BranchFolding] Replace MachineBlockFrequencyInfo with MBFIWrapper. NFC.Haicheng Wu2016-06-063-9/+29
| | | | | | Differential Revision: http://reviews.llvm.org/D20184 llvm-svn: 271923
* [cpu-detection] Substantial refactor of Host CPU detection code (x86)Alina Sbirlea2016-06-061-261/+650
| | | | | | | | | | | | | | | | | | Summary: Following D20970 (committed as r271726). This is a substantial refactoring of the host CPU detection code. There is no functionality change intended, but the changes are extensive. Definitions of architecture types and subtypes are by no means exhaustive or perfectly defined, but a fair starting point. Suggestions for futher improvements are welcome. Reviewers: llvm-commits Differential Revision: http://reviews.llvm.org/D20988 llvm-svn: 271921
* [InstCombine] limit icmp transform to ConstantInt (PR28011)Sanjay Patel2016-06-061-3/+5
| | | | | | | | | | | | | | | In r271810 ( http://reviews.llvm.org/rL271810 ), I loosened the check above this to work for any Constant rather than ConstantInt. AFAICT, that part makes sense if we can determine that the shrunken/extended constant remained equal. But it doesn't make sense for this later transform where we assume that the constant DID change. This could assert for a ConstantExpr: https://llvm.org/bugs/show_bug.cgi?id=28011 And it could be wrong for a vector as shown in the added regression test. llvm-svn: 271908
* [AMDGPU][llvm-mc] v_cndmask_b32: src2 is mandatory; do not enforce VOP2 when ↵Artem Tamazov2016-06-062-14/+76
| | | | | | | | | | | | src2 == VCC. Another step for unification llvm assembler/disassembler with sp3. Besides, CodeGen output is a bit improved, thus changes in CodeGen tests. Assembler/Disassembler tests updated/added. Differential Revision: http://reviews.llvm.org/D20796 llvm-svn: 271900
* [LAA] Use load and store vectors (NFC)Matthew Simpson2016-06-061-11/+7
| | | | | | | Contributed-by: Aditya Kumar <hiraditya@msn.com> Differential Revision: http://reviews.llvm.org/D20953 llvm-svn: 271895
* [KNL] Fix UMULO lowering.Igor Breger2016-06-061-1/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D21013 llvm-svn: 271891
* Remove dead function with incredibly broken assert.Benjamin Kramer2016-06-061-6/+0
| | | | | | Found by clang-tidy's misc-assert-side-effect. llvm-svn: 271887
* [NFC] Silence gcc warning (-Wsign-compare)Filipe Cabecinhas2016-06-061-1/+1
| | | | llvm-svn: 271882
* [AVX512] Remove masked palignr intrinsics and auto-upgrade them to native IR ↵Craig Topper2016-06-061-0/+54
| | | | | | of vector shuffle and select. llvm-svn: 271872
* [AVX512] Add PALIGNR shuffle lowering for v32i16 and v16i32.Craig Topper2016-06-061-0/+13
| | | | llvm-svn: 271870
* Fix spelling and capitalization in comments. NFCNick Lewycky2016-06-061-2/+2
| | | | llvm-svn: 271862
* LICM: Don't sink stores out of loops that may throw.Eli Friedman2016-06-051-0/+10
| | | | | | | | | | | | | | | | Summary: This hasn't been caught before because it requires noalias or similarly strong alias analysis to actually reproduce. Fixes http://llvm.org/PR27952 . Reviewers: hfinkel, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20944 llvm-svn: 271858
* Add safety check to InstCombiner::commonIRemTransformsSanjoy Das2016-06-051-2/+11
| | | | | | | | | | | | | | | | Since FoldOpIntoPhi speculates the binary operation to potentially each of the predecessors of the PHI node (pulling it out of arbitrary control dependence in the process), we can FoldOpIntoPhi only if we know the operation doesn't have UB. This also brings up an interesting profitability question -- the way it is written today, commonIRemTransforms will hoist out work from dynamically dead code into code that will execute at runtime. Perhaps that isn't the best canonicalization? Fixes PR27968. llvm-svn: 271857
* [BitCode] Make sure atomicrmw's argument is an actual PointerTypeFilipe Cabecinhas2016-06-051-0/+1
| | | | llvm-svn: 271851
OpenPOWER on IntegriCloud