summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* [Mem2Reg] Respect optnoneJames Molloy2015-12-112-0/+24
| | | | | | | | Mem2Reg shouldn't be optimizing a function that is marked optnone. There is a test checking this that fails when mem2reg is explicitly added to the standard pass pipeline. llvm-svn: 255336
* [InstCombine] Make MatchBSwap also match bit reversalsJames Molloy2015-12-113-103/+250
| | | | | | MatchBSwap has most of the functionality to match bit reversals already. If we switch it from looking at bytes to individual bits and remove a few early exits, we can extend the main recursive function to match any sequence of ORs, ANDs and shifts that assemble a value from different parts of another, base value. Once we have this bit->bit mapping, we can very simply detect if it is appropriate for a bswap or bitreverse. llvm-svn: 255334
* Revert previous test commit.Maxim Ostapenko2015-12-111-1/+0
| | | | llvm-svn: 255331
* This is a test commit to check my commit access works.Maxim Ostapenko2015-12-111-0/+1
| | | | llvm-svn: 255330
* [PGO] Read VP raw data without depending on the Value fieldXinliang David Li2015-12-112-16/+14
| | | | | | | | | | | | | | | | Before this patch, each function's on-disk VP data is 'pointed' to by the Value field of per-function ProfileData structue, and read relies on this field (relocated with ValueDataDelta field) to read the value data. However this means the Value field needs to be updated during runtime before dumping, which creates undesirable data races. With this patch, the reading of VP data no longer depends on Value field. There is no format change. ValueDataDelta header field becomes obsolute but will be kept for compatibility reason (will be removed next time the raw format change is needed). llvm-svn: 255329
* Fix build after r255319.Hans Wennborg2015-12-111-1/+1
| | | | llvm-svn: 255322
* Fix a spurious if.Eric Christopher2015-12-111-1/+1
| | | | llvm-svn: 255321
* [LazyValueInfo] Stop inserting overdefined values into ValueCache toAkira Hatanaka2015-12-111-18/+48
| | | | | | | | | | | | | | | | | | reduce memory usage. Previously, LazyValueInfoCache inserted overdefined lattice values into both ValueCache and OverDefinedCache. This wasn't necessary and was causing LazyValueInfo to use an excessive amount of memory in some cases. This patch changes LazyValueInfoCache to insert overdefined values only into OverDefinedCache. The memory usage decreases by 70 to 75% when one of the files in llvm is compiled. rdar://problem/11388615 Differential revision: http://reviews.llvm.org/D15391 llvm-svn: 255320
* [PPC]: Peephole optimize small accesss to aligned globals.Kyle Butt2015-12-112-9/+361
| | | | | | | | | | | | | | | | | | | | | | | Access to aligned globals gives us a chance to peephole optimize nonzero offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't overflow the available displacement. For example: addis 3, 2, b4v@toc@ha addi 4, 3, b4v@toc@l lbz 5, b4v@toc@l(3) ; This is the result of the current peephole lbz 6, 1(4) ; optimizer lbz 7, 2(4) lbz 8, 3(4) If b4v is 4-byte aligned, we can skip using register 4 because we know that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate: addis 3, 2, b4v@toc@ha lbz 4, b4v@toc@l(3) lbz 5, b4v@toc@l+1(3) lbz 6, b4v@toc@l+2(3) lbz 7, b4v@toc@l+3(3) Saving a register and an addition. Larger alignments allow larger structures/arrays to be optimized. llvm-svn: 255319
* Check in the script for building Win snapshotsHans Wennborg2015-12-111-0/+93
| | | | llvm-svn: 255318
* [ProfileData] clang-format TextInstrProfReader::hasFormat. NFC.Vedant Kumar2015-12-111-2/+3
| | | | llvm-svn: 255317
* [X86][SSE] Update the cost table for integer-integer conversions on SSE2/SSE4.1.Cong Hou2015-12-113-5/+435
| | | | | | | | | | | | Previously in the conversion cost table there are no entries for integer-integer conversions on SSE2. This will result in imprecise costs for certain vectorized operations. This patch adds those entries for SSE2 and SSE4.1. The cost numbers are counted from the result of running llc on the new test case in this patch. Differential revision: http://reviews.llvm.org/D15132 llvm-svn: 255315
* Format fix (NFC)Xinliang David Li2015-12-101-2/+4
| | | | llvm-svn: 255313
* s/need/needsEric Christopher2015-12-101-2/+2
| | | | llvm-svn: 255306
* Fix (bitcast (fabs x)), (bitcast (fneg x)) and (bitcast (fcopysign cst,Eric Christopher2015-12-102-0/+171
| | | | | | | | | | | | x)) combines for ppc_fp128, since signbit computation is more complicated. Discussion thread: http://lists.llvm.org/pipermail/llvm-dev/2015-November/092863.html Patch by Tim Shen! llvm-svn: 255305
* Attempt to fix the ReST compilation to html of the C API docs.Eric Christopher2015-12-101-14/+14
| | | | llvm-svn: 255304
* More non-ascii quote characters.Eric Christopher2015-12-101-2/+2
| | | | llvm-svn: 255303
* Clarify some of the wording on adding a new subcomponent to theEric Christopher2015-12-101-2/+2
| | | | | | C API. llvm-svn: 255302
* Fix non-ascii quotes.Eric Christopher2015-12-101-4/+4
| | | | llvm-svn: 255301
* Add C API guidelines to the developer policy to match discussionsEric Christopher2015-12-101-0/+27
| | | | | | on the llvm mailing lists. llvm-svn: 255300
* PPC: Teach FMA mutate to respect register classes.Kyle Butt2015-12-102-2/+98
| | | | | | | | | This was causing bad code gen and assembly that won't assemble, as mixed altivec and vsx code would end up with a vsx high register assigned to an altivec instruction, which won't work. Constraining the classes allows the optimization to proceed. llvm-svn: 255299
* [CMake] Add LLVM_BUILD_INSTRUMENTED option to enable building with ↵Chris Bieneman2015-12-101-0/+8
| | | | | | | | -fprofile-instr-generate This is the first step in supporting PGO data generation via CMake. I've marked the option as advanced and experimental until it is fleshed out further. llvm-svn: 255298
* [LibFuzzer] Introducing FUZZER_FLAG_UNSIGNED and using it for seeding.Mike Aizatsky2015-12-105-9/+25
| | | | | | | | Differential Revision: http://reviews.llvm.org/D15339 done llvm-svn: 255296
* EarlyCSE: add testsJF Bastien2015-12-101-10/+68
| | | | | | | | | | | | Summary: As a follow-up to rL255054 I wasn't able to convince myself that the code did what I thought, so I wrote more tests. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15371 llvm-svn: 255295
* Add a forward declaration (NFC)Xinliang David Li2015-12-101-0/+1
| | | | llvm-svn: 255292
* Delete a duplicate branch in IfConversion.cpp. NFC.Cong Hou2015-12-101-9/+0
| | | | llvm-svn: 255291
* [DAGCombiner] Fix PR25763 - vector comparison constant folding + sign-extensionSimon Pilgrim2015-12-102-5/+24
| | | | | | PR25763 demonstrated an issue with D14683 - vector comparison constant folding only works for i1 results, so we need to split off the sign-extension of the result to the required type. Luckily this can be done with the existing type legalization code. llvm-svn: 255289
* [DSE] Disable non-local DSE to see if the bots go green.Chad Rosier2015-12-105-5/+5
| | | | | | I see a few bots timing out, so I'm speculatively disabling r255247. llvm-svn: 255286
* Fix another case where the linkage was not set.Rafael Espindola2015-12-103-2/+13
| | | | llvm-svn: 255272
* [PGO] Use %t as the temporary profdata filename in the test cases.Rong Xu2015-12-1010-19/+19
| | | | | | Using %t rather %T/<specific_name> as the temporary profdata filename. llvm-svn: 255271
* Verifier: Avoid quadratic checking of aggregates for bad bitcastsDuncan P. N. Exon Smith2015-12-101-38/+37
| | | | | | | | | | | | | | | | | | | | | | | Avoid O(N^2) behaviour when checking for bad bitcasts in `ConstantExpr`s buried inside of aggregate initializers to `GlobalVariable`s. I've: - centralized the "visited" set for recursing through `ConstantExpr`s so that expressions are only visited once per Verifier run, - removed the duplicate logic for the stack visit, and - avoided recursing into other `GlobalValue`s. This recovers roughly a 100x time difference in clang compiles of a particular input file (filled with large cross-referencing tables) that depends on whether `-disable-llvm-verifier` is on. This slowdown was caused by r187506, which introduced these checks. Now, avoiding `-disable-llvm-verifier` only causes a 2x slowdown for this case. (Interestingly, dumping the textual IR for this file starts at least 50GB of global variable initializers (I don't know the total, since I killed the dump)...) llvm-svn: 255269
* [DeadStoreElimination] Use range-based loops. NFC.Chad Rosier2015-12-101-9/+6
| | | | llvm-svn: 255265
* [ProfileData] Add unit test infrastructure for sample profile reader/writerNathan Slingerland2015-12-106-23/+187
| | | | | | | | | | | | | | | Summary: Adds support for in-memory round-trip of sample profile data along with basic round trip unit tests. This will also make it easier to include unit tests for future changes to sample profiling. Reviewers: davidxl, dnovillo, silvas Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15211 llvm-svn: 255264
* Fix fptosi, fptoui from f16 vectors to i8, i16 vectorsPirama Arumuga Nainar2015-12-103-1/+105
| | | | | | | | | | | | | | | | Summary: Convert f16 vectors to corresponding f32 vectors before doing the conversion to int. Add tests for v4f16, v8f16. Reviewers: ab, jmolloy Subscribers: llvm-commits, srhines Differential Revision: http://reviews.llvm.org/D14936 llvm-svn: 255263
* [InstCombine] fold bitcasts around an extractelement (3rd try)Sanjay Patel2015-12-102-8/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a redo of r255137 (reverted at r255227) which was a redo of r255124 (reverted at r255126) with a fixed check for a scalar source type and an added test for the failure that caused the revert. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255261
* [ThinLTO] Debug message cleanup (NFC)Teresa Johnson2015-12-101-8/+8
| | | | | | | | Added some missing spaces between the module identifier and the start of the debug message. Also added a ":" after the module identifier to make this look a little nicer. llvm-svn: 255259
* Avoid undefined behavior when vector is empty.Rafael Espindola2015-12-102-2/+3
| | | | | | Found by ubsan. llvm-svn: 255258
* remove duplicated comments and don't repeat function names in comments; NFCSanjay Patel2015-12-101-142/+83
| | | | llvm-svn: 255257
* [ThinLTO] Release files in gold plugin during combined index (take 2)Teresa Johnson2015-12-101-4/+2
| | | | | | | Ensure we release the files even when they don't hold a function index summary section, by restructuring the control flow a little bit. llvm-svn: 255256
* [WebAssembly] Tighten up several CHECK tests.Dan Gohman2015-12-104-18/+18
| | | | llvm-svn: 255255
* Slit lib/Linker in two.Rafael Espindola2015-12-109-1483/+1718
| | | | | | | | | | | | | | | | A linker normally has two stages: symbol resolution and "moving stuff". In lib/Linker there is the complication of lazy linking some globals, but it was still far more mixed than it needed to. This splits the linker into a lower level IRMover and the linker proper. The IRMover just takes a list of globals to move and a callback that lets the user control what is lazy linked. The main motivation is that now tools/gold (and soon lld) can use their own symbol resolution to instruct IRMover what to do. llvm-svn: 255254
* [WebAssembly] Make WebAssemblyStoreResults only return true when it has a ↵Dan Gohman2015-12-101-1/+3
| | | | | | change. llvm-svn: 255253
* [WebAssembly] Fix WebAssemblyPeephole to set Changed to true when making ↵Dan Gohman2015-12-101-0/+1
| | | | | | changes. llvm-svn: 255252
* [WebAssembly] Declare that WebAssemblyPeephole does not modify the CFG.Dan Gohman2015-12-101-0/+5
| | | | llvm-svn: 255251
* [WebAssembly] Remove an unneeded getAnalysisUsage override.Dan Gohman2015-12-101-4/+0
| | | | llvm-svn: 255250
* [DeadStoreElimination] Add support for non-local DSE.Chad Rosier2015-12-106-90/+393
| | | | | | | | | | | | We extend the search for redundant stores to predecessor blocks that unconditionally lead to the block BB with the current store instruction. That also includes single-block loops that unconditionally lead to BB, and if-then-else blocks where then- and else-blocks unconditionally lead to BB. http://reviews.llvm.org/D13363 Patch by Ivan Baev <ibaev@codeaurora.org>! llvm-svn: 255247
* Bitcasts between FP and INT values using direct movesNemanja Ivanovic2015-12-103-92/+334
| | | | | | | | | | | | This patch corresponds to review: http://reviews.llvm.org/D15286 LLVM IR frequently contains bitcast operations between floating point and integer values of the same width. Doing this through memory operations is quite expensive on PPC. This patch allows the use of direct register moves between FPRs and GPRs for lowering bitcasts. llvm-svn: 255246
* Macro debug info support in LLVM IRAmjad Aboud2015-12-1020-40/+509
| | | | | | | | Introduced DIMacro and DIMacroFile debug info metadata in the LLVM IR to support macros. Differential Revision: http://reviews.llvm.org/D14687 llvm-svn: 255245
* [LLE] Use the PredicatedScalarEvolution interface to query SCEVs for dependencesSilviu Baranga2015-12-101-16/+15
| | | | | | | | | | | | | | | | | Summary: LAA uses the PredicatedScalarEvolution interface, so it can produce forward/backward dependences having SCEVs that are AddRecExprs only after being transformed by PredicatedScalarEvolution. Use PredicatedScalarEvolution to get the expected expressions. Reviewers: anemet Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D15382 llvm-svn: 255241
* [PostRA scheduling] Allow a target to do scheduling when it wants post RA.Jonas Paulsson2015-12-104-5/+29
| | | | | | | | | | | | | | SystemZ needs to do its scheduling after branch relaxation, which can only happen after block placement, and therefore the standard PostRAScheduler point in the pass sequence is too early. TargetMachine::targetSchedulesPostRAScheduling() is a new method that signals on returning true that target will insert the final scheduling pass on its own. Reviewed by Hal Finkel llvm-svn: 255234
OpenPOWER on IntegriCloud