summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [SCEV] Use lambda instead of std::bind; NFCSanjoy Das2015-11-291-2/+3
| | | | | | The lambda is more readable. llvm-svn: 254276
* [SCEV] Use range version of all_of; NFCSanjoy Das2015-11-291-13/+10
| | | | llvm-svn: 254275
* [X86] Remove duplicate entries from intrinsics tables and add asserts to ↵Craig Topper2015-11-291-22/+7
| | | | | | verify there are no others. llvm-svn: 254274
* [WebAssembly] Delete an obsolete TODO comment.Dan Gohman2015-11-291-1/+0
| | | | llvm-svn: 254272
* [WebAssembly] Set several MCInstrDesc flags.Dan Gohman2015-11-294-0/+20
| | | | llvm-svn: 254271
* [X86] int_x86_avx2_permps and X86ISD::VPERMV should take an integer vector ↵Craig Topper2015-11-292-4/+6
| | | | | | for its shuffle indices. llvm-svn: 254269
* [WebAssembly] Delete unused functions. NFC.Dan Gohman2015-11-291-6/+0
| | | | llvm-svn: 254268
* [WebAssembly] Minor clang-format and selected clang-tidy cleanups. NFC.Dan Gohman2015-11-2913-64/+68
| | | | llvm-svn: 254267
* fix typos in comments; NFCSanjay Patel2015-11-291-6/+8
| | | | llvm-svn: 254266
* [SimplifyLibCalls] Don't crash if the function doesn't have a name.Davide Italiano2015-11-291-3/+2
| | | | llvm-svn: 254265
* [SimplifyLibCalls] Cross out implemented transformations.Davide Italiano2015-11-291-2/+0
| | | | llvm-svn: 254264
* [SimplifyLibCalls] Tranform log(pow(x, y)) -> y*log(x).Davide Italiano2015-11-291-5/+50
| | | | | | | | | | | | | | | | | | This one is enabled only under -ffast-math. There are cases where the difference between the value computed and the correct value is huge even for ffast-math, e.g. as Steven pointed out: x = -1, y = -4 log(pow(-1), 4) = 0 4*log(-1) = NaN I checked what GCC does and apparently they do the same optimization (which result in the dramatic difference). Future work might try to make this (slightly) less worse. Differential Revision: http://reviews.llvm.org/D14400 llvm-svn: 254263
* SamplePGO - Do not use std::to_string in diagnostics.Diego Novillo2015-11-291-12/+17
| | | | | | | | This fixes buildbots in systems that std::to_string is not present. It also tidies the output of the diagnostic to render doubles a bit better (thanks Ben Kramer for help with string streams and format). llvm-svn: 254261
* Use a lambda instead of std::bind and std::mem_fn I introduced in r254242. NFCCraig Topper2015-11-291-2/+3
| | | | llvm-svn: 254260
* [X86][SSE] Added support for lowering to ADDSUBPS/ADDSUBPD with commuted inputsSimon Pilgrim2015-11-291-5/+10
| | | | | | We could already recognise shuffle(FSUB, FADD) -> ADDSUB, this allow us to recognise shuffle(FADD, FSUB) -> ADDSUB by commuting the shuffle mask prior to matching. llvm-svn: 254259
* Simplify. NFC.Rafael Espindola2015-11-291-16/+12
| | | | llvm-svn: 254254
* AVX512:Implemented encoding for the vmovq.s instruction.Igor Breger2015-11-291-0/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D14810 llvm-svn: 254248
* Remove an intermediate lambda. NFCCraig Topper2015-11-291-3/+2
| | | | llvm-svn: 254246
* Remove unnecessary intermediate lambda. NFCCraig Topper2015-11-292-5/+2
| | | | llvm-svn: 254243
* [SelectionDAG] Use std::any_of instead of a manually coded loop. NFCCraig Topper2015-11-291-8/+4
| | | | llvm-svn: 254242
* Correctly handle llvm.global_ctors merging.Rafael Espindola2015-11-291-42/+48
| | | | | | | We were not handling the case where an entry must be dropped and the destination module has no llvm.global_ctors. llvm-svn: 254241
* Fix a crash when writing merged bitcode.Rafael Espindola2015-11-291-5/+14
| | | | | | | Playing with mutateType in here was making getValueType and getType incompatible. llvm-svn: 254240
* [SimplifyLibCalls] Use any_of(). Suggested by David Blaikie!Davide Italiano2015-11-281-4/+3
| | | | llvm-svn: 254239
* [SimplifyLibCalls] Fix inverted condition that lead to an uninitialized ↵Benjamin Kramer2015-11-281-2/+2
| | | | | | | | memory read below. Found by msan! llvm-svn: 254238
* [PGO] Move value profile format related structures and APIs to common fileXinliang David Li2015-11-281-177/+4
| | | | | | | | | | This is the last step to enable profile runtime to share the same value prof data format and reader/writer code with llvm host tools. The VP related data structures are moved to a section in InstrProfData.inc enabled with macro INSTR_PROF_VALUE_PROF_DATA, and common API implementations are enabled with INSTR_PROF_COMMON_API_IMPL. There should be no functional change. llvm-svn: 254235
* Revert "[ARM] Generate ABI_optimization_goals build attribute, as described ↵Renato Golin2015-11-282-46/+4
| | | | | | | | | in the ARM ARM." This reverts commit r254201 and r254202, as it broke test-suite, self-hosting and sanitizer tests on ARM buildbots. llvm-svn: 254234
* [Stack realignment] Handling of aligned allocas.Jonas Paulsson2015-11-283-15/+48
| | | | | | | | | | | | | | | | | | | | This patch implements dynamic realignment of stack objects for targets with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo is changed so that for a target that has StackRealignable set to false, over-aligned static allocas are considered to be variable-sized objects and are handled with DYNAMIC_STACKALLOC nodes. It would be good to group aligned allocas into a single big alloca as an optimization, but this is yet todo. SystemZ benefits from this, due to its stack frame layout. New tests SystemZ/alloca-03.ll for aligned allocas, and SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions. Review and help from Ulrich Weigand and Hal Finkel. llvm-svn: 254227
* Use range-based for loops. NFCCraig Topper2015-11-281-37/+20
| | | | llvm-svn: 254222
* [PGO] Add return code for vp rt record init routine to indicate error conditionXinliang David Li2015-11-281-3/+6
| | | | llvm-svn: 254220
* [PGO] Allow value profile writer interface to allocated target buffer Xinliang David Li2015-11-281-9/+13
| | | | | | | | | Raw profile writer needs to write all data of one kind in one continuous block, so the buffer needs to be pre-allocated and passed to the writer method in pieces for function profile data. The change adds the support for raw value data writing. llvm-svn: 254219
* Function name cleanup (NFC)Xinliang David Li2015-11-281-4/+4
| | | | llvm-svn: 254218
* [PGO] Extract VP data integrity check code into a helper function (NFC)Xinliang David Li2015-11-281-17/+21
| | | | llvm-svn: 254217
* SamplePGO - Add initial support for inliner annotations.Diego Novillo2015-11-271-1/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds two thresholds to the sample profiler to affect inlining decisions: the concept of global hotness and coldness. Functions that have accumulated more than a certain fraction of samples at runtime, are annotated with the InlineHint attribute. Conversely, functions that accumulate less than a certain fraction of samples, are annotated with the Cold attribute. This is very similar to the hints emitted by Clang when using instrumentation profiles. Notice that this is a very blunt instrument. A function may have globally collected a significant fraction of samples, but that does not necessarily mean that every callsite for that function is hot. Ideally, we would annotate each callsite with the samples collected at that callsite. This way, the inliner can incorporate all these weights into its cost model. Once the inliner offers this functionality, we can change the hints emitted here to a more precise per-callsite annotation. For now, this is providing some measure of speedups with our internal benchmarks. I've observed speedups of up to 23% (though the geo mean is about 3%). I expect these numbers to improve as the inliner gets better annotations. llvm-svn: 254212
* SamplePGO - Fix default threshold for hot callsites.Diego Novillo2015-11-271-3/+4
| | | | | | | | | | | | | | | | | | | | | | | Based on testing of internal benchmarks, I'm lowering this threshold to a value of 0.1%. This means that SamplePGO will respect 99.9% of the original inline decisions when following a profile. The performance difference is noticeable in some tests. With the previous threshold, the speedups over baseline -O2 was about 0.63%. With the new default, the speedups are around 3% on average. The point of this threshold is not to do more aggressive inlining. When an inlined callsite crosses this threshold, SamplePGO will redo the inline decision so that it can better apply the input profile. By respecting most original inline decisions, we can apply more of the input profile because the shape of the code follows the profile more closely. In the next series, I'll be looking at adding some inline hints for the cold callsites and for toplevel functions that are hot/cold as well. llvm-svn: 254211
* Simplify the linking of recursive data.Rafael Espindola2015-11-272-41/+45
| | | | | | | | Now the ValueMapper has two callbacks. The first one maps the declaration. The ValueMapper records the mapping and then materializes the body/initializer. llvm-svn: 254209
* Follow-up fix for r254201Artyom Skrobov2015-11-271-1/+1
| | | | llvm-svn: 254202
* [ARM] Generate ABI_optimization_goals build attribute, as described in the ↵Artyom Skrobov2015-11-272-4/+46
| | | | | | | | | | | | | | | | | | | ARM ARM. Summary: Since this build attribute corresponds to a whole module, and different functions in a module may differ in the optimizations enabled for them, this attribute is emitted after all functions, and only in the case that the optimization goals for all functions match. Reviewers: logan, hans Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D14934 llvm-svn: 254201
* [AArch64] Add ARMv8.2-A FP16 scalar instructionsOliver Stannard2015-11-273-43/+222
| | | | | | | | | | | | | | | ARMv8.2-A adds 16-bit floating point versions of all existing VFP floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. Most of these instructions are the same as the 32- and 64-bit versions, but with the type field (bits 23-22) set to 0b11. Previously the top bit of the size field was always 0, so the instruction classes only provided a 1-bit size field, which I have widened to 2 bits. Differential Revision: http://reviews.llvm.org/D15014 llvm-svn: 254198
* [sanitizer] [dfsan] Unify aarch64 mappingAdhemerval Zanella2015-11-271-16/+21
| | | | | | | | | | | | This patch changes the DFSan instrumentation for aarch64 to instead of using fixes application mask defined by SANITIZER_AARCH64_VMA to read the application shadow mask value from compiler-rt. The value is initialized based on runtime VAM detection. Along with this patch a compiler-rt one will also be added to export the shadow mask variable. llvm-svn: 254196
* [SimplifyLibCalls] Use range-based loop. NFC.Davide Italiano2015-11-271-4/+2
| | | | llvm-svn: 254193
* [X86] Pair a NoVLX with HasAVX512 to match the others and remove a unique ↵Craig Topper2015-11-271-1/+1
| | | | | | predicate check in the isel tables. NFC llvm-svn: 254191
* MC: Simplify handling of temporary symbols in COFF writer.Peter Collingbourne2015-11-261-81/+23
| | | | | | | | | | | | | | The COFF object writer was previously adding unnecessary symbols to its temporary data structures and cleaning them up later. This made the code harder to understand and caused a bug (aliases classed as temporary symbols would cause an assertion failure). A much simpler way of handling such symbols is to ask the layout for their section-relative position when needed. Tested with a bootstrap on Windows and by building Chrome. Differential Revision: http://reviews.llvm.org/D14975 llvm-svn: 254183
* [LoopVectorize] Use MapVector rather than DenseMap for MinBWs.Charlie Turner2015-11-262-6/+6
| | | | | | | | | | | | | | | | | The order in which instructions are truncated in truncateToMinimalBitwidths effects code generation. Switch to a map with a determinisic order, since the iteration order over a DenseMap is not defined. This code is not hot, so the difference in container performance isn't interesting. Many thanks to David Blaikie for making me aware of MapVector! Fixes PR25490. Differential Revision: http://reviews.llvm.org/D14981 llvm-svn: 254179
* [X86] Now that X86VPermt2 is used in all the avx512_perm_t_sizes just ↵Craig Topper2015-11-261-29/+27
| | | | | | hardcode it into the patterns instead of passing as an argument. NFC llvm-svn: 254177
* [X86] Merge X86VPermt2Fp and X86VPermt2Int back together by weakening them ↵Craig Topper2015-11-262-11/+7
| | | | | | just enough. The SDTCisSameSizeAs introduced in r254138 helps here. llvm-svn: 254176
* [X86] Split ISD node for Vfpclass and Vfpclasss so that we can write strong ↵Craig Topper2015-11-264-7/+15
| | | | | | type constraints for each that don't cause ambiguous isel. llvm-svn: 254172
* Disallow aliases to available_externally.Rafael Espindola2015-11-263-39/+6
| | | | | | | | | | | | They are as much trouble as aliases to declarations. They are requiring the code generator to define a symbol with the same value as another symbol, but the second symbol is undefined. If representing this is important for some optimization, we could add support for available_externally aliases. They would be *required* to point to a declaration (or available_externally definition). llvm-svn: 254170
* [X86] Revert part of r254167 to recover bots.Craig Topper2015-11-261-6/+3
| | | | llvm-svn: 254169
* [Hexagon] Lowering of V60/HVX vector typesKrzysztof Parzyszek2015-11-263-89/+469
| | | | llvm-svn: 254168
* [X86] Strengthen more type constraints to reduce isel table size.Craig Topper2015-11-261-12/+24
| | | | llvm-svn: 254167
OpenPOWER on IntegriCloud