summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [WebAssembly] Delete unused functions. NFC.Dan Gohman2015-11-291-6/+0
| | | | llvm-svn: 254268
* [WebAssembly] Minor clang-format and selected clang-tidy cleanups. NFC.Dan Gohman2015-11-2913-64/+68
| | | | llvm-svn: 254267
* fix typos in comments; NFCSanjay Patel2015-11-291-6/+8
| | | | llvm-svn: 254266
* [SimplifyLibCalls] Don't crash if the function doesn't have a name.Davide Italiano2015-11-291-3/+2
| | | | llvm-svn: 254265
* [SimplifyLibCalls] Cross out implemented transformations.Davide Italiano2015-11-291-2/+0
| | | | llvm-svn: 254264
* [SimplifyLibCalls] Tranform log(pow(x, y)) -> y*log(x).Davide Italiano2015-11-291-5/+50
| | | | | | | | | | | | | | | | | | This one is enabled only under -ffast-math. There are cases where the difference between the value computed and the correct value is huge even for ffast-math, e.g. as Steven pointed out: x = -1, y = -4 log(pow(-1), 4) = 0 4*log(-1) = NaN I checked what GCC does and apparently they do the same optimization (which result in the dramatic difference). Future work might try to make this (slightly) less worse. Differential Revision: http://reviews.llvm.org/D14400 llvm-svn: 254263
* SamplePGO - Do not use std::to_string in diagnostics.Diego Novillo2015-11-291-12/+17
| | | | | | | | This fixes buildbots in systems that std::to_string is not present. It also tidies the output of the diagnostic to render doubles a bit better (thanks Ben Kramer for help with string streams and format). llvm-svn: 254261
* Use a lambda instead of std::bind and std::mem_fn I introduced in r254242. NFCCraig Topper2015-11-291-2/+3
| | | | llvm-svn: 254260
* [X86][SSE] Added support for lowering to ADDSUBPS/ADDSUBPD with commuted inputsSimon Pilgrim2015-11-291-5/+10
| | | | | | We could already recognise shuffle(FSUB, FADD) -> ADDSUB, this allow us to recognise shuffle(FADD, FSUB) -> ADDSUB by commuting the shuffle mask prior to matching. llvm-svn: 254259
* Simplify. NFC.Rafael Espindola2015-11-291-16/+12
| | | | llvm-svn: 254254
* AVX512:Implemented encoding for the vmovq.s instruction.Igor Breger2015-11-291-0/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D14810 llvm-svn: 254248
* Remove an intermediate lambda. NFCCraig Topper2015-11-291-3/+2
| | | | llvm-svn: 254246
* Remove unnecessary intermediate lambda. NFCCraig Topper2015-11-292-5/+2
| | | | llvm-svn: 254243
* [SelectionDAG] Use std::any_of instead of a manually coded loop. NFCCraig Topper2015-11-291-8/+4
| | | | llvm-svn: 254242
* Correctly handle llvm.global_ctors merging.Rafael Espindola2015-11-291-42/+48
| | | | | | | We were not handling the case where an entry must be dropped and the destination module has no llvm.global_ctors. llvm-svn: 254241
* Fix a crash when writing merged bitcode.Rafael Espindola2015-11-291-5/+14
| | | | | | | Playing with mutateType in here was making getValueType and getType incompatible. llvm-svn: 254240
* [SimplifyLibCalls] Use any_of(). Suggested by David Blaikie!Davide Italiano2015-11-281-4/+3
| | | | llvm-svn: 254239
* [SimplifyLibCalls] Fix inverted condition that lead to an uninitialized ↵Benjamin Kramer2015-11-281-2/+2
| | | | | | | | memory read below. Found by msan! llvm-svn: 254238
* [PGO] Move value profile format related structures and APIs to common fileXinliang David Li2015-11-281-177/+4
| | | | | | | | | | This is the last step to enable profile runtime to share the same value prof data format and reader/writer code with llvm host tools. The VP related data structures are moved to a section in InstrProfData.inc enabled with macro INSTR_PROF_VALUE_PROF_DATA, and common API implementations are enabled with INSTR_PROF_COMMON_API_IMPL. There should be no functional change. llvm-svn: 254235
* Revert "[ARM] Generate ABI_optimization_goals build attribute, as described ↵Renato Golin2015-11-282-46/+4
| | | | | | | | | in the ARM ARM." This reverts commit r254201 and r254202, as it broke test-suite, self-hosting and sanitizer tests on ARM buildbots. llvm-svn: 254234
* [Stack realignment] Handling of aligned allocas.Jonas Paulsson2015-11-283-15/+48
| | | | | | | | | | | | | | | | | | | | This patch implements dynamic realignment of stack objects for targets with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo is changed so that for a target that has StackRealignable set to false, over-aligned static allocas are considered to be variable-sized objects and are handled with DYNAMIC_STACKALLOC nodes. It would be good to group aligned allocas into a single big alloca as an optimization, but this is yet todo. SystemZ benefits from this, due to its stack frame layout. New tests SystemZ/alloca-03.ll for aligned allocas, and SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions. Review and help from Ulrich Weigand and Hal Finkel. llvm-svn: 254227
* Use range-based for loops. NFCCraig Topper2015-11-281-37/+20
| | | | llvm-svn: 254222
* [PGO] Add return code for vp rt record init routine to indicate error conditionXinliang David Li2015-11-281-3/+6
| | | | llvm-svn: 254220
* [PGO] Allow value profile writer interface to allocated target buffer Xinliang David Li2015-11-281-9/+13
| | | | | | | | | Raw profile writer needs to write all data of one kind in one continuous block, so the buffer needs to be pre-allocated and passed to the writer method in pieces for function profile data. The change adds the support for raw value data writing. llvm-svn: 254219
* Function name cleanup (NFC)Xinliang David Li2015-11-281-4/+4
| | | | llvm-svn: 254218
* [PGO] Extract VP data integrity check code into a helper function (NFC)Xinliang David Li2015-11-281-17/+21
| | | | llvm-svn: 254217
* SamplePGO - Add initial support for inliner annotations.Diego Novillo2015-11-271-1/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds two thresholds to the sample profiler to affect inlining decisions: the concept of global hotness and coldness. Functions that have accumulated more than a certain fraction of samples at runtime, are annotated with the InlineHint attribute. Conversely, functions that accumulate less than a certain fraction of samples, are annotated with the Cold attribute. This is very similar to the hints emitted by Clang when using instrumentation profiles. Notice that this is a very blunt instrument. A function may have globally collected a significant fraction of samples, but that does not necessarily mean that every callsite for that function is hot. Ideally, we would annotate each callsite with the samples collected at that callsite. This way, the inliner can incorporate all these weights into its cost model. Once the inliner offers this functionality, we can change the hints emitted here to a more precise per-callsite annotation. For now, this is providing some measure of speedups with our internal benchmarks. I've observed speedups of up to 23% (though the geo mean is about 3%). I expect these numbers to improve as the inliner gets better annotations. llvm-svn: 254212
* SamplePGO - Fix default threshold for hot callsites.Diego Novillo2015-11-271-3/+4
| | | | | | | | | | | | | | | | | | | | | | | Based on testing of internal benchmarks, I'm lowering this threshold to a value of 0.1%. This means that SamplePGO will respect 99.9% of the original inline decisions when following a profile. The performance difference is noticeable in some tests. With the previous threshold, the speedups over baseline -O2 was about 0.63%. With the new default, the speedups are around 3% on average. The point of this threshold is not to do more aggressive inlining. When an inlined callsite crosses this threshold, SamplePGO will redo the inline decision so that it can better apply the input profile. By respecting most original inline decisions, we can apply more of the input profile because the shape of the code follows the profile more closely. In the next series, I'll be looking at adding some inline hints for the cold callsites and for toplevel functions that are hot/cold as well. llvm-svn: 254211
* Simplify the linking of recursive data.Rafael Espindola2015-11-272-41/+45
| | | | | | | | Now the ValueMapper has two callbacks. The first one maps the declaration. The ValueMapper records the mapping and then materializes the body/initializer. llvm-svn: 254209
* Follow-up fix for r254201Artyom Skrobov2015-11-271-1/+1
| | | | llvm-svn: 254202
* [ARM] Generate ABI_optimization_goals build attribute, as described in the ↵Artyom Skrobov2015-11-272-4/+46
| | | | | | | | | | | | | | | | | | | ARM ARM. Summary: Since this build attribute corresponds to a whole module, and different functions in a module may differ in the optimizations enabled for them, this attribute is emitted after all functions, and only in the case that the optimization goals for all functions match. Reviewers: logan, hans Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D14934 llvm-svn: 254201
* [AArch64] Add ARMv8.2-A FP16 scalar instructionsOliver Stannard2015-11-273-43/+222
| | | | | | | | | | | | | | | ARMv8.2-A adds 16-bit floating point versions of all existing VFP floating-point instructions. This is an optional extension, so all of these instructions require the FeatureFullFP16 subtarget feature. Most of these instructions are the same as the 32- and 64-bit versions, but with the type field (bits 23-22) set to 0b11. Previously the top bit of the size field was always 0, so the instruction classes only provided a 1-bit size field, which I have widened to 2 bits. Differential Revision: http://reviews.llvm.org/D15014 llvm-svn: 254198
* [sanitizer] [dfsan] Unify aarch64 mappingAdhemerval Zanella2015-11-271-16/+21
| | | | | | | | | | | | This patch changes the DFSan instrumentation for aarch64 to instead of using fixes application mask defined by SANITIZER_AARCH64_VMA to read the application shadow mask value from compiler-rt. The value is initialized based on runtime VAM detection. Along with this patch a compiler-rt one will also be added to export the shadow mask variable. llvm-svn: 254196
* [SimplifyLibCalls] Use range-based loop. NFC.Davide Italiano2015-11-271-4/+2
| | | | llvm-svn: 254193
* [X86] Pair a NoVLX with HasAVX512 to match the others and remove a unique ↵Craig Topper2015-11-271-1/+1
| | | | | | predicate check in the isel tables. NFC llvm-svn: 254191
* MC: Simplify handling of temporary symbols in COFF writer.Peter Collingbourne2015-11-261-81/+23
| | | | | | | | | | | | | | The COFF object writer was previously adding unnecessary symbols to its temporary data structures and cleaning them up later. This made the code harder to understand and caused a bug (aliases classed as temporary symbols would cause an assertion failure). A much simpler way of handling such symbols is to ask the layout for their section-relative position when needed. Tested with a bootstrap on Windows and by building Chrome. Differential Revision: http://reviews.llvm.org/D14975 llvm-svn: 254183
* [LoopVectorize] Use MapVector rather than DenseMap for MinBWs.Charlie Turner2015-11-262-6/+6
| | | | | | | | | | | | | | | | | The order in which instructions are truncated in truncateToMinimalBitwidths effects code generation. Switch to a map with a determinisic order, since the iteration order over a DenseMap is not defined. This code is not hot, so the difference in container performance isn't interesting. Many thanks to David Blaikie for making me aware of MapVector! Fixes PR25490. Differential Revision: http://reviews.llvm.org/D14981 llvm-svn: 254179
* [X86] Now that X86VPermt2 is used in all the avx512_perm_t_sizes just ↵Craig Topper2015-11-261-29/+27
| | | | | | hardcode it into the patterns instead of passing as an argument. NFC llvm-svn: 254177
* [X86] Merge X86VPermt2Fp and X86VPermt2Int back together by weakening them ↵Craig Topper2015-11-262-11/+7
| | | | | | just enough. The SDTCisSameSizeAs introduced in r254138 helps here. llvm-svn: 254176
* [X86] Split ISD node for Vfpclass and Vfpclasss so that we can write strong ↵Craig Topper2015-11-264-7/+15
| | | | | | type constraints for each that don't cause ambiguous isel. llvm-svn: 254172
* Disallow aliases to available_externally.Rafael Espindola2015-11-263-39/+6
| | | | | | | | | | | | They are as much trouble as aliases to declarations. They are requiring the code generator to define a symbol with the same value as another symbol, but the second symbol is undefined. If representing this is important for some optimization, we could add support for available_externally aliases. They would be *required* to point to a declaration (or available_externally definition). llvm-svn: 254170
* [X86] Revert part of r254167 to recover bots.Craig Topper2015-11-261-6/+3
| | | | llvm-svn: 254169
* [Hexagon] Lowering of V60/HVX vector typesKrzysztof Parzyszek2015-11-263-89/+469
| | | | llvm-svn: 254168
* [X86] Strengthen more type constraints to reduce isel table size.Craig Topper2015-11-261-12/+24
| | | | llvm-svn: 254167
* [Hexagon] Hexagon V60 HVX intrinsic defintionsKrzysztof Parzyszek2015-11-265-1/+911
| | | | | Author: Ron Lieberman <ronl@codeaurora.org> llvm-svn: 254165
* [mips][ias] Range check uimm5 operands and fix several bugs this revealed.Daniel Sanders2015-11-265-77/+127
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The bugs were: * append, prepend, and balign were not tested * balign takes a uimm2 not a uimm5. * drotr32 was correctly implemented with a uimm5 but the tests expected '52' to be valid. * li/la were implemented with a uimm5 instead of simm32. simm32 isn't completely correct either but I'll fix that when I get to simm32. A notable omission are some of the shift instructions. Several of these have been implemented using a single uimm6 instruction (rather than two uimm5 instructions and a CodeGen-only uimm6 pseudo). These will be updated in the uimm6 patch. Reviewers: vkalintiris Subscribers: llvm-commits, dsanders Differential Revision: http://reviews.llvm.org/D14712 llvm-svn: 254164
* [AArch64] Add ARMv8.2-A new AT instruction variantsOliver Stannard2015-11-263-1/+32
| | | | | | | | | | | ARMv8.2-A adds new variants of the "at" (address translate) system instruction, which take the PSTATE.PAN bit (added in ARMv8.1-A). These are a required part of ARMv8.2-A, so no additional subtarget features are required. Differential Revision: http://reviews.llvm.org/D15018 llvm-svn: 254159
* ARM: address WOA unsigned division overflow crashMartell Malone2015-11-262-20/+19
| | | | | | | | | Building on r253865 the crash is not limited to signed overflows. Disable custom handling of unsigned 32-bit and 64-bit integer divide. Add test cases for both 32-bit and 64-bit unsigned integer overflow. llvm-svn: 254158
* [AArch64] Add ARMv8.2-A UAO PSTATE bitOliver Stannard2015-11-265-3/+17
| | | | | | | | | | | | | ARMv8.2-A adds a new PSTATE bit, PSTATE.UAO, which allows the LDTR/STTR instructions to behave the same as LDR/STR with respect to execute-only pages at higher privilege levels. New variants of the MSR/MRS instructions are added to allow reading and writing this bit. It is a required part of ARMv8.2-A, so no additional subtarget features are required. Differential Revision: http://reviews.llvm.org/D15020 llvm-svn: 254157
* [AArch64] Add ARMv8.2-A persistent memory instructionOliver Stannard2015-11-263-3/+18
| | | | | | | | | | | ARMv8.2-A adds the "dc cvap" instruction, which is a system instruction that cleans caches to the point of persistence (for systems that have persistent memory). It is a required part of ARMv8.2-A, so no additional subtarget features are required. Differential Revision: http://reviews.llvm.org/D15016 llvm-svn: 254156
OpenPOWER on IntegriCloud