summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* Emit ABI_FP_rounding attribute.Charlie Turner2014-12-032-0/+28
| | | | | | | | | | | | LLVM understands a -enable-sign-dependent-rounding-fp-math codegen option. When the user has specified this option, the Tag_ABI_FP_rounding attribute should be emitted with value 1. This option currently does not appear to disable transformations and optimizations that assume default floating point rounding behavior, AFAICT, but the intention should be recorded in the build attributes, regardless of what the compiler actually does with the intention. Change-Id: If838578df3dc652b6f2796b8d152545674bcb30e llvm-svn: 223218
* Add tests for default value of Tag_ABI_FP_rounding.Charlie Turner2014-12-031-0/+48
| | | | | Change-Id: I051866d073fc6ce87ce3e693a3762da6d81f4393 llvm-svn: 223217
* Fix a typo in the documentation of LTOBenjamin Poulain2014-12-031-1/+1
| | | | | | | | Fix defininitions->definitions. Reviewed by David Blaikie. llvm-svn: 223216
* Ask the module for its the identified types.Rafael Espindola2014-12-038-7/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When lazy reading a module, the types used in a function will not be visible to a TypeFinder until the body is read. This patch fixes that by asking the module for its identified struct types. If a materializer is present, the module asks it. If not, it uses a TypeFinder. This fixes pr21374. I will be the first to say that this is ugly, but it was the best I could find. Some of the options I looked at: * Asking the LLVMContext. This could be made to work for gold, but not currently for ld64. ld64 will load multiple modules into a single context before merging them. This causes us to see types from future merges. Unfortunately, MappedTypes is not just a cache when it comes to opaque types. Once the mapping has been made, we have to remember it for as long as the key may be used. This would mean moving MappedTypes to the Linker class and having to drop the Linker::LinkModules static methods, which are visible from C. * Adding an option to ignore function bodies in the TypeFinder. This would fix the PR by picking the worst result. It would work, but unfortunately we are currently quite dependent on the upfront type merging. I will try to reduce our dependency, but it is not clear that we will be able to get rid of it for now. The only clean solution I could think of is making the Module own the types. This would have other advantages, but it is a much bigger change. I will propose it, but it is nice to have this fixed while that is discussed. With the gold plugin, this patch takes the number of types in the LTO clang binary from 52817 to 49669. llvm-svn: 223215
* ADT: Rename argument in emplace_back_implDuncan P. N. Exon Smith2014-12-031-2/+2
| | | | | | | Rename a functor argument in r223201 from `emplace` to `construct` to reduce confusion. llvm-svn: 223212
* Revert r222997. The newly added compile-time checks are finding missing ↵Nick Lewycky2014-12-031-10/+9
| | | | | | origins, testcase is being reduced and a PR will be posted shortly. llvm-svn: 223211
* LoopVectorize: Remove unnecessary RAUWDuncan P. N. Exon Smith2014-12-031-2/+0
| | | | | | | | | | Remove an unnecessary `MDNode::replaceAllUsesWith()`. In the preceding line, `TheLoop->setLoopID()` visits all backedges and sets the new loop ID. This sufficiently updates the loop metadata. Metadata RAUW is going away as part of PR21532. llvm-svn: 223210
* R600/SI: Fix SIFixSGPRCopies for copies to physical registersMatt Arsenault2014-12-031-1/+6
| | | | | | | This shows up when operands required to be passed in VCC are copied to. llvm-svn: 223208
* R600/SI: Remove incorrect assertionMatt Arsenault2014-12-031-5/+5
| | | | | | This can be a COPY to a physical register, such as VCC llvm-svn: 223207
* R600/SI: Remove i1 pseudo VALU opsMatt Arsenault2014-12-0310-124/+339
| | | | | | | | | | | | | | Select i1 logical ops directly to 64-bit SALU instructions. Vector i1 values are always really in SGPRs, with each bit for each item in the wave. This saves about 4 instructions when and/or/xoring any condition, and also helps write conditions that need to be passed in vcc. This should work correctly now that the SGPR live range fixing pass works. More work is needed to eliminate the VReg_1 pseudo regclass and possibly the entire SILowerI1Copies pass. llvm-svn: 223206
* R600/SI: Fix suspicious indexingMatt Arsenault2014-12-031-5/+7
| | | | | | | | The loop is over the operands of an instruction, and checks the register with the sub reg index of the dest register. This probably meant to be checking the sub reg index of the same operand. llvm-svn: 223205
* R600/SI: Fix running SILowerI1Copies a second timeMatt Arsenault2014-12-031-2/+1
| | | | llvm-svn: 223204
* R600/SI: Fix live range error hidden by SIFoldOperandsMatt Arsenault2014-12-031-0/+9
| | | | | | | | | | | | | | | m0 is treated as a virtual register class with a single register rather than the physical register it really is. This was updating the live range of the used virtual copy of m0 from the first ds_read instruction, and leaving the unused copy unchanged. This resulted in a "Live segment doesn't end at a valid instruction" verifier error because the erased instructions. Update the live range of the second copy (which should be dead). No test since I'm not sure how to trigger this with SIFoldOperands enabled. llvm-svn: 223203
* ADT: Add SmallVector<>::emplace_back(): fixupDuncan P. N. Exon Smith2014-12-031-1/+1
| | | | | | | Add missing `void` return type from `!LLVM_HAS_VARIADIC_TEMPLATES` case in r223201. llvm-svn: 223202
* ADT: Add SmallVector<>::emplace_back()Duncan P. N. Exon Smith2014-12-032-0/+176
| | | | llvm-svn: 223201
* StructurizeCFG: Use LoopInfo analysis for better loop detectionTom Stellard2014-12-032-1/+47
| | | | | | | | We were assuming that each back-edge in a region represented a unique loop, which is not always the case. We need to use LoopInfo to correctly determine which back-edges are loops. llvm-svn: 223199
* NVPTX: Delete dead codeDuncan P. N. Exon Smith2014-12-031-5/+0
| | | | | | `MDNode` does not inherit from `User`, and it never has a name. llvm-svn: 223198
* R600/SI: Enable inline assemblyTom Stellard2014-12-032-2/+12
| | | | | | | | We just needed to remove the assertion in AMDGPURegisterInfo::getFrameRegister(), which is called when initializing the parser for inline assembly. llvm-svn: 223197
* [OCaml] [cmake] Disable OCaml bindings if ctypes >=0.3 is not found.Peter Zotov2014-12-031-4/+8
| | | | llvm-svn: 223195
* R600/SI: Change mubuf offsets to print as decimalMatt Arsenault2014-12-0317-95/+95
| | | | | | This matches SC's behavior. llvm-svn: 223194
* Emit the entry block first and the exit block second, then all the blocks in ↵Nick Lewycky2014-12-032-3/+73
| | | | | | between afterwards. This is what gcc always does, and some out of tree tools depend on that. llvm-svn: 223193
* GCRelocateOperands: Try to appease msc17.NAKAMURA Takumi2014-12-031-2/+3
| | | | llvm-svn: 223192
* Prologue supportPeter Collingbourne2014-12-0323-65/+306
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch by Ben Gamari! This redefines the `prefix` attribute introduced previously and introduces a `prologue` attribute. There are a two primary usecases that these attributes aim to serve, 1. Function prologue sigils 2. Function hot-patching: Enable the user to insert `nop` operations at the beginning of the function which can later be safely replaced with a call to some instrumentation facility 3. Runtime metadata: Allow a compiler to insert data for use by the runtime during execution. GHC is one example of a compiler that needs this functionality for its tables-next-to-code functionality. Previously `prefix` served cases (1) and (2) quite well by allowing the user to introduce arbitrary data at the entrypoint but before the function body. Case (3), however, was poorly handled by this approach as it required that prefix data was valid executable code. Here we redefine the notion of prefix data to instead be data which occurs immediately before the function entrypoint (i.e. the symbol address). Since prefix data now occurs before the function entrypoint, there is no need for the data to be valid code. The previous notion of prefix data now goes under the name "prologue data" to emphasize its duality with the function epilogue. The intention here is to handle cases (1) and (2) with prologue data and case (3) with prefix data. References ---------- This idea arose out of discussions[1] with Reid Kleckner in response to a proposal to introduce the notion of symbol offsets to enable handling of case (3). [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html Test Plan: testsuite Differential Revision: http://reviews.llvm.org/D6454 llvm-svn: 223189
* ExceptionDemo: Let setMCJITMemoryManager() take unique_ptr, since r223183.NAKAMURA Takumi2014-12-031-2/+2
| | | | llvm-svn: 223188
* [X86][MC] Intel syntax: accept implicit memory operand sizes larger than 80.Ahmed Bougacha2014-12-032-1/+27
| | | | | | | | | | The X86AsmParser intel handling was refactored in r216481, making it try each different memory operand size to see which one matches. Operand sizes larger than 80 ("[xyz]mmword ptr") were forgotten, which led to an "invalid operand" error for code such as: movdqa [rax], xmm0 llvm-svn: 223187
* [MCJIT] Unique-ptrify the RTDyldMemoryManager member of MCJIT. NFC.Lang Hames2014-12-038-29/+51
| | | | llvm-svn: 223183
* [PowerPC] Fix readcyclecounter to be custom expanded for all 32-bit targetsHal Finkel2014-12-032-6/+5
| | | | | | | We need to use the custom expansion of readcyclecounter on all 32-bit targets (even those with 64-bit registers). This should fix the ppc64 buildbot. llvm-svn: 223182
* AArch64: strengthen Darwin ABI alignment assumptionsTim Northover2014-12-024-11/+8
| | | | | | | | | | A global variable without an explicit alignment specified should be assumed to be ABI-aligned according to its type, like on other platforms. This allows us to use better memory operations when accessing it. rdar://18533701 llvm-svn: 223180
* Use a typed enum instead of 'unsigned char' for packed field. NFC.Pete Cooper2014-12-021-7/+5
| | | | | | This makes it easier to debug Twine as the 'Kind' fields now show their enum values in lldb and not escaped characters. llvm-svn: 223178
* AArch64: don't be too greedy when folding :lo12: accesses into mem ops.Tim Northover2014-12-025-35/+48
| | | | | | | | | | | | | | | This frequently leads to cases like: ldr xD, [xN, :lo12:var] add xA, xN, :lo12:var ldr xD, [xA, #8] where the ADD would have been needed anyway, and the two distinct addressing modes can prevent the formation of an ldp. Because of how we handle ADRP (aggressively forming an ADRP/ADD pseudo-inst at ISel time), this pattern also results in duplicated ADRP instructions (one on its own to cover the ldr, and one combined with the add). llvm-svn: 223172
* PR21302. Vectorize only bottom-tested loops.Michael Zolotukhin2014-12-022-0/+40
| | | | | | rdar://problem/18886083 llvm-svn: 223171
* Apply loop-rotate to several vectorizer tests.Michael Zolotukhin2014-12-023-181/+151
| | | | | | | | Such loops shouldn't be vectorized due to the loops form. After applying loop-rotate (+simplifycfg) the tests again start to check what they are intended to check. llvm-svn: 223170
* [X86][SSE] Keep 4i32 vector insertions in integer domain on SSE4.1 targetsSimon Pilgrim2014-12-025-26/+26
| | | | | | | | | | 4i32 shuffles for single insertions into zero vectors lowers to X86vzmovl which was using (v)blendps - causing domain switch stalls. This patch fixes this by using (v)pblendw instead. The updated tests on test/CodeGen/X86/sse41.ll still contain a domain stall due to the use of insertps - I'm looking at fixing this in a future patch. Differential Revision: http://reviews.llvm.org/D6458 llvm-svn: 223165
* Give lit a --xunit-xml-output option for saving results in xunit formatChris Matthews2014-12-022-7/+55
| | | | | | | | --xunit-xml-output saves test results to disk in JUnit's xml format. This will allow Jenkins to report the details of a lit run. Based on a patch by David Chisnall. llvm-svn: 223163
* [PowerPC] Implement readcyclecounter for PPC32Hal Finkel2014-12-026-0/+102
| | | | | | | | | | | | | | | | | | | We've long supported readcyclecounter on PPC64, but it is easier there (the read of the 64-bit time-base register can be accomplished via a single instruction). This now provides an implementation for PPC32 as well. On PPC32, the time-base register is still 64 bits, but can only be read 32 bits at a time via two separate SPRs. The ISA manual explains how to do this properly (it involves re-reading the upper bits and looping if the counter has wrapped while being read). This requires PPC to implement a custom integer splitting legalization for the READCYCLECOUNTER node, turning it into a target-specific SDAG node, which then gets turned into a pseudo-instruction, which is then expanded to the necessary sequence (which has three SPR reads, the comparison and the branch). Thanks to Paul Hargrove for pointing out to me that this was still unimplemented. llvm-svn: 223161
* R600/SI: Emit amd_kernel_code_t header for AMDGPU environmentTom Stellard2014-12-025-1/+829
| | | | llvm-svn: 223160
* Make sure that the TargetOptions operator== is checking theEric Christopher2014-12-021-0/+6
| | | | | | full contents of the class. llvm-svn: 223159
* [AArch64][Stackmaps] Optimize stackmap shadows on AArch64.Lang Hames2014-12-022-1/+31
| | | | | | | | | | Reduce the number of nops emitted for stackmap shadows on AArch64 by counting non-stackmap instructions up to the next branch target towards the requested shadow. <rdar://problem/14959522> llvm-svn: 223156
* R600/SI: Move more information into SIProgramInfo structTom Stellard2014-12-025-53/+83
| | | | llvm-svn: 223154
* Add bindings for the rest of the MCJIT options that we previouslyEric Christopher2014-12-022-0/+15
| | | | | | | had support for. We're still missing a binding for an MCJIT memory manager. llvm-svn: 223153
* R600: Cleanup some tests and add missing testcasesMatt Arsenault2014-12-026-234/+363
| | | | llvm-svn: 223151
* Restructure some assertion checking based on post commit feedback by Aaron ↵Philip Reames2014-12-022-7/+13
| | | | | | and Tom. llvm-svn: 223150
* [mips] Fix passing of small structures for big-endian O32.Daniel Sanders2014-12-022-0/+57
| | | | | | | | | | | | | | | | | | | Summary: Like N32/N64, they must be passed in the upper bits of the register. The new code could be merged with the existing if-statements but I've refrained from doing this since it will make porting the O32 implementation to tablegen harder later. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6463 llvm-svn: 223148
* Introduce CPUStringIsValid() into MCSubtargetInfo and use it for ARM .cpu ↵Roman Divacky2014-12-023-0/+33
| | | | | | | | | | parsing. Previously .cpu directive in ARM assembler didnt switch to the new CPU and therefore acted as a nop. This implemented real action for .cpu and eg. allows to assembler FreeBSD kernel with -integrated-as. llvm-svn: 223147
* R600/SI: Refactor AMDGPUAsmPrinter::EmitProgramInfoSI()Tom Stellard2014-12-021-9/+11
| | | | llvm-svn: 223144
* [Statepoints 4/4] Statepoint infrastructure for garbage collection: ↵Philip Reames2014-12-021-0/+209
| | | | | | | | | | | | | Documentation This is the fourth and final patch in the statepoint series. It contains the documentation for the statepoint intrinsics and their usage. There's definitely still room to improve the documentation here, but I wanted to get this landed so it was available for others. There will likely be a series of small cleanup changes over the next few weeks as we work to clarify and revise the documentation. If you have comments or questions, please feel free to discuss them either in this commit thread, the original review thread, or on llvmdev. Comments are more than welcome. Reviewed by: atrick, ributzka Differential Revision: http://reviews.llvm.org/D5683 llvm-svn: 223143
* Appease a build bot complaining about an unused variable that's used in an ↵Philip Reames2014-12-021-0/+1
| | | | | | assertion. llvm-svn: 223142
* cmake: Remove MAXPATHLEN define as autoconf does not provide itReid Kleckner2014-12-022-7/+0
| | | | | | | | Presumably it was added to the CMake system when MAXPATHLEN was still used by code built for Windows. Currently only lib/Support/Path.inc uses MAXPATHLEN, and it should be available on all Unices. llvm-svn: 223139
* Remove '#undef const' from config.h.cmake to sync with autoconfReid Kleckner2014-12-021-3/+0
| | | | | | | This define was removed from config.h.in when Rafael removed our use of libtool. llvm-svn: 223138
* [Statepoints 3/4] Statepoint infrastructure for garbage collection: ↵Philip Reames2014-12-0214-0/+1345
| | | | | | | | | | | | | | | | | | SelectionDAGBuilder This is the third patch in a small series. It contains the CodeGen support for lowering the gc.statepoint intrinsic sequences (223078) to the STATEPOINT pseudo machine instruction (223085). The change also includes the set of helper routines and classes for working with gc.statepoints, gc.relocates, and gc.results since the lowering code uses them. With this change, gc.statepoints should be functionally complete. The documentation will follow in the fourth change, and there will likely be some cleanup changes, but interested parties can start experimenting now. I'm not particularly happy with the amount of code or complexity involved with the lowering step, but at least it's fairly well isolated. The statepoint lowering code is split into it's own files and anyone not working on the statepoint support itself should be able to ignore it. During the lowering process, we currently spill aggressively to stack. This is not entirely ideal (and we have plans to do better), but it's functional, relatively straight forward, and matches closely the implementations of the patchpoint intrinsics. Most of the complexity comes from trying to keep relocated copies of values in the same stack slots across statepoints. Doing so avoids the insertion of pointless load and store instructions to reshuffle the stack. The current implementation isn't as effective as I'd like, but it is functional and 'good enough' for many common use cases. In the long term, I'd like to figure out how to integrate the statepoint lowering with the register allocator. In principal, we shouldn't need to eagerly spill at all. The register allocator should do any spilling required and the statepoint should simply record that fact. Depending on how challenging that turns out to be, we may invest in a smarter global stack slot assignment mechanism as a stop gap measure. Reviewed by: atrick, ributzka llvm-svn: 223137
OpenPOWER on IntegriCloud