summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
* Revert part of the PIC tests (TLS part)Justin Hibbits2014-11-121-1/+1
| | | | | | | This change actually wasn't warranted for -O0, and the new changes prove it and break the build. llvm-svn: 221793
* Fix thet tests.Justin Hibbits2014-11-122-2/+2
| | | | | | | I seem to have missed the update I made for changing 'flag_pic' to "PIC Level". Mea culpa. llvm-svn: 221792
* Add support for small-model PIC for PowerPC.Justin Hibbits2014-11-1212-84/+153
| | | | | | | | | | | | | | | | | | | | Summary: Large-model was added first. With the addition of support for multiple PIC models in LLVM, now add small-model PIC for 32-bit PowerPC, SysV4 ABI. This generates more optimal code, for shared libraries with less than about 16380 data objects. Test Plan: Test cases added or updated Reviewers: joerg, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, mcrosier, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D5399 llvm-svn: 221791
* Reduce code duplication a bit. NFC.Rafael Espindola2014-11-122-11/+6
| | | | llvm-svn: 221785
* Fix the test.Rafael Espindola2014-11-121-2/+10
| | | | | | It was broken since r221708. llvm-svn: 221783
* Fixing more -Wcast-qual warnings; NFC.Aaron Ballman2014-11-122-6/+14
| | | | llvm-svn: 221782
* Fixing a -Wcast-qual warning; NFC.Aaron Ballman2014-11-121-2/+3
| | | | llvm-svn: 221781
* [mips][micromips] Add predicate 'InMicroMips' at CodeGen patterns for ↵Zoran Jovanovic2014-11-122-1/+26
| | | | | | | | microMIPS instructions Differential Revision: http://reviews.llvm.org/D6198 llvm-svn: 221780
* [x86] Start improving the matching of unpck instructions based on testChandler Carruth2014-11-123-15/+13
| | | | | | | | cases from Halide folks. This initial step was extracted from a prototype change by Clay Wood to try and address regressions found with Halide and the new vector shuffle lowering. llvm-svn: 221779
* [x86] Clean up a bunch of vector shuffle tests with my script. Notably,Chandler Carruth2014-11-124-65/+69
| | | | | | | removes windows line endings and other noise. This is in prelude to making substantive changes to these tests. llvm-svn: 221776
* MCDisassembler::getInstruction():: Prune also "\param Region", since it was ↵NAKAMURA Takumi2014-11-121-1/+0
| | | | | | removed in r221751. [-Wdocumentation] llvm-svn: 221775
* AVX-512: Intrinsics for ERIElena Demikhovsky2014-11-128-94/+180
| | | | | | | | | 3 instructions: vrcp28, vrsqrt28, vexp2, only vector forms. Intrinsics include SAE (Suppres All Exceptions) parameter. http://reviews.llvm.org/D6214 llvm-svn: 221774
* Reverts r221772 which fails testsJingyue Wu2014-11-123-106/+12
| | | | llvm-svn: 221773
* Disable indvar widening if arithmetics on the wider type are more expensiveJingyue Wu2014-11-123-12/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: IndVarSimplify should not widen an indvar if arithmetics on the wider indvar are more expensive than those on the narrower indvar. For instance, although NVPTX64 treats i64 as a legal type, an ADD on i64 is twice as expensive as that on i32, because the hardware needs to simulate a 64-bit integer using two 32-bit integers. Split from D6188, and based on D6195 which adds NVPTXTargetTransformInfo. Fixes PR21148. Test Plan: Added @indvar_32_bit that verifies we do not widen an indvar if the arithmetics on the wider type are more expensive. Reviewers: jholewinski, eliben, meheff, atrick Reviewed By: atrick Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D6196 llvm-svn: 221772
* Delete dead code. NFC.Rafael Espindola2014-11-121-6/+0
| | | | llvm-svn: 221770
* [PowerPC] Add vec_vsx_ld and vec_vsx_st intrinsicsBill Schmidt2014-11-127-12/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables the vec_vsx_ld and vec_vsx_st intrinsics for PowerPC, which provide programmer access to the lxvd2x, lxvw4x, stxvd2x, and stxvw4x instructions. New LLVM intrinsics are provided to represent these four instructions in IntrinsicsPowerPC.td. These are patterned after the similar intrinsics for lvx and stvx (Altivec). In PPCInstrVSX.td, these intrinsics are tied to the code gen patterns, with additional patterns to allow plain vanilla loads and stores to still generate these instructions. At -O1 and higher the intrinsics are immediately converted to loads and stores in InstCombineCalls.cpp. This will open up more optimization opportunities while still allowing the correct instructions to be generated. (Similar code exists for aligned Altivec loads and stores.) The new intrinsics are added to the code that checks for consecutive loads and stores in PPCISelLowering.cpp, as well as to PPCTargetLowering::getTgtMemIntrinsic(). There's a new test to verify the correct instructions are generated. The loads and stores tend to be reordered, so the test just counts their number. It runs at -O2, as it's not very effective to test this at -O0, when many unnecessary loads and stores are generated. I ended up having to modify vsx-fma-m.ll. It turns out this test case is slightly unreliable, but I don't know a good way to prevent problems with it. The xvmaddmdp instructions read and write the same register, which is one of the multiplicands. Commutativity allows either to be chosen. If the FMAs are reordered differently than expected by the test, the register assignment can be different as a result. Hopefully this doesn't change often. There is a companion patch for Clang. llvm-svn: 221767
* Merge StreamableMemoryObject into MemoryObject.Rafael Espindola2014-11-126-63/+51
| | | | | | | | | Every MemoryObject is a StreamableMemoryObject since the removal of StringRefMemoryObject, so just merge the two. I will clean up the MemoryObject interface in the upcoming commits. llvm-svn: 221766
* Fix non-variadic function_ref cases to match r221753David Blaikie2014-11-121-4/+16
| | | | llvm-svn: 221763
* Don't duplicate name in comments. NFC.Rafael Espindola2014-11-122-40/+33
| | | | llvm-svn: 221762
* Revert "Use a function_ref now that it works (r221753)."Rafael Espindola2014-11-121-2/+2
| | | | | | | | This reverts commit r221756. David Blaikie pointed out it was unsafe. llvm-svn: 221761
* Remove unused method. NFC.Rafael Espindola2014-11-123-22/+0
| | | | llvm-svn: 221759
* Make readBytes pure virtual. Every real implementation has it.Rafael Espindola2014-11-122-20/+2
| | | | llvm-svn: 221758
* Remove unused method. NFC.Rafael Espindola2014-11-124-8/+1
| | | | llvm-svn: 221757
* Use a function_ref now that it works (r221753).Rafael Espindola2014-11-121-2/+2
| | | | llvm-svn: 221756
* Remove the now unused StringRefMemoryObject.h.Rafael Espindola2014-11-127-75/+0
| | | | llvm-svn: 221755
* Ensure function_refs are copyable even from non-const referencesDavid Blaikie2014-11-123-2/+34
| | | | | | | | | | | | | | | | | | A subtle bug was found where attempting to copy a non-const function_ref lvalue would actually invoke the generic forwarding constructor (as it was a closer match - being T& rather than the const T& of the implicit copy constructor). In the particular case this lead to a dangling function_ref member (since it had referenced the function_ref passed by value to its ctor, rather than the outer function_ref that was still alive) SFINAE the converting constructor to not be considered if the copy constructor is available and demonstrate that this causes the copy to refer to the original functor, not to the function_ref it was copied from. (without the code change, the test would fail as Y would be referencing X and Y() would see the result of the mutation to X, ie: 2) llvm-svn: 221753
* Pass an ArrayRef to MCDisassembler::getInstruction.Rafael Espindola2014-11-1217-113/+95
| | | | | | | | | | | | With this patch MCDisassembler::getInstruction takes an ArrayRef<uint8_t> instead of a MemoryObject. Even on X86 there is a maximum size an instruction can have. Given that, it seems way simpler and more efficient to just pass an ArrayRef to the disassembler instead of a MemoryObject and have it do a virtual call every time it wants some extra bytes. llvm-svn: 221751
* Object, support both mach-o archive t.o.c file namesNick Kledzik2014-11-123-1/+2
| | | | | | | | | | For historical reasons archives on mach-o have two possible names for the file containing the table of contents for the archive: "__.SYMDEF SORTED" and "__.SYMDEF". But the libObject archive reader only supported the former. This patch fixes llvm::object::Archive to support both names. llvm-svn: 221747
* Remove a bit of dead code.Rafael Espindola2014-11-124-20/+11
| | | | | | Every "real" object file implements this an ptx doesn't use it. llvm-svn: 221746
* Extend intrinsic name mangling to support arrays, named structs, and ↵Philip Reames2014-11-121-6/+28
| | | | | | | | | | | | | function types. Currently, we have a type parameter mechanism for intrinsics. Rather than having to specify a separate intrinsic for each combination of argument and return types, we can specify a single intrinsic with one or more type parameters. These type parameters are passed explicitly to Intrinsic::getDeclaration or can be specified implicitly in the naming of the intrinsic function in an LL file. Today, the types are limited to integer, floating point, and pointer types. With a goal of supporting symbolic targets for patchpoints and statepoints, this change adds support for function types. The change also includes support for first class aggregate types (named structures and arrays) since these appear in function types we've encountered. Reviewed by: atrick, ributzka Differential Revision: http://reviews.llvm.org/D4608 llvm-svn: 221742
* Make TreePattern::error use TwineMatt Arsenault2014-11-112-2/+2
| | | | | | | The underlying error function already uses a Twine, and most of the uses build up strings. llvm-svn: 221740
* [Reassociate] Canonicalize negative constants out of expressions.Chad Rosier2014-11-112-1/+50
| | | | | | Add support for FDiv, which was regressed by the previous commit. llvm-svn: 221738
* Canonicalize an assume(load != null) into !nonnull metadataPhilip Reames2014-11-112-0/+94
| | | | | | | | | | | We currently have two ways of informing the optimizer that the result of a load is never null: metadata and assume. This change converts the second in to the former. This avoids a need to implement optimizations using both forms. We should probably extend this basic idea to metadata of other forms; in particular, range metadata. We view is that assumes should be considered a "last resort" for when there isn't a more canonical way to represent something. Reviewed by: Hal Differential Revision: http://reviews.llvm.org/D5951 llvm-svn: 221737
* libLTO: Allow linker to choose context of modules and codegenDuncan P. N. Exon Smith2014-11-112-1/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add API for specifying which `LLVMContext` each `lto_module_t` and `lto_code_gen_t` is in. In particular, this enables the following flow: for (auto &File : Files) { lto_module_t M = lto_module_create_in_local_context(File...); querySymbols(M); lto_module_dispose(M); } lto_code_gen_t CG = lto_codegen_create_in_local_context(); for (auto &File : FilesToLink) { lto_module_t M = lto_module_create_in_codegen_context(File..., CG); lto_codegen_add_module(CG, M); lto_module_dispose(M); } lto_codegen_compile(CG); lto_codegen_write_merged_modules(CG, ...); lto_codegen_dispose(CG); This flow has a few benefits. - Only one module (two if you count the combined module in the code generator) is in memory at a time. - Metadata (and constants) from files that are parsed to query symbols but not linked into the code generator don't pollute the global context. - The first for loop can be parallelized, since each module is in its own context. - When the code generator is disposed, the memory from LTO gets freed. rdar://problem/18767512 llvm-svn: 221733
* Initialize new subtarget feature variable for generating reciprocal estimate ↵Sanjay Patel2014-11-111-0/+1
| | | | | | | | instructions. This was missed in r221706. llvm-svn: 221731
* libLTO: Assert if LTOCodeGenerator and LTOModule are from different contextsDuncan P. N. Exon Smith2014-11-111-0/+3
| | | | llvm-svn: 221730
* [FastISel][AArch64] Add support for fabs intrinsic.Juergen Ributzka2014-11-112-0/+45
| | | | | | | | Lower the llvm.fabs intrinsic to the 'fabs' MI instruction. This fixes rdar://problem/18946552. llvm-svn: 221729
* libLTO: Allow LTOModule to own a contextDuncan P. N. Exon Smith2014-11-113-9/+59
| | | | llvm-svn: 221728
* libLTO: Allow LTOCodeGenerator to own a contextDuncan P. N. Exon Smith2014-11-112-4/+21
| | | | llvm-svn: 221726
* [asan] adding ShadowOffset64 for mips64, patch by Kumar SukhaniKostya Serebryany2014-11-111-0/+5
| | | | llvm-svn: 221725
* [Reassociate] Canonicalize negative constants out of expressions.Chad Rosier2014-11-115-115/+205
| | | | | | | | This is a reapplication of r221171, but we only perform the transformation on expressions which include a multiplication. We do not transform rem/div operations as this doesn't appear to be safe in all cases. llvm-svn: 221721
* Move asan-coverage into a separate phase.Kostya Serebryany2014-11-119-171/+287
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This change moves asan-coverage instrumentation into a separate Module pass. The other part of the change in clang introduces a new flag -fsanitize-coverage=N. Another small patch will update tests in compiler-rt. With this patch no functionality change is expected except for the flag name. The following changes will make the coverage instrumentation work with tsan/msan Test Plan: Run regression tests, chromium. Reviewers: nlewycky, samsonov Reviewed By: nlewycky, samsonov Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6152 llvm-svn: 221718
* Revert "IR: MDNode => Value"Duncan P. N. Exon Smith2014-11-1146-201/+173
| | | | | | | | | | | | | | | | | Instead, we're going to separate metadata from the Value hierarchy. See PR21532. This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994. llvm-svn: 221711
* Fix build break: remove unused variable in FCFI.Tom Roeder2014-11-111-1/+0
| | | | llvm-svn: 221710
* Totally forget deallocated SDNodes in SDDbgInfo.Frederic Riss2014-11-112-4/+16
| | | | | | | | | | | | | | | | What would happen before that commit is that the SDDbgValues associated with a deallocated SDNode would be marked Invalidated, but SDDbgInfo would keep a map entry keyed by the SDNode pointer pointing to this list of invalidated SDDbgNodes. As the memory gets reused, the list might get wrongly associated with another new SDNode. As the SDDbgValues are cloned when they are transfered, this can lead to an exponential number of SDDbgValues being produced during DAGCombine like in http://llvm.org/bugs/show_bug.cgi?id=20893 Note that the previous behavior wasn't really buggy as the invalidation made sure that the SDDbgValues won't be used. This commit can be considered a memory optimization and as such is really hard to validate in a unit-test. llvm-svn: 221709
* Add Forward Control-Flow Integrity.Tom Roeder2014-11-1129-133/+873
| | | | | | | | | | | | | | | | | | | | This commit adds a new pass that can inject checks before indirect calls to make sure that these calls target known locations. It supports three types of checks and, at compile time, it can take the name of a custom function to call when an indirect call check fails. The default failure function ignores the error and continues. This pass incidentally moves the function JumpInstrTables::transformType from private to public and makes it static (with a new argument that specifies the table type to use); this is so that the CFI code can transform function types at call sites to determine which jump-instruction table to use for the check at that site. Also, this removes support for jumptables in ARM, pending further performance analysis and discussion. Review: http://reviews.llvm.org/D4167 llvm-svn: 221708
* [llvm-mc] Fixing case where if a file ended with non-newline whitespace or a ↵Colin LeMahieu2014-11-112-15/+11
| | | | | | | | comma it would access invalid memory. Cleaned up parse loop. llvm-svn: 221707
* Use rcpss/rcpps (X86) to speed up reciprocal calcs (PR21385).Sanjay Patel2014-11-115-1/+116
| | | | | | | | | | | | | | | | | | | | | | This is a first step for generating SSE rcp instructions for reciprocal calcs when fast-math allows it. This is very similar to the rsqrt optimization enabled in D5658 ( http://reviews.llvm.org/rL220570 ). For now, be conservative and only enable this for AMD btver2 where performance improves significantly both in terms of latency and throughput. We may never enable this codegen for Intel Core* chips because the divider circuits are just too fast. On SandyBridge, divss can be as fast as 10 cycles versus the 21 cycle critical path for the rcp + mul + sub + mul + add estimate. Follow-on patches may allow configuration of the number of Newton-Raphson refinement steps, add AVX512 support, and enable the optimization for more chips. More background here: http://llvm.org/bugs/show_bug.cgi?id=21385 Differential Revision: http://reviews.llvm.org/D6175 llvm-svn: 221706
* Simplify testcase. NFC.Rafael Espindola2014-11-111-2/+2
| | | | | | Thanks to Filipe Cabecinhas for the tip. llvm-svn: 221705
* [PowerPC] Replace foul hackery with real calls to __tls_get_addrBill Schmidt2014-11-118-125/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | My original support for the general dynamic and local dynamic TLS models contained some fairly obtuse hacks to generate calls to __tls_get_addr when lowering a TargetGlobalAddress. Rather than generating real calls, special GET_TLS_ADDR nodes were used to wrap the calls and only reveal them at assembly time. I attempted to provide correct parameter and return values by chaining CopyToReg and CopyFromReg nodes onto the GET_TLS_ADDR nodes, but this was also not fully correct. Problems were seen with two back-to-back stores to TLS variables, where the call sequences ended up overlapping with unhappy results. Additionally, since these weren't real calls, the proper register side effects of a call were not recorded, so clobbered values were kept live across the calls. The proper thing to do is to lower these into calls in the first place. This is relatively straightforward; see the changes to PPCTargetLowering::LowerGlobalTLSAddress() in PPCISelLowering.cpp. The changes here are standard call lowering, except that we need to track the fact that these calls will require a relocation. This is done by adding a machine operand flag of MO_TLSLD or MO_TLSGD to the TargetGlobalAddress operand that appears earlier in the sequence. The calls to LowerCallTo() eventually find their way to LowerCall_64SVR4() or LowerCall_32SVR4(), which call FinishCall(), which calls PrepareCall(). In PrepareCall(), we detect the calls to __tls_get_addr and immediately snag the TargetGlobalTLSAddress with the annotated relocation information. This becomes an extra operand on the call following the callee, which is expected for nodes of type tlscall. We change the call opcode to CALL_TLS for this case. Back in FinishCall(), we change it again to CALL_NOP_TLS for 64-bit only, since we require a TOC-restore nop following the call for the 64-bit ABIs. During selection, patterns in PPCInstrInfo.td and PPCInstr64Bit.td convert the CALL_TLS nodes into BL_TLS nodes, and convert the CALL_NOP_TLS nodes into BL8_NOP_TLS nodes. This replaces the code removed from PPCAsmPrinter.cpp, as the BL_TLS or BL8_NOP_TLS nodes can now be emitted normally using their patterns and the associated printTLSCall print method. Finally, as a result of these changes, all references to get-tls-addr in its various guises are no longer used, so they have been removed. There are existing TLS tests to verify the changes haven't messed anything up). I've added one new test that verifies that the problem with the original code has been fixed. llvm-svn: 221703
OpenPOWER on IntegriCloud