summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Revert "Decouple dllexport/dllimport from linkage"Nico Rieck2014-01-1417-189/+93
| | | | | | | | Revert this for now until I fix an issue in Clang with it. This reverts commit r199204. llvm-svn: 199207
* Revert "Handle dllexport for global aliases"Nico Rieck2014-01-141-15/+0
| | | | | | This reverts commit r199205. llvm-svn: 199206
* Handle dllexport for global aliasesNico Rieck2014-01-141-0/+15
| | | | llvm-svn: 199205
* Decouple dllexport/dllimport from linkageNico Rieck2014-01-1417-93/+189
| | | | | | | | | | | | | | | | | | | Representing dllexport/dllimport as distinct linkage types prevents using these attributes on templates and inline functions. Instead of introducing further mixed linkage types to include linkonce and weak ODR, the old import/export linkage types are replaced with a new separate visibility-like specifier: define available_externally dllimport void @f() {} @Var = dllexport global i32 1, align 4 Linkage for dllexported globals and functions is now equal to their linkage without dllexport. Imported globals and functions must be either declarations with external linkage, or definitions with AvailableExternallyLinkage. llvm-svn: 199204
* Fix fastcall mangling of dllimported symbolsNico Rieck2014-01-141-7/+6
| | | | | | | | | | | | | fastcall requires @ as global prefix instead of _ but getNameWithPrefix wrongly assumes the OutName buffer is empty and replaces at index 0. For imported functions this buffer is pre-filled with "__imp_" resulting in broken "@_imp_foo@0" mangling. Instead replace at the proper index. We also never have to prepend the @-prefix because this fastcall mangling is only used on 32-bit Windows targets which have _ has global prefix. llvm-svn: 199203
* Revert r199191, "LTO: add API to set strategy for -internalize"NAKAMURA Takumi2014-01-142-38/+19
| | | | | | Please update also Other/link-opts.ll, in next time. llvm-svn: 199197
* Separate the concept of 16-bit/32-bit operand size controlled by 0x66 prefix ↵Craig Topper2014-01-1411-313/+325
| | | | | | | | and the current mode from the concept of SSE instructions using 0x66 prefix as part of their encoding without being affected by the mode. This should allow SSE instructions to be encoded correctly in 16-bit mode which r198586 probably broke. llvm-svn: 199193
* LTO: add API to set strategy for -internalizeDuncan P. N. Exon Smith2014-01-142-19/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add API to LTOCodeGenerator to specify a strategy for the -internalize pass. This is a new attempt at Bill's change in r185882, which he reverted in r188029 due to problems with the gold linker. This puts the onus on the linker to decide whether (and what) to internalize. In particular, running internalize before outputting an object file may change a 'weak' symbol into an internal one, even though that symbol could be needed by an external object file --- e.g., with arclite. This patch enables three strategies: - LTO_INTERNALIZE_FULL: the default (and the old behaviour). - LTO_INTERNALIZE_NONE: skip -internalize. - LTO_INTERNALIZE_HIDDEN: only -internalize symbols with hidden visibility. LTO_INTERNALIZE_FULL should be used when linking an executable. Outputting an object file (e.g., via ld -r) is more complicated, and depends on whether hidden symbols should be internalized. E.g., for ld -r, LTO_INTERNALIZE_NONE can be used when -keep_private_externs, and LTO_INTERNALIZE_HIDDEN can be used otherwise. However, LTO_INTERNALIZE_FULL is inappropriate, since the output object file will eventually need to link with others. lto_codegen_set_internalize_strategy() sets the strategy for subsequent calls to lto_codegen_write_merged_modules() and lto_codegen_compile*(). <rdar://problem/14334895> llvm-svn: 199191
* Always let value types influence register classes.Jakob Stoklund Olesen2014-01-141-4/+13
| | | | | | | | | | | | | | | | | | | | | When creating a virtual register for a def, the value type should be used to pick the register class. If we only use the register class constraint on the instruction, we might pick a too large register class. Some registers can store values of different sizes. For example, the x86 xmm registers can hold f32, f64, and 128-bit vectors. The three different value sizes are represented by register classes with identical register sets: FR32, FR64, and VR128. These register classes have different spill slot sizes, so it is important to use the right one. The register class constraint on an instruction doesn't necessarily care about the size of the value its defining. The value type determines that. This fixes a problem where InstrEmitter was picking 32-bit register classes for 64-bit values on SPARC. llvm-svn: 199187
* Switch the NEON register class from QPR to DPair.Jakob Stoklund Olesen2014-01-141-1/+1
| | | | | | | | | | | | | The already allocatable DPair superclass contains odd-even D register pair in addition to the even-odd pairs in the QPR register class. There is no reason to constrain the set of D register pairs that can be used for NEON values. Any NEON instructions that require a Q register will automatically constrain the register class to QPR. The allocation order for DPair begins with the QPR registers, so register allocation is unlikely to change much. llvm-svn: 199186
* Replace .mips_hack_stocg with ".set micromips" and ".set nomicromips".Rafael Espindola2014-01-145-49/+47
| | | | | | | This matches what gnu as does and implementing this is easier than arguing about it. llvm-svn: 199181
* Fix llc to not reuse spill slots in functions that invoke setjmp()Mark Seaborn2014-01-141-4/+2
| | | | | | | | | | | | | | | | | | We need to ensure that StackSlotColoring.cpp does not reuse stack spill slots in functions that call "returns_twice" functions such as setjmp(), otherwise this can lead to miscompiled code, because a stack slot would be clobbered when it's still live. This was already handled correctly for functions that call setjmp() (though this wasn't covered by a test), but not for functions that invoke setjmp(). We fix this by changing callsFunctionThatReturnsTwice() to check for invoke instructions. This fixes PR18244. llvm-svn: 199180
* Make getTargetStreamer return a possibly null pointer.Rafael Espindola2014-01-148-9/+10
| | | | | | | | | This will allow it to be called from target independent parts of the main streamer that don't know if there is a registered target streamer or not. This in turn will allow targets to perform extra actions at specified points in the interface: add extra flags for some labels, extra work during finalization, etc. llvm-svn: 199174
* Fix uninitialized warning in llvm/lib/IR/DataLayout.cpp.Cameron McInally2014-01-131-2/+3
| | | | llvm-svn: 199147
* [DAG] Refactor ReassociateOps - no functional change intended.Juergen Ributzka2014-01-131-73/+44
| | | | llvm-svn: 199146
* [DAG] Teach DAG to also reassociate vector operationsJuergen Ributzka2014-01-132-16/+60
| | | | | | | | | | This commit teaches DAG to reassociate vector ops, which in turn enables constant folding of vector op chains that appear later on during custom lowering and DAG combine. Reviewed by Andrea Di Biagio llvm-svn: 199135
* Hide the pre-RA-sched= option.Andrew Trick2014-01-132-2/+2
| | | | | | | | | This is a very confusing option for a feature that will go away. -enable-misched is exposed instead to help triage issues with the new scheduler. llvm-svn: 199133
* Fix PR 18369: [Thumbv8] asserts due to inconsistent CPSR liveness of IT blocksWeiming Zhao2014-01-131-0/+3
| | | | | | | | | The issue is caused when Post-RA scheduler reorders a bundle instruction (IT block). However, it only flips the CPSR liveness of the bundle instruction, leaves the instructions inside the bundle unchanged, which causes inconstancy and crashes Thumb2SizeReduction.cpp::ReduceMBB(). llvm-svn: 199127
* Update getLazyBitcodeModule to use ErrorOr for error handling.Rafael Espindola2014-01-134-20/+27
| | | | llvm-svn: 199125
* [AArch64] Fix assertion failure caused by an invalid comparison between ↵Andrea Di Biagio2014-01-131-2/+3
| | | | | | | | | | | | | APInt values. APInt only knows how to compare values with the same BitWidth and asserts in all other cases. With this fix, function PerformORCombine does not use the APInt equality operator if the APInt values returned by 'isConstantSplat' differ in BitWidth. In that case they are different and no comparison is needed. llvm-svn: 199119
* Fix indentation.Joerg Sonnenberger2014-01-131-11/+11
| | | | llvm-svn: 199118
* [SystemZ] Optimize (sext (ashr (shl ...), ...))Richard Sandiford2014-01-132-0/+36
| | | | | | | | | | ...into (ashr (shl (anyext X), ...), ...), which requires one fewer instruction. The (anyext X) can sometimes be simplified too. I didn't do this in DAGCombiner because widening shifts isn't a win on all targets. llvm-svn: 199114
* ARM: constrain Thumb LDRLIT pseudo-instructions to r0-r7.Tim Northover2014-01-131-4/+5
| | | | | | | | | | | | Previously we only used GPR for the destination placeholder in "ldr rD, [pc, incorrect codegen under the integrated assembler. This should fix both issues (which probably only affect MachO targets at the moment). rdar://problem/15800156 llvm-svn: 199108
* [x86] Fix retq/retl handling in 64-bit modeDavid Woodhouse2014-01-135-11/+25
| | | | | | | | | | | | | | | | | | | | | | | | | This finishes the job started in r198756, and creates separate opcodes for 64-bit vs. 32-bit versions of the rest of the RET instructions too. LRETL/LRETQ are interesting... I can't see any justification for their existence in the SDM. There should be no 'LRETL' in 64-bit mode, and no need for a REX.W prefix for LRETQ. But this is what GAS does, and my Sandybridge CPU and an Opteron 6376 concur when tested as follows: asm __volatile__("pushq $0x1234\nmovq $0x33,%rax\nsalq $32,%rax\norq $1f,%rax\npushq %rax\nlretl $8\n1:"); asm __volatile__("pushq $1234\npushq $0x33\npushq $1f\nlretq $8\n1:"); asm __volatile__("pushq $0x33\npushq $1f\nlretq\n1:"); asm __volatile__("pushq $0x1234\npushq $0x33\npushq $1f\nlretq $8\n1:"); cf. PR8592 and commit r118903, which added LRETQ. I only added LRETIQ to match it. I don't quite understand how the Intel syntax parsing for ret instructions is working, despite r154468 allegedly fixing it. Aren't the explicitly sized 'retw', 'retd' and 'retq' supposed to work? I have at least made the 'lretq' work with (and indeed *require*) the 'q'. llvm-svn: 199106
* [PM] Split DominatorTree into a concrete analysis result object whichChandler Carruth2014-01-1352-220/+283
| | | | | | | | | | | | | | | | | | | | | | | can be used by both the new pass manager and the old. This removes it from any of the virtual mess of the pass interfaces and lets it derive cleanly from the DominatorTreeBase<> template. In turn, tons of boilerplate interface can be nuked and it turns into a very straightforward extension of the base DominatorTree interface. The old analysis pass is now a simple wrapper. The names and style of this split should match the split between CallGraph and CallGraphWrapperPass. All of the users of DominatorTree have been updated to match using many of the same tricks as with CallGraph. The goal is that the common type remains the resulting DominatorTree rather than the pass. This will make subsequent work toward the new pass manager significantly easier. Also in numerous places things became cleaner because I switched from re-running the pass (!!! mid way through some other passes run!!!) to directly recomputing the domtree. llvm-svn: 199104
* AVX-512: Embedded Rounding Control - encoding and printingElena Demikhovsky2014-01-136-160/+220
| | | | | | Changed intrinsics for vrcp14/vrcp28 vrsqrt14/vrsqrt28 - aligned with GCC. llvm-svn: 199102
* [PM] Pull the generic graph algorithms and data structures for dominatorChandler Carruth2014-01-135-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | trees into the Support library. These are all expressed in terms of the generic GraphTraits and CFG, with no reliance on any concrete IR types. Putting them in support clarifies that and makes the fact that the static analyzer in Clang uses them much more sane. When moving the Dominators.h file into the IR library I claimed that this was the right home for it but not something I planned to work on. Oops. So why am I doing this? It happens to be one step toward breaking the requirement that IR verification can only be performed from inside of a pass context, which completely blocks the implementation of verification for the new pass manager infrastructure. Fixing it will also allow removing the concept of the "preverify" step (WTF???) and allow the verifier to cleanly flag functions which fail verification in a way that precludes even computing dominance information. Currently, that results in a fatal error even when you ask the verifier to not fatally error. It's awesome like that. The yak shaving will continue... llvm-svn: 199095
* Revert "ReMat: fix overly cavalier attitude to sub-register indices"Tim Northover2014-01-131-4/+24
| | | | | | | | | | Very sorry, this was a premature patch that I still need to investigate and finish off (for some reason beyond me at the moment it doesn't actually fix the issue in all cases). This reverts commit r199091. llvm-svn: 199093
* ReMat: fix overly cavalier attitude to sub-register indicesTim Northover2014-01-131-24/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two attempted optimisations in reMaterializeTrivialDef, trying to avoid promoting the size of a register too much when rematerializing. Unfortunately, both appear to be flawed. First, we see if the original register would have worked, but this is inadequate. Consider: v1 = SOMETHING (v1 is QQ) v2:Q0 = COPY v1:Q1 (v1, v2 are QQ) ... uses of v2 In this case even though v2 *could* be used directly as the output of SOMETHING, this would set the wrong bits of the QQ register involved. The correct rematerialization must be: v2:Q0_Q1 = SOMETHING (v2 promoted to QQQ) ... uses of v2:Q1_Q2 For the second optimisation, if the correct remat is "v2:idx = SOMETHING" then we can't necessarily expect v2 itself to be valid for SOMETHING, but we do try to hunt for a class between v1 and v2 that works. Unfortunately, this is also wrong: v1 = SOMETHING (v1 is QQ) v2:Q0_Q1 = COPY v1 (v1 is QQ, v2 is QQQ) ... uses of v2 as a QQQ The canonical rematerialization here is "v2:Q0_Q1 = SOMETHING". However current logic would decide that v2 could be a QQ (no interest is taken in later uses). This patch, therefore, always accepts the widened register class without trying to be clever. Generally there is no penalty to this (e.g. in the common GR32 < GR64 case, expanding the width doesn't matter because it's not like you were going to do anything else with the high bits of a GR32 register). It can increase register pressure in cases like the ARM VFP regs though (multiple non-overlapping but equivalent subregisters). Hopefully this situation is rare enough that it won't matter. Unfortunately, no in-tree targets actually expose this as far as I can tell (there are so few isAsCheapAsAMove instructions for it to trigger on) so I've been unable to produce a test. It was exposed in our ARM64 SPEC tests though, and I will be adding a test there that we should be able to contribute soon(TM). llvm-svn: 199091
* [cleanup] Move the Dominators.h and Verifier.h headers into the IRChandler Carruth2014-01-1366-74/+74
| | | | | | | | | | | | | | | | | | directory. These passes are already defined in the IR library, and it doesn't make any sense to have the headers in Analysis. Long term, I think there is going to be a much better way to divide these matters. The dominators code should be fully separated into the abstract graph algorithm and have that put in Support where it becomes obvious that evn Clang's CFGBlock's can use it. Then the verifier can manually construct dominance information from the Support-driven interface while the Analysis library can provide a pass which both caches, reconstructs, and supports a nice update API. But those are very long term, and so I don't want to leave the really confusing structure until that day arrives. llvm-svn: 199082
* Re-sort #include lines again, prior to moving headers around.Chandler Carruth2014-01-133-5/+5
| | | | llvm-svn: 199080
* [PM] Wire up support for writing bitcode with new PM.Chandler Carruth2014-01-131-3/+9
| | | | | | | | | | This moves the old pass creation functionality to its own header and updates the callers of that routine. Then it adds a new PM supporting bitcode writer to the header file, and wires that up in the opt tool. A test is added that round-trips code into bitcode and back out using the new pass manager. llvm-svn: 199078
* [AArch64 NEON] Add missing patterns for bitcast from or to v1f64Kevin Qin2014-01-131-0/+14
| | | | llvm-svn: 199070
* [AArch64 NEON] Add more scenarios to use perm instructions when lowering ↵Kevin Qin2014-01-131-10/+37
| | | | | | | | | | | | | | | | shuffle_vector This patch covered 2 more scenarios: 1. Two operands of shuffle_vector are the same, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> %a, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> 2. One of operands is undef, like %shuffle.i = shufflevector <8 x i8> %a, <8 x i8> undef, <8 x i32> <i32 0, i32 2, i32 4, i32 6, i32 8, i32 10, i32 12, i32 14> After this patch, perm instructions will have chance to be emitted instead of lots of INS. llvm-svn: 199069
* correct target directive handling error handlingSaleem Abdulrasool2014-01-135-64/+112
| | | | | | | | | | | | | | The target specific parser should return `false' if the target AsmParser handles the directive, and `true' if the generic parser should handle the directive. Many of the target specific directive handlers would `return Error' which does not follow these semantics. This change simply changes the target specific routines to conform to the semantis of the ParseDirective correctly. Conformance to the semantics improves diagnostics emitted for the invalid directives. X86 is taken as a sample to ensure that multiple diagnostics are not presented for a single error. llvm-svn: 199068
* Handle bundled terminators in isBlockOnlyReachableByFallthrough.Jakob Stoklund Olesen2014-01-122-42/+7
| | | | | | | | | | Targets like SPARC and MIPS have delay slots and normally bundle the delay slot instruction with the corresponding terminator. Teach isBlockOnlyReachableByFallthrough to find any MBB operands on bundled terminators so SPARC doesn't need to specialize this function. llvm-svn: 199061
* raw_fd_ostream: Don't change STDERR to O_BINARY, or w*printf() (in assert()) ↵NAKAMURA Takumi2014-01-121-2/+3
| | | | | | would barf wide chars after llvm::errs(). llvm-svn: 199057
* raw_stream formatter: [Win32] Use std::signbit() if available, instead of ↵NAKAMURA Takumi2014-01-121-0/+6
| | | | | | | _fpclass(). FIXME: It should be generic to C++11. For now, it is dedicated to mingw-w64. llvm-svn: 199052
* Fix non-deterministic SDNodeOrder-dependent codegenNico Rieck2014-01-122-1/+6
| | | | | | | Reset SelectionDAGBuilder's SDNodeOrder to ensure deterministic code generation. llvm-svn: 199050
* [PM] Add module and function printing passes for the new pass manager.Chandler Carruth2014-01-122-24/+41
| | | | | | | | | This implements the legacy passes in terms of the new ones. It adds basic testing using explicit runs of the passes. Next up will be wiring the basic output mechanism of opt up when the new pass manager is engaged unless bitcode writing is requested. llvm-svn: 199049
* [PM] Simplify the IR printing passes significantly now that a narrowerChandler Carruth2014-01-121-38/+19
| | | | | | | | | | API is exposed. This removes the support for deleting the ostream, switches the member and constructor order arround to be consistent with the creation routines, and switches to using references. llvm-svn: 199047
* [PM] Simplify the interface exposed for IR printing passes.Chandler Carruth2014-01-125-21/+17
| | | | | | | | | | | | Nothing was using the ability of the pass to delete the raw_ostream it printed to, and nothing was trying to pass it a pointer to the raw_ostream. Also, the function variant had a different order of arguments from all of the others which was just really confusing. Now the interface accepts a reference, doesn't offer to delete it, and uses a consistent order. The implementation of the printing passes haven't been updated with this simplification, this is just the API switch. llvm-svn: 199044
* [PM] Run clang-format and remove redundant or obvious comments beforeChandler Carruth2014-01-121-93/+89
| | | | | | | the heavy factoring needed to share logic between the new pass manager and the old. llvm-svn: 199043
* [PM] Rename the IR printing pass header to a more generic and correctChandler Carruth2014-01-129-9/+9
| | | | | | | | name to match the source file which I got earlier. Update the include sites. Also modernize the comments in the header to use the more recommended doxygen style. llvm-svn: 199041
* ARM IAS: fix diagnostics of improper qualificationSaleem Abdulrasool2014-01-121-0/+1
| | | | | | | | An improper qualifier would result in a superfluous error due to the parser not consuming the remainder of the statement. Simply consume the remainder of the statement to avoid the error. llvm-svn: 199035
* [Sparc] Add support for parsing floating point instructions.Venkatraman Govindaraju2014-01-123-164/+238
| | | | llvm-svn: 199033
* ARM: change implicit immediate forms of {ld,st}r{,b}t to psuedo-instructionsSaleem Abdulrasool2014-01-123-65/+89
| | | | | | | | | | | | | The implicit immediate 0 forms are assembly aliases, not distinct instruction encodings. Fix the initial implementation introduced in r198914 to an alias to avoid two separate instruction definitions for the same encoding. An InstAlias is insufficient in this case as the necessary due to the need to add a new additional operand for the implicit zero. By using the AsmPsuedoInst, fall back to the C++ code to transform the instruction to the equivalent _POST_IMM form, inserting the additional implicit immediate 0. llvm-svn: 199032
* [Sparc] Replace (unsigned)-1 with ~OU as suggested by Reid Kleckner.Venkatraman Govindaraju2014-01-121-9/+9
| | | | llvm-svn: 199031
* The SPARCv9 ABI returns a float in %f0.Jakob Stoklund Olesen2014-01-122-3/+12
| | | | | | | | | | | | | | | This is different from the argument passing convention which puts the first float argument in %f1. With this patch, all returned floats are treated as if the 'inreg' flag were set. This means multiple float return values get packed in %f0, %f1, %f2, ... Note that when returning a struct in registers, clang will set the 'inreg' flag on the return value, so that behavior is unchanged. This also happens when returning a float _Complex. llvm-svn: 199028
* Add missing mul aliases for armv4 support. Add checks that armv4 canJoerg Sonnenberger2014-01-121-3/+12
| | | | | | assemble the various mul instructions. llvm-svn: 199026
OpenPOWER on IntegriCloud