summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Bitcode/Writer
Commit message (Collapse)AuthorAgeFilesLines
...
* Bitcode: Move the DEBUG_LOC record to DEBUG_LOC_OLDDuncan P. N. Exon Smith2015-01-091-1/+1
| | | | | | Prepare to simplify the `DebugLoc` record. llvm-svn: 225498
* IR: Add 'distinct' MDNodes to bitcode and assemblyDuncan P. N. Exon Smith2015-01-081-1/+3
| | | | | | | | | | | | | | | | | | Propagate whether `MDNode`s are 'distinct' through the other types of IR (assembly and bitcode). This adds the `distinct` keyword to assembly. Currently, no one actually calls `MDNode::getDistinct()`, so these nodes only get created for: - self-references, which are never uniqued, and - nodes whose operands are replaced that hit a uniquing collision. The concept of distinct nodes is still not quite first-class, since distinct-ness doesn't yet survive across `MapMetadata()`. Part of PR22111. llvm-svn: 225474
* clang-format. NFC.Rafael Espindola2015-01-081-11/+22
| | | | llvm-svn: 225454
* [PM] Switch the new pass manager to use a reference-based API for IRChandler Carruth2015-01-051-2/+2
| | | | | | | | | | | | | | | | | | | | | units. This was debated back and forth a bunch, but using references is now clearly cleaner. Of all the code written using pointers thus far, in only one place did it really make more sense to have a pointer. In most cases, this just removes immediate dereferencing from the code. I think it is much better to get errors on null IR units earlier, potentially at compile time, than to delay it. Most notably, the legacy pass manager uses references for its routines and so as more and more code works with both, the use of pointers was likely to become really annoying. I noticed this when I ported the domtree analysis over and wrote the entire thing with references only to have it fail to compile. =/ It seemed better to switch now than to delay. We can, of course, revisit this is we learn that references are really problematic in the API. llvm-svn: 225145
* Make ValueEnumerator::print use OS for metadata too. Noticed by inspection.Nick Lewycky2014-12-171-2/+1
| | | | llvm-svn: 224404
* Bitcode: Use unsigned char to record MDStringsDuncan P. N. Exon Smith2014-12-111-1/+1
| | | | | | | | | | `MDString`s can have arbitrary characters in them. Prevent an assertion that fired in `BitcodeWriter` because of sign extension by copying the characters into the record as `unsigned char`s. Based on a patch by Keno Fischer; fixes PR21882. llvm-svn: 224077
* Bitcode: Add METADATA_NODE and METADATA_VALUEDuncan P. N. Exon Smith2014-12-112-38/+10
| | | | | | | | | | | | | | | | This reflects the typelessness of `Metadata` in the bitcode format, removing types from all metadata operands. `METADATA_VALUE` represents a `ValueAsMetadata`, and always has two fields: the type and the value. `METADATA_NODE` represents an `MDNode`, and unlike `METADATA_OLD_NODE`, doesn't store types. It stores operands at their ID+1 so that `0` can reference `nullptr` operands. Part of PR21532. llvm-svn: 224073
* Bitcode: Add `OLD_` prefix to metadata node recordsDuncan P. N. Exon Smith2014-12-111-3/+3
| | | | | | | | I'm about to change these, so move the old ones out of the way. Part of PR21532. llvm-svn: 224070
* IR: Split Metadata from ValueDuncan P. N. Exon Smith2014-12-093-134/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do *not* have a `Type`. - `MDNode`'s operands are all `Metadata *` (instead of `Value *`). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode*`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode *N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode *N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode *N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the *only* class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802
* Prologue supportPeter Collingbourne2014-12-032-5/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch by Ben Gamari! This redefines the `prefix` attribute introduced previously and introduces a `prologue` attribute. There are a two primary usecases that these attributes aim to serve, 1. Function prologue sigils 2. Function hot-patching: Enable the user to insert `nop` operations at the beginning of the function which can later be safely replaced with a call to some instrumentation facility 3. Runtime metadata: Allow a compiler to insert data for use by the runtime during execution. GHC is one example of a compiler that needs this functionality for its tables-next-to-code functionality. Previously `prefix` served cases (1) and (2) quite well by allowing the user to introduce arbitrary data at the entrypoint but before the function body. Case (3), however, was poorly handled by this approach as it required that prefix data was valid executable code. Here we redefine the notion of prefix data to instead be data which occurs immediately before the function entrypoint (i.e. the symbol address). Since prefix data now occurs before the function entrypoint, there is no need for the data to be valid code. The previous notion of prefix data now goes under the name "prologue data" to emphasize its duality with the function epilogue. The intention here is to handle cases (1) and (2) with prologue data and case (3) with prefix data. References ---------- This idea arose out of discussions[1] with Reid Kleckner in response to a proposal to introduce the notion of symbol offsets to enable handling of case (3). [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html Test Plan: testsuite Differential Revision: http://reviews.llvm.org/D6454 llvm-svn: 223189
* Add and use Type::subtypes. NFC.Rafael Espindola2014-11-241-3/+2
| | | | llvm-svn: 222682
* Pass a reference to ValueEnumerator.Rafael Espindola2014-11-173-37/+36
| | | | | | NFC. This will just make it easier to use std::unique_ptr in a caller. llvm-svn: 222170
* Revert "IR: MDNode => Value"Duncan P. N. Exon Smith2014-11-112-5/+5
| | | | | | | | | | | | | | | | | Instead, we're going to separate metadata from the Value hierarchy. See PR21532. This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994. llvm-svn: 221711
* IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc()Duncan P. N. Exon Smith2014-11-032-5/+5
| | | | | | | Change `Instruction::getAllMetadataOtherThanDebugLoc()` from a vector of `MDNode` to one of `Value`. Part of PR21433. llvm-svn: 221167
* IR: Reorder metadata bitcode serialization, NFCDuncan P. N. Exon Smith2014-10-211-12/+14
| | | | | | | | | Enumerate `MDNode`'s operands *before* the node itself, so that the reader requires less RAUW. Although this will cause different code paths to be hit in the reader, this should effectively be no functionality change. llvm-svn: 220340
* IR: Remove dead code in metadata bitcode writing, NFCDuncan P. N. Exon Smith2014-10-213-16/+12
| | | | | | | No one cares how many uses each metadata value has, so don't bother counting. llvm-svn: 220337
* correct const-ness with auto and dyn_castSanjay Patel2014-10-151-3/+3
| | | | | | | | | 1. Use const with autos. 2. Don't bother with explicit const in cast ops because they do it automagically. Thanks, David B. / Aaron B. / Reid K. llvm-svn: 219817
* Use 'auto' for easier reading; no functional change intended.Sanjay Patel2014-10-151-6/+3
| | | | llvm-svn: 219804
* Introduce LLVMWriteBitcodeToMemoryBuffer C API function.Peter Collingbourne2014-10-141-0/+8
| | | | llvm-svn: 219643
* Modernize raw_fd_ostream's constructor a bit.Rafael Espindola2014-08-251-3/+3
| | | | | | | | | | Take a StringRef instead of a "const char *". Take a "std::error_code &" instead of a "std::string &" for error. A create static method would be even better, but this patch is already a bit too big. llvm-svn: 216393
* Canonicalize header guards into a common format.Benjamin Kramer2014-08-131-2/+2
| | | | | | | | | | Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
* UseListOrder: Handle self-usersDuncan P. N. Exon Smith2014-07-311-3/+3
| | | | | | | | | | Correctly sort self-users (such as PHI nodes). I added a targeted test in `test/Bitcode/use-list-order.ll` and the final missing RUN line to tests in `test/Assembly`. This is part of PR5680. llvm-svn: 214417
* UseListOrder: Don't give constant IDs to GlobalValuesDuncan P. N. Exon Smith2014-07-311-3/+6
| | | | | | | | | | | Since initializers of GlobalValues are being assigned IDs before GlobalValues themselves, explicitly exclude GlobalValues from the constant pool. Added targeted test in `test/Bitcode/use-list-order.ll` and added two more RUN lines in `test/Assembly`. This is part of PR5680. llvm-svn: 214368
* UseListOrder: Visit global valuesDuncan P. N. Exon Smith2014-07-301-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When predicting use-list order, we visit functions in reverse order followed by `GlobalValue`s and write out use-lists at the first opportunity. In the reader, this will translate to *after* the last use has been added. For this to work, we actually need to descend into `GlobalValue`s. Added a targeted test in `use-list-order.ll` and `RUN` lines to the newly passing tests in `test/Bitcode`. There are two remaining failures in `test/Bitcode`: - blockaddress.ll: I haven't thought through how to model the way block addresses change the order of use-lists (or how to work around it). - metadata-2.ll: There's an old-style `@llvm.used` global array here that I suspect the .ll parser isn't upgrading properly. When it round-trips through bitcode, the .bc reader *does* upgrade it, so the extra variable (`i8* null`) has an extra use, and the shuffle vector doesn't match. I think the fix is to upgrade old-style global arrays (or reject them?) in the .ll parser. This is part of PR5680. llvm-svn: 214321
* Reapply "UseListOrder: Order GlobalValue uses after initializers"Duncan P. N. Exon Smith2014-07-301-14/+55
| | | | | | | This reverts commit r214249, reapplying r214242 and r214243, now that r214270 has fixed the UB. llvm-svn: 214271
* UseListOrder: Fix undefined behaviourDuncan P. N. Exon Smith2014-07-301-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | This commit fixes undefined behaviour that caused the revert in r214249. The problem was two unsequenced operations on a `DenseMap<>`, giving different behaviour in GCC and Clang. This: DenseMap<T*, unsigned> DM; for (auto &X : ...) DM[&X] = DM.size() + 1; should have been: DenseMap<T*, unsigned> DM; for (auto &X : ...) { unsigned Size = DM.size(); DM[&X] = Size + 1; } Until r214242, this difference between compilers didn't matter. In r214242, `OrderMap::LastGlobalValueID` was introduced and compared against IDs, which in GCC were off-by-one my expectations. llvm-svn: 214270
* Revert "UseListOrder: Order GlobalValue uses after initializers"Duncan P. N. Exon Smith2014-07-291-55/+14
| | | | | | | | | | | | This reverts commits r214242 and r214243 while I investigate buildbot failures [1][2][3]. I can't reproduce these failures locally, so if anyone can see what I've done wrong, I'd appreciate a note. [1]: http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/9840 [2]: http://lab.llvm.org:8011/builders/clang-hexagon-elf/builds/14981 [3]: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/15191 llvm-svn: 214249
* UseListOrder: Order GlobalValue uses after initializersDuncan P. N. Exon Smith2014-07-291-14/+55
| | | | | | | | | | | | To avoid unnecessary forward references, the reader doesn't process initializers of `GlobalValue`s until after the constant pool has been processed, and then in reverse order. Model this when predicting use-list order. This gets two more Bitcode tests passing with `llvm-uselistorder`. Part of PR5680. llvm-svn: 214242
* UseListOrder: Create a struct around OrderMap, NFCDuncan P. N. Exon Smith2014-07-291-1/+9
| | | | llvm-svn: 214241
* IR: Create the use-list order shuffle vector in-placeDuncan P. N. Exon Smith2014-07-291-4/+3
| | | | | | | Per David Blaikie's review of r214135, this is a more natural way to initialize. llvm-svn: 214184
* Bitcode: Correctly compare a Use against itselfDuncan P. N. Exon Smith2014-07-291-0/+3
| | | | | | | | | | | | | | | Fix the sort of expected order in the reader to correctly return `false` when comparing a `Use` against itself. This was caught by test/Bitcode/binaryIntInstructions.3.2.ll, so I'm adding a `RUN` line using `llvm-uselistorder` for every test in `test/Bitcode` that passes. A few tests still fail, so I'll investigate those next. This is part of PR5680. llvm-svn: 214157
* IR: Optimize size of use-list order shuffle vectorsDuncan P. N. Exon Smith2014-07-281-6/+5
| | | | | | | | | Since we're storing lots of these, save two-pointers per vector with a custom type rather than using the relatively heavy `SmallVector`. Part of PR5680. llvm-svn: 214135
* Bitcode: Serialize (and recover) use-list orderDuncan P. N. Exon Smith2014-07-283-94/+234
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Predict and serialize use-list order in bitcode. This makes the option `-preserve-bc-use-list-order` work *most* of the time, but this is still experimental. - Builds a full value-table up front in the writer, sets up a list of use-list orders to write out, and discards the table. This is a simpler first step than determining the order from the various overlapping IDs of values on-the-fly. - The shuffles stored in the use-list order list have an unnecessarily large memory footprint. - `blockaddress` expressions cause functions to be materialized out-of-order. For now I've ignored this problem, so use-list orders will be wrong for constants used by functions that have block addresses taken. There are a couple of ways to fix this, but I don't have a concrete plan yet. - When materializing functions lazily, the use-lists for constants will not be correct. This use case is out of scope: what should the use-list order be, if it's incomplete? This is part of PR5680. llvm-svn: 214125
* Bitcode: Don't optimize constants when preserving use-list orderDuncan P. N. Exon Smith2014-07-251-0/+6
| | | | | | | | | | | | | | | | | | | | `ValueEnumerator::OptimizeConstants()` creates forward references within the constant pools, which makes predicting constants' use-list order difficult. For now, just disable the optimization. This can be re-enabled in the future in one of two ways: - Enable a limited version of this optimization that doesn't create forward references. One idea is to categorize constants by their "height" and make that the top-level sort. - Enable it entirely. This requires predicting how may times each constant will be recreated as its operands' and operands' operands' (etc.) forward references get resolved. This is part of PR5680. llvm-svn: 213953
* IPO: Add use-list-order verifierDuncan P. N. Exon Smith2014-07-251-7/+2
| | | | | | | | | | | | | | | | | | | | Add a -verify-use-list-order pass, which shuffles use-list order, writes to bitcode, reads back, and verifies that the (shuffled) order matches. - The utility functions live in lib/IR/UseListOrder.cpp. - Moved (and renamed) the command-line option to enable writing use-lists, so that this pass can return early if the use-list orders aren't being serialized. It's not clear that this pass is the right direction long-term (perhaps a separate tool instead?), but short-term it's a great way to test the use-list order prototype. I've added an XFAIL-ed testcase that I'm hoping to get working pretty quickly. This is part of PR5680. llvm-svn: 213945
* Add a dereferenceable attributeHal Finkel2014-07-181-0/+2
| | | | | | | | | This attribute indicates that the parameter or return pointer is dereferenceable. Practically speaking, loads from such a pointer within the associated byte range are safe to speculatively execute. Such pointer parameters are common in source languages (C++ references, for example). llvm-svn: 213385
* Rename AlignAttribute to IntAttributeHal Finkel2014-07-181-1/+1
| | | | | | | | | | | | Currently the only kind of integer IR attributes that we have are alignment attributes, and so the attribute kind that takes an integer parameter is called AlignAttr, but that will change (we'll soon be adding a dereferenceable attribute that also takes an integer value). Accordingly, rename AlignAttribute to IntAttribute (class names, enums, etc.). No functionality change intended. llvm-svn: 213352
* Roundtrip the inalloca bit on allocas through bitcodeReid Kleckner2014-07-161-2/+9
| | | | | | | | | | | | | This was an oversight in the original support. As it is, I stuffed this bit into the alignment. The alignment is stored in log2 form, so it doesn't need more than 5 bits, given that Value::MaximumAlignment is 1 << 29. Reviewers: nicholas Differential Revision: http://reviews.llvm.org/D3943 llvm-svn: 213118
* IR: Add COMDATs to the IRDavid Majnemer2014-06-273-1/+54
| | | | | | | | | | | | | | | | This new IR facility allows us to represent the object-file semantic of a COMDAT group. COMDATs allow us to tie together sections and make the inclusion of one dependent on another. This is required to implement features like MS ABI VFTables and optimizing away certain kinds of initialization in C++. This functionality is only representable in COFF and ELF, Mach-O has no similar mechanism. Differential Revision: http://reviews.llvm.org/D4178 llvm-svn: 211920
* Convert a few loops to use ranges.Rafael Espindola2014-06-171-18/+15
| | | | llvm-svn: 211089
* IR: add "cmpxchg weak" variant to support permitted failure.Tim Northover2014-06-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903
* Allow aliases to be unnamed_addr.Rafael Espindola2014-06-061-2/+2
| | | | | | | | | | | | | | | | | | Alias with unnamed_addr were in a strange state. It is stored in GlobalValue, the language reference talks about "unnamed_addr aliases" but the verifier was rejecting them. It seems natural to allow unnamed_addr in aliases: * It is a property of how it is accessed, not of the data itself. * It is perfectly possible to write code that depends on the address of an alias. This patch then makes unname_addr legal for aliases. One side effect is that the syntax changes for a corner case: In globals, unnamed_addr is now printed before the address space. llvm-svn: 210302
* Add a new attribute called 'jumptable' that creates jump-instruction tables ↵Tom Roeder2014-06-051-0/+2
| | | | | | | | | | | | for functions marked with this attribute. It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables. This also adds backend support for generating the jump-instruction tables on ARM and X86. Note that since the jumptable attribute creates a second function pointer for a function, any function marked with jumptable must also be marked with unnamed_addr. llvm-svn: 210280
* [pr19844] Add thread local mode to aliases.Rafael Espindola2014-05-281-1/+3
| | | | | | | | | | This matches gcc's behavior. It also seems natural given that aliases contain other properties that govern how it is accessed (linkage, visibility, dll storage). Clang still has to be updated to expose this feature to C. llvm-svn: 209759
* Convert a few loops to use ranges.Rafael Espindola2014-05-261-54/+51
| | | | llvm-svn: 209628
* Add 'nonnull', a new parameter and return attribute which indicates that the ↵Nick Lewycky2014-05-201-0/+2
| | | | | | pointer is not null. Instcombine will elide comparisons between these and null. Patch by Luqman Aden! llvm-svn: 209185
* [IR] Make {extract,insert}element accept an index of any integer type.Michael J. Spencer2014-05-011-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Given the following C code llvm currently generates suboptimal code for x86-64: __m128 bss4( const __m128 *ptr, size_t i, size_t j ) { float f = ptr[i][j]; return (__m128) { f, f, f, f }; } ================================================= define <4 x float> @_Z4bss4PKDv4_fmm(<4 x float>* nocapture readonly %ptr, i64 %i, i64 %j) #0 { %a1 = getelementptr inbounds <4 x float>* %ptr, i64 %i %a2 = load <4 x float>* %a1, align 16, !tbaa !1 %a3 = trunc i64 %j to i32 %a4 = extractelement <4 x float> %a2, i32 %a3 %a5 = insertelement <4 x float> undef, float %a4, i32 0 %a6 = insertelement <4 x float> %a5, float %a4, i32 1 %a7 = insertelement <4 x float> %a6, float %a4, i32 2 %a8 = insertelement <4 x float> %a7, float %a4, i32 3 ret <4 x float> %a8 } ================================================= shlq $4, %rsi addq %rdi, %rsi movslq %edx, %rax vbroadcastss (%rsi,%rax,4), %xmm0 retq ================================================= The movslq is uneeded, but is present because of the trunc to i32 and then sext back to i64 that the backend adds for vbroadcastss. We can't remove it because it changes the meaning. The IR that clang generates is already suboptimal. What clang really should emit is: %a4 = extractelement <4 x float> %a2, i64 %j This patch makes that legal. A separate patch will teach clang to do it. Differential Revision: http://reviews.llvm.org/D3519 llvm-svn: 207801
* raw_ostream: Forward declare OpenFlags and include FileSystem.h only where ↵Benjamin Kramer2014-04-291-0/+1
| | | | | | necessary. llvm-svn: 207593
* Add 'musttail' marker to call instructionsReid Kleckner2014-04-241-1/+2
| | | | | | | | | | | | This is similar to the 'tail' marker, except that it guarantees that tail call optimization will occur. It also comes with convervative IR verification rules that ensure that tail call optimization is possible. Reviewers: nicholas Differential Revision: http://llvm-reviews.chandlerc.com/D3240 llvm-svn: 207143
* [C++11] More 'nullptr' conversion. In some cases just using a boolean check ↵Craig Topper2014-04-151-1/+1
| | | | | | instead of comparing to nullptr. llvm-svn: 206252
OpenPOWER on IntegriCloud