summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/ARM
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert r237590, "ARM: allow jump tables to be placed as constant islands."Peter Collingbourne2015-05-212-42/+2
| | | | | | | Caused a miscompile of the Android port of Chromium, details forthcoming. llvm-svn: 237972
* [Target/ARM] Only enable OptimizeBarrierPass at -O1 and above.Davide Italiano2015-05-202-1/+16
| | | | | | | | | | Ideally this is going to be and LLVM IR pass (shared, among others with AArch64), but for the time being just enable it if consumers ask us for optimization and not unconditionally. Discussed with Tim Northover on IRC. llvm-svn: 237837
* ARM: allow jump tables to be placed as constant islands.Tim Northover2015-05-182-2/+42
| | | | | | | | | | | | | | | | | Previously, they were forced to immediately follow the actual branch instruction. This was usually OK (the LEAs actually accessing them got emitted nearby, and weren't usually separated much afterwards). Unfortunately, a sufficiently nasty phi elimination dumps many instructions right before the basic block terminator, and this can increase the range too much. This patch frees them up to be placed as usual by the constant islands pass, and consequently has to slightly modify the form of TBB/TBH tables to refer to a PC-relative label at the final jump. The other jump table formats were already position-independent. rdar://20813304 llvm-svn: 237590
* Revert r237579, as it broke windows buildbotsOliver Stannard2015-05-185-301/+2
| | | | llvm-svn: 237583
* [LLVM - ARM/AArch64] Add ACLE special register intrinsicsOliver Stannard2015-05-185-2/+301
| | | | | | | | | | | | | | | | | | | This patch implements LLVM support for the ACLE special register intrinsics in section 10.1, __arm_{w,r}sr{,p,64}. This patch is intended to lower the read/write_register instrinsics, used to implement the special register intrinsics in the clang patch for special register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor registers in AArch32 and AArch64. This is done by inspecting the register string passed to the intrinsic and then lowering to the appropriate instruction. Patch by Luke Cheeseman. Differential Revision: http://reviews.llvm.org/D9699 llvm-svn: 237579
* [CodeGen] Use standard -not gnueabi- naming for f16 libcalls on Darwin.Ahmed Bougacha2015-05-141-3/+3
| | | | | | | | Other targets probably should as well. Since r237161, compiler-rt has both, but I don't see why anything other than gnueabi would use a gnueabi naming scheme. llvm-svn: 237324
* CodeGen: ignore DEBUG_VALUE nodes in KILL taggingSaleem Abdulrasool2015-05-121-0/+88
| | | | | | | DEBUG_VALUE nodes do not take part in code generation. Ignore them when performing KILL updates. Addresses PR23486. llvm-svn: 237211
* Changed renaming of local symbols by inserting a dot vefore the numeric suffix.Sunil Srivastava2015-05-122-4/+4
| | | | | | | One code change and several test changes to match that details in http://reviews.llvm.org/D9481 llvm-svn: 237150
* [ARM] Use AEABI aligned function variantsJohn Brawn2015-05-121-31/+82
| | | | | | | | | | | AEABI defines aligned variants of memcpy etc. that can be faster than the default version due to not having to do alignment checks. When emitting target code for these functions make use of these aligned variants if possible. Also convert memset to memclr if possible. Differential Revision: http://reviews.llvm.org/D8060 llvm-svn: 237127
* Migrate existing backends that care about software floating pointEric Christopher2015-05-121-2/+2
| | | | | | | | | | | | | | | | | | | | to use the information in the module rather than TargetOptions. We've had and clang has used the use-soft-float attribute for some time now so have the backends set a subtarget feature based on a particular function now that subtargets are created based on functions and function attributes. For the one middle end soft float check go ahead and create an overloadable TargetLowering::useSoftFloat function that just checks the TargetSubtargetInfo in all cases. Also remove the command line option that hard codes whether or not soft-float is set by using the attribute for all of the target specific test cases - for the generic just go ahead and add the attribute in the one case that showed up. llvm-svn: 237079
* [Fast-ISel] Don't mark the first use of a remat constant as killed.Pete Cooper2015-05-091-0/+29
| | | | | | | | | | | | | | | | | | | When emitting something like 'add x, 1000' if we remat the 1000 then we should be able to mark the vreg containing 1000 as killed. Given that we go bottom up in fast-isel, a later use of 1000 will be higher up in the BB and won't kill it, or be impacted by the lower kill. However, rematerialised constant expressions aren't generated bottom up. The local value save area grows downwards. This means that if you remat 2 constant expressions which both use 1000 then the first will kill it, then the second, which is *lower* in the BB will read a killed register. This is the case in the attached test where the 2 GEPs both need to generate 'add x, 6680' for the constant offset. Note that this commit only makes kill flag generation conservative. There's nothing else obviously wrong with the local value save area growing downwards, and in fact it needs to for handling arbitrarily complex constant expressions. However, it would be nice if there was a solution which would let us generate more accurate kill flags, or just kill flags completely. llvm-svn: 236922
* ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not ↵Arnold Schwaighofer2015-05-082-11/+11
| | | | | | | | | | | | | | | | | | | | non-aliasing distinct objects The code that builds the dependence graph assumes that two PseudoSourceValues don't alias. In a tail calling function two FixedStackObjects might refer to the same location. Worse 'immutable' fixed stack objects like function arguments are not immutable and will be clobbered. Change this so that a load from a FixedStackObject is not invariant in a tail calling function and don't return a PseudoSourceValue for an instruction in tail calling functions when building the dependence graph so that we handle function arguments conservatively. Fix for PR23459. rdar://20740035 llvm-svn: 236916
* [Fast-ISel] Clear kill flags on registers replaced by updateValueMap.Pete Cooper2015-05-081-0/+24
| | | | | | | | | | When selecting an extract instruction, we don't actually generate code but instead work out which register we are reading, and rewrite uses of the extract def to the source register. This is done via updateValueMap,. However, its possible that the source register we are rewriting *to* to also have uses. If those uses are after a kill of the value we are rewriting *from* then we have uses after a kill and the verifier fails. This code checks for the case where the to register is also used, and if so it clears all kill on the from register. This is conservative, but better that always clearing kills on the from register. llvm-svn: 236897
* Clear kill flags in tail duplication.Pete Cooper2015-05-071-0/+54
| | | | | | | | | | | | | | If we duplicate an instruction then we must also clear kill flags on any uses we rewrite. Otherwise we might be killing a register which was used in other BBs. For example, here the entry BB ended up with these instructions, the ADD having been tail duplicated. %vreg24<def> = t2ADDri %vreg10<kill>, 1, pred:14, pred:%noreg, opt:%noreg; GPRnopc:%vreg24 rGPR:%vreg10 %vreg22<def> = COPY %vreg10; GPR:%vreg22 rGPR:%vreg10 The copy here is inserted after the add and so needs vreg10 to be live. llvm-svn: 236782
* Handle dead defs in the if converter.Pete Cooper2015-05-061-0/+55
| | | | | | | | | | | | | | | | | | | | | | | | We had code such as this: r2 = ... t2Bcc label1: ldr ... r2 label2; return r2<dead, def> The if converter was transforming this to r2<def> = ... return [pred] r2<dead,def> ldr <r2, kill> return which fails the machine verifier because the ldr now reads from a dead def. The fix here detects dead defs in stepForward and passes them back to the caller in the clobbers list. The caller then clears the dead flag from the def is the value is live. llvm-svn: 236660
* Fix incorrect kill flags in fastisel.Pete Cooper2015-05-061-0/+25
| | | | | | | | If called twice in the same BB on the same constant, FastISel::fastEmit_ri_ was marking the materialized vreg as killed on each use, instead of only the last use. Change this to only mark the last use as killed by making earlier uses check if the vreg is already used elsewhere. llvm-svn: 236650
* [ARM] Fast-Isel was incorrectly selecting <2 x double> adds.Pete Cooper2015-05-061-0/+33
| | | | | | | | | | With neon enabled, we reach SelectBinaryFPOp and are able to get registers for a <2 x double> add. However, we shouldn't actually attempt arithmetic on it as ARMIselLowering says "v2f64 is legal so that QR subregs can be extracted as f64 elements, but neither Neon nor VFP support any arithmetic operations on it." This commit disables SelectBinaryFPOp for any vector types. There's already a FIXME to try handle neon. Doing so would require fixing this conditional which isn't safe for vectors 'VT == MVT::f64 || VT == MVT::i64' llvm-svn: 236609
* [ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe ↵Artyom Skrobov2015-05-061-0/+285
| | | | | | math mode too llvm-svn: 236590
* [ARM][FastISel] Use TST #1 instead of CMP #0 for select.Ahmed Bougacha2015-05-061-12/+12
| | | | | | | | | | Since r234249, i1 are sext instead of zext; because of that, doing "CMP rN, #0; IT EQ/NE" isn't correct anymore. "TST #1" is the conservatively correct alternative - the tradeoff being that it doesn't have a 16-bit encoding -, so use that instead. llvm-svn: 236569
* Fix IfConverter to handle regmask machine operands.Pete Cooper2015-05-051-0/+45
| | | | | | | | | | | | | | Note, this is a recommit of r236515 after fixing an error in r236514. The buildbot ran fast enough that it picked up r236514 prior to r236515 and threw an error. r236515 itself ran 'make check' without errors. Original commit message follows: A regmask (typically seen on a call) clobbers the set of registers it lists. The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks. These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier. Otherwise, uses after the if converted call could think they are reading an undefined register. Reviewed by Matthias Braun and Quentin Colombet. llvm-svn: 236550
* Revert "Fix IfConverter to handle regmask machine operands."Pete Cooper2015-05-051-45/+0
| | | | | | | | This reverts commit b27413cbfd78d959c18e713bfa271fb69e6b3303 (ie r236515). This is to get the bots green while i investigate the failures. llvm-svn: 236517
* Fix IfConverter to handle regmask machine operands.Pete Cooper2015-05-051-0/+45
| | | | | | | | | | A regmask (typically seen on a call) clobbers the set of registers it lists. The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks. These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier. Otherwise, uses after the if converted call could think they are reading an undefined register. Reviewed by Matthias Braun and Quentin Colombet. llvm-svn: 236515
* ARM: Align functions containing Thumb-2 jump tables to 4 bytes.Peter Collingbourne2015-05-011-0/+39
| | | | | | | | | Functions with jump tables need an alignment of 4 because they use the ADR instruction, which aligns the PC to 4 bytes before adding an offset. Differential Revision: http://reviews.llvm.org/D9424 llvm-svn: 236327
* [ARM][TEST] Strengthen test against smarter reg alloc.Quentin Colombet2015-05-011-3/+3008
| | | | | | | | Follow-up of r236247. rdar://problem/20770899 llvm-svn: 236296
* [ARM] optimizeSelect should clear kill flags.Pete Cooper2015-04-301-0/+63
| | | | | | | | | | | | If we move an instruction from one block down to a MOVC and predicate it, then the original instruction could be moved in to a loop. In this case, its invalid for any kill flags to remain on there. Fails with -verfy-machineinstrs. rdar://problem/20752113 llvm-svn: 236290
* Commute the internal flag on MachineOperands.Pete Cooper2015-04-301-0/+173
| | | | | | | | | | | | | | When commuting a thumb instruction in the size reduction pass, thumb instructions are represented as a bundle and so some operands may be marked as internal. The internal flag has to move with the operand when commuting. This test is sensitive to register allocation so can't specifically check that this error was happening, but so long as it continues to pass with -verify then hopefully its still ok. rdar://problem/20752113 llvm-svn: 236282
* Don't always apply kill flag in thumb2 ABS pseudo expansion.Pete Cooper2015-04-301-0/+23
| | | | | | | | | The expansion for t2ABS was always setting the kill flag on the rsb instruction. It should instead only be set on rsb if it was set on the original ABS instruction. rdar://problem/20752113 llvm-svn: 236272
* [ARM] Do not generate invalid encoding for stack adjust, even if this is justQuentin Colombet2015-04-301-0/+3839
| | | | | | | | | | | | | temporary. Because of that: 1. The machine verifier was complaining on such code. 2. The generate code worked just because the thumb reduction size pass fixed the opcode. rdar://problem/20749824 llvm-svn: 236247
* IR: Give 'DI' prefix to debug info metadataDuncan P. N. Exon Smith2015-04-2919-733/+733
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Finish off PR23080 by renaming the debug info IR constructs from `MD*` to `DI*`. The last of the `DIDescriptor` classes were deleted in r235356, and the last of the related typedefs removed in r235413, so this has all baked for about a week. Note: If you have out-of-tree code (like a frontend), I recommend that you get everything compiling and tests passing with the *previous* commit before updating to this one. It'll be easier to keep track of what code is using the `DIDescriptor` hierarchy and what you've already updated, and I think you're extremely unlikely to insert bugs. YMMV of course. Back to *this* commit: I did this using the rename-md-di-nodes.sh upgrade script I've attached to PR23080 (both code and testcases) and filtered through clang-format-diff.py. I edited the tests for test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns were off-by-three. It should work on your out-of-tree testcases (and code, if you've followed the advice in the previous paragraph). Some of the tests are in badly named files now (e.g., test/Assembler/invalid-mdcompositetype-missing-tag.ll should be 'dicompositetype'); I'll come back and move the files in a follow-up commit. llvm-svn: 236120
* ARM: fix peephole optimisation of TSTTim Northover2015-04-281-6/+28
| | | | | | | | | | | We were trying to look through COPY instructions, but only to the next instruction in a BB and incorrectly anyway. The cases where that would actually be a good idea are rare enough (and not even tested!) that it's not worth trying to get right. rdar://20721342 llvm-svn: 236050
* Switch lowering: Take branch weight into account when ordering for fall-throughHans Wennborg2015-04-271-2/+4
| | | | | | | | | | | Previously, the code would try to put a fall-through case last, even if that meant moving a case with much higher branch weight further down the chain. Ordering by branch weight is most important, putting a fall-through block last is secondary. llvm-svn: 235942
* ARM: When re-creating a branch via InsertBranch, preserve CPSR flags.Peter Collingbourne2015-04-231-1/+1
| | | | | | | | | | | | | In particular, this preserves the kill flag, which allows the Thumb2 cbn?z optimization to be applied in cases where a branch has been re-created after the live variables analysis pass, e.g. by the machine block placement pass. This appears to be low risk; a number of other targets seem to already be doing something similar, e.g. AArch64, PowerPC. Differential Revision: http://reviews.llvm.org/D9184 llvm-svn: 235639
* ARM: When spilling extra registers for alignment, prefer low registers on ↵Peter Collingbourne2015-04-236-29/+29
| | | | | | | | | | | all Thumb targets. This makes it more likely that we can use the 16-bit push and pop instructions on Thumb-2, saving around 4 bytes per function. Differential Revision: http://reviews.llvm.org/D9165 llvm-svn: 235637
* ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools.Peter Collingbourne2015-04-231-0/+15
| | | | | | | | | | | | | | | | | This appears to have been introduced back in r76698 as part of an unrelated change. I can find no official ARM documentation stating that Thumb-2 functions require 4-byte alignment; in fact, ARM documentation appears to contradict this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement, section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions."). Also remove code that sets alignment for ARM functions, which is redundant with code in the MachineFunction constructor, and remove the hidden -arm-align-constant-islands flag, which has been enabled by default since r146739 (Dec 2011) and has probably received sufficient testing by now. Differential Revision: http://reviews.llvm.org/D9138 llvm-svn: 235636
* Re-commit r235560: Switch lowering: extract jump tables and bit tests before ↵Hans Wennborg2015-04-232-3/+3
| | | | | | | | | | | building binary tree (PR22262) Third time's the charm. The previous commit was reverted as a reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--' on an iterator at the beginning of a vector, causing asserts when using debugging iterators. This commit fixes that. llvm-svn: 235608
* Revert r235560; this commit was causing several failed assertions in Debug ↵Aaron Ballman2015-04-232-3/+3
| | | | | | builds using MSVC's STL. The iterator is being used outside of its valid range. llvm-svn: 235597
* Switch lowering: extract jump tables and bit tests before building binary ↵Hans Wennborg2015-04-222-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tree (PR22262) This is a re-commit of r235101, which also fixes the problems with the previous patch: - Switches with only a default case and non-fallthrough were handled incorrectly - The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here. > This is a major rewrite of the SelectionDAG switch lowering. The previous code > would lower switches as a binary tre, discovering clusters of cases > suitable for lowering by jump tables or bit tests as it went along. To increase > the likelihood of finding jump tables, the binary tree pivot was selected to > maximize case density on both sides of the pivot. > > By not selecting the pivot in the middle, the binary trees would not always > be balanced, leading to performance problems in the generated code. > > This patch rewrites the lowering to search for clusters of cases > suitable for jump tables or bit tests first, and then builds the binary > tree around those clusters. This way, the binary tree will always be balanced. > > This has the added benefit of decoupling the different aspects of the lowering: > tree building and jump table or bit tests finding are now easier to tweak > separately. > > For example, this will enable us to balance the tree based on profile info > in the future. > > The algorithm for finding jump tables is quadratic, whereas the previous algorithm > was O(n log n) for common cases, and quadratic only in the worst-case. This > doesn't seem to be major problem in practice, e.g. compiling a file consisting > of a 10k-case switch was only 30% slower, and such large switches should be rare > in practice. Compiling e.g. gcc.c showed no compile-time difference. If this > does turn out to be a problem, we could limit the search space of the algorithm. > > This commit also disables all optimizations during switch lowering in -O0. > > Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235560
* Fix flakiness in fp16-promote.llPirama Arumuga Nainar2015-04-201-578/+194
| | | | | | | | | | | | | | | | | | | | | | | Summary: In the f16-promote test, make the checks for native conversion instructions similar to the libcall checks: - Remove hard coded register names - Do not check exact instruction sequences. This fixes test flakiness due to non-determinism in instruction scheduling and register allocation. I also fixed a few minor things in the CHECK-LIBCALL checks. I'll try to find a way to check that unnecessary loads, stores, or conversions don't happen. Reviewers: mzolotukhin, srhines, ab Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9112 llvm-svn: 235363
* [GlobalMerge] Look at uses to create smaller global sets.Ahmed Bougacha2015-04-186-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Instead of merging everything together, look at the users of GlobalVariables, and try to group them by function, to create sets of globals used "together". Using that information, a less-aggressive alternative is to keep merging everything together *except* globals that are only ever used alone, that is, those for which it's clearly non-profitable to merge with others. In my testing, grouping by Function is too aggressive, but grouping by BasicBlock is too conservative. Anything in-between isn't trivially available, so stick with Function grouping for now. cl::opts are added for testing; both enabled by default. A few of the testcases aren't testing the merging proper, but just various edge cases when merging does occur. Update them to use the previous grouping behavior. Also, one of the tests is unrelated to GlobalMerge; change it accordingly. While there, switch to r234666' flags rather than the brutal -O3. Differential Revision: http://reviews.llvm.org/D8070 llvm-svn: 235249
* Add support to promote f16 to f32Pirama Arumuga Nainar2015-04-171-0/+1287
| | | | | | | | | | | | | | Summary: This patch adds legalization support to operate on FP16 as a load/store type and do operations on it as floats. Tests for ARM are added to test/CodeGen/ARM/fp16-promote.ll Reviewers: srhines, t.p.northover Differential Revision: http://reviews.llvm.org/D8755 llvm-svn: 235215
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-04-1651-102/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the call instruction See r230786 and r230794 for similar changes to gep and load respectively. Call is a bit different because it often doesn't have a single explicit type - usually the type is deduced from the arguments, and just the return type is explicit. In those cases there's no need to change the IR. When that's not the case, the IR usually contains the pointer type of the first operand - but since typed pointers are going away, that representation is insufficient so I'm just stripping the "pointerness" of the explicit type away. This does make the IR a bit weird - it /sort of/ reads like the type of the first operand: "call void () %x(" but %x is actually of type "void ()*" and will eventually be just of type "ptr". But this seems not too bad and I don't think it would benefit from repeating the type ("void (), void () * %x(" and then eventually "void (), ptr %x(") as has been done with gep and load. This also has a side benefit: since the explicit type is no longer a pointer, there's no ambiguity between an explicit type and a function that returns a function pointer. Previously this case needed an explicit type (eg: a function returning a void() function was written as "call void () () * @x(" rather than "call void () * @x(" because of the ambiguity between a function returning a pointer to a void() function and a function returning void). No ambiguity means even function pointer return types can just be written alone, without writing the whole function's type. This leaves /only/ the varargs case where the explicit type is required. Given the special type syntax in call instructions, the regex-fu used for migration was a bit more involved in its own unique way (as every one of these is) so here it is. Use it in conjunction with the apply.sh script and associated find/xargs commands I've provided in rr230786 to migrate your out of tree tests. Do let me know if any of this doesn't cover your cases & we can iterate on a more general script/regexes to help others with out of tree tests. About 9 test cases couldn't be automatically migrated - half of those were functions returning function pointers, where I just had to manually delete the function argument types now that we didn't need an explicit function type there. The other half were typedefs of function types used in calls - just had to manually drop the * from those. import fileinput import sys import re pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)') addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$") func_end = re.compile("(?:void.*|\)\s*)\*$") def conv(match, line): if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)): return line return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():] for line in sys.stdin: sys.stdout.write(conv(re.search(pat, line), line)) llvm-svn: 235145
* Revert the switch lowering change (r235101, r235103, r235106)Hans Wennborg2015-04-162-3/+3
| | | | | | Looks like it broke the sanitizer-ppc64-linux1 build. Reverting for now. llvm-svn: 235108
* Switch lowering: extract jump tables and bit tests before building binary ↵Hans Wennborg2015-04-162-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tree (PR22262) This is a major rewrite of the SelectionDAG switch lowering. The previous code would lower switches as a binary tre, discovering clusters of cases suitable for lowering by jump tables or bit tests as it went along. To increase the likelihood of finding jump tables, the binary tree pivot was selected to maximize case density on both sides of the pivot. By not selecting the pivot in the middle, the binary trees would not always be balanced, leading to performance problems in the generated code. This patch rewrites the lowering to search for clusters of cases suitable for jump tables or bit tests first, and then builds the binary tree around those clusters. This way, the binary tree will always be balanced. This has the added benefit of decoupling the different aspects of the lowering: tree building and jump table or bit tests finding are now easier to tweak separately. For example, this will enable us to balance the tree based on profile info in the future. The algorithm for finding jump tables is O(n^2), whereas the previous algorithm was O(n log n) for common cases, and quadratic only in the worst-case. This doesn't seem to be major problem in practice, e.g. compiling a file consisting of a 10k-case switch was only 30% slower, and such large switches should be rare in practice. Compiling e.g. gcc.c showed no compile-time difference. If this does turn out to be a problem, we could limit the search space of the algorithm. This commit also disables all optimizations during switch lowering in -O0. Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235101
* DebugInfo: Remove 'inlinedAt:' field from MDLocalVariableDuncan P. N. Exon Smith2015-04-152-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove 'inlinedAt:' from MDLocalVariable. Besides saving some memory (variables with it seem to be single largest `Metadata` contributer to memory usage right now in -g -flto builds), this stops optimization and backend passes from having to change local variables. The 'inlinedAt:' field was used by the backend in two ways: 1. To tell the backend whether and into what a variable was inlined. 2. To create a unique id for each inlined variable. Instead, rely on the 'inlinedAt:' field of the intrinsic's `!dbg` attachment, and change the DWARF backend to use a typedef called `InlinedVariable` which is `std::pair<MDLocalVariable*, MDLocation*>`. This `DebugLoc` is already passed reliably through the backend (as verified by r234021). This commit removes the check from r234021, but I added a new check (that will survive) in r235048, and changed the `DIBuilder` API in r235041 to require a `!dbg` attachment whose 'scope:` is in the same `MDSubprogram` as the variable's. If this breaks your out-of-tree testcases, perhaps the script I used (mdlocalvariable-drop-inlinedat.sh) will help; I'll attach it to PR22778 in a moment. llvm-svn: 235050
* DebugInfo: Add missing !dbg attachments to intrinsicsDuncan P. N. Exon Smith2015-04-152-2/+2
| | | | | | | | Add missing `!dbg` attachments to `@llvm.dbg.*` intrinsics. I updated these using a script (add-dbg-to-intrinsics.sh) that I'll attach to PR22778 for posterity. llvm-svn: 235040
* Update tests to not be as dependent on section numbers.Rafael Espindola2015-04-151-5/+0
| | | | | | | | Many of these predate llvm-readobj. With elf-dump we had to match a relocation to symbol number and symbol number to symbol name or section number. llvm-svn: 235015
* Re-apply r234898 and fix tests.Daniel Jasper2015-04-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit makes LLVM not estimate branch probabilities when doing a single bit bitmask tests. The code that originally made me discover this is: if ((a & 0x1) == 0x1) { .. } In this case we don't actually have any branch probability information and should not assume to have any. LLVM transforms this into: %and = and i32 %a, 1 %tobool = icmp eq i32 %and, 0 So, in this case, the result of a bitwise and is compared against 0, but nevertheless, we should not assume to have probability information. CodeGen/ARM/2013-10-11-select-stalls.ll started failing because the changed probabilities changed the results of ARMBaseInstrInfo::isProfitableToIfCvt() and led to an Ifcvt of the diamond in the test. AFAICT, the test was never meant to test this and thus changing the test input slightly to not change the probabilities seems like the best way to preserve the meaning of the test. llvm-svn: 234979
* Remove this test until I figure out why it failsKrzysztof Parzyszek2015-04-131-31/+0
| | | | llvm-svn: 234777
* Make the ARM testcase from r234764 also pass on ThumbKrzysztof Parzyszek2015-04-131-3/+3
| | | | llvm-svn: 234772
* Allow memory intrinsics to be tail callsKrzysztof Parzyszek2015-04-131-0/+31
| | | | llvm-svn: 234764
OpenPOWER on IntegriCloud