summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Bring some better consistency to the naming of the move to/from ↵Craig Topper2015-01-022-44/+41
| | | | | | %al/ax/eax/rax with memory offset. llvm-svn: 225078
* [X86] Make the instructions that use AdSize16/32/64 co-exist together ↵Craig Topper2015-01-025-144/+241
| | | | | | | | | | without using mode predicates. This is necessary to allow the disassembler to be able to handle AdSize32 instructions in 64-bit mode when address size prefix is used. Eventually we should probably also support 'addr32' and 'addr16' in the assembler to override the address size on some of these instructions. But for now we'll just use special operand types that will lookup the current mode size to select the right instruction. llvm-svn: 225075
* [PowerPC] use UINT64_C instead of ulHal Finkel2015-01-011-4/+4
| | | | | | | Attempting to fix PR22078 (building on 32-bit systems) by replacing my careless use of 1ul to be a uint64_t constant with UINT64_C(1). llvm-svn: 225066
* [PowerPC] Improve instruction selection bit-permuting operations (64-bit)Hal Finkel2015-01-012-97/+868
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the second installment of improvements to instruction selection for "bit permutation" instruction sequences. r224318 added logic for instruction selection for 32-bit bit permutation sequences, and this adds lowering for 64-bit sequences. The 64-bit sequences are more complicated than the 32-bit ones because: a) the 64-bit versions of the 32-bit rotate-and-mask instructions work by replicating the lower 32-bits of the value-to-be-rotated into the upper 32 bits -- and integrating this into the cost modeling for the various bit group operations is non-trivial b) unlike the 32-bit instructions in 32-bit mode, the rotate-and-mask instructions cannot, in one instruction, specify the mask starting index, the mask ending index, and the rotation factor. Also, forming arbitrary 64-bit constants is more complicated than in 32-bit mode because the number of instructions necessary is value dependent. Plus, support for 'late masking' was added: it is sometimes more efficient to treat the overall value as if it had no mandatory zero bits when planning the bit-group insertions, and then mask them in at the very end. Unfortunately, as the structure of the bit groups is different in the two cases, the more feasible implementation technique was to generate both instruction sequences, and then pick the shorter one. And finally, we now generate reasonable code for i64 bswap: rldicl 5, 3, 16, 0 rldicl 4, 3, 8, 0 rldicl 6, 3, 24, 0 rldimi 4, 5, 8, 48 rldicl 5, 3, 32, 0 rldimi 4, 6, 16, 40 rldicl 6, 3, 48, 0 rldimi 4, 5, 24, 32 rldicl 5, 3, 56, 0 rldimi 4, 6, 40, 16 rldimi 4, 5, 48, 8 rldimi 4, 3, 56, 0 vs. what we used to produce: li 4, 255 rldicl 5, 3, 24, 40 rldicl 6, 3, 40, 24 rldicl 7, 3, 56, 8 sldi 8, 3, 8 sldi 10, 3, 24 sldi 12, 3, 40 rldicl 0, 3, 8, 56 sldi 9, 4, 32 sldi 11, 4, 40 sldi 4, 4, 48 andi. 5, 5, 65280 andis. 6, 6, 255 andis. 7, 7, 65280 sldi 3, 3, 56 and 8, 8, 9 and 4, 12, 4 and 9, 10, 11 or 6, 7, 6 or 5, 5, 0 or 3, 3, 4 or 7, 9, 8 or 4, 6, 5 or 3, 3, 7 or 3, 3, 4 which is 12 instructions, instead of 25, and seems optimal (at least in terms of code size). llvm-svn: 225056
* Add r224985 back with a fix.Rafael Espindola2014-12-317-182/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The issues was that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. Original message: Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225048
* Reverting 225045 and 225043 and XFAIL multiline.ll on hexagonColin LeMahieu2014-12-311-1/+1
| | | | llvm-svn: 225047
* [Hexagon] Removing assertion to appease buildbot until I can reproduce the ↵Colin LeMahieu2014-12-311-1/+0
| | | | | | problem llvm-svn: 225045
* Revert "Remove doesSectionRequireSymbols."Rafael Espindola2014-12-317-93/+177
| | | | | | | | This reverts commit r224985. I am investigating why it made an Apple bot unhappy. llvm-svn: 225044
* [Hexagon] Changing an llvm_unreachable to an assertion and returning 0. ↵Colin LeMahieu2014-12-311-1/+2
| | | | | | Relocations aren't implemented yet but we don't need to abort for this in release builds. llvm-svn: 225043
* [X86] Fix disassembly of absolute moves to work correctly in 16 and 32-bit ↵Craig Topper2014-12-312-6/+35
| | | | | | modes with all 4 combinations of OpSize and AdSize prefixes being present or not. llvm-svn: 225036
* [x86] Simplify detection of jcxz/jecxz/jrcxz in disassembler.Craig Topper2014-12-311-16/+5
| | | | llvm-svn: 225035
* [Hexagon] Adding accumulating add/sub, doubleword logic-not variants, ↵Colin LeMahieu2014-12-311-0/+111
| | | | | | doubleword bitfield extract, word parity, accumulating multiplies with saturation. llvm-svn: 225024
* [Hexagon] Adding double-logic on predicate instructions.Colin LeMahieu2014-12-301-0/+60
| | | | llvm-svn: 225018
* [Hexagon] Adding newvalue compare and jumps.Colin LeMahieu2014-12-301-17/+35
| | | | llvm-svn: 225015
* [Hexagon] Adding postincrement register newvalue stores.Colin LeMahieu2014-12-301-0/+30
| | | | llvm-svn: 225010
* [Hexagon] Removing old newvalue store variants. Adding postincrement ↵Colin LeMahieu2014-12-302-96/+90
| | | | | | immediate newvalue stores. llvm-svn: 225009
* [mips][microMIPS] Relocate with symbol for micromips symbolsZoran Jovanovic2014-12-301-1/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D6796 llvm-svn: 225008
* [Hexagon] Adding indexed store new-value variants.Colin LeMahieu2014-12-302-45/+100
| | | | llvm-svn: 225007
* [Hexagon] Adding indexed store of immediates.Colin LeMahieu2014-12-302-48/+97
| | | | llvm-svn: 225006
* [Hexagon] Adding indexed stores.Colin LeMahieu2014-12-302-81/+167
| | | | llvm-svn: 225005
* x86_64: Fix calls to __morestack under the large code model.Peter Collingbourne2014-12-301-6/+30
| | | | | | | | | | | | | | | | | | | | Under the large code model, we cannot assume that __morestack lives within 2^31 bytes of the call site, so we cannot use pc-relative addressing. We cannot perform the call via a temporary register, as the rax register may be used to store the static chain, and all other suitable registers may be either callee-save or used for parameter passing. We cannot use the stack at this point either because __morestack manipulates the stack directly. To avoid these issues, perform an indirect call via a read-only memory location containing the address. This solution is not perfect, as it assumes that the .rodata section is laid out within 2^31 bytes of each function body, but this seems to be sufficient for JIT. Differential Revision: http://reviews.llvm.org/D6787 llvm-svn: 225003
* [Hexagon] Adding reg-reg indexed load forms.Colin LeMahieu2014-12-303-85/+135
| | | | llvm-svn: 224997
* [Hexagon] Dropping old combine instructions without encodings.Colin LeMahieu2014-12-303-79/+68
| | | | llvm-svn: 224992
* [Hexagon] Adding compare byte/halfword reg-reg/reg-imm forms. Adding ↵Colin LeMahieu2014-12-301-55/+121
| | | | | | compare to general register reg-imm form. llvm-svn: 224991
* [Hexagon] Updating constant extender def, adding alu-not instructions, ↵Colin LeMahieu2014-12-302-10/+43
| | | | | | compare to general register, and inverted compares. llvm-svn: 224989
* Remove doesSectionRequireSymbols.Rafael Espindola2014-12-307-177/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 224985
* [Hexagon] Adding allocframe, post-increment circular immediate stores, ↵Colin LeMahieu2014-12-293-17/+149
| | | | | | post-increment circular register stores, and bit reversed post-increment stores. llvm-svn: 224957
* [Hexagon] Fixing 224952 where an addressing mode update was missed.Colin LeMahieu2014-12-291-1/+1
| | | | llvm-svn: 224955
* [Hexagon] Adding post-increment register form stores and register-immediate ↵Colin LeMahieu2014-12-296-180/+194
| | | | | | form stores with tests. llvm-svn: 224952
* [Hexagon] Replacing the remaining postincrement stores with versions that ↵Colin LeMahieu2014-12-293-58/+20
| | | | | | have encoding bits. llvm-svn: 224951
* [Hexagon] Renaming old multiclass for removal. Adding post-increment store ↵Colin LeMahieu2014-12-293-9/+104
| | | | | | classes and instruction defs. llvm-svn: 224949
* [X86] Fix some cases where some 8-bit instructions were marked as being ↵Craig Topper2014-12-291-18/+24
| | | | | | convertible to three address instructions, but aren't really. llvm-svn: 224940
* [X86] Add the 0x82 instructions to the disassebmler. They are identical in ↵Craig Topper2014-12-291-6/+35
| | | | | | functionality to the 0x80 opcode instructions, but are not valid in 64-bit mode. llvm-svn: 224939
* [x86] Refactor some tablegen instruction info classes slightly to prepare ↵Craig Topper2014-12-291-29/+28
| | | | | | for another change. NFC. llvm-svn: 224938
* [x86] Remove unused classes from tablegen instruction info.Craig Topper2014-12-291-23/+0
| | | | llvm-svn: 224937
* Add segmented stack support for DragonFlyBSD.Rafael Espindola2014-12-292-3/+12
| | | | | | Patch by Michael Neumann. llvm-svn: 224936
* Refactor duplicated code.Rafael Espindola2014-12-296-30/+8
| | | | | | No intended functionality change. llvm-svn: 224935
* [X86][ISel] Fix a regression I introduced in r224884Keno Fischer2014-12-281-3/+3
| | | | | | | | | | | | | The else case ResultReg was not checked for validity. To my surprise, this case was not hit in any of the existing test cases. This includes a new test cases that tests this path. Also drop the `target triple` declaration from the original test as suggested by H.J. Lu, because apparently with it the test won't be run on Linux llvm-svn: 224901
* [X86] Add missing memory variants to AVX false dependency breakingMichael Kuperstein2014-12-281-2/+26
| | | | | | | | Adds missing memory instruction variants to AVX false dependency breaking handling. (SSE was handled in r224246) Differential Revision: http://reviews.llvm.org/D6780 llvm-svn: 224900
* [CodeGenPrepare] Teach when it is profitable to speculate calls to ↵Andrea Di Biagio2014-12-282-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | @llvm.cttz/ctlz. If the control flow is modelling an if-statement where the only instruction in the 'then' basic block (excluding the terminator) is a call to cttz/ctlz, CodeGenPrepare can try to speculate the cttz/ctlz call and simplify the control flow graph. Example: \code entry: %cmp = icmp eq i64 %val, 0 br i1 %cmp, label %end.bb, label %then.bb then.bb: %c = tail call i64 @llvm.cttz.i64(i64 %val, i1 true) br label %end.bb end.bb: %cond = phi i64 [ %c, %then.bb ], [ 64, %entry] \code In this example, basic block %then.bb is taken if value %val is not zero. Also, the phi node in %end.bb would propagate the size-of in bits of %val only if %val is equal to zero. With this patch, CodeGenPrepare will try to hoist the call to cttz from %then.bb into basic block %entry only if cttz is cheap to speculate for the target. Added two new hooks in TargetLowering.h to let targets customize the behavior (i.e. decide whether it is cheap or not to speculate calls to cttz/ctlz). The two new methods are 'isCheapToSpeculateCtlz' and 'isCheapToSpeculateCttz'. By default, both methods return 'false'. On X86, method 'isCheapToSpeculateCtlz' returns true only if the target has LZCNT. Method 'isCheapToSpeculateCttz' only returns true if the target has BMI. Differential Revision: http://reviews.llvm.org/D6728 llvm-svn: 224899
* [x86] Prevent instruction selection of AVX512 cmp.ps/pd/ss/sd intrinsics ↵Craig Topper2014-12-272-22/+23
| | | | | | with illegal immediates. Correctly this time. I did the wrong patterns the first time. llvm-svn: 224891
* PowerPC: CTR shouldn't fire if a TLS call is in the loopDavid Majnemer2014-12-271-0/+18
| | | | | | | | | | | | | | | Determining the address of a TLS variable results in a function call in certain TLS models. This means that a simple ICmpInst might actually result in invalidating the CTR register. In such cases, do not attempt to rely on the CTR register for loop optimization purposes. This fixes PR22034. Differential Revision: http://reviews.llvm.org/D6786 llvm-svn: 224890
* Fixing another -Wunused-variable warning, this time in release builds ↵Aaron Ballman2014-12-271-3/+3
| | | | | | without asserts. NFC. llvm-svn: 224889
* Removing a variable that is set but never used, to silence a ↵Aaron Ballman2014-12-271-4/+0
| | | | | | -Wunused-but-set-variable warning; NFC. llvm-svn: 224888
* [x86] Prevent instruction selection of AVX512 cmp.ps/pd/ss/sd intrinsics ↵Craig Topper2014-12-271-15/+18
| | | | | | with illegal immediates. Forgot to do this when I did SSE/SSE2/AVX/AVX2. llvm-svn: 224887
* [x86] Assert on invalid immediates in the instruction printer for ↵Craig Topper2014-12-272-4/+8
| | | | | | cmp.ps/pd/ss/sd instead of truncating the immediate. The assembly parser and instruction selection shouldn't generate invalid immediates. llvm-svn: 224886
* [x86] Prevent llvm.x86.cmp.ps/pd/ss/sd from being selected with bad ↵Craig Topper2014-12-272-26/+33
| | | | | | immediates. The frontend now checks this when the builtin is used. This will allow the instruction printer to not have to deal with invalid immediates on these instructions. llvm-svn: 224885
* [FastIsel][X86] Fix invalid register replacement for bool argsKeno Fischer2014-12-271-28/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Consider the following IR: %3 = load i8* undef %4 = trunc i8 %3 to i1 %5 = call %jl_value_t.0* @foo(..., i1 %4, ...) ret %jl_value_t.0* %5 Bools (that are the result of direct truncs) are lowered as whatever the argument to the trunc was and a "and 1", causing the part of the MBB responsible for this argument to look something like this: %vreg8<def,tied1> = AND8ri %vreg7<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg8,%vreg7 Later, when the load is lowered, it will insert %vreg15<def> = MOV8rm %vreg14, 1, %noreg, 0, %noreg; mem:LD1[undef] GR8:%vreg15 GR64:%vreg14 but remember to (at the end of isel) replace vreg7 by vreg15. Now for the bug. In fast isel lowering, we mistakenly mark vreg8 as the result of the load instead of the trunc. This adds a fixup to have vreg8 replaced by whatever the result of the load is as well, so we end up with %vreg15<def,tied1> = AND8ri %vreg15<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg15 which is an SSA violation and causes problems later down the road. This fixes PR21557. Test Plan: Test test case from PR21557 is added to the test suite. Reviewers: ributzka Reviewed By: ributzka Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6245 llvm-svn: 224884
* [Hexagon] Adding auto-incrementing loads with and without byte reversal.Colin LeMahieu2014-12-261-0/+76
| | | | llvm-svn: 224871
* [Hexagon] Adding locked loads.Colin LeMahieu2014-12-261-0/+19
| | | | llvm-svn: 224870
OpenPOWER on IntegriCloud