summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "X86 memcpy lowering: use "rep movs" even when esi is used as base ↵Hans Wennborg2014-03-261-29/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | pointer" (r204174) > For functions where esi is used as base pointer, we would previously fall ba > from lowering memcpy with "rep movs" because that clobbers esi. > > With this patch, we just store esi in another physical register, and restore > it afterwards. This adds a little bit of register preassure, but the more > efficient memcpy should be worth it. > > Differential Revision: http://llvm-reviews.chandlerc.com/D2968 This didn't work. I was ending up with code like this: lea edi,[esi+38h] mov ecx,0Fh mov edx,esi mov esi,ebx rep movs dword ptr es:[edi],dword ptr [esi] lea ecx,[esi+74h] <-- Ooops, we're now using esi before restoring it from edx. add ebx,3Ch mov esi,edx I guess if we want to do this we need stronger glue or something, or doing the expansion much later. llvm-svn: 204829
* [PowerPC] Add v2i64 as a legal VSX typeHal Finkel2014-03-264-9/+32
| | | | | | | | | v2i64 needs to be a legal VSX type because it is the SetCC result type from v2f64 comparisons. We need to expand all non-arithmetic v2i64 operations. This fixes the lowering for v2f64 VSELECT. llvm-svn: 204828
* [mips] Use TwoOperandAliasConstraint for ArithLogicR instructions.Matheus Almeida2014-03-262-24/+1
| | | | | | | | This enables TableGen to generate an additional two operand matcher for our ArithLogicR class of instructions (constituted by 3 register operands). E.g.: and $1, $2 <=> and $1, $1, $2 llvm-svn: 204826
* [mips] Add support to the '.dword' directive.Matheus Almeida2014-03-261-0/+5
| | | | | | | The '.dword' directive accepts a list of expressions and emits them in 8-byte chunks in successive locations. llvm-svn: 204822
* [mips] Rename function in MipsAsmParser.Matheus Almeida2014-03-261-4/+4
| | | | | | | | | | | | parseDirectiveWord is a generic function that parses an expression which means there's no need for it to have such an specific name. Renaming it to parseDataDirective so that it can also be used to handle .dword directives[1]. [1]To be added in a follow up commit. No functional changes. llvm-svn: 204818
* [mips] Add support to '.set mips64'.Matheus Almeida2014-03-263-0/+17
| | | | | | | | | | The '.set mips64' directive enables the feature Mips:FeatureMips64 from assembly. Note that it doesn't modify the ELF header as opposed to the use of -mips64 from the command-line. The reason for this is that we want to be as compatible as possible with existing assemblers like GAS. llvm-svn: 204817
* AArch64_BE Elf support for MC-JIT runtime dynamic linkerChristian Pirker2014-03-263-2/+4
| | | | llvm-svn: 204816
* [mips] Add support to '.set mips64r2'.Matheus Almeida2014-03-263-0/+17
| | | | | | | | | | The '.set mips64r2' directive enables the feature Mips:FeatureMips64r2 from assembly. Note that it doesn't modify the ELF header as opposed to the use of -mips64r2 from the command-line. The reason for this is that we want to be as compatible as possible with existing assemblers like GAS. llvm-svn: 204815
* AArch64_BE function argument passing for ARM ABIChristian Pirker2014-03-261-2/+11
| | | | llvm-svn: 204814
* ARM: add intrinsics for the v8 ldaex/stlexTim Northover2014-03-264-26/+98
| | | | | | | | | We've already got versions without the barriers, so this just adds IR-level support for generating the new v8 ones. rdar://problem/16227836 llvm-svn: 204813
* [mips] Hoist common functionality into a new function.Matheus Almeida2014-03-261-29/+30
| | | | | | | | | | Given that we support multiple directives that enable a particular feature (e.g. '.set mips16'), it's best to hoist that code into a new function so that we don't repeat the same pattern w.r.t parsing and handling error cases. No functional changes. llvm-svn: 204811
* Change @llvm.clear_cache default to call rt-libRenato Golin2014-03-262-10/+0
| | | | | | | | | | | After some discussion on IRC, emitting a call to the library function seems like a better default, since it will move from a compiler internal error to a linker error, that the user can work around until LLVM is fixed. I'm also adding a note on the responsibility of the user to confirm that the cache was cleared on platforms where nothing is done. llvm-svn: 204806
* [mips] The decision to use MO_GOT_PAGE and MO_GOT_OFST depends on the ABI ↵Daniel Sanders2014-03-262-9/+11
| | | | | | | | | | | | | | being N32 or N64 not the arch being MIPS64 Summary: No functional change (in supported use cases) Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3177 llvm-svn: 204805
* Fix AVX512 Gather and Scatter execution domains.Cameron McInally2014-03-261-7/+16
| | | | llvm-svn: 204804
* [mips] Add support for '.option pic2'.Matheus Almeida2014-03-263-0/+36
| | | | | | | | | The directive '.option pic2' enables PIC from assembly source. At the moment none of the macros/directives check the PIC bit but that's going to be fixed relatively soon. For example, the expansion of macros like 'la' depend on the relocation model. llvm-svn: 204803
* Add @llvm.clear_cache builtinRenato Golin2014-03-264-0/+17
| | | | | | | | | | | | | | | | | Implementing the LLVM part of the call to __builtin___clear_cache which translates into an intrinsic @llvm.clear_cache and is lowered by each target, either to a call to __clear_cache or nothing at all incase the caches are unified. Updating LangRef and adding some tests for the implemented architectures. Other archs will have to implement the method in case this builtin has to be compiled for it, since the default behaviour is to bail unimplemented. A Clang patch is required for the builtin to be lowered into the llvm intrinsic. This will be done next. llvm-svn: 204802
* [PowerPC] Lower VSELECT using xxsel when VSX is availableHal Finkel2014-03-262-3/+22
| | | | | | | | With VSX there is a real vector select instruction, and so we should use it. Note that VSELECT will still scalarize for v2f64 because the corresponding SetCC result type (v2i64) is not currently a legal type. llvm-svn: 204801
* [mips] The register names depend on the ABI being N32/N64 rather than the ↵Daniel Sanders2014-03-261-15/+18
| | | | | | | | | | | | | | arch being mips64 Summary: Added test cases for O32 and N32 on MIPS64. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3175 llvm-svn: 204796
* Follow-up to r204790: don't try to emit line tables if there are no ↵Timur Iskhodzhanov2014-03-261-2/+9
| | | | | | functions with DI in the TU llvm-svn: 204795
* [mips] $s8 is an alias for $fp in all ABI's, not just N32/N64.Daniel Sanders2014-03-261-2/+2
| | | | llvm-svn: 204793
* Fix PR19239 - Add support for generating debug info for functions without ↵Timur Iskhodzhanov2014-03-262-16/+12
| | | | | | lexical scopes and/or debug info at all llvm-svn: 204790
* Revert "Prevent alias from pointing to weak aliases."Rafael Espindola2014-03-2612-37/+41
| | | | | | | | | This reverts commit r204781. I will follow up to with msan folks to see what is what they were trying to do with aliases to weak aliases. llvm-svn: 204784
* [PowerPC] Generate logical vector VSX instructionsHal Finkel2014-03-261-5/+12
| | | | | | | These instructions are essentially the same as their Altivec counterparts, but have access to the larger VSX register file. llvm-svn: 204782
* Prevent alias from pointing to weak aliases.Rafael Espindola2014-03-2612-41/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is *not* the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. llvm-svn: 204781
* DebugInfo: Add fission-related sections to COFFDavid Blaikie2014-03-261-0/+27
| | | | | | | Allows this test to pass on COFF platforms so we don't need to restrict this test to a single target anymore. llvm-svn: 204780
* Correctly detect if a symbol uses a reserved section index or not.Rafael Espindola2014-03-261-3/+5
| | | | | | | The logic was incorrect for variables, causing them to end up in the wrong section if the section had an index >= 0xff00. llvm-svn: 204771
* [X86] Add broadcast instructions to the table used by ExeDepsFix pass.Quentin Colombet2014-03-261-1/+7
| | | | | | | | | | | | | | | | | | | | | Adds the different broadcast instructions to the ReplaceableInstrsAVX2 table. That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float). In particular, prior to this patch we were generating: vpbroadcastd LCPI1_0(%rip), %ymm2 vpand %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty Now, we generate the following nice sequence where everything is in the float domain: vbroadcastss LCPI1_0(%rip), %ymm2 vandps %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 <rdar://problem/16354675> llvm-svn: 204770
* Create .symtab_shndxr only when needed.Rafael Espindola2014-03-251-86/+120
| | | | | | | | | | | | | | | | | | | | | | | We need .symtab_shndxr if and only if a symbol references a section with an index >= 0xff00. The old code was trying to figure out if the section was needed ahead of time, making it a fairly dependent on the code actually writing the table. It was also somewhat conservative and would create the section in cases where it was not needed. If I remember correctly, the old structure was there so that the sections were created in the same order gas creates them. That was valuable when MC's support for ELF was new and we tested with elf-dump.py. This patch refactors the symbol table creation to another class and makes it obvious that .symtab_shndxr is really only created when we are about to output a reference to a section index >= 0xff00. While here, also improve the tests to use macros. One file is one section short of needing .symtab_shndxr, the second one has just the right number. llvm-svn: 204769
* [PowerPC] Select between VSX A-type and M-type FMA instructions just before RAHal Finkel2014-03-253-0/+283
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VSX instruction set has two types of FMA instructions: A-type (where the addend is taken from the output register) and M-type (where one of the product operands is taken from the output register). This adds a small pass that runs just after MI scheduling (and, thus, just before register allocation) that mutates A-type instructions (that are created during isel) into M-type instructions when: 1. This will eliminate an otherwise-necessary copy of the addend 2. One of the product operands is killed by the instruction The "right" moment to make this decision is in between scheduling and register allocation, because only there do we know whether or not one of the product operands is killed by any particular instruction. Unfortunately, this also makes the implementation somewhat complicated, because the MIs are not in SSA form and we need to preserve the LiveIntervals analysis. As a simple example, if we have: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 ... %vreg9<def,tied1> = XSMADDADP %vreg9<tied0>, %vreg17, %vreg19, %RM<imp-use>; VSLRC:%vreg9,%vreg17,%vreg19 ... We can eliminate the copy by changing from the A-type to the M-type instruction. This means: %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 is replaced by: %vreg16<def,tied1> = XSMADDMDP %vreg16<tied0>, %vreg18, %vreg9, %RM<imp-use>; VSLRC:%vreg16,%vreg18,%vreg9 and we remove: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 llvm-svn: 204768
* Use Endian.h to simplify this code a bit.Rafael Espindola2014-03-251-104/+64
| | | | | | | While at it, factor some logic into FragmentWriter. This will allow more code to be factored out of the fairly large ELFObjectWriter. llvm-svn: 204765
* [Constant Hoisting] Make the constant candidate map local to the ↵Juergen Ributzka2014-03-251-11/+14
| | | | | | collectConstantCandidates method. llvm-svn: 204758
* [PowerPC] Correct commutable indices for VSX FMA instructionsHal Finkel2014-03-252-0/+18
| | | | | | | | | | Although the first two operands are the ones that can be swapped, the tied input operand is listed before them, so we need to adjust for that. I have a test case for this, but it goes along with an upcoming commit (so it will come soon). llvm-svn: 204748
* [PowerPC] Add a TableGen relation for A-type and M-type VSX FMA instructionsHal Finkel2014-03-252-27/+103
| | | | | | | TableGen will create a lookup table for the A-type FMA instructions providing their corresponding M-form opcodes. This will be used by upcoming commits. llvm-svn: 204746
* R600: Move computeMaskedBitsForTargetNode out of AMDILISelLowering.cppMatt Arsenault2014-03-253-39/+12
| | | | | | | | Remove handling of select_cc, since it makes no sense to be there. This now does nothing, but I'll be adding some handling of other target nodes soon. llvm-svn: 204743
* blockfreq: Implement Pass::releaseMemory()Duncan P. N. Exon Smith2014-03-252-19/+20
| | | | | | | | | | Implement Pass::releaseMemory() in BlockFrequencyInfo and MachineBlockFrequencyInfo. Just delete the private implementation when not in use. Switch to a std::unique_ptr to make the logic more clear. <rdar://problem/14292693> llvm-svn: 204741
* blockfreq: Use const in MachineBlockFrequencyInfoDuncan P. N. Exon Smith2014-03-252-10/+10
| | | | | | <rdar://problem/14292693> llvm-svn: 204740
* [X86TTI] Make constant base pointers for getElementPtr opaque.Juergen Ributzka2014-03-251-2/+3
| | | | | | | | | If getElementPtr uses a constant as base pointer, then make the constant opaque. This prevents constant folding it with the offset. The offset can usually be encoded in the load/store instruction itself and the base address doesn't have to be rematerialized several times. llvm-svn: 204739
* [Stackmaps][X86TTI] Fix think-o in getIntImmCost calculation.Juergen Ributzka2014-03-251-7/+6
| | | | | | | | The cost for the first four stackmap operands was always TCC_Free. This is only true for the first two operands. All other operands are TCC_Free if they are within 64bit. llvm-svn: 204738
* [DAG] Keep the opaque constant flag when performing unary constant folding ↵Juergen Ributzka2014-03-251-6/+12
| | | | | | | | | | | | operations. Usually opaque constants shouldn't be folded, unless they are simple unary operations that don't create new constants. Although this shouldn't drop the opaque constant flag. This commit fixes this. Related to <rdar://problem/14774662> llvm-svn: 204737
* [X86] Generate VPSHUFB for in-place v16i16 shufflesAdam Nemet2014-03-251-0/+25
| | | | | | | | | This used to resort to splitting the 256-bit operation into two 128-bit shuffles and then recombining the results. Fixes <rdar://problem/16167303> llvm-svn: 204735
* [X86] Factor out new helper getPSHUFBAdam Nemet2014-03-251-40/+62
| | | | | | | | | | | | | | | | | | | | I found three implementations of this. This splits it out into a new function and uses it from the three places. My plan is to add a fourth use when lowering a vector_shuffle:v16i16. Compared the assembly output of test/CodeGen/X86 before and after. The only change is due to how the first PSHUFB was generated in LowerVECTOR_SHUFFLEv8i16. If the shuffle mask specified undef (i.e. -1), the old implementation would write -1 * 2 and -1 * 2 + 1 (254 and 255) in the control mask. Now we write 0x80. These are of course interchangeable since bit 7 decides if a constant zero is written in the result byte. The other instances of this code use 0x80 consistently. Related to <rdar://problem/16167303> llvm-svn: 204734
* [InstCombine] Don't fold bitcast into store if it would need addrspacecastRichard Osborne2014-03-251-4/+16
| | | | | | | | | | | | | | | | | | Summary: Previously the code didn't check if the before and after types for the store were pointers to different address spaces. This resulted in instcombine using a bitcast to convert between pointers to different address spaces, causing an assertion due to the invalid cast. It is not be appropriate to use addrspacecast this case because it is not guaranteed to be a no-op cast. Instead bail out and do not do the transformation. CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3117 llvm-svn: 204733
* Reuse earlier variables to make it clear the types involved in the cast.Richard Osborne2014-03-251-3/+3
| | | | | | No functionality change. llvm-svn: 204732
* ScalarEvolution: Compute exit counts for loops with a power-of-2 step.Benjamin Kramer2014-03-251-0/+10
| | | | | | | | | | | | | If we have a loop of the form for (unsigned n = 0; n != (k & -32); n += 32) {} then we know that n is always divisible by 32 and the loop must terminate. Even if we have a condition where the loop counter will overflow it'll always hold this invariant. PR19183. Our loop vectorizer creates this pattern and it's also occasionally formed by loop counters derived from pointers. llvm-svn: 204728
* Fix creating illegal setcc cond codes.Matt Arsenault2014-03-251-10/+18
| | | | | | | | | | | | | | If GT/UGT or LT/ULT were set to expand, a comparison with a constant would replace it with the illegal cond code. There are several more places later in this function that will have the same basic problem. Theoretically R600 should hit this problem for a test, but for some reason it doesn't. llvm-svn: 204727
* [msan] More precise instrumentation of select IR.Evgeniy Stepanov2014-03-251-19/+41
| | | | | | | | | Some bits of select result may be initialized even if select condition is not. https://code.google.com/p/memory-sanitizer/issues/detail?id=50 llvm-svn: 204716
* [mips] '.set at=$0' should be equivalent to '.set noat'Daniel Sanders2014-03-251-1/+1
| | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D3171 llvm-svn: 204714
* Fix AVX2 Gather execution domains. Cameron McInally2014-03-251-4/+10
| | | | llvm-svn: 204713
* [mips] Correct testcase for .set at=$reg and emit the new warnings for ↵Daniel Sanders2014-03-251-11/+19
| | | | | | | | | | | | | | | | | | numeric registers too. Summary: Remove the XFAIL added in my previous commit and correct the test such that it correctly tests the expansion of the assembler temporary. Also added a test to check that $at is always $1 when written by the user. Corrected the new assembler temporary warnings so that they are emitted for numeric registers too. Differential Revision: http://llvm-reviews.chandlerc.com/D3169 llvm-svn: 204711
* [mips] Fix assembler temporary expansion and add associated warnings about ↵Daniel Sanders2014-03-251-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | the use of $at. Summary: The assembler temporary is normally $at ($1) but can be reassigned using '.set at=$reg'. Regardless of which register is nominated as the assembler temporary, $at remains $1 when written by the user. Adds warnings under the following conditions: * The register nominated as the assembler temporary is used by the user. * '.set noat' is in effect and $at is used by the user. Both of these only work for named registers. I have a follow up commit that makes it work for numeric registers as well. XFAIL set-at-directive.s since it incorrectly tests that $at is redefined by '.set at=$reg'. Testcases will follow in a separate commit. Patch by David Chisnall His work was sponsored by: DARPA, AFRL Differential Revision: http://llvm-reviews.chandlerc.com/D3167 llvm-svn: 204710
OpenPOWER on IntegriCloud