summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* R600/SI: Merge offset0 and offset1 fields for single address DS instructions v2Matt Arsenault2014-03-192-18/+27
| | | | | | | | | Also remove unused data fields from the DS_Load_Helper class. v2: - Merge fields for DS_WRITE llvm-svn: 204269
* [mips] 80-column.Matheus Almeida2014-03-191-8/+12
| | | | llvm-svn: 204252
* Prune includes in X86 target.Craig Topper2014-03-1915-32/+18
| | | | llvm-svn: 204216
* Revert "Add back r203962, r204028 and r204059."Rafael Espindola2014-03-191-70/+0
| | | | | | This reverts commit r204178. llvm-svn: 204203
* Add back r203962, r204028 and r204059.Rafael Espindola2014-03-181-0/+70
| | | | | | | | This reverts commit r204137. This includes a fix for handling aliases of aliases. llvm-svn: 204178
* X86 memcpy lowering: use "rep movs" even when esi is used as base pointerHans Wennborg2014-03-181-13/+29
| | | | | | | | | | | | | For functions where esi is used as base pointer, we would previously fall back from lowering memcpy with "rep movs" because that clobbers esi. With this patch, we just store esi in another physical register, and restore it afterwards. This adds a little bit of register preassure, but the more efficient memcpy should be worth it. Differential Revision: http://llvm-reviews.chandlerc.com/D2968 llvm-svn: 204174
* X86: Use enums for memory operand decoding instead of integer literals.Manuel Jacob2014-03-185-53/+54
| | | | | | | | | | | | | | | | Summary: X86BaseInfo.h defines an enum for the offset of each operand in a memory operand sequence. Some code uses it and some does not. This patch replaces (hopefully) all remaining locations where an integer literal was used instead of this enum. No functionality change intended. Reviewers: nadav CC: llvm-commits, t.p.northover Differential Revision: http://llvm-reviews.chandlerc.com/D3108 llvm-svn: 204158
* Enable CFI on Hexagon.Krzysztof Parzyszek2014-03-181-1/+0
| | | | llvm-svn: 204157
* Fix PR19144: Incorrect offset generated for int-to-fp conversion at -O0.Bill Schmidt2014-03-181-3/+5
| | | | | | | | | | | | | | | | | | When converting a signed 32-bit integer to double-precision floating point on hardware without a lfiwax instruction, we have to instead use a lfd followed by fcfid. We were erroneously offsetting the address by 4 bytes in preparation for either a lfiwax or lfiwzx when generating the lfd. This fixes that silly error. This was not caught in the test suite since the conversion tests were run with -mcpu=pwr7, which implies availability of lfiwax. I've added another test case for older hardware that checks the code we expect in the absence of lfiwax and other flavors of fcfid. There are fewer tests in this test case because we punt to DAG selection in more cases on older hardware. (We must generate complex fiddly sequences in those cases, and there is marginal benefit in duplicating that logic in fast-isel.) llvm-svn: 204155
* Revert r203962 and two revisions depending on it: r204028 and r204059.Alexander Kornienko2014-03-181-70/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The revision I'm reverting breaks handling of transitive aliases. This blocks us and breaks sanitizer bootstrap: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/2651 (and checked locally by Alexey). This revision is the result of: svn merge -r204059:204058 -r204028:204027 -r203962:203961 . + the regression test added to test/MC/ELF/alias.s Another way to reproduce the regression with clang: $ cat q.c void a1(); void a2() __attribute__((alias("a1"))); void a3() __attribute__((alias("a2"))); void a1() {} $ ~/work/llvm-build/bin/clang-3.5-good -c q.c && mv q.o good.o && \ ~/work/llvm-build/bin/clang-3.5-bad -c q.c && mv q.o bad.o && \ objdump -t good.o bad.o good.o: file format elf64-x86-64 SYMBOL TABLE: 0000000000000000 l df *ABS* 0000000000000000 q.c 0000000000000000 l d .text 0000000000000000 .text 0000000000000000 l d .data 0000000000000000 .data 0000000000000000 l d .bss 0000000000000000 .bss 0000000000000000 l d .comment 0000000000000000 .comment 0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack 0000000000000000 l d .eh_frame 0000000000000000 .eh_frame 0000000000000000 g F .text 0000000000000006 a1 0000000000000000 g F .text 0000000000000006 a2 0000000000000000 g F .text 0000000000000006 a3 bad.o: file format elf64-x86-64 SYMBOL TABLE: 0000000000000000 l df *ABS* 0000000000000000 q.c 0000000000000000 l d .text 0000000000000000 .text 0000000000000000 l d .data 0000000000000000 .data 0000000000000000 l d .bss 0000000000000000 .bss 0000000000000000 l d .comment 0000000000000000 .comment 0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack 0000000000000000 l d .eh_frame 0000000000000000 .eh_frame 0000000000000000 g F .text 0000000000000006 a1 0000000000000000 g F .text 0000000000000006 a2 0000000000000000 g .text 0000000000000000 a3 llvm-svn: 204137
* [C++11] Change DebugInfoFinder to use range-based loopsAlon Mishne2014-03-181-8/+2
| | | | | | Also changes the iterators to return actual DI type over MDNode. llvm-svn: 204130
* [C++11] Mark the target fast isel classes as 'final' so that the compiler ↵Craig Topper2014-03-183-3/+3
| | | | | | can de-virtualize some of the internal calls. llvm-svn: 204123
* ARM: add an assertionSaleem Abdulrasool2014-03-181-0/+1
| | | | | | | Add an assertion that a valid section is referenced. The potential NULL pointer dereference was identified by the clang static analyzer. llvm-svn: 204114
* Make methods staticMatt Arsenault2014-03-171-23/+24
| | | | llvm-svn: 204085
* R600: Match sign_extend_inreg to BFE instructionsMatt Arsenault2014-03-179-47/+154
| | | | llvm-svn: 204072
* [X86] Fix unused variable warning with NDEBUG from r204058Adam Nemet2014-03-171-2/+1
| | | | llvm-svn: 204063
* ARM IAS: support .thumb_setSaleem Abdulrasool2014-03-171-0/+70
| | | | | | | | | | | | | | This performs the equivalent of a .set directive in that it creates a symbol which is an alias for another symbol or value which may possibly be yet undefined. This directive also has the added property in that it marks the aliased symbol as being a thumb function entry point, in the same way that the .thumb_func directive does. The current implementation fails one test due to an unrelated issue. Functions within .thumb sections are not marked as thumb_func. The result is that the aliasee function is not valued correctly. llvm-svn: 204059
* [VectorLegalizer/X86] Don't unvectorize fp_to_uint for v8f32->v8i16Adam Nemet2014-03-171-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than LegalizeAction::Expand, this needs LegalizeAction::Promote to get promoted to fp_to_sint v8f32->v8i32. This is a legal operation on AVX. For that to work properly, we also need to teach the legalizer about the specific promotion required here. The default vector promotion uses bitcasting to a vector type of the same total size. We want to promote the vector element type, effectively widening the operation and then truncating the result. This is analogous to the current logic of how int_to_fp is promoted. The change also factors out some code from the int_to_fp promotion code to ValueType::widenIntegerVectorElementType. This is now shared between int_to_fp and fp_to_int. There is no longer need for the custom lowering of fp_to_sint f32->v8i16 in X86. It can now go through the new target-independent fp_to_*int promotion logic. I also checked that no other target uses Promote for these ops yet, so there shouldn't be any unexpected change in behavior. Fixes <rdar://problem/16202247> llvm-svn: 204058
* R600/SI: Fix implementation of isInlineConstant() used by the verifierTom Stellard2014-03-171-14/+25
| | | | | | | | The type of the immediates should not matter as long as the encoding is equivalent to the encoding of one of the legal inline constants. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204056
* R600/SI: Use correct dest register class for V_READFIRSTLANE_B32Tom Stellard2014-03-174-6/+28
| | | | | | | | | | | | This instructions writes to an 32-bit SGPR. This change required adding the 32-bit VCC_LO and VCC_HI registers, because the full VCC register is 64 bits. This fixes verifier errors on several of the indirect addressing piglit tests. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204055
* R600/SI: Add generic checks to SIInstrInfo::verifyInstruction()Tom Stellard2014-03-171-0/+41
| | | | | | | Added checks for number of operands and operand register classes. Tested-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 204054
* [X86] New and improved VZeroUpperInserter optimization.Lang Hames2014-03-171-165/+162
| | | | | | | | | | | | | | | | | - Adds support for inserting vzerouppers before tail-calls. This is enabled implicitly by having MachineInstr::copyImplicitOps preserve regmask operands, which allows VZeroUpperInserter to see where tail-calls use vector registers. - Fixes a bug that caused the previous version of this optimization to miss some vzeroupper insertion points in loops. (Loops-with-vector-code that followed loops-without-vector-code were mistakenly overlooked by the previous version). - New algorithm never revisits instructions. Fixes <rdar://problem/16228798> llvm-svn: 204021
* Remove some dead assignements found by scan-buildArnaud A. de Grandmaison2014-03-151-1/+0
| | | | llvm-svn: 204013
* Replace ValueTypes.h with MachineValueType.h if possible.Patrik Hagglund2014-03-154-3/+5
| | | | | | | | | Utilize the previous move of MVT to a separate header for all trivial cases (that don't need any further restructuring). Reviewed By: Tim Northover llvm-svn: 204003
* R600: Remove unnecessary attempt to zext a pointer.Matt Arsenault2014-03-151-3/+6
| | | | | | Private pointers are now always 32-bits. llvm-svn: 203989
* R600: Code cleanup.Matt Arsenault2014-03-151-11/+12
| | | | | | | Use sign_extend_inreg and getZeroExtendInReg instead of using the bit operations they expand into. llvm-svn: 203988
* x86: Add missing break to getCallPreservedMask()Duncan P. N. Exon Smith2014-03-141-0/+1
| | | | | | | | | | | | This change brings getCallPreservedMask()'s logic in line with getCalleeSavedRegs(). While this changes the control flow slightly, the change is not currently observable. is64Bit must be false to get to the accidental fallthrough, but the case that we fall into (coldcc) does nothing unless is64Bit is true. llvm-svn: 203943
* x86: NFC: Make getCallPreservedMask() more similar to getCalleeSavedRegs()Duncan P. N. Exon Smith2014-03-141-4/+6
| | | | | | | Changing order of checks in getCallPreservedMask() to match getCalleeSavedRegs() so that the logic is easier to compare. llvm-svn: 203939
* x86: getCalleeSavedRegs() would crash on 0 (so don't default to it)Duncan P. N. Exon Smith2014-03-142-1/+2
| | | | | | | The current logic assumes that MF is not 0. Assert that it isn't, and remove the default of 0 from the header. llvm-svn: 203934
* [ppc64] Avoid copy relocs in named rodata sectionsUlrich Weigand2014-03-141-13/+9
| | | | | | | | | | | | | | Commit r181723 introduced code to avoid placing initialized variables needing relocations into the .rodata section, which avoid copy relocs that do not work as expected on ppc64 function references. The same treatment is also needed for *named* .rodata.XXX sections. This patch changes PPC64LinuxTargetObjectFile::SelectSectionForGlobal to modify "Kind" *before* calling the default SelectSectionForGlobal routine, instead of first calling the default routine and then just checking for the (main) .rodata section afterwards. llvm-svn: 203921
* AddressSanitizer instrumentation for MOV and MOVAPS.Evgeniy Stepanov2014-03-144-3/+304
| | | | | | | | This is an initial version of *Sanitizer instrumentation of assembly code. Patch by Yuri Gorshenin. llvm-svn: 203908
* Remove the linker_private and linker_private_weak linkages.Rafael Espindola2014-03-131-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These linkages were introduced some time ago, but it was never very clear what exactly their semantics were or what they should be used for. Some investigation found these uses: * utf-16 strings in clang. * non-unnamed_addr strings produced by the sanitizers. It turns out they were just working around a more fundamental problem. For some sections a MachO linker needs a symbol in order to split the section into atoms, and llvm had no idea that was the case. I fixed that in r201700 and it is now safe to use the private linkage. When the object ends up in a section that requires symbols, llvm will use a 'l' prefix instead of a 'L' prefix and things just work. With that, these linkages were already dead, but there was a potential future user in the objc metadata information. I am still looking at CGObjcMac.cpp, but at this point I am convinced that linker_private and linker_private_weak are not what they need. The objc uses are currently split in * Regular symbols (no '\01' prefix). LLVM already directly provides whatever semantics they need. * Uses of a private name (start with "\01L" or "\01l") and private linkage. We can drop the "\01L" and "\01l" prefixes as soon as llvm agrees with clang on L being ok or not for a given section. I have two patches in code review for this. * Uses of private name and weak linkage. The last case is the one that one could think would fit one of these linkages. That is not the case. The semantics are * the linker will merge these symbol by *name*. * the linker will hide them in the final DSO. Given that the merging is done by name, any of the private (or internal) linkages would be a bad match. They allow llvm to rename the symbols, and that is really not what we want. From the llvm point of view, these objects should really be (linkonce|weak)(_odr)?. For now, just keeping the "\01l" prefix is probably the best for these symbols. If we one day want to have a more direct support in llvm, IMHO what we should add is not a linkage, it is just a hidden_symbol attribute. It would be applicable to multiple linkages. For example, on weak it would produce the current behavior we have for objc metadata. On internal, it would be equivalent to private (and we should then remove private). llvm-svn: 203866
* Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changingOwen Anderson2014-03-1312-40/+44
| | | | | | | | | | operator* on the by-operand iterators to return a MachineOperand& rather than a MachineInstr&. At this point they almost behave like normal iterators! Again, this requires making some existing loops more verbose, but should pave the way for the big range-based for-loop cleanups in the future. llvm-svn: 203865
* Use printable names to implement directional labels.Rafael Espindola2014-03-131-2/+1
| | | | | | | | | | | | | | This changes the implementation of local directional labels to use a dedicated map. With that it can then just use CreateTempSymbol, which is what the rest of MC uses. CreateTempSymbol doesn't do a great job at making sure the names are unique (or being efficient when the names are not needed), but that should probably be fixed in a followup patch. This fixes pr18928. llvm-svn: 203826
* R600: LDS instructions shouldn't implicitly define OQAPTom Stellard2014-03-131-2/+0
| | | | | | | | | LDS instructions are pseudo instructions which model the OQAP defs and uses within a single instruction. This fixes a hang in the opencv MedianFilter tests. llvm-svn: 203818
* [ARM] Use symbolic register names in .cfi directives only with IAS (PR19110)Hans Wennborg2014-03-132-3/+11
| | | | | | | | | | This is a follow-up to r203635. Saleem pointed out that since symbolic register names are much easier to read, it would be good if we could turn them off only when we really need to because we're using an external assembler. Differential Revision: http://llvm-reviews.chandlerc.com/D3056 llvm-svn: 203806
* CodeGenPrep: sink extends of illegal types into use block.Manuel Jacob2014-03-131-48/+0
| | | | | | | | | | | | | | | | | | | Summary: This helps the instruction selector to lower an i64 * i64 -> i128 multiplication into a single instruction on targets which support it. This is an update of D2973 which was reverted because of a bug reported as PR19084. Reviewers: t.p.northover, chapuni Reviewed By: t.p.northover CC: llvm-commits, alex, chapuni Differential Revision: http://llvm-reviews.chandlerc.com/D3021 llvm-svn: 203797
* AVX-512: masked load/store + intrinsics for them.Elena Demikhovsky2014-03-131-121/+108
| | | | llvm-svn: 203790
* AArch64: error when both positional & named operands are used.Tim Northover2014-03-133-4/+7
| | | | | | | | | | Only one instruction pair needed changing: SMULH & UMULH. The previous code worked, but MC was doing extra work treating Ra as a valid operand (which then got completely overwritten in MCCodeEmitter). No behaviour change, so no tests. llvm-svn: 203772
* [PowerPC] Initial support for the VSX instruction setHal Finkel2014-03-1319-23/+1275
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VSX is an ISA extension supported on the POWER7 and later cores that enhances floating-point vector and scalar capabilities. Among other things, this adds <2 x double> support and generally helps to reduce register pressure. The interesting part of this ISA feature is the register configuration: there are 64 new 128-bit vector registers, the 32 of which are super-registers of the existing 32 scalar floating-point registers, and the second 32 of which overlap with the 32 Altivec vector registers. This makes things like vector insertion and extraction tricky: this can be free but only if we force a restriction to the right register subclass when needed. A new "minipass" PPCVSXCopy takes care of this (although it could do a more-optimal job of it; see the comment about unnecessary copies below). Please note that, currently, VSX is not enabled by default when targeting anything because it is not yet ready for that. The assembler and disassembler are fully implemented and tested. However: - CodeGen support causes miscompiles; test-suite runtime failures: MultiSource/Benchmarks/FreeBench/distray/distray MultiSource/Benchmarks/McCat/08-main/main MultiSource/Benchmarks/Olden/voronoi/voronoi MultiSource/Benchmarks/mafft/pairlocalalign MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 SingleSource/Benchmarks/CoyoteBench/almabench SingleSource/Benchmarks/Misc/matmul_f64_4x4 - The lowering currently falls back to using Altivec instructions far more than it should. Worse, there are some things that are scalarized through the stack that shouldn't be. - A lot of unnecessary copies make it past the optimizers, and this needs to be fixed. - Many more regression tests are needed. Normally, I'd fix these things prior to committing, but there are some students and other contributors who would like to work this, and so it makes sense to move this development process upstream where it can be subject to the regular code-review procedures. llvm-svn: 203768
* [TableGen] Optionally forbid overlap between named and positional operandsHal Finkel2014-03-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are currently two schemes for mapping instruction operands to instruction-format variables for generating the instruction encoders and decoders for the assembler and disassembler respectively: a) to map by name and b) to map by position. In the long run, we'd like to remove the position-based scheme and use only name-based mapping. Unfortunately, the name-based scheme currently cannot deal with complex operands (those with suboperands), and so we currently must use the position-based scheme for those. On the other hand, the position-based scheme cannot deal with (register) variables that are split into multiple ranges. An upcoming commit to the PowerPC backend (adding VSX support) will require this capability. While we could teach the position-based scheme to handle that, since we'd like to move away from the position-based mapping generally, it seems silly to teach it new tricks now. What makes more sense is to allow for partial transitioning: use the name-based mapping when possible, and only use the position-based scheme when necessary. Now the problem is that mixing the two sensibly was not possible: the position-based mapping would map based on position, but would not skip those variables that were mapped by name. Instead, the two sets of assignments would overlap. However, I cannot currently change the current behavior, because there are some backends that rely on it [I think mistakenly, but I'll send a message to llvmdev about that]. So I've added a new TableGen bit variable: noNamedPositionallyEncodedOperands, that can be used to cause the position-based mapping to skip variables mapped by name. llvm-svn: 203767
* ARM: ignore unused variable to fix -Wunused-variable buildsSaleem Abdulrasool2014-03-131-0/+1
| | | | llvm-svn: 203765
* ARM: support emission of complex SO expressionsSaleem Abdulrasool2014-03-131-2/+13
| | | | | | | | | | | | | | Support to the IAS was added to actually parse and handle the complex SO expressions. However, the object file lowering was not updated to compensate for the fact that the shift operand may be an absolute expression. When trying to assemble to an object file, the lowering would fail while succeeding when emitting purely assembly. Add an appropriate test. The test case is inspired by the test case provided by Jiangning Liu who also brought the issue to light. llvm-svn: 203762
* [X86] Add peephole for masked rotate amountAdam Nemet2014-03-121-0/+2
| | | | | | | | | | | | | | | | Extend what's currently done for shift because the HW performs this masking implicitly: (rotl:i32 x, (and y, 31)) -> (rotl:i32 x, y) I use the newly factored out multiclass that was only supporting shifts so far. For testing I extended my testcase for the new rotation idiom. <rdar://problem/15295856> llvm-svn: 203718
* Allow exclamation and tilde to be parsed as a part of the ppc asm operand.Roman Divacky2014-03-121-0/+2
| | | | llvm-svn: 203699
* R600: Fix trunc store from i64 to i1Matt Arsenault2014-03-121-0/+6
| | | | llvm-svn: 203695
* [X86] Refactor peepholes for masked shift amount into a multiclassAdam Nemet2014-03-121-55/+25
| | | | | | | | | | | | | | | | | | The peephole (shift x, (and y, 31)) -> (shift x, y) is repeated for each integer type and each shift variant. To improve this a new multiclass is added that covers all integer types. The shift patterns are now instantiated from this. I am planning to add new instances for rotates as well. No functional change intended: * test/CodeGen/X86/shift-and.ll provides coverage * Compared the expanded tablegen output and matched up the defs for these Pat<>s before and after llvm-svn: 203685
* [X86] Set the scheduling resources of some of the FPStack instructions.Quentin Colombet2014-03-121-0/+17
| | | | | | This is related to <rdar://problem/15607571>. llvm-svn: 203682
* Try harder to evaluate expressions when printing assembly.Rafael Espindola2014-03-124-8/+7
| | | | | | | | | When printing assembly we don't have a Layout object, but we can still try to fold some constants. Testcase by Ulrich Weigand. llvm-svn: 203677
* Add comment pointing to the binutils bugzilla entryHans Wennborg2014-03-121-0/+1
| | | | | | This is a follow-up to r203635 as suggested by Rafael. llvm-svn: 203670
OpenPOWER on IntegriCloud