summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* [AVX512] Make vextract*x4/vinsert*x4 tests check for the index as wellAdam Nemet2014-09-251-7/+7
| | | | | | Extend test so that it provides coverage for the next commit. llvm-svn: 218479
* [AVX512] Refactor subvector extractsAdam Nemet2014-09-251-98/+69
| | | | | | | | | | | | | | | | | | | | | No functional change. These are now implemented as two levels of multiclasses heavily relying on the new X86VectorVTInfo class. The multiclass at the first level that is called with float or int provides the 128 or 256 bit subvector extracts. The second level provides the register and memory variants and some more Pat<>s. I've compared the td.expanded files before and after. One change is that ExeDomain for 64x4 is SSEPackedDouble now. I think this is correct, i.e. a bugfix. (BTW, this is the change that was blocked on the recent tablegen fix. The class-instance values X86VectorVTInfo inside vextract_for_type weren't properly evaluated.) Part of <rdar://problem/17688758> llvm-svn: 218478
* [AVX512] Fix typoAdam Nemet2014-09-251-1/+1
| | | | | | F->I in VEXTRACTF32x4rr. llvm-svn: 218477
* Add SDAG TableGen definitions for BR_CCHal Finkel2014-09-251-0/+5
| | | | | | | | | Add SelectionDAG TableGen definitions for BR_CC so that targets can instruction-select BR_CC using TableGen pattern matching. Patch by deadal nix. llvm-svn: 218476
* R600: Fix some missing conversion testcasesMatt Arsenault2014-09-253-4/+44
| | | | llvm-svn: 218474
* Remove duplicated RUN lines in middle of testMatt Arsenault2014-09-251-2/+0
| | | | llvm-svn: 218473
* [MachineSink+PGO] Teach MachineSink to use BlockFrequencyInfoBruno Cardoso Lopes2014-09-252-6/+68
| | | | | | | | | | | | | | | Machine Sink uses loop depth information to select between successors BBs to sink machine instructions into, where BBs within smaller loop depths are preferable. This patch adds support for choosing between successors by using profile information from BlockFrequencyInfo instead, whenever the information is available. Tested it under SPEC2006 train (average of 30 runs for each program); ~1.5% execution speedup in average on x86-64 darwin. <rdar://problem/18021659> llvm-svn: 218472
* Object: Add range iterators for Archive childrenDavid Majnemer2014-09-252-9/+12
| | | | | | No functional change intended. llvm-svn: 218471
* [Support] Fix Format.h to build on WindowsNick Kledzik2014-09-251-0/+1
| | | | llvm-svn: 218467
* [Support] Add type-safe alternative to llvm::format()Nick Kledzik2014-09-254-0/+164
| | | | | | | | | | | | | | | | | | | | | llvm::format() is somewhat unsafe. The compiler does not check that integer parameter size matches the %x or %d size and it does not complain when a StringRef is passed for a %s. And correctly using a StringRef with format() is ugly because you have to convert it to a std::string then call c_str(). The cases where llvm::format() is useful is controlling how numbers and strings are printed, especially when you want fixed width output. This patch adds some new formatting functions to raw_streams to format numbers and StringRefs in a type safe manner. Some examples: OS << format_hex(255, 6) => "0x00ff" OS << format_hex(255, 4) => "0xff" OS << format_decimal(0, 5) => " 0" OS << format_decimal(255, 5) => " 255" OS << right_justify(Str, 5) => " foo" OS << left_justify(Str, 5) => "foo " llvm-svn: 218463
* Refactoring: raw pointer -> unique_ptrAnton Yartsev2014-09-251-5/+3
| | | | llvm-svn: 218462
* ARM: Remove unneeded check for MI->hasPostISelHook()Tom Stellard2014-09-251-6/+0
| | | | llvm-svn: 218459
* SelectionDAG: Remove #if NDEBUG from check for a post-isel hookTom Stellard2014-09-251-2/+0
| | | | | | | | | | | | | | The InstrEmitter will skip the check of MI.hasPostISelHook() before calling AdjustInstrPostInstrSelection() when NDEBUG is not defined. This was added in r140228, and I'm not sure if it is intentional or not, but it is a likely source for bugs, because it means with Release+Asserts builds you can forget to set the hasPostISelHook flag on TableGen definitions and AdjustInstrPostInstrSelection() will still be called. llvm-svn: 218458
* R600/SI: Add support for global atomic addTom Stellard2014-09-255-3/+150
| | | | llvm-svn: 218457
* Lower idempotent RMWs to fence+loadRobin Morisset2014-09-255-6/+185
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I originally tried doing this specifically for X86 in the backend in D5091, but it was rather brittle and generally running too late to be general. Furthermore, other targets may want to implement similar optimizations. So I reimplemented it at the IR-level, fitting it into AtomicExpandPass as it interacts with that pass (which could not be cleanly done before at the backend level). This optimization relies on a new target hook, which is only used by X86 for now, as the correctness of the optimization on other targets remains an open question. If it is found correct on other targets, it should be trivial to enable for them. Details of the optimization are discussed in D5091. Test Plan: make check-all + a new test Reviewers: jfb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5422 llvm-svn: 218455
* Since the DisasmMemoryObject only operates on const data, it now only ↵Aaron Ballman2014-09-251-3/+3
| | | | | | accepts a const data pointer. This silences a -Wcast-qual warning. llvm-svn: 218454
* Add missing attributes !cmp.[eq,gt,gtu] instructions.Sid Manning2014-09-254-30/+96
| | | | | | | | These instructions do not indicate they are extendable or the number of bits in the extendable operand. Rename to match architected names. Add a testcase for the intrinsics. llvm-svn: 218453
* Add llvm_unreachables() for [ASZ]ExtUpper to X86FastISel.cpp to appease the ↵Daniel Sanders2014-09-251-0/+3
| | | | | | buildbots. llvm-svn: 218452
* [mips] Add CCValAssign::[ASZ]ExtUpper and CCPromoteToUpperBitsInType and ↵Daniel Sanders2014-09-256-14/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | handle struct's correctly on big-endian N32/N64 return values. Summary: The N32/N64 ABI's require that structs passed in registers are laid out such that spilling the register with 'sd' places the struct at the lowest address. For little endian this is trivial but for big-endian it requires that structs are shifted into the upper bits of the register. We also require that structs passed in registers have the 'inreg' attribute for big-endian N32/N64 to work correctly. This is because the tablegen-erated calling convention implementation only has access to the lowered form of struct arguments (one or more integers of up to 64-bits each) and is unable to determine the original type. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5286 llvm-svn: 218451
* Add aliases for VAND imm to VBIC ~immRenato Golin2014-09-255-37/+197
| | | | | | | | | | | | | On ARM NEON, VAND with immediate (16/32 bits) is an alias to VBIC ~imm with the same type size. Adding that logic to the parser, and generating VBIC instructions from VAND asm files. This patch also fixes the validation routines for NEON splat immediates which were wrong. Fixes PR20702. llvm-svn: 218450
* [x86] Teach the new vector shuffle lowering to use AVX2 instructions forChandler Carruth2014-09-253-219/+381
| | | | | | | | | | | | | | | | | | | | | | | v4f64 and v8f32 shuffles when they are lane-crossing. We have fully general lane-crossing permutation functions in AVX2 that make this easy. Part of this also changes exactly when and how these vectors are split up when we don't have AVX2. This isn't always a win but it usually is a win, so on the balance I think its better. The primary regressions are all things that just need to be fixed anyways such as modeling when a blend can be completely accomplished via VINSERTF128, etc. Also, this highlights one of the few remaining big features: we do a really poor job of inserting elements into AVX registers efficiently. This completes almost all of the big tricks I have in mind for AVX2. The only things left that I plan to add: 1) element insertion smarts 2) palignr and other fairly specialized lowerings when they happen to apply llvm-svn: 218449
* Update my previous commit to fit 80 cols...Sylvestre Ledru2014-09-251-1/+2
| | | | llvm-svn: 218448
* Details that -debug-only is not available when LLVM is built with ↵Sylvestre Ledru2014-09-251-0/+2
| | | | | | --enable-optimized llvm-svn: 218447
* [x86] Teach the new vector shuffle lowering a fancier way to lowerChandler Carruth2014-09-255-287/+264
| | | | | | | | | | | 256-bit vectors with lane-crossing. Rather than immediately decomposing to 128-bit vectors, try flipping the 256-bit vector lanes, shuffling them and blending them together. This reduces our worst case shuffle by a pretty significant margin across the board. llvm-svn: 218446
* [Thumb2] BXJ should be undefined for v7M, v8AOliver Stannard2014-09-252-1/+11
| | | | | | | | The Thumb2 BXJ instruction (Branch and Exchange Jazelle) is not defined for v7M or v8A. It is defined for all other Thumb2-supporting architectures (v6T2, v7A and v7R). llvm-svn: 218445
* [x86] Fix an oversight in the v8i32 path of the new vector shuffleChandler Carruth2014-09-252-8/+4
| | | | | | | | | | | | | | | | lowering where it only used the mask of the low 128-bit lane rather than the entire mask. This allows the new lowering to correctly match the unpack patterns for v8i32 vectors. For reference, the reason that we check for the the entire mask rather than checking the repeated mask is because the repeated masks don't abide by all of the invariants of normal masks. As a consequence, it is safer to use the full mask with functions like the generic equivalence test. llvm-svn: 218442
* [x86] Rearrange the code for v16i16 lowering a bit for clarity and toChandler Carruth2014-09-251-29/+18
| | | | | | | | | | | | | | | | | | | | reduce the amount of checking we do here. The first realization is that only non-crossing cases between 128-bit lanes are handled by almost the entire function. It makes more sense to handle the crossing cases first. THe second is that until we actually are going to generate fancy shared lowering strategies that use the repeated semantics of the v8i16 lowering, we should waste time checking for repeated masks. It is simplest to directly test for the entire unpck masks anyways, so we gained nothing from this. This also matches the structure of v32i8 more closely. No functionality changed here. llvm-svn: 218441
* [x86] Implement AVX2 support for v32i8 in the new vector shuffleChandler Carruth2014-09-252-216/+112
| | | | | | | | | | lowering. This completes the basic AVX2 feature support, but there are still some improvements I'd like to do to really get the last mile of performance here. llvm-svn: 218440
* [x86] More tweaks to the v32i8 test cases.Chandler Carruth2014-09-251-18/+52
| | | | | | | | I made a mistake in the previous commit and produced the wrong pattern. Fix that. Also make one more shuffle pattern byte-based rather than word-based, and add two more blend patterns. llvm-svn: 218439
* [x86] Re-work a bunch of the v32i8 test cases to actually involve byteChandler Carruth2014-09-251-188/+142
| | | | | | | | | | | | shuffles rather than word shuffles. As you might guess, these were built starting from the word shuffle test cases and I failed to properly port a bunch of them and left them as widened word shuffle test cases. We still have a couple of tests that check our ability to widen shuffles, but now we will test the actual byte shuffle quite a bit better. llvm-svn: 218438
* MC: Use @IMGREL instead of @IMGREL32, which we can't parseReid Kleckner2014-09-253-3/+3
| | | | | | | | | | | | Nico Rieck added support for this 32-bit COFF relocation some time ago for Win64 stuff. It appears that as an oversight, the assembly output used "foo"@IMGREL32 instead of "foo"@IMGREL, which is what we can parse. Sadly, there were actually tests that took in IMGREL and put out IMGREL32, and we didn't notice the inconsistency. Oh well. Now LLVM can assemble it's own output with slightly more fidelity. llvm-svn: 218437
* [x86] Remove the defunct X86ISD::BLENDV entry -- we use vector selectsChandler Carruth2014-09-252-4/+0
| | | | | | | | | for this now. Should prevent folks from running afoul of this and not knowing why their code won't instruction select the way I just did... llvm-svn: 218436
* [x86] Fix the v16i16 blend logic I added in the prior commit and add theChandler Carruth2014-09-252-3/+138
| | | | | | | | | | missing test cases for it. Unsurprisingly, without test cases, there were bugs here. Surprisingly, this bug wasn't caught at compile time. Yep, there is an X86ISD::BLENDV. It isn't wired to anything. Oops. I'll fix than next. llvm-svn: 218434
* llvm-cov: Combine segments that cover the same locationJustin Bogner2014-09-255-4/+62
| | | | | | | | If we have multiple coverage counts for the same segment, we need to add them up rather than arbitrarily choosing one. This fixes that and adds a test with template instantiations to exercise it. llvm-svn: 218432
* [X86,AVX] Add an isel pattern for X86VBroadcast.Akira Hatanaka2014-09-252-0/+22
| | | | | | This fixes PR21050 and rdar://problem/18434607. llvm-svn: 218431
* [x86] Implement v16i16 support with AVX2 in the new vector shuffleChandler Carruth2014-09-255-262/+220
| | | | | | | | | | | | | | | lowering. This also implements the fancy blend lowering for v16i16 using AVX2 and teaches the X86 backend to print shuffle masks for 256-bit PSHUFB and PBLENDW instructions. It also makes the mask decoding correct for PBLENDW instructions. The yaks, they are legion. Tests are updated accordingly. There are some missing tests for the VBLENDVB lowering, but I'll add those in a follow-up as this commit has accumulated enough cruft already. llvm-svn: 218430
* Flush out enough of llvm-objdump’s SymbolizerSymbolLookUp() for Mach-O ↵Kevin Enderby2014-09-244-37/+311
| | | | | | | | | | | | | | | | | | | | | files to get the literal string “Hello world” printed as a comment on the instruction that loads the pointer to it. For now this is just for x86_64. So for object files with relocation entries it produces things like: leaq L_.str(%rip), %rax ## literal pool for: "Hello world\n" and similar for fully linked images like executables: leaq 0x4f(%rip), %rax ## literal pool for: "Hello world\n" Also to allow testing against darwin’s otool(1), I hooked up the existing -no-show-raw-insn option to the Mach-O parser code, added the new Mach-O only -full-leading-addr option to match otool(1)'s printing of addresses and also added the new -print-imm-hex option. llvm-svn: 218423
* [asan] don't instrument module CTORs that may be run before ↵Kostya Serebryany2014-09-242-9/+21
| | | | | | asan.module_ctor. This fixes asan running together -coverage llvm-svn: 218421
* Removing empty ARM tests from failed revertRenato Golin2014-09-242-0/+0
| | | | llvm-svn: 218419
* Removing empty tests from failed revertRenato Golin2014-09-243-0/+0
| | | | llvm-svn: 218417
* Revert 218406 - Refactor the RelocVisitor::visit methodRenato Golin2014-09-244-136/+98
| | | | llvm-svn: 218416
* Revert 218407 - Add support for ARM and AArch64 BE object filesRenato Golin2014-09-245-51/+4
| | | | llvm-svn: 218415
* Revert 218408 - Report endianness in output of {dwarf, obj}dumpRenato Golin2014-09-245-47/+4
| | | | llvm-svn: 218414
* Revert 218411 - XFAIL reloc test on x86/hexagonRenato Golin2014-09-241-1/+0
| | | | llvm-svn: 218413
* XFAIL reloc test on x86/hexagonRenato Golin2014-09-241-0/+1
| | | | llvm-svn: 218411
* Revert r218380. This was breaking Apple internal build bots.Akira Hatanaka2014-09-241-6/+14
| | | | llvm-svn: 218409
* Report endianness in output of {dwarf, obj}dumpRenato Golin2014-09-245-4/+47
| | | | | | | | | | For biendian targets like ARM and AArch64, it is useful to have the output of the llvm-dwarfdump and llvm-objdump report the endianness used when the object files were generated. Patch by Charlie Turner. llvm-svn: 218408
* Add support for ARM and AArch64 BE object filesRenato Golin2014-09-245-4/+51
| | | | | | | | | | | | This change fixes the ARM and AArch64 relocation visitors in RelocVisitor. They were unconditionally assuming the object data are little-endian. Tests have been added to ensure that the llvm-dwarfdump utility does not crash when processing big-endian object files. Patch by Charlie Turner. llvm-svn: 218407
* Refactor the RelocVisitor::visit methodRenato Golin2014-09-244-98/+136
| | | | | | | | | | | | | | | | | | This change replaces the brittle if/else chain of string comparisons with a switch statement on the detected target triple, removing the need for testing arbitrary architecture names returned from getFileFormatName, whose primary purpose seems to be for display (user-interface) purposes. The visitor now takes a reference to the object file, rather than its arbitrary file format name to figure out whether the file is a 32 or 64-bit object file and what the detected target triple is. A set of tests have been added to help show that the refactoring processes relocations for the same targets as the original code. Patch by Charlie Turner. llvm-svn: 218406
* pass environment when invoking llvm-config from lit.cfgScott Douglass2014-09-241-1/+2
| | | | | | | | Use the same environment when invoking llvm-config from lit.cfg as will be used when running tests, so that ASAN_OPTIONS, INCLUDE, etc. are present. llvm-svn: 218403
OpenPOWER on IntegriCloud