summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* R600/SI: Use s_movk_i32Matt Arsenault2014-11-133-2/+17
| | | | llvm-svn: 221922
* R600/SI: Fix definition for s_cselect_b32Matt Arsenault2014-11-132-3/+7
| | | | | | | | | | | These were directly using the old base instruction class, and specifying the wrong register classes for operands. The operands can be the other special inputs besides SGPRs. The op name was also being directly used for the asm string, so this was printed without any operands. llvm-svn: 221921
* R600: Fix assert on empty functionMatt Arsenault2014-11-131-1/+0
| | | | | | | | If a function is just an unreachable, this would hit a "this is not a MachO target" assertion because of setting HasSubsectionViaSymbols. llvm-svn: 221920
* Un-break the big-endian buildbotsRui Ueyama2014-11-131-2/+2
| | | | llvm-svn: 221919
* R600: Error on initializer for LDS.Matt Arsenault2014-11-131-2/+21
| | | | | | Also give a proper error for other address spaces. llvm-svn: 221917
* R600/SI: Get rid of FCLAMP_SI pseudoMatt Arsenault2014-11-134-25/+16
| | | | | | | It's not necessary. Also use complex patterns to allow src modifier usage. llvm-svn: 221916
* Object, Mach-O: Refactor and clean code upDavid Majnemer2014-11-131-12/+25
| | | | | | | Don't assert if we can return an error code, reuse existing functionality like is64Bit(). llvm-svn: 221915
* R600/SI: Allow commuting with src2_modifiersMatt Arsenault2014-11-131-5/+0
| | | | llvm-svn: 221911
* R600/SI: Allow commuting some 3 op instructionsMatt Arsenault2014-11-131-3/+27
| | | | | | | | | | | | | e.g. v_mad_f32 a, b, c -> v_mad_f32 b, a, c This simplifies matching v_madmk_f32. This looks somewhat surprising, but it appears to be OK to do this. We can commute src0 and src1 in all of these instructions, and that's all that appears to matter. llvm-svn: 221910
* ARM: allow constpool entry to be moved to the user's block in all cases.Tim Northover2014-11-131-1/+7
| | | | | | | | | | | | | | | Normally entries can only move to a lower address, but when that wasn't viable, the user's block was considered anyway. Unfortunately, it went via createNewWater which wasn't designed to handle the case where there's already an island after the block. Unfortunately, the test we have is slow and fragile, and I couldn't reduce it to anything sane even with the @llvm.arm.space intrinsic. The test change here is recreating the previous one after the change. rdar://problem/18545506 llvm-svn: 221905
* ARM: avoid duplicating branches during constant islands.Tim Northover2014-11-131-6/+10
| | | | | | | | | We were using a naive heuristic to determine whether a basic block already had an unconditional branch at the end. This mostly corresponded to reality (assuming branches got optimised) because there's not much point in a branch to the next block, but could go wrong. llvm-svn: 221904
* ARM: add @llvm.arm.space intrinsic for testing ConstantIslands.Tim Northover2014-11-133-0/+10
| | | | | | | | Creating tests for the ConstantIslands pass is very difficult, since it depends on precise layout details. Having the ability to precisely inject a number of bytes into the stream helps greatly. llvm-svn: 221903
* Fix a regression on the disassembling C API.Rafael Espindola2014-11-131-1/+1
| | | | | | | | | The fix is easy. Unfortunately, we had 0 tests, so adding one was somewhat complicated. Thanks to Kevin Enderby for the report. llvm-svn: 221899
* [Hexagon]Colin LeMahieu2014-11-131-4/+4
| | | | | | NFC Renaming reserved identifier. llvm-svn: 221898
* [Reassociate] Update comment. NFC.Chad Rosier2014-11-131-1/+1
| | | | llvm-svn: 221894
* Fixing -Wtype-limits warnings with the asserts (the expression would always ↵Aaron Ballman2014-11-131-3/+3
| | | | | | evaluate to true). Also fixing a -Wcast-qual warning, where the cast expression isn't required. llvm-svn: 221888
* IR: Create the Metadata classDuncan P. N. Exon Smith2014-11-131-2/+2
| | | | | | | | | This will become the root of a new class hierarchy separate from `Value`. As a first step, stick it between `Value` and `MDNode`. This is part of PR21532. llvm-svn: 221886
* AVX-512: SINT_TO_FP cost model and some bugfixesElena Demikhovsky2014-11-132-4/+25
| | | | | | | Checked some corner cases, for example translation of <8 x i1> to <8 x double> llvm-svn: 221883
* Object, COFF: Refactor code to get relocation iteratorsDavid Majnemer2014-11-131-26/+24
| | | | | | No functional change intended. llvm-svn: 221880
* This patch changes the ownership of TLOF from TargetLoweringBase to ↵Aditya Nandakumar2014-11-1337-63/+136
| | | | | | TargetMachine so that different subtargets could share the TLOF effectively llvm-svn: 221878
* Revert r219432 - "Revert "[BasicAA] Revert "Revert r218714 - Make better use ↵Hal Finkel2014-11-131-5/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of zext and sign information.""" Let's try this again... This reverts r219432, plus a bug fix. Description of the bug in r219432 (by Nick): The bug was using AllPositive to break out of the loop; if the loop break condition i != e is changed to i != e && AllPositive then the test_modulo_analysis_with_global test I've added will fail as the Modulo will be calculated incorrectly (as the last loop iteration is skipped, so Modulo isn't updated with its Scale). Nick also adds this comment: ComputeSignBit is safe to use in loops as it takes into account phi nodes, and the == EK_ZeroEx check is safe in loops as, no matter how the variable changes between iterations, zero-extensions will always guarantee a zero sign bit. The isValueEqualInPotentialCycles check is therefore definitely not needed as all the variable analysis holds no matter how the variables change between loop iterations. And this patch also adds another enhancement to GetLinearExpression - basically to convert ConstantInts to Offsets (see test_const_eval and test_const_eval_scaled for the situations this improves). Original commit message: This reverts r218944, which reverted r218714, plus a bug fix. Description of the bug in r218714 (by Nick): The original patch forgot to check if the Scale in VariableGEPIndex flipped the sign of the variable. The BasicAA pass iterates over the instructions in the order they appear in the function, and so BasicAliasAnalysis::aliasGEP is called with the variable it first comes across as parameter GEP1. Adding a %reorder label puts the definition of %a after %b so aliasGEP is called with %b as the first parameter and %a as the second. aliasGEP later calculates that %a == %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) - ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly conclude that %a > %b. Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug. Slightly modified by me to add an early exit from the loop and avoid unnecessary, but expensive, function calls. Original commit message: Two related things: 1. Fixes a bug when calculating the offset in GetLinearExpression. The code previously used zext to extend the offset, so negative offsets were converted to large positive ones. 2. Enhance aliasGEP to deduce that, if the difference between two GEP allocations is positive and all the variables that govern the offset are also positive (i.e. the offset is strictly after the higher base pointer), then locations that fit in the gap between the two base pointers are NoAlias. Patch by Nick White! llvm-svn: 221876
* Object, COFF: Increase code reuseDavid Majnemer2014-11-131-24/+32
| | | | | | | | | | Split getObject's smarts into checkOffset, use this to replace the handwritten check in getSectionContents. Similarly, replace checks in section_rel_begin/section_rel_end with getNumberOfRelocations. No functionality change intended. llvm-svn: 221873
* Object, COFF: getRelocationSymbol shouldn't assertDavid Majnemer2014-11-131-1/+1
| | | | | | | | lib/Object is supposed to be robust to malformed object files. Don't assert if we don't have a symbol table. I'll try to come up with a test case later. llvm-svn: 221870
* Object, COFF: Cleanup some code in getSectionNameDavid Majnemer2014-11-131-2/+2
| | | | | | | Use StringRef::startswith to tidy up some code, no functionality change intended. llvm-svn: 221869
* Object, COFF: Fix some theoretical bugsDavid Majnemer2014-11-131-3/+14
| | | | | | | | getObject didn't consider the case where a pointer came before the start of the object file. No test is included, trying to come up with something reasonable. llvm-svn: 221868
* Read 64 bits at a time in the bitcode reader.Rafael Espindola2014-11-131-4/+3
| | | | | | | The reading of 64 bit values could still be optimized, but at least this cuts down on the number of virtual calls to fetch more data. llvm-svn: 221865
* [x86] Teach the vector shuffle lowering to make a more nuanced decisionChandler Carruth2014-11-131-12/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | between splitting a vector into 128-bit lanes and recombining them vs. decomposing things into single-input shuffles and a final blend. This handles a large number of cases in AVX1 where the cross-lane shuffles would be much more expensive to represent even though we end up with a fast blend at the root. Instead, we can do a better job of shuffling in a single lane and then inserting it into the other lanes. This fixes the remaining bits of Halide's regression captured in PR21281 for AVX1. However, the bug persists in AVX2 because I've made this change reasonably conservative. The cases where it makes sense in AVX2 to split into 128-bit lanes are much more rare because we can often do full permutations across all elements of the 256-bit vector. However, the particular test case in PR21281 is an example of one of the rare cases where it is *always* better to work in a single 128-bit lane. I'm going to try to teach the logic to detect and form the good code even in AVX2 next, but it will need to use a separate heuristic. Finally, there is one pesky regression here where we previously would craftily use vpermilps in AVX1 to shuffle both high and low halves at the same time. We no longer pull that off, and not for any really good reason. Ultimately, I think this is just another missing nuance to the selection heuristic that I'll try to add in afterward, but this change already seems strictly worth doing considering the magnitude of the improvements in common matrix math shuffle patterns. As always, please let me know if this causes a surprising regression for you. llvm-svn: 221861
* llvm-readobj: Print out address table when dumping COFF delay-import tableRui Ueyama2014-11-131-0/+14
| | | | llvm-svn: 221855
* Add an assert and a test that verify r221709's fix.Frederic Riss2014-11-131-2/+4
| | | | llvm-svn: 221854
* [x86] Don't form overly fragmented blends when splitting andChandler Carruth2014-11-131-2/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | re-combining shuffles because nothing was available in the wider vector type. The key observation (which I've put in the comments for future maintainers) is that at this point, no further combining is really possible. And so even though these shuffles trivially could be combined, we need to actually do that as we produce them when producing them this late in the lowering. This fixes another (huge) part of the Halide vector shuffle regressions. As it happens, this was already well covered by the tests, but I hadn't noticed how bad some of these got. The specific patterns that turn directly into unpckl/h patterns were occurring *many* times in common vector processing code. There are still more problems here sadly, but trying to incrementally tease them apart and it looks like this is the core of the problem in the splitting logic. There is some chance of regression here, you can see it in the test changes. Specifically, where we stop forming pshufb in some cases, it is possible that pshufb was in fact faster. Intel "says" that pshufb is slower than the instruction sequences replacing it. llvm-svn: 221852
* [CodeGenPrepare] Handle zero extensions in the TypePromotionHelper.Quentin Colombet2014-11-131-111/+143
| | | | | | | | | | | | | | | | | | | Prior to this patch the TypePromotionHelper was promoting only sign extensions. Supporting zero extensions changes: - How constants are extended. - How sign extensions, zero extensions, and truncate are composed together. - How the type of the extended operation is recorded. Now we need to know the kind of the extension as well as its type. Each change is fairly small, unlike the diff. Most of the diff are comments/variable renaming to say "extension" instead of "sign extension". The performance improvements on the test suite are within the noise. Related to <rdar://problem/18310086>. llvm-svn: 221851
* [FastISel][AArch64] Optimize select when one of the operands is a 'true' or ↵Juergen Ributzka2014-11-131-0/+61
| | | | | | | | | | | 'false' value. Optimize selects of i1 in the presence of 'true' and 'false' operands to simple logic operations. This fixes rdar://problem/18960150. llvm-svn: 221848
* [FastISel][AArch64] Fold the cmp into the select when possible.Juergen Ributzka2014-11-131-0/+54
| | | | | | | | | This folds the compare emission into the select emission when possible, so we can directly use the flags and don't have to emit a separate compare. Related to rdar://problem/18960150. llvm-svn: 221847
* [FastISel][AArch64] Extend 'select' lowering to support also i1 to i16.Juergen Ributzka2014-11-131-34/+46
| | | | | | Related to rdar://problem/18960150. llvm-svn: 221846
* Revert "[dwarfdump] Add support for dumping accelerator tables."Frederic Riss2014-11-135-194/+0
| | | | | | | | | | This reverts commit r221836. The tests are asserting on some buildbots. This also reverts the test part of r221837 as it relies on dwarfdump dumping the accelerator tables. llvm-svn: 221842
* Improve long path name support on Windows.Paul Robinson2014-11-131-38/+66
| | | | | | | | | | Windows normally limits the length of an absolute path name to 260 characters; directories can have lower limits. These limits increase to about 32K if you use absolute paths with the special '\\?\' prefix. Teach Support\Windows\Path.inc to use that prefix as needed. TODO: Other parts of Support could also learn to use this prefix. llvm-svn: 221841
* Teach ScalarEvolution to sharpen range information.Sanjoy Das2014-11-131-0/+60
| | | | | | | | | | | | | | | | | | If x is known to have the range [a, b), in a loop predicated by (icmp ne x, a) its range can be sharpened to [a + 1, b). Get ScalarEvolution and hence IndVars to exploit this fact. This change triggers an optimization to widen-loop-comp.ll, so it had to be edited to get it to pass. This change was originally landed in r219834 but had a bug and broke ASan. It was reverted in r219878, and is now being re-landed after fixing the original bug. phabricator: http://reviews.llvm.org/D5639 reviewed by: atrick llvm-svn: 221839
* Fix emission of Dwarf accelerator table when there are multiple CUs.Frederic Riss2014-11-123-7/+10
| | | | | | | | The DIE offset in the accel tables is an offset relative to the start of the debug_info section, but we were encoding the offset to the start of the containing CU. llvm-svn: 221837
* [dwarfdump] Add support for dumping accelerator tables.Frederic Riss2014-11-125-0/+194
| | | | | | | The class used for the dump only allows to dump for the moment, but it can (and will) be easily extended to support search also. llvm-svn: 221836
* Allow DWARFFormValue::extractValue to be called with a null CU.Frederic Riss2014-11-121-15/+17
| | | | | | | | | | | | | | | Currently FormValues are only used for attributes of DIEs and thus uers always have a CU lying around when calling into the FormValue API. Accelerator tables encode their information using the same Forms as the attributes, thus it is natural to use DWARFFormValue to extract/dump them. There is no CU in that case though. Allow the API to be called with a null CU arguemnt by making the RelocMap lookup conditional on the CU pointer validity. And document this new behvior in the header. (Test coverage for this use of the API comes in the DwarfAccelTable support patch) llvm-svn: 221835
* Remove unsused variables.Frederic Riss2014-11-121-2/+0
| | | | llvm-svn: 221834
* [CodeGenPrepare] Replace other uses of EVT::getEVT with TL::getValueType.Ahmed Bougacha2014-11-121-5/+5
| | | | | | | | | | | | | | | | | | | | | r221820 fixed a problem (PR21548) where an iPTR was used in TLI legality checks, which isn't valid and resulted in a failed assertion. The solution was to lower pointer types into the correct target's VT, by using TL::getValueType instead of EVT::getEVT. This commit changes 3 other uses of EVT::getEVT, but without any tests: - One of these non-lowered EVTs is passed to allowsMisalignedMemoryAccesses, which goes into target's TL implementation and doesn't cause any problem (yet.) - Two others are passed to TLI.isOperationLegalOrCustom: - one only looks at extensions, so doesn't concern pointers. - one only looks at binary operators, so also isn't a problem. The latter might some day be exposed to pointers and cause the same assert as the original PR, because there's a comment hinting at also supporting cast ops. For consistency, update all of them and be done with it. llvm-svn: 221827
* [CodeGenPrepare][AArch64] Fix a TLI legality check on iPTR to use a lowered ↵Ahmed Bougacha2014-11-121-2/+2
| | | | | | | | instead. Fixes PR21548. Related to PR20474. llvm-svn: 221820
* Expose the number of Newton-Raphson iterations applied to the hardware's ↵Sanjay Patel2014-11-121-3/+7
| | | | | | | | | | | | reciprocal estimate as a parameter (x86). This is a follow-on to r221706 and r221731 and discussed in more detail in PR21385. This patch also loosens the testcase checking for btver2. We know that the "1.0" will be loaded, but we can't tell exactly when, so replace the CHECK-NEXT specifiers with plain CHECKs. The CHECK-NEXT sequence relied on a quirk of post-RA-scheduling that may change independently of anything in these tests. llvm-svn: 221819
* Add fortified (__*_chk) library functions to TLI (NFC)Ahmed Bougacha2014-11-122-17/+15
| | | | | | | | | | | | One of them (__memcpy_chk) was already there, the others were checked by comparing function names. Note that the fortified libfuncs are now part of TLI, but are always available, because they aren't generated, only optimized into the non-checking versions. Differential Revision: http://reviews.llvm.org/D6179 llvm-svn: 221817
* Temporary fix for PR21528 - use mangled C++ function names in COFF debug ↵Timur Iskhodzhanov2014-11-121-1/+8
| | | | | | info to un-break ASan on Windows llvm-svn: 221813
* [COFF] Make it clearer that the symbols subsection holds function display ↵Timur Iskhodzhanov2014-11-121-1/+1
| | | | | | name rather than just name llvm-svn: 221812
* [AVX512] Add integer shift by immediate intrinsics.Cameron McInally2014-11-122-1/+12
| | | | llvm-svn: 221811
* Changing a StringRef::begin() call into StringRef::data(); NFC.Aaron Ballman2014-11-121-1/+1
| | | | llvm-svn: 221808
* Use the return of readBytes to find out if we are at the end of the stream.Rafael Espindola2014-11-121-12/+0
| | | | | | | This allows the removal of isObjectEnd and opens the way for reading 64 bits at a time. llvm-svn: 221804
OpenPOWER on IntegriCloud