summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Disable all standard lib functions for NVVM.Justin Lebar2016-01-261-0/+14
| | | | | | | | | | | | | | | | | Summary: NVVM doesn't have a standard library, as currently implemented, so this just isn't going to work. I'd like to revisit this, since it's hiding opportunities for optimization, but correctness comes first. Thank you to hfinkel for pointing me in the right direction here. Reviewers: tra Subscribers: echristo, jhen, llvm-commits, hfinkel Differential Revision: http://reviews.llvm.org/D16604 llvm-svn: 258884
* Fix identify_magic() to check that a file that starts with MH_MAGIC isKevin Enderby2016-01-261-2/+15
| | | | | | | | | | | | at least as big as the mach header to be identified as a Mach-O file and make sure smaller files are not identified as a Mach-O files but as unknown files. Also fix identify_magic() so it looks at all 4 bytes of the filetype field when determining the type of the Mach-O file. Then fix the macho-invalid-header test case to check that it is an unknown file and make sure it does not get the error for object_error::parse_failed. And also update the unit tests. llvm-svn: 258883
* [GVN] Split AvailableValueInBlock into two parts [NFC]Philip Reames2016-01-261-29/+69
| | | | | | | | AvailableValue is the part that represents the potential rematerialization. AvailableValueInBlock is simply a pair of an AvailableValue and a BB which we might materialize it in. This is motivated by http://reviews.llvm.org/D16608. The intent is that we'll have a single function which handles the local case which both local and non-local will use to identify available values. Once that's done, the local case can rematerialize at the use site and the non-local case can do the SSA construction as it does currently. llvm-svn: 258882
* [PGO] allow pgo name collector to disable compression (for testing)/NFCXinliang David Li2016-01-261-2/+3
| | | | llvm-svn: 258876
* [llvm-tblgen] Stop emitting the intrinsic name matching codeReid Kleckner2016-01-261-17/+20
| | | | | | | | | The AMDGPU backend was the last user of the old StringMatcher recognition code. Move it over to the new lookupLLVMIntrinsicName funciton, which is now improved to handle all of the interesting edge cases exposed by AMDGPU intrinsic names. llvm-svn: 258875
* [WebAssembly] Omit no-op adds for non-mem uses of FrameIndexDerek Schuff2016-01-263-10/+17
| | | | | | Differential Revision: http://reviews.llvm.org/D16554 llvm-svn: 258872
* Handle more edge cases in intrinsic name binary searchReid Kleckner2016-01-262-40/+40
| | | | | | | I tried to make the AMDGPU intrinsic info table use this instead of another StringMatcher, and some issues arose. llvm-svn: 258871
* [x86] make the subtarget member a const reference, not a pointer ; NFCISanjay Patel2016-01-262-731/+731
| | | | | | It's passed in as a reference; it's not optional; it's not a pointer. llvm-svn: 258867
* [X86] Add support for zeroed shuffle elements to getShuffleScalarEltSimon Pilgrim2016-01-261-2/+6
| | | | | | Enable handling of SM_SentinelZero shuffle elements to getShuffleScalarElt. Improves VZEXT_LOAD matches in EltsFromConsecutiveLoads. llvm-svn: 258865
* Remove autoconf supportChris Bieneman2016-01-26128-2171/+0
| | | | | | | | | | | | | | | | Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened." - Obi Wan Kenobi Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16471 llvm-svn: 258861
* [WebAssembly] Remove check for FrameIndex operands in WebAssemblyPeepholeDerek Schuff2016-01-261-14/+9
| | | | | | | This pass runs after FrameIndex elimination, so it should never see FI operands. NFC llvm-svn: 258860
* [x86] add materializeVectorConstant() helper function; NFCSanjay Patel2016-01-261-15/+29
| | | | | | LowerBUILD_VECTOR is still over 300 lines long, but it's a start... llvm-svn: 258858
* WebAssembly NFC: update error messageJF Bastien2016-01-261-1/+2
| | | | | | I forgot to update this one in my previous patch. llvm-svn: 258853
* WebAssembly: don't optimize memcpy/memmove/memcpy to frame indexJF Bastien2016-01-262-20/+31
| | | | | | r258781 optimized memcpy/memmove/memcpy so the intrinsic call can return its first argument, but missed the frame index case. Teach it to ignore that case so C code doesn't assert out in these cases. llvm-svn: 258851
* Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed.Cong Hou2016-01-262-48/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, AnalyzeBranch() fails non-equality comparison between floating points on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this function can modify the branch by reversing the conditional jump and removing unconditional jump if there is a proper fall-through. However, in the case of non-equality comparison between floating points, this can turn the branch "unanalyzable". Consider the following case: jne.BB1 jp.BB1 jmp.BB2 .BB1: ... .BB2: ... AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be removed: jne.BB1 jnp.BB2 .BB1: ... .BB2: ... However, AnalyzeBranch() cannot analyze this branch anymore as there are two conditional jumps with different targets. This may disable some optimizations like block-placement: in this case the fall-through behavior is enforced even if the fall-through block is very cold, which is suboptimal. Actually this optimization is also done in block-placement pass, which means we can remove this optimization from AnalyzeBranch(). However, currently X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined negation conditions for them. In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them. Here only the second conditional jump is reversed. This is valid as we only need them to do this "unconditional jump removal" optimization. Differential Revision: http://reviews.llvm.org/D11393 llvm-svn: 258847
* [ScheduleDAGInstrs] Simplify logic to improve readability. NFC.Chad Rosier2016-01-261-2/+1
| | | | | | The call to isInvariantLoad() already returns false for non-load instructions. llvm-svn: 258841
* tidy up; NFCSanjay Patel2016-01-261-9/+9
| | | | llvm-svn: 258838
* [x86] simplify getOnesVector() ; NFCISanjay Patel2016-01-261-20/+11
| | | | | | | Let DAG.getConstant() handle the splatting; there's no need to repeat that logic here. llvm-svn: 258833
* Fix Clang-tidy modernize-use-nullptr and modernize-use-override warnings; ↵Eugene Zelenko2016-01-267-33/+25
| | | | | | | | other minor fixes. Differential revision: reviews.llvm.org/D16568 llvm-svn: 258831
* Reassociate: Reprocess RedoInsts after each instAditya Nandakumar2016-01-261-28/+39
| | | | | | | | | | Previously the RedoInsts was processed at the end of the block. However it was possible that it left behind some instructions that were not canonicalized. This should guarantee that any previous instruction in the basic block is canonicalized before we process a new instruction. llvm-svn: 258830
* Update wasm target for r258819.Benjamin Kramer2016-01-261-1/+1
| | | | llvm-svn: 258827
* fix formatting; NFCSanjay Patel2016-01-261-2/+1
| | | | llvm-svn: 258825
* Reflect the MC/MCDisassembler split on the include/ level.Benjamin Kramer2016-01-2621-25/+25
| | | | | | No functional change, just moving code around. llvm-svn: 258818
* [LibCallSimplifier] fold memset(malloc(x), 0, x) --> calloc(1, x)Sanjay Patel2016-01-261-0/+81
| | | | | | | | | | | | | This is a step towards solving PR25892: https://llvm.org/bugs/show_bug.cgi?id=25892 It won't handle the reported case. As noted by the 'TODO' comments in the patch, we need to relax the hasOneUse() constraint and also match patterns that include memset_chk() and the llvm.memset() intrinsic in addition to memset(). Differential Revision: http://reviews.llvm.org/D16337 llvm-svn: 258816
* Revert "Reapply commit r258404 with fix"Matthew Simpson2016-01-261-226/+11
| | | | | | | | This commit exposes a crash in computeKnownBits on the Chromium buildbots. Reverting to investigate. Reference: https://llvm.org/bugs/show_bug.cgi?id=26307 llvm-svn: 258812
* Re-submit r256008 "Improve DWARFDebugFrame::parse to also handle __eh_frame."Igor Laevsky2016-01-263-19/+186
| | | | | | | Originally this change was causing failures on windows buildbots. But those problems were fixed in r258806. llvm-svn: 258811
* [WebAssembly] Fix a typo in a comment.Dan Gohman2016-01-261-1/+1
| | | | llvm-svn: 258810
* [DebugInfo] Fix DWARFDebugFrame instruction operand orderingIgor Laevsky2016-01-261-6/+14
| | | | | | | | We can't rely on the evalution order of function arguments. Differential Revision: http://reviews.llvm.org/D16509 llvm-svn: 258806
* [X86][SSE] Add zero element and general 64-bit VZEXT_LOAD support to ↵Simon Pilgrim2016-01-261-56/+87
| | | | | | | | | | | | | | EltsFromConsecutiveLoads This patch adds support for trailing zero elements to VZEXT_LOAD loads (and checks that no zero elts occur within the consecutive load). It also generalizes the 64-bit VZEXT_LOAD load matching to work for loads other than 2x32-bit loads. After this patch it will also be easier to add support for other basic load patterns like 32-bit VZEXT_LOAD loads, PMOVZX and subvector load insertion. Differential Revision: http://reviews.llvm.org/D16217 llvm-svn: 258798
* [X86] Mark LDS/LES as not being allowed in 64-bit mode.Craig Topper2016-01-261-4/+8
| | | | | | Their opcodes are used as part of the VEX prefix in 64-bit mode. Clearly the disassembler implicitly decoded them as AVX instructions in 64-bit mode, but I think the AsmParser would have encoded them. llvm-svn: 258793
* AMDGPU: Move AMDGPU intrinsics only used by R600Matt Arsenault2016-01-262-10/+13
| | | | llvm-svn: 258790
* AMDGPU: Tidy minor td file issuesMatt Arsenault2016-01-264-247/+249
| | | | | | | | | | Make comments and indentation more consistent. Rearrange a few things to be in a more consistent order, such as organizing subtarget features from those describing an actual device property, and those used as options. llvm-svn: 258789
* AMDGPU: Make v32i8/v64i8 illegal typesMatt Arsenault2016-01-264-21/+13
| | | | | | | | Old intrinsics were forcing these, but they have now all been removed. This fixes large i8 vector operations generally being broken. llvm-svn: 258788
* AMDGPU: Remove old sample intrinsicsMatt Arsenault2016-01-264-61/+0
| | | | | | | | | | | I did my best to try to update all the uses in tests that just happened to use the old ones to the newer intrinsics. I'm not sure I got all of the immediate operand conversions correct, since the value seems to have been ignored by the old pattern but I don't think it really matters. llvm-svn: 258787
* AMDGPU: Add new amdgcn intrinsics for cube instructionsMatt Arsenault2016-01-262-5/+9
| | | | | | | More cleanup to try to get all intrinsics using the correct amdgcn prefix that are as close to the instruction as possible. llvm-svn: 258786
* AMDGPU: Implement read_register and write_register intrinsicsMatt Arsenault2016-01-262-0/+50
| | | | | | | | | | | | | | Some of the special intrinsics now that now correspond to a instruction also have special setting of some registers, e.g. llvm.SI.sendmsg sets m0 as well as use s_sendmsg. Using these explicit register intrinsics may be a better option. Reading the exec mask and others may be useful for debugging. For this I'm not sure this is entirely correct because we would want this to be convergent, although it's possible this is already treated sufficently conservatively. llvm-svn: 258785
* AMDGPU: Restore AMDGPU prefixed rsq intrinsic for nowMatt Arsenault2016-01-264-6/+13
| | | | | | Also move into backend intrinsics to discourage use of the old name. llvm-svn: 258783
* [WebAssembly] Optimize memcpy/memmove/memcpy calls.Dan Gohman2016-01-263-41/+127
| | | | | | | | These calls return their first argument, but because LLVM uses an intrinsic with a void return type, they can't use the returned attribute. Generalize the store results pass to optimize these calls too. llvm-svn: 258781
* [WebAssembly] Remove a completed entry from the README.txt.Dan Gohman2016-01-261-4/+0
| | | | llvm-svn: 258780
* [WebAssembly] Implement unaligned loads and stores.Dan Gohman2016-01-2616-246/+489
| | | | | | Differential Revision: http://reviews.llvm.org/D16534 llvm-svn: 258779
* [LIR] Add support for structs and hand unrolled loopsHaicheng Wu2016-01-263-124/+272
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a recommit of r258620 which causes PR26293. The original message: Now LIR can turn following codes into memset: typedef struct foo { int a; int b; } foo_t; void bar(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; ++i) { f[i].a = 0; f[i].b = 0; } } void test(foo_t *f, unsigned n) { for (unsigned i = 0; i < n; i += 2) { f[i] = 0; f[i+1] = 0; } } llvm-svn: 258777
* Use binary search for intrinsic ID lookupsReid Kleckner2016-01-261-14/+60
| | | | | | | | This improves compile time of Function.cpp from 57s to 37s for me locally. Intrinsic IDs are cached on the Function object, so this shouldn't regress performance. llvm-svn: 258774
* LiveIntervalAnalysis: Improve some commentsMatthias Braun2016-01-261-4/+4
| | | | | | As recommended by Justin. llvm-svn: 258771
* Sort intrinsics by LLVM intrinsic name, rather than tablegen def nameReid Kleckner2016-01-261-92/+92
| | | | | | | | | | | | Step one towards using a simple binary search to lookup intrinsic IDs instead of our crazy table generated switch+memcmp+startswith code that makes Function.cpp take about a minute to compile. See PR24785 and PR11951 for why we should do this. The X86 backend contains tables that need to be sorted on intrinsic ID, so reorder those. llvm-svn: 258757
* LiveIntervalAnalysis: Cleanup handleMove{Down|Up}() functions, NFCMatthias Braun2016-01-261-131/+141
| | | | | | | | | | | | | | | | | | | | | | These two functions are hard to reason about. This commit makes the code more comprehensible: - Use four distinct variables (OldIdxIn, OldIdxOut, NewIdxIn, NewIdxOut) with a fixed value instead of a changing iterator I that points to different things during the function. - Remove the early explanation before the function in favor of more detailed comments inside the function. Should have more/clearer comments now stating which conditions are tested and which invariants hold at different points in the functions. The behaviour of the code was not changed. I hope that this will make it easier to review the changes in http://reviews.llvm.org/D9067 which I will adapt next. Differential Revision: http://reviews.llvm.org/D16379 llvm-svn: 258756
* [MC] Use .p2align instead of .alignDan Gohman2016-01-261-5/+2
| | | | | | | | | | | | | | | For historic reasons, the behavior of .align differs between targets. Fortunately, there are alternatives, .p2align and .balign, which make the interpretation of the parameter explicit, and which behave consistently across targets. This patch teaches MC to use .p2align instead of .align, so that people reading code for multiple architectures don't have to remember which way each platform does its .align directive. Differential Revision: http://reviews.llvm.org/D16549 llvm-svn: 258750
* [GVN] Rearrange code to make local vs non-local cases more obvious [NFCI]Philip Reames2016-01-251-13/+18
| | | | llvm-svn: 258747
* [cfi] Cross-DSO CFI diagnostic mode (LLVM part).Evgeniy Stepanov2016-01-251-15/+14
| | | | | | | | * __cfi_check gets a 3rd argument: ubsan handler data * Instead of trapping on failure, call __cfi_check_fail which must be present in the module (generated in the frontend). llvm-svn: 258746
* [GVN] Factor out common code [NFCI]Philip Reames2016-01-251-40/+21
| | | | | | We had the same code duplicated for each type of Def. We also have the entire block duplicated between the local and non-local case, but let's start with local cleanup. llvm-svn: 258740
* X86ISelLowering: Fix cmov(cmov) special lowering bugMatthias Braun2016-01-251-1/+2
| | | | | | | | | | | | | | | | There's a special case in EmitLoweredSelect() that produces an improved lowering for cmov(cmov) patterns. However this special lowering is currently broken if the inner cmov has multiple users so this patch stops using it in this case. If you wonder why this wasn't fixed by continuing to use the special lowering and inserting a 2nd PHI for the inner cmov: I believe this would incur additional copies/register pressure so the special lowering does not improve upon the normal one anymore in this case. This fixes http://llvm.org/PR26256 (= rdar://24329747) llvm-svn: 258729
OpenPOWER on IntegriCloud