summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU/SI: Improve register allocation hints for sopk instructionsTom Stellard2016-08-292-2/+3
| | | | | | | | | | | | | | | | | | | Summary: For shrinking SOPK instructions, we were creating a hint to tell the register allocator to use the register allocated for src0 for the dst operand as well. However, this seems to not work sometimes depending on the order virtual registers are assigned physical registers. To fix this, I've added a second allocation hint which does the reverse, asks that the register allocated for dst is used for src0. Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23862 llvm-svn: 279968
* Use the correct ctor/dtor section for dynamic-no-pic.Rafael Espindola2016-08-292-1/+5
| | | | llvm-svn: 279967
* Mark test as XFAIL instead of disabling it everywhere.Benjamin Kramer2016-08-291-2/+2
| | | | | | | There is no lit feature 'X86' so this test is just disabled completely. Make it XFAIL until a solution is found. llvm-svn: 279966
* Move code only used by codegen out of MC. NFC.Rafael Espindola2016-08-295-51/+64
| | | | | | MC itself never needs to know about these sections. llvm-svn: 279965
* Fix -Wunused-but-set-variable warning.Haojian Wu2016-08-291-4/+0
| | | | | | | | | | | | Summary: A follow-up fix on r279958. Reviewers: bkramer Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23989 llvm-svn: 279964
* AMDGPU/SI: Query AA, if available, in areMemAccessesTriviallyDisjoint()Tom Stellard2016-08-291-0/+11
| | | | | | | | | | | | | | Summary: The SILoadStoreOptimizer will need to use AliasAnalysis here in order to move it before scheduling. Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23813 llvm-svn: 279963
* Fixed a bug in type legalizer for masked gather.Igor Breger2016-08-292-1/+42
| | | | | | | | | The problem occurs when the Node doesn't updated in place , UpdateNodeOperation() return the node that already exist. In this case assert fail in PromoteIntegerOperand() , N have 2 results ( val + chain). Differential Revision: http://reviews.llvm.org/D23756 llvm-svn: 279961
* [AVX512] In some cases KORTEST instruction may be used instead of ZEXT + ↵Igor Breger2016-08-297-728/+296
| | | | | | | | TEST sequence. Differential Revision: http://reviews.llvm.org/D23490 llvm-svn: 279960
* [InstructionSelect] NumBlocks isn't defined in DEBUG build.Haojian Wu2016-08-291-1/+1
| | | | | | | | | | | | Summary: A follow-up fixing on http://llvm.org/viewvc/llvm-project?view=revision&revision=279905. Reviewers: bkramer Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23985 llvm-svn: 279959
* [X86] Don't lower FABS/FNEG masking directly to a ConstantPool load. Just ↵Craig Topper2016-08-298-82/+192
| | | | | | | | create a ConstantFPSDNode and let that be lowered. This allows broadcast loads to used when available. llvm-svn: 279958
* [AVX-512] Always use v8i64 when converting 512-bit FAND/FOR/FXOR/FANDN to ↵Craig Topper2016-08-291-5/+3
| | | | | | integer operations when DQI isn't supported. This is consistent with the recent changes to promote logical operations to i64 vectors. llvm-svn: 279957
* [AVX-512] Add 512-bit fabs tests with and without AVX512DQ.Craig Topper2016-08-291-4/+84
| | | | llvm-svn: 279956
* [Orc] Simplify LogicalDylib and move it back inside CompileOnDemandLayer. AlsoLang Hames2016-08-296-324/+158
| | | | | | | | | | | | | | | | | | | | switch to using one indirect stub manager per logical dylib rather than one per input module. LogicalDylib is a helper class used by the CompileOnDemandLayer to manage symbol resolution between modules during lazy compilation. In particular, it ensures that internal symbols resolve correctly even in the case where multiple input modules contain the same internal symbol name (which must to be promoted to external hidden linkage so that functions in any given module can be split out by lazy compilation). LogicalDylib's resolution scheme (before this commit) required one stub-manager per input module. This made recompilation of functions (by adding a module containing a new definition) difficult, as the stub manager for any given symbol was bound to the module that supplied the original definition. By using one stubs manager for the whole logical dylib symbols can be more easily replaced, although support for doing this is not included in this patch (it will be implemented in a follow up). llvm-svn: 279952
* [AVX-512] Add support for selecting 512-bit VPABSB/VPABSW when BWI is available.Craig Topper2016-08-283-10/+21
| | | | llvm-svn: 279951
* [AVX-512] Add patterns for selecting 128/256-bit EVEX VPABS instructions.Craig Topper2016-08-282-2/+37
| | | | llvm-svn: 279950
* [AVX-512] Add testcases showing that we don't emit 512-bit vpabsb/vpabsw. ↵Craig Topper2016-08-281-5/+155
| | | | | | Will be fixed in a future commit. llvm-svn: 279949
* Fix some typos in the docSylvestre Ledru2016-08-287-7/+7
| | | | llvm-svn: 279943
* [x86] add tests for <3 x N> vector types (PR29114)Sanjay Patel2016-08-281-0/+40
| | | | llvm-svn: 279939
* [InstCombine] use m_APInt to allow icmp (and X, Y), C folds for splat ↵Sanjay Patel2016-08-285-50/+42
| | | | | | constant vectors llvm-svn: 279937
* [X86][AVX512] Only combine EVEX targets shuffles to shuffles of the same ↵Simon Pilgrim2016-08-282-8/+20
| | | | | | | | | | number of vector elements Over eager combing prevents the correct folding of writemasks. At the moment this occurs for ALL EVEX shuffles, in the future we need to check that the user of the root shuffle is a VSELECT that can fold to a writemask. llvm-svn: 279934
* [PowerPC] Implement lowering for atomicrmw min/max/umin/umaxHal Finkel2016-08-285-5/+587
| | | | | | Implement lowering for atomicrmw min/max/umin/umax. Fixes PR28818. llvm-svn: 279933
* [Loop Vectorizer] Fixed memory confilict checks.Elena Demikhovsky2016-08-288-30/+109
| | | | | | | | | Fixed a bug in run-time checks for possible memory conflicts inside loop. The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size". Differential revision: https://reviews.llvm.org/D23176 llvm-svn: 279930
* [AVX-512] Promote AND/OR/XOR to v2i64/v4i64/v8i64 even when we have ↵Craig Topper2016-08-288-56/+177
| | | | | | | | | | AVX512F/AVX512VL. Previously we weren't creating masked logical operations if bitcasts appeared between the logic operation and the select. The IR optimizers can move bitcasts across logic operations and create these cases. To minimize the number of cases we need to handle, this change promotes all logic ops to an i64 vector type just like when only SSE or AVX is available. Unfortunately, this also has the consequence of making it difficult to select unmasked VPANDD/VPORD/VPXORD in all the cases it was previously used. This is the cause of most of the test change. This shouldn't result in any functional change though. llvm-svn: 279929
* [AVX-512] Add tests to show that we don't select masked logic ops if there ↵Craig Topper2016-08-281-0/+51
| | | | | | | | are bitcasts between the logic op and the select. This is taken from optimized IR of clang test cases for masked logic ops. llvm-svn: 279928
* [X86] Rename PABSB/D/W instructions to be consistent with SSE/AVX ↵Craig Topper2016-08-282-40/+40
| | | | | | instructions instead of ending 128/256. NFC llvm-svn: 279927
* AMDGPU/R600: Enable Load combineJan Vesely2016-08-277-135/+1952
| | | | | | | | Fix and improve tests Differential Revision: https://reviews.llvm.org/D23899 llvm-svn: 279925
* [X86] Rename predicate function that detects if requires one of the REX.B, ↵Craig Topper2016-08-271-15/+16
| | | | | | REX.X or REX.R bits. It's old name conflicted with a function in X8II namespace that doesnt' quite do the same thing. NFC llvm-svn: 279924
* [X86] Keep looping over operands looking for byte registers even if we ↵Craig Topper2016-08-271-5/+4
| | | | | | already found a register that requires a REX prefix. Otherwise we don't error if a high byte register is used after SPL/BPL/DIL/SIL. llvm-svn: 279923
* [X86] Include XMM/YMM/ZMM16-23 in X86II::isX86_64ExtendedReg. This feels ↵Craig Topper2016-08-272-8/+4
| | | | | | more consistent with its name and simplifies assembler code. llvm-svn: 279922
* [X86] Don't allow DR8-DR15 to be assembled in 32-bit mode. Add missing test ↵Craig Topper2016-08-272-0/+8
| | | | | | for CR8-CR15. llvm-svn: 279921
* [X86] Remove stale comment about FixupBWInsts pass being off by default. NFCCraig Topper2016-08-271-2/+0
| | | | llvm-svn: 279915
* [AVX-512] Allow EVEX encoding unordered/ordered/equal/notequal ↵Craig Topper2016-08-272-8/+28
| | | | | | VCMPPS/PD/SS/SD to be commuted just like the SSE and AVX counterparts. llvm-svn: 279914
* [X86] Enable FR32/FR64 cmpeq/cmpne/cmpunord/cmpord to be commuted.Craig Topper2016-08-273-4/+13
| | | | llvm-svn: 279913
* [AVX-512] Add load folding for EVEX vcmpps/pd/ss/sd.Craig Topper2016-08-273-0/+62
| | | | llvm-svn: 279912
* [LTO] Don't create a new common unless merged has different sizeTeresa Johnson2016-08-273-7/+8
| | | | | | | | | | | | | | | | | Summary: This addresses a regression in common handling from the new LTO API in r278338. Only create a new common if the size is different. The type comparison against an array type fails when the size is different but not an array. GlobalMerge does not handle the array types as well and we lose some global merging opportunities. Reviewers: mehdi_amini Subscribers: junbuml, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23955 llvm-svn: 279911
* AMDGPU: Mark sched model completeMatt Arsenault2016-08-271-1/+1
| | | | | | Fixes bug 26800 llvm-svn: 279910
* AMDGPU: Remove unneeded implicit exec uses/defsMatt Arsenault2016-08-272-40/+48
| | | | | | | SI_BREAK, SI_IF_BREAK, and SI_ELSE_BREAK do not def exec. SI_IF_BREAK and SI_ELSE_BREAK do not read it either. llvm-svn: 279909
* [Orc] Explicitly specify type for assignment.Lang Hames2016-08-271-3/+3
| | | | | | | This should fix the MSVC errors in http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/15120 llvm-svn: 279908
* GVN-hoist: invalidate MD cache (PR29144)Sebastian Pop2016-08-271-0/+2
| | | | | | | | | Without invalidating the entries in the MD cache we would try to access instructions that were removed in previous iterations of hoisting. Differential Revision: https://reviews.llvm.org/D23927 llvm-svn: 279907
* [RegBankSelect] Do not abort when the target wants to fall back.Quentin Colombet2016-08-272-20/+59
| | | | llvm-svn: 279906
* [InstructionSelect] Do not abort when the target wants to fall back.Quentin Colombet2016-08-272-7/+30
| | | | llvm-svn: 279905
* [MachineLegalize] Do not abort when the target wants to fall back.Quentin Colombet2016-08-273-6/+28
| | | | llvm-svn: 279904
* AMDGPU: Select mulhi 24-bit instructionsMatt Arsenault2016-08-279-57/+456
| | | | llvm-svn: 279902
* AMDGPU: Move cndmask pseudo to be isel pseudoMatt Arsenault2016-08-275-41/+49
| | | | | | | | There's only one use of this for the convenience of a pattern. I think v_mov_b64_pseudo should also be moved, but SIFoldOperands does currently make use of it. llvm-svn: 279901
* AMDGPU: Fix sched type for branchesMatt Arsenault2016-08-271-1/+1
| | | | llvm-svn: 279900
* AMDGPU: Remove register operand from si_mask_branchMatt Arsenault2016-08-272-5/+3
| | | | | | | | | It isn't used for anything, and is also misleading since it could be spilled at the end of the block, so it can't be relied on. There ends up being a verifier error about using an undefined register since the spill kills the register. llvm-svn: 279899
* AMDGPU: Improve error reporting for maximum branch distanceMatt Arsenault2016-08-272-30/+68
| | | | | | Unfortunately this seems to only help the assembler diagnostic. llvm-svn: 279895
* [CMake] Only generate Components.cmake if components are specifiedChris Bieneman2016-08-271-18/+20
| | | | | | | | Generating the Components import file is useless if there are no components coming in from the runtimes configuration, so we should skip generation in that case. This also should fix the configuration error that Renato reported on llvm-dev. llvm-svn: 279893
* [ORC] Fix typo in LogicalDylib, add unit test.Lang Hames2016-08-273-1/+78
| | | | llvm-svn: 279892
* [GlobalISel] Add a fallback path to SDISel.Quentin Colombet2016-08-276-0/+85
| | | | | | | | | When global-isel fails on a MachineFunction MF, MF will be cleaned up and given to SDISel. Thanks to this fallback, we can already perform correctness test even if we support only a small portion of the functions in a test. llvm-svn: 279891
OpenPOWER on IntegriCloud