summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Remove a temporary hack.Rafael Espindola2014-06-231-10/+0
| | | | | | | Amusingly this survived a lot longer than the CFI transition. We don't even support non-cfi assemblers any more. llvm-svn: 211498
* [PowerPC] Refactor getMinCallFrameSize / getMinCallArgumentsSizeUlrich Weigand2014-06-234-51/+20
| | | | | | | | | | | | | | | | | | | | As of r211495, the only remaining users of getMinCallFrameSize are in core ABI code (LowerFormalParameter / LowerCall). This is actually a good thing, since the details of the parameter save area are ABI specific. With the new ELFv2 ABI in particular, the rules defining the size of the save area will become significantly more complex, so it wouldn't make sense to implement those outside ABI code that has all required information. In preparation, this patch eliminates the getMinCallFrameSize (and associated getMinCallArgumentsSize) routines, and inlines them into all callers. Note that since nearly all call arguments are constant, this allows simplifying the inlined copies to a single line everywhere. No change in generate code expected. llvm-svn: 211497
* [PowerPC] Allow stack frames without parameter save areaUlrich Weigand2014-06-232-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PPCFrameLowering::determineFrameLayout routine currently ensures that every function that allocates a stack frame provides space for the parameter save area (via PPCFrameLowering::getMinCallFrameSize). This is actually not necessary. There may be functions that never call another routine but still allocate a frame; those do not require the parameter save area. In the future, with the ELFv2 ABI, even some routines that do call other functions do not need to allocate the parameter save area. While it is not a bug to allocate the parameter area when it is not needed, it is better to avoid it to save stack space. Note that when any particular function call requires the parameter save area, this space will already have been included by ABI code in the size the CALLSEQ_START insn is annotated with, and therefore included in the size returned by MFI->getMaxCallFrameSize(). This means that determineFrameLayout simply does not need to care about the parameter save area. (It still needs to ensure that every frame provides the linkage area.) This is implemented by this patch. Note that this exposed a bug in the new fast-isel code where the parameter area was *not* included in the CALLSEQ_START size; this is also fixed. A couple of test cases needed to be adapted for the new (smaller) stack frame size those tests now see. llvm-svn: 211495
* [PowerPC] Fix IsDarwin arg in PPCFrameLowering:: callsUlrich Weigand2014-06-231-5/+5
| | | | | | | | | | | | | | | | As remarked in the commit message to r211493, in several places throughout the 64-bit SVR4 ABI code there are calls to PPCFrameLowering::getLinkageSize and getMinCallFrameSize using an incorrect IsDarwin argument of "true". (Some of those were made explicit by the above refactoring patch, others have been there all along.) This patch fixes those places to pass "false" for IsDarwin. No change in generated code expected. llvm-svn: 211494
* [PowerPC] Refactor setMinReservedArea and CalculateParameterAndLinkageAreaSizeUlrich Weigand2014-06-232-127/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PPCISelLowering.cpp routines PPCTargetLowering::setMinReservedArea and CalculateParameterAndLinkageAreaSize are currently used as subroutines from both 64-bit SVR4 and Darwin ABI code. However, the two ABIs are already quite different w.r.t. AltiVec conventions, and they will become more different when the ELFv2 ABI is supported. Also, in general it seems better to disentangle ABI support routines for different ABIs to avoid accidentally affecting one ABI when intending to change only the other. (Actually, the current code strictly speaking already contains a bug: these routines call PPCFrameLowering::getMinCallFrameSize and PPCFrameLowering::getLinkageSize with the IsDarwin parameter set to "true" even on 64-bit SVR4. This bug currently has no adverse effect since those routines always return the same for 64-bit SVR4 and 64-bit Darwin, but it still seems wrong ... I'll fix this in a follow-up commit shortly.) To remove this code sharing, I'm simply inlining both routines into all call sites (there are just two each, one for 64-bit SVR4 and one for Darwin), and simplifying due to constant parameters where possible. A small piece of code that *does* make sense to share is refactored into the new routine EnsureStackAlignment, now also called from 32-bit SVR4 ABI code. No change in generated code is expected. llvm-svn: 211493
* [PowerPC] Fix on-stack AltiVec arguments with 64-bit SVR4Ulrich Weigand2014-06-231-44/+29
| | | | | | | | | | | | | | | | | | Current 64-bit SVR4 code seems to have some remnants of Darwin code in AltiVec argument handing. This had the effect that AltiVec arguments (or subsequent arguments) were not correctly placed in the parameter area in some cases. The correct behaviour with the 64-bit SVR4 ABI is: - All AltiVec arguments take up space in the parameter area, just like any other arguments, whether vararg or not. - They are always 16-byte aligned, skipping a parameter area doubleword (and the associated GPR, if any), if necessary. This patch implements the correct behaviour and adds a test case. (Verified against GCC behaviour via the ABI compat test suite.) llvm-svn: 211492
* ARM: mark UBFX as not allowing PC.Tim Northover2014-06-231-2/+2
| | | | | | | | | Strictly, it's unpredictable. But we don't quite model that yet and an error is better than ignoring the issue. This one somehow got left out before though. rdar://problem/15997748 llvm-svn: 211490
* MC: Cleanup parseMSInlineAsmDavid Majnemer2014-06-231-16/+13
| | | | | | | | | Utilize range based for-loops to simplify some code. Use insert() instead of a loop for simplicity/efficiency. No functionality change. llvm-svn: 211486
* MC: adjust text section flags for WoASaleem Abdulrasool2014-06-222-8/+15
| | | | | | | | | | | | | | | | | | | | | | | Correct the section flags for code built for Windows on ARM with `-ffunction-sections`. Windows on ARM uses solely Thumb-2 instructions, and indicates that the function is thumb by placing it in a text section that has IMAGE_SCN_MEM_16BIT flag set. When we encounter a .section directive, a new section is constructed. This may be a text segment. In order to identify that we need the additional flag, expose the target triple through the ObjectFileInfo as this information is lost otherwise. Since any modern ARM targeting environment on Windows would be Thumb-2 (Windows ARM NT or Windows Embedded Compact), introducing a new flag to indicate the section attribute seems to be a bit overkill. Simply depend on the target triple. Since there is one location that this information is currently needed, creating a target specific assembly parser and delegating the parsing of section switches also feels a bit heavy handed. If it turns out that this information ends up changing additional behaviour, then it may be worth considering that alternative. llvm-svn: 211481
* Revert r211399, "Generate native unwind info on Win64"NAKAMURA Takumi2014-06-2211-414/+190
| | | | | | It broke Legacy JIT Tests on x86_64-{mingw32|msvc}, aka Windows x64. llvm-svn: 211480
* R600: Use LowerSDIVREM for i64 node replaceJan Vesely2014-06-223-92/+120
| | | | | | | | v2: move div/rem node replacement to R600ISelLowering make lowerSDIVREM protected Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211478
* R600: Implement custom SDIVREM.Jan Vesely2014-06-222-4/+44
| | | | | | | | | | Instead of separate SDIV/SREM. SDIV used UDIV which in turn used UDIVREM anyway. SREM used SDIV(UDIV->UDIVREM)+MUL+SUB, using UDIVREM directly is more efficient. v2: Don't use all caps names Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211477
* Fix PR20087 by using the source index when changing the vector loadFilipe Cabecinhas2014-06-221-2/+2
| | | | llvm-svn: 211472
* LoopVectorizer: Fix a dominance issueArnold Schwaighofer2014-06-221-8/+17
| | | | | | | The induction variables start value needs to be defined before we branch (overflow check) to the scalar preheader where we used it. llvm-svn: 211460
* MergeFunctions Pass, removed DenseMap helpers.Stepan Dyatkovskiy2014-06-221-104/+0
| | | | | | | | | | | Patch removes rest part of code related to old implementation. This patch belongs to patch series that improves MergeFunctions performance time from O(N*N) to O(N*log(N)). This one was the final patch. llvm-svn: 211457
* MergeFunctions Pass, updated header comments.Stepan Dyatkovskiy2014-06-221-9/+47
| | | | | | | | | | Added short description for new comparison algorithm, that introduces total ordering among functions set. This patch belongs to patch series that improves MergeFunctions performance time from O(N*N) to O(N*log(N)). llvm-svn: 211456
* Report error for non-zero data in .bssWeiming Zhao2014-06-221-2/+8
| | | | | | | | | | | | User may initialize a var with non-zero value and specify .bss section. E.g. : int a __attribute__((section(".bss"))) = 2; This patch converts an assertion to error report for better user experience. Differential Revision: http://reviews.llvm.org/D4199 llvm-svn: 211455
* MergeFunctions Pass, FnSet has been replaced with FnTree.Stepan Dyatkovskiy2014-06-211-37/+49
| | | | | | | | | | | | | | | | Patch activates new implementation. So from now, merging process should take time O(N*log(N)). Where N size of module (we are free to measure it in functions or in instructions). Internally FnTree represents binary tree. So every lookup operation takes O(log(N)) time. It is still not the last patch in series, we also have to clean-up pass from old code, and update pass comments. This patch belongs to patch series that improves MergeFunctions performance time from O(N*N) to O(N*log(N)). llvm-svn: 211445
* MergeFunctions Pass, removed unused methods from old implementation.Stepan Dyatkovskiy2014-06-211-21/+0
| | | | | | | | | | | | | Patch removed next old FunctionComparator methods: * enumerate * isEquivalentOperation * isEquivalentGEP * isEquivalentType This patch belongs to patch series that improves MergeFunctions performance time from O(N*N) to O(N*log(N)). llvm-svn: 211444
* MergeFunctions, doSanityCheck: fixed body comments.Stepan Dyatkovskiy2014-06-211-7/+5
| | | | llvm-svn: 211443
* MergeFunctions Pass, introduced sanity check, that checks order relation,Stepan Dyatkovskiy2014-06-211-0/+88
| | | | | | | | | introduced among functions set. This patch belongs to patch series that improves MergeFunctions performance time from O(N*N) to O(N*log(N)). llvm-svn: 211442
* MergeFunctions Pass, introduced total ordering among top-level comparisonStepan Dyatkovskiy2014-06-211-79/+94
| | | | | | | | | | | | | methods. Patch changes return type of FunctionComparator::compare() and FunctionComparator::compare(const BasicBlock*, const BasicBlock*) methods from bool (equal or not) to {-1, 0, 1} (less, equal, great). This patch belongs to patch series that improves MergeFunctions performance time from O(N*N) to O(N*log(N)). llvm-svn: 211437
* LoopUnrollRuntime: Check for overflow in the trip count calculation.Benjamin Kramer2014-06-211-11/+12
| | | | | | Fixes PR19823. llvm-svn: 211436
* Legalizer: Add support for splitting insert_subvectors.Benjamin Kramer2014-06-212-0/+39
| | | | | | | | | We handle this by spilling the whole thing to the stack and doing the insertion as a store. PR19492. This happens in real code because the vectorizer creates v2i128 when AVX is enabled. llvm-svn: 211435
* SCEVExpander: Fold constant PHIs harder. The logic below only understands ↵Benjamin Kramer2014-06-211-1/+2
| | | | | | | | proper IVs. PR20093. llvm-svn: 211433
* Add back functionality removed in r210497.Richard Trieu2014-06-217-16/+34
| | | | | | Instead of asserting, output a message stating that a null pointer was found. llvm-svn: 211430
* [X86] Add ISel patterns to select SSE3/AVX ADDSUB instructions.Andrea Di Biagio2014-06-211-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds ISel patterns to select SSE3/AVX ADDSUB instructions from a sequence of "vadd + vsub + blend". Example: /// typedef float float4 __attribute__((ext_vector_type(4))); float4 foo(float4 A, float4 B) { float4 X = A - B; float4 Y = A + B; return (float4){X[0], Y[1], X[2], Y[3]}; } /// Before this patch, (with flag -mcpu=corei7) llc produced the following assembly sequence: movaps %xmm0, %xmm2 addps %xmm1, %xmm2 subps %xmm1, %xmm0 blendps $10, %xmm2, %xmm0 With this patch, we now get a single addsubps %xmm1, %xmm0 llvm-svn: 211427
* Fix the MinGW builder. Apparently std::call_once andZachary Turner2014-06-211-17/+10
| | | | | | | | std::recursive_mutex are not available on MinGW and breaks the builder. Revert to using a function local static and sys::Mutex just to get the tree green until we figure out a better solution. llvm-svn: 211424
* Always use a temp symbol for CIE.Rafael Espindola2014-06-201-6/+1
| | | | | | Fixes pr19185. llvm-svn: 211423
* Use compact unwind for the iOS simulator.Rafael Espindola2014-06-201-0/+5
| | | | | | Another step in fixing pr19185. llvm-svn: 211416
* Use a helper function and clang-format.Rafael Espindola2014-06-201-6/+19
| | | | | | No functionality change. llvm-svn: 211415
* Delete dead code.Rafael Espindola2014-06-201-17/+8
| | | | | | The compact unwind info is only used by code that knows it is supported. llvm-svn: 211412
* Support: Write ScaledNumber::getQuotient() and getProduct()Duncan P. N. Exon Smith2014-06-203-91/+120
| | | | llvm-svn: 211409
* Don't produce eh_frame relocations when targeting the IOS simulator.Rafael Espindola2014-06-201-2/+3
| | | | | | First step for fixing pr19185. llvm-svn: 211404
* Revert "Replace Execution Engine's mutex with std::recursive_mutex."Zachary Turner2014-06-204-56/+57
| | | | | | | | | | This reverts commit 1f502bd9d7d2c1f98ad93a09ffe435e11a95aedd, due to GCC / MinGW's lack of support for C++11 threading. It's possible this will go back in after we come up with a reasonable solution. llvm-svn: 211401
* Generate native unwind info on Win64Reid Kleckner2014-06-2011-190/+414
| | | | | | | | | | | | | | | | | | | | This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 llvm-svn: 211399
* Commited patch from Björn Steinbrink:Stepan Dyatkovskiy2014-06-201-1/+6
| | | | | | | | | | | | Summary: Different range metadata can lead to different optimizations in later passes, possibly breaking the semantics of the merged function. So range metadata must be taken into consideration when comparing Load instructions. Thanks! llvm-svn: 211391
* [RuntimeDyld] Fix ppc64 stub relocations on little-endianUlrich Weigand2014-06-201-5/+11
| | | | | | | | | | | | | | | When RuntimeDyldELF creates stub functions, it needs to install relocations that will resolve to the final address of the target routine. Since those are 16-bit relocs, they need to be applied to the least-significant halfword of the instruction. On big-endian ppc64, this means that addresses have to be adjusted by 2, which is what the code currently does. However, on a little-endian system, the address must *not* be adjusted; the least-significant halfword is the first one. This patch updates the RuntimeDyldELF code to take the target byte order into account. llvm-svn: 211384
* Fix a warning about the use of const being ignored with a cast.Kevin Enderby2014-06-201-1/+1
| | | | llvm-svn: 211383
* [RuntimeDyld] Support more PPC64 relocationsUlrich Weigand2014-06-201-4/+70
| | | | | | | | | | This adds support for several missing PPC64 relocations in the straight-forward manner to RuntimeDyldELF.cpp. Note that this actually fixes a failure of a large-model test case on PowerPC, allowing the XFAIL to be removed. llvm-svn: 211382
* R600/SI: Add patterns for ctpop inside a branchTom Stellard2014-06-201-12/+38
| | | | llvm-svn: 211378
* R600/SI: Add a pattern for f32 ftruncTom Stellard2014-06-203-5/+4
| | | | llvm-svn: 211377
* R600: Expand vector flog2Tom Stellard2014-06-201-0/+1
| | | | llvm-svn: 211376
* R600: Expand vector fexp2Tom Stellard2014-06-201-0/+1
| | | | llvm-svn: 211375
* R600/SI: SI Control Flow Annotation bug fixedTom Stellard2014-06-201-25/+21
| | | | | | | | | | | | Mixing of AddAvailableValue and GetValueAtEndOfBlock methods of SSAUpdater leaded to the endless loop generation when the nested loops annotated. This fixes a bug in the OCL_ML/KNN OpenCV test. The test case is too complex for FileCheck and would be very fragile. Patch by: Elena Denisova llvm-svn: 211374
* R600/SI: Add a VALU pattern for i64 xorTom Stellard2014-06-201-4/+7
| | | | llvm-svn: 211373
* [PowerPC] Fix small argument stack slot offset for LEUlrich Weigand2014-06-201-11/+20
| | | | | | | | | | | | | | When small arguments (structures < 8 bytes or "float") are passed in a stack slot in the ppc64 SVR4 ABI, they must reside in the least significant part of that slot. On BE, this means that an offset needs to be added to the stack address of the parameter, but on LE, the least significant part of the slot has the same address as the slot itself. This changes the PowerPC back-end ABI code to only add the small argument stack slot offset for BE. It also adds test cases to verify the correct behavior on both BE and LE. llvm-svn: 211368
* Allow a target to create a null streamer.Rafael Espindola2014-06-205-72/+42
| | | | | | | | | Targets can assume that a target streamer is present, so they have to be able to construct a null streamer in order to set the target streamer in it to. Fixes a crash when using the null streamer with arm. llvm-svn: 211358
* The count() function for STL datatypes returns unsigned, even where it'sYaron Keren2014-06-202-2/+2
| | | | | | | | | | | | | | | | | only 1/0 result like std::set. Some of the LLVM ADT already return unsigned count(), while others still return bool count(). In continuation to r197879, this patch modifies DenseMap, DenseSet, ScopedHashTable, ValueMap:: count() to return size_type instead of bool, 1 instead of true and 0 instead of false. size_type is typedef-ed locally within each class to size_t. http://reviews.llvm.org/D4018 Reviewed by dblaikie. llvm-svn: 211350
* Emit the ARM build attributes ABI_PCS_wchar_t and ABI_enum_size.Oliver Stannard2014-06-201-0/+26
| | | | | | | Emit the ARM build attributes ABI_PCS_wchar_t and ABI_enum_size based on module flags metadata. llvm-svn: 211349
OpenPOWER on IntegriCloud