summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* [x86] Refactor some of the new code for lowering v16i8 shuffles toChandler Carruth2014-07-101-17/+17
| | | | | | | | remove duplication and make it easier to select different strategies. No functionality changed. llvm-svn: 212674
* [dfsan] Handle bitcast aliases.Peter Collingbourne2014-07-102-2/+10
| | | | llvm-svn: 212668
* [SDAG] Make the new zext-vector-inreg node default to expand so targetsChandler Carruth2014-07-092-2/+4
| | | | | | | | | | | don't need to set it manually. This is based on feedback from Tom who pointed out that if every target needs to handle this we need to reach out to those maintainers. In fact, it doesn't make sense to duplicate everything when anything other than expand seems unlikely at this stage. llvm-svn: 212661
* Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for ↵David Blaikie2014-07-099-44/+314
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | functions that do not have top level debug information. Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped provide a reduced reproduction, though the failure wasn't too hard to guess, and even easier with the example to confirm. The assertion that the subprogram metadata associated with an llvm::Function matches the scope data referenced by the DbgLocs on the instructions in that function is not valid under LTO. In LTO, a C++ inline function might exist in multiple CUs and the subprogram metadata nodes will refer to the same llvm::Function. In this case, depending on the order of the CUs, the first intance of the subprogram metadata may not be the one referenced by the instructions in that function and the assertion will fail. A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the assertion removed and a comment added to explain this situation. Original commit message: If a function isn't actually in a CU's subprogram list in the debug info metadata, ignore all the DebugLocs and don't try to build scopes, track variables, etc. While this is possibly a minor optimization, it's also a correctness fix for an incoming patch that will add assertions to LexicalScopes and the debug info verifier to ensure that all scope chains lead to debug info for the current function. Fix up a few test cases that had broken/incomplete debug info that could violate this constraint. Add a test case where this occurs by design (inlining a debug-info-having function in an attribute nodebug function - we want this to work because /if/ the nodebug function is then inlined into a debug-info-having function, it should be fine (and will work fine - we just stitch the scopes up as usual), but should the inlining not happen we need to not assert fail either). llvm-svn: 212649
* Decouple llvm::SpecialCaseList text representation and its LLVM IR semantics.Alexey Samsonov2014-07-0910-403/+278
| | | | | | | | | | | | | | | | Turn llvm::SpecialCaseList into a simple class that parses text files in a specified format and knows nothing about LLVM IR. Move this class into LLVMSupport library. Implement two users of this class: * DFSanABIList in DFSan instrumentation pass. * SanitizerBlacklist in Clang CodeGen library. The latter will be modified to use actual source-level information from frontend (source file names) instead of unstable LLVM IR things (LLVM Module identifier). Remove dependency edge from ClangCodeGen/ClangDriver to LLVMTransformUtils. No functionality change. llvm-svn: 212643
* Use simpler constructor for range adapter.Tim Northover2014-07-091-2/+1
| | | | | | | It is a good idea, it's slightly clearer and simpler. Unfortunately the headline news is: we save one line! llvm-svn: 212641
* Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine.Matt Arsenault2014-07-093-3/+54
| | | | | | Do this if the truncate is free and the select is legal. llvm-svn: 212640
* AArch64: Better codegen for storing to __fp16.Jim Grosbach2014-07-092-0/+166
| | | | | | | | | | | | | Storing will generally be immediately preceded by rounding from an f32 or f64, so make sure to match those patterns directly to convert into the FPR16 register class directly rather than going through the integer GPRs. This also eliminates an extra step in the convert-from-f64 path which was first converting to f32 and then to f16 from there. rdar://17594379 llvm-svn: 212638
* Change an assert() to a diagnostic.Jim Grosbach2014-07-091-3/+5
| | | | llvm-svn: 212637
* TargetRegisterInfo: Remove function that fell out of use years ago.Benjamin Kramer2014-07-093-25/+0
| | | | llvm-svn: 212636
* Update ReleaseNotes to mention Atomic NAND semantic changes.Cameron McInally2014-07-091-0/+4
| | | | llvm-svn: 212635
* [X86] AVX512: Enable it in the Loop VectorizerAdam Nemet2014-07-092-1/+40
| | | | | | | | | | This lets us experiment with 512-bit vectorization without passing force-vector-width manually. The code generated for a simple integer memset loop is properly vectorized. Disassembly is still broken for it though :(. llvm-svn: 212634
* Make AArch64FastISel::EmitIntExt explicitly check its source and destination ↵Louis Gerbarg2014-07-091-3/+8
| | | | | | | | | | | | types This is a follow up to r212492. There should be no functional difference, but this patch makes it clear that SrcVT must be an i1/i8/16/i32 and DestVT must be an i8/i16/i32/i64. rdar://17516686 llvm-svn: 212633
* removed duplicate testcaseSanjay Patel2014-07-091-16/+0
| | | | llvm-svn: 212632
* Fix for PR20059 (instcombine reorders shufflevector after instruction that ↵Sanjay Patel2014-07-092-0/+38
| | | | | | | | | | | | may trap) In PR20059 ( http://llvm.org/pr20059 ), instcombine eliminates shuffles that are necessary before performing an operation that can trap (srem). This patch calls isSafeToSpeculativelyExecute() and bails out of the optimization in SimplifyVectorOp() if needed. Differential Revision: http://reviews.llvm.org/D4424 llvm-svn: 212629
* Add Imagination Technologies to the vendors in llvm::TripleDaniel Sanders2014-07-092-0/+3
| | | | | | | | Summary: This is a pre-requisite for supporting the mips-img-linux-gnu triple in clang. Differential Revision: http://reviews.llvm.org/D4435 llvm-svn: 212626
* Generic: add range-adapter for option parsing.Tim Northover2014-07-092-17/+20
| | | | | | I want to use it in lld, but while I'm here I'll update LLVM uses. llvm-svn: 212615
* [x86] Fix a bug in my new zext-vector-inreg DAG trickery where we wereChandler Carruth2014-07-093-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | not widening the input type to the node sufficiently to let the ext take place in a register. This would in turn result in a mysterious bitcast assertion failure downstream. First change here is to add back the helpful assert I had in an earlier version of the code to catch this immediately. Next change is to add support to the type legalization to detect when we have widened the operand either too little or too much (for whatever reason) and find a size-matched legal vector type to convert it to first. This can also fail so we get a new fallback path, but that seems OK. With this, we no longer crash on vec_cast2.ll when using widening. I've also added the CHECK lines for the zero-extend cases here. We still need to support sign-extend and trunc (or something) to get plausible code for the other two thirds of this test which is one of the regression tests that showed the most scalarization when widening was force-enabled. Slowly closing in on widening being a viable legalization strategy without it resorting to scalarization at every turn. =] llvm-svn: 212614
* Sink two variables only used in an assert into the assert itself. ShouldChandler Carruth2014-07-091-3/+3
| | | | | | fix the release builds with Werror. llvm-svn: 212612
* X86: When lowering v8i32 himuls use the correct shuffle masks for AVX2.Benjamin Kramer2014-07-092-24/+36
| | | | | | | | | | Turns out my trick of using the same masks for SSE4.1 and AVX2 didn't work out as we have to blend two vectors. While there remove unecessary cross-lane moves from the shuffles so the backend can lower it to palignr instead of vperm. Fixes PR20118, a miscompilation of vector sdiv by constant on AVX2. llvm-svn: 212611
* [x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when wideningChandler Carruth2014-07-099-1/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vector types to be legal and a ZERO_EXTEND node is encountered. When we use widening to legalize vector types, extend nodes are a real challenge. Either the input or output is likely to be legal, but in many cases not both. As a consequence, we don't really have any way to represent this situation and the prior code in the widening legalization framework would just scalarize the extend operation completely. This patch introduces a new DAG node to represent doing a zero extend of a vector "in register". The core of the idea is to allow legal but different vector types in the input and output. The output vector must have fewer lanes but wider elements. The operation is defined to zero extend the low elements of the input to the size of the output elements, and drop all of the high elements which don't have a corresponding lane in the output vector. It also includes generic expansion of this node in terms of blending a zero vector into the high elements of the vector and bitcasting across. This in turn yields extremely nice code for x86 SSE2 when we use the new widening legalization logic in conjunction with the new shuffle lowering logic. There is still more to do here. We need to support sign extension, any extension, and potentially int-to-float conversions. My current plan is to continue using similar synthetic nodes to model each of these transitions with generic lowering code for each one. However, with this patch LLVM already reaches performance parity with GCC for the core C loops of the x264 code (assuming you disable the hand-written assembly versions) when compiling for SSE2 and SSE3 architectures and enabling the new widening and lowering logic for vectors. Differential Revision: http://reviews.llvm.org/D4405 llvm-svn: 212610
* [mips][mips64r6] Correct select patterns that have the condition or ↵Daniel Sanders2014-07-096-105/+105
| | | | | | | | | | true/false values backwards Summary: This bug caused SingleSource/Regression/C/uint64_to_float and SingleSource/UnitTests/2002-05-02-CastTest3 to fail (among others). Differential Revision: http://reviews.llvm.org/D4388 llvm-svn: 212608
* [mips][mips64r6] Correct cond names in the cmp.cond.[ds] instructionsDaniel Sanders2014-07-099-158/+159
| | | | | | | | | | Summary: It seems we accidentally read the wrong column of the table MIPS64r6 spec and used the names for c.cond.fmt instead of cmp.cond.fmt. Differential Revision: http://reviews.llvm.org/D4387 llvm-svn: 212607
* [x86] Initialize a pointer to null to fix a bug in r212602.Chandler Carruth2014-07-091-1/+1
| | | | | | | This should restore GCC hosts (which happen to put the bad stuff into the pointer) and MSan, etc. llvm-svn: 212606
* [mips][mips64r6] Use JALR for indirect branches instead of JR (which is not ↵Daniel Sanders2014-07-097-36/+71
| | | | | | | | | | | | | | | | | available on MIPS32r6/MIPS64r6) Summary: This completes the change to use JALR instead of JR on MIPS32r6/MIPS64r6. Reviewers: jkolek, vmedic, zoran.jovanovic, dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4269 llvm-svn: 212605
* [mips][mips64r6] Use JALR for returns instead of JR (which is not available ↵Daniel Sanders2014-07-0913-60/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | on MIPS32r6/MIPS64r6) Summary: RET, and RET_MM have been replaced by a pseudo named PseudoReturn. In addition a version with a 64-bit GPR named PseudoReturn64 has been added. Instruction selection for a return matches RetRA, which is expanded post register allocation to PseudoReturn/PseudoReturn64. During MipsAsmPrinter, this PseudoReturn/PseudoReturn64 are emitted as: - (JALR64 $zero, $rs) on MIPS64r6 - (JALR $zero, $rs) on MIPS32r6 - (JR_MM $rs) on microMIPS - (JR $rs) otherwise On MIPS32r6/MIPS64r6, 'jr $rs' is an alias for 'jalr $zero, $rs'. To aid development and review (specifically, to ensure all cases of jr are updated), these aliases are temporarily named 'r6.jr' instead of 'jr'. A follow up patch will change them back to the correct mnemonic. Added (JALR $zero, $rs) to MipsNaClELFStreamer's definition of an indirect jump, and removed it from its definition of a call. Note: I haven't accounted for MIPS64 in MipsNaClELFStreamer since it's doesn't appear to account for any MIPS64-specifics. The return instruction created as part of eh_return expansion is now expanded using expandRetRA() so we use the right return instruction on MIPS32r6/MIPS64r6 ('jalr $zero, $rs'). Also, fixed a misuse of isABI_N64() to detect 64-bit wide registers in expandEhReturn(). Reviewers: jkolek, vmedic, mseaborn, zoran.jovanovic, dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4268 llvm-svn: 212604
* Add ability to emit internal instruction representation to CodeGen assembly ↵Daniel Sanders2014-07-091-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | output. Summary: This patch re-uses the implementation of 'llvm-mc -show-inst' and makes it available to llc as 'llc -asm-show-inst'. This is necessary to test parts of MIPS32r6/MIPS64r6 without resorting to 'llc -filetype=obj' tests. For example, on MIPS32r2 and earlier we use the 'jr $rs' instruction for indirect branches and returns. On MIPS32r6, we no longer have 'jr $rs' and use 'jalr $zero, $rs' instead. The catch is that, on MIPS32r6, 'jr $rs' is an alias for 'jalr $zero, $rs' and is the preferred way of writing this instruction. As a result, all MIPS ISA's emit 'jr $rs' in their assembly output and the assembler encodes this to different opcodes according to the ISA. Using this option, we can check that the MCInst really is a JR or a JALR by matching the emitted comment. This removes the need for a 'llc -filetype=obj' test. Reviewers: rafael, dsanders Reviewed By: dsanders Subscribers: zoran.jovanovic, llvm-commits Differential Revision: http://reviews.llvm.org/D4267 llvm-svn: 212603
* [x86] Re-apply a variant of the x86 side of r212324 now that the restChandler Carruth2014-07-093-82/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | has settled without incident, removing the x86-specific and overly strict 'isVectorSplat' routine in favor of generic and more powerful splat detection. The primary motivation and result of this is that the x86 backend can now see through splats which contain undef elements. This is essential if we are using a widening form of legalization and I've updated a test case to also run in that mode as before this change the generated code for the test case was completely scalarized. This version of the patch much more carefully handles the undef lanes. - We aren't overly conservative about them in the shift lowering (where we will never use the splat itself). - One place where the splat would have been re-used by the existing code now explicitly constructs a new constant splat that will be safe. - The broadcast lowering is much more reasonable with undefs by doing a correct check of whether the splat is the only user of a loaded value, checking that the splat actually crosses multiple lanes before using a broadcast, and handling broadcasts of non-constant splats. As a consequence of the last bullet, the weird usage of vpshufd instead of vbroadcast is gone, and we actually can lower an AVX splat with vbroadcastss where before we emitted a really strange pattern of a vector load and a manual splat across the vector. llvm-svn: 212602
* [ASan/Win] Don't instrument COMDAT globals. Properly fixes PR20244.Timur Iskhodzhanov2014-07-091-8/+4
| | | | llvm-svn: 212596
* SourceMgr: consistently use 'unsigned' for the memory buffer ID typeDmitri Gribenko2014-07-092-7/+7
| | | | llvm-svn: 212595
* Prospective -fsanitize=memory build fix following r212586Alp Toker2014-07-091-3/+3
| | | | | | | | | | | This -f group flag appears to influence linker flags, breaking the usual rules and causing CMake's link invocation to fail during feature detection due to missing link dependencies (msan_*). Let's forcibly add it for now to get things the way they were before feature detection started working. llvm-svn: 212590
* Use correct memeber when displaying StringMap's size.Nikola Smiljanic2014-07-091-2/+2
| | | | llvm-svn: 212588
* CMake: make __DATE__, __TIME__ etc. macro usage an errorAlp Toker2014-07-091-0/+3
| | | | | | | | | | | | | | | | When LLVM_ENABLE_TIMESTAMPS has been disabled we can prevent the preprocessor from embedding dates, times and file timestamps. There are a few motivations for this: 1) Validate the recent CMake feature detection bugfix from LLVM r212586 with a flag that's not actually available everywhere. 2) Dogfood clang's new -Wdate-time warning from r210511 when bootstrapping. 3) Encourage reproducible builds. llvm-svn: 212587
* CMake: fix compiler feature detectionAlp Toker2014-07-091-36/+27
| | | | | | | | | | | | | | | | | | | | add_flag_if_supported() and add_flag_or_print_warning() were effectively no-ops, just returning the value of the first result (usually '-fno-omit-frame-pointer') for all subsequent checks for different flags. Due to the way CMake caches feature detection results, we need to provide symbolic variable names which will persist the cached results. This commit fixes feature detection using these two macros. The feature checks now run and get stored correctly, and the correct output can be observed in configure logs: -- Performing Test C_SUPPORTS_FPIC -- Performing Test C_SUPPORTS_FPIC - Success -- Performing Test CXX_SUPPORTS_FPIC -- Performing Test CXX_SUPPORTS_FPIC - Success llvm-svn: 212586
* [SDAG] At the suggestion of Hal, switch to an output parameter thatChandler Carruth2014-07-094-25/+42
| | | | | | | | | | tracks which elements of the build vector are in fact undef. This should make actually inpsecting them (likely in my next patch) reasonably pretty. Also makes the output parameter optional as it is clear now that *most* users are happy with undefs in their splats. llvm-svn: 212581
* [ms-coff] Add a test for proper handling of full Windows path names in the ↵Ehsan Akhgari2014-07-091-0/+2
| | | | | | | | | | | | | | .drectve section Summary: This test ensures that we can correctly specify a full Windows path to the clang ASAN runtime libraries. This is in preparation to fix PR20246. Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4427 llvm-svn: 212580
* MipsTargetStreamer.h: Avoid "using" to appease msc17.NAKAMURA Takumi2014-07-081-1/+1
| | | | llvm-svn: 212577
* Changed the lvm-nm alias "-s" for -print-armap to "-M".Kevin Enderby2014-07-088-15/+15
| | | | | | | | | | This will allow the "-s" flag to implemented in the future as it is in darwin’s nm(1) to list symbols only in the specified section. Given a LGTM by Shankar Easwaran who originally implemented the support for lvm-nm’s -print-armap and archive map symbols. llvm-svn: 212576
* AArch64: Better codegen for loading from __fp16.Jim Grosbach2014-07-082-0/+163
| | | | | | | | | | | | | Loading will generally extend to an f32 or an 64, so make sure to match those patterns directly to load into the FPR16 register class directly rather than going through the integer GPRs. This also eliminates an extra step in the convert-to-f64 path which was first converting to f32 and then to f64 from there. rdar://17594379 llvm-svn: 212573
* Improve BasicAA CS-CS queriesHal Finkel2014-07-085-126/+371
| | | | | | | | | | | | | | | | | | | | | | | | | | | | BasicAA contains knowledge of certain intrinsics, such as memcpy and memset, and uses that information to form more-accurate answers to CallSite vs. Loc ModRef queries. Unfortunately, it did not use this information when answering CallSite vs. CallSite queries. Generically, when an intrinsic takes one or more pointers and the intrinsic is marked only to read/write from its arguments, the offset/size is unknown. As a result, the generic code that answers CallSite vs. CallSite (and CallSite vs. Loc) queries in AA uses UnknownSize when forming Locs from an intrinsic's arguments. While BasicAA's CallSite vs. Loc override could use more-accurate size information for some intrinsics, it did not do the same for CallSite vs. CallSite queries. This change refactors the intrinsic-specific logic in BasicAA into a generic AA query function: getArgLocation, which is overridden by BasicAA to supply the intrinsic-specific knowledge, and used by AA's generic implementation. This allows the intrinsic-specific knowledge to be used by both CallSite vs. Loc and CallSite vs. CallSite queries, and simplifies the BasicAA implementation. Currently, only one function, Mac's memset_pattern16, is handled by BasicAA (all the rest are intrinsics). As a side-effect of this refactoring, BasicAA's getModRefBehavior override now also returns OnlyAccessesArgumentPointees for this function (which is an improvement). llvm-svn: 212572
* DominanceInfo is strongly preferred over RegionInfoTobias Grosser2014-07-081-0/+10
| | | | | | | This is and always was strong community consensus. Make this clear in the header in case newcomers may not be aware. llvm-svn: 212570
* Add support for BSD format Archive map symbols (aka the table of contentsKevin Enderby2014-07-082-6/+69
| | | | | | from a __.SYMDEF or "__.SYMDEF SORTED" archive member). llvm-svn: 212568
* Revert "GlobalDCE: Delete available_externally initializers if it allows ↵Pete Cooper2014-07-082-74/+4
| | | | | | | | | | removing the value the initializer is referring to." This reverts commit 5b55a47e94e28fbb56d0cd5d72c3db9105c15b4c. A test case was found to crash after this was applied. I'll file a bug to track fixing this with the test case needed. llvm-svn: 212550
* [PowerPC] Implement atomic NAND operations as actual NANDUlrich Weigand2014-07-081-4/+4
| | | | | | | | | | | This changes the implementation of atomic NAND operations from "a & ~b" (compatible with GCC < 4.4) to actual "~(a & b)" (compatible with GCC >= 4.4). This is in line with the common-code and ARM back-end change implemented in r212433. llvm-svn: 212547
* [DAG] Teach how to combine a pair of shuffles into a single shuffle if the ↵Andrea Di Biagio2014-07-083-3/+627
| | | | | | | | | | | | | | | | | | | | | | | | | resulting mask is legal. This patch teaches how to fold a shuffle according to rule: shuffle (shuffle (x, undef, M0), undef, M1) -> shuffle(x, undef, M2) We do this only if the resulting mask M2 is legal; this is to avoid introducing illegal shuffles that are potentially expanded into a sub-optimal sequence of target specific dag nodes. This patch has the advantage of being target independent, since it works on ISD nodes. Therefore, all targets (not only x86) can take advantage of this rule. The idea behind this patch is that most shuffle pairs can be safely combined before we run the legalizer on vector operations. This allows us to combine/simplify dag nodes earlier in the process and not only immediately before instruction selection stage. That said. This patch is not meant to replace any existing target specific combine rules; backends might still introduce new shuffles during legalization stage. Also, this rule is very simple and avoids to aggressively optimize shuffles. llvm-svn: 212539
* Fix some Twine locals.Benjamin Kramer2014-07-083-17/+18
| | | | | | Two of those are use after frees. Found by clang-tidy, fixed by me. llvm-svn: 212537
* [ASan/Win] Don't instrument private COMDAT globals until PR20244 is properly ↵Timur Iskhodzhanov2014-07-081-0/+7
| | | | | | fixed llvm-svn: 212530
* [mips] Fixed struct/class mismatch introduced in r212522.Daniel Sanders2014-07-081-1/+1
| | | | | | Clang emits a warning about this. llvm-svn: 212528
* Fix r212522 - [mips] Improve encapsulation of the .MIPS.abiflags ↵Daniel Sanders2014-07-081-0/+3
| | | | | | | | implementation and limit scope of related enums Added two lines that should have been in r212522. llvm-svn: 212523
* [mips] Improve encapsulation of the .MIPS.abiflags implementation and limit ↵Daniel Sanders2014-07-088-298/+391
| | | | | | | | | | | | | | | scope of related enums Summary: Follow on to r212519 to improve the encapsulation and limit the scope of the enums. Also merged two very similar parser functions, fixed a bug where ASE's were not being reported, and marked CPR1's as being 128-bit when MSA is enabled. Differential Revision: http://reviews.llvm.org/D4384 llvm-svn: 212522
OpenPOWER on IntegriCloud