summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* Move Delinearization pass into an anonymous namespace.Benjamin Kramer2013-11-131-0/+4
| | | | llvm-svn: 194582
* Make sure LLVMLoadLibraryPermanently gets an extern "C" symbol.Benjamin Kramer2013-11-131-1/+1
| | | | | | | Otherwise it's impossible to use it. Also don't include C++ headers in a C header. llvm-svn: 194581
* Remove AllowQuotesInName and friends from MCAsmInfo.Rafael Espindola2013-11-138-97/+34
| | | | | | | | | | | Accepting quotes is a property of an assembler, not of an object file. For example, ELF can support any names for sections and symbols, but the gnu assembler only accepts quotes in some contexts and llvm-mc in a few more. LLVM should not produce different symbols based on a guess about which assembler will be reading the code it is printing. llvm-svn: 194575
* Don't call doFinalization from verifyFunction.Rafael Espindola2013-11-131-1/+0
| | | | | | | | | | | | | | | | verifyFunction needs to call doInitialization to collect metadata and avoid crashing when verifying debug info in a function. But it should not call doFinalization since that is where the verifier will check declarations, variables and aliases, which is not desirable when one only wants to verify a function. A possible cleanup would be to split the class into a ModuleVerifier and FunctionVerifier. Issue reported by Ilia Filippov. Patch by Michael Kruse. llvm-svn: 194574
* Fix bug in .gpword directive parsing.Vladimir Medic2013-11-131-4/+2
| | | | llvm-svn: 194570
* Support for microMIPS trap instruction with immediate operands.Zoran Jovanovic2013-11-134-8/+27
| | | | llvm-svn: 194569
* Fix -Wdelete-non-virtual-dtor warnings by making SampleProfile methods ↵Alexey Samsonov2013-11-131-4/+4
| | | | | | non-virtual llvm-svn: 194568
* SampleProfileLoader pass. Initial setup.Diego Novillo2013-11-133-0/+481
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new scalar pass that reads a file with samples generated by 'perf' during runtime. The samples read from the profile are incorporated and emmited as IR metadata reflecting that profile. The profile file is assumed to have been generated by an external profile source. The profile information is converted into IR metadata, which is later used by the analysis routines to estimate block frequencies, edge weights and other related data. External profile information files have no fixed format, each profiler is free to define its own. This includes both the on-disk representation of the profile and the kind of profile information stored in the file. A common kind of profile is based on sampling (e.g., perf), which essentially counts how many times each line of the program has been executed during the run. The SampleProfileLoader pass is organized as a scalar transformation. On startup, it reads the file given in -sample-profile-file to determine what kind of profile it contains. This file is assumed to contain profile information for the whole application. The profile data in the file is read and incorporated into the internal state of the corresponding profiler. To facilitate testing, I've organized the profilers to support two file formats: text and native. The native format is whatever on-disk representation the profiler wants to support, I think this will mostly be bitcode files, but it could be anything the profiler wants to support. To do this, every profiler must implement the SampleProfile::loadNative() function. The text format is mostly meant for debugging. Records are separated by newlines, but each profiler is free to interpret records as it sees fit. Profilers must implement the SampleProfile::loadText() function. Finally, the pass will call SampleProfile::emitAnnotations() for each function in the current translation unit. This function needs to translate the loaded profile into IR metadata, which the analyzer will later be able to use. This patch implements the first steps towards the above design. I've implemented a sample-based flat profiler. The format of the profile is fairly simplistic. Each sampled function contains a list of relative line locations (from the start of the function) together with a count representing how many samples were collected at that line during execution. I generate this profile using perf and a separate converter tool. Currently, I have only implemented a text format for these profiles. I am interested in initial feedback to the whole approach before I send the other parts of the implementation for review. This patch implements: - The SampleProfileLoader pass. - The base ExternalProfile class with the core interface. - A SampleProfile sub-class using the above interface. The profiler generates branch weight metadata on every branch instructions that matches the profiles. - A text loader class to assist the implementation of SampleProfile::loadText(). - Basic unit tests for the pass. Additionally, the patch uses profile information to compute branch weights based on instruction samples. This patch converts instruction samples into branch weights. It does a fairly simplistic conversion: Given a multi-way branch instruction, it calculates the weight of each branch based on the maximum sample count gathered from each target basic block. Note that this assignment of branch weights is somewhat lossy and can be misleading. If a basic block has more than one incoming branch, all the incoming branches will get the same weight. In reality, it may be that only one of them is the most heavily taken branch. I will adjust this assignment in subsequent patches. llvm-svn: 194566
* XCore target: implement exception handlingRobert Lytton2013-11-133-3/+39
| | | | llvm-svn: 194564
* This patch fixes a bug in floating point operands parsing, when instruction ↵Vladimir Medic2013-11-131-2/+19
| | | | | | alias uses default register operand. llvm-svn: 194562
* Mips16InstrInfo.cpp: Use <cctype> instead of <ctype.h>NAKAMURA Takumi2013-11-131-2/+1
| | | | | | Also, prune <stdlib.h>, seems stray. llvm-svn: 194557
* Allow the code which returns the length for inline assembler to knowReed Kotler2013-11-133-3/+52
| | | | | | | | | specifically about the .space directive. This allows us to force large blocks of code to appear in test cases for things like constant islands without having to make giant test cases to force things like long branches to take effect. llvm-svn: 194555
* R600: Fix selection failure on EXTLOADMatt Arsenault2013-11-131-1/+9
| | | | llvm-svn: 194547
* SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.Juergen Ributzka2013-11-134-11/+51
| | | | | | | | | | | | | | | | | | | | | | This patch reapplies r193676 with an additional fix for the Hexagon backend. The SystemZ backend has already been fixed by r194148. The Type Legalizer recognizes that VSELECT needs to be split, because the type is to wide for the given target. The same does not always apply to SETCC, because less space is required to encode the result of a comparison. As a result VSELECT is split and SETCC is unrolled into scalar comparisons. This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG Combiner. If a matching pattern is found, then the result mask of SETCC is promoted to the expected vector mask type for the given target. Now the type legalizer will split both VSELECT and SETCC. This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>. Reviewed by Nadav llvm-svn: 194542
* Introduce an AnalysisManager which is like a pass manager but with a lotChandler Carruth2013-11-132-0/+156
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | more smarts in it. This is where most of the interesting logic that used to live in the implicit-scheduling-hackery of the old pass manager will live. Like the previous commits, note that this is a very early prototype! I expect substantial changes before this is ready to use. The core of the design is the following: - We have an AnalysisManager which can be used across a series of passes over a module. - The code setting up a pass pipeline registers the analyses available with the manager. - Individual transform passes can check than an analysis manager provides the analyses they require in order to fail-fast. - There is *no* implicit registration or scheduling. - Analysis passes are different from other passes: they produce an analysis result that is cached and made available via the analysis manager. - Cached results are invalidated automatically by the pass managers. - When a transform pass requests an analysis result, either the analysis is run to produce the result or a cached result is provided. There are a few aspects of this design that I *know* will change in subsequent commits: - Currently there is no "preservation" system, that needs to be added. - All of the analysis management should move up to the analysis library. - The analysis management needs to support at least SCC passes. Maybe loop passes. Living in the analysis library will facilitate this. - Need support for analyses which are *both* module and function passes. - Need support for pro-actively running module analyses to have cached results within a function pass manager. - Need a clear design for "immutable" passes. - Need support for requesting cached results when available and not re-running the pass even if that would be necessary. - Need more thorough testing of all of this infrastructure. There are other aspects that I view as open questions I'm hoping to resolve as I iterate a bit on the infrastructure, and especially as I start writing actual passes against this. - Should we have separate management layers for function, module, and SCC analyses? I think "yes", but I'm not yet ready to switch the code. Adding SCC support will likely resolve this definitively. - How should the 'require' functionality work? Should *that* be the only way to request results to ensure that passes always require things? - How should preservation work? - Probably some other things I'm forgetting. =] Look forward to more patches in shorter order now that this is in place. llvm-svn: 194538
* Update the docs to match the function name.Nadav Rotem2013-11-131-1/+1
| | | | llvm-svn: 194537
* Replacing HUGE_VALF with llvm::huge_valf in order to work around a warning ↵Aaron Ballman2013-11-133-5/+6
| | | | | | | | triggered in MSVC 12. Patch reviewed by Reid Kleckner and Jim Grosbach. llvm-svn: 194533
* Remove always true flag.Rafael Espindola2013-11-122-7/+4
| | | | llvm-svn: 194530
* Cleanup the stackmap operand folding code and fix a corner case.Andrew Trick2013-11-121-6/+12
| | | | | | | I still don't know how to refer to the fixed operands symbolically. I plan to look into it. llvm-svn: 194529
* delinearization of arraysSebastian Pop2013-11-125-0/+653
| | | | llvm-svn: 194527
* Fold (iszero(A&K1) | iszero(A&K2)) -> (A&(K1|K2)) != (K1|K2) if we know ↵Nadav Rotem2013-11-121-3/+50
| | | | | | that K1 and K2 are 'one-hot' (only one bit is on). llvm-svn: 194525
* FoldBranchToCommonDest merges branches into a single branch with or/and of ↵Nadav Rotem2013-11-121-2/+7
| | | | | | the condition. It has a heuristics for estimating when some of the dependencies are processed by out-of-order processors. This patch adds another rule to the heuristics that says that if the "BonusInstruction" that we speculatively execute is used by the condition of the second branch then it is okay to hoist it. This change exposes more opportunities for other passes to transform the code. It does not matter that much that we if-convert the code because the selectiondag builder splits or/and branches into multiple branches when profitable. llvm-svn: 194524
* [mips] Fix a bug in function CC_MipsO32_FP64. The second double precisionAkira Hatanaka2013-11-121-1/+1
| | | | | | | argument was not being passed in $f14. llvm-svn: 194522
* Add a FIXME for 32-bit q modifiers.Eric Christopher2013-11-121-0/+1
| | | | llvm-svn: 194515
* Protect user-supplied runtime library functions in LTOJustin Bogner2013-11-121-3/+46
| | | | | | | | | | | | | | Add user-supplied C runtime and compiler-rt library functions to llvm.compiler.used to protect them from premature optimization by passes like -globalopt and -ipsccp. Calls to (seemingly unused) runtime library functions can be added by -instcombine and instruction lowering. Patch by Duncan Exon Smith, thanks! Fixes <rdar://problem/14740087> llvm-svn: 194514
* ARM: diagnose invalid system LDM/STMTim Northover2013-11-121-0/+16
| | | | | | | | | | | | | The system LDM and STM instructions can't usually writeback to the base register. The one exception is when an LDM is actually an exception-return (i.e. contains PC in the register list). (There's already a test that "ldm sp!, {r0-r3, pc}^" works, which is why there is no positive test). rdar://problem/15223374 llvm-svn: 194512
* [mips] Revert part of r194510 that was accidentally committed.Akira Hatanaka2013-11-121-1/+1
| | | | llvm-svn: 194511
* [mips] Fix and re-enable a test case that has been disabled for a long time.Akira Hatanaka2013-11-121-1/+1
| | | | llvm-svn: 194510
* Corruptly merge constants with explicit and implicit alignments.Rafael Espindola2013-11-121-4/+7
| | | | | | | | | | | | | Constant merge can merge a constant with implicit alignment with one that has explicit alignment. Before this change it was assuming that the explicit alignment was higher than the implicit one, causing the result to be under aligned in some cases. Fixes pr17815. Patch by Chris Smowton! llvm-svn: 194506
* [AArch64] Implemented AdvSIMD scalar x indexed element format and AdvSIMD scalarChad Rosier2013-11-124-31/+327
| | | | | | | | copy in MC layer. Added the MC layer tests. Fixed triple setting in test cases. Patch by Ana Pazos <apazos@codeaurora.org>. llvm-svn: 194501
* Expand rotate instructions on sparcv9 as well.Roman Divacky2013-11-121-0/+2
| | | | llvm-svn: 194500
* Simplify operand folding when rematerializing a load.Andrew Trick2013-11-121-1/+6
| | | | | | | | | | | | We already know how to fold a reload from a frameindex without analyzing the load instruction. Generalize this to handle any frameindex load. This streamlines the logic for rematerializing loads from stack arguments. As a side effect, it allows stackmaps to record a stack argument location without spilling it. Verified no effect on codegen for llvm test-suite. llvm-svn: 194497
* R600: Reenable llvm.R600.load.input/interp.input for compatibilityVincent Lejeune2013-11-122-0/+47
| | | | llvm-svn: 194484
* [mips][msa] Enable inlinse assembly for MSA.Daniel Sanders2013-11-122-9/+51
| | | | | | | | | | | | | | Like GCC, this re-uses the 'f' constraint and a new 'w' print-modifier: asm ("ldi.w %w0, 1", "=f"(result)); Unlike GCC, the 'w' print-modifer is not _required_ to produce the intended output. This is a consequence of differences in the internal handling of the registers in each compiler. To be source-compatible between the compilers, users must use the 'w' print-modifier. MSA registers (including control registers) are supported in clobber lists. llvm-svn: 194476
* SimplifyCFG: Use existing constant folding logic when forming switch tables.Benjamin Kramer2013-11-121-31/+20
| | | | | | Both simpler and more powerful than the hand-rolled folding logic. llvm-svn: 194475
* [mips][msa] Fix buildbot failures caused by an unused variable when ↵Daniel Sanders2013-11-121-2/+1
| | | | | | assertions are disabled. llvm-svn: 194472
* [mips][msa] Added support for matching bclr, and bclri from normal IR (i.e. ↵Daniel Sanders2013-11-129-38/+161
| | | | | | not intrinsics) llvm-svn: 194471
* [ARM] Add support for FP_HP_extension build attributeBradley Smith2013-11-122-1/+7
| | | | llvm-svn: 194470
* [mips][msa] Added support for matching bset, bseti, bneg, and bnegi from ↵Daniel Sanders2013-11-122-70/+216
| | | | | | normal IR (i.e. not intrinsics) llvm-svn: 194469
* XCore target: fix bug in aligning 'byval i8*' on the stackRobert Lytton2013-11-121-1/+1
| | | | llvm-svn: 194466
* Add XCore support for ATOMIC_FENCE.Robert Lytton2013-11-123-1/+25
| | | | | | | | | | ATOMIC_FENCE is lowered to a compiler barrier which is codegen only. There is no need to emit an instructions since the XCore provides sequential consistency. Original patch by Richard Osborne llvm-svn: 194464
* XCore target: return error for unsupported alignmentRobert Lytton2013-11-121-0/+4
| | | | llvm-svn: 194463
* Change data structure to memorize computed result in ScalarEvolutionWan Xiaofei2013-11-121-22/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace std::map with SmallVector to memorize the cached result since SCEV usually belongs to little Loop/BB Linear scan on SmallVector is faster than std::map. Code reviewer : Andrew Trick. Test result : Pass Unit Test & LLVM Test Suite 401.bzip2 0.425721 0.419981 101.37% 403.gcc 24.53855 24.2667 101.12% 429.mcf 0.060847 0.059944 101.51% 433.milc 0.646009 0.636119 101.55% 444.namd 1.383928 1.370614 100.97% 445.gobmk 5.836575 5.800225 100.63% 450.soplex 1.911257 1.895963 100.81% 456.hmmer 1.039565 1.032534 100.68% 458.sjeng 0.897401 0.885567 101.34% 464.h264ref 3.645908 3.577991 101.90% 470.lbm 0.049456 0.048398 102.19% 471.omnetpp 5.638575 5.60435 100.61% bitmnp01 0.045738 0.045291 100.99% cjpegv2data 0.304359 0.302833 100.50% idctrn01 0.046433 0.045763 101.46% quake2 4.534416 4.4952 100.87% quake 2.688566 2.659208 101.10% xcsoar 12.42545 12.30385 100.99% linpack 0.038739 0.03803 101.86% matrix01 0.053564 0.0528 101.45% nbench 0.402867 0.395803 101.78% tblook01 0.021265 0.021015 101.19% ttsprk01 0.066384 0.065566 101.25% llvm-svn: 194459
* Correct a glitch in r194424 which may invalidate iterator.Shuxin Yang2013-11-121-1/+3
| | | | llvm-svn: 194457
* llvm-cov: Added call to update run/program counts.Yuchen Wu2013-11-121-0/+8
| | | | | | Also updated test files that were generated from this change. llvm-svn: 194453
* R600/SI: Change formatting of printed registers.Matt Arsenault2013-11-122-2/+64
| | | | | | | | | | | | | | | | | | | | | | | Print the range of registers used with a single letter prefix. This better matches what the shader compiler produces and is overall less obnoxious than concatenating all of the subregister names together. Instead of SGPR0, it will print s0. Instead of SGPR0_SGPR1, it will print s[0:1] and so on. There doesn't appear to be a straightforward way to get the actual register info in the InstPrinter, so this parses the generated name to print with the new syntax. The required test changes are pretty nasty, and register matching regexes are now worse. Since there isn't a way to add to a variable in FileCheck, some of the tests now don't check the exact number of registers used, but I don't think that will be a real problem. llvm-svn: 194443
* Change the default branch instruction to be the 16 bit variety for mips16.Reed Kotler2013-11-123-5/+26
| | | | | | | | | | | This has no material effect at this time since we don't have a direct object emitter for mips16 and the assembler can't tell them apart. I place a comment "16 bit inst" for those so that I can tell them apart in the output. The constant island pass has only been minimally changed to allow this. More complete branch work is forthcoming but this is the first step. llvm-svn: 194442
* Extract a bc attr parsing helper that returns Attribute::None on errorReid Kleckner2013-11-121-78/+49
| | | | | | | The parsing method still returns llvm::error_code for consistency with other parsing methods. Minor cleanup, no functionality change. llvm-svn: 194437
* Lower X86::MORESTACK_RET and X86::MORESTACK_RET_RESTORE_R10 inLang Hames2013-11-111-12/+12
| | | | | | | | | | | | | | X86AsmPrinter::EmitInstruction, rather than X86MCInstLower::Lower. The aim is to improve the reusability of the X86MCInstLower class by making it more function-like. The X86::MORESTACK_RET_RESTORE_R10 pseudo broke the function model by emitting an extra instruction to the MCStreamer attached to the AsmPrinter. The patch should have no impact on generated code. llvm-svn: 194431
* Fix the recently added anyregcc convention to handle spilled operands.Andrew Trick2013-11-111-1/+10
| | | | | | | | | | | | Fixes <rdar://15432754> [JS] Assertion: "Folded a def to a non-store!" The primary purpose of anyregcc is to prevent a patchpoint's call arguments and return value from being spilled. They must be available in a register, although the calling convention does not pin the register. It's up to the front end to avoid using this convention for calls with more arguments than allocatable registers. llvm-svn: 194428
OpenPOWER on IntegriCloud