summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* Add support for lowercase variants.Rafael Espindola2011-01-231-0/+14
| | | | llvm-svn: 124071
* Enhance SRoA to be more aggressive about scalarization of aggregate allocasChris Lattner2011-01-231-12/+114
| | | | | | | | | | | | that have PHI or select uses of their element pointers. This can often happen when instcombine sinks two loads into a successor, inserting a phi or select. With this patch, we can scalarize the alloca, but the pinned elements are not yet promoted. This is still a win for large aggregates where only one element is used. This fixes rdar://8904039 and part of rdar://7339113 (poor codegen on stringswitch). llvm-svn: 124070
* Convert two std::vectors to SmallVectors for a 3.4% speedup running -scalarreplCameron Zwarich2011-01-231-2/+2
| | | | | | on test-suite + SPEC2000 & SPEC2006. llvm-svn: 124068
* have AllocaInfo store the alloca being inspected, simplifying callers.Chris Lattner2011-01-231-22/+24
| | | | | | No functionality change. llvm-svn: 124067
* Rearrange some code a bit. Change MarkUnsafe to Chris Lattner2011-01-231-27/+29
| | | | | | | | handle the "Transformation preventing inst" printing, so that -scalarrepl -debug will always print the rejected instruction. No functionality change. llvm-svn: 124066
* remove an old hack that avoided creating MMX datatypes. TheChris Lattner2011-01-231-22/+1
| | | | | | X86 backend has been fixed. llvm-svn: 124064
* Use value ranges to fold ext(trunc) in SCEV when possible.Nick Lewycky2011-01-231-0/+34
| | | | llvm-svn: 124062
* Delay the creation of eh_frame so that the user can change the defaults.Rafael Espindola2011-01-234-21/+26
| | | | | | Add support for SHT_X86_64_UNWIND. llvm-svn: 124059
* Remove more duplicated code.Rafael Espindola2011-01-2310-111/+111
| | | | llvm-svn: 124056
* Remove duplicated code.Rafael Espindola2011-01-2311-79/+90
| | | | llvm-svn: 124054
* Have SCEV turn sext(x) into zext(x) when x is s>= 0. This applies many times inNick Lewycky2011-01-221-0/+4
| | | | | | "make check" alone. llvm-svn: 124046
* Add a FIXME explaining the move to a single indirect call bonus per functionEric Christopher2011-01-221-0/+5
| | | | | | that we can change from indirect to direct. llvm-svn: 124045
* Only apply the devirtualization bonus once instead of per-call site in theEric Christopher2011-01-221-2/+6
| | | | | | | | target function. Fixes part of rdar://8546196 llvm-svn: 124044
* Pass sret arguments through the stack instead of through registers in Sparc ↵Venkatraman Govindaraju2011-01-223-4/+75
| | | | | | backend. It makes the code generated more compliant with the sparc32 ABI. llvm-svn: 124030
* Added ICC, FCC as uses of movcc instruction to generate correct code when ↵Venkatraman Govindaraju2011-01-221-42/+51
| | | | | | -mattr=v9 is used. llvm-svn: 124027
* Actually check memcpy lengths, instead of just commenting aboutDan Gohman2011-01-211-2/+4
| | | | | | how they should be checked. llvm-svn: 123999
* Sparc backend: Venkatraman Govindaraju2011-01-213-23/+28
| | | | | | | Rename FLUSH to FLUSHW. Output "ta 3" instead of a "flushw" instruction if v8 instruction set is used. llvm-svn: 123997
* Just because we have determined that an (fcmp | fcmp) is true for A < B,Owen Anderson2011-01-211-1/+3
| | | | | | | A == B, and A > B, does not mean we can fold it to true. We still need to check for A ? B (A unordered B). llvm-svn: 123993
* Last round of fixes for movw + movt global address codegen.Evan Cheng2011-01-219-75/+136
| | | | | | | | | | 1. Fixed ARM pc adjustment. 2. Fixed dynamic-no-pic codegen 3. CSE of pc-relative load of global addresses. It's now enabled by default for Darwin. llvm-svn: 123991
* Clang was not parsing target triples involving EABI and was generating wrong ↵Renato Golin2011-01-211-3/+33
| | | | | | IR (wrong PCS) and passing the wrong information down llc via the target-triple printed in IR. I've fixed this by adding the parsing of EABI into LLVM's Triple class and using it to choose the correct PCS in Clang's Tools. A Clang patch is on its way to use this infrastructure. llvm-svn: 123990
* Handles libffi on the CMake build.Oscar Fuentes2011-01-211-0/+16
| | | | | | Patch by arrowdodger! llvm-svn: 123976
* Fix the encoding of QADD/SUB, QDADD/SUB. While qadd16, qadd8 use "rd, rn, rm",Bruno Cardoso Lopes2011-01-212-15/+24
| | | | | | | qadd and qdadd uses "rd, rm, rn", the same applies to the 'sub' variants. This is described in ARM manuals and matches the encoding used by the gnu assembler. llvm-svn: 123975
* Implement support for byval arguments in Sparc backend.Venkatraman Govindaraju2011-01-211-1/+31
| | | | llvm-svn: 123974
* SCCP doesn't actually preserve the CFG. It will delete and insert terminatorNick Lewycky2011-01-211-4/+0
| | | | | | instructions. llvm-svn: 123973
* Enable support for precise scheduling of the instruction selectionAndrew Trick2011-01-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | DAG. Disable using "-disable-sched-cycles". For ARM, this enables a framework for modeling the cpu pipeline and counting stalls. It also activates several heuristics to drive scheduling based on the model. Scheduling is inherently imprecise at this stage, and until spilling is improved it may defeat attempts to schedule. However, this framework provides greater control over tuning codegen. Although the flag is not target-specific, it should have very little affect on the default scheduler used by x86. The only two changes that affect x86 are: - scheduling a high-latency operation bumps the current cycle so independent operations can have their latency covered. i.e. two independent 4 cycle operations can produce results in 4 cycles, not 8 cycles. - Two operations with equal register pressure impact and no latency-based stalls on their uses will be prioritized by depth before height (height is irrelevant if no stalls occur in the schedule below this point). llvm-svn: 123971
* Convert -enable-sched-cycles and -enable-sched-hazard to -disableAndrew Trick2011-01-214-43/+56
| | | | | | | | | | | flags. They are still not enable in this revision. Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with the scheduler's model of operand latency in the selection DAG. Generalized unit tests to work with sched-cycles. llvm-svn: 123969
* fix PR9013, an infinite loop in instcombine.Chris Lattner2011-01-211-2/+10
| | | | llvm-svn: 123968
* update obsolete comment.Chris Lattner2011-01-211-4/+3
| | | | llvm-svn: 123965
* Don't try to pull vector bitcasts that change the number of elements throughNick Lewycky2011-01-211-3/+17
| | | | | | | a select. A vector select is pairwise on each element so we'd need a new condition with the right number of elements to select on. Fixes PR8994. llvm-svn: 123963
* Object: Fix type punned pointer issues by making DataRefImpl a union and ↵Michael J. Spencer2011-01-212-83/+68
| | | | | | using intptr_t. llvm-svn: 123962
* Add a constant folding of casts from zero to zero. Fixes PR9011!Nick Lewycky2011-01-211-0/+4
| | | | | | | | While here, I'd like to complain about how vector is not an aggregate type according to llvm::Type::isAggregateType(), but they're listed under aggregate types in the LangRef and zero vectors are stored as ConstantAggregateZero. llvm-svn: 123956
* Don't be overly aggressive with CSE of "ldr constantpool". If it's a pc-relativeEvan Cheng2011-01-201-5/+1
| | | | | | | | | value, the "add pc" must be CSE'ed at the same time. We could follow the same approach as T2 by adding pseudo instructions that combine the ldr + "add pc". But the better approach is to use movw + movt (which I will enable soon), so I'll leave this as a TODO. llvm-svn: 123949
* Implement requiredTransitiveTobias Grosser2011-01-201-1/+32
| | | | | | | The PassManager did not implement the transitivity of requiredTransitive. This was unnoticed since 2006. llvm-svn: 123942
* Fix the encoding and parsing of clrex instructionBruno Cardoso Lopes2011-01-202-5/+9
| | | | llvm-svn: 123936
* Change instruction names for consistencyBruno Cardoso Lopes2011-01-201-4/+6
| | | | llvm-svn: 123930
* Add cdp/cdp2 instructions for thumb/thumb2Bruno Cardoso Lopes2011-01-203-1/+51
| | | | llvm-svn: 123929
* - Use a more appropriate name for Owen's ARM Parser isMCR hack since the ↵Bruno Cardoso Lopes2011-01-202-26/+60
| | | | | | | | | | same operands can be present in cdp/cdp2 instructions. Also increase the hack with cdp/cdp2 instructions. - Fix the encoding of cdp/cdp2 instructions for ARM (no thumb and thumb2 yet) and add testcases for t hem. llvm-svn: 123927
* SplitKit requires that all defs are in place before calling useIntv().Jakob Stoklund Olesen2011-01-201-10/+22
| | | | | | | | | | | The value mapping gets confused about which original values have multiple new definitions so they may need phi insertions. This could probably be simplified by letting enterIntvBefore() take a live range to be added following the instruction. As long as the range stays inside the same basic block, value mapping shouldn't be a problem. llvm-svn: 123926
* Add LiveIntervalMap::dumpCache() to print out the cache used by the ssa ↵Jakob Stoklund Olesen2011-01-202-0/+24
| | | | | | update algorithm. llvm-svn: 123925
* Add mcr*2 and mr*c2 support to thumb2 targetsBruno Cardoso Lopes2011-01-202-0/+62
| | | | llvm-svn: 123919
* Add mcr* and mr*c support to thumb targetsBruno Cardoso Lopes2011-01-203-2/+68
| | | | llvm-svn: 123917
* Allow sign-extending of i8 and i16 to i128 on SPU. Kalle Raiskila2011-01-202-1/+7
| | | | llvm-svn: 123912
* At -O123 the early-cse pass is run before instcombine has run. According to myDuncan Sands2011-01-202-32/+173
| | | | | | | | | | | | | | | | auto-simplier the transform most missed by early-cse is (zext X) != 0 -> X != 0. This patch adds this transform and some related logic to InstructionSimplify and removes some of the logic from instcombine (unfortunately not all because there are several situations in which instcombine can improve things by making new instructions, whereas instsimplify is not allowed to do this). At -O2 this often results in more than 15% more simplifications by early-cse, and results in hundreds of lines of bitcode being eliminated from the testsuite. I did see some small negative effects in the testsuite, for example a few additional instructions in three programs. One program, 483.xalancbmk, got an additional 35 instructions, which seems to be due to a function getting an additional instruction and then being inlined all over the place. llvm-svn: 123911
* Refactor mcr* and mr*c instructions into classes with the same encoding. No ↵Bruno Cardoso Lopes2011-01-201-108/+46
| | | | | | functionality change. llvm-svn: 123910
* My editor's indent went crazy. Fix.Eric Christopher2011-01-201-1/+1
| | | | llvm-svn: 123909
* Expand invalid return values for umulo and smulo. Handle these similarlyEric Christopher2011-01-202-0/+28
| | | | | | | | | | to add/sub by doing the normal operation and then checking for overflow afterwards. This generally relies on the DAG handling the later invalid operations as well. Fixes the 64-bit part of rdar://8622122 and rdar://8774702. llvm-svn: 123908
* Correct itinerary entry for t2MOV_pic_ga_add_pc.Evan Cheng2011-01-201-1/+1
| | | | llvm-svn: 123907
* Sorry, several patches in one.Evan Cheng2011-01-2014-210/+291
| | | | | | | | | | | | | | | | | | | | | | | | | | | | TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. llvm-svn: 123905
* Object: Add ELF support.Michael J. Spencer2011-01-203-1/+708
| | | | llvm-svn: 123896
* Object: Add COFF Support.Michael J. Spencer2011-01-203-1/+372
| | | | llvm-svn: 123895
OpenPOWER on IntegriCloud