summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Rework how private buffer passed for HSAMatt Arsenault2015-11-309-102/+594
| | | | | | | | | | | | | | | | If we know we have stack objects, we reserve the registers that the private buffer resource and wave offset are passed and use them directly. If not, reserve the last 5 SGPRs just in case we need to spill. After register allocation, try to pick the next available registers instead of the last SGPRs, and then insert copies from the inputs to the reserved registers in the progloue. This also only selectively enables all of the input registers which are really required instead of always enabling them. llvm-svn: 254331
* AMDGPU: Rename enums to be consistent with HSA code object terminologyMatt Arsenault2015-11-305-50/+49
| | | | llvm-svn: 254330
* AMDGPU: Remove SIPrepareScratchRegsMatt Arsenault2015-11-3012-249/+123
| | | | | | | | | | | | | | | | | | | | | | It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. llvm-svn: 254329
* AMDGPU: Use assert zext for workgroup sizesMatt Arsenault2015-11-302-10/+24
| | | | llvm-svn: 254328
* [ARM] For old thumb ISA like v4t, we cannot use PC directly in pop.Quentin Colombet2015-11-301-18/+5
| | | | | | Fix the epilogue emission to account for that. llvm-svn: 254325
* [SimplifyLibCalls] Transform log(exp2(y)) to y*log(2) under fast-math.Davide Italiano2015-11-301-1/+9
| | | | llvm-svn: 254317
* [X86] Add RIP to GR64_TCW64David Majnemer2015-11-301-1/+1
| | | | | | | | | The MachineVerifier wants to check that the register operands of an instruction belong to the instruction's register class. RIP-relative control flow instructions violated this by referencing RIP. While this was fixed for SysV, it was never fixed for Win64. llvm-svn: 254315
* Enable shrink wrapping for PPC64Kit Barton2015-11-301-6/+14
| | | | | | | | | | | Re-enable shrink wrapping for PPC64 Little Endian. One minor modification to PPCFrameLowering::findScratchRegister was necessary to handle fall-thru blocks (blocks with no terminator) correctly. Tested with all LLVM test, clang tests, and the self-hosting build, with no problems found. PHabricator: http://reviews.llvm.org/D14778 llvm-svn: 254314
* Fix another llvm.ctors merging bug.Rafael Espindola2015-11-301-2/+3
| | | | | | | We were not looking past casts to see if an element should be included or not. llvm-svn: 254313
* [WebAssembly] Fix a few minor compiler warnings. NFC.Dan Gohman2015-11-301-7/+7
| | | | llvm-svn: 254311
* fix formatting; NFCSanjay Patel2015-11-301-6/+7
| | | | llvm-svn: 254310
* [Hexagon] NFC Reordering headers.Colin LeMahieu2015-11-301-1/+1
| | | | llvm-svn: 254307
* AMDGPU: Don't reserve SCRATCH_PTR input registerMatt Arsenault2015-11-301-12/+4
| | | | | | This hasn't been doing anything since using relocations was added. llvm-svn: 254304
* Silencing a 32-bit to 64-bit implicit conversion warning; NFC.Aaron Ballman2015-11-301-1/+1
| | | | llvm-svn: 254302
* [mips][microMIPS] Implement LBUX, LHX, LWX, MAQ_S[A].W.PHL, MAQ_S[A].W.PHR, ↵Hrvoje Varga2015-11-303-11/+75
| | | | | | | | MFHI, MFLO, MTHI and MTLO instructions Differential Revision: http://reviews.llvm.org/D14436 llvm-svn: 254297
* [mips][microMIPS] Fix issue with offset operand of BALC and BC instructionsZoran Jovanovic2015-11-304-2/+50
| | | | | | | Value of offset operand for microMIPS BALC and BC instructions is currently shifted 2 bits, but it should be 1 bit. Differential Revision: http://reviews.llvm.org/D14770 llvm-svn: 254296
* [mips][microMIPS] Implement PRECR.QB.PH, PRECR_SRA[_R].PH.W, PRECRQ.PH.W, ↵Zlatko Buljan2015-11-303-7/+41
| | | | | | | | PRECRQ.QB.PH, PRECRQU_S.QB.PH and PRECRQ_RS.PH.W instructions Differential Revision: http://reviews.llvm.org/D14605 llvm-svn: 254291
* Revert r254279 "[X86] Use ArrayRef. NFC". It seems to have upset an MSVC ↵Craig Topper2015-11-301-4/+7
| | | | | | build bot. llvm-svn: 254280
* [X86] Use ArrayRef. NFCCraig Topper2015-11-301-7/+4
| | | | llvm-svn: 254279
* [AVX512] The vpermi2 instructions require an integer vector for the index ↵Craig Topper2015-11-303-35/+66
| | | | | | | | vector. This is reflected correctly in the intrinsics, but was not refelected in the isel patterns. For the floating point types, this requires adding a bitcast to the index vector when its passed through to the output. llvm-svn: 254277
* [SCEV] Use lambda instead of std::bind; NFCSanjoy Das2015-11-291-2/+3
| | | | | | The lambda is more readable. llvm-svn: 254276
* [SCEV] Use range version of all_of; NFCSanjoy Das2015-11-291-13/+10
| | | | llvm-svn: 254275
* [X86] Remove duplicate entries from intrinsics tables and add asserts to ↵Craig Topper2015-11-291-22/+7
| | | | | | verify there are no others. llvm-svn: 254274
* [WebAssembly] Delete an obsolete TODO comment.Dan Gohman2015-11-291-1/+0
| | | | llvm-svn: 254272
* [WebAssembly] Set several MCInstrDesc flags.Dan Gohman2015-11-294-0/+20
| | | | llvm-svn: 254271
* [X86] int_x86_avx2_permps and X86ISD::VPERMV should take an integer vector ↵Craig Topper2015-11-292-4/+6
| | | | | | for its shuffle indices. llvm-svn: 254269
* [WebAssembly] Delete unused functions. NFC.Dan Gohman2015-11-291-6/+0
| | | | llvm-svn: 254268
* [WebAssembly] Minor clang-format and selected clang-tidy cleanups. NFC.Dan Gohman2015-11-2913-64/+68
| | | | llvm-svn: 254267
* fix typos in comments; NFCSanjay Patel2015-11-291-6/+8
| | | | llvm-svn: 254266
* [SimplifyLibCalls] Don't crash if the function doesn't have a name.Davide Italiano2015-11-291-3/+2
| | | | llvm-svn: 254265
* [SimplifyLibCalls] Cross out implemented transformations.Davide Italiano2015-11-291-2/+0
| | | | llvm-svn: 254264
* [SimplifyLibCalls] Tranform log(pow(x, y)) -> y*log(x).Davide Italiano2015-11-291-5/+50
| | | | | | | | | | | | | | | | | | This one is enabled only under -ffast-math. There are cases where the difference between the value computed and the correct value is huge even for ffast-math, e.g. as Steven pointed out: x = -1, y = -4 log(pow(-1), 4) = 0 4*log(-1) = NaN I checked what GCC does and apparently they do the same optimization (which result in the dramatic difference). Future work might try to make this (slightly) less worse. Differential Revision: http://reviews.llvm.org/D14400 llvm-svn: 254263
* SamplePGO - Do not use std::to_string in diagnostics.Diego Novillo2015-11-291-12/+17
| | | | | | | | This fixes buildbots in systems that std::to_string is not present. It also tidies the output of the diagnostic to render doubles a bit better (thanks Ben Kramer for help with string streams and format). llvm-svn: 254261
* Use a lambda instead of std::bind and std::mem_fn I introduced in r254242. NFCCraig Topper2015-11-291-2/+3
| | | | llvm-svn: 254260
* [X86][SSE] Added support for lowering to ADDSUBPS/ADDSUBPD with commuted inputsSimon Pilgrim2015-11-291-5/+10
| | | | | | We could already recognise shuffle(FSUB, FADD) -> ADDSUB, this allow us to recognise shuffle(FADD, FSUB) -> ADDSUB by commuting the shuffle mask prior to matching. llvm-svn: 254259
* Simplify. NFC.Rafael Espindola2015-11-291-16/+12
| | | | llvm-svn: 254254
* AVX512:Implemented encoding for the vmovq.s instruction.Igor Breger2015-11-291-0/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D14810 llvm-svn: 254248
* Remove an intermediate lambda. NFCCraig Topper2015-11-291-3/+2
| | | | llvm-svn: 254246
* Remove unnecessary intermediate lambda. NFCCraig Topper2015-11-292-5/+2
| | | | llvm-svn: 254243
* [SelectionDAG] Use std::any_of instead of a manually coded loop. NFCCraig Topper2015-11-291-8/+4
| | | | llvm-svn: 254242
* Correctly handle llvm.global_ctors merging.Rafael Espindola2015-11-291-42/+48
| | | | | | | We were not handling the case where an entry must be dropped and the destination module has no llvm.global_ctors. llvm-svn: 254241
* Fix a crash when writing merged bitcode.Rafael Espindola2015-11-291-5/+14
| | | | | | | Playing with mutateType in here was making getValueType and getType incompatible. llvm-svn: 254240
* [SimplifyLibCalls] Use any_of(). Suggested by David Blaikie!Davide Italiano2015-11-281-4/+3
| | | | llvm-svn: 254239
* [SimplifyLibCalls] Fix inverted condition that lead to an uninitialized ↵Benjamin Kramer2015-11-281-2/+2
| | | | | | | | memory read below. Found by msan! llvm-svn: 254238
* [PGO] Move value profile format related structures and APIs to common fileXinliang David Li2015-11-281-177/+4
| | | | | | | | | | This is the last step to enable profile runtime to share the same value prof data format and reader/writer code with llvm host tools. The VP related data structures are moved to a section in InstrProfData.inc enabled with macro INSTR_PROF_VALUE_PROF_DATA, and common API implementations are enabled with INSTR_PROF_COMMON_API_IMPL. There should be no functional change. llvm-svn: 254235
* Revert "[ARM] Generate ABI_optimization_goals build attribute, as described ↵Renato Golin2015-11-282-46/+4
| | | | | | | | | in the ARM ARM." This reverts commit r254201 and r254202, as it broke test-suite, self-hosting and sanitizer tests on ARM buildbots. llvm-svn: 254234
* [Stack realignment] Handling of aligned allocas.Jonas Paulsson2015-11-283-15/+48
| | | | | | | | | | | | | | | | | | | | This patch implements dynamic realignment of stack objects for targets with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo is changed so that for a target that has StackRealignable set to false, over-aligned static allocas are considered to be variable-sized objects and are handled with DYNAMIC_STACKALLOC nodes. It would be good to group aligned allocas into a single big alloca as an optimization, but this is yet todo. SystemZ benefits from this, due to its stack frame layout. New tests SystemZ/alloca-03.ll for aligned allocas, and SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions. Review and help from Ulrich Weigand and Hal Finkel. llvm-svn: 254227
* Use range-based for loops. NFCCraig Topper2015-11-281-37/+20
| | | | llvm-svn: 254222
* [PGO] Add return code for vp rt record init routine to indicate error conditionXinliang David Li2015-11-281-3/+6
| | | | llvm-svn: 254220
* [PGO] Allow value profile writer interface to allocated target buffer Xinliang David Li2015-11-281-9/+13
| | | | | | | | | Raw profile writer needs to write all data of one kind in one continuous block, so the buffer needs to be pre-allocated and passed to the writer method in pieces for function profile data. The change adds the support for raw value data writing. llvm-svn: 254219
OpenPOWER on IntegriCloud