| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we know we have stack objects, we reserve the registers
that the private buffer resource and wave offset are passed
and use them directly.
If not, reserve the last 5 SGPRs just in case we need to spill.
After register allocation, try to pick the next available registers
instead of the last SGPRs, and then insert copies from the inputs
to the reserved registers in the progloue.
This also only selectively enables all of the input registers
which are really required instead of always enabling them.
llvm-svn: 254331
|
| |
|
|
| |
llvm-svn: 254330
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It does not work because of emergency stack slots.
This pass was supposed to eliminate dummy registers for the
spill instructions, but the register scavenger can introduce
more during PrologEpilogInserter, so some would end up
left behind if they were needed.
The potential for spilling the scratch resource descriptor
and offset register makes doing something like this
overly complicated. Reserve registers to use for the resource
descriptor and use them directly in eliminateFrameIndex.
Also removes creating another scratch resource descriptor
when directly selecting scratch MUBUF instructions.
The choice of which registers are reserved is temporary.
For now it attempts to pick the next available registers
after the user and system SGPRs.
llvm-svn: 254329
|
| |
|
|
| |
llvm-svn: 254328
|
| |
|
|
|
|
| |
Fix the epilogue emission to account for that.
llvm-svn: 254325
|
| |
|
|
| |
llvm-svn: 254317
|
| |
|
|
|
|
|
|
|
| |
The MachineVerifier wants to check that the register operands of an
instruction belong to the instruction's register class. RIP-relative
control flow instructions violated this by referencing RIP. While this
was fixed for SysV, it was never fixed for Win64.
llvm-svn: 254315
|
| |
|
|
|
|
|
|
|
|
|
| |
Re-enable shrink wrapping for PPC64 Little Endian.
One minor modification to PPCFrameLowering::findScratchRegister was necessary to handle fall-thru blocks (blocks with no terminator) correctly.
Tested with all LLVM test, clang tests, and the self-hosting build, with no problems found.
PHabricator: http://reviews.llvm.org/D14778
llvm-svn: 254314
|
| |
|
|
|
|
|
| |
We were not looking past casts to see if an element should be included
or not.
llvm-svn: 254313
|
| |
|
|
| |
llvm-svn: 254311
|
| |
|
|
| |
llvm-svn: 254310
|
| |
|
|
| |
llvm-svn: 254307
|
| |
|
|
|
|
| |
This hasn't been doing anything since using relocations was added.
llvm-svn: 254304
|
| |
|
|
| |
llvm-svn: 254302
|
| |
|
|
|
|
|
|
| |
MFHI, MFLO, MTHI and MTLO instructions
Differential Revision: http://reviews.llvm.org/D14436
llvm-svn: 254297
|
| |
|
|
|
|
|
| |
Value of offset operand for microMIPS BALC and BC instructions is currently shifted 2 bits, but it should be 1 bit.
Differential Revision: http://reviews.llvm.org/D14770
llvm-svn: 254296
|
| |
|
|
|
|
|
|
| |
PRECRQ.QB.PH, PRECRQU_S.QB.PH and PRECRQ_RS.PH.W instructions
Differential Revision: http://reviews.llvm.org/D14605
llvm-svn: 254291
|
| |
|
|
|
|
| |
build bot.
llvm-svn: 254280
|
| |
|
|
| |
llvm-svn: 254279
|
| |
|
|
|
|
|
|
| |
vector. This is reflected correctly in the intrinsics, but was not refelected in the isel patterns.
For the floating point types, this requires adding a bitcast to the index vector when its passed through to the output.
llvm-svn: 254277
|
| |
|
|
|
|
| |
The lambda is more readable.
llvm-svn: 254276
|
| |
|
|
| |
llvm-svn: 254275
|
| |
|
|
|
|
| |
verify there are no others.
llvm-svn: 254274
|
| |
|
|
| |
llvm-svn: 254272
|
| |
|
|
| |
llvm-svn: 254271
|
| |
|
|
|
|
| |
for its shuffle indices.
llvm-svn: 254269
|
| |
|
|
| |
llvm-svn: 254268
|
| |
|
|
| |
llvm-svn: 254267
|
| |
|
|
| |
llvm-svn: 254266
|
| |
|
|
| |
llvm-svn: 254265
|
| |
|
|
| |
llvm-svn: 254264
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This one is enabled only under -ffast-math. There are cases where the
difference between the value computed and the correct value is huge
even for ffast-math, e.g. as Steven pointed out:
x = -1, y = -4
log(pow(-1), 4) = 0
4*log(-1) = NaN
I checked what GCC does and apparently they do the same optimization
(which result in the dramatic difference). Future work might try to
make this (slightly) less worse.
Differential Revision: http://reviews.llvm.org/D14400
llvm-svn: 254263
|
| |
|
|
|
|
|
|
| |
This fixes buildbots in systems that std::to_string is not present. It
also tidies the output of the diagnostic to render doubles a bit better
(thanks Ben Kramer for help with string streams and format).
llvm-svn: 254261
|
| |
|
|
| |
llvm-svn: 254260
|
| |
|
|
|
|
| |
We could already recognise shuffle(FSUB, FADD) -> ADDSUB, this allow us to recognise shuffle(FADD, FSUB) -> ADDSUB by commuting the shuffle mask prior to matching.
llvm-svn: 254259
|
| |
|
|
| |
llvm-svn: 254254
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D14810
llvm-svn: 254248
|
| |
|
|
| |
llvm-svn: 254246
|
| |
|
|
| |
llvm-svn: 254243
|
| |
|
|
| |
llvm-svn: 254242
|
| |
|
|
|
|
|
| |
We were not handling the case where an entry must be dropped and the
destination module has no llvm.global_ctors.
llvm-svn: 254241
|
| |
|
|
|
|
|
| |
Playing with mutateType in here was making getValueType and getType
incompatible.
llvm-svn: 254240
|
| |
|
|
| |
llvm-svn: 254239
|
| |
|
|
|
|
|
|
| |
memory read below.
Found by msan!
llvm-svn: 254238
|
| |
|
|
|
|
|
|
|
|
| |
This is the last step to enable profile runtime to share the same value prof
data format and reader/writer code with llvm host tools. The VP related
data structures are moved to a section in InstrProfData.inc enabled with macro
INSTR_PROF_VALUE_PROF_DATA, and common API implementations are enabled with
INSTR_PROF_COMMON_API_IMPL. There should be no functional change.
llvm-svn: 254235
|
| |
|
|
|
|
|
|
|
| |
in the ARM ARM."
This reverts commit r254201 and r254202, as it broke test-suite,
self-hosting and sanitizer tests on ARM buildbots.
llvm-svn: 254234
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements dynamic realignment of stack objects for targets
with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo
is changed so that for a target that has StackRealignable set to
false, over-aligned static allocas are considered to be variable-sized
objects and are handled with DYNAMIC_STACKALLOC nodes.
It would be good to group aligned allocas into a single big alloca as
an optimization, but this is yet todo.
SystemZ benefits from this, due to its stack frame layout.
New tests SystemZ/alloca-03.ll for aligned allocas, and
SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions.
Review and help from Ulrich Weigand and Hal Finkel.
llvm-svn: 254227
|
| |
|
|
| |
llvm-svn: 254222
|
| |
|
|
| |
llvm-svn: 254220
|
| |
|
|
|
|
|
|
|
| |
Raw profile writer needs to write all data of one kind in one continuous block,
so the buffer needs to be pre-allocated and passed to the writer method in
pieces for function profile data. The change adds the support for raw value data
writing.
llvm-svn: 254219
|