|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| | small bug in the process.
llvm-svn: 157446 | 
| | 
| 
| 
| | llvm-svn: 156633 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.
(this time, actually commit what was reviewed!)
llvm-svn: 155825 | 
| | 
| 
| 
| | llvm-svn: 155746 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.
llvm-svn: 155745 | 
| | 
| 
| 
| 
| 
| | since they are equivalent.
llvm-svn: 155186 | 
| | 
| 
| 
| | llvm-svn: 153500 | 
| | 
| 
| 
| 
| 
| | uint16_t to reduce space.
llvm-svn: 152538 | 
| | 
| 
| 
| 
| 
| | to static data that should not be modified.
llvm-svn: 151134 | 
| | 
| 
| 
| 
| 
| 
| | The different calling conventions and call-preserved registers are
represented with regmask operands that are added dynamically.
llvm-svn: 150708 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Call instructions no longer have a list of 43 call-clobbered registers.
Instead, they get a single register mask operand with a bit vector of
call-preserved registers.
This saves a lot of memory, 42 x 32 bytes = 1344 bytes per call
instruction, and it speeds up building call instructions because those
43 imp-def operands no longer need to be added to use-def lists. (And
removed and shifted and re-added for every explicit call operand).
Passes like LiveVariables, LiveIntervals, RAGreedy, PEI, and
BranchFolding are significantly faster because they can deal with call
clobbers in bulk.
Overall, clang -O2 is between 0% and 8% faster, uniformly distributed
depending on call density in the compiled code.  Debug builds using
clang -O0 are 0% - 3% faster.
I have verified that this patch doesn't change the assembly generated
for the LLVM nightly test suite when building with -disable-copyprop
and -disable-branch-fold.
Branch folding behaves slightly differently in a few cases because call
instructions have different hash values now.
Copy propagation flushes its data structures when it crosses a register
mask operand. This causes it to leave a few dead copies behind, on the
order of 20 instruction across the entire nightly test suite, including
SPEC. Fixing this properly would require the pass to use different data
structures.
llvm-svn: 150638 | 
| | 
| 
| 
| | llvm-svn: 150538 | 
| | 
| 
| 
| | llvm-svn: 148513 | 
| | 
| 
| 
| 
| 
| | the final piece to remove the AVX hack that disabled SSE.
llvm-svn: 147843 | 
| | 
| 
| 
| 
| 
| | hasXMM/hasXMMInt instead. Also fix one place that checked SSE3, but accidentally excluded AVX to use hasSSE3orAVX. This is a step towards removing the AVX hack from the X86Subtarget.h
llvm-svn: 147764 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.
One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.
llvm-svn: 145714 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | Like V_SET0, these instructions are expanded by ExpandPostRA to xorps /
vxorps so they can participate in execution domain swizzling.
This also makes the AVX variants redundant.
llvm-svn: 145440 | 
| | 
| 
| 
| 
| 
| | it fails to emit a store. This fixes <rdar://problem/10215997>.
llvm-svn: 142432 | 
| | 
| 
| 
| | llvm-svn: 141749 | 
| | 
| 
| 
| 
| 
| | an alias involves thread-local storage.  (I'm not entirely sure how this is supposed to work, but this patch makes fast-isel consistent with the normal isel path.)
llvm-svn: 140355 | 
| | 
| 
| 
| | llvm-svn: 139062 | 
| | 
| 
| 
| 
| 
| | missing from fast-isel.
llvm-svn: 139044 | 
| | 
| 
| 
| 
| 
| | Krasin!
llvm-svn: 136663 | 
| | 
| 
| 
| | llvm-svn: 136653 | 
| | 
| 
| 
| | llvm-svn: 135375 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | We would put the return value from long double functions in the wrong
register.
This fixes gcc.c-torture/execute/conversion.c
llvm-svn: 134205 | 
| | 
| 
| 
| | llvm-svn: 134030 | 
| | 
| 
| 
| 
| 
| 
| 
| | sink them into MC layer.
- Added MCInstrInfo, which captures the tablegen generated static data. Chang
TargetInstrInfo so it's based off MCInstrInfo.
llvm-svn: 134021 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Drop the FpMov instructions, use plain COPY instead.
Drop the FpSET/GET instruction for accessing fixed stack positions.
Instead use normal COPY to/from ST registers around inline assembly, and
provide a single new FpPOP_RETVAL instruction that can access the return
value(s) from a call. This is still necessary since you cannot tell from
the CALL instruction alone if it returns anything on the FP stack. Teach
fast isel to use this.
This provides a much more robust way of handling fixed stack registers -
we can tolerate arbitrary FP stack instructions inserted around calls
and inline assembly. Live range splitting could sometimes break x87 code
by inserting spill code in unfortunate places.
As a bonus we handle floating point inline assembly correctly now.
llvm-svn: 134018 | 
| | 
| 
| 
| | llvm-svn: 133726 | 
| | 
| 
| 
| 
| 
| 
| | memcpy/memset symbol doesn't get marked up correctly in PIC modes otherwise.
Should fix llvm-x86_64-linux-checks buildbot.  Followup to r132864.
llvm-svn: 132869 | 
| | 
| 
| 
| 
| 
| | rdar://9431466
llvm-svn: 132864 | 
| | 
| 
| 
| 
| 
| 
| 
| | No functional change.
Part of PR6965
llvm-svn: 132763 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | simpler and more consistent.
The practical effects here are that x86-64 fast-isel can now handle trunc from i8 to i1, and ARM fast-isel can handle many more constructs involving integers narrower than 32 bits (including loads, stores, and many integer casts).
rdar://9437928 .
llvm-svn: 132099 | 
| | 
| 
| 
| | llvm-svn: 131764 | 
| | 
| 
| 
| | llvm-svn: 131689 | 
| | 
| 
| 
| | llvm-svn: 131597 | 
| | 
| 
| 
| | llvm-svn: 131596 | 
| | 
| 
| 
| 
| 
| | This is r131438 with a couple small fixes.
llvm-svn: 131474 | 
| | 
| 
| 
| 
| 
| | it more tomorrow.
llvm-svn: 131451 | 
| | 
| 
| 
| | llvm-svn: 131438 | 
| | 
| 
| 
| | llvm-svn: 131420 | 
| | 
| 
| 
| 
| 
| | intrinsic from the x86 code to the generic code.
llvm-svn: 131332 | 
| | 
| 
| 
| 
| 
| | to being bottom-up (a very long time ago).
llvm-svn: 131329 | 
| | 
| 
| 
| 
| 
| | rdar://problem/9303592 .
llvm-svn: 130429 | 
| | 
| 
| 
| | llvm-svn: 130412 | 
| | 
| 
| 
| 
| 
| | rdar://problem/9303592 .
llvm-svn: 130348 | 
| | 
| 
| 
| 
| 
| | common.  rdar://problem/9303592 .
llvm-svn: 130338 | 
| | 
| 
| 
| 
| 
| | length.  (I'm planning to use this to implement byval.)
llvm-svn: 130274 | 
| | 
| 
| 
| 
| 
| | rdar://problem/9303306 .
llvm-svn: 130272 |