| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
type (in addition to the memory type).
The *LoadExt* legalization handling used to only have one type, the
memory type. This forced users to assume that as long as the extload
for the memory type was declared legal, and the result type was legal,
the whole extload was legal.
However, this isn't always the case. For instance, on X86, with AVX,
this is legal:
v4i32 load, zext from v4i8
but this isn't:
v4i64 load, zext from v4i8
Whereas v4i64 is (arguably) legal, even without AVX2.
Note that the same thing was done a while ago for truncstores (r46140),
but I assume no one needed it yet for extloads, so here we go.
Calls to getLoadExtAction were changed to add the value type, found
manually in the surrounding code.
Calls to setLoadExtAction were mechanically changed, by wrapping the
call in a loop, to match previous behavior. The loop iterates over
the MVT subrange corresponding to the memory type (FP vectors, etc...).
I also pulled neighboring setTruncStoreActions into some of the loops;
those shouldn't make a difference, as the additional types are illegal.
(e.g., i128->i1 truncstores on PPC.)
No functional change intended.
Differential Revision: http://reviews.llvm.org/D6532
llvm-svn: 225421
|
|
|
|
| |
llvm-svn: 225305
|
|
|
|
| |
llvm-svn: 224648
|
|
|
|
|
|
|
| |
If the condition is used for something else, this increases
the number of instructions.
llvm-svn: 224646
|
|
|
|
|
|
|
|
| |
The returned operand needs to be permuted for the unordered
compares. Also fix incorrectly producing fmin_legacy / fmax_legacy
for f64, which don't exist.
llvm-svn: 224094
|
|
|
|
|
|
|
|
| |
Add an option to disable optimization to shrink truncated larger type
loads to smaller type loads. On SI this prevents using scalar load
instructions in some cases, since there are no scalar extloads.
llvm-svn: 224084
|
|
|
|
|
|
|
|
|
| |
There are 3 changes:
- Convert 32-bit S_LSHL/LSHR/ASHR to their V_*REV variants for VI
- Lower RSQ_CLAMP for VI
- Don't generate MIN/MAX_LEGACY on VI
llvm-svn: 223604
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This sort of doesn't matter since the setcc type is i1, but
this previously was using the default UndefinedBooleanContent. This
makes it more consistent with R600. This enables more optimizations
which typically give up on UndefinedBooleanContent. For example,
there is already a special case target DAG combine for
setcc + sext which can be eliminated in favor of what the generic
DAG combiner can do if it assumes boolean values are sign extended.
Since -1 is an inline immediate, using it is basically free and the
backend already uses it when a boolean value is needed in a wider type.
llvm-svn: 222850
|
|
|
|
|
|
|
| |
i1 is not a legal type on Evergreen, so this combine proceeded
and tried to produce a bitcast between i1 and i8.
llvm-svn: 222630
|
|
|
|
|
|
|
|
|
|
| |
This gets the correct NaN behavior based on the compare type
the hardware uses. This now passes the new piglit test I have
for this on SI.
Add stricter tests for the operand order.
llvm-svn: 222079
|
|
|
|
|
|
|
|
| |
This fixes a failure in one of the oclconform tests.
Patch by: Jan Vesely
llvm-svn: 222073
|
|
|
|
|
|
|
|
| |
This is so it could potentially be used by SI. However, the current
implementation does not always produce correct results, so the
IntegerDivisionPass is being used instead.
llvm-svn: 222072
|
|
|
|
| |
llvm-svn: 222032
|
|
|
|
| |
llvm-svn: 222015
|
|
|
|
|
|
| |
select_cc is expanded on SI, so this was never matched.
llvm-svn: 221941
|
|
|
|
|
|
| |
requires TargetLoweringObjectFile to be passed.
llvm-svn: 221926
|
|
|
|
|
|
| |
Also give a proper error for other address spaces.
llvm-svn: 221917
|
|
|
|
|
|
| |
TargetMachine so that different subtargets could share the TLOF effectively
llvm-svn: 221878
|
|
|
|
| |
llvm-svn: 220342
|
|
|
|
| |
llvm-svn: 220338
|
|
|
|
|
|
| |
This was resulting in invalid simplifications of sdiv
llvm-svn: 219953
|
|
|
|
| |
llvm-svn: 219879
|
|
|
|
|
|
|
| |
Zero-width BFEs are combined away already, so there's no point in
handling them.
llvm-svn: 219868
|
|
|
|
| |
llvm-svn: 219867
|
|
|
|
|
|
| |
SimplifyDemandedBits would break the other uses of the operand.
llvm-svn: 219819
|
|
|
|
| |
llvm-svn: 219778
|
|
|
|
| |
llvm-svn: 219777
|
|
|
|
| |
llvm-svn: 219038
|
|
|
|
| |
llvm-svn: 219037
|
|
|
|
|
|
| |
Re-add the tests since they were deleted at some point
llvm-svn: 219036
|
|
|
|
| |
llvm-svn: 218534
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BypassSlowDiv is used by codegen prepare to insert a run-time
check to see if the operands to a 64-bit division are really 32-bit
values and if they are it will do 32-bit division instead.
This is not useful for R600, which has predicated control flow since
both the 32-bit and 64-bit paths will be executed in most cases. It
also increases code size which can lead to more instruction cache
misses.
llvm-svn: 218252
|
|
|
|
|
|
|
| |
ISD::MUL and ISD:UMULO are the same except that UMULO sets an overflow
bit. Since we aren't using the overflow bit, we should use ISD::MUL.
llvm-svn: 218251
|
|
|
|
|
|
| |
Just do the left shift as unsigned to avoid the UB.
llvm-svn: 218092
|
|
|
|
|
|
|
| |
I'm not sure what the hardware actually does, so don't
bother trying to fold it for now.
llvm-svn: 218057
|
|
|
|
| |
llvm-svn: 217553
|
|
|
|
|
|
|
| |
We can use a negate source modifier to match
this for fsub.
llvm-svn: 216735
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
isPow2DivCheap
That name doesn't specify signed or unsigned.
Lazy as I am, I eventually read the function and variable comments. It turns out that this is strictly about signed div. But I discovered that the comments are wrong:
srl/add/sra
is not the general sequence for signed integer division by power-of-2. We need one more 'sra':
sra/srl/add/sra
That's the sequence produced in DAGCombiner. The first 'sra' may be removed when dividing by exactly '2', but that's a special case.
This patch corrects the comments, changes the name of the flag bit, and changes the name of the accessor methods.
No functional change intended.
Differential Revision: http://reviews.llvm.org/D5010
llvm-svn: 216237
|
|
|
|
| |
llvm-svn: 215748
|
|
|
|
| |
llvm-svn: 215747
|
|
|
|
| |
llvm-svn: 215734
|
|
|
|
|
|
|
|
|
| |
v2: drop enum keyword
use correct extension mode
don't bother computing the sign in unsinged case
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 215462
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 215461
|
|
|
|
|
|
|
|
|
| |
v2: add tests
rename LowerSDIV24 to LowerSDIVREM24
handle the rem part in this function
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 215460
|
|
|
|
| |
llvm-svn: 215277
|
|
|
|
|
|
|
| |
SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code
path for 8-bit and 16-bit private loads.
llvm-svn: 214865
|
|
|
|
|
|
| |
information and update all callers. No functional change.
llvm-svn: 214781
|
|
|
|
| |
llvm-svn: 214729
|
|
|
|
|
|
|
|
| |
This reverts commit r214566.
I did not mean to commit this yet.
llvm-svn: 214572
|
|
|
|
|
|
|
| |
SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code
path for 8-bit and 16-bit private loads.
llvm-svn: 214566
|