<feed xmlns='http://www.w3.org/2005/Atom'>
<title>bcm5719-llvm/llvm/test/Analysis/CostModel, branch meklort-10.0.1</title>
<subtitle>Project Ortega BCM5719 LLVM</subtitle>
<id>https://git.raptorcs.com/git/bcm5719-llvm/atom?h=meklort-10.0.1</id>
<link rel='self' href='https://git.raptorcs.com/git/bcm5719-llvm/atom?h=meklort-10.0.1'/>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/'/>
<updated>2020-01-06T13:17:02+00:00</updated>
<entry>
<title>[CostModel][X86] Add missing scalar i64-&gt;f32 uitofp costs</title>
<updated>2020-01-06T13:17:02+00:00</updated>
<author>
<name>Simon Pilgrim</name>
<email>llvm-dev@redking.me.uk</email>
</author>
<published>2020-01-06T13:16:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=5d986a68a59c9bed7060e87840e61390d8247c1d'/>
<id>urn:sha1:5d986a68a59c9bed7060e87840e61390d8247c1d</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351</title>
<updated>2019-12-24T23:57:33+00:00</updated>
<author>
<name>Fangrui Song</name>
<email>maskray@google.com</email>
</author>
<published>2019-12-24T23:52:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=502a77f125f43ffde57af34d3fd1b900248a91cd'/>
<id>urn:sha1:502a77f125f43ffde57af34d3fd1b900248a91cd</id>
<content type='text'>
</content>
</entry>
<entry>
<title>[AMDGPU] Implemented fma cost analysis</title>
<updated>2019-12-19T07:54:20+00:00</updated>
<author>
<name>Stanislav Mekhanoshin</name>
<email>Stanislav.Mekhanoshin@amd.com</email>
</author>
<published>2019-12-18T21:29:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=58578f705663a9f31b906a341f0a61ce51f7dcb2'/>
<id>urn:sha1:58578f705663a9f31b906a341f0a61ce51f7dcb2</id>
<content type='text'>
Differential Revision: https://reviews.llvm.org/D71676
</content>
</entry>
<entry>
<title>[AMDGPU] Fixed cost model for packed 16 bit ops</title>
<updated>2019-12-17T23:14:17+00:00</updated>
<author>
<name>Stanislav Mekhanoshin</name>
<email>Stanislav.Mekhanoshin@amd.com</email>
</author>
<published>2019-12-17T19:16:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=b8ac5894a115987fcc7e871049ec31a8eba66741'/>
<id>urn:sha1:b8ac5894a115987fcc7e871049ec31a8eba66741</id>
<content type='text'>
Differential Revision: https://reviews.llvm.org/D71622
</content>
</entry>
<entry>
<title>[ARM] Teach the Arm cost model that a Shift can be folded into other instructions</title>
<updated>2019-12-09T10:24:33+00:00</updated>
<author>
<name>David Green</name>
<email>david.green@arm.com</email>
</author>
<published>2019-12-08T15:33:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=be7a1070700e591732b254e29f2dd703325fb52a'/>
<id>urn:sha1:be7a1070700e591732b254e29f2dd703325fb52a</id>
<content type='text'>
This attempts to teach the cost model in Arm that code such as:
  %s = shl i32 %a, 3
  %a = and i32 %s, %b
Can under Arm or Thumb2 become:
  and r0, r1, r2, lsl #3

So the cost of the shift can essentially be free. To do this without
trying to artificially adjust the cost of the "and" instruction, it
needs to get the users of the shl and check if they are a type of
instruction that the shift can be folded into. And so it needs to have
access to the actual instruction in getArithmeticInstrCost, which if
available is added as an extra parameter much like getCastInstrCost.

We otherwise limit it to shifts with a single user, which should
hopefully handle most of the cases. The list of instruction that the
shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR,
ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and
ICmp.

Differential Revision: https://reviews.llvm.org/D70966
</content>
</entry>
<entry>
<title>[ARM] Additional tests and minor formatting. NFC</title>
<updated>2019-12-09T10:24:33+00:00</updated>
<author>
<name>David Green</name>
<email>david.green@arm.com</email>
</author>
<published>2019-12-08T15:26:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=f008b5b8ce724d60f0f0eeafceee0119c42022d4'/>
<id>urn:sha1:f008b5b8ce724d60f0f0eeafceee0119c42022d4</id>
<content type='text'>
This adds some extra cost model tests for shifts, and does some minor
adjustments to some Neon code to make it clear as to what it applies to.
Both NFC.
</content>
</entry>
<entry>
<title>[x86] add cost model special-case for insert/extract from element 0</title>
<updated>2019-12-06T18:50:25+00:00</updated>
<author>
<name>Sanjay Patel</name>
<email>spatel@rotateright.com</email>
</author>
<published>2019-12-06T18:29:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=7ff0fcb53f6e71bc22d37494fdfa68bbf2d3709b'/>
<id>urn:sha1:7ff0fcb53f6e71bc22d37494fdfa68bbf2d3709b</id>
<content type='text'>
This is a follow-up to D70607 where we made any
extract element on SLM more costly than default. But that is
pessimistic for extract from element 0 because that corresponds
to x86 movd/movq instructions. These generally have &gt;1 cycle
latency, but they are probably implemented as single uop
instructions.

Note that no vectorization tests are affected by this change.
Also, no targets besides SLM are affected because those are
falling through to the default cost of 1 anyway. But this will
become visible/important if we add more specializations via cost
tables.

Differential Revision: https://reviews.llvm.org/D71023
</content>
</entry>
<entry>
<title>[PowerPC] Separate Features that are known to be Power9 specific from Future CPU</title>
<updated>2019-11-27T21:40:13+00:00</updated>
<author>
<name>Stefan Pintilie</name>
<email>stefanp@ca.ibm.com</email>
</author>
<published>2019-11-27T21:38:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=8e84c9ae99846c91c4e9828f1945c200d26d2fb9'/>
<id>urn:sha1:8e84c9ae99846c91c4e9828f1945c200d26d2fb9</id>
<content type='text'>
The Power 9 CPU has some features that are unlikely to be passed on to future
versions of the CPU. This patch separates this out so that future CPU does not
inherit them.

Differential Revision: https://reviews.llvm.org/D70466
</content>
</entry>
<entry>
<title>[x86] make SLM extract vector element more expensive than default</title>
<updated>2019-11-27T19:08:56+00:00</updated>
<author>
<name>Sanjay Patel</name>
<email>spatel@rotateright.com</email>
</author>
<published>2019-11-27T18:33:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=5c166f1d1969e9c1e5b72aa672add429b9c22b53'/>
<id>urn:sha1:5c166f1d1969e9c1e5b72aa672add429b9c22b53</id>
<content type='text'>
I'm not sure what the effect of this change will be on all of the affected
tests or a larger benchmark, but it fixes the horizontal add/sub problems
noted here:
https://reviews.llvm.org/D59710?vs=227972&amp;id=228095&amp;whitespace=ignore-most#toc

The costs are based on reciprocal throughput numbers in Agner's tables for
PEXTR*; these appear to be very slow ops on Silvermont.

This is a small step towards the larger motivation discussed in PR43605:
https://bugs.llvm.org/show_bug.cgi?id=43605

Also, it seems likely that insert/extract is the source of perf regressions on
other CPUs (up to 30%) that were cited as part of the reason to revert D59710,
so maybe we'll extend the table-based approach to other subtargets.

Differential Revision: https://reviews.llvm.org/D70607
</content>
</entry>
<entry>
<title>AMDGPU: Split test functions to avoid dependency on subtarget</title>
<updated>2019-11-19T05:42:13+00:00</updated>
<author>
<name>Matt Arsenault</name>
<email>Matthew.Arsenault@amd.com</email>
</author>
<published>2019-11-18T06:54:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.raptorcs.com/git/bcm5719-llvm/commit/?id=b337bce8710f2a7ab8ce9f84c80cfbce1032963c'/>
<id>urn:sha1:b337bce8710f2a7ab8ce9f84c80cfbce1032963c</id>
<content type='text'>
Prepare this test for moving tthe denormal setting out of the
subtarget features.
</content>
</entry>
</feed>
