summaryrefslogtreecommitdiffstats
path: root/lldb/scripts/Python/finishSwigPythonLLDB.py
diff options
context:
space:
mode:
authorChandler Carruth <chandlerc@gmail.com>2014-11-21 13:56:05 +0000
committerChandler Carruth <chandlerc@gmail.com>2014-11-21 13:56:05 +0000
commitd2b19bc867ee14c93e6ec72b28f25efb8ab8e179 (patch)
tree7571bffa5699caaff40d9316756065ddd1f4699e /lldb/scripts/Python/finishSwigPythonLLDB.py
parent8a3934f85b9ebc171f162f5328219a482bf21ff9 (diff)
downloadbcm5719-llvm-d2b19bc867ee14c93e6ec72b28f25efb8ab8e179.tar.gz
bcm5719-llvm-d2b19bc867ee14c93e6ec72b28f25efb8ab8e179.zip
[x86] Teach the x86 vector shuffle lowering to detect mergable 128-bit
lanes. By special casing these we can often either reduce the total number of shuffles significantly or reduce the number of (high latency on Haswell) AVX2 shuffles that potentially cross 128-bit lanes. Even when these don't actually cross lanes, they have much higher latency to support that. Doing two of them and a blend is worse than doing a single insert across the 128-bit lanes to blend and then doing a single interleaved shuffle. While this seems like a narrow case, it kept cropping up on me and the difference is *huge* as you can see in many of the test cases. I first hit this trying to perfectly fix the interleaving shuffle patterns used by Halide for AVX2. llvm-svn: 222533
Diffstat (limited to 'lldb/scripts/Python/finishSwigPythonLLDB.py')
0 files changed, 0 insertions, 0 deletions
OpenPOWER on IntegriCloud