summaryrefslogtreecommitdiffstats
path: root/llvm/test
diff options
context:
space:
mode:
authorJingyue Wu <jingyue@google.com>2015-07-01 20:08:06 +0000
committerJingyue Wu <jingyue@google.com>2015-07-01 20:08:06 +0000
commit77b5b385ee4f727a166eb5bd5175e396a17c1407 (patch)
tree268672316de1cab7e9e6a6a62e11f5dee9e0b76c /llvm/test
parentd4cb1dddebe5a65672d62fe379956f7324749fa4 (diff)
downloadbcm5719-llvm-77b5b385ee4f727a166eb5bd5175e396a17c1407.tar.gz
bcm5719-llvm-77b5b385ee4f727a166eb5bd5175e396a17c1407.zip
[NVPTX] Move NVPTXPeephole after NVPTXPrologEpilogPass
Summary: Offset of frame index is calculated by NVPTXPrologEpilogPass. Before that the correct offset of stack objects cannot be obtained, which leads to wrong offset if there are more than 2 frame objects. This patch move NVPTXPeephole after NVPTXPrologEpilogPass. Because the frame index is already replaced by %VRFrame in NVPTXPrologEpilogPass, we check VRFrame register instead, and try to remove the VRFrame if there is no usage after NVPTXPeephole pass. Patched by Xuetian Weng. Test Plan: Strengthened test/CodeGen/NVPTX/local-stack-frame.ll to check the offset calculation based on SP and SPL. Reviewers: jholewinski, jingyue Reviewed By: jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D10853 llvm-svn: 241185
Diffstat (limited to 'llvm/test')
-rw-r--r--llvm/test/CodeGen/NVPTX/local-stack-frame.ll6
1 files changed, 6 insertions, 0 deletions
diff --git a/llvm/test/CodeGen/NVPTX/local-stack-frame.ll b/llvm/test/CodeGen/NVPTX/local-stack-frame.ll
index fba5dd883f9..ef1b7da6ad0 100644
--- a/llvm/test/CodeGen/NVPTX/local-stack-frame.ll
+++ b/llvm/test/CodeGen/NVPTX/local-stack-frame.ll
@@ -59,10 +59,16 @@ define void @foo3(i32 %a) {
; PTX32: cvta.local.u32 %SP, %SPL;
; PTX32: add.u32 {{%r[0-9]+}}, %SP, 0;
+; PTX32: add.u32 {{%r[0-9]+}}, %SPL, 0;
+; PTX32: add.u32 {{%r[0-9]+}}, %SP, 4;
+; PTX32: add.u32 {{%r[0-9]+}}, %SPL, 4;
; PTX32: st.local.u32 [{{%r[0-9]+}}], {{%r[0-9]+}}
; PTX32: st.local.u32 [{{%r[0-9]+}}], {{%r[0-9]+}}
; PTX64: cvta.local.u64 %SP, %SPL;
; PTX64: add.u64 {{%rd[0-9]+}}, %SP, 0;
+; PTX64: add.u64 {{%rd[0-9]+}}, %SPL, 0;
+; PTX64: add.u64 {{%rd[0-9]+}}, %SP, 4;
+; PTX64: add.u64 {{%rd[0-9]+}}, %SPL, 4;
; PTX64: st.local.u32 [{{%rd[0-9]+}}], {{%r[0-9]+}}
; PTX64: st.local.u32 [{{%rd[0-9]+}}], {{%r[0-9]+}}
define void @foo4() {
OpenPOWER on IntegriCloud