summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms/SROA/alignment.ll
Commit message (Collapse)AuthorAgeFilesLines
* SROA: Don't drop atomic load/store alignments (PR45010)Hans Wennborg2020-02-281-0/+15
| | | | | | | | | | | | | SROA will drop the explicit alignment on allocas when the ABI guarantees enough alignment. Because the alignment on new load/store instructions are set based on the alloca's alignment, that means SROA would end up dropping the alignment from atomic loads and stores, which is not allowed (see bug). For those, make sure to always carry over the alignment from the previous instruction. Differential revision: https://reviews.llvm.org/D75266 (cherry picked from commit d48c981697a49653efff9dd14fa692d99e6fa868)
* Revert "Temporarily Revert "Add basic loop fusion pass.""Eric Christopher2019-04-171-0/+231
| | | | | | | | The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552
* Temporarily Revert "Add basic loop fusion pass."Eric Christopher2019-04-171-231/+0
| | | | | | | | As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546
* Tighten up tests that use -debugify as a shortcut. NFCPaul Robinson2019-03-141-2/+3
| | | | | | | | These now verify that a given instruction has a specific source location, rather than any old location. We want to make sure we propagate the correct locations from one instruction to another. llvm-svn: 356217
* SROA: preserve alignment tags on loads and stores.Tim Northover2018-12-181-0/+47
| | | | | | | | | | | When splitting up an alloca's uses we were dropping any explicit alignment tags, which means they default to the ABI-required default alignment and this can cause miscompiles if the real value was smaller. Also refactor the TBAA metadata into a parent class since it's shared by both children anyway. llvm-svn: 349465
* [SROA] Preserve DebugLoc when rewriting alloca partitionsAnastasis Grammenos2018-06-281-0/+9
| | | | | | | | | When rewriting an alloca partition copy the DL from the old alloca over the the new one. Differential Revision: https://reviews.llvm.org/D48640 llvm-svn: 335904
* Remove alignment argument from memcpy/memmove/memset in favour of alignment ↵Daniel Neilson2018-01-191-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965
* Revert "Change memcpy/memset/memmove to have dest and source alignments."Pete Cooper2015-11-191-9/+9
| | | | | | | | | | This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543
* Change memcpy/memset/memmove to have dest and source alignments.Pete Cooper2015-11-181-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-271-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float*> %x, ... ->getelementptr float, <4 x float*> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") normrep = re.compile( r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name *.ll | xargs ./apply.sh From llvm/src/tools/clang: find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll | xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786
* [SROA] Teach SROA to be more aggressive in splitting now that we haveChandler Carruth2015-01-021-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a pre-splitting pass over loads and stores. Historically, splitting could cause enough problems that I hamstrung the entire process with a requirement that splittable integer loads and stores must cover the entire alloca. All smaller loads and stores were unsplittable to prevent chaos from ensuing. With the new pre-splitting logic that does load/store pair splitting I introduced in r225061, we can now very nicely handle arbitrarily splittable loads and stores. In order to fully benefit from these smarts, we need to mark all of the integer loads and stores as splittable. However, we don't actually want to rewrite partitions with all integer loads and stores marked as splittable. This will fail to extract scalar integers from aggregates, which is kind of the point of SROA. =] In order to resolve this, what we really want to do is only do pre-splitting on the alloca slices with integer loads and stores fully splittable. This allows us to uncover all non-integer uses of the alloca that would benefit from a split in an integer load or store (and where introducing the split is safe because it is just memory transfer from a load to a store). Once done, we make all the non-whole-alloca integer loads and stores unsplittable just as they have historically been, repartition and rewrite. The result is that when there are integer loads and stores anywhere within an alloca (such as from a memcpy of a sub-object of a larger object), we can split them up if there are non-integer components to the aggregate hiding beneath. I've added the challenging test cases to demonstrate how this is able to promote to scalars even a case where we have even *partially* overlapping loads and stores. This restores the single-store behavior for small arrays of i8s which is really nice. I've restored both the little endian testing and big endian testing for these exactly as they were prior to r225061. It also forced me to be more aggressive in an alignment test to actually defeat SROA. =] Without the added volatiles there, we actually split up the weird i16 loads and produce nice double allocas with better alignment. This also uncovered a number of bugs where we failed to handle splittable load and store slices which didn't have a begininng offset of zero. Those fixes are included, and without them the existing test cases explode in glorious fireworks. =] I've kept support for leaving whole-alloca integer loads and stores as splittable even for the purpose of rewriting, but I think that's likely no longer needed. With the new pre-splitting, we might be able to remove all the splitting support for loads and stores from the rewriter. Not doing that in this patch to try to isolate any performance regressions that causes in an easy to find and revert chunk. llvm-svn: 225074
* Update Transforms tests to use CHECK-LABEL for easier debugging. No ↵Stephen Lin2013-07-141-7/+7
| | | | | | | | | | | | | | | | | | | | | | functionality change. This update was done with the following bash script: find test/Transforms -name "*.ll" | \ while read NAME; do echo "$NAME" if ! grep -q "^; *RUN: *llc" $NAME; then TEMP=`mktemp -t temp` cp $NAME $TEMP sed -n "s/^define [^@]*@\([A-Za-z0-9_]*\)(.*$/\1/p" < $NAME | \ while read FUNC; do sed -i '' "s/;\(.*\)\([A-Za-z0-9_]*\):\( *\)@$FUNC\([( ]*\)\$/;\1\2-LABEL:\3@$FUNC(/g" $TEMP done mv $TEMP $NAME fi done llvm-svn: 186268
* Teach SROA to cope with wrapper aggregates. These show up a lot in ABIChandler Carruth2012-10-131-31/+0
| | | | | | | | | | | | | | | | | | | | | | | type coercion code, especially when targetting ARM. Things like [1 x i32] instead of i32 are very common there. The goal of this logic is to ensure that when we are picking an alloca type, we look through such wrapper aggregates and across any zero-length aggregate elements to find the simplest type possible to form a type partition. This logic should (generally speaking) rarely fire. It only ends up kicking in when an alloca is accessed using two different types (for instance, i32 and float), and the underlying alloca type has wrapper aggregates around it. I noticed a significant amount of this occurring looking at stepanov_abstraction generated code for arm, and suspect it happens elsewhere as well. Note that this doesn't yet address truly heinous IR productions such as PR14059 is concerning. Those result in mismatched *sizes* of types in addition to mismatched access and alloca types. llvm-svn: 165870
* Teach the integer-promotion rewrite strategy to be endianness aware.Chandler Carruth2012-10-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Sorry for this being broken so long. =/ As part of this, switch all of the existing tests to be Little Endian, which is the behavior I was asserting in them anyways! Add in a new big-endian test that checks the interesting behavior there. Another part of this is to tighten the rules abotu when we perform the full-integer promotion. This logic now rejects cases where there fully promoted integer is a non-multiple-of-8 bitwidth or cases where the loads or stores touch bits which are in the allocated space of the alloca but are not loaded or stored when accessing the integer. Sadly, these aren't really observable today as the rest of the pass will already ensure the invariants hold. However, the latter situation is likely to become a potential concern in the future. Thanks to Benjamin and Duncan for early review of this patch. I'm still looking into whether there are further endianness issues, please let me know if anyone sees BE failures persisting past this. llvm-svn: 165219
* Fix an issue where we failed to adjust the alignment constraint onChandler Carruth2012-10-031-0/+31
| | | | | | | | a memcpy to reflect that '0' has a different meaning when applied to a load or store. Now we correctly use underaligned loads and stores for the test case added. llvm-svn: 165101
* Try to use a better set of abstractions for computing the alignmentChandler Carruth2012-10-031-4/+59
| | | | | | | | | | | | | | | | | | | | necessary during rewriting. As part of this, fix a real think-o here where we might have left off an alignment specification when the address is in fact underaligned. I haven't come up with any way to trigger this, as there is always some other factor that reduces the alignment, but it certainly might have been an observable bug in some way I can't think of. This also slightly changes the strategy for placing explicit alignments on loads and stores to only do so when the alignment does not match that required by the ABI. This causes a few redundant alignments to go away from test cases. I've also added a couple of tests that really push on the alignment that we end up with on loads and stores. More to come here as I try to fix an underlying bug I have conjectured and produced test cases for, although it's not clear if this bug is the one currently hitting dragonegg's gcc47 bootstrap. llvm-svn: 165100
* Fix more misspellings found by Duncan during review.Chandler Carruth2012-10-011-2/+2
| | | | llvm-svn: 164940
* Fix several issues with alignment. We weren't always accounting for typeChandler Carruth2012-10-011-0/+31
| | | | | | | | | | alignment requirements of the new alloca. As one consequence which was reported as a bug by Duncan, we overaligned memcpy calls to ranges of allocas after they were rewritten to types with lower alignment requirements. Other consquences are possible, but I don't have any test cases for them. llvm-svn: 164937
* Fix a somewhat surprising miscompile where code relying on an ABIChandler Carruth2012-09-291-1/+22
| | | | | | | | | | | | | | | alignment could lose it due to the alloca type moving down to a much smaller alignment guarantee. Now SROA will actively compute a proper alignment, factoring the target data, any explicit alignment, and the offset within the struct. This will in some cases lower the alignment requirements, but when we lower them below those of the type, we drop the alignment entirely to give freedom to the code generator to align it however is convenient. Thanks to Duncan for the lovely test case that pinned this down. =] llvm-svn: 164891
* When rewriting the pointer operand to a load or store which hasChandler Carruth2012-09-261-0/+18
| | | | | | | alignment guarantees attached, re-compute the alignment so that we consider offsets which impact alignment. llvm-svn: 164690
* Teach all of the loads, stores, memsets and memcpys created by theChandler Carruth2012-09-261-0/+46
rewriter in SROA to carry a proper alignment. This involves interrogating various sources of alignment, etc. This is a more complete and principled fix to PR13920 as well as related bugs pointed out by Eli in review and by inspection in the area. Also by inspection fix the integer and vector promotion paths to create aligned loads and stores. I still need to work up test cases for these... Sorry for the delay, they were found purely by inspection. llvm-svn: 164689
OpenPOWER on IntegriCloud