summaryrefslogtreecommitdiffstats
path: root/llvm/test/Instrumentation/DataFlowSanitizer
Commit message (Collapse)AuthorAgeFilesLines
* Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" ↵Fangrui Song2019-12-241-1/+1
| | | | as cleanups after D56351
* IR: print value numbers for unnamed function argumentsTim Northover2019-08-033-7/+7
| | | | | | | | | | For consistency with normal instructions and clarity when reading IR, it's best to print the %0, %1, ... names of function arguments in definitions. Also modifies the parser to accept IR in that form for obvious reasons. llvm-svn: 367755
* [DFSan] Add UnaryOperator visitor to DataFlowSanitizerCameron McInally2019-06-191-0/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D62815 llvm-svn: 363814
* [DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.Shiva Chen2018-05-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to set breakpoints on labels and list source code around labels, we need collect debug information for labels, i.e., label name, the function label belong, line number in the file, and the address label located. In order to keep these information in LLVM IR and to allow backend to generate debug information correctly. We create a new kind of metadata for labels, DILabel. The format of DILabel is !DILabel(scope: !1, name: "foo", file: !2, line: 3) We hope to keep debug information as much as possible even the code is optimized. So, we create a new kind of intrinsic for label metadata to avoid the metadata is eliminated with basic block. The intrinsic will keep existing if we keep it from optimized out. The format of the intrinsic is llvm.dbg.label(metadata !1) It has only one argument, that is the DILabel metadata. The intrinsic will follow the label immediately. Backend could get the label metadata through the intrinsic's parameter. We also create DIBuilder API for labels to be used by Frontend. Frontend could use createLabel() to allocate DILabel objects, and use insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR. Differential Revision: https://reviews.llvm.org/D45024 Patch by Hsiangkai Wang. llvm-svn: 331841
* DataFlowSanitizer: wrappers of functions with local linkage should have the ↵Peter Collingbourne2018-03-302-0/+26
| | | | | | | | | | | | | | same linkage as the function being wrapped This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan. This change resolves bug 36314. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D44784 llvm-svn: 328890
* Fix DataFlowSanitizer instrumentation pass to take parameter position ↵Peter Collingbourne2018-02-222-0/+64
| | | | | | | | | | | | | | changes into account for custom functions. When DataFlowSanitizer transforms a call to a custom function, the new call has extra parameters. The attributes on parameters must be updated to take the new position of each parameter into account. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D43132 llvm-svn: 325820
* Remove alignment argument from memcpy/memmove/memset in favour of alignment ↵Daniel Neilson2018-01-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965
* [dfsan] Add explicit zero extensions for shadow parameters in function wrappers.Simon Dardis2017-08-173-5/+67
| | | | | | | | | | | | | | | | | In the case where dfsan provides a custom wrapper for a function, shadow parameters are added for each parameter of the function. These parameters are i16s. For targets which do not consider this a legal type, the lack of sign extension information would cause LLVM to generate anyexts around their usage with phi variables and calling convention logic. Address this by introducing zero exts for each shadow parameter. Reviewers: pcc, slthakur Differential Revision: https://reviews.llvm.org/D33349 llvm-svn: 311087
* Add element-atomic mem intrinsic canary tests for Dataflow Sanitizer.Daniel Neilson2017-07-181-0/+37
| | | | | | | | | | | | | | | | | | | Summary: Add canary tests to verify that DFSAN currently does nothing with the element atomic memory intrinsics for memcpy, memmove, and memset. Placeholder tests that will fail once @llvm.mem[cpy|move|set] instrinsics have been added to the MemIntrinsic class hierarchy. These will act as a reminder to verify that DFSAN handles these intrinsics properly once they have been added to that class hierarchy. Note that there could be some trickiness with these element-atomic intrinsics for the dataflow sanitizer in racy multithreaded programs. The data flow sanitizer inserts additional lib calls to mirror the memory intrinsic's action, so it is possible (very likely, even) that the dfsan buffers will not be in sync with the original buffers. Furthermore, implementation of the dfsan buffer updates for the element atomic intrinsics will have to also use unordered atomic instructions. If we can assume that dfsan is never run on racy multithreaded programs, then the element atomic memory intrinsics can pretty much be treated the same as the regular memory intrinsics. Reviewers: reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35507 llvm-svn: 308249
* [PR27284] Reverse the ownership between DICompileUnit and DISubprogram.Adrian Prantl2016-04-151-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently each Function points to a DISubprogram and DISubprogram has a scope field. For member functions the scope is a DICompositeType. DIScopes point to the DICompileUnit to facilitate type uniquing. Distinct DISubprograms (with isDefinition: true) are not part of the type hierarchy and cannot be uniqued. This change removes the subprograms list from DICompileUnit and instead adds a pointer to the owning compile unit to distinct DISubprograms. This would make it easy for ThinLTO to strip unneeded DISubprograms and their transitively referenced debug info. Motivation ---------- Materializing DISubprograms is currently the most expensive operation when doing a ThinLTO build of clang. We want the DISubprogram to be stored in a separate Bitcode block (or the same block as the function body) so we can avoid having to expensively deserialize all DISubprograms together with the global metadata. If a function has been inlined into another subprogram we need to store a reference the block containing the inlined subprogram. Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script that updates LLVM IR testcases to the new format. http://reviews.llvm.org/D19034 <rdar://problem/25256815> llvm-svn: 266446
* testcase gardening: update the emissionKind enum to the new syntax. (NFC)Adrian Prantl2016-04-011-1/+1
| | | | llvm-svn: 265081
* [DFSan] Remove an overly aggressive assert reported in PR26068.Chandler Carruth2016-03-071-3/+40
| | | | | | | | | | | | This code has been successfully used to bootstrap libc++ in a no-asserts mode for a very long time, so the code that follows cannot be completely incorrect. I've added a test that shows the current behavior for this kind of code with DFSan. If it is desirable for DFSan to do something special when processing an invoke of a variadic function, it can be added, but we shouldn't keep an assert that we've been ignoring due to release builds anyways. llvm-svn: 262829
* [sanitizer] [dfsan] Unify aarch64 mappingAdhemerval Zanella2015-11-271-0/+14
| | | | | | | | | | | | This patch changes the DFSan instrumentation for aarch64 to instead of using fixes application mask defined by SANITIZER_AARCH64_VMA to read the application shadow mask value from compiler-rt. The value is initialized based on runtime VAM detection. Along with this patch a compiler-rt one will also be added to export the shadow mask variable. llvm-svn: 254196
* Revert "Change memcpy/memset/memmove to have dest and source alignments."Pete Cooper2015-11-191-2/+2
| | | | | | | | | | This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543
* Change memcpy/memset/memmove to have dest and source alignments.Pete Cooper2015-11-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511
* DI: Reverse direction of subprogram -> function edge.Peter Collingbourne2015-11-051-3/+4
| | | | | | | | | | | | | | | | | | | | | | | Previously, subprograms contained a metadata reference to the function they described. Because most clients need to get or set a subprogram for a given function rather than the other way around, this created unneeded inefficiency. For example, many passes needed to call the function llvm::makeSubprogramMap() to build a mapping from functions to subprograms, and the IR linker needed to fix up function references in a way that caused quadratic complexity in the IR linking phase of LTO. This change reverses the direction of the edge by storing the subprogram as function-level metadata and removing DISubprogram's function field. Since this is an IR change, a bitcode upgrade has been provided. Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is attached to the PR. Differential Revision: http://reviews.llvm.org/D14265 llvm-svn: 252219
* [opaque pointer type] Add textual IR support for explicit type parameter for ↵David Blaikie2015-09-112-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | global aliases update.py: import fileinput import sys import re alias_match_prefix = r"(.*(?:=|:|^)\s*(?:external |)(?:(?:private|internal|linkonce|linkonce_odr|weak|weak_odr|common|appending|extern_weak|available_externally) )?(?:default |hidden |protected )?(?:dllimport |dllexport )?(?:unnamed_addr |)(?:thread_local(?:\([a-z]*\))? )?alias" plain = re.compile(alias_match_prefix + r" (.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|addrspacecast|\[\[[a-zA-Z]|\{\{).*$)") cast = re.compile(alias_match_prefix + r") ((?:bitcast|inttoptr|addrspacecast)\s*\(.* to (.*?)(| addrspace\(\d+\) *)\*\)\s*(?:;.*)?$)") gep = re.compile(alias_match_prefix + r") ((?:getelementptr)\s*(?:inbounds)?\s*\((?P<type>.*), (?P=type)(?:\s*addrspace\(\d+\)\s*)?\* .*\)\s*(?:;.*)?$)") def conv(line): m = re.match(cast, line) if m: return m.group(1) + " " + m.group(3) + ", " + m.group(2) m = re.match(gep, line) if m: return m.group(1) + " " + m.group(3) + ", " + m.group(2) m = re.match(plain, line) if m: return m.group(1) + ", " + m.group(2) + m.group(3) + "*" + m.group(4) + "\n" return line for line in sys.stdin: sys.stdout.write(conv(line)) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name *.ll | xargs ./apply.sh From llvm/src/tools/clang: find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll | xargs ./apply.sh llvm-svn: 247378
* DI: Require subprogram definitions to be distinctDuncan P. N. Exon Smith2015-08-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | As a follow-up to r246098, require `DISubprogram` definitions (`isDefinition: true`) to be 'distinct'. Specifically, add an assembler check, a verifier check, and bitcode upgrading logic to combat testcase bitrot after the `DIBuilder` change. While working on the testcases, I realized that test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore. Its purpose was to check for a corner case in PR22792 where two subprogram definitions match exactly and share the same metadata node. The new verifier check, requiring that subprogram definitions are 'distinct', precludes that possibility. I updated almost all the IR with the following script: git grep -l -E -e '= !DISubprogram\(.* isDefinition: true' | grep -v test/Bitcode | xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true\)/= distinct \1/' Likely some variant of would work for out-of-tree testcases. llvm-svn: 246327
* DI: Disallow uniquable DICompileUnitsDuncan P. N. Exon Smith2015-08-031-1/+1
| | | | | | | | | | | | | | | | | | Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s. The backend is liable to start relying on that (if it hasn't already), so make uniquable `DICompileUnit`s illegal and automatically upgrade old bitcode. This is a nice cleanup, since we can remove an unnecessary `DenseSet` (and the associated uniquing info) from `LLVMContextImpl`. Almost all the testcases were updated with this script: git grep -e '= !DICompileUnit' -l -- test | grep -v test/Bitcode | xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,' I imagine something similar should work for out-of-tree testcases. llvm-svn: 243885
* IR: Give 'DI' prefix to debug info metadataDuncan P. N. Exon Smith2015-04-291-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Finish off PR23080 by renaming the debug info IR constructs from `MD*` to `DI*`. The last of the `DIDescriptor` classes were deleted in r235356, and the last of the related typedefs removed in r235413, so this has all baked for about a week. Note: If you have out-of-tree code (like a frontend), I recommend that you get everything compiling and tests passing with the *previous* commit before updating to this one. It'll be easier to keep track of what code is using the `DIDescriptor` hierarchy and what you've already updated, and I think you're extremely unlikely to insert bugs. YMMV of course. Back to *this* commit: I did this using the rename-md-di-nodes.sh upgrade script I've attached to PR23080 (both code and testcases) and filtered through clang-format-diff.py. I edited the tests for test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns were off-by-three. It should work on your out-of-tree testcases (and code, if you've followed the advice in the previous paragraph). Some of the tests are in badly named files now (e.g., test/Assembler/invalid-mdcompositetype-missing-tag.ll should be 'dicompositetype'); I'll come back and move the files in a follow-up commit. llvm-svn: 236120
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-04-161-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the call instruction See r230786 and r230794 for similar changes to gep and load respectively. Call is a bit different because it often doesn't have a single explicit type - usually the type is deduced from the arguments, and just the return type is explicit. In those cases there's no need to change the IR. When that's not the case, the IR usually contains the pointer type of the first operand - but since typed pointers are going away, that representation is insufficient so I'm just stripping the "pointerness" of the explicit type away. This does make the IR a bit weird - it /sort of/ reads like the type of the first operand: "call void () %x(" but %x is actually of type "void ()*" and will eventually be just of type "ptr". But this seems not too bad and I don't think it would benefit from repeating the type ("void (), void () * %x(" and then eventually "void (), ptr %x(") as has been done with gep and load. This also has a side benefit: since the explicit type is no longer a pointer, there's no ambiguity between an explicit type and a function that returns a function pointer. Previously this case needed an explicit type (eg: a function returning a void() function was written as "call void () () * @x(" rather than "call void () * @x(" because of the ambiguity between a function returning a pointer to a void() function and a function returning void). No ambiguity means even function pointer return types can just be written alone, without writing the whole function's type. This leaves /only/ the varargs case where the explicit type is required. Given the special type syntax in call instructions, the regex-fu used for migration was a bit more involved in its own unique way (as every one of these is) so here it is. Use it in conjunction with the apply.sh script and associated find/xargs commands I've provided in rr230786 to migrate your out of tree tests. Do let me know if any of this doesn't cover your cases & we can iterate on a more general script/regexes to help others with out of tree tests. About 9 test cases couldn't be automatically migrated - half of those were functions returning function pointers, where I just had to manually delete the function argument types now that we didn't need an explicit function type there. The other half were typedefs of function types used in calls - just had to manually drop the * from those. import fileinput import sys import re pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)') addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$") func_end = re.compile("(?:void.*|\)\s*)\*$") def conv(match, line): if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)): return line return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():] for line in sys.stdin: sys.stdout.write(conv(re.search(pat, line), line)) llvm-svn: 235145
* DebugInfo: Move new hierarchy into placeDuncan P. N. Exon Smith2015-03-031-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the specialized metadata nodes for the new debug info hierarchy into place, finishing off PR22464. I've done bootstraps (and all that) and I'm confident this commit is NFC as far as DWARF output is concerned. Let me know if I'm wrong :). The code changes are fairly mechanical: - Bumped the "Debug Info Version". - `DIBuilder` now creates the appropriate subclass of `MDNode`. - Subclasses of DIDescriptor now expect to hold their "MD" counterparts (e.g., `DIBasicType` expects `MDBasicType`). - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp` for printing comments. - Big update to LangRef to describe the nodes in the new hierarchy. Feel free to make it better. Testcase changes are enormous. There's an accompanying clang commit on its way. If you have out-of-tree debug info testcases, I just broke your build. - `upgrade-specialized-nodes.sh` is attached to PR22564. I used it to update all the IR testcases. - Unfortunately I failed to find way to script the updates to CHECK lines, so I updated all of these by hand. This was fairly painful, since the old CHECKs are difficult to reason about. That's one of the benefits of the new hierarchy. This work isn't quite finished, BTW. The `DIDescriptor` subclasses are almost empty wrappers, but not quite: they still have loose casting checks (see the `RETURN_FROM_RAW()` macro). Once they're completely gutted, I'll rename the "MD" classes to "DI" and kill the wrappers. I also expect to make a few schema changes now that it's easier to reason about everything. llvm-svn: 231082
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-274-41/+41
| | | | | | | | | | | | | | | | | | | | | | | | load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
* [opaque pointer type] Add textual IR support for explicit type parameter to ↵David Blaikie2015-02-273-22/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float*> %x, ... ->getelementptr float, <4 x float*> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") normrep = re.compile( r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name *.ll | xargs ./apply.sh From llvm/src/tools/clang: find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll | xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786
* IR: Move MDLocation into placeDuncan P. N. Exon Smith2015-01-141-1/+1
| | | | | | | | | | | | | | | | | | | | This commit moves `MDLocation`, finishing off PR21433. There's an accompanying clang commit for frontend testcases. I'll attach the testcase upgrade script I used to PR21433 to help out-of-tree frontends/backends. This changes the schema for `DebugLoc` and `DILocation` from: !{i32 3, i32 7, !7, !8} to: !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8) Note that empty fields (line/column: 0 and inlinedAt: null) don't get printed by the assembly writer. llvm-svn: 226048
* IR: Make metadata typeless in assemblyDuncan P. N. Exon Smith2014-12-151-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. llvm-svn: 224257
* Add target triples to all dfsan tests.Peter Collingbourne2014-12-0511-0/+11
| | | | llvm-svn: 223536
* [dfsan] Abort at runtime on indirect calls to uninstrumented vararg functions.Peter Collingbourne2014-11-051-2/+5
| | | | | | | | | | | | | We currently have no infrastructure to support these correctly. This is accomplished by generating a call to a runtime library function that aborts at runtime in place of the regular wrapper for such functions. Direct calls are rewritten in the usual way during traversal of the caller's IR. We also remove the "split-stack" attribute from such wrappers, as the code generator cannot currently handle split-stack vararg functions. llvm-svn: 221360
* [dfsan] New calling convention for custom functions with variadic arguments.Peter Collingbourne2014-10-301-21/+30
| | | | | | | | | | | | | | | | | | | Summary: The previous calling convention prevented custom functions from being able to access argument labels unless it knew how many variadic arguments there were, and of which type. This restriction made it impossible to correctly model functions in the printf family, as it is legal to pass more arguments than required to those functions. We now pass arguments in the following order: non-vararg arguments labels for non-vararg arguments [if vararg function, pointer to array of labels for vararg arguments] [if non-void function, pointer to label for return value] vararg arguments Differential Revision: http://reviews.llvm.org/D6028 llvm-svn: 220906
* DebugInfo+DFSan: Ensure that debug info references to llvm::Functions remain ↵David Blaikie2014-10-072-0/+38
| | | | | | | | | | | | | | | | | | | | | | pointing to the underlying function when wrappers are created This is somewhat the inverse of how similar bugs in DAE and ArgPromo manifested and were addressed. In those passes, individual call sites were visited explicitly, and then the old function was deleted. This left the debug info with a null llvm::Function* that needed to be updated to point to the new function. In the case of DFSan, it RAUWs the old function with the wrapper, which includes debug info. So now the debug info refers to the wrapper, which doesn't actually have any instructions with debug info in it, so it is ignored entirely - resulting in a DW_TAG_subprogram with no high/low pc, etc. Instead, fix up the debug info to refer to the original function after the RAUW messed it up. Reviewed/discussed with Peter Collingbourne on the llvm-dev mailing list. llvm-svn: 219249
* Introduce support for custom wrappers for vararg functions.Lorenzo Martignoni2014-09-301-4/+10
| | | | | | Differential Revision: http://reviews.llvm.org/D5412 llvm-svn: 218671
* [dfsan] Fix non-determinism bug in non-zero label check annotator.Peter Collingbourne2014-08-221-5/+8
| | | | | | | We now use a std::vector instead of a DenseSet to store the list of label checks so that we can iterate over it deterministically. llvm-svn: 216255
* [dfsan] Treat vararg custom functions like unimplemented functions.Peter Collingbourne2014-08-201-0/+6
| | | | | | | | | Because declarations of these functions can appear in places like autoconf checks, they have to be handled somehow, even though we do not support vararg custom functions. We do so by printing a warning and calling the uninstrumented function, as we do for unimplemented functions. llvm-svn: 216042
* [dfsan] Try not to create too many additional basic blocks in functions whichPeter Collingbourne2014-08-061-0/+3013
| | | | | | | already have a large number of blocks. Works around a performance issue with the greedy register allocator. llvm-svn: 214944
* [dfsan] Correctly handle loads and stores of zero size.Peter Collingbourne2014-08-012-1/+26
| | | | llvm-svn: 214561
* [dfsan] Introduce further optimization to reduce the number of union queries.Peter Collingbourne2014-07-151-4/+15
| | | | | | | Specifically, do not compute a union if it is statically known that one shadow set subsumes the other. llvm-svn: 213100
* [dfsan] Introduce an optimization to reduce the number of union queries.Peter Collingbourne2014-07-151-0/+41
| | | | | | | Specifically, when building a union query, if we are dominated by an identical query then use the result of that query instead. llvm-svn: 213047
* [dfsan] Handle bitcast aliases.Peter Collingbourne2014-07-101-0/+8
| | | | llvm-svn: 212668
* Introduce two command-line flags for the instrumentation pass to control ↵Peter Collingbourne2013-11-212-111/+256
| | | | | | | | | | | | | | whether the labels of pointers should be ignored in load and store instructions The new command line flags are -dfsan-ignore-pointer-label-on-store and -dfsan-ignore-pointer-label-on-load. Their default value matches the current labelling scheme. Additionally, the function __dfsan_union_load is marked as readonly. Patch by Lorenzo Martignoni! Differential Revision: http://llvm-reviews.chandlerc.com/D2187 llvm-svn: 195382
* DataFlowSanitizer: Implement trampolines for function pointers passed to ↵Peter Collingbourne2013-08-272-5/+17
| | | | | | | | custom functions. Differential Revision: http://llvm-reviews.chandlerc.com/D1503 llvm-svn: 189408
* DataFlowSanitizer: correctly combine labels in the case where they are equal.Peter Collingbourne2013-08-231-4/+5
| | | | llvm-svn: 189133
* DataFlowSanitizer: Replace non-instrumented aliases of instrumented ↵Peter Collingbourne2013-08-222-3/+16
| | | | | | | | functions, and vice versa, with wrappers. Differential Revision: http://llvm-reviews.chandlerc.com/D1442 llvm-svn: 189054
* DataFlowSanitizer: Prefix the name of each instrumented function with "dfs$".Peter Collingbourne2013-08-229-22/+36
| | | | | | | | | | | | | | | | DFSan changes the ABI of each function in the module. This makes it possible for a function with the native ABI to be called with the instrumented ABI, or vice versa, thus possibly invoking undefined behavior. A simple way of statically detecting instances of this problem is to prepend the prefix "dfs$" to the name of each instrumented-ABI function. This will not catch every such problem; in particular function pointers passed across the instrumented-native barrier cannot be used on the other side. These problems could potentially be caught dynamically. Differential Revision: http://llvm-reviews.chandlerc.com/D1373 llvm-svn: 189052
* [tests] Cleanup initialization of test suffixes.Daniel Dunbar2013-08-161-1/+0
| | | | | | | | | | | | | | | | | - Instead of setting the suffixes in a bunch of places, just set one master list in the top-level config. We now only modify the suffix list in a few suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py). - Aside from removing the need for a bunch of lit.local.cfg files, this enables 4 tests that were inadvertently being skipped (one in Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been XFAILED). - This commit also fixes a bunch of config files to use config.root instead of older copy-pasted code. llvm-svn: 188513
* DataFlowSanitizer: Add a debugging feature to help us track nonzero labels.Peter Collingbourne2013-08-151-0/+23
| | | | | | | | | | | | | | | | | | Summary: When the -dfsan-debug-nonzero-labels parameter is supplied, the code is instrumented such that when a call parameter, return value or load produces a nonzero label, the function __dfsan_nonzero_label is called. The idea is that a debugger breakpoint can be set on this function in a nominally label-free program to help identify any bugs in the instrumentation pass causing labels to be introduced. Reviewers: eugenis CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1405 llvm-svn: 188472
* DataFlowSanitizer: move abilist input file to Inputs.Peter Collingbourne2013-08-142-1/+1
| | | | llvm-svn: 188423
* DataFlowSanitizer: Instrumentation for memset.Peter Collingbourne2013-08-141-0/+11
| | | | | | Differential Revision: http://llvm-reviews.chandlerc.com/D1395 llvm-svn: 188412
* DataFlowSanitizer: greylist is now ABI list.Peter Collingbourne2013-08-142-0/+58
| | | | | | | | | | | | | | | | | | | | | | | | | This replaces the old incomplete greylist functionality with an ABI list, which can provide more detailed information about the ABI and semantics of specific functions. The pass treats every function in the "uninstrumented" category in the ABI list file as conforming to the "native" (i.e. unsanitized) ABI. Unless the ABI list contains additional categories for those functions, a call to one of those functions will produce a warning message, as the labelling behaviour of the function is unknown. The other supported categories are "functional", "discard" and "custom". - "discard" -- This function does not write to (user-accessible) memory, and its return value is unlabelled. - "functional" -- This function does not write to (user-accessible) memory, and the label of its return value is the union of the label of its arguments. - "custom" -- Instead of calling the function, a custom wrapper __dfsw_F is called, where F is the name of the function. This function may wrap the original function or provide its own implementation. Differential Revision: http://llvm-reviews.chandlerc.com/D1345 llvm-svn: 188402
* Reapply r188119 now that the bug it exposed is fixed.Peter Collingbourne2013-08-121-2/+12
| | | | llvm-svn: 188217
* Revert r188119 "Kill some duplicated code for removing unreachable BBs."Arnold Schwaighofer2013-08-101-12/+2
| | | | | | | | | | | | | | It is breaking builbots with libgmalloc enabled on Mac OS X. $ cd llvm ; mkdir release ; cd release $ ../configure --enable-optimized —prefix=$PWD/install $ make $ make check $ Release+Asserts/bin/llvm-lit -v --param use_gmalloc=1 --param \ gmalloc_path=/usr/lib/libgmalloc.dylib \ ../test/Instrumentation/DataFlowSanitizer/args-unreachable-bb.ll llvm-svn: 188142
OpenPOWER on IntegriCloud