summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Reverting r335326 while I look at the test failureSjoerd Meijer2018-06-222-15/+13
| | | | llvm-svn: 335328
* Revert r335324 due to a builtbot failureEugene Leviant2018-06-224-143/+3
| | | | llvm-svn: 335327
* [ARM] ARMv6m and v8m.baseline strict alignSjoerd Meijer2018-06-222-13/+15
| | | | | | | | | | | | This sets target feature FeatureStrictAlign for Armv6-m and Armv8-m.baseline, because it has no support for unaligned accesses. It looks like we always pass target feature "+strict-align" from Clang, so this is not a user facing problem, but querying the subtarget (in e.g. llc) for unaligned access support is incorrect. Differential Revision: https://reviews.llvm.org/D48437 llvm-svn: 335326
* AMDGPU: Add patterns for i32/i64 local atomic load/storeMatt Arsenault2018-06-226-1/+159
| | | | | | | | Not sure why the 32/64 split is needed in the atomic_load store hierarchies. The regular PatFrags do this, but we don't do it for the existing handling for global. llvm-svn: 335325
* [Evaluator] Improve evaluation of call instructionEugene Leviant2018-06-224-3/+143
| | | | | | Differential revision: https://reviews.llvm.org/D46584 llvm-svn: 335324
* [X86] Changing the check for valid inputs in combineScalarToVectorMikhail Dvoretckii2018-06-222-6/+15
| | | | | | | | | Changing the logic of scalar mask folding to check for valid input types rather than against invalid ones, making it more robust and fixing PR37879. Differential Revision: https://reviews.llvm.org/D48366 llvm-svn: 335323
* tsan: fix deficiency in MutexReadOrWriteUnlockDmitry Vyukov2018-06-221-1/+1
| | | | | | | | | | MutexUnlock uses ReleaseStore on s->clock, which is the right thing to do. However MutexReadOrWriteUnlock for writers uses Release on s->clock. Make MutexReadOrWriteUnlock also use ReleaseStore for consistency and performance. Unfortunately, I don't think any test can detect this as this only potentially affects performance. llvm-svn: 335322
* [clangd] Remove FilterText from the index.Sam McCall2018-06-229-37/+4
| | | | | | | | | | | | | | | | | Summary: It's almost always identical to Name, and in fact we never used it (we used name instead). The only case where they differ is objc method selectors (foo: vs foo:bar:). We can live with the latter for both name and filterText, so I've made that change too. Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D48375 llvm-svn: 335321
* Revert r335306 (and r335314) - the Call Graph Profile pass.Chandler Carruth2018-06-2216-308/+7
| | | | | | | | | | | This is the first pass in the main pipeline to use the legacy PM's ability to run function analyses "on demand". Unfortunately, it turns out there are bugs in that somewhat-hacky approach. At the very least, it leaks memory and doesn't support -debug-pass=Structure. Unclear if there are larger issues or not, but this should get the sanitizer bots back to green by fixing the memory leaks. llvm-svn: 335320
* AMDGPU/GlobalISel: Default to using TableGen'd instruction selectorTom Stellard2018-06-221-7/+0
| | | | | | | | | | | | | | | | Summary: We can select all instructions that are marked as legal in a full piglit run, so now is a good time to make the TableGen'd instruction selector default for all opcodes. This is NFC for a full piglit run, which is why there are no tests. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48198 llvm-svn: 335319
* AMDGPU/GlobalISel: legalize and select 32-bit G_ASHRTom Stellard2018-06-226-0/+155
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D48196 llvm-svn: 335318
* [LegacyPM] Fix PR37888 by teaching the legacy loop pass manager how toChandler Carruth2018-06-222-1/+48
| | | | | | | | | | | | | | | | | clear out deleted loops from the current queue beyond just the current loop. This is important because SimpleLoopUnswitch will now enqueue the same loop to be re-processed. When it does this with the legacy PM, we don't have a way of canceling the rest of the pipeline and so we can end up deleting the loop before we reprocess it. =/ This change also makes it easy to support deleting other loops in the queue to process, although I don't have any use cases for that. Differential Revision: https://reviews.llvm.org/D48470 llvm-svn: 335317
* AMDGPU/GlobalISel: legalize and select 32-bit G_SITOFPTom Stellard2018-06-226-0/+68
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48195 llvm-svn: 335316
* AMDGPU/GlobalISel: Implement select() for COPYTom Stellard2018-06-222-1/+31
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46151 llvm-svn: 335315
* Fix test failures after r335306 due to the pipeline changing.Chandler Carruth2018-06-223-0/+12
| | | | | | | | | | | | | This wasn't obvious for the author to fix because this is the first pipeline use of the magic utility to get function analyses within a module pass in the lagecy pass manager. Turns out that has a bug which prevents dumping the structure of the pipeline and shows up as an unnamed pass. I've just left a FIXME for that as it doesn't seem likely worth fixing and certainly shouldn't hold up getting the bots green. llvm-svn: 335314
* Remove dead codeFrederic Riss2018-06-221-42/+20
| | | | | | | | Our DWARF parsing code had a workaorund for Objective-C "self" not being marked as artifial by the compiler. Clang has been doing this since 2010, so let's just drop the workaround. llvm-svn: 335313
* [InstCombine] fix shuffle-of-binops bugSanjay Patel2018-06-212-4/+11
| | | | | | | | | With non-commutative binops, we could be using the same variable value as operand 0 in 1 binop and operand 1 in the other, so we have to check for that possibility and bail out. llvm-svn: 335312
* [InstCombine] add test for shuffle-of-binops; NFCSanjay Patel2018-06-211-0/+14
| | | | | | This shows a miscompile that was missed in rL335283. llvm-svn: 335311
* [x86] Fix a tiny bug in my test case in r335309 by marking that we don'tChandler Carruth2018-06-211-1/+2
| | | | | | expect any diagnostics. llvm-svn: 335310
* [x86] Teach the builtin argument range check to allow invalid ranges inChandler Carruth2018-06-2120-300/+340
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dead code. This is important for C++ templates that essentially compute the valid input in a way that is constant and will cause all the invalid cases to be dead code that is deleted. Code in the wild actually does this and GCC also accepts these kinds of patterns so it is important to support it. To make this work, we provide a non-error path to diagnose these issues, and use a default-error warning instead. This keeps the relatively strict handling but prevents nastiness like SFINAE on these errors. It also allows us to safely use the system to diagnose this only when it occurs at runtime (in emitted code). Entertainingly, this required fixing the syntax in various other ways for the x86 test because we never bothered to diagnose that the returns were invalid. Since debugging these compile failures was super confusing, I've also improved the diagnostic to actually say what the value was. Most of the checks I've made ignore this to simplify maintenance, but I've checked it in a few places to make sure the diagnsotic is working. Depends on D48462. Without that, we might actually crash some part of the compiler after bypassing the error here. Thanks to Richard, Ben Kramer, and especially Craig Topper for all the help here. Differential Revision: https://reviews.llvm.org/D48464 llvm-svn: 335309
* [X86] Update handling in CGBuiltin to be tolerant of out of range immediates.Craig Topper2018-06-215-32/+48
| | | | | | | | D48464 contains changes that will loosen some of the range checks in SemaChecking to a DefaultError warning that can be disabled. This patch adds explicit masking to avoid using the upper bits of immediates to gracefully handle the warning being disabled. llvm-svn: 335308
* AMDGPU/GlobalISel: Implement select() for G_IMPLICIT_DEFTom Stellard2018-06-213-0/+41
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46150 llvm-svn: 335307
* [Instrumentation] Add Call Graph Profile passMichael J. Spencer2018-06-2113-7/+296
| | | | | | | | | | | | | | | | | | | | This patch adds support for generating a call graph profile from Branch Frequency Info. The CGProfile module pass simply gets the block profile count for each BB and scans for call instructions. For each call instruction it adds an edge from the current function to the called function with the current BB block profile count as the weight. After scanning all the functions, it generates an appending module flag containing the data. The format looks like: !llvm.module.flags = !{!0} !0 = !{i32 5, !"CG Profile", !1} !1 = !{!2, !3, !4} ; List of edges !2 = !{void ()* @a, void ()* @b, i64 32} ; Edge from a to b with a weight of 32 !3 = !{void (i1)* @freq, void ()* @a, i64 11} !4 = !{void (i1)* @freq, void ()* @b, i64 20} Differential Revision: https://reviews.llvm.org/D48105 llvm-svn: 335306
* Ignore blacklist when generating __cfi_check_fail.Evgeniy Stepanov2018-06-212-0/+11
| | | | | | | | | | | | Summary: Fixes PR37898. Reviewers: pcc, vlad.tsyrklevich Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D48454 llvm-svn: 335305
* [X86] Fix 32-bit mingw comdat names, only add one underscoreReid Kleckner2018-06-213-16/+14
| | | | llvm-svn: 335304
* [gdb] Update llvm::OptionalFangrui Song2018-06-211-3/+4
| | | | | | | | | | Reviewers: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48461 llvm-svn: 335303
* [AMDGPU] Fix lit failures introduced in r335281Scott Linder2018-06-212-0/+6
| | | | | | The tests do not support big-endian hosts. llvm-svn: 335302
* [IR] fix typo in comment; NFCSanjay Patel2018-06-211-1/+1
| | | | llvm-svn: 335301
* Revert r335297 "[X86] Implement more of x86-64 large and medium PIC code models"Reid Kleckner2018-06-2113-531/+39
| | | | | | MCJIT can't handle R_X86_64_GOT64 yet. llvm-svn: 335300
* Test commit, made a minor change to a commentEmmett Neyman2018-06-211-1/+1
| | | | llvm-svn: 335299
* [X86] Commit some comments that weren't in the medium code model patchReid Kleckner2018-06-211-4/+4
| | | | llvm-svn: 335298
* [X86] Implement more of x86-64 large and medium PIC code modelsReid Kleckner2018-06-2113-37/+529
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The large code model allows code and data segments to exceed 2GB, which means that some symbol references may require a displacement that cannot be encoded as a displacement from RIP. The large PIC model even relaxes the assumption that the GOT itself is within 2GB of all code. Therefore, we need a special code sequence to materialize it: .LtmpN: leaq .LtmpN(%rip), %rbx movabsq $_GLOBAL_OFFSET_TABLE_-.LtmpN, %rax # Scratch addq %rax, %rbx # GOT base reg From that, non-local references go through the GOT base register instead of being PC-relative loads. Local references typically use GOTOFF symbols, like this: movq extern_gv@GOT(%rbx), %rax movq local_gv@GOTOFF(%rbx), %rax All calls end up being indirect: movabsq $local_fn@GOTOFF, %rax addq %rbx, %rax callq *%rax The medium code model retains the assumption that the code segment is less than 2GB, so calls are once again direct, and the RIP-relative loads can be used to access the GOT. Materializing the GOT is easy: leaq _GLOBAL_OFFSET_TABLE_(%rip), %rbx # GOT base reg DSO local data accesses will use it: movq local_gv@GOTOFF(%rbx), %rax Non-local data accesses will use RIP-relative addressing, which means we may not always need to materialize the GOT base: movq extern_gv@GOTPCREL(%rip), %rax Direct calls are basically the same as they are in the small code model: They use direct, PC-relative addressing, and the PLT is used for calls to non-local functions. This patch adds reasonably comprehensive testing of LEA, but there are lots of interesting folding opportunities that are unimplemented. Reviewers: chandlerc, echristo Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47211 llvm-svn: 335297
* [scudo] Add a minimal runtime for -fsanitize-minimal-runtime compatibilityKostya Kortchinsky2018-06-211-8/+34
| | | | | | | | | | | | | | | | | | | Summary: This patch follows D48373. The point is to be able to use Scudo with `-fsanitize-minimal-runtime`. For that we need a runtime that doesn't embed the UBSan one. This results in binaries that can be compiled with `-fsanitize=scudo,integer -fsanitize-minimal-runtime`. Reviewers: eugenis Reviewed By: eugenis Subscribers: mgorny, delcypher, llvm-commits, #sanitizers Differential Revision: https://reviews.llvm.org/D48377 llvm-svn: 335296
* Re-apply: Add python tool to dump and construct header mapsBruno Cardoso Lopes2018-06-2115-20/+345
| | | | | | | | | | | | | | | | | | | | | | | | | Header maps are binary files used by Xcode, which are used to map header names or paths to other locations. Clang has support for those since its inception, but there's not a lot of header map testing around. Since it's a binary format, testing becomes pretty much brittle and its hard to even know what's inside if you don't have the appropriate tools. Add a python based tool that allows creating and dumping header maps based on a json description of those. While here, rewrite tests to use the tool and remove the binary files from the tree. This tool was initially written by Daniel Dunbar. Thanks to Stella Stamenova for helping make this work on Windows. Differential Revision: https://reviews.llvm.org/D46485 rdar://problem/39994722 llvm-svn: 335295
* [GVN] Avoid casting a vector of size less than 8 bits to i8Matthew Voss2018-06-212-1/+41
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: A reprise of D25849. This crash was found through fuzzing some time ago and was documented in PR28879. No check for load size has been added due to the following tests: - Transforms/GVN/invariant.group.ll - Transforms/GVN/pr10820.ll These tests expect load sizes that are not a multiple of eight. Thanks to @davide for the original patch. Reviewers: nlopes, davide, RKSimon, reames, efriedma Reviewed By: efriedma Subscribers: davide, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D48330 llvm-svn: 335294
* [dsymutil] Force mmap'ing of binariesJonas Devlieghere2018-06-211-2/+2
| | | | | | | | | | | | | | | | | | | | | After the recent refactoring that introduced parallel handling of different object, the binary holder became unique per object file. This defeats its optimization of caching archives, leading to an archive being opened for every binary it contains. This is obviously unfortunate and will need to be refactored soon. Luckily in practice, the impact of this is limited as most files are mmap'ed instead of memcopy'd. There's a caveat however: when the memory buffer requires a null terminator and it's a multiple of the page size, we allocate instead of mmap'ing. If this happens for a static archive, we end up with N copies of it in memory, where N is the number of objects in the archive, leading to exuberant memory usage. This provided a stopgap solution to ensure that all the files it loads are mmap in memory by removing the requirement for a terminating null byte. Differential revision: https://reviews.llvm.org/D48397 llvm-svn: 335293
* [SCEV] Re-apply r335197 (with Polly fixes).Tim Shen2018-06-214-11/+115
| | | | | | | | | | | | | | | | | Summary: This initiates a discussion on changing Polly accordingly while re-applying r335197 (D48338). I have never worked on Polly. The proposed change to param_div_div_div_2.ll is not educated, but just patterns that match the output. All LLVM files are already reviewed in D48338. Reviewers: jdoerfert, bollu, efriedma Subscribers: jlebar, sanjoy, hiraditya, llvm-commits, bixia Differential Revision: https://reviews.llvm.org/D48453 llvm-svn: 335292
* Revert "[LTO] Enable module summary emission by default for regular LTO"Tobias Edler von Koch2018-06-216-49/+18
| | | | | | | | | | | This is breaking a couple of buildbots. We need to run the NameAnonGlobal pass for regular LTO now as well (since we're producing a summary). I'll post a separate patch for review to make this happen and then re-commit. This reverts commit c0759b7b1f4a81ff9021b952aa38a222d5fa4dfd. llvm-svn: 335291
* [libFuzzer] Filter architectures for testing on Apple platforms.George Karpenkov2018-06-211-0/+4
| | | | | | This is done in all other sanitizers, and was missing on libFuzzer. llvm-svn: 335290
* [libFuzzer] Provide more descriptive names for testing targets.George Karpenkov2018-06-211-1/+1
| | | | llvm-svn: 335289
* AMDGPU: Remove ability to reserve VGPRs for debuggerKonstantin Zhuravlyov2018-06-218-118/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D48234 llvm-svn: 335288
* AMDGPU: Remove amdgpu-debugger-reserve-regs featureKonstantin Zhuravlyov2018-06-212-2/+1
| | | | llvm-svn: 335287
* [mingw] Fix GCC ABI compatibility for comdat thingsReid Kleckner2018-06-216-9/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: GCC and the binutils COFF linker do comdats differently from MSVC. If we want to be ABI compatible, we have to do what they do, which is to emit unique section names like ".text$_Z3foov" instead of short section names like ".text". Otherwise, the binutils linker gets confused and reports multiple definition errors when two object files from GCC and Clang containing the same inline function are linked together. The best description of the issue is probably at https://github.com/Alexpux/MINGW-packages/issues/1677, we don't seem to have a good one in our tracker. I fixed up the .pdata and .xdata sections needed everywhere other than 32-bit x86. GCC doesn't use associative comdats for those, it appears to rely on the section name. Reviewers: smeenai, compnerd, mstorsjo, martell, mati865 Subscribers: llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D48402 llvm-svn: 335286
* [OPENMP, NVPTX] Fix globalization of the variables passed to orphanedAlexey Bataev2018-06-213-49/+70
| | | | | | | | | | parallel region. If the current construct requires sharing of the local variable in the inner parallel region, this variable must be globalized to avoid runtime crash. llvm-svn: 335285
* [LTO] Enable module summary emission by default for regular LTOTobias Edler von Koch2018-06-216-18/+49
| | | | | | | | | | | | | | | | | | | Summary: With D33921, we gained the ability to have module summaries in regular LTO modules without triggering ThinLTO compilation. Module summaries in regular LTO allow garbage collection (dead stripping) before LTO compilation and thus open up additional optimization opportunities. This patch enables summary emission in regular LTO for all targets except ld64-based ones (which use the legacy LTO API). Reviewers: pcc, tejohnson, mehdi_amini Subscribers: inglorion, eraman, cfe-commits Differential Revision: https://reviews.llvm.org/D34156 llvm-svn: 335284
* [InstCombine] fold vector select of binops with constant ops to 1 binop ↵Sanjay Patel2018-06-212-48/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (PR37806) This is the simplest case from PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have a common variable operand used in a pair of binops with vector constants that are vector selected together, then we can constant shuffle the constant vectors to eliminate the shuffle instruction. This has some tricky parts that are hopefully addressed in the tests and their respective comments: 1. If the shuffle mask contains an undef element, then that lane of the result is undef: http://llvm.org/docs/LangRef.html#shufflevector-instruction Therefore, we can replace the constant in that lane with an undef value except for div/rem. With div/rem, an undef in the divisor would cause the whole op to be undef. So I'm using the same hack as in D47686 - replace the undefs with '1'. 2. Intersect the wrapping and FMF of the original binops for the new binop. There should be no extra poison or fast-math potential in the new binop that wasn't possible in the original code. 3. Disregard other uses. Given that we're eliminating uses (shortening the dependency chain), I think that's always the right IR canonicalization. But I purposely chose the udiv test to demonstrate the scenario where both intermediate values have other uses because that seems likely worse for codegen with an expensive math op. This seems like a very rare possibility to me, so I don't think it requires a backend patch first. Differential Revision: https://reviews.llvm.org/D48401 llvm-svn: 335283
* [bindings] Fix most Python binding unittests on WindowsJonathan Coe2018-06-213-22/+21
| | | | | | | | | | | | | | | | | | Summary: This fixes all but one of the test cases for Windows. TestCDB will take more work to debug, as CompilationDatabase seems not to work correctly. Reviewers: bkramer, wanders, jbcoe Reviewed By: bkramer, jbcoe Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D47864 Patch written by ethanhs (Ethan) llvm-svn: 335282
* [AMDGPU] Update assembler for HSA Code Object v3Scott Linder2018-06-2111-187/+1275
| | | | | | | | | | | | | | Update AMDGPU assembler syntax behind the code-object-v3 feature: * Replace/rename most AMDGPU assembler directives/symbols and document them. * Provide more diagnostics (e.g. values out of range, missing values, repeated values). * Provide path for backwards compatibility, even with underlying descriptor changes. Differential Revision: https://reviews.llvm.org/D47736 llvm-svn: 335281
* atom: Use volatile pointers for ↵Jan Vesely2018-06-2114-20/+20
| | | | | | | | | | | | | | | | cl_khr_{global,local}_int32_{base,extended}_atomics int64 versions were switched to volatile pointers in cl1.1 cl1.1 also renamed atom_ functions to atomic_ that use volatile pointers. CTS and applications use volatile pointers. Passes CTS on carrizo no return piglit tests still pass on turks. Reviewed-By: Aaron Watry <awatry@gmail.com> Tested-By: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 335280
* atom: Consolidate cl_khr_{local,global}_int32_{base,extended}_atomics ↵Jan Vesely2018-06-2121-148/+66
| | | | | | | | | | | | implementation These are just atomic_* wrappers. Switch inc, dec to use atomic_* wrappers as well. Reviewed-By: Aaron Watry <awatry@gmail.com> Tested-By: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 335279
OpenPOWER on IntegriCloud