summaryrefslogtreecommitdiffstats
path: root/clang/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Use zeroinitializer for (trailing zero portion of) large array initializersRichard Smith2018-05-231-0/+10
| | | | | | | | more reliably. This re-commits r333044 with a fix for PR37560. llvm-svn: 333141
* Revert r333044 "Use zeroinitializer for (trailing zero portion of) large ↵Hans Wennborg2018-05-231-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | array initializers" It caused asserts, see PR37560. > Use zeroinitializer for (trailing zero portion of) large array initializers > more reliably. > > Clang has two different ways it emits array constants (from InitListExprs and > from APValues), and both had some ability to emit zeroinitializer, but neither > was able to catch all cases where we could use zeroinitializer reliably. In > particular, emitting from an APValue would fail to notice if all the explicit > array elements happened to be zero. In addition, for large arrays where only an > initial portion has an explicit initializer, we would emit the complete > initializer (which could be huge) rather than emitting only the non-zero > portion. With this change, when the element would have a suffix of more than 8 > zero elements, we emit the array constant as a packed struct of its initial > portion followed by a zeroinitializer constant for the trailing zero portion. > > In passing, I found a bug where SemaInit would sometimes walk the entire array > when checking an initializer that only covers the first few elements; that's > fixed here to unblock testing of the rest. > > Differential Revision: https://reviews.llvm.org/D47166 llvm-svn: 333067
* [X86] In the floating point max reduction intrinsics, negate infinity before ↵Craig Topper2018-05-231-6/+4
| | | | | | | | feeding it to set1. Previously we negated the whole vector after splatting infinity. But its better to negate the infinity before splatting. This generates IR with the negate already folded with the infinity constant. llvm-svn: 333062
* [X86] Remove mask argument from more builtins that are handled completely in ↵Craig Topper2018-05-231-200/+0
| | | | | | CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead. llvm-svn: 333061
* Use zeroinitializer for (trailing zero portion of) large array initializersRichard Smith2018-05-231-0/+10
| | | | | | | | | | | | | | | | | | | | | | | more reliably. Clang has two different ways it emits array constants (from InitListExprs and from APValues), and both had some ability to emit zeroinitializer, but neither was able to catch all cases where we could use zeroinitializer reliably. In particular, emitting from an APValue would fail to notice if all the explicit array elements happened to be zero. In addition, for large arrays where only an initial portion has an explicit initializer, we would emit the complete initializer (which could be huge) rather than emitting only the non-zero portion. With this change, when the element would have a suffix of more than 8 zero elements, we emit the array constant as a packed struct of its initial portion followed by a zeroinitializer constant for the trailing zero portion. In passing, I found a bug where SemaInit would sometimes walk the entire array when checking an initializer that only covers the first few elements; that's fixed here to unblock testing of the rest. Differential Revision: https://reviews.llvm.org/D47166 llvm-svn: 333044
* [CodeGen] use nsw negation for builtin absSanjay Patel2018-05-221-3/+3
| | | | | | | | | | | | | The clang builtins have the same semantics as the stdlib functions. The stdlib functions are defined in section 7.20.6.1 of the C standard with: "If the result cannot be represented, the behavior is undefined." That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would be UB/poison. Differential Revision: https://reviews.llvm.org/D47202 llvm-svn: 333038
* Reland r332885, "CodeGen, Driver: Start using direct split dwarf emission in ↵Peter Collingbourne2018-05-221-0/+7
| | | | | | | | | clang." As well as two follow-on commits r332906, r332911 with a fix for test clang/test/CodeGen/split-debug-filename.c. llvm-svn: 333013
* [CodeGen] produce the LLVM canonical form of absSanjay Patel2018-05-221-6/+6
| | | | | | | | We chose the 'slt' form as canonical in IR with: rL332819 ...so we should generate that form directly for efficiency. llvm-svn: 332989
* [CodeGen] add tests for abs builtins; NFCSanjay Patel2018-05-221-0/+29
| | | | llvm-svn: 332988
* [CodeView] Enable debugging of captured variables within C++ lambdasBrock Wyma2018-05-221-0/+30
| | | | | | | | | This change will help Visual Studio resolve forward references to C++ lambda routines used by captured variables. Differential Revision: https://reviews.llvm.org/D45438 llvm-svn: 332975
* Revert "CodeGen, Driver: Start using direct split dwarf emission in clang."Amara Emerson2018-05-221-6/+0
| | | | | | This reverts commit r332885 as it broke several greendragon buildbots. llvm-svn: 332973
* [X86] Remove a builtin that should have been removed in r332882.Craig Topper2018-05-211-1/+0
| | | | llvm-svn: 332909
* [X86] Remove masking from pternlog llvm intrinsics and use a select ↵Craig Topper2018-05-212-18/+30
| | | | | | | | | | | | instruction instead. Because the intrinsics in the headers are implemented as macros, we can't just use a select builtin and pternlog builtin. This would require one of the macro arguments to be used twice. Depending on what was passed to the macro we could expand an expression twice leading to weird behavior. We could maybe declare our local variable in the macro, but that would need to worry about name collisions. To avoid that just generate IR directly in CGBuiltin.cpp. Differential Revision: https://reviews.llvm.org/D47125 llvm-svn: 332891
* Revert r332847; it caused us to miscompile certain forms of reference ↵Richard Smith2018-05-213-3/+3
| | | | | | initialization. llvm-svn: 332886
* CodeGen, Driver: Start using direct split dwarf emission in clang.Peter Collingbourne2018-05-211-0/+6
| | | | | | | | Fixes PR37466. Differential Revision: https://reviews.llvm.org/D47093 llvm-svn: 332885
* [X86] Use __builtin_convertvector to implement some of the packed integer to ↵Craig Topper2018-05-217-37/+56
| | | | | | | | | | | | packed float conversion intrinsics. I believe this is safe assuming default default FP environment. The conversion might be inexact, but it can never overflow the FP type so this shouldn't be undefined behavior for the uitofp/sitofp instructions. We already do something similar for scalar conversions. Differential Revision: https://reviews.llvm.org/D46863 llvm-svn: 332882
* [CodeGen] Recognize more cases of zero initializationSerge Pavlov2018-05-213-3/+3
| | | | | | | | | | | | | | | | | | | | | If a variable has an initializer, codegen tries to build its value. If the variable is large in size, building its value requires substantial resources. It causes strange behavior from user viewpoint: compilation of huge zero initialized arrays like: char data_1[2147483648u] = { 0 }; consumes enormous amount of time and memory. With this change codegen tries to determine if variable initializer is equivalent to zero initializer. In this case variable value is not constructed. This change fixes PR18978. Differential Revision: https://reviews.llvm.org/D46241 llvm-svn: 332847
* [X86] Remove mask arguments from permvar builtins/intrinsics. Use a select ↵Craig Topper2018-05-206-35/+59
| | | | | | | | in IR instead. Someday maybe we'll use selects for all the builtins. llvm-svn: 332825
* This patch aims to match the changes introducedAlexander Ivchenko2018-05-183-7/+9
| | | | | | | | | | | | | | | | | in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html. The -mibt feature flag is being removed, and the -fcf-protection option now also defines a CET macro and causes errors when used on non-X86 targets, while X86 targets no longer check for -mibt and -mshstk to determine if -fcf-protection is supported. -mshstk is now used only to determine availability of shadow stack intrinsics. Comes with an LLVM patch (D46882). Patch by mike.dvoretsky Differential Revision: https://reviews.llvm.org/D46881 llvm-svn: 332704
* [AArch64] Correct inline assembly test case for S modifier [NFC]Peter Smith2018-05-171-3/+3
| | | | | | | | | | | | | | | The existing test for the AArch64 inline assembly constraint S uses the A and L modifiers. These modifiers were implemented in the original AArch64 backend but were not carried forward to the merged backend. The A is associated with ADRP and does nothing, the L is associated with :lo12: . Given that A and L are not supported by GCC and not supported by the new implementation of constraint S in LLVM (see D46745) I've altered the test to put :lo12: directly in the string so that A and L are not needed. Differential Revision: https://reviews.llvm.org/D46932 llvm-svn: 332606
* [X86] Revert part of r332266: Use __builtin_convertvector to replace some of ↵Craig Topper2018-05-152-16/+8
| | | | | | | | the avx512 truncate builtins. The masking doesn't work right in the backend for the ones that produce byte or word elements without avx512bw. llvm-svn: 332322
* [X86] Use __builtin_convertvector to replace some of the avx512 truncate ↵Craig Topper2018-05-144-24/+40
| | | | | | | | | | builtins. As long as the destination type is a 256 or 128 bit vector with the same number of elements we can use __builtin_convertvector to directly generate trunc IR instruction which will be handled natively by the backend. Differential Revision: https://reviews.llvm.org/D46742 llvm-svn: 332266
* [X86] Use select instrution and fpextend in the implementation of ↵Craig Topper2018-05-141-3/+6
| | | | | | _mm512_mask_cvtps_pd and _mm512_maskz_cvtps_pd. llvm-svn: 332213
* [X86] Use __builtin_convertvector to implement _mm512_cvtps_pd.Craig Topper2018-05-141-2/+2
| | | | | | | | If we're using default rounding mode we can let __builtin_convertvector to generate an fpextend. This matches 128 and 256 bit. If we're using the version that takes an explicit rounding mode argument we would need to look at the immediate to see if its CUR_DIRECTION. llvm-svn: 332210
* [X86] Emit better code for _mm_cvtu32_sd, _mm_cvtu64_sd, _mm_cvtu32_ss, and ↵Craig Topper2018-05-131-4/+8
| | | | | | | | | | _mm_cvtu64_ss. We can use direct C code for these that will use uitofp and insertelement instructions. For the versions that take an explicit rounding mode we can't do this. llvm-svn: 332203
* Added atomic_fetch_min, max, umin, umax intrinsics to clang.Elena Demikhovsky2018-05-131-0/+7
| | | | | | | | | These intrinsics work exactly as all other atomic_fetch_* intrinsics and allow to create *atomicrmw* with ordering. Updated the clang-extensions document. Differential Revision: https://reviews.llvm.org/D46386 llvm-svn: 332193
* [Hexagon] Implement checking arguments of builtin callsKrzysztof Parzyszek2018-05-111-0/+30
| | | | llvm-svn: 332105
* [X86] Assume alignment of movdir64b dst argumentGabor Buella2018-05-111-1/+6
| | | | | | | | | | Reviewers: craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46683 llvm-svn: 332091
* [X86] ptwrite intrinsicGabor Buella2018-05-101-0/+22
| | | | | | | | | | Reviewers: craig.topper, RKSimon Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D46540 llvm-svn: 331962
* [X86] Change the implementation of scalar masked load/store intrinsics to ↵Craig Topper2018-05-101-6/+6
| | | | | | | | | | not use a 512-bit intermediate vector. This is unnecessary for AVX512VL supporting CPUs like SKX. We can just emit a 128-bit masked load/store here no matter what. The backend will widen it to 512-bits on KNL CPUs. Fixes the frontend portion of PR37386. Need to fix the backend to optimize the new sequences well. llvm-svn: 331958
* [Builtins] Improve the IR emitted for MSVC compatible rotr/rotl builtins to ↵Craig Topper2018-05-101-72/+60
| | | | | | | | | | | | | | | | | | | | | | | match what the middle and backends understand Previously we emitted something like rotl(x, n) { n &= bitwidth-1; return n != 0 ? ((x << n) | (x >> (bitwidth - n)) : x; } We use a select to avoid the undefined behavior on the (bitwidth - n) shift. The middle and backend don't really recognize this as a rotate and end up emitting a cmov or control flow because of the select. A better pattern is (x << (n & mask)) | (x << (-n & mask)) where mask is bitwidth - 1. Fixes the main complaint in PR37387. There's still some work to be done if the user writes that sequence directly on a short or char where type promotion rules can prevent it from being recognized. The builtin is emitting direct IR with unpromoted types so that isn't a problem for it. Differential Revision: https://reviews.llvm.org/D46656 llvm-svn: 331943
* [Clang] Implement function attribute no_stack_protector.Manoj Gupta2018-05-091-0/+20
| | | | | | | | | | | | | | | | | | | | | | Summary: This attribute tells clang to skip this function from stack protector when -stack-protector option is passed. GCC option for this is: __attribute__((__optimize__("no-stack-protector"))) and the equivalent clang syntax would be: __attribute__((no_stack_protector)) This is used in Linux kernel to selectively disable stack protector in certain functions. Reviewers: aaron.ballman, rsmith, rnk, probinson Reviewed By: aaron.ballman Subscribers: probinson, srhines, cfe-commits Differential Revision: https://reviews.llvm.org/D46300 llvm-svn: 331925
* Revert r331843 "[DebugInfo] Generate debug information for labels."Hans Wennborg2018-05-092-44/+0
| | | | | | | | | | | | | | | | It broke the Chromium build (see reply on the review). > Generate DILabel metadata and call llvm.dbg.label after label > statement to associate the metadata with the label. > > Differential Revision: https://reviews.llvm.org/D45045 > > Patch by Hsiangkai Wang. This doesn't revert the change to backend-unsupported-error.ll that seems to correspond to an llvm-side change. llvm-svn: 331861
* _Atomic of empty struct shouldn't assertJF Bastien2018-05-092-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: An _Atomic of an empty struct is pretty silly. In general we just widen empty structs to hold a byte's worth of storage, and we represent size and alignment as 0 internally and let LLVM figure out what to do. For _Atomic it's a bit different: the memory model mandates concrete effects occur when atomic operations occur, so in most cases actual instructions need to get emitted. It's really not worth trying to optimize empty struct atomics by figuring out e.g. that a fence would do, even though sane compilers should do optimize atomics. Further, wg21.link/p0528 will fix C++20 atomics with padding bits so that cmpxchg on them works, which means that we'll likely need to do the zero-init song and dance for empty atomic structs anyways (and I think we shouldn't special-case this behavior to C++20 because prior standards are just broken). This patch therefore makes a minor change to r176658 "Promote atomic type sizes up to a power of two": if the width of the atomic's value type is 0, just use 1 byte for width and leave alignment as-is (since it should never be zero, and over-aligned zero-width structs are weird but fine). This fixes an assertion: (NumBits >= MIN_INT_BITS && "bitwidth too small"), function get, file ../lib/IR/Type.cpp, line 241. It seems like this has run into other assertions before (namely the unreachable Kind check in ImpCastExprToType), but I haven't reproduced that issue with tip-of-tree. <rdar://problem/39678063> Reviewers: arphaman, rjmccall Subscribers: aheejin, cfe-commits Differential Revision: https://reviews.llvm.org/D46613 llvm-svn: 331845
* [DebugInfo] Generate debug information for labels.Shiva Chen2018-05-093-2/+46
| | | | | | | | | | | Generate DILabel metadata and call llvm.dbg.label after label statement to associate the metadata with the label. Differential Revision: https://reviews.llvm.org/D45045 Patch by Hsiangkai Wang. llvm-svn: 331843
* [X86] Use target feature defines in tests instead of defining our own flag ↵Craig Topper2018-05-073-10/+10
| | | | | | on the command line. NFCI llvm-svn: 331683
* Add -target to address errors in test from r331592Teresa Johnson2018-05-051-8/+7
| | | | | | | | | | | The error turns out to be: Assertion failed: (Target.isCompatibleDataLayout(getDataLayout()) && "Can't create a MachineFunction using a Module with a " "Target-incompatible DataLayout attached\n"), function init, file /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm/lib/CodeGen/MachineFunction.cpp, line 180. Add -target to address this. Also re-enable the test I had temporarily commented, and move it further down in case there is still a failure (since it pipes stderr to FileCheck). llvm-svn: 331597
* Skip part of test added in r331592 to help debug bot failuresTeresa Johnson2018-05-051-2/+3
| | | | | | | | | | Trying to debug why/where a few bots getting exit code 256 e.g. http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/48471/testReport/Clang/CodeGen/thinlto_diagnostic_handler_remarks_with_hotness_ll/ and a few windows bots getting no output from that RUN line e.g. http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/11865/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Athinlto-diagnostic-handler-remarks-with-hotness.ll llvm-svn: 331596
* Add required target to address bot failures from r331592Teresa Johnson2018-05-051-0/+2
| | | | | | Failing on non-x86 bots, needs x86 target for code gen. llvm-svn: 331593
* [ThinLTO] Support opt remarks options with distributed ThinLTO backendsTeresa Johnson2018-05-052-3/+50
| | | | | | | | | | | | | | | | | Summary: Passes down the necessary code ge options to the LTO Config to enable -fdiagnostics-show-hotness and -fsave-optimization-record in the ThinLTO backend for a distributed build. Also, remove warning about not having PGO when the input is IR. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, cfe-commits Differential Revision: https://reviews.llvm.org/D46464 llvm-svn: 331592
* [gcov] Make the CLang side coverage test work with the newChandler Carruth2018-05-021-7/+15
| | | | | | | | | instrumentation codegeneration strategy of using a data structure and a loop. Required some finesse to get the critical things being tested to surface in a nice way for FileCheck but I think this preserves the original intent of the test. llvm-svn: 331411
* [ARM] Remove redundant #if in test. NFCShoaib Meenai2018-05-011-4/+0
| | | | | | | | | | Both sides of this #if #include the same file. Drop the #if, leaving only the #include. Patch by Matt Glazar. Differential Revision: https://reviews.llvm.org/D45779 llvm-svn: 331305
* Update existed CodeGen TBAA testsDanil Malyshev2018-05-016-47/+99
| | | | | | | | | | Reviewers: hfinkel, kosarev, rjmccall Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D44616 llvm-svn: 331292
* [X86] directstore and movdir64b intrinsicsGabor Buella2018-05-011-0/+31
| | | | | | | | | | Reviewers: spatel, craig.topper, RKSimon Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D45984 llvm-svn: 331249
* [MC] Change AsmParser to leverage Assembler during evaluationNirav Dave2018-04-301-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | Teach AsmParser to check with Assembler for when evaluating constant expressions. This improves the handing of preprocessor expressions that must be resolved at parse time. This idiom can be found as assembling-time assertion checks in source-level assemblers. Note that this relies on the MCStreamer to keep sufficient tabs on Section / Fragment information which the MCAsmStreamer does not. As a result the textual output may fail where the equivalent object generation would pass. This can most easily be resolved by folding the MCAsmStreamer and MCObjectStreamer together which is planned for in a separate patch. Currently, this feature is only enabled for assembly input, keeping IR compilation consistent between assembly and object generation. Reviewers: echristo, rnk, probinson, espindola, peter.smith Reviewed By: peter.smith Subscribers: eraman, peter.smith, arichardson, jyknight, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45164 llvm-svn: 331218
* [Driver, CodeGen] rename options to disable an FP cast optimizationSanjay Patel2018-04-301-4/+8
| | | | | | | | | | | | | | As suggested in the post-commit thread for rL331056, we should match these clang options with the established vocabulary of the corresponding sanitizer option. Also, the use of 'strict' is well-known for these kinds of knobs, and we can improve the descriptive text in the docs. So this intends to match the logic of D46135 but only change the words. Matching LLVM commit to match this spelling of the attribute to follow shortly. Differential Revision: https://reviews.llvm.org/D46236 llvm-svn: 331209
* [Driver, CodeGen] add options to enable/disable an FP cast optimizationSanjay Patel2018-04-271-0/+14
| | | | | | | | | | | | | | | | | | | | | | As discussed in the post-commit thread for: rL330437 ( http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180423/545906.html ) We need a way to opt-out of a float-to-int-to-float cast optimization because too much existing code relies on the platform-specific undefined result of those casts when the float-to-int overflows. The LLVM changes associated with adding this function attribute are here: rL330947 rL330950 rL330951 Also as suggested, I changed the LLVM doc to mention the specific sanitizer flag that catches this problem: rL330958 Differential Revision: https://reviews.llvm.org/D46135 llvm-svn: 331041
* [ARM,AArch64] Add intrinsics for dot product instructionsOliver Stannard2018-04-272-0/+193
| | | | | | | | | | The ACLE spec which describes these intrinsics hasn't been published yet, but this is based on the final draft which will be published soon, and these have already been implemented by GCC. Differential revision: https://reviews.llvm.org/D46109 llvm-svn: 331039
* [x86] Revert r330322 (& r330323): Lowering x86 adds/addus/subs/subus intrinsicsChandler Carruth2018-04-264-558/+89
| | | | | | | | The LLVM commit introduces a crash in LLVM's instruction selection. I filed http://llvm.org/PR37260 with the test case. llvm-svn: 330997
* [X86] Add support for _mm512_mullox_epi64 and _mm512_mask_mullox_epi64 ↵Craig Topper2018-04-261-0/+13
| | | | | | | | | | intrinsics to match icc. On AVX512F targets we'll produce an emulated sequence using 3 pmuludqs with shifts and adds. On AVX512DQ we'll use vpmulld. Fixes PR37140. llvm-svn: 330923
OpenPOWER on IntegriCloud