bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[InstCombine] foldICmpWithLowBitMaskedVal(): handle ~(-1 << y) mask	Roman Lebedev	2018-09-19	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Two folds are happening here: 1. https://rise4fun.com/Alive/oaFX 2. And then `foldICmpWithHighBitMask()` (D52001): https://rise4fun.com/Alive/wsP4 This change doesn't just add the handling for eq/ne predicates, it actually builds upon the previous `foldICmpWithLowBitMaskedVal()` work, so all the 16 fold variants* are immediately supported. I'm indeed only testing these two predicates. I do not feel like re-proving all 16 folds, because they were already proven for the general case of constant with all-ones in low bits. So as long as the mask produces all-ones in low bits, i'm pretty sure the fold is valid. But required, i can re-prove, let me know. eq/ne are commutative - 4 folds; ult/ule/ugt/uge - are not commutative (the commuted variant is InstSimplified), 4 folds; slt/sle/sgt/sge are not commutative - 4 folds. 12 folds in total. https://bugs.llvm.org/show_bug.cgi?id=38123 https://bugs.llvm.org/show_bug.cgi?id=38708 Reviewers: spatel, craig.topper, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52146 llvm-svn: 342546
*	[ARM] Fix unwind information for floating point registers	Oliver Stannard	2018-09-19	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes the unwind information generated for floating-point registers. Previously, all padding registers were assumed to be four bytes wide. Now, the width of the register is used to specify the amount of padding. Patch by Jackson Woodruff! Differential revision: https://reviews.llvm.org/D51494 llvm-svn: 342545
*	[New PM] Introducing PassInstrumentation framework	Fedor Sergeev	2018-09-19	6	-0/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Pass Execution Instrumentation interface enables customizable instrumentation of pass execution, as per "RFC: Pass Execution Instrumentation interface" posted 06/07/2018 on llvm-dev@ The intent is to provide a common machinery to implement all the pass-execution-debugging features like print-before/after, opt-bisect, time-passes etc. Here we get a basic implementation consisting of: * PassInstrumentationCallbacks class that handles registration of callbacks and access to them. * PassInstrumentation class that handles instrumentation-point interfaces that call into PassInstrumentationCallbacks. * Callbacks accept StringRef which is just a name of the Pass right now. There were some ideas to pass an opaque wrapper for the pointer to pass instance, however it appears that pointer does not actually identify the instance (adaptors and managers might have the same address with the pass they govern). Hence it was decided to go simple for now and then later decide on what the proper mental model of identifying a "pass in a phase of pipeline" is. * Callbacks accept llvm::Any serving as a wrapper for const IRUnit, to remove direct dependencies on different IRUnits (e.g. Analyses). PassInstrumentationAnalysis analysis is explicitly requested from PassManager through usual AnalysisManager::getResult. All pass managers were updated to run that to get PassInstrumentation object for instrumentation calls. * Using tuples/index_sequence getAnalysisResult helper to extract generic AnalysisManager's extra args out of a generic PassManager's extra args. This is the only way I was able to explicitly run getResult for PassInstrumentationAnalysis out of a generic code like PassManager::run or RepeatedPass::run. TODO: Upon lengthy discussions we agreed to accept this as an initial implementation and then get rid of getAnalysisResult by improving RepeatedPass implementation. * PassBuilder takes PassInstrumentationCallbacks object to pass it further into PassInstrumentationAnalysis. Callbacks registration should be performed directly through PassInstrumentationCallbacks. * new-pm tests updated to account for PassInstrumentationAnalysis being run * Added PassInstrumentation tests to PassBuilderCallbacks unit tests. Other unit tests updated with registration of the now-required PassInstrumentationAnalysis. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D47858 llvm-svn: 342544
*	[InstCombine] Don't transform sin/cos -> tanl if for half types	Benjamin Kramer	2018-09-19	1	-0/+2
\| \| \| \| \| \| \|	This is still unsafe for long double, we will transform things into tanl even if tanl is for another type. But that's for someone else to fix. llvm-svn: 342542
*	Verify commit access in fixing typo	Calixte Denizet	2018-09-19	1	-1/+1
\| \| \| \|	llvm-svn: 342538
*	[RISCV] Codegen for i8, i16, and i32 atomicrmw with RV32A	Alex Bradbury	2018-09-19	9	-4/+730
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Introduce a new RISCVExpandPseudoInsts pass to expand atomic pseudo-instructions after register allocation. This is necessary in order to ensure that register spills aren't introduced between LL and SC, thus breaking the forward progress guarantee for the operation. AArch64 does something similar for CmpXchg (though only at O0), and Mips is moving towards this approach (see D31287). See also [this mailing list post](http://lists.llvm.org/pipermail/llvm-dev/2016-May/099490.html) from James Knight, which summarises the issues with lowering to ll/sc in IR or pre-RA. See the [accompanying RFC thread](http://lists.llvm.org/pipermail/llvm-dev/2018-June/123993.html) for an overview of the lowering strategy. Differential Revision: https://reviews.llvm.org/D47882 llvm-svn: 342534
*	[COFF] Emit @feat.00 on 64-bit and set the CFG bit when emitting guardcf tables	Hans Wennborg	2018-09-19	1	-8/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 0x800 bit in @feat.00 needs to be set in order to make LLD pick up the .gfid$y table. I believe this is fine to set even if we don't emit the instrumentation. We haven't emitted @feat.00 on 64-bit before. I see that MSVC does emit it, but I'm not entirely sure what the default value should be. I went with zero since that seems as safe as not emitting the symbol in the first place. Differential Revision: https://reviews.llvm.org/D52235 llvm-svn: 342532
*	[DebugInfo][Dexter] Speculated BB presents illegal variable value to debugger.	Carlos Alberto Enciso	2018-09-19	2	-2/+13
\| \| \| \| \| \| \| \|	When SimplifyCFG changes the PHI node into a select instruction, the debug information becomes ambiguous. It causes the debugger to display wrong variable value. Differential Revision: https://reviews.llvm.org/D51976 llvm-svn: 342527
*	[DWARF Verifier] Add helper function to dump DIEs. [NFC]	Jonas Devlieghere	2018-09-19	1	-24/+18
\| \| \| \| \| \| \| \| \|	It's pretty common for the verifier to dump the relevant DIE when it finds an issue. This tends to be relatively verbose and error prone because we have to pass the DIDumpOptions to the DIE's dump method. This patch adds a helper function to the verifier to make this easier. llvm-svn: 342526
*	[WebAssembly][NFC] Remove extra space in WebAssemblyInstrSIMD.td	Thomas Lively	2018-09-19	1	-1/+1
\| \| \| \|	llvm-svn: 342522
*	AArch64MacroFusion: Factor out some opcode handling code; NFC	Matthias Braun	2018-09-19	1	-121/+110
\| \| \| \|	llvm-svn: 342521
*	ScheduleDAG: Cleanup dumping code; NFC	Matthias Braun	2018-09-19	20	-150/+154
\| \| \| \| \| \| \| \| \| \| \| \|	- Instead of having both `SUnit::dump(ScheduleDAG)` and `ScheduleDAG::dumpNode(ScheduleDAG)`, just keep the latter around. - Add `ScheduleDAG::dump()` and avoid code duplication in several places. Implement it for different ScheduleDAG variants. - Add `ScheduleDAG::dumpNodeName()` in favor of the `SUnit::print()` functions. They were only ever used for debug dumping and putting the function into ScheduleDAG is consistent with the `dumpNode()` change. llvm-svn: 342520
*	[WebAssembly] v4f32.abs and v2f64.abs	Thomas Lively	2018-09-18	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	Summary: implement lowering of @llvm.fabs for vector types. Reviewers: aheejin, dschuff Subscribers: llvm-svn: 342513
*	Do not optimize atomic load to non-atomic memcmp	Christy Lee	2018-09-18	1	-2/+3
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D51998 llvm-svn: 342498
*	[AMDGPU] Match udot8 pattern	Farhana Aleen	2018-09-18	1	-22/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: D.u32 = S0.u4[0] * S1.u4[0] + S0.u4[1] * S1.u4[1] + S0.u4[2] * S1.u4[2] + S0.u4[3] * S1.u4[3] + S0.u4[4] * S1.u4[4] + S0.u4[5] * S1.u4[5] + S0.u4[6] * S1.u4[6] + S0.u4[7] * S1.u4[7] + S2.u32 Author: FarhanaAleen Reviewed By: arsenm, nhaehnle Differential Revision: https://reviews.llvm.org/D51947 llvm-svn: 342497
*	[PGO][CHR] Add opt remarks.	Hiroshi Yamauchi	2018-09-18	1	-5/+75
\| \| \| \| \| \| \| \| \| \| \| \|	Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52056 llvm-svn: 342495
*	[PDB] Better support for enumerating pointer types.	Zachary Turner	2018-09-18	9	-52/+191
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were several issues with the previous implementation. 1) There were no tests. 2) We didn't support creating PDBSymbolTypePointer records for builtin types since those aren't described by LF_POINTER records. 3) We didn't support a wide enough variety of builtin types even ignoring pointers. This patch fixes all of these issues. In order to add tests, it's helpful to be able to ignore the symbol index id hierarchy because it makes the golden output from the DIA version not match our output, so I've extended the dumper to disable dumping of id fields. llvm-svn: 342493
*	[PostRASink] Make sure to remove subregisters from live-ins as well	Krzysztof Parzyszek	2018-09-18	1	-2/+5
\| \| \| \|	llvm-svn: 342492
*	[RISCV][MC] Use a custom ParserMethod for the bare_symbol operand type	Alex Bradbury	2018-09-18	2	-33/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows the hard-coded shouldForceImmediate logic to be removed because the generated MatchOperandParserImpl makes use of the current context (i.e. the current mnemonic) to determine parsing behaviour, and so won't first try to parse a register before parsing a symbol name. No functional change is intended. gas accepts immediate arguments for call, tail and lla. This patch doesn't address this discrepancy. Differential Revision: https://reviews.llvm.org/D51733 llvm-svn: 342488
*	[RISCV][MC] Reject bare symbols for the simm12 operand type	Alex Bradbury	2018-09-18	1	-3/+5
\| \| \| \| \| \| \|	addi a0, a0, foo and lw a0, foo(a0) and similar are now rejected. An explicit %lo and %pcrel_lo modifier is required. This matches gas behaviour. llvm-svn: 342487
*	[RISCV][MC] Tighten up checking of sybol operands to lui and auipc	Alex Bradbury	2018-09-18	2	-13/+42
\| \| \| \| \| \| \| \| \| \| \| \|	Reject bare symbols and accept only %pcrel_hi(sym) for auipc and %hi(sym) for lui. Also test valid operand modifiers in rv32i-valid.s. Note this is slightly stricter than gas, which will accept either %pcrel_hi or %hi for both lui and auipc. Differential Revision: https://reviews.llvm.org/D51731 llvm-svn: 342486
*	Remove dead function user_cache_directory()	Nico Weber	2018-09-18	3	-39/+0
\| \| \| \| \| \| \| \| \| \| \| \|	It's been unused since it was added almost 3 years ago in https://reviews.llvm.org/D13801 Motivated by https://reviews.llvm.org/rL342002 since it removes one of the functions keeping a ref to SHGetKnownFolderPath. Differential Revision: https://reviews.llvm.org/D52184 llvm-svn: 342485
*	Revert r342457 "Fixes removal of dead elements from PressureDiff (PR37252)."	Hans Wennborg	2018-09-18	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	This broke the lit tests on a bunch of buildbots, e.g. http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/36679 > Reviewed By: MatzeB > > Differential Revision: https://reviews.llvm.org/D51495 llvm-svn: 342482
*	[PowerPC] Do not emit record-form rotates when record-form andi/andis suffices	Nemanja Ivanovic	2018-09-18	1	-6/+28
\| \| \| \| \| \| \| \| \| \| \| \|	This is a follow-up to the previous patch that eliminated some of the rotates. With this addition, we will also emit the record-form andis. This patch increases the number of record-form rotates we eliminate by more than 70%. Differential revision: https://reviews.llvm.org/D44897 llvm-svn: 342478
*	[LTO] Make detection of WPD remark enablement more robust	Teresa Johnson	2018-09-18	1	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently only the first function in the module is checked to see if it has remarks enabled. If that first function is a declaration, remarks will be incorrectly skipped. Change to look for the first non-empty function. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D51556 llvm-svn: 342477
*	[LLVM-C][OCaml] Add UnifyFunctionExitNodes pass to C and OCaml APIs	whitequark	2018-09-18	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds LLVMAddUnifyFunctionExitNodesPass to expose createUnifyFunctionExitNodesPass to the C and OCaml APIs. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52212 llvm-svn: 342476
*	[LLVM-C][OCaml] Add LowerAtomic pass to C and OCaml APIs	whitequark	2018-09-18	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adds LLVMAddLowerAtomicPass to expose createLowerAtomicPass in the C and OCaml APIs. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D52211 llvm-svn: 342475
*	[PowerPC] Optimize compares fed by ANDISo	Nemanja Ivanovic	2018-09-18	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both ANDIo and ANDISo (and the 64-bit versions) are record-form instructions. When optimizing compares, we handle the former in order to eliminate the compare instruction but not the latter. This patch just adds the latter to the set of instructions we optimize. The reason these instructions need to be handled separately is that they are not part of the RecFormRel map (since they don't have a non-record-form). The missing "and-immediate-shifted" is just an oversight in the initial implementation. Differential revision: https://reviews.llvm.org/D51353 llvm-svn: 342472
*	[TargetLowering] Android has sincos functions	John Brawn	2018-09-18	1	-1/+2
\| \| \| \| \| \| \| \| \|	Since Android API version 9 the Android libm has had the sincos functions, so they should be recognised as libcalls and sincos optimisation should be applied. Differential Revision: https://reviews.llvm.org/D52025 llvm-svn: 342471
*	[X86][SSE] LowerShift - pull out repeated getTargetVShiftUniformOpcode ↵	Simon Pilgrim	2018-09-18	1	-25/+19
\| \| \| \| \| \|	calls. NFCI. llvm-svn: 342462
*	Fixes removal of dead elements from PressureDiff (PR37252).	Yury Gribov	2018-09-18	1	-2/+1
\| \| \| \| \| \| \| \|	Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D51495 llvm-svn: 342457
*	[AArch64] Attempt to parse more operands as expressions	David Green	2018-09-18	1	-24/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This tries to make use of evaluateAsRelocatable in AArch64AsmParser::classifySymbolRef to parse more complex expressions as relocatable operands. It is hopefully better than the existing code which only handles Symbol +- Constant. This allows us to parse more complex adr/adrp, mov, ldr/str and add operands. It also loosens the requirements on parsing addends in ld/st and mov's and adds a number of tests. Differential Revision: https://reviews.llvm.org/D51792 llvm-svn: 342455
*	[IndVars] Remove unreasonable checks in rewriteLoopExitValues	Max Kazantsev	2018-09-18	1	-11/+5
\| \| \| \| \| \| \| \| \| \| \|	A piece of logic in rewriteLoopExitValues has a weird check on number of users which allowed an unprofitable transform in case if an instruction has more than 6 users. Differential Revision: https://reviews.llvm.org/D51404 Reviewed By: etherzhhb llvm-svn: 342444
*	AMDGPU: Don't form fmed3 if it will require materialization	Matt Arsenault	2018-09-18	1	-2/+9
\| \| \| \| \| \| \|	If there is a single use constant, it can be folded into the min/max, but not into med3. llvm-svn: 342443
*	LSV: Fix adjust alloca alignment trick for AMDGPU	Matt Arsenault	2018-09-18	1	-29/+31
\| \| \| \| \| \| \| \| \| \|	This was checking the hardcoded address space 0 for the stack. Additionally, this should be checking for legality with the adjusted alignment, so defer the alignment check. Also try to split if the unaligned access isn't allowed. llvm-svn: 342442
*	[PowerPC] Add Itineraries of IIC_IntMulHD for P7/P8	QingShan Zhang	2018-09-18	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When doing some instruction scheduling work, we noticed some missing itineraries. Before we switch to machine scheduler, those missing itineraries might not have impact to actually scheduling, because we can still get same latency due to default values. With machine scheduler, however, itineraries will have impact to scheduling. eg: NumMicroOps will default to be 0 if there is NO itineraries for specific instruction class. And most of the instruction class with itineraries will have NumMicroOps default to 1. This will has impact on the count of RetiredMOps, affects the Pending/Available Queue, then causing different scheduling or suboptimal scheduling further. Patch By: jsji (Jinsong Ji) Differential Revision: https://reviews.llvm.org/D52040 llvm-svn: 342441
*	AMDGPU: Expand vector canonicalizes	Matt Arsenault	2018-09-18	1	-0/+1
\| \| \| \|	llvm-svn: 342439
*	[LLVM-C][OCaml] Add C and OCaml APIs for llvm::StructType::isLiteral	whitequark	2018-09-18	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds LLVMIsLiteralStruct to the C API to expose StructType::isLiteral. This is then used to implement the analogous addition to the OCaml API. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52209 llvm-svn: 342435
*	[LLVM-C] Add support for ConstantExpr in LLVMGetNumIndices and LLVMGetIndices	whitequark	2018-09-18	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: ConstantExpr supports getIndices, but prior to this patch LLVMGetNumIndices and LLVMGetIndices would error on them. Reviewers: whitequark Reviewed By: whitequark Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52206 llvm-svn: 342434
*	Revert "[ARM] Cleanup ARM CGP isSupportedValue"	Volodymyr Sapsai	2018-09-18	1	-19/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts r342395 as it caused error > Argument value type does not match pointer operand type! > %0 = atomicrmw volatile xchg i8* %_Value1, i32 1 monotonic, !dbg !25 > i8in function atomic_flag_test_and_set > fatal error: error in backend: Broken function found, compilation aborted! on bot http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/ More details are available at https://reviews.llvm.org/D52080 llvm-svn: 342431
*	[EarlyCSEwMemorySSA] Add MSSA verification and tests to make EarlyCSE ↵	Alina Sbirlea	2018-09-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	failures easier to track. Summary: EarlyCSE can make IR changes that will leave MemorySSA with accesses claiming to be optimized, but for which a subsequent MemorySSA run will yield a different optimized result. Due to relying on AA queries, we can't fix this in general, unless we recompute MemorySSA. Adding some tests to track this and a basic verify for future potential failures. Reviewers: george.burgess.iv, gberry Subscribers: sanjoy, jlebar, Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D51960 llvm-svn: 342422
*	[mips] Fix MIPS N32 ABI triples support	Simon Atanasyan	2018-09-17	4	-4/+19
\| \| \| \| \| \| \| \| \| \| \| \|	Add support mips64(el)-linux-gnuabin32 triples, and set them to N32. Debian architecture name mipsn32/mipsn32el are also added. Set UseIntegratedAssembler for N32 if we can detect it. Patch by YunQiang Su. Differential revision: https://reviews.llvm.org/D51408 llvm-svn: 342416
*	[PDB] Make the native reader support enumerators.	Zachary Turner	2018-09-17	6	-12/+254
\| \| \| \| \| \| \| \| \| \| \|	Previously we would dump the names of enum types, but not their enumerator values. This adds support for enumerator values. In doing so, we have to introduce a general purpose mechanism for caching symbol indices of field list members. Unlike global types, FieldList members do not have a TypeIndex. So instead, we identify them by the pair {TypeIndexOfFieldList, IndexInFieldList}. llvm-svn: 342415
*	[PDB] Make the native reader support modified types.	Zachary Turner	2018-09-17	5	-53/+150
\| \| \| \| \| \| \| \|	Previously for cv-qualified types, we would just ignore them and they would never get printed. Now we can enumerate them and cache them like any other symbol type. llvm-svn: 342414
*	[MC] Avoid inlining constant symbols with variants.	Nirav Dave	2018-09-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Defer unnecessary early inlining of constants to symbol variants. Fixes PR38945. Reviewers: nickdesaulniers, rnk Subscribers: nemanjai, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D52188 llvm-svn: 342412
*	[Loopinfo] Remove one latch-case in getLoopID. NFC.	Michael Kruse	2018-09-17	1	-20/+15
\| \| \| \| \| \| \| \| \| \| \| \|	getLoopID has different control flow for two cases: If there is a single loop latch and for any other number of loop latches (0 and more than one). The latter case should return the same result if there is only a single latch. We can save the preceding redundant search for a latch by handling both cases with the same code. Differential Revision: https://reviews.llvm.org/D52118 llvm-svn: 342406
*	[MachineOutliner][NFC] Don't map more illegal instrs than you have to	Jessica Paquette	2018-09-17	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were mapping an instruction every time we saw something we couldn't map before this. Since each illegal mapping is unique, we only have to do this once. This makes it so that we don't map illegal instructions when the previous mapped instruction was illegal. In CTMark (AArch64), this results in 240 fewer instruction mappings on average over 619 files in total. The largest improvement is 12576 fewer mappings in one file, and the smallest is 0. The median improvement is 101 fewer mappings. llvm-svn: 342405
*	[X86ISel] Implement byval lowering for Win64 calling convention	Keno Fischer	2018-09-17	2	-9/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The IR reference for the `byval` attribute states: ``` This indicates that the pointer parameter should really be passed by value to the function. The attribute implies that a hidden copy of the pointee is made between the caller and the callee, so the callee is unable to modify the value in the caller. This attribute is only valid on LLVM pointer arguments. ``` However, on Win64, this attribute is unimplemented and the raw pointer is passed to the callee instead. This is problematic, because frontend authors relying on the implicit hidden copy (as happens for every other calling convention) will see the passed value silently (if mutable memory) or loudly (by means of a crash) modified because the callee treats the location as scratch memory space it is allowed to mutate. At this point, it's worth taking a step back to understand the context. In most calling conventions, aggregates that are too large to be passed in registers, instead get copied to the stack at a fixed (computable from the signature) offset of the stack pointer. At the LLVM, we hide this hidden copy behind the byval attribute. The caller passes a pointer to the desired data and the callee receives a pointer, but these pointers are not the same. In particular, the pointer that the callee receives points to temporary stack memory allocated as part of the call lowering. In most calling conventions, this pointer is never realized in registers or memory. The temporary memory is simply defined by an implicit offset from the stack pointer at function entry. Win64, uniquely, works differently. The structure is still passed in memory, but instead of being stored at an implicit memory offset, the caller computes a pointer to the temporary memory and passes it to the callee as a regular pointer (taking up a register, or if all registers are taken up, an additional stack slot). Presumably, this was done to allow eliding the copy when passing aggregates through several functions on the stack. This explains why ignoring the `byval` attribute mostly works on Win64. The argument simply gets passed as a pointer and as long as we're ok with the callee trampling all over that memory, there are no ill effects. However, it does contradict the documentation of the `byval` attribute which specifies that there is to be an implicit copy. Frontends can of course work around this by never emitting the `byval` attribute for Win64 and creating `alloca`s for the requisite temporary stack slots (and that does appear to be what frontends are doing). However, the presence of the `byval` attribute is not a trap for frontend authors, since it seems to work, but silently modifies the passed memory contrary to documentation. I see two solutions: - Disallow the `byval` attribute in the verifier if using the Win64 calling convention. - Make it work by simply emitting a temporary stack copy as we would with any other calling convention (frontends can of course always not use the attribute if they want to elide the copy). This patch implements the second option (make it work), though I would be fine with the first also. Ref: https://github.com/JuliaLang/julia/issues/28338 Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51842 llvm-svn: 342402
*	[AMDGPU] Initialize instruction itinerary from GCNSubtarget	Stanislav Mekhanoshin	2018-09-17	2	-0/+6
\| \| \| \| \| \| \| \|	I need to use it in the GCN codegen. Differential Revision: https://reviews.llvm.org/D52123 llvm-svn: 342400
*	Revert "[DWARF] reposting r342048, which was reverted in r342056 due to ↵	Alexander Kornienko	2018-09-17	7	-201/+221
\| \| \| \| \| \| \| \| \|	buildbot errors. Adjusted 2 test cases for ARM and darwin and fixed a bug with the original change in dsymutil." This reverts commit r342218. Due to a number of failures under TSAN. An isolated test case is being worked on. llvm-svn: 342399