bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[RISCV] Fixed test case failure due to r338047	Ana Pazos	2018-07-31	1	-1/+1
\| \| \| \|	llvm-svn: 338341
*	[RISCV] Add support for _interrupt attribute	Ana Pazos	2018-07-26	5	-0/+1317
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Save/restore only registers that are used. This includes Callee saved registers and Caller saved registers (arguments and temporaries) for integer and FP registers. - If there is a call in the interrupt handler, save/restore all Caller saved registers (arguments and temporaries) and all FP registers. - Emit special return instructions depending on "interrupt" attribute type. Based on initial patch by Zhaoshi Zheng. Reviewers: asb Reviewed By: asb Subscribers: rkruppe, the_o, MartinMosbeck, brucehoult, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D48411 llvm-svn: 338047
*	[RISCV] Add machine function pass to merge base + offset	Sameer AbuAsal	2018-06-27	1	-37/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In r333455 we added a peephole to fix the corner cases that result from separating base + offset lowering of global address.The peephole didn't handle some of the cases because it only has a basic block view instead of a function level view. This patch replaces that logic with a machine function pass. In addition to handling the original cases it handles uses of the global address across blocks in function and folding an offset from LW\SW instruction. This pass won't run for OptNone compilation, so there will be a negative impact overall vs the old approach at O0. Reviewers: asb, apazos, mgrang Reviewed By: asb Subscribers: MartinMosbeck, brucehoult, the_o, rogfer01, mgorny, rbar, johnrusso, simoncook, niosHD, kito-cheng, shiva0217, zzheng, llvm-commits, edward-jones Differential Revision: https://reviews.llvm.org/D47857 llvm-svn: 335786
*	[RISC-V] Fix a test case to not include label names as those aren't	Chandler Carruth	2018-06-21	1	-2/+2
\| \| \| \| \| \| \|	stable in non-asserts builds. This fixes a test failure in release config. llvm-svn: 335202
*	[RISCV] Add tests for overflow intrinsics	Roger Ferrer Ibanez	2018-06-19	1	-0/+84
\| \| \| \| \| \| \|	This is using the existing codegen so we can see the change once we custom lower ISD::{U,S}{ADD,SUB}O nodes. llvm-svn: 335023
*	[RISCV] Add codegen support for atomic load/stores with RV32A	Alex Bradbury	2018-06-13	1	-0/+217
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fences are inserted according to table A.6 in the current draft of version 2.3 of the RISC-V Instruction Set Manual, which incorporates the memory model changes and definitions contributed by the RISC-V Memory Consistency Model task group. Instruction selection failures will now occur for 8/16/32-bit atomicrmw and cmpxchg operations when targeting RV32IA until lowering for these operations is added in a follow-on patch. Differential Revision: https://reviews.llvm.org/D47589 llvm-svn: 334591
*	[RISCV] Codegen support for atomic operations on RV32I	Alex Bradbury	2018-06-13	4	-0/+7345
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds lowering for atomic fences and relies on AtomicExpandPass to lower atomic loads/stores, atomic rmw, and cmpxchg to __atomic_* libcalls. test/CodeGen/RISCV/atomic-* are modelled on the exhaustive test/CodeGen/PPC/atomics-regression.ll, and will prove more useful once RV32A codegen support is introduced. Fence mappings are taken from table A.6 in the current draft of version 2.3 of the RISC-V Instruction Set Manual, which incorporates the memory model changes and definitions contributed by the RISC-V Memory Consistency Model task group. Differential Revision: https://reviews.llvm.org/D47587 llvm-svn: 334590
*	[RISCV] Add peepholes for Global Address lowering patterns	Sameer AbuAsal	2018-05-29	1	-16/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Base and offset are always separated when a GlobalAddress node is lowered (rL332641) as an optimization to reduce instruction count. However, this optimization is not profitable if the Global Address ends up being used in only instruction. This patch adds peephole optimizations that merge an offset of an address calculation into the LUI %%hi and ADD %lo of the lowering sequence. The peephole handles three patterns: 1) ADDI (ADDI (LUI %hi(global)) %lo(global)), offset ---> ADDI (LUI %hi(global + offset)) %lo(global + offset). This generates: lui a0, hi (global + offset) add a0, a0, lo (global + offset) Instead of lui a0, hi (global) addi a0, hi (global) addi a0, offset This pattern is for cases when the offset is small enough to fit in the immediate filed of ADDI (less than 12 bits). 2) ADD ((ADDI (LUI %hi(global)) %lo(global)), (LUI hi_offset)) ---> offset = hi_offset << 12 ADDI (LUI %hi(global + offset)) %lo(global + offset) Which generates the ASM: lui a0, hi(global + offset) addi a0, lo(global + offset) Instead of: lui a0, hi(global) addi a0, lo(global) lui a1, (offset) add a0, a0, a1 This pattern is for cases when the offset doesn't fit in an immediate field of ADDI but the lower 12 bits are all zeros. 3) ADD ((ADDI (LUI %hi(global)) %lo(global)), (ADDI lo_offset, (LUI hi_offset))) ---> offset = global + offhi20<<12 + offlo12 ADDI (LUI %hi(global + offset)) %lo(global + offset) Which generates the ASM: lui a1, %hi(global + offset) addi a1, %lo(global + offset) Instead of: lui a0, hi(global) addi a0, lo(global) lui a1, (offhi20) addi a1, (offlo12) add a0, a0, a1 This pattern is for cases when the offset doesn't fit in an immediate field of ADDI and both the lower 1 bits and high 20 bits are non zero. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang llvm-svn: 333455
*	[RISCV] Lower the tail pseudoinstruction	Mandeep Singh Grang	2018-05-23	3	-0/+224
\| \| \| \| \| \| \|	This patch lowers the tail pseudoinstruction. This has been modeled after ARM's tail call opt. llvm-svn: 333137
*	[RISCV] Set CostPerUse for registers	Sameer AbuAsal	2018-05-23	4	-57/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Set CostPerUse higher for registers that are not used in the compressed instruction set. This will influence the greedy register allocator to reduce the use of registers that can't be encoded in 16 bit instructions. This affects register allocation even when compressed instruction isn't targeted, we see no major negative codegen impact. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang Differential Revision: https://reviews.llvm.org/D47039 llvm-svn: 333132
*	[RISCV] Separate base from offset in lowerGlobalAddress	Sameer AbuAsal	2018-05-17	8	-80/+181
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: When lowering global address, lower the base as a TargetGlobal first then create an SDNode for the offset separately and chain it to the address calculation This optimization will create a DAG where the base address of a global access will be reused between different access. The offset can later be folded into the immediate part of the memory access instruction. With this optimization we generate: lui a0, %hi(s) addi a0, a0, %lo(s) ; shared base address. addi a1, zero, 20 ; 2 instructions per access. sw a1, 44(a0) addi a1, zero, 10 sw a1, 8(a0) addi a1, zero, 30 sw a1, 80(a0) Instead of: lui a0, %hi(s+44) ; 3 instructions per access. addi a1, zero, 20 sw a1, %lo(s+44)(a0) lui a0, %hi(s+8) addi a1, zero, 10 sw a1, %lo(s+8)(a0) lui a0, %hi(s+80) addi a1, zero, 30 sw a1, %lo(s+80)(a0) Which will save one instruction per access. Reviewers: asb, apazos Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, apazos, asb, llvm-commits Differential Revision: https://reviews.llvm.org/D46989 llvm-svn: 332641
*	[RISCV] Set isReMaterializable on ADDI and LUI instructions	Alex Bradbury	2018-05-17	1	-38/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The isReMaterlizable flag is somewhat confusing, unlike most other instruction flags it is currently interpreted as a hint (mightBeRematerializable would be a better name). While LUI is always rematerialisable, for an instruction like ADDI it depends on its operands. TargetInstrInfo::isTriviallyReMaterializable will call TargetInstrInfo::isReallyTriviallyReMaterializable, which in turn calls TargetInstrInfo::isReallyTriviallyReMaterializableGeneric. We rely on the logic in the latter to pick out instances of ADDI that really are rematerializable. The isReMaterializable flag does make a difference on a variety of test programs. The recently committed remat.ll test case demonstrates how stack usage is reduce and a unnecessary lw/sw can be removed. Stack usage in the Proc0 function in dhrystone reduces from 192 bytes to 112 bytes. For the sake of completeness, this patch also implements RISCVRegisterInfo::isConstantPhysReg. Although this is called from a number of places, it doesn't seem to result in different codegen for any programs I've thrown at it. However, it is called in the rematerialisation codepath and it seems sensible to implement something correct here. Differential Revision: https://reviews.llvm.org/D46182 llvm-svn: 332617
*	[RISCV] Support .option rvc and norvc assembler directives	Alex Bradbury	2018-05-11	2	-0/+30
\| \| \| \| \| \| \| \| \| \|	These directives allow the 'C' (compressed) extension to be enabled/disabled within a single file. Differential Revision: https://reviews.llvm.org/D45864 Patch by Kito Cheng llvm-svn: 332107
*	[RISCV] Add remat.ll test case	Alex Bradbury	2018-04-27	1	-0/+209
\| \| \| \| \| \| \|	This test case demonstrates suboptimal codegen due to the fact that simple constants aren't recognised as rematerialisable. llvm-svn: 331028
*	[RISCV] Implement isZextFree	Alex Bradbury	2018-04-26	1	-10/+7
\| \| \| \| \| \| \|	This returns true for 8-bit and 16-bit loads, allowing LBU/LHU to be selected and avoiding unnecessary masks. llvm-svn: 330943
*	[RISCV] Add test case showing suboptimal codegen when loading unsigned ↵	Alex Bradbury	2018-04-26	1	-0/+79
\| \| \| \| \| \| \| \| \|	char/short Implementing isZextFree will allow lbu or lhu to be selected rather than lb+mask and lh+mask. llvm-svn: 330942
*	[RISCV] Implement isLegalAddImmediate	Alex Bradbury	2018-04-26	1	-12/+11
\| \| \| \| \| \| \|	This causes a trivial improvement in the recently added lsr-legaladdimm.ll test case. llvm-svn: 330937
*	[RISCV] Add test/CodeGen/RISCV/lsr-legaladdimm.ll	Alex Bradbury	2018-04-26	1	-0/+49
\| \| \| \| \| \| \|	Add a test case which will show a codegen difference upon the implementation of a target-specific isLegalAddImmediate. llvm-svn: 330936
*	[RISCV] Expand function call to "call" pseudoinstruction	Shiva Chen	2018-04-25	23	-555/+231
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To do this: 1. Change GlobalAddress SDNode to TargetGlobalAddress to avoid legalizer split the symbol. 2. Change ExternalSymbol SDNode to TargetExternalSymbol to avoid legalizer split the symbol. 3. Let PseudoCALL match direct call with target operand TargetGlobalAddress and TargetExternalSymbol. Differential Revision: https://reviews.llvm.org/D44885 llvm-svn: 330827
*	[RISCV] Add test changes missed from rL330293	Alex Bradbury	2018-04-18	1	-4/+0
\| \| \| \|	llvm-svn: 330294
*	[RISCV] Introduce pattern for materialising immediates with 0 for lower 12 bits	Alex Bradbury	2018-04-18	5	-52/+29
\| \| \| \| \| \| \|	These immediates can be materialised with just an lui, rather than an lui+addi pair. llvm-svn: 330293
*	[RISCV] Add imm-cse.ll test case	Alex Bradbury	2018-04-18	1	-0/+39
\| \| \| \| \| \| \| \| \|	This test case demonstrates that common subexpression elimination takes place between code sequences for materialising constants. In particular, it demonstrates that redundant lui aren't generated. This would capture a regression if applying a patch such as D41949. llvm-svn: 330291
*	[RISCV] Expand codegen -> compression sanity checks and move to a single file	Alex Bradbury	2018-04-18	4	-85/+169
\| \| \| \| \| \| \| \| \| \|	The objdump tests interfere with update_llc_test_checks.py and can't be automatically update them. Put the sanitify check for compression on the codegen codepath into a separate file, and expand it to also include tests of integer materialisation. This would catch changes such as those triggered by D41949. llvm-svn: 330288
*	Revert "[RISCV] implement li pseudo instruction"	Alex Bradbury	2018-04-18	5	-104/+127
\| \| \| \| \| \| \| \| \|	Reverts rL330224, while issues with the C extension and missed common subexpression elimination opportunities are addressed. Neither of these issues are visible in current RISC-V backend unit tests, which clearly need expanding. llvm-svn: 330281
*	[RISCV] Add specific tests for materialising imm32hi20 constants	Alex Bradbury	2018-04-18	1	-0/+16
\| \| \| \| \| \| \|	i.e. constants that can be materialised with a single lui, as the lower 12 bits are zero. llvm-svn: 330274
*	[RISCV] implement li pseudo instruction	Alex Bradbury	2018-04-17	4	-125/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The implementation follows the MIPS backend and expands the pseudo instruction directly during asm parsing. As the result, only real MC instructions are emitted to the MCStreamer. Additionally, PseudoLI instructions are emitted during codegen. The actual expansion to real instructions is performed during MI to MC lowering and is similar to the expansion performed by the GNU Assembler. Differential Revision: https://reviews.llvm.org/D41949 Patch by Mario Werner. llvm-svn: 330224
*	[RISCV] Change function alignment to 4 bytes, and 2 bytes for RVC	Shiva Chen	2018-04-12	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: According RISC-V ELF psABI specification, base RV32 and RV64 ISAs only allow 32-bit instruction alignment, but instruction allow to be aligned to 16-bit boundaries for C-extension. So we just align to 4 bytes and 2 bytes for C-extension is enough. Reviewers: asb, apazos Differential Revision: https://reviews.llvm.org/D45560 Patch by Kito Cheng. llvm-svn: 329899
*	[RISCV] Codegen support for RV32D floating point comparison operations	Alex Bradbury	2018-04-12	4	-0/+1327
\| \| \| \| \| \| \| \|	Also add double-prevoius-failure.ll which captures a test case that at one point triggered a compiler crash, while developing calling convention support for f64 on RV32D with soft-float ABI. llvm-svn: 329877
*	[RISCV] Codegen support for RV32D floating point conversion operations	Alex Bradbury	2018-04-12	2	-0/+106
\| \| \| \| \| \| \|	This also includes support and a test for truncating stores, which are now possible thanks to the fpround pattern. llvm-svn: 329876
*	[RISCV] Add codegen support for RV32D floating point arithmetic operations	Alex Bradbury	2018-04-12	1	-0/+256
\| \| \| \|	llvm-svn: 329874
*	[RISCV] Add tests missed in r329871	Alex Bradbury	2018-04-12	6	-0/+464
\| \| \| \|	llvm-svn: 329872
*	[RISCV] Tablegen-driven Instruction Compression.	Sameer AbuAsal	2018-04-06	4	-0/+99
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch implements a tablegen-driven Instruction Compression mechanism for generating RISCV compressed instructions (C Extension) from the expanded instruction form. This tablegen backend processes CompressPat declarations in a td file and generates all the compile-time and runtime checks required to validate the declarations, validate the input operands and generate correct instructions. The checks include validating register operands, immediate operands, fixed register operands and fixed immediate operands. Example: class CompressPat<dag input, dag output> { dag Input = input; dag Output = output; list<Predicate> Predicates = []; } let Predicates = [HasStdExtC] in { def : CompressPat<(ADD GPRNoX0:$rs1, GPRNoX0:$rs1, GPRNoX0:$rs2), (C_ADD GPRNoX0:$rs1, GPRNoX0:$rs2)>; } The result is an auto-generated header file 'RISCVGenCompressEmitter.inc' which exports two functions for compressing/uncompressing MCInst instructions, plus some helper functions: bool compressInst(MCInst& OutInst, const MCInst &MI, const MCSubtargetInfo &STI, MCContext &Context); bool uncompressInst(MCInst& OutInst, const MCInst &MI, const MCRegisterInfo &MRI, const MCSubtargetInfo &STI); The clients that include this auto-generated header file and invoke these functions can compress an instruction before emitting it, in the target-specific ASM or ELF streamer, or can uncompress an instruction before printing it, when the expanded instruction format aliases is favored. The following clients were added to implement compression\uncompression for RISCV: 1) RISCVAsmParser::MatchAndEmitInstruction: Inserted a call to compressInst() to compresses instructions parsed by llvm-mc coming from an ASM input. 2) RISCVAsmPrinter::EmitInstruction: Inserted a call to compressInst() to compress instructions that were lowered from Machine Instructions (MachineInstr). 3) RVInstPrinter::printInst: Inserted a call to uncompressInst() to print the expanded version of the instruction instead of the compressed one (e.g, add s0, s0, a5 instead of c.add s0, a5) when -riscv-no-aliases is not passed. This patch squashes D45119, D42780 and D41932. It was reviewed in smaller patches by asb, efriedma, apazos and mgrang. Reviewers: asb, efriedma, apazos, llvm-commits, sabuasal Reviewed By: sabuasal Subscribers: mgorny, eraman, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, niosHD, kito-cheng, shiva0217, zzheng Differential Revision: https://reviews.llvm.org/D45385 llvm-svn: 329455
*	[RISCV] Use init_array instead of ctors for RISCV target, by default	Mandeep Singh Grang	2018-03-24	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: LLVM defaults to the newer .init_array/.fini_array scheme for static constructors rather than the less desirable .ctors/.dtors (the UseCtors flag defaults to false). This wasn't being respected in the RISC-V backend because it fails to call TargetLoweringObjectFileELF::InitializeELF with the the appropriate flag for UseInitArray. This patch fixes this by implementing RISCVELFTargetObjectFile and overriding its Initialize method to call InitializeELF(TM.Options.UseInitArray). Reviewers: asb, apazos Reviewed By: asb Subscribers: mgorny, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, llvm-commits Differential Revision: https://reviews.llvm.org/D44750 llvm-svn: 328433
*	[RISCV] Codegen support for RV32F floating point comparison operations	Alex Bradbury	2018-03-21	4	-0/+1067
\| \| \| \| \| \| \| \| \|	This patch also includes extensive tests targeted at select and br+fcmp IR inputs. A sequence of br+fcmp required support for FPR32 registers to be added to RISCVInstrInfo::storeRegToStackSlot and RISCVInstrInfo::loadRegFromStackSlot. llvm-svn: 328104
*	[RISCV] Add tests missed from r327979	Alex Bradbury	2018-03-21	1	-0/+53
\| \| \| \|	llvm-svn: 328102
*	[RISCV] Add codegen for RV32F floating point load/store	Alex Bradbury	2018-03-20	2	-0/+111
\| \| \| \| \| \| \|	As part of this, add support for load/store from the constant pool. This is used to materialise f32 constants. llvm-svn: 327979
*	[RISCV] Add codegen for RV32F arithmetic and conversion operations	Alex Bradbury	2018-03-20	2	-0/+262
\| \| \| \| \| \|	Currently, only a soft floating point ABI is supported. llvm-svn: 327976
*	[RISCV] Preserve stack space for outgoing arguments when the function ↵	Shiva Chen	2018-03-20	2	-7/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	contain variable size objects E.g. bar (int x) { char p[x]; push outgoing variables for foo. call foo } We need to generate stack adjustment instructions for outgoing arguments by eliminateCallFramePseudoInstr when the function contains variable size objects to avoid outgoing variables corrupt the variable size object. Default hasReservedCallFrame will return !hasFP(). We don't want to generate extra sp adjustment instructions when hasFP() return true, So We override hasReservedCallFrame as !hasVarSizedObjects(). Differential Revision: https://reviews.llvm.org/D43752 llvm-svn: 327938
*	[RISCV] Peephole optimisation for load/store of global values or constant ↵	Alex Bradbury	2018-03-19	6	-67/+36
\| \| \| \| \| \| \| \| \| \| \|	addresses (load (add base, off), 0) -> (load base, off) (store val, (add base, off)) -> (store val, base, off) This is similar to an equivalent peephole optimisation in PPCISelDAGToDAG. llvm-svn: 327831
*	[RISCV] Update two tests after r326208	Alex Bradbury	2018-02-28	2	-4/+4
\| \| \| \|	llvm-svn: 326309
*	[RISCV] Revert r324172 now r323991 was reverted	Alex Bradbury	2018-02-17	2	-4/+4
\| \| \| \| \| \|	This fixes the build, now that r325421 was commited to revert r323991. llvm-svn: 325441
*	[RISCV] Update two RISCV codegen tests after rL323991	Alex Bradbury	2018-02-03	2	-4/+4
\| \| \| \| \| \| \|	From the discussion in D41835 it looks possible the change will be backed out, but for now let's fix the RISCV tests. llvm-svn: 324172
*	[RISCV] Define getSetCCResultType for setting vector setCC type	Shiva Chen	2018-02-02	1	-0/+35
\| \| \| \| \| \| \| \|	To avoid trigger "No default SetCC type for vectors!" Assertion Differential Revision: https://reviews.llvm.org/D42675 llvm-svn: 324054
*	Remove alignment argument from memcpy/memmove/memset in favour of alignment ↵	Daniel Neilson	2018-01-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])$(.), i32, i1$~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 llvm-svn: 322965
*	[RISCV] Codegen support for the standard RV32M instruction set extension	Alex Bradbury	2018-01-18	3	-3/+204
\| \| \| \|	llvm-svn: 322843
*	[RISCV] Implement frame pointer elimination	Alex Bradbury	2018-01-18	31	-1818/+1972
\| \| \| \|	llvm-svn: 322839
*	[RISCV][NFC] Add nounwind to functions in div.ll and mul.ll	Alex Bradbury	2018-01-18	2	-16/+16
\| \| \| \| \| \| \|	Committing this separately to minimise irrelevant changes for an upcoming patch. llvm-svn: 322825
*	[RISCV] Reserve an emergency spill slot for the register scavenger when ↵	Alex Bradbury	2018-01-11	1	-7/+92
\| \| \| \| \| \| \| \| \| \| \|	necessary Although the register scavenger can often find a spare register, an emergency spill slot is needed to guarantee success. Reserve this slot in cases where the function is known to have a large stack (meaning the scavenger may be needed when forming stack addresses). llvm-svn: 322269
*	[RISCV] Implement support for the BranchRelaxation pass	Alex Bradbury	2018-01-10	4	-31/+110
\| \| \| \| \| \| \| \| \|	Branch relaxation is needed to support branch displacements that overflow the instruction's immediate field. Differential Revision: https://reviews.llvm.org/D40830 llvm-svn: 322224
*	[RISCV] Implement branch analysis	Alex Bradbury	2018-01-10	5	-159/+240
\| \| \| \| \| \| \| \| \|	This is a prerequisite for the branch relaxation pass, and allows a number of optimisation passes (e.g. BranchFolding and MachineBlockPlacement) to work. Differential Revision: https://reviews.llvm.org/D40808 llvm-svn: 322222