bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[InlineAsm] Remove EarlyClobber on registers that are also inputs	Hal Finkel	2015-04-20	1	-0/+155
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an inline asm call has an output register marked as early-clobber, but that same register is also an input operand, what should we do? GCC accepts this, and is documented to accept this for read/write operands saying, "Furthermore, if the earlyclobber operand is also a read/write operand, then that operand is written only after it's used." For write-only operands, the situation seems less clear, but I have at least one existing codebase that assumes this will work, in part because it has syscall macros like this: ({ \ register uint64_t r0 __asm__ ("r0") = (__NR_ ## name); \ register uint64_t r3 __asm__ ("r3") = ((uint64_t) (arg0)); \ register uint64_t r4 __asm__ ("r4") = ((uint64_t) (arg1)); \ register uint64_t r5 __asm__ ("r5") = ((uint64_t) (arg2)); \ __asm__ __volatile__ \ ("sc" \ : "=&r"(r0),"=&r"(r3),"=&r"(r4),"=&r"(r5) \ : "0"(r0), "1"(r3), "2"(r4), "3"(r5) \ : "r6","r7","r8","r9","r10","r11","r12","cr0","memory"); \ r3; \ }) Furthermore, with register aliases and subregister relationships that only the backend knows about, rejecting this in the frontend seems like a difficult proposition (if we wanted to do so). However, keeping the early-clobber flag on the INLINEASM MI does not work for us, because it will cause the register's live interval to end to soon (so it will not appear defined to be used as an input). Fortunately, fixing this does not seem hard: When forming the INLINEASM MI, check to see if any of the early-clobber outputs are also inputs, and if so, remove the early-clobber flag. llvm-svn: 235283
*	[opaque pointer type] Add textual IR support for explicit type parameter to ↵	David Blaikie	2015-04-16	36	-98/+98
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the call instruction See r230786 and r230794 for similar changes to gep and load respectively. Call is a bit different because it often doesn't have a single explicit type - usually the type is deduced from the arguments, and just the return type is explicit. In those cases there's no need to change the IR. When that's not the case, the IR usually contains the pointer type of the first operand - but since typed pointers are going away, that representation is insufficient so I'm just stripping the "pointerness" of the explicit type away. This does make the IR a bit weird - it /sort of/ reads like the type of the first operand: "call void () %x(" but %x is actually of type "void ()" and will eventually be just of type "ptr". But this seems not too bad and I don't think it would benefit from repeating the type ("void (), void () %x(" and then eventually "void (), ptr %x(") as has been done with gep and load. This also has a side benefit: since the explicit type is no longer a pointer, there's no ambiguity between an explicit type and a function that returns a function pointer. Previously this case needed an explicit type (eg: a function returning a void() function was written as "call void () () * @x(" rather than "call void () * @x(" because of the ambiguity between a function returning a pointer to a void() function and a function returning void). No ambiguity means even function pointer return types can just be written alone, without writing the whole function's type. This leaves /only/ the varargs case where the explicit type is required. Given the special type syntax in call instructions, the regex-fu used for migration was a bit more involved in its own unique way (as every one of these is) so here it is. Use it in conjunction with the apply.sh script and associated find/xargs commands I've provided in rr230786 to migrate your out of tree tests. Do let me know if any of this doesn't cover your cases & we can iterate on a more general script/regexes to help others with out of tree tests. About 9 test cases couldn't be automatically migrated - half of those were functions returning function pointers, where I just had to manually delete the function argument types now that we didn't need an explicit function type there. The other half were typedefs of function types used in calls - just had to manually drop the * from those. import fileinput import sys import re pat = re.compile(r'((?:=\|:\|^\|\s)call\s(?:[^@]?))(\s$\|\s(?:(?:\[\[[a-zA-Z0-9_]+\]\]\|[@%](?:(")?[\\\?@a-zA-Z0-9_.]?(?(3)"\|)\|{{.}}))(?:$\|$)\|undef\|inttoptr\|bitcast\|null\|asm).$)') addrspace_end = re.compile(r"addrspace\(\d+$\s\$") func_end = re.compile("(?:void.\|\)\s)\$") def conv(match, line): if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)): return line return line[:match.start()] + match.group(1)[:match.group(1).rfind('')].rstrip() + match.group(2) + line[match.end():] for line in sys.stdin: sys.stdout.write(conv(re.search(pat, line), line)) llvm-svn: 235145
*	Revert the switch lowering change (r235101, r235103, r235106)	Hans Wennborg	2015-04-16	2	-56/+45
\| \| \| \| \| \|	Looks like it broke the sanitizer-ppc64-linux1 build. Reverting for now. llvm-svn: 235108
*	Switch lowering: extract jump tables and bit tests before building binary ↵	Hans Wennborg	2015-04-16	2	-45/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tree (PR22262) This is a major rewrite of the SelectionDAG switch lowering. The previous code would lower switches as a binary tre, discovering clusters of cases suitable for lowering by jump tables or bit tests as it went along. To increase the likelihood of finding jump tables, the binary tree pivot was selected to maximize case density on both sides of the pivot. By not selecting the pivot in the middle, the binary trees would not always be balanced, leading to performance problems in the generated code. This patch rewrites the lowering to search for clusters of cases suitable for jump tables or bit tests first, and then builds the binary tree around those clusters. This way, the binary tree will always be balanced. This has the added benefit of decoupling the different aspects of the lowering: tree building and jump table or bit tests finding are now easier to tweak separately. For example, this will enable us to balance the tree based on profile info in the future. The algorithm for finding jump tables is O(n^2), whereas the previous algorithm was O(n log n) for common cases, and quadratic only in the worst-case. This doesn't seem to be major problem in practice, e.g. compiling a file consisting of a 10k-case switch was only 30% slower, and such large switches should be rare in practice. Compiling e.g. gcc.c showed no compile-time difference. If this does turn out to be a problem, we could limit the search space of the algorithm. This commit also disables all optimizations during switch lowering in -O0. Differential Revision: http://reviews.llvm.org/D8649 llvm-svn: 235101
*	Update tests to not be as dependent on section numbers.	Rafael Espindola	2015-04-15	2	-3/+3
\| \| \| \| \| \| \| \|	Many of these predate llvm-readobj. With elf-dump we had to match a relocation to symbol number and symbol number to symbol name or section number. llvm-svn: 235015
*	[PowerPC] Really iterate over all loops in PPCLoopDataPrefetch/PPCLoopPreIncPrep	Hal Finkel	2015-04-12	1	-0/+48
\| \| \| \| \| \| \| \|	When I fixed these a couple of days ago to iterate over all loops, not just depth == 1 loops, I inadvertently made it such that we'd only look at the first top-level loop. Make sure that we really look at all of them. llvm-svn: 234705
*	[PowerPC] Disable part-word atomics on the P7	Hal Finkel	2015-04-11	1	-16/+16
\| \| \| \| \| \| \|	As it turns out, even though these are part of ISA 2.06, the P7 does not support them (or, at least, not any P7s we're tested so far). llvm-svn: 234686
*	Add direct moves to/from VSR and exploit them for FP/INT conversions	Nemanja Ivanovic	2015-04-11	1	-0/+426
\| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: http://reviews.llvm.org/D8928 It adds direct move instructions to/from VSX registers to GPR's. These are exploited for FP <-> INT conversions. llvm-svn: 234682
*	[PowerPC] Fix PPCLoopPreIncPrep for depth > 1 loops	Hal Finkel	2015-04-11	1	-0/+52
\| \| \| \| \| \| \| \| \|	This pass had the same problem as the data-prefetching pass: it was only checking for depth == 1 loops in practice. Fix that, add some debugging statements, and make sure that, when we grab an AddRec, it is for the loop we expect. llvm-svn: 234670
*	[PowerPC] Prefetching should also consider depth > 1 loops	Hal Finkel	2015-04-10	1	-0/+66
\| \| \| \| \| \| \|	Iterating over loops from the LoopInfo instance only provides top-level loops. We need to search the whole tree of loops to find the inner ones. llvm-svn: 234603
*	[PowerPC] Don't crash on PPC32 i64 fp_to_uint on modern cores	Hal Finkel	2015-04-10	1	-0/+23
\| \| \| \| \| \| \| \| \| \|	When we have an instruction for this (and, thus, don't generate a runtime call), we need to custom type legalize this (in a trivial way, just as we do for fp_to_sint). Fixes PR23173. llvm-svn: 234561
*	Add LLVM support for remaining integer divide and permute instructions from ↵	Nemanja Ivanovic	2015-04-09	2	-0/+85
\| \| \| \| \| \| \| \| \| \| \|	ISA 2.06 This is the patch corresponding to review: http://reviews.llvm.org/D8406 It adds some missing instructions from ISA 2.06 to the PPC back end. llvm-svn: 234546
*	Revert "Refactoring and enhancement to FMA combine."	Rafael Espindola	2015-04-09	2	-161/+9
\| \| \| \| \| \|	This reverts commit r234513. It was failing on the bots. llvm-svn: 234518
*	Refactoring and enhancement to FMA combine.	Olivier Sallenave	2015-04-09	2	-9/+161
\| \| \| \|	llvm-svn: 234513
*	Strip trailing whitespace and reword explanatory comment.	Eric Christopher	2015-04-04	1	-10/+5
\| \| \| \|	llvm-svn: 234078
*	[PowerPC] Enable splat generation for BUILD_VECTOR with little endian	Bill Schmidt	2015-04-03	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When enabling PPC64LE, I disabled some optimizations of BUILD_VECTOR nodes for little endian because wrong results were produced. I've subsequently investigated and found this is due to a call to BuildVectorSDNode::isConstantSplat that was always specifying big-endian. With this changed to correctly identify the target endianness, the optimizations work as expected. I found another case of a call to the same method with big-endian hardcoded, in PPC::isAllNegativeZeroVector(). I discovered this was an orphaned method with no callers, so I've just removed it. The existing test/CodeGen/PowerPC/vec_constants.ll checks these optimizations, so for testing I've just added a variant for little endian. llvm-svn: 234011
*	[DAGCombiner] Combine shuffles of BUILD_VECTOR and SCALAR_TO_VECTOR	Simon Pilgrim	2015-04-03	1	-46/+11
\| \| \| \| \| \| \| \|	This patch attempts to fold the shuffling of 'scalar source' inputs - BUILD_VECTOR and SCALAR_TO_VECTOR nodes - if the shuffle node is the only user. This folds away a lot of unnecessary shuffle nodes, and allows quite a bit of constant folding that was being missed. Differential Revision: http://reviews.llvm.org/D8516 llvm-svn: 234004
*	[PowerPC] FastISel can't handle i1 return values when using CR bits	Hal Finkel	2015-04-01	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Under normal circumstances, use of CR bits is disabled when running at -O0, but it is enabled by default otherwise, and if you have optnone functions, they'll still generally be generated with crbits turned on (because nothing else turns them off). FastISel can't handle most things dealing with i1 values when using CR bits, and checks for that, but was not checking the return type on functions; we can't fast-isel function calls with i1 return values either when using CR bits for boolean values. Fixes PR22664. llvm-svn: 233775
*	[PowerPC] Don't use a vector preferred memory type at -O0	Hal Finkel	2015-03-31	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \|	Even at -O0, we fall back to SDAG when we hit intrinsics, and if the intrinsic is a memset/memcpy/etc. we might normally use vector types. At -O0, this is probably not a good idea (because, if there is a bug in the lowering code, there would be no good way to turn it off). At -O0, only use scalar preferred types. Related to PR22754. llvm-svn: 233755
*	[SDAG] Handle non-integer preferred memset types for non-constant values	Hal Finkel	2015-03-31	2	-0/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The existing code in getMemsetValue only handled integer-preferred types when the fill value was not a constant. Make this more robust in two ways: 1. If the preferred type is a floating-point value, do the mul-splat trick on the corresponding integer type and then bitcast. 2. If the preferred type is a vector, do the mul-splat trick on one vector element, and then build a vector out of them. Fixes PR22754 (although, we should also turn off use of vector types at -O0). llvm-svn: 233749
*	DebugInfo: Fix bad debug info for compile units and types	Duncan P. N. Exon Smith	2015-03-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix debug info in these tests, which started failing with a WIP patch to verify compile units and types. The problems look like they were all caused by bitrot. They fell into these categories: - Using `!{i32 0}` instead of `!{}`. - Using `!{null}` instead of `!{}`. - Using `!MDExpression()` instead of `!{}`. - Using `!8` instead of `!{!8}`. - `file:` references that pointed at `MDCompileUnit`s instead of the same `MDFile` as the compile unit. - `file:` references that were numerically off-by-one or (off-by-ten). llvm-svn: 233415
*	Complete the MachineScheduler fix made way back in r210390.	Andrew Trick	2015-03-27	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"Fix the MachineScheduler's logic for updating ready times for in-order. Now the scheduler updates a node's ready time as soon as it is scheduled, before releasing dependent nodes." This fix was only made in one variant of the ScheduleDAGMI driver. Francois de Ferriere reported the issue in the other bit of code where it was also needed. I never got around to coming up with a test case, but it's an obvious fix that shouldn't be delayed any longer. I'll try to refactor this code a little better. I did verify performance on a wide variety of targets and saw no negative impact with this fix. llvm-svn: 233366
*	Testcase for r233239.	Eric Christopher	2015-03-26	1	-0/+32
\| \| \| \|	llvm-svn: 233240
*	Add Hardware Transactional Memory (HTM) Support	Kit Barton	2015-03-25	1	-0/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07 (POWER8). The intrinsic support is based on GCC one [1], but currently only the 'PowerPC HTM Low Level Built-in Function' are implemented. The HTM instructions follows the RC ones and the transaction initiation result is set on RC0 (with exception of tcheck). Currently approach is to create a register copy from CR0 to GPR and comapring. Although this is suboptimal, since the branch could be taken directly by comparing the CR0 value, it generates code correctly on both test and branch and just return value. A possible future optimization could be elimitate the MFCR instruction to branch directly. The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on powerpc64 and powerpc64le. This is send along a clang patch to enabled the builtins and option switch. [1] https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html Phabricator Review: http://reviews.llvm.org/D8247 llvm-svn: 233204
*	[SDAG] Don't widen VSETCC during type legalization for split operands	Hal Finkel	2015-03-23	1	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \|	Because the operands of a vector SETCC node can be of a different type from the result (and often are), it can happen that even if we'd prefer to widen the result type of the SETCC, the operands have been split instead. In this case, the SETCC result also must be split. This mirrors what is done in WidenVecRes_SELECT, and should be NFC elsewhere because if the operands are not widened the following calls to GetWidenedVector will assert (which is what was happening in the test case). llvm-svn: 232935
*	Remove the bare getSubtargetImpl call from the PPC port. As part	Eric Christopher	2015-03-21	1	-0/+43
\| \| \| \| \| \| \|	of this add a test that shows we can generate code with for functions that differ by subtarget feature. llvm-svn: 232882
*	Fix a nasty bug in DAGCombine of STORE nodes.	Owen Anderson	2015-03-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is very related to the bug fixed in r174431. The problem is that SelectionDAG does not include alignment in the uniquing of loads and stores. When an otherwise no-op DAGCombine would increase the alignment of a load or store, the original node would be returned (with the alignment increased), which would cause the node not to be processed by any further DAGCombines. I don't have a direct testcase for this that manifests on an in-tree target, but I did see some noise in the tests for other targets and have updated them for it. llvm-svn: 232780
*	Note that we don't support COFF on PPC.	Rafael Espindola	2015-03-19	1	-0/+4
\| \| \| \| \| \|	Should bring back the windows bots. llvm-svn: 232701
*	Fix R0 use in PowerPC VSX store for FastIsel.	Samuel Antao	2015-03-17	1	-0/+33
\| \| \| \| \| \|	The VSX stores are sometimes generated with a undefined index register, causing %noreg to be used and R0 to be emitted later on. The semantics of the VSX store (e.g. stdsdx) requires R0 to be used as base if we want zero to be used in the computation of the effective address instead of the content of R0. This patch checks if no index register was generated and forces R0 to be used as base address. llvm-svn: 232486
*	Use createTempSymbol to avoid collisions instead of an ad hoc method.	Rafael Espindola	2015-03-17	2	-8/+8
\| \| \| \|	llvm-svn: 232483
*	Add a bunch of CHECK missing colons in tests. NFC.	Ahmed Bougacha	2015-03-14	1	-3/+3
\| \| \| \| \| \|	Some wouldn't pass; fixed most, the rest will be fixed separately. llvm-svn: 232239
*	[opaque pointer type] Add textual IR support for explicit type parameter to ↵	David Blaikie	2015-03-13	41	-219/+219
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gep operator Similar to gep (r230786) and load (r230794) changes. Similar migration script can be used to update test cases, which successfully migrated all of LLVM and Polly, but about 4 test cases needed manually changes in Clang. (this script will read the contents of stdin and massage it into stdout - wrap it in the 'apply.sh' script shown in previous commits + xargs to apply it over a large set of test cases) import fileinput import sys import re rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s$)((<\d\s+x\s+)?([^@]?)(\|\saddrspace\(\d+$)\s\(?(3)>)\s*)(?=$\|%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|zeroinitializer\|<\|\[\[[a-zA-Z]\|\{\{)", re.MULTILINE \| re.DOTALL) def conv(match): line = match.group(1) line += match.group(4) line += ", " line += match.group(2) return line line = sys.stdin.read() off = 0 for match in re.finditer(rep, line): sys.stdout.write(line[off:match.start()]) sys.stdout.write(conv(match)) off = match.end() sys.stdout.write(line[off:]) llvm-svn: 232184
*	Add support for part-word atomics for PPC	Nemanja Ivanovic	2015-03-10	1	-0/+54
\| \| \| \| \| \|	http://reviews.llvm.org/D8090#inline-67337 llvm-svn: 231843
*	Change the generation of the vmuluwm instruction to be based on the MUL opcode.	Kit Barton	2015-03-10	2	-5/+6
\| \| \| \| \| \|	Phabricator review: http://reviews.llvm.org/D8185 llvm-svn: 231827
*	Use the correct func begin symbol in all places in ppc.	Rafael Espindola	2015-03-05	2	-3/+4
\| \| \| \| \| \|	I missed an occurrence of the old symbol in my previous patch. llvm-svn: 231398
*	Use the generic Lfunc_begin label on ppc.	Rafael Espindola	2015-03-05	5	-25/+89
\| \| \| \| \| \|	This removes yet another custom label to mark the start of a function. llvm-svn: 231390
*	While reviewing the changes to Clang to add builtin support for the vsld, ↵	Kit Barton	2015-03-05	1	-5/+8
\| \| \| \| \| \|	vsrd, and vsrad instructions, it was pointed out that the builtins are generating the LLVM opcodes (shl, lshr, and ashr) not calls to the intrinsics. This patch changes the implementation of the vsld, vsrd, and vsrad instructions from from intrinsics to VXForm_1 instructions and makes them legal with P8 Altivec. It also removes the definition of the int_ppc_altivec_vsld, int_ppc_altivec_vsrd, and int_ppc_altivec_vsrad intrinsics. llvm-svn: 231378
*	Add LLVM support for PPC cryptography builtins	Nemanja Ivanovic	2015-03-04	1	-0/+275
\| \| \| \| \| \|	Review: http://reviews.llvm.org/D7955 llvm-svn: 231285
*	Use the vanilla func_end symbol for .size.	Rafael Espindola	2015-03-04	2	-2/+2
\| \| \| \| \| \|	No need to create yet another temp symbol. llvm-svn: 231198
*	Add the following 64-bit vector integer arithmetic instructions added in POWER8:	Kit Barton	2015-03-03	5	-0/+428
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vaddudm vsubudm vmulesw vmulosw vmuleuw vmulouw vmuluwm vmaxsd vmaxud vminsd vminud vcmpequd vcmpequd. vcmpgtsd vcmpgtsd. vcmpgtud vcmpgtud. vrld vsld vsrd vsrad Phabricator review: http://reviews.llvm.org/D7959 llvm-svn: 231115
*	DebugInfo: Move new hierarchy into place	Duncan P. N. Exon Smith	2015-03-03	3	-423/+423
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the specialized metadata nodes for the new debug info hierarchy into place, finishing off PR22464. I've done bootstraps (and all that) and I'm confident this commit is NFC as far as DWARF output is concerned. Let me know if I'm wrong :). The code changes are fairly mechanical: - Bumped the "Debug Info Version". - `DIBuilder` now creates the appropriate subclass of `MDNode`. - Subclasses of DIDescriptor now expect to hold their "MD" counterparts (e.g., `DIBasicType` expects `MDBasicType`). - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp` for printing comments. - Big update to LangRef to describe the nodes in the new hierarchy. Feel free to make it better. Testcase changes are enormous. There's an accompanying clang commit on its way. If you have out-of-tree debug info testcases, I just broke your build. - `upgrade-specialized-nodes.sh` is attached to PR22564. I used it to update all the IR testcases. - Unfortunately I failed to find way to script the updates to CHECK lines, so I updated all of these by hand. This was fairly painful, since the old CHECKs are difficult to reason about. That's one of the benefits of the new hierarchy. This work isn't quite finished, BTW. The `DIDescriptor` subclasses are almost empty wrappers, but not quite: they still have loose casting checks (see the `RETURN_FROM_RAW()` macro). Once they're completely gutted, I'll rename the "MD" classes to "DI" and kill the wrappers. I also expect to make a few schema changes now that it's easier to reason about everything. llvm-svn: 231082
*	Regenerated test case from pr 230801 for change in LLVM IR syntax	Bill Schmidt	2015-02-27	1	-0/+78
\| \| \| \|	llvm-svn: 230811
*	Revert test case until it can be fixed	Bill Schmidt	2015-02-27	1	-65/+0
\| \| \| \|	llvm-svn: 230803
*	[PowerPC] Fix PR22711 - Misaligned .toc section	Bill Schmidt	2015-02-27	1	-0/+65
\| \| \| \| \| \| \| \| \| \| \| \|	Straightforward patch to emit an alignment directive when emitting a TOC entry. The test case was generated from the test in PR22711 that demonstrated a misaligned .toc section. The object code is run through llvm-readobj to verify that the correct alignment has been applied to the .toc section. Thanks to Ulrich Weigand for running down where the fix was needed. llvm-svn: 230801
*	[opaque pointer type] Add textual IR support for explicit type parameter to ↵	David Blaikie	2015-02-27	224	-1331/+1331
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 llvm-svn: 230794
*	[PowerPC] Use vector types for memcpy and friends (sometimes)	Hal Finkel	2015-02-27	2	-2/+114
\| \| \| \| \| \| \| \| \| \|	When using Altivec, we can use vector loads and stores for aligned memcpy and friends. Starting with the P7 and VXS, we have reasonable unaligned vector stores. Starting with the P8, we have fast unaligned loads too. For QPX, we use vector loads are stores, but only for aligned memory accesses. llvm-svn: 230788
*	[opaque pointer type] Add textual IR support for explicit type parameter to ↵	David Blaikie	2015-02-27	103	-1360/+1360
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float> %x, ... ->getelementptr float, <4 x float> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.?[^%\w]getelementptr inbounds )(((?:<\d x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") normrep = re.compile( r"(^.?[^%\w]getelementptr )(((?:<\d* x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll \| xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 llvm-svn: 230786
*	Change the fast-isel-abort option from bool to int to enable "levels"	Mehdi Amini	2015-02-27	18	-26/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Currently fast-isel-abort will only abort for regular instructions, and just warn for function calls, terminators, function arguments. There is already fast-isel-abort-args but nothing for calls and terminators. This change turns the fast-isel-abort options into an integer option, so that multiple levels of strictness can be defined. This will help no being surprised when the "abort" option indeed does not abort, and enables the possibility to write test that verifies that no intrinsics are forgotten by fast-isel. Reviewers: resistor, echristo Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D7941 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 230775
*	[PowerPC] Make LDtocL and friends invariant loads	Hal Finkel	2015-02-25	4	-37/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LDtocL, and other loads that roughly correspond to the TOC_ENTRY SDAG node, represent loads from the TOC, which is invariant. As a result, these loads can be hoisted out of loops, etc. In order to do this, we need to generate GOT-style MMOs for TOC_ENTRY, which requires treating it as a legitimate memory intrinsic node type. Once this is done, the MMO transfer is automatically handled for TableGen-driven instruction selection, and for nodes generated directly in PPCISelDAGToDAG, we need to transfer the MMOs manually. Also, we were not transferring MMOs associated with pre-increment loads, so do that too. Lastly, this fixes an exposed bug where R30 was not added as a defined operand of UpdateGBR. This problem was highlighted by an example (used to generate the test case) posted to llvmdev by Francois Pichet. llvm-svn: 230553
*	[PowerPC] Add triples to QPX tests	Hal Finkel	2015-02-25	7	-0/+7
\| \| \| \| \| \| \|	Some of these tests fail on Darwin systems because of a lack of a triple; fix that. llvm-svn: 230421