summaryrefslogtreecommitdiffstats
path: root/llvm
Commit message (Collapse)AuthorAgeFilesLines
...
* SLPVectorizer: Reduce the compile time of the consecutive store lookup.Nadav Rotem2013-07-161-5/+13
| | | | | | Process groups of stores in chunks of 16. llvm-svn: 186420
* Create files with mode 666. This matches the behavior of other unix tools.Rafael Espindola2013-07-162-1/+14
| | | | llvm-svn: 186414
* [Support] Fix some warnings when self-hosting clang on WindowsReid Kleckner2013-07-162-2/+5
| | | | llvm-svn: 186413
* [APFloat] PR16573: Avoid losing mantissa bits in ppc_fp128 to double truncationUlrich Weigand2013-07-162-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When truncating to a format with fewer mantissa bits, APFloat::convert will perform a right shift of the mantissa by the difference of the precision of the two formats. Usually, this will result in just the mantissa bits needed for the target format. One special situation is if the input number is denormal. In this case, the right shift may discard significant bits. This is usually not a problem, since truncating a denormal usually results in zero (underflow) after normalization anyway, since the result format's exponent range is usually smaller than the target format's. However, there is one case where the latter property does not hold: when truncating from ppc_fp128 to double. In particular, truncating a ppc_fp128 whose first double of the pair is denormal should result in just that first double, not zero. The current code however performs an excessive right shift, resulting in lost result bits. This is then caught in the APFloat::normalize call performed by APFloat::convert and causes an assertion failure. This patch checks for the scenario of truncating a denormal, and attempts to (possibly partially) replace the initial mantissa right shift by decrementing the exponent, if doing so will still result in a valid *target format* exponent. Index: test/CodeGen/PowerPC/pr16573.ll =================================================================== --- test/CodeGen/PowerPC/pr16573.ll (revision 0) +++ test/CodeGen/PowerPC/pr16573.ll (revision 0) @@ -0,0 +1,11 @@ +; RUN: llc < %s | FileCheck %s + +target triple = "powerpc64-unknown-linux-gnu" + +define double @test() { + %1 = fptrunc ppc_fp128 0xM818F2887B9295809800000000032D000 to double + ret double %1 +} + +; CHECK: .quad -9111018957755033591 + Index: lib/Support/APFloat.cpp =================================================================== --- lib/Support/APFloat.cpp (revision 185817) +++ lib/Support/APFloat.cpp (working copy) @@ -1956,6 +1956,23 @@ X86SpecialNan = true; } + // If this is a truncation of a denormal number, and the target semantics + // has larger exponent range than the source semantics (this can happen + // when truncating from PowerPC double-double to double format), the + // right shift could lose result mantissa bits. Adjust exponent instead + // of performing excessive shift. + if (shift < 0 && isFiniteNonZero()) { + int exponentChange = significandMSB() + 1 - fromSemantics.precision; + if (exponent + exponentChange < toSemantics.minExponent) + exponentChange = toSemantics.minExponent - exponent; + if (exponentChange < shift) + exponentChange = shift; + if (exponentChange < 0) { + shift -= exponentChange; + exponent += exponentChange; + } + } + // If this is a truncation, perform the shift before we narrow the storage. if (shift < 0 && (isFiniteNonZero() || category==fcNaN)) lostFraction = shiftRight(significandParts(), oldPartCount, -shift); llvm-svn: 186409
* [XCore] Fix printing of inline asm operands.Richard Osborne2013-07-162-11/+39
| | | | | | | Previously an asm operand with no operand modifier would give the error "invalid operand in inline asm". llvm-svn: 186407
* ARM: allow printing of ARM atomic DAG nodes.Tim Northover2013-07-161-0/+13
| | | | | | | | We'd forgotten to provide string representations for the special ARMISD atomic nodes; this adds them in. No effect on CodeGen, just makes the output of "-view-whatever-dags" slightly more readable. llvm-svn: 186406
* [SystemZ] Use ROSBG and non-zero form of RISBG for OR nodesRichard Sandiford2013-07-163-1/+289
| | | | llvm-svn: 186405
* Fixing a buildbot failure:unused function.Vladimir Medic2013-07-161-14/+0
| | | | llvm-svn: 186403
* [SystemZ] Add MC support for R[NOX]SBGRichard Sandiford2013-07-164-0/+179
| | | | | | CodeGen support will come later. llvm-svn: 186401
* [SystemZ] Use RISBG for (shift (and ...))Richard Sandiford2013-07-162-98/+321
| | | | | | | Another patch in the series to make more use of R.SBG. This one extends r186072 and r186073 to handle cases where the AND is inside the shift. llvm-svn: 186399
* This patch represents Mips utilization of r186388 code that alows asm ↵Vladimir Medic2013-07-164-270/+242
| | | | | | matcher to emit mnemonics contain '.' characters. This makes asm parser code simpler and more efficient. llvm-svn: 186397
* PPCJITInfo.cpp: Tweak r186252 with s/__ppc/__powerpc/ to work on ↵NAKAMURA Takumi2013-07-161-2/+2
| | | | | | | | powerpc-linux Fedora 12. g++ (GCC) 4.4.4 20100630 (Red Hat 4.4.4-10) llvm-svn: 186396
* ARM: implement ldrex, strex and clrex intrinsicsTim Northover2013-07-168-53/+277
| | | | | | | Intrinsics already existed for the 64-bit variants, so these support operations of size at most 32-bits. llvm-svn: 186392
* ARM EABI divmod supportRenato Golin2013-07-164-2/+289
| | | | | | | | | | | | This patch enables calls to __aeabi_idivmod when in EABI mode, by using the remainder value returned on registers (R1), enabled by the ARM triple "none-eabi". Note that Darwin and GNUEABI triples will continue lowering on GNU style, that is, using the stack for the remainder. Still need to add SREM/UREM support fix for 64-bit lowering. llvm-svn: 186390
* This patch allows targets to define weather the instruction mnemonics in asm ↵Vladimir Medic2013-07-162-4/+10
| | | | | | matcher tables will contain '.' character. llvm-svn: 186388
* llvm/test/Object/directory.ll: Mark it as XFAIL:cygwin. Directories can be ↵NAKAMURA Takumi2013-07-161-3/+3
| | | | | | opened on cygwin. llvm-svn: 186387
* Use open+fstat instead of stat+open.Rafael Espindola2013-07-161-3/+13
| | | | llvm-svn: 186381
* Remember that we have a null terminated string.Rafael Espindola2013-07-161-4/+4
| | | | | | | This is a micro optimization. Instead of going char*->StringRef->Twine->char*, go char*->Twine->char* and avoid having to copy the filename on the stack. llvm-svn: 186380
* [Object/COFF] Add import_directory_table_entry.Rui Ueyama2013-07-161-0/+8
| | | | | | | | | | | | Summary: Add import_directory_table_entry to use for .idata section. Reviewers: Bigcheese CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D1059 llvm-svn: 186379
* Add a version of sys::fs::status that uses fstat.Rafael Espindola2013-07-163-40/+71
| | | | llvm-svn: 186378
* COFF: Add constants for optional data directory.Rui Ueyama2013-07-161-0/+18
| | | | llvm-svn: 186377
* Instead friending status, provide windows and posix constructors to file_status.Rafael Espindola2013-07-163-44/+52
| | | | | | | This opens the way of having static helpers in the .inc files that can construct a file_status. llvm-svn: 186376
* unittests/Support: Add TimeValue.Win32FILETIME, corresponding to r186374.NAKAMURA Takumi2013-07-161-0/+16
| | | | llvm-svn: 186375
* Fix TimeValue::toWin32Time() to be symmetric to fromWin32Time() and ↵NAKAMURA Takumi2013-07-162-5/+3
| | | | | | | | | | compatible to Win32's FILETIME. llvm-ar is the only user of toWin32Time() (via setLastModificationAndAccessTime), and r186298 can be reverted. It had been buggy since the initial commit. FIXME: Could we rename {from|to}Win32Time as {from|to}Win32FILETIME in TimeValue? llvm-svn: 186374
* Rename Support.TimeValue to TimeValue.time_t in unittests/Support.NAKAMURA Takumi2013-07-162-3/+3
| | | | llvm-svn: 186372
* Add 'const' qualifiers to static const char* variables.Craig Topper2013-07-169-37/+38
| | | | llvm-svn: 186371
* Add mingw32 to the XFAIL. I forgot about it when adding win32.Rafael Espindola2013-07-151-1/+1
| | | | llvm-svn: 186365
* PEI: Support for non-zero SPAdj at beginning of a basic block.Manman Ren2013-07-153-15/+280
| | | | | | | | | | | | | | | | | | | | We can have a FrameSetup in one basic block and the matching FrameDestroy in a different basic block when we have struct byval. In that case, SPAdj is not zero at beginning of the basic block. Modify PEI to correctly set SPAdj at beginning of each basic block using DFS traversal. We used to assume SPAdj is 0 at beginning of each basic block. PEI had an assert SPAdjCount || SPAdj == 0. If we have a Destroy <n> followed by a Setup <m>, PEI will assert failure. We can add an extra condition to make sure the pairs are matched: The pairs start with a FrameSetup. But since we are doing a much better job in the verifier, this patch removes the check in PEI. PR16393 llvm-svn: 186364
* PR16628: Fix a bug in the code that merges compares.Nadav Rotem2013-07-152-1/+30
| | | | | | Compares return i1 but they compare different types. llvm-svn: 186359
* PPC: Refactoring to support subtarget feature changingHal Finkel2013-07-152-37/+69
| | | | | | | | | This change mirrors the changes that were made to the X86 and ARM targets to support subtarget feature changing. As indicated in r182899, the mechanism is still undergoing revision, and so as with the X86 and ARM targets, there is no test case yet (there is no effective functionality change). llvm-svn: 186357
* Further simplify test case from r186119/r186035.David Blaikie2013-07-151-123/+94
| | | | llvm-svn: 186356
* XFAIL on windows too and document the XFAILs.Rafael Espindola2013-07-151-1/+4
| | | | llvm-svn: 186354
* Machine Verifier: verify FrameSetup and FrameDestroyManman Ren2013-07-151-0/+132
| | | | | | | | | | | 1> on every path through the CFG, a FrameSetup <n> is always followed by a FrameDestroy <n> and a FrameDestroy is always followed by a FrameSetup. 2> stack adjustments are identical on all CFG edges to a merge point. 3> frame is destroyed at end of a return block. PR16393 llvm-svn: 186350
* Remove an extra is_directory call.Rafael Espindola2013-07-151-11/+0
| | | | | | I checked that opening a directory on windows does fail, so this saves a "stat". llvm-svn: 186345
* Fix register subclass handling in PPCInstrInfo::insertSelectHal Finkel2013-07-152-5/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | PPCInstrInfo::insertSelect and PPCInstrInfo::canInsertSelect were computing the common subclass of the true and false inputs, and then selecting either the 32-bit or the 64-bit isel variant based on the result of calling PPC::GPRCRegClass.hasSubClassEq(RC) and PPC::G8RCRegClass.hasSubClassEq(RC) (where RC is the common subclass). Unfortunately, this is not quite right: if we have something like this: %vreg8<def> = SELECT_CC_I8 %vreg4<kill>, %vreg7<kill>, %vreg6<kill>, 76; G8RC_and_G8RC_NOX0:%vreg8 CRRC:%vreg4 G8RC_NOX0:%vreg7,%vreg6 then the common subclass of G8RC_and_G8RC_NOX0 and G8RC_NOX0 is G8RC_NOX0, and G8RC_NOX0 is not a subclass of G8RC (because it also contains the ZERO8 pseudo-register). As a result, we also need to check the common subclass against GPRC_NOR0 and G8RC_NOX0 explicitly. This had not been a problem for clients of insertSelect that called canInsertSelect first (because it had a compensating mistake), but insertSelect is also used by the PPC pseudo-instruction expander, and this error was causing a problem in that context. This problem was found by csmith. llvm-svn: 186343
* [mc-coff] Resolve aliases when emitting COFF relocationsReid Kleckner2013-07-152-2/+109
| | | | | | | | | | | | This is consistent with the ELF object writer. Add some COFF tests that relocate against an alias. Reviewers: espindola Differential Revision: http://llvm-reviews.chandlerc.com/D1079 llvm-svn: 186341
* R600/SI: Add support for 64-bit loadsTom Stellard2013-07-155-1/+85
| | | | | | https://bugs.freedesktop.org/show_bug.cgi?id=65873 llvm-svn: 186339
* Remove invalid assert in DAGTypeLegalizer::RemapValueHal Finkel2013-07-152-1/+61
| | | | | | | | | | | | | | | | | | There is a comment at the top of DAGTypeLegalizer::PerformExpensiveChecks which, in part, says: // Note that these invariants may not hold momentarily when processing a node: // the node being processed may be put in a map before being marked Processed. Unfortunately, this assert would be valid only if the above-mentioned invariant held unconditionally. This was causing llc to assert when, in fact, everything was fine. Thanks to Richard Sandiford for investigating this issue! Fixes PR16562. llvm-svn: 186338
* Remove trailing whitespaceStephen Lin2013-07-151-36/+36
| | | | llvm-svn: 186333
* Revert r186316 while I track down an ASan failure and an assert fromChandler Carruth2013-07-152-973/+1256
| | | | | | | | | | | a bot. This reverts the commit which introduced a new implementation of the fancy SROA pass designed to reduce its overhead. I'll skip the huge commit log here, refer to r186316 if you're looking for how this all works and why it works that way. llvm-svn: 186332
* Teaching llvm-tblgen to not emit a switch statement when there are no case ↵Aaron Ballman2013-07-153-53/+88
| | | | | | statements. llvm-svn: 186330
* Revert "[Option] Store arg strings in a set backed by a BumpPtrAllocator"Reid Kleckner2013-07-152-14/+4
| | | | | | | | | This broke clang's crash-report.c test, and I haven't been able to figure it out yet. This reverts commit r186319. llvm-svn: 186329
* Test commit to see if write access works.Job Noorman2013-07-151-1/+1
| | | | llvm-svn: 186321
* [Option] Store arg strings in a set backed by a BumpPtrAllocatorReid Kleckner2013-07-152-4/+14
| | | | | | | | | | | No functionality change. This is preparing to move response file parsing into lib/Option so it can be shared between clang and lld. This change isn't just a micro-optimization. Clang's driver uses a std::set<std::string> to unique arguments while parsing response files, so this matches that. llvm-svn: 186319
* XFAIL this on freebsd to bring the bot back.Rafael Espindola2013-07-151-0/+1
| | | | | | | | Joerg Sonnenberger tells me one can open a directory in freebsd. I will try to centralize our calls to open so that we can handle O_BINARY in one place, and will then handle this there too. llvm-svn: 186317
* Reimplement SROA yet again. Same fundamental principle, but a totallyChandler Carruth2013-07-152-1256/+973
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | different core implementation strategy. Previously, SROA would build a relatively elaborate partitioning of an alloca, associate uses with each partition, and then rewrite the uses of each partition in an attempt to break apart the alloca into chunks that could be promoted. This was very wasteful in terms of memory and compile time because regardless of how complex the alloca or how much we're able to do in breaking it up, all of the datastructure work to analyze the partitioning was done up front. The new implementation attempts to form partitions of the alloca lazily and on the fly, rewriting the uses that make up that partition as it goes. This has a few significant effects: 1) Much simpler data structures are used throughout. 2) No more double walk of the recursive use graph of the alloca, only walk it once. 3) No more complex algorithms for associating a particular use with a particular partition. 4) PHI and Select speculation is simplified and happens lazily. 5) More precise information is available about a specific use of the alloca, removing the need for some side datastructures. Ultimately, I think this is a much better implementation. It removes about 300 lines of code, but arguably removes more like 500 considering that some code grew in the process of being factored apart and cleaned up for this all to work. I've re-used as much of the old implementation as possible, which includes the lion's share of code in the form of the rewriting logic. The interesting new logic centers around how the uses of a partition are sorted, and split into actual partitions. Each instruction using a pointer derived from the alloca gets a 'Partition' entry. This name is totally wrong, but I'll do a rename in a follow-up commit as there is already enough churn here. The entry describes the offset range accessed and the nature of the access. Once we have all of these entries we sort them in a very specific way: increasing order of begin offset, followed by whether they are splittable uses (memcpy, etc), followed by the end offset or whatever. Sorting by splittability is important as it simplifies the collection of uses into a partition. Once we have these uses sorted, we walk from the beginning to the end building up a range of uses that form a partition of the alloca. Overlapping unsplittable uses are merged into a single partition while splittable uses are broken apart and carried from one partition to the next. A partition is also introduced to bridge splittable uses between the unsplittable regions when necessary. I've looked at the performance PRs fairly closely. PR15471 no longer will even load (the module is invalid). Not sure what is up there. PR15412 improves by between 5% and 10%, however it is nearly impossible to know what is holding it up as SROA (the entire pass) takes less time than reading the IR for that test case. The analysis takes the same time as running mem2reg on the final allocas. I suspect (without much evidence) that the new implementation will scale much better however, and it is just the small nature of the test cases that makes the changes small and noisy. Either way, it is still simpler and cleaner I think. llvm-svn: 186316
* DebugInfo: Factor out parsing compile unit DIEs to a separate function. ↵Alexey Samsonov2013-07-152-78/+70
| | | | | | | | Improve code style and comments. No functionality change. llvm-svn: 186315
* Add 'const' qualifier to some arrays.Craig Topper2013-07-153-3/+4
| | | | llvm-svn: 186312
* Make some arrays 'static const'Craig Topper2013-07-153-43/+51
| | | | llvm-svn: 186311
* Add include to hopefully fix windows build.Craig Topper2013-07-151-0/+1
| | | | llvm-svn: 186310
OpenPOWER on IntegriCloud