summaryrefslogtreecommitdiffstats
path: root/llvm/tools/llvm-objdump
Commit message (Collapse)AuthorAgeFilesLines
...
* Add basic YAML MC CFG testcase.Ahmed Bougacha2013-08-211-1/+1
| | | | | | Drive-by llvm-objdump cleanup (don't hardcode ToolName). llvm-svn: 188904
* MC CFG: Add YAML MCModule representation to enable MC CFG testing.Ahmed Bougacha2013-08-211-9/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like yaml ObjectFiles, this will be very useful for testing the MC CFG implementation (mostly MCObjectDisassembler), by matching the output with YAML, and for potential users of the MC CFG, by using it as an input. There isn't much to the actual format, it is just a serialization of the MCModule class. Of note: - Basic block references (pred/succ, ..) are represented by the BB's start address. - Just as in the MC CFG, instructions are MCInsts with a size. - Operands have a prefix representing the type (only register and immediate supported here). - Instruction opcodes are represented by their names; enum values aren't stable, enum names mostly are: usually, a change to a name would need lots of changes in the backend anyway. Same with registers. All in all, an example is better than 1000 words, here goes: A simple binary: Disassembly of section __TEXT,__text: _main: 100000f9c: 48 8b 46 08 movq 8(%rsi), %rax 100000fa0: 0f be 00 movsbl (%rax), %eax 100000fa3: 3b 04 25 48 00 00 00 cmpl 72, %eax 100000faa: 0f 8c 07 00 00 00 jl 7 <.Lend> 100000fb0: 2b 04 25 48 00 00 00 subl 72, %eax .Lend: 100000fb7: c3 ret And the (pretty verbose) generated YAML: --- Atoms: - StartAddress: 0x0000000100000F9C Size: 20 Type: Text Content: - Inst: MOV64rm Size: 4 Ops: [ RRAX, RRSI, I1, R, I8, R ] - Inst: MOVSX32rm8 Size: 3 Ops: [ REAX, RRAX, I1, R, I0, R ] - Inst: CMP32rm Size: 7 Ops: [ REAX, R, I1, R, I72, R ] - Inst: JL_4 Size: 6 Ops: [ I7 ] - StartAddress: 0x0000000100000FB0 Size: 7 Type: Text Content: - Inst: SUB32rm Size: 7 Ops: [ REAX, REAX, R, I1, R, I72, R ] - StartAddress: 0x0000000100000FB7 Size: 1 Type: Text Content: - Inst: RET Size: 1 Ops: [ ] Functions: - Name: __text BasicBlocks: - Address: 0x0000000100000F9C Preds: [ ] Succs: [ 0x0000000100000FB7, 0x0000000100000FB0 ] <snip> ... llvm-svn: 188890
* [Object] Split the ELF interface into 3 parts.Michael J. Spencer2013-08-081-9/+7
| | | | | | | | * ELFTypes.h contains template magic for defining types based on endianess, size, and alignment. * ELFFile.h defines the ELFFile class which provides low level ELF specific access. * ELFObjectFile.h contains ELFObjectFile which uses ELFFile to implement the ObjectFile interface. llvm-svn: 188022
* keep only the StringRef version of getFileOrSTDIN.Rafael Espindola2013-06-251-1/+1
| | | | llvm-svn: 184826
* Use pointers to the MCAsmInfo and MCRegInfo.Bill Wendling2013-06-181-1/+1
| | | | | | | | | Someone may want to do something crazy, like replace these objects if they change or something. No functionality change intended. llvm-svn: 184175
* readobj: Dump PE/COFF optional records.Rui Ueyama2013-06-121-1/+1
| | | | | | | | | | | | These records are mandatory for executables and are used by the loader. Reviewers: rafael CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D939 llvm-svn: 183852
* Teach llvm-objdump with the -macho parser how to use the data in code tableKevin Enderby2013-06-061-6/+112
| | | | | | | | | | | | | | | from the LC_DATA_IN_CODE load command. And when disassembling print the data in code formatted for the kind of data it and not disassemble those bytes. I added the format specific functionality to the derived class MachOObjectFile since these tables only appears in Mach-O object files. This is my first attempt to modify the libObject stuff so if folks have better suggestions how to fit this in or suggestions on the implementation please let me know. rdar://11791371 llvm-svn: 183424
* Handle relocations that don't point to symbols.Rafael Espindola2013-06-052-4/+3
| | | | | | | | In ELF (as in MachO), not all relocations point to symbols. Represent this properly by using a symbol_iterator instead of a SymbolRef. Update llvm-readobj ELF's dumper to handle relocatios without symbols. llvm-svn: 183284
* llvm-objdump.cpp: Appease MSC16 x64. utostr(n++) causes internal compiler error.NAKAMURA Takumi2013-05-271-1/+2
| | | | llvm-svn: 182722
* Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.Michael J. Spencer2013-05-241-1/+1
| | | | llvm-svn: 182680
* MC: Disassembled CFG reconstruction.Ahmed Bougacha2013-05-246-553/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch builds on some existing code to do CFG reconstruction from a disassembled binary: - MCModule represents the binary, and has a list of MCAtoms. - MCAtom represents either disassembled instructions (MCTextAtom), or contiguous data (MCDataAtom), and covers a specific range of addresses. - MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is backed by an MCTextAtom, and has the usual successors/predecessors. - MCObjectDisassembler creates a module from an ObjectFile using a disassembler. It first builds an atom for each section. It can also construct the CFG, and this splits the text atoms into basic blocks. MCModule and MCAtom were only sketched out; MCFunction and MCBB were implemented under the experimental "-cfg" llvm-objdump -macho option. This cleans them up for further use; llvm-objdump -d -cfg now generates graphviz files for each function found in the binary. In the future, MCObjectDisassembler may be the right place to do "intelligent" disassembly: for example, handling constant islands is just a matter of splitting the atom, using information that may be available in the ObjectFile. Also, better initial atom formation than just using sections is possible using symbols (and things like Mach-O's function_starts load command). This brings two minor regressions in llvm-objdump -macho -cfg: - The printing of a relocation's referenced symbol. - An annotation on loop BBs, i.e., which are their own successor. Relocation printing is replaced by the MCSymbolizer; the basic CFG annotation will be superseded by more related functionality. llvm-svn: 182628
* Add MCSymbolizer for symbolic/annotated disassembly.Ahmed Bougacha2013-05-242-16/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a basic first step towards symbolization of disassembled instructions. This used to be done using externally provided (C API) callbacks. This patch introduces: - the MCSymbolizer class, that mimics the same functions that were used in the X86 and ARM disassemblers to symbolize immediate operands and to annotate loads based off PC (for things like c string literals). - the MCExternalSymbolizer class, which implements the old C API. - the MCRelocationInfo class, which provides a way for targets to translate relocations (either object::RelocationRef, or disassembler C API VariantKinds) to MCExprs. - the MCObjectSymbolizer class, which does symbolization using what it finds in an object::ObjectFile. This makes simple symbolization (with no fancy relocation stuff) work for all object formats! - x86-64 Mach-O and ELF MCRelocationInfos. - A basic ARM Mach-O MCRelocationInfo, that provides just enough to support the C API VariantKinds. Most of what works in otool (the only user of the old symbolization API that I know of) for x86-64 symbolic disassembly (-tvV) works, namely: - symbol references: call _foo; jmp 15 <_foo+50> - relocations: call _foo-_bar; call _foo-4 - __cf?string: leaq 193(%rip), %rax ## literal pool for "hello" Stub support is the main missing part (because libObject doesn't know, among other things, about mach-o indirect symbols). As for the MCSymbolizer API, instead of relying on the disassemblers to call the tryAdding* methods, maybe this could be done automagically using InstrInfo? For instance, even though PC-relative LEAs are used to get the address of string literals in a typical Mach-O file, a MOV would be used in an ELF file. And right now, the explicit symbolization only recognizes PC-relative LEAs. InstrInfo should have already have most of what is needed to know what to symbolize, so this can definitely be improved. I'd also like to remove object::RelocationRef::getValueString (it seems only used by relocation printing in objdump), as simply printing the created MCExpr is definitely enough (and cleaner than string concats). llvm-svn: 182625
* llvm-objdump: Initialize MCDisassembler once instead of for each section.Ahmed Bougacha2013-05-161-45/+45
| | | | llvm-svn: 182054
* Remove the MachineMove class.Rafael Espindola2013-05-132-9/+11
| | | | | | | | | | | | It was just a less powerful and more confusing version of MCCFIInstruction. A side effect is that, since MCCFIInstruction uses dwarf register numbers, calls to getDwarfRegNum are pushed out, which should allow further simplifications. I left the MachineModuleInfo::addFrameMove interface unchanged since this patch was already fairly big. llvm-svn: 181680
* Introduce convenience typedefs for the 4 ELF object types.Rafael Espindola2013-05-091-8/+4
| | | | llvm-svn: 181509
* Clarify getRelocationAddress x getRelocationOffset a bit.Rafael Espindola2013-04-252-5/+5
| | | | | | | | | | getRelocationAddress is for dynamic libraries and executables, getRelocationOffset for relocatable objects. Mark the getRelocationAddress of COFF and MachO as not implemented yet. Add a test of ELF's. llvm-readobj -r now prints the same values as readelf -r. llvm-svn: 180259
* Don't read one command past the end.Rafael Espindola2013-04-191-2/+6
| | | | | | | | | Thanks to Evgeniy Stepanov for reporting this. It might be a good idea to add a command iterator abstraction to MachO.h, but this fixes the bug for now. llvm-svn: 179848
* At Jim Grosbach's request detemplate Object/MachO.h.Rafael Espindola2013-04-182-40/+23
| | | | | | | We are still able to handle mixed endian objects by swapping one struct at a time. llvm-svn: 179778
* llvm-objdump: Don't print contents of BSS sections: it makes no sense and ↵Alexey Samsonov2013-04-161-0/+8
| | | | | | crashes llvm-objdump on relocated objects with large bss llvm-svn: 179589
* Finish templating MachObjectFile over endianness.Rafael Espindola2013-04-132-18/+48
| | | | | | | We are now able to handle big endian macho files in llvm-readobject. Thanks to David Fang for providing the object files. llvm-svn: 179440
* Simplify the code. No functionality change.Rafael Espindola2013-04-111-17/+1
| | | | llvm-svn: 179259
* Template the MachO types over endianness.Rafael Espindola2013-04-101-5/+6
| | | | | | For now they are still only used as little endian. llvm-svn: 179147
* Convert MachOObjectFile to a template.Rafael Espindola2013-04-092-6/+6
| | | | | | | For now it is templated only on being 64 or 32 bits. I will add little/big endian next. llvm-svn: 179097
* Implement MachOObjectFile::getHeader directly.Rafael Espindola2013-04-071-4/+4
| | | | llvm-svn: 178994
* Remove LoadCommandInfo now that we always have a pointer to the command.Rafael Espindola2013-04-071-3/+3
| | | | | | | LoadCommandInfo was needed to keep a command and its offset in the file. Now that we always have a pointer to the command, we don't need the offset. llvm-svn: 178991
* Add MachOObjectFile::LoadCommandInfo.Rafael Espindola2013-04-071-2/+2
| | | | | | This avoids using MachOObject::getLoadCommandInfo. llvm-svn: 178990
* Remove MachOObjectFile::getObject.Rafael Espindola2013-04-071-10/+8
| | | | llvm-svn: 178986
* Make getObject const. Remove a const_cast.Rafael Espindola2013-04-071-2/+2
| | | | llvm-svn: 178980
* Remove last use of InMemoryStruct in llvm-objdump.Rafael Espindola2013-04-071-2/+2
| | | | llvm-svn: 178979
* Remove dead code.Rafael Espindola2013-04-071-17/+0
| | | | llvm-svn: 178977
* Remove unused argument.Rafael Espindola2013-04-071-3/+1
| | | | llvm-svn: 178976
* Don't fetch pointers from a InMemoryStruct.Rafael Espindola2013-04-052-8/+4
| | | | | | | | InMemoryStruct is extremely dangerous as it returns data from an internal buffer when the endiannes doesn't match. This should fix the tests on big endian hosts. llvm-svn: 178875
* Don't disassemble symbols with an unknown address or size.Eric Christopher2013-04-031-0/+1
| | | | | | Patch by Nico Rieck! llvm-svn: 178678
* print TLS segmentShankar Easwaran2013-02-271-0/+3
| | | | llvm-svn: 176192
* [objdump] Add PT_PHDR.Michael J. Spencer2013-02-211-0/+3
| | | | llvm-svn: 175709
* [objdump] Print the PT_INTERP and PT_DYNAMIC correcctly.Michael J. Spencer2013-02-201-0/+6
| | | | llvm-svn: 175659
* Add static cast to unsigned char whenever a character classification ↵Guy Benyei2013-02-121-1/+1
| | | | | | function is called with a signed char argument, in order to avoid assertions in Windows Debug configuration. llvm-svn: 175006
* [objdump,readobj] Document the purpose and goals of each tool.Michael J. Spencer2013-02-051-0/+3
| | | | llvm-svn: 174439
* Remove unneeded #include.Jakub Staszak2013-01-211-1/+0
| | | | llvm-svn: 173088
* Sort all of the includes. Several files got checked in with mis-sortedChandler Carruth2013-01-191-1/+0
| | | | | | includes. llvm-svn: 172891
* [Object][ELF] Simplify ELFObjectFile by using ELFType.Michael J. Spencer2013-01-151-12/+12
| | | | | | | | | | | | | This simplifies the usage and implementation of ELFObjectFile by using ELFType to replace: <endianness target_endianness, std::size_t max_alignment, bool is64Bits> This does complicate the base ELF types as they must now use template template parameters to partially specialize for the 32 and 64bit cases. However these are only defined once. llvm-svn: 172515
* [llvm-objdump] Emit addresses with the correct number of leading 0's.Michael J. Spencer2013-01-101-2/+5
| | | | llvm-svn: 172130
* [objdump] Use correct format specifiers and fix C++03 variadic warning.Michael J. Spencer2013-01-061-6/+8
| | | | llvm-svn: 171651
* [objdump] Add --private-headers, -p.Michael J. Spencer2013-01-064-1/+103
| | | | | | This currently prints the ELF program headers. llvm-svn: 171649
* Sort a few more #include lines in tools/... unittests/... and utils/...Chandler Carruth2013-01-021-1/+1
| | | | llvm-svn: 171363
* Add a function to get the segment name of a section.Rafael Espindola2012-12-212-5/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On MachO, sections also have segment names. When a tool looking at a .o file prints a segment name, this is what they mean. In reality, a .o has only one anonymous, segment. This patch adds a MachO only function to fetch that segment name. I named it getSectionFinalSegmentName since the main use for the name seems to be inform the linker with segment this section should go to. The patch also changes MachOObjectFile::getSectionName to return just the section name instead of computing SegmentName,SectionName. The main difference from the previous patch is that it doesn't use InMemoryStruct. It is extremely dangerous: if the endians match it returns a pointer to the file buffer, if not, it returns a pointer to an internal buffer that is overwritten in the next API call. We should change all of this code to use support::detail::packed_endian_specific_integral like ELF, but since these functions only handle strings, they work with big and little endian machines as is. I have tested this by installing ubuntu 12.10 ppc on qemu, that is why it took so long :-) llvm-svn: 170838
* Revert 170545 while I debug the ppc failures.Rafael Espindola2012-12-192-28/+5
| | | | llvm-svn: 170547
* Add r170095 back.Rafael Espindola2012-12-192-5/+28
| | | | | | | | | | | | | | | | | | | | | | | I cannot reproduce it the failures locally, so I will keep an eye at the ppc bots. This patch does add the change to the "Disassembly of section" message, but that is not what was failing on the bots. Original message: Add a funciton to get the segment name of a section. On MachO, sections also have segment names. When a tool looking at a .o file prints a segment name, this is what they mean. In reality, a .o has only one anonymous, segment. This patch adds a MachO only function to fetch that segment name. I named it getSectionFinalSegmentName since the main use for the name seems to be infor the linker with segment this section should go to. The patch also changes MachOObjectFile::getSectionName to return just the section name instead of computing SegmentName,SectionName. llvm-svn: 170545
* Revert "Add a funciton to get the segment name of a section."Eric Christopher2012-12-132-18/+4
| | | | | | This reverts commit r170095 since it appears to be breaking the bots. llvm-svn: 170105
* Add a funciton to get the segment name of a section.Rafael Espindola2012-12-132-4/+18
| | | | | | | | | | | | | | | On MachO, sections also have segment names. When a tool looking at a .o file prints a segment name, this is what they mean. In reality, a .o has only one, anonymous, segment. This patch adds a MachO only function to fetch that segment name. I named it getSectionFinalSegmentName since the main use for the name seems to be informing the linker with segment this section should go to. The patch also changes MachOObjectFile::getSectionName to return just the section name instead of computing SegmentName,SectionName. llvm-svn: 170095
OpenPOWER on IntegriCloud