Monday, July 18, 2011

ELF File Format Parsing

In this post i will explain how to parse 32 bit ELF File Format for finding offsets of segments. Firstly we need to lookup ELF Header (First 52 bytes) for taking offsets of Program Headers Table, EntryPoint Addr, Section Headers Tables etc..

If we take an DWORD value from 24th offset of ELF Header we can grab EntryPoint of executable file. DWORD value at 32th offset is start offset of Section Headers Table and size of this headers table is located at 46th offset as a WORD value. We also need number of section headers that located at 48th offset as a WORD value. Also Section Header String Table Index is located at 50th offset as a WORD value. So we can say ELF Header structure is like this;


Magic Number: 16 bytes
Class: 1 byte
Data: 1 byte
Version: 1 byte
OS/ABI: 1 byte
ABI Version: 1 byte
Type: 1 byte
Machine: 1 byte
Version: 1 byte
EntryPoint: DWORD
Start of Program Headers: DWORD
Start of Section Headers: DWORD
Flags: DWORD
Size of this header: WORD
Size of program headers: WORD
Number of program headers: WORD
Size of section headers: WORD
Number of section headers: WORD
Section header string table index: WORD

It's OK, we can succesfully parse ELF Header table with this structure. But we need to parse Section Headers table for accessing segments like .text, .rodata, .strtab, .symtab etc..


Structure of Section Headers Table is more complicated than ELF Header. According to output of readelf utility, executable file have 29 Section Headers and each Section Header is 40 bytes long. Section Headers Table structure is like this;

Section Name: DWORD (We will use this offset at .shstrtab for parsing names)
Section Type: DWORD
Section Flag: DWORD
Section Addr: DWORD
Section Start Offset: DWORD
Section Size: DWORD
Section ES: DWORD
Section Lk: DWORD
Section Inf: DWORD
Section Align: DWORD

OK, we successfuly parse Section Headers Table but we need to parse Section Names from .shstrtab section. Section header string table index is points .shstrtab section's index. When we grab start offset of .shstrtab (26th Section Header) we can accessing and parsing names by using Section Name Offsets.


Example shows .shstrtab section is located at offset 1068h (offset: 4200). Let's look at the hexdump of this section.


So we will try finding section name of exampel section header.


White area is an example section header for this executable. First DWORD value is Section Name offset and it's 92h (offset: 146 of .shstrtab).

0x1068 + 0x92 = 0x10FA

At offset 0x10FA we see .text string. So we can say example is section header of .text segment. Lets parse the .text segment. Address of .text segment is 0x08048310 (Did you remember the EntryPoint?) Also its start offset is 310h (offset: 784) and size is 17Ch (380 bytes).

0x310 + 0x17C = 0x48C

Now, we can say .text segment starting at offset 0x310 and ending at offset 0x48C.


It's OK we parse .text segment. You can confirm it by disassemble it by objdump or gdb. I also coded an simple ELF parser library in Ruby according this structures, it will be released soon.

Cya!

3 comments:

  1. yo, i'm following your structures to try and make my own (in vb.net) im just confused at something here, you're saying that shstrtab is the 26th Section Header, right? Considering that this number is for every ELF file, as 26th you mean from 0-30 its 25, or 1-30 actually 26? Thanks in advance

    ReplyDelete
  2. Hi Kamikai,
    Sorry for late response. It should be 0-30 25th but I didn't remember clearly. Please ping me back if you have a chance to try it. Sorry mate.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete