The Common Object File Format
Discussing the intricacies of the COFF file format.
The Common Object File Format (COFF) was originally created by AT&T for a major version of the Unix operating system, Unix System V Release 3, in 1983. Since then, it's been adopted and modified for use in many modern operating systems, including Windows.
The COFF format used on Windows today isn't the same as the original, but most of its components remain the same, including:
The File header (sometimes called the COFF header)
The Optional header (although listed in Microsoft's COFF specification as being an optional part of the format, you'll essentially never see this present within any Windows COFF file and is usually found within a PE file instead. We'll be skipping this one, as it isn't relevant.)
The section headers
The sections themselves
The Symbol Table
The Strings Table
Relocation entries
The File Header
These components each have their distinct purpose, and we'll discuss them all thoroughly, starting with the file header. If you're at all familiar with the Portable Executable (PE) file format on Windows used for executables, then you should know this one well, as it's the same header found alongside the Optional Header, which itself is grouped inside of the NT headers.
On Windows, it looks something like this, and contains an abundance of useful metadata for the file, including some characteristics, the number of sections present, as well as an offset to the symbol table.
Sections and Section Headers
Again, if you're familiar with the PE format for Windows executables you should immediately recognize this one, as it's identical to the ones found there. There is typically a singular section header for each section within the object file, and they each contain metadata about the section, including the size of the section, characteristics, the RVA to the section it references, and PointerToRelocations, a crucial member that we'll be using later. These section headers exist directly after the file header, and they're all in line with one another, preceding the sections themselves.
The sections themselves, sometimes referred to as image or COFF sections, come immediately after the section headers. This includes things like .text for executable sections, .data for modifiable program data, .pdata for info used during stack unwinding, and others.
The Symbol And Strings Tables
the symbol table follows the COFF sections. It contains data on different symbols used within the object file, including functions and variables. The symbol table is essentially an array of IMAGE_SYMBOL structures, as shown here.
The most important members to take note of are SectionNumber, which indicates which COFF section this symbol belongs to, Value, which is an offset that can be applied to the base address of that section to reach the symbol, Type, which indicates the symbol's type (usually either variable, function, or imported function), and a confusing union, called N, which stores the symbol's name. The Name struct has two members, Short and Long. If Short is a non-null value, the symbol's name is stored within the ShortName member, an array that is 8 bytes long. For names that are longer than 8 bytes, however, you can access the name of the symbol by applying Name.Long, which is an offset, to the address directly following the symbol table. Here's some pseudocode to demonstrate this:
This offset is accessing the strings table, which directly follows the symbol table. The strings table is exclusively used to store symbol names longer than 8 bytes (including the null terminator).
This is generally what it'll look like; a myriad of ANSI strings packed together at the end of the file. This strings table in particular can be found within TrustedSec's whoami BOF.
Relocation Table/Entries
There is one final part of the COFF format on Windows we haven't discussed, the relocation entries. Each COFF section has a related array of relocation entries, with each entry using the following structure:
Each relocation describes a value that needs to be readjusted for it to be valid in the context of where the file is loaded. The IMAGE_RELOCATION structure contains some important information, including VirtualAddress, an RVA (offset) that can be applied to the beginning of the section the relocation belongs to to reach the value that needs relocating, the SymbolTableIndex, which indicates which symbol this relocation is referring to, and the most important member, Type, which indicates what kind of relocation to perform. There are several kinds of relocations you'll need to handle in a custom COFF loader, and I won't go over all of them here, but I'll provide you with a few to give a bit of insight into what I'm referring to, so I don't have to go over this too much later.
This one is fairly straightforward. It's a 64-bit relocation, which means you'll need to adjust 64 bits at the relocation's address. For each relocation, there are specific formulas that need to be used. For this one, we can do something like this:
Here, we simply add the address of the base of the section that the relocation belongs to with the first 64 bits from the relocation.
Here's another common one you might encounter:
This is a 32-bit relative relocation, indicating that we need to adjust 32 bits at the relocation's address. this is usually for a relative jump or call instruction. So for instance, a jump relative to the instruction pointer, such as jmp rip + 80h
, would fall under this category. So now you have a broad idea of some of the relocations that need to take place when we parse the BOFs relocation entries.
Last updated