Loading An Object File

Detailing the steps needed to load a Beacon Object File.

Now that we have a good understanding of the structure of the COFF format, we can begin with writing a loader. Before we begin, I'd like to state that while I don't have a public COFF loader available at the moment, I am using one for my upcoming C2s agent. None of this is public yet, and it is very far from being done, so I'll take some time now to showcase a couple of excellent COFF loaders that have been made publicly available on GitHub.

  • TrustedSec's COFF loader: one of the first ones publicly released. Works, but quite ugly.

  • 5pider's CoffeeLdr: much cleaner and easier to read. Currently used in Havoc's flagship agent, Demon. For the remainder of this paper, I'll be using his loader specifically as an example.

Generally, we can follow these steps when writing a COFF loader:

  1. Calculate the total size needed to load the BOF, rounded up to the nearest page boundary.

  2. Allocate pages for the BOF, and copy its sections into their correct virtual addresses.

  3. Parse the file's relocations for each section. If the relocation is for an imported function, resolve that function's address. The imported function is usually either a Win32 API function or one of the aforementioned Beacon API functions.

Calculating Required Virtual Memory, Copying Sections

When calculating how much virtual memory we'll need for the Object File's sections to be loaded, we want to take into account the alignment of memory pages for this allocation. This is because if a section within the object file has a size smaller than a page (4096 bytes), it may end up with the wrong memory permissions since you can only set them for an entire page at a time. 5pider's COFF loader takes this into account, which we can see here:

for ( UINT16 SecCnt = 0 ; SecCnt < Coffee->Header->NumberOfSections; SecCnt++ )
{
    Coffee->Section  = C_PTR( U_PTR( Coffee->Data ) + sizeof( COFF_FILE_HEADER ) + U_PTR( sizeof( COFF_SECTION ) * SecCnt ) );
    Coffee->BofSize += Coffee->Section->SizeOfRawData;
    Coffee->BofSize  = ( SIZE_T ) ( ULONG_PTR ) PAGE_ALLIGN( Coffee->BofSize );
}

Coffee->BofSize += Coffee->FunMapSize;

Coffee->ImageBase = MmVirtualAlloc( DX_MEM_DEFAULT, NtCurrentProcess(), Coffee->BofSize, PAGE_READWRITE );
if ( ! Coffee->ImageBase )
{
    PUTS( "Failed to allocate memory for the BOF" )
    goto END;
}

Excusing the fact that he misspelled the word align as "allign" here (apologies for the grammar policing), we can see that he is indeed adding up the size of each COFF section, and page aligning the values as he goes. Finally, he allocates memory for the sections with a call to MmVirtualAlloc, a Havoc internal function that'll end up calling VirtualAlloc or NtAllocateVirtualMemory based on the parameters. Also, if you don't spend a large amount of time reading 5pider's code as I do, you may be unfamiliar with some of the stylistic choices he often uses in his code. Make note of the C_PTR and U_PTR macros, as you'll see them quite frequently in the Havoc source. C_PTR converts an unsigned 64-bit integer to a pointer type, and U_PTR does the opposite, converting a pointer to an integral type instead.

Next, 5pider copies over each of the COFF sections like so:

NextBase = Coffee->ImageBase;
for ( UINT16 SecCnt = 0 ; SecCnt < Coffee->Header->NumberOfSections; SecCnt++ )
{
    Coffee->Section               = C_PTR( U_PTR( Coffee->Data ) + sizeof( COFF_FILE_HEADER ) + U_PTR( sizeof( COFF_SECTION ) * SecCnt ) );
    Coffee->SecMap[ SecCnt ].Size = Coffee->Section->SizeOfRawData;
    Coffee->SecMap[ SecCnt ].Ptr  = NextBase;

    NextBase += Coffee->Section->SizeOfRawData;
    NextBase  = PAGE_ALLIGN( NextBase );

    PRINTF( "Coffee->SecMap[ %d ].Ptr => %p\n", SecCnt, Coffee->SecMap[ SecCnt ].Ptr )

    MemCopy( Coffee->SecMap[ SecCnt ].Ptr, C_PTR( U_PTR( CoffeeData ) + Coffee->Section->PointerToRawData ), Coffee->Section->SizeOfRawData );
}

Again, this is nothing all that crazy. If you ignore the smaller details and look at the bigger picture, he's simply copying each section from its original location into where it needs to be in virtual memory. This is indicated by the PointerToRawData member of the section header. This is an offset that, when applied to the base of the object file before it's been mapped, will leave you at the section itself. For each iteration, he's copying it into one of the pages he allocated earlier, and then setting NextBase, which is the base of the section, to the beginning of the next page. He continues to do this until all of the sections are copied over.

Section Parsing

This is where the bulk of the action takes place and is probably the most challenging part. We'll need to parse each section of the BOF, check through the relocation entries of that section, and perform the necessary adjustments. For Havoc, which again, is an excellent, easy-to-follow example for this, this takes place within the CoffeeProcessSections function.

He begins by creating two nested loops, one that keeps track of which section he's on, and an inner one to keep track of the relocation entry he's accessing.

for ( UINT16 SectionCnt = 0; SectionCnt < Coffee->Header->NumberOfSections; SectionCnt++ )
{
    Coffee->Section = C_PTR( U_PTR( Coffee->Data ) + sizeof( COFF_FILE_HEADER ) + U_PTR( sizeof( COFF_SECTION ) * SectionCnt ) );
    Coffee->Reloc   = C_PTR( U_PTR( Coffee->Data ) + Coffee->Section->PointerToRelocations );

    for ( DWORD RelocCnt = 0; RelocCnt < Coffee->Section->NumberOfRelocations; RelocCnt++ )
    {
        Symbol = &Coffee->Symbol[ Coffee->Reloc->SymbolTableIndex ]; //get the symbol
        
----------- SNIP -----------

Notice how he accesses the symbol that the relocation is referencing, by indexing into the symbol table. he does this with the relocation entry's SymbolTableIndex member.

After accessing the symbol's name via the method I discussed in "The Common Object File Format", it's now possible to perform relocations and address resolution based on the symbol's type. If the symbol begins with __imp_, it's an imported function, and it needs to be resolved. Now, here's a crucial detail that I left out earlier, the naming convention for imported functions found in BOFs.

For a BOF to be loaded, the loader needs an easy way of determining what library imported functions belong to. This is done by separating the name of the library with the name of the function in the symbol's definition. For example:

DECLSPEC_IMPORT BOOL KERNEL32$VirtualProtect(
  LPVOID lpAddress,
  SIZE_T dwSize,
  DWORD  flNewProtect,
  PDWORD lpflOldProtect
);

If we wanted to call VirtualProtect for any reason in our custom BOF, we'd create a definition like this. KERNEL32 is the library here, and it's included within the function name to give the loader an easier time. It's also marked with DECLSPEC_IMPORT, which is just a macro that expands to __declspec(dllimport), which will tell the compiler that this function is an import. This makes the resolution of this symbol significantly easier from the perspective of the loader. The final name of this symbol within the Object File will be "__imp_KERNEL32$VirtualProtect". this means that all we need to do is remove "__imp_", separate the string at the "$" character, and then use the first half of the string to resolve the library, and the second half for the function name.

Additionally, we have to consider Beacon API functions. If we take a look at some of the definitions within beacon.h, we can observe that they also use DECLSPEC_IMPORT, to mark the definitions as imported functions.

DECLSPEC_IMPORT void   BeaconPrintf(int type, char * fmt, ...);

With Beacon API functions, it's also fairly straightforward. The final symbol name of the above BeaconPrintf would be "__imp_BeaconPrintf", meaning we only need to remove "__imp_" and check if the symbol name starts with "Beacon". If it does, we can check if it's an API function that we support, and if so, resolve the function call to point to the Beacon API function within our program. We can observe 5pider doing exactly this within his COFF loader:

if ( SymBeacon == COFF_PREP_BEACON )
{
    // this is an import symbol from Beacon: __imp_BeaconFUNCNAME
    SymFunction = SymbolName + COFF_PREP_SYMBOL_SIZE;

    for ( DWORD i = 0 ;; i++ )
    {
        if ( ! BeaconApi[ i ].NameHash )
            break;

        if ( HashStringA( SymFunction ) == BeaconApi[ i ].NameHash )
        {
            PUTS( "Found Beacon api function" )
            *pFuncAddr = BeaconApi[ i ].Pointer; //return the address of the Beacon API
            return TRUE;
        }
    }

    goto SymbolNotFound;
}

------- SNIP -------

else if ( SymbolIsImport( SymbolName ) )
    {
        // this is a typical import symbol in the form: __imp_LIBNAME$FUNCNAME
        SymLibrary  = Bak + COFF_PREP_SYMBOL_SIZE;
        SymLibrary  = StringTokenA( SymLibrary, "$" );
        SymFunction = SymLibrary + StringLengthA( SymLibrary ) + 1;
        
        
        //
        // load the module
        //
        hLibrary    = LdrModuleLoad( SymLibrary );

        if ( ! hLibrary )
        {
            PRINTF( "Failed to load library: Lib:[%s] Err:[%d]\n", SymLibrary, NtGetLastError() );
            goto SymbolNotFound;
        }

        StringCopyA( SymName, SymFunction );

        AnsiString.Length        = StringLengthA( SymName );
        AnsiString.MaximumLength = AnsiString.Length + sizeof( CHAR );
        AnsiString.Buffer        = SymName;
        
        
        //
        // get the address of the function
        //
        if ( NT_SUCCESS( Instance->Win32.LdrGetProcedureAddress( hLibrary, &AnsiString, 0, pFuncAddr ) ) )
            return TRUE;

        goto SymbolNotFound;
    }

------- SNIP -------

As we can see here, he's comparing the name of the symbol against existing Beacon API functions supported by Demon using hashing. If there's a match, he returns the address of the Beacon function within his program, which will then be used during relocation. For imported functions such as ones found in Kernel32, he separates the symbol name at the "$" character and attempts to load the library into the process. After that's done, he can try to get the address of the function and return it.

After the symbol itself is resolved, all that's left to do is perform relocation with the resolved symbol. We can see him do this here, and it's mostly related to the relocation types I discussed earlier, in "The Common Object File Format". He performs 32-bit and 64-bit, as well as relative and non-relative relocations here, using the previously resolved symbol's address.

Executing BOF Functions

There's one last step involved in the BOF loading process, and it's perhaps the most straightforward one. All we need to do now is execute a specific function in the BOF, which sometimes depends on the one specified by the operator. More often than not though, COFF loaders will try and execute a function named "go". This is typically the name given to the pseudo-main function of these files. Your typical "go" function will have a signature like this:

void go(char* args, int argc);

This will be the entry point for the execution of the vast majority of BOFs by convention. In Havoc's Demon agent, this is handled by the CoffeeExecuteFunction function. For the sake of brevity, I'll actually be showcasing a different version of this function, one that I'm using for my COFF loader. The reason is that there are a couple of things in here that are (probably) unnecessary, namely how he changes the protection of each section in the BOF to the correct value, as seen here:

BitMask = Coffee->Section->Characteristics & ( IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_WRITE );

if ( BitMask == 0 )
    Protection = PAGE_NOACCESS;
else if ( BitMask == IMAGE_SCN_MEM_EXECUTE )
    Protection = PAGE_EXECUTE;
else if ( BitMask == IMAGE_SCN_MEM_READ )
    Protection = PAGE_READONLY;
else if ( BitMask == ( IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_EXECUTE ) )
    Protection = PAGE_EXECUTE_READ;
else if ( BitMask == IMAGE_SCN_MEM_WRITE )
    Protection = PAGE_WRITECOPY;
else if ( BitMask == ( IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_MEM_WRITE ) )
    Protection = PAGE_EXECUTE_WRITECOPY;
else if ( BitMask == ( IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_WRITE ) )
    Protection = PAGE_READWRITE;
else if ( BitMask == ( IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_WRITE ) )
    Protection = PAGE_EXECUTE_READWRITE;
else
{
    PRINTF( "Unknown protection: %x", Coffee->Section->Characteristics );
    Protection = PAGE_EXECUTE_READWRITE;
}

This (probably) isn't necessary, because you realistically only need to make the section you're about to run executable. The rest of the sections at this point will all be read/write, which shouldn't cause any issues. Here's my version of the function:

bool
object_execute(
        object_context* ctx, 
        const char* entry, 
        unsigned char* args, 
        const uint32_t argc
    ) {

    void(*main)(unsigned char*, uint32_t) = nullptr;

    PIMAGE_SYMBOL   symbol       = nullptr;
    char*           symbol_name  = nullptr;
    void*           section_base = nullptr;
    uint32_t        section_size = 0;
    uint32_t        old_protect  = 0;

    for(size_t i = 0; i < ctx->header->NumberOfSymbols; i++) {

        symbol = &ctx->sym_table[i];
        if(symbol->N.Name.Short) {    //if the symbol has a short name
            symbol_name = reinterpret_cast<char*>(symbol->N.ShortName);
        } 
        else {                        //the symbol has a long name
            symbol_name = reinterpret_cast<char*>(PTR_TO_U64(ctx->sym_table + ctx->header->NumberOfSymbols) + INT_TO_U64(symbol->N.Name.Long));
        }

        if(ISFCN(ctx->sym_table[i].Type) && strcmp(entry, symbol_name) == 0) {
            section_base = ctx->sec_map[symbol->SectionNumber - 1].base;
            section_size = ctx->sec_map[symbol->SectionNumber - 1].size;

            if(!VirtualProtect(
                section_base,
                section_size,
                PAGE_EXECUTE_READ,
                reinterpret_cast<PDWORD>(&old_protect)
            )) {
                return false;
            }

            main = reinterpret_cast<decltype(main)>(PTR_TO_U64(section_base) + symbol->Value);
            main(args, argc);

            if(!VirtualProtect(
                section_base,
                section_size,
                old_protect,
                reinterpret_cast<PDWORD>(&old_protect)
            )) {
                return false;
            }

            return true;
        }
    }

    return false;
}

In this code, I simply iterate through all of the symbols within the symbol table, and whenever I find the one we're looking for, I ensure that the symbol's type is a function, which I can check with Microsoft's ISFCN macro, and then retrieve the base of the section that the symbol belongs to. After that, I acquire the exact position of the symbol (function in this case) in memory, by using the Value member of the symbol, which is an offset, and applying it to the base of the section that the symbol belongs to. Once I have the address of the function we want to execute (probably "go"), I set the protection of the section to execute/read, ensuring we can run the function. This is (probably) a much simpler way of doing it. Note how I keep using probably in parenthesis because realistically I am nowhere near as knowledgeable as someone like 5pider, so there may be a severe issue with doing this instead of changing the protection of each section individually, in which case I will sorely regret this later.

Either way, the BOF should now run, and in my case, I can get this useful output from TrustedSec's "whoami" BOF.

Last updated