SecurityXploded.com
In-Memory Execution of an Executable | www.SecurityXploded.com
 
 
In-Memory Execution of an Executable
Author: Amit Malik 
 
 
 
See Also
 
 
 
Introduction

This article is the part of our free "Reverse Engineering & Malware Analysis Course".

You can visit our training page here and all the presentations of previous sessions here

 
 

In this article, we will learn how to perform in-memory or file-less execution of executable with practical code example.

Here I will explain about some of the fancy techniques used by exploits and malwares from shellcode perspective. This article requires a strong understanding of PE file format. If you are not comfortable with PE file format then first visit our first training session on PE Format Basics.

 
 
Technical Introduction
Technically an exploit is the combination of two things
  1. Vulnerability – the software security bug
  2. Shellcode – the actual malicious payload
Vulnerability gives us control over execution flow while shellcode is the actual payload that carries out the malicious activity. Without the shellcode vulnerability is just a simple software bug.

Further we can divide shellcodes into two parts:

  1. Normal shellcodes
  2. Staged shellcodes (often times termed as drive by download)
In a normal shellcode, shellcode itself carry out the malicious activity for eg: bind shell, reverse shell shellcodes etc. They do not require any other payload to be downloaded for their working. On the other hand staged shellcodes require another payload for their working and are often divided into two stages.

Stage 1 – that will download stage 2.
Stage 2 – It is the actual malicious payload

Stage 1 downloads the stage 2 payload and executes it. After that stage 2 will perform all kind of malicious activity.  Here the interesting part is how stage 1 executes stage 2 payloads. In this article I will discuss about it in detail.

The two possibilities for the stage 1 shellcode to execute stage 2 shellcode could be,

  1. Download the payload, save it on the disk and create a new process
  2. Download the payload and execute it directly from the memory
#1 will increase the footprints and moreover there is greater chances of detection by the host based security softwares like antivirus.

However in #2, as the payload is executed directly from the memory so it can bypass host based security softwares very easily. But unfortunately no windows API provides mechanism to execute file directly from memory. All windows API like CreateProcess, WinExec, ShellExcute etc. requires file to be locally present.

So the question is how we can do that if there is no such API?

 
 
 
In-Memory Execution
 
I think in this regard the first known work on In-memory execution was done by ZomBie of 29A labs and then the Nologin also published its own version of the same. Later on Stephen Fewer from harmony security applied the logic on the DLL and coined a new term reflective DLL injection which is the integral part of Metasploit framework.

Interestingly it is possible because the structure of a PE file is exactly the same on disk as in mapped memory. So we can easily calculate the offsets or addresses in memory if we know the offset on disk and vice-versa. It makes it possible to mimic the actual operating system loader that loads the executable in memory.

Operating system loader is responsible for process initialization, so if we can make a prototype of it then we can also create a process probably directly from the memory. But before that, we need to take a look into the OS loader working especially how it map executable in memory.

Following are the simplified steps that carried out by OS loader when you launch Executables.

  1. Read first page of the file which includes DOS header, PE header, section headers etc.

  2. Fetch Image Base address from PE header and determine if that address is available else allocate another area.  (Case of relocation)

  3. Map the sections into the allocated area

  4. Read information from import table and load the DLLs

  5. Resolve the function addresses and create Import Address Table (IAT).

  6. Create initial heap and stack using values from PE header.

  7. Create main thread and start the process.

If we can create a programme that can mimic some of the above steps then we can execute exe directly from memory.

For example, consider a situation: you download an exe/dll from internet so until you save it on the disk it will remain in the volatile memory.  This means we can read the header information of that file directly from memory and based on the above steps we can execute that file directly from memory, in short it is possible to execute an exe/dll without its file or file-less execution is possible.

If you take a close look on the above steps then we can easily say that most of the information is stored in the PE header itself, which we can read programmatically.

Technically the minimum information required to run any executable is as follows,

  1. Address space

  2. Proper sections (exe sections) placement into the address space

  3. Imported API addresses

 
Address space
In PE, everything is relative to Image Base so if we can get Image Base address allocation then we can proceed to next steps easily else we have to add relocation support to our loader prototype but for this article, I am ignoring that part and will be assuming that we have an allocation with Image Base.

Sections mapped into Address Space
In PE File header, NumberOfSections field can give us the total number of sections, after that we can read section’s headers and can write on to the proper address in the memory. (We read the offset from PointerToRawData and copy that data at VirtualAddress by taking length from SizeOfRawData field).

Imported API addresses
Again by reading Import Table structure we can get the names of DLLs and APIs used by the executable. Remember FirstThunk in the import table structure is actually IAT after name resolution

 
 
 
Memory Execution – Prototype Code
 
Based on the above information we can write a basic loader prototype.  Please note that I am ignoring couple of important things in the code intentionally like relocation case, section permissions, ordinal based entries fixes etc.

/* In memory execution example */
/*
Author: Amit Malik
http://www.securityxploded.com
Compile in Dev C++
*/

#include 
#include 
#include 

#define DEREF_32( name )*(DWORD *)(name)

int main()
{
     char file[20];
     HANDLE handle;
     PVOID vpointer;
     HINSTANCE laddress;
     LPSTR libname;
     DWORD size;
     DWORD EntryAddr;
     int state;
     DWORD byteread;
     PIMAGE_NT_HEADERS nt;
     PIMAGE_SECTION_HEADER section;
     DWORD dwValueA;
     DWORD dwValueB;
     DWORD dwValueC;
     DWORD dwValueD; 

     printf("Enter file name: ");
     scanf("%s",&file);

          
           
     // read the file
     printf("Reading file..\n");
     handle = CreateFile(file,GENERIC_READ,0,0,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,0);
     
     // get the file size
     size = GetFileSize(handle,NULL);
     
     // Allocate the space 
     vpointer = VirtualAlloc(NULL,size,MEM_COMMIT,PAGE_READWRITE);
     
     // read file on the allocated space
     state = ReadFile(handle,vpointer,size,&byteread,NULL);
     CloseHandle(handle);
     printf("You can delete the file now!\n");
     system("pause");
     
     // read NT header of the file
     nt = PIMAGE_NT_HEADERS(PCHAR(vpointer) + PIMAGE_DOS_HEADER(vpointer)->e_lfanew);
     handle = GetCurrentProcess();
     
     // get VA of entry point
     EntryAddr = nt->OptionalHeader.ImageBase + nt->OptionalHeader.AddressOfEntryPoint;
     
     // Allocate the space with Imagebase as a desired address allocation request
     PVOID memalloc = VirtualAllocEx(
                                     handle, 
                                     PVOID(nt->OptionalHeader.ImageBase), 
                                     nt->OptionalHeader.SizeOfImage, 
                                     MEM_RESERVE | MEM_COMMIT, 
                                     PAGE_EXECUTE_READWRITE
                                     );
    
     // Write headers on the allocated space
     WriteProcessMemory(handle, 
     memalloc, 
     vpointer, 
     nt->OptionalHeader.SizeOfHeaders, 
     0
     );
     
     
     // write sections on the allocated space
     section = IMAGE_FIRST_SECTION(nt);
     for (ULONG i = 0; i < nt->FileHeader.NumberOfSections; i++) 
     {
         WriteProcessMemory(
                           handle, 
                           PCHAR(memalloc) + section[i].VirtualAddress, 
                           PCHAR(vpointer) + section[i].PointerToRawData, 
                           section[i].SizeOfRawData, 
                           0
                           );
     }
     
     // read import dirctory    
     dwValueB = (DWORD) &(nt->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT]);
     
     // get the VA 
     dwValueC = (DWORD)(nt->OptionalHeader.ImageBase) + 
                          ((PIMAGE_DATA_DIRECTORY)dwValueB)->VirtualAddress;
     
     
     while(((PIMAGE_IMPORT_DESCRIPTOR)dwValueC)->Name)
     {
            // get DLL name
            libname = (LPSTR)(nt->OptionalHeader.ImageBase + 
                              ((PIMAGE_IMPORT_DESCRIPTOR)dwValueC)->Name);
                              
            // Load dll
            laddress = LoadLibrary(libname);
            
            // get first thunk, it will become our IAT
            dwValueA = nt->OptionalHeader.ImageBase + 
                                  ((PIMAGE_IMPORT_DESCRIPTOR)dwValueC)->FirstThunk;
            
            // resolve function addresses
            while(DEREF_32(dwValueA))
            {
                dwValueD = nt->OptionalHeader.ImageBase + DEREF_32(dwValueA);
                // get function name 
                LPSTR Fname = (LPSTR)((PIMAGE_IMPORT_BY_NAME)dwValueD)->Name;
                // get function addresses
                DEREF_32(dwValueA) = (DWORD)GetProcAddress(laddress,Fname);
                dwValueA += 4;
            }

            dwValueC += sizeof( IMAGE_IMPORT_DESCRIPTOR );

     }
   
   
     // call the entry point :: here we assume that everything is ok.
     ((void(*)(void))EntryAddr)();
           
}
           
           
 

Compile the above code in Dev C++.  For proof of concept, I will execute the MessageBox code that I had shown in my 'Assembly Basics' article.

Now perform the following steps,

  1. Compile the MessageBox code again but before that select project properties in WinAsm (project->Project Properties->Release) and in Link block add the following command: /BASE:0x500000
  2. Click on ok.
  3. Now assemble and link the code you will get EXE with 500000 Image Base which is good for our POC
 
 
Below snapshot shows you the execution directly from memory,
 
 
 
Conclusion
 
Recently Kaspersky said that they saw a file less worm, actually these things are not new. Metasploit has file less Trojan from years in terms of reflective DLL injection. Many malicious codes and packers use heavily these things. It is also strongly known for security softwares bypassing.

Overall it is very powerful mechanism and must be known to a malware analyst.

 
 
 
References
  1. Nologin - Remote Library Injection
  2. Harmony Security - Reflective DLL Injection
  3. In Memory Execution – Zombie
 
 
See Also