This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification."
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE64-1434
Target Operating System: 64 bit Linux (x86_64 GNU/Linux)
In the remainder of this blog post, I will show 2 examples of using an egg hunter to find and execute our larger shellcode.
GitHub Link: https://github.com/rtaylor777/nasm/blob/master/EggHunterStack1434.c
Number of bytes = 16
Number of nulls = 0
EggHunterStack1434.c
Line 56 is an example of our larger shellcode. In this case it is the ExecveStack shellcode but it could be any shellcode that we wish to execute. Simply replace line 56 with your shellcode of choice.
Here you can see an example of the EggHunterStack1434 having successfully found and executed the ExecveStack shellcode.
Below is an example of testing the EggHunter with the BindShell1434 shellcode.
A Workaround
First we set up to read input from standard in and then point to the stack as our buffer.
RDI - First Argument
RSI - Second Argument
RDX - Third Argument
R10 - Fourth Argument
R8 - Fifth Argument
R9 - Sixth Argument
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Student ID: SLAE64-1434
Target Operating System: 64 bit Linux (x86_64 GNU/Linux)
Assignment 3
Study about the Egghunter shellcode.Create a working demo of the Egghunter with different payloads. Should be configurable for different payloads.Egg Hunting
The
idea behind egg hunting is that sometimes the shellcode that you wish
to run will not fit in the available buffer that you wish to overflow.
But you are able to fit a much smaller shellcode that has just enough
code to search in memory for the larger shellcode.
We
need a way for the smaller shellcode to identify the larger shellcode
in memory and we accomplish that by providing an identifier that we can
use to recognize our larger shellcode. This identifier is referred to as
the egg and finding it is called egg hunting.
Once
we find our larger shellcode the smaller shellcode will start the
larger code executing. In this sense the smaller code acts as a stager, a
shim or loader if you will for the larger shellcode.
In the remainder of this blog post, I will show 2 examples of using an egg hunter to find and execute our larger shellcode.
Example 1
In my first example (below) I use an egg hunter that simply scans the stack space for our larger shellcode. We have identified the larger shellcode with a sequence of 2 of our eggs to be sure we are not looking at our EggHunter code when we find the larger shellcode. Since in practice our egg hunter code would probably be on the stack as part of a buffer overflow it probably makes sense to be able to tell the difference between it and our larger shellcode.GitHub Link: https://github.com/rtaylor777/nasm/blob/master/EggHunterStack1434.c
Number of bytes = 16
Number of nulls = 0
EggHunterStack1434.c
Here you can see an example of the EggHunterStack1434 having successfully found and executed the ExecveStack shellcode.
Below is an example of testing the EggHunter with the BindShell1434 shellcode.
Details
EggHunterStack1434.nasm
Lines 35 and 36 are getting a pointer to the stack into the RDI register.
Line 37 is pushing our 4 byte egg onto the stack. The bytes just happen to work out to be the same as the code xor edi,edi repeated twice in a row.
Line 38 pops the Egg into the RAX register. This is where it needs to be for the scasd command which will compare EAX to the data pointed to by RDI. Since we are only comparing 4 bytes at a time the value in RDI will be incremented by 4 so that it points at the next 4 bytes to compare. The incrementing is done automatically by scasd.
Line 40 , scasd, does the comparison of our Egg which is in RAX and the memory pointed to by RDI.
Line 41, if the Egg is not found we jump to "next" which then runs scasd to compare the next 4 bytes.
Line 42, if we found a match for our Egg we scan again to see if there were two Eggs in a row. This is how we tell if we found our larger shellcode.
Line 43, if we didn't find a second Egg we jump back to our scan at line 40.
Line 44, we found our second Egg and we jump to the code that is pointed at by RDI. This starts our larger shellcode executing.
EggStack
I created a script to automate the creation of the EggHunterStack1434.c file. If you have been using my helper scripts (see: http://a41l4.blogspot.ca/2017/02/slae-helper-scripts.html ) you would just put this new script in the same location.
The script will clobber any existing EggHunterStack1434.c file that is in the current directory so if you made some important edits to the file move it somewhere safe.
Running the script, if it is in your parent directory might look like this:
../EggStack ../Assignment1/b/BindShell1434.o
The path and name of an object file that represents the main Shellcode that you want the egg hunter to find should be the parameter that you pass to the script on the command line.
The command line to compile the new EggHunterStack1434.c file looks like this:
gcc -fno-stack-protector -zexecstack EggHunterStack1434.c -o EggHunterStack1434
Example 2
In this example we are going to answer the question "What do we do if our main shellcode is not on the stack but is somewhere else in memory?".
I would like to start by recommending that you take a look at this blog post for his 32 bit Linux solution: https://www.rcesecurity.com/2014/08/slae-egg-hunters-linux-x86/
ASLR
One of the topics that you should probably research and be familiar with includes:
ASLR (Address Space Layout Randomization) https://en.wikipedia.org/wiki/Address_space_layout_randomization
If you have root access on your Linux system you can examine ASLR by looking at the addresses assigned to various sections of memory as follows:
cat /proc/self/maps
If you run the cat command above several times you will see that each time the address range for the stack and the heap changes. I'll just pull out a couple of lines from my system as an example:
First run:
0064a000-0066b000 rw-p 00000000 00:00 0 [heap]
7ffc145c5000-7ffc145e6000 rw-p 00000000 00:00 0 [stack]
Second run:
01bbf000-01be0000 rw-p 00000000 00:00 0 [heap]
7ffc4cb83000-7ffc4cba4000 rw-p 00000000 00:00 0 [stack]
ASLR is what is prventing us from knowing (either roughly or exactly) where our shellcode might be located.
SIGSEGV
Another problem we face is that of memory protection. See this article: http://www.informit.com/articles/article.aspx?p=29961
"If an access that is not permitted is attempted, a page
protection violation fault is raised. When this happens, the kernel
responds by sending a segmentation violation signal (SIGSEGV) to the process." From the article linked above.
For example if we forge ahead and try scanning memory using this snippet of code:
The code will exit with a Segmentation fault.A Workaround
One work around that I have been able to discover from my research is that if we convince the system to try and access the protected memory on our behalf via a system call we can find out whether we have access without causing our program to exit in error.
In order to prove that we know what is happening I decided to use a different syscall for this demonstration. Since we are familiar with the Read system call from our earlier shellcoding lets try using that.
Okay so an example using the Read syscall.
man readFirst we set up to read input from standard in and then point to the stack as our buffer.
After executing this code in GDB, it pauses for our input. I hit the space bar and then pressed the enter key. We can see the number of characters read returned in the RAX register.
On the stack we can see a before and after picture that shows the hex values of our keys that we pressed.
With this next section of code we change RSI to point at the memory address 4096, and try our syscall again.
After executing the second Read syscall, we see the following return value in RAX.
That is the value we were looking for. This is what indicates that the system call experienced a
SIGSEGV.
Access Syscall
man access
int access(const char *pathname, int mode);
"The mode specifies the accessibility check(s) to be performed, and is either the value F_OK, or a mask consisting of the bitwise OR of one or more of R_OK, W_OK, and X_OK. F_OK tests for the existence of the file. R_OK, W_OK, and X_OK test whether the file exists and grants read, write, and execute permissions, respectively."
On my system I found the above values in this file:
/usr/include/unistd.h
Registers used for a system call:
RAX - System Call NumberRDI - First Argument
RSI - Second Argument
RDX - Third Argument
R10 - Fourth Argument
R8 - Fifth Argument
R9 - Sixth Argument
The system call numbers on my system are listed in this file:
"/usr/include/x86_64-linux-gnu/asm/unistd_64.h"
So for the Access syscall:
RAX = 21
RDI = possibly off limits memory address
RSI = one of 0, 1, 2, 3, 4 or a bitwise combination
EggHunter1434.c
So in this file the main thing to notice is that we are moving the Shellcode that we are looking for into the memory heap so that we can't just find it on the stack as it was with our last egg hunter.
EggHunter1434.nasm
Details
The section of code between the labels "_start" and "nextpage" is just the initialization area.
We zero RDI and RSI. Note that RSI is the second argument to the Access syscall and we are zeroing it to represent the F_OK value.
We put the number 4 into rbx on lines 37 and 38 because we need to use it to point RDI 4 bytes ahead of where we are trying to read for the Access syscall. Trying to use the litteral number 4 in the math on line 42 generates NULLS.
The section starting at the label "nextpage" is all about advancing RDI to point at the next page in memory, assuming pages are 4096 bytes in size.
The section starting at the label "nextaddr" is the busiest section of our code.
Line 42. Since we are testing our access to memory that is 4 bytes further along than we are scanning (the scasd reads 4 bytes at a time so we need to make sure we have permission to do that) we have to backup RDI so we push it onto the stack.
Line 44 is where we point RDI to 4 bytes ahead.
Line 45 and 46 is loading the system call number for Access (21) into RAX.
Line 47 is the Access syscall.
Line 48, before doing anything we need to restore our RDI value from the
stack. The danger is that in a tight loop of checking access we could
build up a lot of values on the stack if we don't keep the pushing and
popping balanced.
Line 49, we can identify the value which indicates that the syscall experienced a SIGSEGV simply by examining the value of AL.
Line 50, we jump back to nextaddr if our comparison on Line 49 determined that we can't read the memory pointed to by RDI. Otherwise, we can scan this memory so move on to line 51.
Line 51, this is where we are loading our Egg into the RAX register in preparation for the next scasd.
Line 52, scasd compares the value in EAX with the 4 bytes pointed to by RDI and advances RDI by a value of 4.
Line 53, if the values do not match jump back to testing whether we can access the next 4 bytes of memory. Otherwise we move on to testing the next 4 bytes for our second EGG.
Line 54. Okay at this point, it is feasible that if we did not choose a unique enough EGG value we could be risking trying to read an area of memory that would cause a SIGSEGV. But assuming that our EGG is unique (or unlikely) enough we hazard the chance and read the next 4 bytes without testing whether we can access them first.
Line 55, We did not find our second egg so jump back to test the next address, or we did find it and all there is left is to execute our Shellcode.
Line 56, JMP to our Shellcode, which is now pointed to by RDI, and start executing it.
Testing
Below is an image depicting a test run using the password protected TCP Bind Shell:
EggAll
As you can see in the testing graphic above, the EggAll script is used to create the EggHunter1434.c file. As before, put this script together with the other helper scripts for easy access.
The EggAll script will clobber any existing EggHunter1434.c file in the current directory. So if you have made important changes to the file be sure to copy it somewhere safe.
Summary
Egg hunting is the process of finding a larger shellcode in some unknown place in memory and executing it using a smaller special purpose shellcode. We have demonstrated 2 egg hunter shellcodes, one that searches the stack and one that can search all of memory. I have created scripts that can be used to easily configure each of the egg hunter shellcodes that I created, in order to work with different payloads.
If you wish to learn more about assembly language, I highly recommend
the "SecurityTube Linux Assembly Expert course and certification."
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert/
Comments
Post a Comment