Shellcode is a sequence of bytes that represent assembly instructions. Please note that they are not assembly instructions but just another way to represent them. For example, x90 is a hexadecimal way of representing instruction ‘nop.’ Now shellcode and malware have a long history together, and historically shellcodes are used to spawn a shell on the infected system. However, over a period of time, capabilities of shell code have increased drastically, and malware authors have found new ways to de-obfuscate the code while increasing the impact by including shellcode. A basic example of a shellcode is like below which targeted a popular IRC client. The shellcode below is a hexadecimal representation of assembly instructions. More details for this exploit can be found here.
Because shellcode has a very small area in memory to be fit in and work from there, very often its main purpose is just to download the additional component of malware which will then work in a full fletched manner. Thus, shellcode often helps in dropping or downloading the next malicious component in the infection chain. Also, shellcode needs to obtain information about its environment like which windows API call it can make.
Fortunately, we do not need to pick up Intel or any other architecture manual and map each value with its corresponding instruction since we have the debuggers and dissemblers which know how to interpret these opcodes. For example, below is an example of viewing the above shellcode in radare2:
As stated above, shellcode has a very buffer to live in, and further, it needs to know about its environment that will include the addresses of local data and variables so that shellcode can use them. However, how can attacker know where the shellcode resides in target process memory?
Shellcode can determine the address where it resides by looking at the EIP register because EIP (Instruction Pointer) stores the address of next instruction to be executed. However, next question how will shellcode use this register value because it cannot be accessed directly? Shellcode developers can use a technique like:
E8 00000000 CALL 58 POP EAX
What these set of instructions are doing is using a CALL instruction to zero bytes away. What will that do? Well as soon as CALL instruction is executed, It saves the EIP on the top of the stack. Now the POP will retrieve from the top of the stack (which in this case is EIP) and put it into EAX.
This pattern of CALL immediately followed by a POP can be easily spotted by the antimalware solution, so normally shellcode developers tweak the above logic by incorporating more instructions. For example, look at below instruction sets:
00000017 EB 03 JMP SHORT 0000001C
00000019 5E POP ESI
0000001A EB 05 JMP SHORT 00000021
0000001C E8 CALL 00000019
00000021 ADD ESI,3
The first instruction is JMP SHORT instruction with opcode EB and is followed by an argument to show the relative bytes offset with which EIP needs to be increased. What you think the next instruction will be? Will it be the one at 0000001A? No, it will be the instruction at 0000001C because EIP already has 00000019 in it and adding 3 bytes will take it to 0000001C. Now the instruction at 0000001C is making a call to 00000019, but before that, it will save the EIP onto the top of the stack. Instruction at 00000019 will POP the top of the stack (which is EIP) into the ESI register. Next instruction at 0000001A executes and takes us to 00000021 where the shellcode will continue executing.
There are some other scenarios as well in which first-stage shellcode tries to look for second stage shellcode. This technique is called egg-hunting. This is useful for the attackers when the buffer space to put the shellcode is very small and can only be used to point to second stage shellcode which can be placed in large buffers.
Now during execution, shellcode looks for certain windows APIs to perform its functions, but it’s not necessary that those libraries will be loaded in the memory but what is good for shellcode is that a common DLL known as kernel32.dl is usually loaded into the systems and it can use LoadLibraryA to load the libraries and GetProcAddress to find the address of the exported function. However, you must be thinking now that how does the shellcode find kernel32.dll in the first place. Well, the answer lies in a structure called as Process Environment Block(PEB) which holds details of the process including information about loaded DLLs. However, now again how to find PEB? I am hoping you are not confused in how we are moving backward to find the kernel32.dll. So, there is a register FS which points to Thread Information Block. There is a pointer in TIB to PEB at offset 0x30. This is the address we were looking for. Now to load PEB all we have to do is to push FS:[0x30h] into a register. After the push EAX will hold the PEB which then can be parsed to find the kernel32.dll. Instruction can be simple as:
MOV EAX,DWORD PTR FS:[30h]
Once the Kernel32.dll is found, shellcode will use GetProcAddress to find the address of an API function it needs. There is another way to calling the APIs without calling GetProcAddress i.e. through parsing of DLL’s export directory table and comparing the ASCII names of the functions it needs to resolve (usually hash is compared to save buffer space even more )
So, these all are ShellCode basics that one should be aware of how it operates and various patterns it follows to find itself in memory, etc.