23

MMD-0064-2019 - Linux/AirDropBot

 4 years ago
source link: https://www.tuicool.com/articles/NVv2Yvy
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Prologue

There are a lot of botnet aiming multiple architecture of Linux basis internet of thing, and this story is just one of them, but I haven't seen the one was coded like this before.

Like the most of other posts on our analysis reports in MalwareMustDie blog, this post was started from a request from a friend to take a look at a certain binary that was having low (or no) detection and at that time hasn't been categorized into a known threat ID.

This time I decided to write the report along with my style on how to reverse engineering its sample, in MIPS architecture.

So I was sent with this MIPS 32bit binary ..

cloudbot-mips: ELF 32-bit MSB executable, MIPS, MIPS-I version 1 (SYSV), statically linked, stripped

..and according to its hash it is supposed to be a Mirai-like, (thank's to good people for the uploading the sample to VirusTotal)

dxOB7ZuVg-fBrRmstt3EmVsaleV_-P9L6HxsIHK5w3puttv_w8pBzUM3sao1qFynjKVehsyIqcuW3dtU-sfUmA3l6TXGot255o7mfQ_tKl2KDS3XZ6s9_8vTKbVwK81VjsnENa4y32U=w300-h810-no

..and this is just one of a series of badness, my honeypots, OSINT and a given tips was leading me into 26 types of samples that is meant to pwned series of internet of things running on Linux OS and the MIPS-32 one I received is just one of them. If you see the filenames you can guess some of those binaries are meant to aim specific IoT/router platforms and not only for several randomly cross-compiled architecture supported result. This type of binaries seem to be started appearing in the early August, 2019.

jYRbmuJ.png!web Below is the additional list of the compiled binaries that are meant to run on several non-Intel CPU running Linux operating systems, they can affect network devices like routers, bridges, switches, and the small internet of things that maybe we use already:

m68k-68xxx.cloudbot:   32-bit MSB Motorola m68k, 68020, version 1 (SYSV), statically linked
hnios2.cloudbot:       32-bit LSB Altera Nios II, version 1 (SYSV), dynamically linked
hriscv64.cloudbot:     64-bit LSB UCB RISC-V, version 1 (SYSV), dynamically linked
microblazebe.cloudbot: 32-bit MSB Xilinx MicroBlaze 32-bit RISC, version 1 (SYSV), statically linked
microblazeel.cloudbot: 32-bit LSB version 1 (SYSV), statically linked,
sh-sh4.cloudbot:       32-bit LSB Renesas SH, version 1 (SYSV), statically linked.
xtensa.cloudbot:       32-bit LSB Tensilica Xtensa, version 1 (SYSV), dynamically linked.
arcle-750d.cloudbot:   32-bit LSB ARC Cores Tangent-A5, version 1 (SYSV), statically linked.
arc.cloudbot:          32-bit LSB ARC Cores Tangent-A5, version 1 (SYSV), dynamically linked.

(The hashes are in the "Hashes" section of this post)

Binary Analysis

Since I was asked to look into the MIPS sample so I started with it. The binary analysis is showing a striping result, but we can still get some executable section's information, compiler setting/trace that's showing how it should be run, and some information regarding of the size for the section/program headers, but it's too few isn't it? Still these are good clues for supporting dynamic analysis afterward. But I don't think I will go that far on the early stage of analysis with this new binaries, I love to solve stuff statically, as much as possible.

FZ7bq2R.png!web

For file attributes I extracted using Tsurugi DFIR commands which are also not showing special data too, except of what has been recorded into the infected box. I was taking the checks further I run some several ELF pattern I have with Yara rules and ClamAV signature to match it to see what binary is having, and it is only to make me understand why several false-positive results came up. The malware yet is having several interesting strings but they are too generic to be used to identify the threat without reading its assembly further.

So my practical binary analysis for this MIPS binary is going to be it, nothing much.

Some methods on MIPS-32 static analysis to dissect this sample with radare2:)

So this is the fun part, the binary analysis with radare2 ;). no cutter GUI, no fancy huds, just an old-schooler way with command line, visual mode and graph in a r2shell .

I think there is really no such precise step by step "cookbook" on how to to use radare2 during analyzing something, and basically radare2 is enriched in design for any kind of users to use it freely, once you get into it you'll just get use to it, and before you know it you are using it forever.

But first, let's make sure you are setting"mips" and "32" in radare2 environment of assembly architecture (arc) and bits for this binary, then try to recognize the "main function", which is in "0x4016a0 a0" at the pattern/location that's different than Intel basis assembly like shown in the picture below:

yAjMBzR.png!web

Next, I may just run following commands to be sure that it can be reversed well. It is a simple command for only showing how many Linux syscall is used, and this will work after the radare2 parse and analyze the binary to the analysis database.

7RJJFzr.png!web

PS: If you know what you're doing, a more simple way for the MIPS 32bit to seek where the syscall codes placed is by grepping the assembly code with the hex value of "0x0000000c" like below, the same result should come up:

6fQz2aV.png!web

In my case on dealing with Linux or UNIX binaries, I have to know first what syscalls are used (that kernel uses for making basic operations), "syscall" is used to request a service from kernel. Any good or bad program are using those (if they need to run on that OS), so syscalls have to be there. For me, the syscalls is important and its amount will tell you how big the work load will be, ..then the rest is up to you and radare2 to extract them, the more of those syscalls, the merrier our RE life will be, without knowing these syscalls there's no way we can solve such stripped binary :)

In a Linux MIPS architecture, where assembly and register is different than Intel one (MISP is RISC, Intel is CISC). Linux OS in MIPS platform can be configured to run either in big or in little endian mode too, you have to be careful about that in reversing MIPS, like this MIPS binary is using little endian, but i.e. binaries for SGI machines are on big endian. In MIPS the way "syscall" used is also have its own uniqueness. Basically, a designated service code for a syscall must be passed in $v0 register, and arguments are passed in other registers. A simple way in assembly code to recognize a syscall is as per below snipcode:

li $v0, 0x1
add $a0, $t0, $zero
syscall

Explanation: The "0x1" is stored in the "&v0" (it doesn't have to be assembly command "li" but any command in MIPS assembly in example "addliu", etc, can be used for the same effect), which means the service code used to print integer. The next line is to perform a copy value from the register "$t0" to "$a0" (register where argument is saved).

Finally (the third line) the syscall code is there, with these components altogether one "syscall" can be executed.

We can apply the above concept in the previously grep syscall result. The objective is to recognize the address of its syscall wrapper function for this stripped binary analysis purpose. For example, at the second result at "0x004019d0" there's a syscall code, and by radare2 you go to that location with seek (s) command and using visual mode we can figure the function name in no time. I will show you how.

Let's fix the screen for it as per below so we can be at the same page: Ur2qMjB.png!web

I marked the line where it is assigning "0xfa2" value to "$v0", and "0xfa2" is the code for "fork" syscall, the function that's having that syscall code, if you scroll up a bit you can see the function name "fcn.004019a0", which is the "wrapper function" for this "syscall fork".

The manual of syscall [ link ] is a good reference explaining syscall wrapper in libc. Quoted:

Usually, system calls are not invoked directly: 
instead, most system calls have corresponding C library wrapper 
functions which perform the steps required (e.g., trapping to kernel 
mode) in order to invoke the system call.  

Thus, making a system call looks the same as invoking a
normal library function.

In many cases, the C library wrapper function does nothing more than:

*  copying arguments and the unique system call number to the
   registers where the kernel expects them;

*  trapping to kernel mode, at which point the kernel does the real
   work of the system call;

*  setting errno if the system call returns an error number when the
   kernel returns the CPU to user mode.

However, in a few cases, a wrapper function may do rather more than
this, for example, performing some preprocessing of the arguments
before trapping to kernel mode, or postprocessing of values returned
by the system call.  Where this is the case, the manual pages in
Section 2 generally try to note the details of both the (usually GNU)
C library API interface and the raw system call.  Most commonly, the
main DESCRIPTION will focus on the C library interface, and
differences for the system call are covered in the NOTES section.

Using this method, in no time you'll get the full list of the syscall function's used by this malware as per following table that I made for myself during this analysis:

eiuQNjV.png!web The rest is up to you on how to make it easy to name the strings for each "syscall" for your purpose, I go by the above strings naming since it is fit to my RE platform, I suggest you refer to Linux syscall base on naming them [ link ].

The next step is, you may need to change all function name in radare2 according to this "syscall table". Using the visual mode and analyze function name (afn) command is the faster way to do it manually, or you can script that too, radare2 can be used with varied of methods, anything will do as long as we can get the job's done. In my case I like to use these radare2 shell macro based on table I made for myself:

:
s 0x0402060; af; afn ____connect; pdf |head 
s 0x0401CF0; af; afn ____write; pdf |head 
s 0x04019B0; af; afn ____fork; pdf |head 
    :

The result is as per seen in the below screenshot:

zyiYbi3.png!web

Up to this way, we'll have all of the syscalls back in place :) Don't worry, you'll do this faster if you get used to it. FrmQjmB.png!web

The result looks cool enough for me to read the radare2 graph on examining how this MIPS binary further goes.. NFjI3mM.png!web

The next step is a generic way on reversing a stripped binary, by defining the functions that is not part of Libc but likely coded by malware coder. For this task, you have to check the rest of the function and seek whether the XREF doesn't go to any of syscall wrapper functions, nor main(), init_proc() or init_term() functions, and that goes to the below leftover list, just naming it to anything you think it is fit with to what it does.

In my case I named them this way:

r05PVYUL0HuujtiZStkvwRX7XnMm6y_7fGTgOFftCf9rhA6dhjfRjlkQq83uMMxbbuZYIRjWP1ruNiqamlgVYx4zvHAu7SGWiHLA1jZUT71_PZejKZfoBrFIcPvEPqCfG6TWNWKJfW0=w500-h338-no Then we can put the correct function name into the binary using the same macro I showed you previously, then we are pretty much completed in making this binary so readable... hold on, but read it from where? Where to start?

To pick a good place to start to start reversing, this command will help you to pick some juicy spots, all the extractable strings will be dumped and we can pick one interesting one to start, and go up to build the big picture.:)

Actually symbols are giving us much better options, but right now we don't have anything else that is readable enough to start..

NkLmUQYR3h3rWhMvTQxdFdvcLM7icFVoHT27xvS_1IfQROjAUWJz0SrJv8PTsPyBYx_4Mouj6J4FCirSk4dt_dGwbnhMOy1PahpEUzccYyxRKYX2AaPBt04qs9T9u_9Ya-zf5IfFiZk=w580-h852-no You can start to trace this binary from these text address reference and then go up to the call in the main function that supports it. For example, by using the visual mode you can seek the XREF of each text to see how it is called from which function and you can trail them further after that. This isn't going to be difficult to read since you have all functions back in place.

The picture below is showing how the " air dropping " is referred to the caller function.

3Ufm2qZ.png!web

That's it. These methods I shared are useful methodology in analyzing Linux MIPS-32 binary especially stripped ones like the one I have now. I think you're good enough to go to complete your own analysis by yourself too. Please just tried those methods if you don't have any other better ways and don't be afraid if other RE tools can't make you read the MIPS-32 binary well, just fire the radare2 with the tips written above, and everything should be okay :)

We go on with the malware analysis of this binary and its threat then..

What does this MIPS-32 binary do?

Practically. the MIPS binary is bot that is having a mission to infect the host it was dropped into (note: so it needs a dropping scheme to go to the infected host beforehand), making a malicious process called " cloudprocess ", send message of " airdopping clouds " through the standard output (that can be piped later on). It is recording its "PID" and fork its process for the further step.

Upon successful forking it will extract the what the coder so-called " encrypted array ", it's ala Mirai table crypted keywords in its concept, but it is different in implementation., I must guess that it could be originally coded to avoid XOR operation which is the worst Mirai bug in the history :) but this " encrypt_array " is just ending up to an encoded obfuscation function :) - Anyhow the value from this "decrypted" coded is used for further malware process.

Then the malware tries to connect to the C2 which its IP address is hard-coded in the binary, on a success connection attempt to C2 server, it will parse the commands sent by the C2 to perform three weaponized functions on the binary to perform TCP, and UDP DDoS attack with either using the specific hex-coded payload, or the latter on is using a custom pattern so-called " hex-attack " that sends DoS packet in a hex escape strings format to the targeted host.

I will break it down to more details in its specific functions in the next sections.

The "encryption" (aka the obfuscation)

The challenge was the "encryption" part, it was I used radare2 with ESIL to see the "encrypted" variables, as per snipped below as PoC:

The decryption is by [shift-1] as per shown in the cascade loop shown in every encoded strings.

rInINrR.png!web

The result for the "decryption" can be shown as per below, using ESIL with the fake stack can be used to emulate this with the same result, so you don't need to get into the debug mode:

qM7RBzn.png!web The last four strings:

/proc/
/maps
/cmdline 
/status 
/exe

...are used for taking information (process name) from the infected Linux box, that will be used for the malware other functions like "killing" processes, etc. The other decrypted strings are used for infecting purpose (known credentials for telnet operation), and also for other botnet operation related.

The C2, its commands and bot offensive activity

What happened after decryption (encrypt_array) of these strings is, the binary gets into the loop to call the "connecting" function per 5 seconds. If I try to write C code based on this stage it's going to be like below snipcode:

7ixBFIfJIpl10-tFjuwNZnxM5uST7ui18PCXWrhJkyWsDwDmC3zqsi2U3Oh7KSaA5HXr_O80UuCTK3OEdcIhFG5aVCL42APA1QnMRRNjadXuZ9yVI2_hDmfQ-K73Il7IWZkFdR49EGc=w300-h448-no

Within each loop, when it calls "connecting" function it will try to connect the C2 which is defined a struct sockaddr "addr", pointing to port number (htons) 455 (0x1c7) and IP: "179.43.149.189".

YfuQrq2.png!web

When connected to C2, it will listen and receive the data sent by C2, to perform decryption and then to send its decryption result (as per previous logic) to the " command parsing " function, that's having " cmd_parse " sub-function inside. The " command parsing " is delimiting received command with the white space " " for the " cmd_parse " to grep three possible keywords of "udp" , "tcp" , and "hex" , which in next paragraph those keywords will be explained further.

Below is the loop when the command from C2 is received (listened) inside the "connecting" function in radare2:

zuaayan.png!web

Now we come into the offensive capability of this bot binary. The "udp" keyword will trigger the execution of " udpattack " function, "tcp" will execute " tcpattack " and so does the "hex" for executing the " hexattack " function.

TCP and UDP is having the same payload packet in binary is as per below:

n6bqqqv.png!web ...that is sent from tcpattack () and udpattack () in TCP and UDP different socket connection from the target sent by C2.

The hexattack is having a different payload that looks like this:

MrmABjV.png!web

Is there any difference between MIPS and other binaries?

Oh yes it has. The Intel and ARM version (or to binary that is having a scanner function) is interestingly having more functions. If I go to details on each functions for Intel binary maybe I will not stop writing this post, so I will summary them below with a pseudo code snips if necessary.

1. The " array_kill_list " function

This function is used to kill process that matched to these strings:

zyqY3eU.png!web

It seems this is how the bot herder gets rid of the competitor if they're in the same infected Linux box.

This " array_kill_list " is accessed from killer() function that is being executed before going to "connecting" loop in the main for Intel version.

The killer function is having multiple capability to stop unwanted processes too, it will be too long to describe it one by one but in simple C code and comments as per picture below will be enough to get the idea:

MRF7VvI.png!web

2. The scanner, the spreader via exploit

The bot herder is aiming Lynksys tmUnblock.cgi of a known router's brand, the vulnerability that has to be patched since published 5 years ago. For this purpose, in intel and ARM binaries right after killer() function it runs scanner() function which aiming TCP port 8080 with the code translated from assembly to C looks like below:

jTBJGvnSM7n9_bEUHmtphthEU3AlHGVptVEhtqBaBr_Wh72-EeDBLo2KxZnuqIr9R23_b4Pv11xs_5TdcWIapFgkYWLAmRB916509Zgm6dMQtioHWplwULRxViJ5Qg8wPbfGekwabjI=w580-h733-no

This scanner is having four pattern of payloads which I quickly paste it below for your reference if you are either receiving or researching this attack:

iai2eaE.png!web2MJJJjm.png!web

Maybe one of the thing that I may suggest for this bot's scanner functionality is what it seems like a spoof capability. I examined into low level for code generation of about this part and found what the send syscall performed when AirDrop bot make scanning with exploit is interesting :) please take a look yourself of what has been recorded as per below snipcodes:

QNB3UrF.png!web

On those "scanner" function supported binary, the spreading scheme is executed with targeting random generated IP addresses right after the the C2 has been attempted to call, and is using the same socket for multiple effort to infect Linksys CGI vulnerability. Below is the record in re-production this activity:

ZnEjamF.png!web

3. The " singleInstance " function

This is a code to make sure that there is no duplication of " cloudprocess " process that runs after a device getting infected. It's a simple code to kill -KILL the PID of detected double instance. You can easily reverse and examine it by yourself.Below is the example ARM-32 assembly code for this function with my comments in it just in case:

eeai6nN.png!web

for the right side of code, if I write that in C it's going to be something like this, more or less:

BVbQ7na.png!web

In a short summary as the conclusion

This binaries are a DoS bot clients, a part of a DDoS botnet. It spread as a worm with currently aiming Lynksys tmUnblock.cgi routers derived by non MIPS built binaries that infects machines to act as payload spreader too. I must warn you that I did not check the details in every 26 binaries came up during this investigation, but I think the general aspect is covered.

These are malware for Linux platform, it has backdoor, bot functions and are having infection capability with aiming vulnerability in routers CGI or telnet. The malware is coded with many originality intact, it is not Mirai, GafGyt (Qbot/Torlus base), or Kaiten (or STD like), but I can tell that the development is not mature yet. I was about to name it as "Cloudbot" but it looks like a legitimate software already using it so I switched to the "Airdropbot" instead for the hardcoded message printed on a success infection. This is a new strain of various library of IoT botnet, I hope that other security entities and law enforcer aware of what has just been occurred here, before it is making bigger damage like Mirai botnet did before.

Detection

Since the AirDropBot was developed to support many embed platform from various CPU type, to detect it precisely you may need to code several signature. However, if you see the binary carefully, so it is yes, one generic signature can be generated and applied. For that I PoC'ed it myself to develop a bit complex rules to detect them all and to recognize which binary is having the scanner and not.

The snippet code and scan example is as per screenshot below.

IBjYzyR.png!web

The bad actors and his gang are still at large and reading this blog post too :) , I am sorry I can not share the generic scanning code I made in here, but the screenshot is enough for fellow reversers to recognize the idea to detect these series. The rest is opsec.

Hashes

../bins/aarch64be.cloudbot    | 417151777eaaccfc62f778d33fd183ff
../bins/arc.cloudbot          | d31f047c125deb4c2f879d88b083b9d5
../bins/arcle-750d.cloudbot   | ff1eb225f31e5c29dde47c147f40627e
../bins/arcle-hs38.cloudbot   | f3aed39202b51afdd1354adc8362d6bf
../bins/arm.cloudbot          | 083a5f463cb84f7ae8868cb2eb6a22eb
../bins/arm5.cloudbot         | 9ce4decd27c303a44ab2e187625934f3
../bins/arm6.cloudbot         | b6c6c1b2e89de81db8633144f4cb4b7d
../bins/arm7.cloudbot         | abd5008522f69cca92f8eefeb5f160e2
../bins/fritzbox.cloudbot     | a84bbf660ace4f0159f3d13e058235e9
../bins/haarch64.cloudbot     | 5fec65455bd8c842d672171d475460b6
../bins/hnios2.cloudbot       | 4d3cab2d0c51081e509ad25fbd7ff596
../bins/hopenrisc.cloudbot    | 252e2dfdf04290e7e9fc3c4d61bb3529
../bins/hriscv64.cloudbot     | 5dcdace449052a596bce05328bd23a3b
../bins/linksys.cloudbot      | 9c66fbe776a97a8613bfa983c7dca149
../bins/m68k-68xxx.cloudbot   | 59af44a74873ac034bd24ca1c3275af5
../bins/microblazebe.cloudbot | 9642b8aff1fda24baa6abe0aa8c8b173
../bins/microblazeel.cloudbot | e56cec6001f2f6efc0ad7c2fb840aceb
../bins/mips.cloudbot         | 54d93673f9539f1914008cfe8fd2bbdd
../bins/mips2.cloudbot        | a84bbf660ace4f0159f3d13e058235e9
../bins/mpsl.cloudbot         | 9c66fbe776a97a8613bfa983c7dca149
../bins/ppc.cloudbot          | 6d202084d4f25a0aa2225589dab536e7
../bins/sh-sh4.cloudbot       | cfbf1bd882ae7b87d4b04122d2ab42cb
../bins/sh4.cloudbot          | b02af5bd329e19d7e4e2006c9c172713
../bins/x86.cloudbot          | 85a8aad8d938c44c3f3f51089a60ec16
../bins/x86_64.cloudbot       | 2c0afe7b13cdd642336ccc7b3e952d8d
../bins/xtensa.cloudbot       | 94b8337a2d217286775bcc36d9c862d2

Salutation & Epilogue

I would like to thank to @0xrb for his his cooperation and his persistence in trying to convince me this binary is interesting. It is interesting indeed, and as promise this is the analysis I did after work, writing this in 8hours more, non-stop. Thank's also for other readers who keep on supporting MMD, and as team, we appreciate your patience in waiting for our new post.

Thank you pancake and Radare2 teams who keep on making r2 the best RE tools for UNIX (All of the radare2 reversing was done in FreeBSD ), and also Tsurugi DFIR distro for your great forensics tools. For these security open source both frameworks I still keep on helping with tests and some of BSD bug reports to the r2team.

Okay, I will rest and will wordsmith some part of the post later, maybe I will add parts that I didn't have much time to write or paste now, or, to correct some stuff. In the mean time, enjoy the writing, please share with mention or using #MalwareMustDie hashtag. This post is a start for more posts to come.

A tribute to the newborn radare2 community in Japan "r2jp" , that we established in 2013 together with "pancake" on AVTokyo workshop in Tokyo, Japan.

BVz2AfN.png!web

Malware Must Die!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK