Security + Data Science

Thursday, September 16, 2021

Fuzzing annocheck with AFL++ Part 1

In the last article, we fuzzed annocheck with radamsa. This found several crashes. But annobin-9.94 has everything cleaned up that radamsa can find. We're not done fuzzing yet. We need to try a guided coverage fuzzer like AFL++ to see if there are more problems.

The strategy for using AFL++ is to build annocheck as a static application. We will start fuzzing it and make some adjustments based on how the initial runs go. Then we will let it run for several hours to see what it finds.

The first step is to download a copy of annobin-9.94 from here. Next install and build the source rpm. Then cd into the annobin-9.94 source directory. If you need to setup an rpm build environment, there are steps here.

Next, we build annocheck as follows:

export PATH=/home/builder/working/BUILD/repos/git-repos/AFLplusplus:/home/builder/working/BUILD/repos/git-repos/llvm-project/build/bin:$PATH
export AFL_PATH=/home/builder/working/BUILD/repos/git-repos/AFLplusplus
CC=afl-clang-lto CXX=afl-clang-lto++ RANLIB=llvm-ranlib AR=llvm-ar LD=afl-clang-lto++ NM="llvm-nm -g" ./configure --without-gcc_plugin --without-tests --without-docs --enable-static --with-gcc-plugin-dir=`gcc -print-file-name=plugin`

The configure script errors out like this:

configure: creating ./config.lt
config.lt: creating libtool
checking for GCC plugin headers... no
configure: error: GCC plugin headers not found; consider installing GCC plugin development package

After much digging around, I decided that since we are making a static app and not worried about the gcc plugins, we'll just fix the tests to think everything is OK. Apply the following patch:

diff -urp annobin-9.94.orig/config/gcc-plugin.m4 annobin-9.94/config/gcc-plugin.m4
--- annobin-9.94.orig/config/gcc-plugin.m4      2021-08-31 10:02:12.000000000 -0400
+++ annobin-9.94/config/gcc-plugin.m4   2021-09-15 16:31:49.946843182 -0400
@@ -129,7 +129,7 @@ int main () {}
[gcc_plugin_headers=yes],
[gcc_plugin_headers=no])

-if test x"$gcc_plugin_headers" = xyes; then
+if test x"$gcc_plugin_headers" = xyes -o x"$static_plugin" = xyes; then
   AC_MSG_RESULT([yes])
else
   AC_MSG_RESULT([no])

Since we modified the m4 scripts, we need to regenerate the configure script. I used the autogen.sh file from the audit-userspace project as a convenience. Download it and do the following:

./autogen.sh
CC=afl-clang-lto CXX=afl-clang-lto++ RANLIB=llvm-ranlib AR=llvm-ar LD=afl-clang-lto++ NM="llvm-nm -g" ./configure --without-gcc_plugin --without-tests --without-docs --enable-static --with-gcc-plugin-dir=`gcc -print-file-name=plugin`

Now configure finishes correctly. We are now ready to build it. The first step is to export the instrument variable and the run make:

export AFL_LLVM_INSTRUMENT=NATIVE
make

This fails as follows:

afl-cc ++3.14c by Michal Zalewski, Laszlo Szekeres, Marc Heuse - mode: LLVM-LTO-PCGUARD
error: unable to load plugin 'annobin': 'annobin: cannot open shared object file: No such file or directory'
make[1]: *** [Makefile:442: annocheck-annocheck.o] Error 1

After much digging around, I found that the cause of this failure is a plugin option in the annocheck_CFLAGS. Open annocheck/Makefile and go down to line 338 and remove -fplugin=annocheck. Now run make again. Now it compiles annocheck.

Time to setup for fuzzing. We will use the same program that we created to fuzz using radamsa. Go check the last article if you need the recipe. Do the following:

cd annocheck
mkdir in
cp ~/test/hello in
mkdir -p /tmp/proj/out
ln -s /tmp/proj/out out
# Then we switch to root and do some house keeping to let this
# run as fast as possible
su - root
service auditd stop
service systemd-journald stop
service abrtd stop
auditctl -D
auditctl -e 0
echo core >/proc/sys/kernel/core_pattern
cd /sys/devices/system/cpu
echo performance | tee cpu*/cpufreq/scaling_governor
exit

Now it's time to fuzz!

afl-fuzz -i in -o out ./annocheck @@

And we get the following:

There are plenty of guides that tell you what each of these items mean. Please search the internet and find one if you are curious. My main goal in this is to show you how to overcome many obstacles to get to the prize.

What I'd like to point out is 2 things. We are getting about 248 executions per second. That is kind of low. And we hit 2 crashes in 25 seconds. I hit Ctl-C to exit so that we can take an initial look at the crashes. The directory structure of the out directory looks like this:

$ tree out
out
└── default
    ├── cmdline
    ├── crashes
    │   ├── id:000000,sig:06,src:000000,time:17826,op:havoc,rep:4
    │   ├── id:000001,sig:11,src:000000,time:23980,op:havoc,rep:4
    │   └── README.txt
    ├── fuzz_bitmap
    ├── fuzzer_setup
    ├── fuzzer_stats

The actual test cases are in the crashes directory. So, what do we actually do with these? The answer is that we build another copy of annocheck using the address sanitizer and pass these to the sanitized annocheck to see what happens.

The last article explains how to build a sanitized copy of annocheck. Just open the tarball in a different location so that you don't overwrite the AFL++ build of annocheck and compile it. I built it in a dirrectory called annobin-9.94.test. You can do it anywhere, just correct the location.

Next, from the sanitized annocheck dir, we run:

$ ./annocheck /tmp/proj/out/default/crashes/id:000000,sig:06,src:000000,time:17826,op:havoc,rep:4
annocheck: Version 9.94.
Hardened: id:000000,sig:06,src:000000,time:17826,op:havoc,rep:4: FAIL: pie test because not built with '-Wl,-pie' (gcc/clang) or '-buildmode pie' (go)
Hardened: id:000000,sig:06,src:000000,time:17826,op:havoc,rep:4: FAIL: bind-now test because not linked with -Wl,-z,now
annocheck: annocheck.c:1250: find_symbol_in: Assertion `symndx == sym_hdr->sh_size / sym_hdr->sh_entsize' failed.
Aborted

Aborted? That stinks. Programs that call abort trick AFL++ into thinking this is a valid crash. This is because abort creates a core dump and that's exactly what AFL++ is looking for, The solution to this is that we need to override the abort call. The program depends on abort stopping execution. Removing it altogether means it will be running in code that the developer never intended. Since it's a false positive, we'll replace the call to abort with exit. So, let's go find it and recompile.

Grep'ing for abort doesn't find anything. But looking again at the error message says this occurs on line 1250 in annocheck.c. Opening that file finds an assert macro at that line. That means we need to override the assert macro with one that calls exit.

In annocheck/annocheck.h we find the include of assert.h around line 21. Comment that out using old style /* */ syntax. Next, a little farther down after all the includes, put the new macro:

/* Added this so that we don't trap abort in AFL++ */
#define assert(e) ((void) ((e) ? 0 : __assert (#e, __FILE__, __LINE__)))
#define __assert(e, file, line) ((void)printf ("%s:%u: failed assertion `%s'\n", file, line, e), exit(1), 0)

Now, save, run make clean, and make. One other thing to note, annocheck makes temp directories to work with. When it aborts or crashes, it cannot clean those up. For now

rm -rf annocheck.data.*

Back to fuzzing. We really do not need the previous runs. You can clear them out by

rm -rf out/default

Actually, let's take a quick detour before going back to fuzzing. If you recall from above, I mentioned that the number of executions per second is kind of low. Let's fix that before we start fuzzing.

AFL++ has a mode of operation called deferred instrumentation mode. What this does is instead of starting an application from scratch, letting the runtime linker resolve all symbols, open and parse configuration files, and do general initialization...you can insert code that marks a spot where all of that has completed. AFL++ will fork new copies from that spot. The placement matters. You need to do it after general startup, and before it reads the file with the fuzzing data in it.

So, if we open annocheck.c and locate the main function at line 1943, let's scan down looking for the place to put it. It checks some arguments, checks elf version, loads rpm's configuration (that takes some time), and then processes the command line arguments. This looks like the spot. Copy and paste this at line 1967 just above processing command line args:

#ifdef __AFL_HAVE_MANUAL_CONTROL
__AFL_INIT();
#endif

Let's make one more change. Annocheck creates a temporary directory while processing a file. Let's put that over in /tmp so as not to wear out a real hard drive. Look for a function named create_tmpdir. At line 1296 it copies a file name template into a buffer. Let's prepend "/tmp/" to that string. If you look a little further down, it calls concat and rewrites the tempdir variable with the current working directory as the path root. We don't want that. So, comment it out and the following line which outputs where the tempdir is. You have to use /* */ commenting on this program.

Save and exit. Rerun make. In another terminal, run the following script. It will periodically clear out the annocheck.data temp directories that aren't removed on a crash.

while true ; do sleep 30 ; rm -rf /tmp/annocheck.data.* ; done

And in the other terminal, from the annocheck directory, run this:

afl-fuzz -i in -o out ./annocheck @@

Now let's look at the executions per second:

Now we are getting about 4.5 times the speed. Using deferred instrumentation mode makes a huge improvement. The reason we want it to run faster is that the more executions per second, the more test cases it can try in the same amount of time. On very mature programs, it can take days of fuzzing to find even one bug.

We'll stop here. You can let it run for a couple hours. You will probably have over a hundred "unique" crashes to choose from. In the next article, I'll go over how to sort through all that efficiently. If you do get a collection, remember to copy them to persistent disk storage. The /tmp dir goes away when you reboot the computer.

Wednesday, September 8, 2021

Fuzzing annocheck with Radamsa

Recently I heard that annocheck was crashing when scanning some files. That gave me an idea - fuzz it! I got the latest code in rawhide which at the time was 9.93, built it using rpmbuild, and cd'ed into it's source directory. Then:

make clean
CFLAGS="-fsanitize=address,undefined,null,return,object-size,nonnull-
attribute,returns-nonnull-attribute,bool,enum -ggdb -fno-sanitize-
recover=signed-integer-overflow" ./configure --without-tests --without-docs
make -j `nproc`

This sets it up for the address sanitizer so that we can spot fuzzing induced problems. Running make fails unexpectedly:

In function ‘follow_debuglink’,
    inlined from ‘annocheck_walk_dwarf’ at annocheck.c:1092:18:
annocheck.c:776:7: error: ‘%s’ directive argument is null [-Werror=format-
overflow=]
776 |       einfo (VERBOSE2, "%s: try: %s", data->filename, debugfile);
\
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
annocheck.c:805:7: note: in expansion of macro ‘TRY_DEBUG’
805 |       TRY_DEBUG ("%s", debug_file);
      |       ^~~~~~~~~
annocheck.c: In function ‘annocheck_walk_dwarf’:
annocheck.c:776:35: note: format string is defined here
776 |       einfo (VERBOSE2, "%s: try: %s", data->filename, debugfile);

This goes on an on. As I understand it, this is a bug in gcc and will be fixed in an upcoming release. However, what is stopping the build is the -Werror flag in the Makefile. So, you want to edit annocheck/Makefile and remove -Werror from the make file. Now running make will produce the binaries. Prior to working with annocheck, I wrote and compiled a little "Hello World" program in a test directory. In doing this, I left it unstripped. To prepare for fuzzing, I did this:

cd annocheck
mkdir in
cp ~/test/hello in/test
mkdir -p /tmp/out
ln -s /tmp/out out

Then I used a script similar to the one discussed in the Fuzzing with Radamsa article from a couple days ago.

#!/bin/sh
LOG="in/test"
TLOG="out/test"

while true
do
        cat $LOG | radamsa > $TLOG
        ./annocheck $TLOG >/dev/null
        rc="$?"
        if [ "$rc" == "1" ] ; then
                exit 1
        fi
        rm -f $TLOG
        echo "==="
done

The basic idea is put a seed program into the "in" directory. Radamsa mutates it and writes it to the "out" directory, which is a symlink to /tmp. As explained in the previous article, you want to do fuzzing writes to a tmpfs file system so that you don't wear out real hardware. Running the script found this on the first test case:

hardened.c:1081:7: runtime error: null pointer passed as argument 1, which is
declared to never be null
AddressSanitizer:DEADLYSIGNAL
=================================================================
==48860==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc
0x7ffb239405bb bp 0x7fff6413a7d0 sp 0x7fff64139f60 T0)
==48860==The signal is caused by a READ memory access.
==48860==Hint: address points to the zero page.
    #0 0x7ffb239405bb in __interceptor_strcmp.part.0 (/lib64/libasan.so.
6+0x8d5bb)
    #1 0x42b9eb in interesting_sec /home/builder/working/BUILD/annobin-9.93/
annocheck/hardened.c:1081
    #2 0x42b9eb in interesting_sec /home/builder/working/BUILD/annobin-9.93/
annocheck/hardened.c:1074
    #3 0x40dd23 in run_checkers /home/builder/working/BUILD/annobin-9.93/
annocheck/annocheck.c:618
<snip>

I re-run the sanitizer it stops a lot on this error. But if you keep restarting it, eventually you may get this one:

=================================================================
==49841==ERROR: AddressSanitizer: heap-use-after-free on address
0x603000035d40 at pc 0x7fd46885170c bp 0x7ffcf350afc0 sp 0x7ffcf350a770
READ of size 1 at 0x603000035d40 thread T0
    #0 0x7fd46885170b in __interceptor_strcmp.part.0 (/lib64/libasan.so.
6+0x8d70b)
    #1 0x42637d in check_for_gaps /home/builder/working/BUILD/annobin-9.93/
annocheck/hardened.c:3709
    #2 0x42637d in finish /home/builder/working/BUILD/annobin-9.93/annocheck/
hardened.c:3889
    #3 0x42637d in finish /home/builder/working/BUILD/annobin-9.93/annocheck/
hardened.c:3844
    #4 0x40e3f6 in run_checkers /home/builder/working/BUILD/annobin-9.93/
annocheck/annocheck.c:691
    #5 0x40e3f6 in process_elf /home/builder/working/BUILD/annobin-9.93/
annocheck/annocheck.c:1517
    #6 0x40f690 in process_file /home/builder/working/BUILD/annobin-9.93/
annocheck/annocheck.c:1732
    #7 0x408880 in process_files /home/builder/working/BUILD/annobin-9.93/
annocheck/annocheck.c:1890
    #8 0x408880 in main /home/builder/working/BUILD/annobin-9.93/annocheck/
annocheck.c:1982
    #9 0x7fd467b36b74 in __libc_start_main (/lib64/libc.so.6+0x27b74)
    #10 0x409dad in _start (/home/builder/working/BUILD/annobin-9.93/
annocheck/annocheck+0x409dad)

0x603000035d40 is located 0 bytes inside of 25-byte region
[0x603000035d40,0x603000035d59)
freed by thread T0 here:
    #0 0x7fd468872647 in free (/lib64/libasan.so.6+0xae647)
    #1 0x412cfb in annocheck_get_symbol_name_and_type /home/builder/working/
BUILD/annobin-9.93/annocheck/annocheck.c:1463

previously allocated by thread T0 here:
    #0 0x7fd46881d967 in strdup (/lib64/libasan.so.6+0x59967)
    #1 0x412d25 in annocheck_get_symbol_name_and_type /home/builder/working/
BUILD/annobin-9.93/annocheck/annocheck.c:1466

SUMMARY: AddressSanitizer: heap-use-after-free (/lib64/libasan.so.6+0x8d70b)
in __interceptor_strcmp.part.0

This is a good one. Use after free can be exploitable under the right conditions. I reported these to the annobin developer. He replicated the results, fixed it up, and released 9.94 to Fedora the next day. I checked it and can confirm that Radamsa finds no other problems.

Note that if you actually wanted to fix the bug, the test case is in out/test. Just make the patch, recompile, and manually run the test case to confirm it's fixed. If you would like to preserve the test cases for later, remember that if you shut down the system they are gone unless you moved it from out to a more permanent location.

Radamsa is a good first fuzzer to reach for when starting to fuzz a new program. It's simple to setup and get running. And it finds all the low hanging fruit. But to go deeper, you need a guided fuzzer like AFL++. And that is exactly what we'll do in a future blog post.

Tuesday, September 7, 2021

Checking application hardening with annocheck

Gcc and glibc have multiple mitigations that are available to prevent certain kinds of exploits. The redhat-rpm-macros package contains the flags that are passed into the build environment when a package is built. If you wanted to see this, look in /usr/lib/rpm/redhat/macros. There is a _hardened_build macro that is defined to a 1. That pulls in _hardened_cflags, which pulls in _hardening_cflags and more and more macros.

However, people write their own build system. They sometimes override environmental variables. Or maybe the spec file is written in a way that cflags cannot be injected. How can we check to see if the intended flags were applied?

One possibility is to use the checksec.sh program. If you google around, you can find it. I have put a copy into the security-assessor github tools. To use it, we can pass a file or a pid. For example:

./checksec.sh --file /usr/sbin/auditd
RELRO STACK CANARY FORTIFY SOURCE PIE FILE PACKAGE
Full RELRO Canary found Fortify found PIE enabled /usr/sbin/auditd audit-3.0.6

For common criteria, it calls out that applications should have stack smashing protection and ASLR. In the output of checksec.sh, this would be the stack canary and PIE columns. If PIE is not enabled then the application has some but not all parts randomized. Specifically the code segment doesn’t move around. However, making an application fully use ASLR causes a new layer of indirection to get added to applications. This becomes an attack point unless it’s made read only at application start-up. This is what the RELRO column is talking about. What we want is full RELRO so that we have full ASLR and complete symbol resolution so that all indirection is marked readonly. That leads to the question of how does checksec.sh determine that?

To detect whether stack smashing is enabled, we need to use readelf. What we can do is look in the symbol table. Stack smashing detection is done by placing a random number on the stack for each function call. On return, its checked to see if it's changed. If it is, then it calls the internal function __stack_check_fail(). On recent Fedora, the binutils were changed to shorten function names. To see them accurately, you need to use the ‘W’ argument. So, to check for stack smashing protection you would do this:

readelf -sW /usr/sbin/auditd | grep __stack_check_fail

To check for ASLR, we need to examine the ELF headers. One field is called Type. This is to say what type of Elf file it is. It can be an executable, dynamic, core, or object file. The dynamic type means that its a shared object or a library file. However, there is almost no difference between a shared object file and a program that is compiled with PIE. The only difference might be that the application has a main function, but so does libc. The check for PIE ASLR would look something like this:

readelf -hW /usr/sbin/auditd | grep 'Type:[[:space:]]*DYN'

But, the last item to check on is if we have full RELRO. All applications compiled on Fedora or RHEL automatically have partial RELRO. There was a patch applied to binutils that hardwires this. In order to have full RELRO, the program must be compiled with the bind_now linker flag. The check for this is located in the dynamic section of the program. A test would look like this:

readelf -dW /usr/sbin/auditd | grep 'BIND_NOW'

Simple...right? Not so fast. Some of these tests are certain to give you a correct answer. For example, there is only one program header. It can give you a reliable answer. However, what about the stack smashing protection? All we can tell is it’s enabled for at least one object file. We cannot tell if all object files were compiled with stack smashing protection. We also can’t tell if its regular, strong, or full protection. And that goes for other hardening flags such as stack clash or control flow integrity. If checksec.sh is all we have, then we are reduced to looking for the build logs and verifying that every file got every intended flag.

A Better Mousetrap

This is why we have the annobin and annocheck programs. The annobin program is a gcc plugin that annotates build information in a notes section of each object file. The annocheck program can then read these note sections and reason about the build policy being faithfully carried out. To use it, all you need to do is pass the full path to the program to it. It will check dozens of things about the application. To see these pass the --verbose flag. But what if we just wanted to recreate the 3 check that checksec.sh does? We can turn all tests off and then enable the ones we want like this:

# annocheck --verbose --skip-all --test-stack-prot --test-pie --test-bind-now /usr/sbin/auditd | grep -v info:
annocheck: Version 9.79.
Hardened: /usr/sbin/auditd: PASS: pie test
Hardened: /usr/sbin/auditd: PASS: bind-now test
Hardened: /usr/sbin/auditd: PASS: stack-prot test

Based on this, it’s possible to write a script and check all files for stack smashing protection like so:

#!/bin/sh
DIRS="/usr/lib64 /usr/lib /usr/bin /usr/sbin /usr/libexec"
FLAGS="--skip-all --ignore-gaps --test-stack-prot"
for d in $DIRS
do
        if [ ! -d $d ] ; then
                continue
        fi
        echo "Scanning files in $d..."
        for f in `/usr/bin/find $d -type f 2>/dev/null`
        do
                # Get just the elf executables
                testf=`echo $f | /usr/bin/file -n -f - 2>/dev/null | grep ELF`
                if [ x"$testf" != "x" ] ; then
                        # Get results dropping version and first 2 fields
                        res=`annocheck $FLAGS $f 2>/dev/null | grep -v '^annocheck:' | cut -d " " -f 3-`
                        if [ x"$res" != "x" ] ; then
                                echo "$f $res"
                        fi
                fi
        done
done

Saving this as check-ssp and running this on a fully patched Fedora 34 system gives the following results:

$ ./check-ssp | grep FAIL
/usr/lib64/libva.so.2.1100.0 FAIL: stack-prot test because only some functions protected
/usr/lib64/gimp/2.0/plug-ins/fourier FAIL: stack-prot test because stack protection deliberately disabled
/usr/lib64/ocaml/objinfo_helper FAIL: stack-prot test because stack protection deliberately disabled
/usr/lib64/libva-x11.so.2.1100.0 FAIL: stack-prot test because only some functions protected

There are a lot more failures in the video drivers. Hopefully there’s no bugs there. :-)

Conclusion
The annobin / annocheck programs allow us to verify that the intended compiler mitigations are present in all ELF files in the distro. It is a better check than the old way. There are times when functions don’t have stack variables or do anything that cause stack smashing protection to be enabled. Only by looking at the annotation from the build can we tell that the flags were passed in and the compiler chose not to need the __stack_check_fail function. And without annocheck, there is otherwise no visibility into how much stack smashing protection is compiled in.

The annocheck program gives unprecedented visibility in the application hardening on your system. It can let you know if everything is good. It can also be used as a gating test when you build a package and intend to deploy it. It’s worth your time to know about. And it now has online documentation to help you fix programs.

Sunday, September 5, 2021

How to build AFL++ on Fedora 34

In the last article, I explained how to use Radamsa to fuzz applications. But what if you said I want to use a real fuzzer - one like AFL. Well, OK then. When you say you want to use AFL, I think you really mean AFL++, This is the community supported version based on the original, but with a whole lot of new ideas to make it faster and more aggressive. I'm here to show you how to do it...but on Fedora, it's harder than it needs to be.

The thing about AFL++ is that you really want to use the clang-lto mode. To do that means you need the clang gold linker. And for whatever reason, Fedora doesn't ship it. No gold linker, no clang-lto mode. So, the first step of building AFL++ is to build clang from scratch. And unless you have one of those nice AMD 12 or 16 core CPU's, this will take a while.

Suppose you have a 4 core machine, that gives you 8 hyperthreads. Typically when you compile, you can do:

make -j $(expr nproc)

but a lot of time is spent doing IO. So, you can get a little more speed by doubling that. And that is exact what we'll do in the instruction below.

Also, I wanted to go with a released version of llvm/clang instead of whatever's in the repo at the moment. So, I'll add the steps in to get 12.0.1 which is the current release as of this writing. Building clang takes about an hour on 4 core Xeon. See you on the other side.

cd working/BUILD/
git clone --depth 1 --branch llvmorg-12.0.1 https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir build
cd build
cmake -G "Unix Makefiles" -DLLVM_ENABLE_PROJECTS='clang;clang-tools-extra;compiler-rt;libclc;libcxx;libcxxabi;libunwind;lld' -DCMAKE_BUILD_TYPE=Release -DLLVM_BINUTILS_INCDIR="/usr/include" ../llvm
make -j $(expr `nproc` \* 2) ENABLE_OPTIMIZED=1

export PATH="~/working/BUILD/llvm-project/build/bin/:$PATH"

Hopefully you had something to do while that built. Anyways, on to doing the real job of making AFL++. This goes much faster.

git clone https://github.com/AFLplusplus/AFLplusplus.git
cd AFLplusplus
make -j $(expr `nproc` \* 2) source-only

export PATH=/home/builder/working/BUILD/AFLplusplus:/home/builder/working/BUILD/llvm-project/build/bin:$PATH
export AFL_PATH=/home/builder/working/BUILD/AFLplusplus

These last two updates of the environment are something you can put in your bashrc or as part of a script to setup for fuzzing. Also note that the first one includes the path to llvm-clang.

So, there you have it. We're all ready to fuzz a target. We'll start a fuzzing project in a future article to show how fuzz a real program that people are using.

Saturday, September 4, 2021

Simple fuzzing with Radamsa

We will start looking into improving programs by fuzzing them. A simple fuzzer that gives very good results is Radamsa. It is part of the Fedora distribution, so all you need to do is install it.

Radamsa is a file mutator. It takes a file as input and modifies it. So, if we wanted to fuzz the audit search utility, we would gather a sample log, mutate it, and run a search on the mutated log. This is easily scriptable. For example, consider the following bash script:

#!/bin/sh
LOG_DIR="/tmp"
LOG="$LOG_DIR/test.log"
MLOG="$LOG_DIR/stmp.log"
PDIR="/home/audit-3.0.5"
OPTIONS="--format csv --extra-keys --extra-labels --extra-obj2 --extra-time"
export ASAN_OPTIONS=detect_stack_use_after_return=true:strict_string_checks=true:detect_invalid_pointer_pairs=2

# Get fresh log data to test with
echo "Collecting logs..."
ausearch -if /var/log/audit/audit.log --start today --raw > $LOG
echo "Log collected, starting to fuzz..."

# Now fuzz the logs over and over
while true
do
        cat $LOG | radamsa > $MLOG
        date
        LD_LIBRARY_PATH="$DIR"/auparse/.libs/:"$PDIR"/lib/.libs/ $PDIR/src/.libs/ausearch $OPTIONS -if $MLOG >/dev/null
        if [ "$?" != "0" ] ; then
                exit 1
        fi
        rm -f $MLOG
        echo "==="
done

The idea is to cause ausearch to choke on its input as it parses things. The thing is that many failures can happen but is not visible out side the program. Glibc uses 8 byte alignment for memory allocations. If we blow past the buffer by 1 byte, it is not likely to cause a crash.

To detect these kind of issues, we need to rebuild the audit software with gcc's address sanitizer. This build has all kinds of ways to look at the program to detect these overflows. What I would recommend is to download the source rpm and build it. If you need help setting up a build environment, I have instructions here.

Once it's done do the following:

cd audit-3.0.5
make clean
CFLAGS="-fsanitize=address,pointer-compare,pointer-subtract,unreachable,vla-bound,bounds,undefined,null,return,object-size,nonnull-attribute,returns-nonnull-attribute,bool,enum,builtin -ggdb -fno-sanitize-recover=signed-integer-overflow" ./configure --with-python=no --with-python3=yes --enable-gssapi-krb5=yes --with-arm --with-aarch64 --with-libcap-ng=yes --without-golang --enable-systemd --enable-experimental
make -j 8

Once this completes, you are ready to fuzz with the script above. If you do fuzz ausearch, you will probably find a couple direct memory leaks. If you look at the mutated log, you will probably see that it added 5 comm= fields. This will never happen in real life, so I have not patched ausearch to fix this.

You can apply this same fuzzing technique to other applications. Build it with the address sanitizer, get a sample input, adjust the script, and let it roll.

Sunday, April 18, 2021

Ambient Capabilities

Support for ambient capabilities has been added to the most recent libcap-ng release (0.8+). Ambient capabilities are designed to let a privileged process bestow capabilities to a non-privileged child process. Normally when you hit execve in a non-privilged process, all capabilities are stripped unless you do a very specific sequence of instructions. This is nicely wrapped up in capng_change_id(). But what if you want to execute a helper process that also needs capabilities?

Ambient capabilities allows this to happen. If you are the privileged process, you can grant yourself ambient capabilities and then capng_change_id() and then calling helper processes is not problem. Systemd takes advantage of this to allow services to be started as non-root and with capabilities. All you have to do is add a line with AmbientCapabilities= to your service file and assign yourself the capabilities you want.

But there is a problem. Ambient capabilities are leaky. You may pass them to a child, the child can pass them to a child, and so on. Whole families of processes could leak capabilities to everything they do. And if one of those applications are exploitable, then the attacker has a way to escalate some privileges. One of the updates to libcap-ng is to also allow netcap and pscap display processes that have ambient capabilities so that we can hunt down the source of leaks.

Let's test this and ways to solve it. The test scenario is like this, we have a privileged process as root with all capabilities. It selects a set of capabilities, including ambient, so that its child can use the capabilities and changes to uid 99 (nobody). But then suppose that program is exploitable and the attacker pops a shell, the third process? What are the possible mitigations?

In this first run, we have the privileged root process starting off the second as follows:

    capng_clear(CAPNG_SELECT_ALL);
    capng_updatev(CAPNG_ADD,
               CAPNG_EFFECTIVE|CAPNG_PERMITTED|CAPNG_INHERITABLE|
               CAPNG_BOUNDING_SET|CAPNG_AMBIENT, CAP_KILL, CAP_IPC_LOCK, -1);
    capng_change_id(99, 99, CAPNG_DROP_SUPP_GRP);

It drops supplemental groups as a precaution.

# ./first
first
Effective:    00000000, 00004020
Permitted:    00000000, 00004020
Inheritable: 00000000, 00004020
Bounding Set: 00000000, 00004020
Ambient :     00000000, 00004020
second
Effective:    00000000, 00004020
Permitted:    00000000, 00004020
Inheritable: 00000000, 00004020
Bounding Set: 000000FF, FFFFFFFF
Ambient :     00000000, 00004020
third
Effective:    00000000, 00004020
Permitted:    00000000, 00004020
Inheritable: 00000000, 00004020
Bounding Set: 000000FF, FFFFFFFF
Ambient :     00000000, 00004020

As we can see, the attacker has inherited privileges because of the ambient capabilities. We can see that Effective and Permitted have values.

Does clearing the bounding set in the privileged process have any effect?

The code to do that looks like this:

capng_change_id(99, 99, CAPNG_DROP_SUPP_GRP|CAPNG_CLEAR_BOUNDING);

Now running our test program we see the evolution of permissions as follows:

# ./first
first
Effective:    00000000, 00004020
Permitted:    00000000, 00004020
Inheritable: 00000000, 00004020
Bounding Set: 00000000, 00000000
Ambient :     00000000, 00004020
second
Effective:    00000000, 00004020
Permitted:    00000000, 00004020
Inheritable: 00000000, 00004020
Bounding Set: 00000000, 00000000
Ambient :     00000000, 00004020
third
Effective:    00000000, 00004020
Permitted:    00000000, 00004020
Inheritable: 00000000, 00004020
Bounding Set: 00000000, 00000000
Ambient :     00000000, 00004020

Nope. That changes nothing. The attacker still has privileges in Effective and Permitted capabilities.

Next let's see what happens if we drop the inheritable capabilities

capng_updatev(CAPNG_DROP, CAPNG_INHERITABLE, CAP_KILL, CAP_IPC_LOCK, -1);
capng_apply(CAPNG_SELECT_CAPS);

The results are as follows:

# ./first
first
Effective:    00000000, 00004020
Permitted:    00000000, 00004020
Inheritable: 00000000, 00004020
Bounding Set: 00000000, 00004020
Ambient :     00000000, 00004020
second
Effective:    00000000, 00004020
Permitted:    00000000, 00004020
Inheritable: 00000000, 00000000
Bounding Set: 000000FF, FFFFFFFF
Ambient :     00000000, 00000000
third
Effective:    00000000, 00000000
Permitted:    00000000, 00000000
Inheritable: 00000000, 00000000
Bounding Set: 000000FF, FFFFFFFF
Ambient :     00000000, 00000000
attempting to regain ambient capabilities
Effective:    00000000, 00000000
Permitted:    00000000, 00000000
Inheritable: 00000000, 00000000
Bounding Set: 000000FF, FFFFFFFF
Ambient :     00000000, 00000000

This works but that means the program needs to be capabilities aware. If the program was capabilities aware it would not need ambient capabilities because it could start as root and set itself up and change uid.

Maybe just clearing the ambient capabilities in the second process works?

This can be done with the following code:

prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_CLEAR_ALL, 0, 0, 0);

The test results are as follows:

# ./first
first
Effective:    00000000, 00004020
Permitted:    00000000, 00004020
Inheritable: 00000000, 00004020
Bounding Set: 00000000, 00004020
Ambient :     00000000, 00004020
second
Effective:    00000000, 00004020
Permitted:    00000000, 00004020
Inheritable: 00000000, 00004020
Bounding Set: 000000FF, FFFFFFFF
Ambient :     00000000, 00000000
third
Effective:    00000000, 00000000
Permitted:    00000000, 00000000
Inheritable: 00000000, 00004020
Bounding Set: 000000FF, FFFFFFFF
Ambient :     00000000, 00000000
attempting to regain ambient capabilities
Effective:    00000000, 00000000
Permitted:    00000000, 00000000
Inheritable: 00000000, 00004020
Bounding Set: 000000FF, FFFFFFFF
Ambient :     00000000, 00000000

This works. This is probably the best solution for programs that have ambient capabilities and do not want to become capabilities aware. All that's needed is to just make a 1 line call to prctl. You do not need to link against any new library.

Friday, July 5, 2019

About commenting on this blog

I hate to post any note like this, but some people just can't help themselves. I would like to encourage people to comment when there is something constructive to add to the discussion. However, there seems to be a number of comments that simply say, "nice article" and link to their own site and products.

From now on, anyone posting comments that simply say nice article and appear to advertise will have their comment deleted.