Security + Data Science: Using auparse in python

A while back we took a look at how to write a basic auparse program. The audit libraries have python bindings so that can let you write scripts that do things with audit events. Today, we will take a look at previously given example programs for "C" and see how to recreate them in python. I will avoid the lengthy discussion of the how's and why's from the original article, please refer back to it if explanation is needed.

Now in Python
I was going to publish this blog post about 2 weeks ago. In writing the code, I discovered that the python bindings for auparse had bugs and outright errors in them. These were all corrected in the last release, audit-2.7.7. I held up publishing this to give time for various distributions to get this update pushed out. The following code is not guaranteed to work unless you are on 2.7.7 or later.

We started the article off by showing the basic application construct to loop through all the logs. This is the equivalent of the first example:

#!/usr/bin/env python3

import sys
import auparse
import audit

aup = auparse.AuParser(auparse.AUSOURCE_LOGS);
aup.first_record()
while True:
    while True:
        while True:
            aup.get_field_name()
            if not aup.next_field(): break
        if not aup.next_record(): break
    if not aup.parse_next_event(): break
aup = None
sys.exit(0)

Just as stated in the original article...it's not too useful but it shows the basic structure of how to iterate through logs. We start by importing both audit libraries. Then we call the equivalent of auparse_init which is auparse.AuParser. The auparse state is caught in the variable aup. After that, all functions in auparse are called similarly to the C version except you do not need the auparse_ part of the function name. When done with the state variable, it is destroyed by setting it to None.

Now let's recreate example 2 which is a small program that loops through the logs and prints the record type and the field names contained in each record that follows:

#!/usr/bin/env python3

import sys
import auparse
import audit

aup = auparse.AuParser(auparse.AUSOURCE_LOGS);
aup.first_record()
while True:
    while True:
        mytype = aup.get_type_name()
        print("Record type: %s" % mytype, "- ", end='')
        while True:
            print("%s," % aup.get_field_name(), end='')
            if not aup.next_field(): break
        print("\b")
        if not aup.next_record(): break
    if not aup.parse_next_event(): break
aup = None
sys.exit(0)

I don't think there is anything new to mention here. Running it should give some output such as:

Record type: PROCTITLE - type,proctitle,
Record type: SYSCALL - type,arch,syscall,success,exit,a0,a1,a2,a3,items,ppid,pid,auid,uid,gid,euid,suid,fsuid,egid,sgid,fsgid,tty,ses,comm,exe,subj,key,
Record type: CWD - type,cwd,
Record type: PATH - type,item,name,inode,dev,mode,ouid,ogid,rdev,obj,nametype,
Record type: PROCTITLE - type,proctitle,
Record type: SYSCALL - type,arch,syscall,success,exit,a0,a1,a2,a3,items,ppid,pid,auid,uid,gid,euid,suid,fsuid,egid,sgid,fsgid,tty,ses,comm,exe,subj,key,

Now, let's take a quick look at how to use output from the auparse normalizer. I will not repeat the explanation of how auparse_normalize works. Please refer to the original article for a deeper explanation. The next program takes its input from stdin. So, run ausearch --raw and pipe that into the following program.

#!/usr/bin/env python3

import sys
import auparse
import audit

aup = auparse.AuParser(auparse.AUSOURCE_DESCRIPTOR, 0);
if not aup:
    print("Error initializing")
    sys.exit(1)

while aup.parse_next_event():
    print("---")
    mytype = aup.get_type_name()
    print("event: ", mytype)

    if aup.aup_normalize(auparse.NORM_OPT_NO_ATTRS):
        print("Error normalizing")
        continue

    try:
        evkind = aup.aup_normalize_get_event_kind()
    except RuntimeError:
        evkind = ""
    print(" event-kind:", evkind)

    if aup.aup_normalize_session():
        print(" session:", aup.interpret_field())

    if aup.aup_normalize_subject_primary():
        subj = aup.interpret_field()
        field = aup.get_field_name()
        if subj == "unset":
            subj = "system"
        print(" subject.primary:", field, "=", subj)

    if aup.aup_normalize_subject_secondary():
        subj = aup.interpret_field()
        field = aup.get_field_name()
        print(" subject.secondary:", field, "=", subj)

    try:
        action = aup.aup_normalize_get_action()
    except RuntimeError:
        action = ""
    print(" action:", action)

    if aup.aup_normalize_object_primary():
        field = aup.get_field_name()
        print(" object.primary:", field, "=", aup.interpret_field())

    if aup.aup_normalize_object_secondary():
        field = aup.get_field_name()
        print(" object.secondary:", field, "=", aup.interpret_field())

    try:
        str = aup.aup_normalize_object_kind()
    except RuntimeError:
       str = ""
    print(" object-kind:", str)

    try:
        how = aup.aup_normalize_how()
    except RuntimeError:
        how = ""
    print(" how:", how)

aup = None
sys.exit(0)

There is one thing about the function names that I wanted to point out. The auparse_normalizer functions are all prefixed with aup_. There were some unfortunate naming collisions that necessitated the change in names.

Another thing to notice is that the normalizer metadata functions can throw exceptions. They are always a RuntimeError whenever the function would have returned NULL as a C function. The above program also shows how to read a file from stdin which is descriptor 0. Below is some sample output:

ausearch --start today --raw | ./test3.py

---
event: SYSCALL
event-kind: audit-rule
session: 4
subject.primary: auid = sgrubb
subject.secondary: uid = sgrubb
action: opened-file
object.primary: name = /etc/audit/auditd.conf
object-kind: file
how: /usr/sbin/ausearch
---
event: SYSCALL
event-kind: audit-rule
session: 4
subject.primary: auid = sgrubb
subject.secondary: uid = sgrubb
action: opened-file
object.primary: name = /etc/audit/auditd.conf
object-kind: file
how: /usr/sbin/ausearch

Conclusion
The auparse python bindings can be used whenever you want to manipulate audit data via python. This might be preferable in some cases where you want to create a Jupyter notebook with some reports inside. Another possibility is that you can go straight to Keras, Theano, or TensorFlow in the same application. We will eventually cover machine learning and the audit logs. It'll take some time to get there because there are a lot of prerequisite setups that you would need to do.

Security + Data Science

Monday, June 26, 2017

Using auparse in python

2 comments:

Blog Archive