Friday, February 17, 2017

Introduction to Linux Audit

OK, we are about done with all the prerequisites before we jump into doing the more interesting analytics. I expect that there will be some people attracted to this series of articles that have a limited understanding of the Linux audit system and how to set it up. But, I would like everyone to be able to recreate the analysis that we are going to do. So, we need go over some basics first. If I show you neat diagrams and show you how I created them, its nice. But if it also works on your system with your data, then its much more valuable.

Audit Pieces
The first thing we need to discuss is all the pieces that are in play or may come up in future discussions. Having a common set of terms will let the experienced and the newbie both understand what is being presented. Below is a diagram of the various pieces of the audit system and what they do.

Fig 1

First let me explain the colors. Light blue are the things that create the events, purple is the reporting tools, red is the controller, gray is the logs, and green is the real-time components.

Audit events can be created in two ways. There are applications that send events any time something specific happens. For example, if you log in to sshd, it will send a series of events as the log in proceeds. It is considered a trusted application and it always tries to send events. If the audit system is not enabled, the event is discarded. Otherwise the kernel accepts the event, time stamps it, adds sender information to the event, and queues it for delivery to the audit daemon, auditd. The only job that the audit daemon has is to reliably dequeue and write events to the log and the event dispatcher, audispd.

The other way that events are created is by the kernel observing system activity that matches a rule loaded by auditctl. The kernel is the thing that creates most events...assuming you loaded rules. It uses a first matching rule system to decide if a syscall is of any interest.

During each system call made by a program, the kernel will look at the rules and compare attributes of the calling program with a rule. These attributes could be the loginuid, uid, executable name, login session, or many other things. You can look at the auditctl man page for a comprehensive list.

When a rule matches, the event is created which records a lot of information about the process that triggered the event. When that happens, the event is time stamped and queued to the audit daemon for disposition.

The audit system is configured by a program named auditctl. It can, among other things, enable/disable the audit system, load rules, list rules, and report status.

The audit daemon simply polls the kernel for events. When it sees an event is available, it looks at its configuration to determine what to do with the event. It may or may not write the event to disk. It may or may not send the event to its real-time interface. Normally, it sends events to both.

Once an event is written to disk, the reporting tools ausearch, aureport, and aulast can be used. Ausearch is specialized at picking events out of the audit trail matched by command line options. It is the recommended tool for looking at audit events because it can detangle interlaced events, group the records into a complete event, and interpret the event several ways. Aureport is a program that can summarize kinds of events. The aulast program is specialized at showing login sessions.

If the audit daemon has been configured to send events to the real-time interface (the dispatcher) it queues the event to audispd who's job it is to distribute the event to any process wanting to see the events in real-time. These could be alerting programs or remote logging programs.

Auditd configuration
There are only a couple settings I want to discuss for the purposes of getting events ready to do analysis. There are other important settings which I will save for another blog post. I will simply list the settings from /etc/audit/auditd.conf that are important for analysis and data science:

local_events = yes
write_logs = yes
log_group = audit
log_format = enriched
freq = 50

The first one determines if the audit daemon should record local events. The answer should be "yes" unless the daemon is running in a container that has no access to the audit netlink interface. In that case it can still be used for aggregating logs from other systems by setting this value to "no".

The write_logs setting should be set to "yes" to put events to disk. This is normal although some people prefer not writing logs and instead, immediately send all events to a remote logging system.

Normally, to see audit logs you must be root. This is not always desirable and it interferes with an easy work flow. The log_group configuration item should be set to a group that you have access to. It could be your own group name, wheel, or maybe you make a special purpose audit group to give people access to logs without the need of privileges.

The log_format should be set to "enriched" so that extra information is recorded about the event at the time its logged so that the event can be analyzed on other systems.

The flush technique should be set to INCREMENTAL_ASYNC which gives the highest performance with a reasonable guarantee the event made it to disk. The freq setting tells how often to flush written events to disk. Set this to 50 for normal systems or 100 if you have a busy system.

Audit Rules
To make sure we get important system events, we need to configure some rules to get things that are important. If you are on a Fedora system, the first thing to do is delete /etc/audit/rules.d/audit.rules. It has a rule in it that basically turns off the audit system. This was intentionally mandated by Fedora's governing board, FESCO. Anyways, undo the damage. Next lets copy a couple rules to the right place:

$ cp /usr/share/doc/audit/rules/10-base-config.rules /etc/audit/rules.d/
$ cp /usr/share/doc/audit/rules/30-stig.rules /etc/audit/rules.d/

I'd recommend one change to the stig rules. Around line 95 is 6 rules to record successful and unsuccessful uses of chmod. This creates way too many events. Either delete them or comment them out by placing a '#' at the beginning of the rule.

Standard Directories
We also want to define the locations of a couple standard directories in your home directory for use in our experiments.

└── R
    ├── audit-data
    ├── extra-data
    └── x86_64-redhat-linux-gnu-library

RStudio will create an 'R' directory for you in your home directory when you install libraries. This is where we want to keep all of our R scripts, data, and libraries. Go ahead and create this directory structure with the same names if it does not exist.

Aliases That Help
A few aliases in the .bashrc file can make for less typing later. I'd recommend:

alias autext='ausearch --format text'
alias aucsv='ausearch --format csv'

Log out and log back in. Try 'autext --start today'.

This concludes the last of the prerequisite and setup blogs. Starting next week, we will begin talking about how to use R to do some pretty interesting things with the audit logs. In the mean time it might be good to brush up on R programming using one of the tutorials I mentioned in my previous posting.


bhargavi laxmi said...
This comment has been removed by the author.
bhargavi laxmi said...

Very nice article to understand the basic concept of Linux. Well written with an example.
Linux Online Training | Linux Training in Hyderabad | Linux Online Training in Hyderabad | Linux Online Training in India | Linux Online Training Institutes in Hyderabad | Best Institutes for Linux in Hyderabad | Linux Training Institutes in Hyderabad | Linux Training Institutes in India | Linux Online Training hyderabad | Linux Online Training India | Best Institutes for Linux | Linux Administration Training in Hyderabad

naveen nunna said...

Thanks for sharing excellent article Datascience training in hyderabad

Jagjeet Kirarkot said...

Most of ideas can be nice content. Thank you for sharing. Any more information visit hereLinux Training in Delhi | Linux Server Certification in India

nivedhitha reddy said...

very nice article python training

nivedhitha reddy said...

very nice article website classes in mumbai

Naveen Yadav said...

That is very interesting; you are a very skilled blogger. I have shared your website in my social networks! A very nice guide. I will definitely follow these tips. Thank you for sharing such detailed article.
Linux Server Corporate Training in India | Linux Institute in Delhi

GIRI M said...

nice blog,
data science!

data science training!

best data science training institute in Hyderabad!

data science online training institute in Hyderabad!

Dishita Kapoor said...

Linux is an inexpensive operating system developed on the interface of Unix to offer a more expedient, reasonable, and responsive operating system to all PC users. Linux server hosting provides with high level of safety to any significant data or information accumulated in the computer.

Rstrainings said...

awesome blog,
it was a very useful information which you shared,
Data science training!
Data science training in Hyderabad!
Data science training in Madhapur!
best Data science training institutes in Hyderabad!
Data science online training in USA!
Data science online training in india!

Networkers Champ said...

Thanks For Sharing. Networkers Champ provide Best RHCSA training institute in bangalore, India.

Mathew Stephen said...

Thanks for taking time to share this blog, it gives lots of information to me.
Continue sharing more like this.
Data Analytics Courses in Chennai | Data Science Course in Chennai

Rstrainings said...

nice blog ,thank you for science training in hyderabad!
Data science training in Madhapur!
Data science online training in Hyderabad!

Mohan R said...
This comment has been removed by the author.
Mohan R said...

Nice Article Thanks for Sharing with us.
DataScience Online Training In Hyderabad

naveen airiss said...

Its really awesome blog Datascience online training in hyderabad

mahesh said...
This comment has been removed by the author.
Raghu said...

Great Blog,Thanks for sharing such beautiful information with us.We have some more information about this safety audit. Please visit our site safety audit.

mahesh said...

Really very informative and creative contents. This concept is a good way to enhance the knowledge.
thanks for sharing. please keep it up.
Linux Training in Gurgaon

Dubai Raju said...

I am really happy to say it’s an interesting post to read . I learn new information from your article , you are doing a great job . Keep it up
Data Science Training in Hyderabad
Data Science Training in Madhapur

suresh H said...

I am impressed on your blog and the information what you are provide. Really this kind of information wants to know all the peoples. Thanks for the sharing...
Data Science Course in Chennai
Big Data Course in Chennai
Machine learning Training in Chennai
R programming training in chennai

Polanki Venkata padmini said...

Very nice blog. Thanks for sharing this useful information. I have shared this blog in all my social media channels too. Really it helps people a lot.....
Linux Online Training, linux course

Gokul Ravi said...

nice blog
hadoop training in chennai

Gokul Ravi said...

nice blog
android training in bangalore
ios training in bangalore
machine learning online training

kanchi sandeep said...

Really it was an awesome article… very interesting to read…
Thanks for sharing.........

datascience online training in hyderabad