Applications often run as root because figuring out which Linux capabilities they actually need is difficult. You might know your web server needs to bind to port 80, which requires CAP_NET_BIND_SERVICE. But what about that database daemon? Or that monitoring agent? Trial and error gets old fast, and running everything as root is a big risk.
The libcap-ng project now includes cap-audit, a tool that traces applications to determine exactly which capabilities they require. Unlike static analysis tools that guess based on which syscalls appear in the binary, cap-audit hooks into the kernel's actual capability checking functions. When the kernel asks "does this process have CAP_NET_RAW?", cap-audit records it. This is ground truth - not guesswork.
How It Works
Cap-audit uses eBPF to hook the kernel's capability checking functions - cap_capable(), ns_capable(), and their variants. These are the functions the kernel calls every time it needs to verify a capability. When you trace an application, cap-audit forks your target program, registers its PID with the eBPF program, and then watches for kernel events.
The eBPF program filters events by PID right at the kprobe entry point. This is critical. Without filtering, it would capture thousands of capability checks per second from every process on the system. By filtering for just the target application and its children, overhead drops to less than 1% for the traced app and effectively zero for everything else.
Each capability check generates an event that includes which capability was checked, whether it was granted or denied, which syscall triggered it, and a user-space stack trace. These events stream through a ring buffer to the userspace program, which aggregates them and generates a report showing exactly what your application needs.
Why System Context Matters
Cap-audit doesn't just tell you which capabilities were checked - it reads various sysctls to understand when those capabilities are actually required. For example, consider /proc/sys/kernel/yama/ptrace_scope. If it's set to 0, any process can ptrace any other process it could normally signal. But if it's 1 or higher, you need CAP_SYS_PTRACE. Same binary, different capability requirements depending on system configuration.
The tool gathers kernel.perf_event_paranoid, kernel.unprivileged_bpf_disabled, kernel.kptr_restrict, and several others. These aren't just informational - they directly affect which capabilities your application needs to run. A monitoring tool that reads /proc/kallsyms needs CAP_SYSLOG when kernel.kptr_restrict is 1, but not when it's 0. Cap-audit shows you both the capabilities that were actually checked and the system settings that made them necessary.
This means the capability requirements cap-audit reports are specific to your kernel configuration. If you're deploying to containers or hardened systems with different sysctl values, you might need different capabilities. The report includes the system context so you can make informed decisions.
You Must Exercise All Code Paths
Here's something important to understand: cap-audit traces what your application actually does, not what it could theoretically do. If your application can set the system clock which requires CAP_SYS_TIME, but you only trace it handling normal requests, you won't see that capability requirement.
Think of it like code coverage in testing. If you don't exercise a code path during tracing, its capability requirements won't appear in the report. For daemons, this means you need to trigger all administrative operations, error handling paths, and edge cases. For CLI tools, you need to use all major features and options.
Let's see it in action:
DEMO: Tracing a daemon======================================================================
CAPABILITY ANALYSIS FOR: /usr/sbin/irqbalance (PID 8189)
======================================================================
SYSTEM CONTEXT:
----------------------------------------------------------------------
Kernel version: 6.18.7-100.fc42.x86_64
kernel.yama.ptrace_scope: 1
kernel.kptr_restrict: 1
kernel.dmesg_restrict: 1
kernel.modules_disabled: 0
kernel.perf_event_paranoid: 2
kernel.unprivileged_bpf_disabled: 2
net.core.bpf_jit_enable: 1
net.core.bpf_jit_harden: 1
net.core.bpf_jit_kallsyms: 1
vm.mmap_min_addr: 65536
fs.protected_hardlinks: 1
fs.protected_symlinks: 1
fs.suid_dumpable: 2
REQUIRED CAPABILITIES:
----------------------------------------------------------------------
setpcap (#8)
Checks: 43 granted, 0 denied
Reason: Used by prctl (syscall 157)
sys_admin (#21)
Checks: 34 granted, 0 denied
Reason: Used by clone (syscall 56)
CONDITIONAL CAPABILITIES:
----------------------------------------------------------------------
None
ATTEMPTED BUT DENIED:
----------------------------------------------------------------------
None
SUMMARY:
----------------------------------------------------------------------
Total capability checks: 77
Required capabilities: 2
Conditional capabilities: 0
Denied operations: 0
RECOMMENDATIONS:
----------------------------------------------------------------------
Programmatic solution (C with libcap-ng):
#include <cap-ng.h>
...
capng_clear(CAPNG_SELECT_BOTH);
capng_updatev(CAPNG_ADD, CAPNG_EFFECTIVE|CAPNG_PERMITTED, SETPCAP, SYS_ADMIN, -1);
if (capng_change_id(uid, gid, CAPNG_DROP_SUPP_GRP | CAPNG_CLEAR_BOUNDING))
perror("capng_change_id");
For systemd service:
[Service]
User=<non-root-user>
Group=<non-root-group>
AmbientCapabilities=setpcap sys_admin
CapabilityBoundingSet=setpcap sys_admin
For file capabilities (via filecap):
filecap /path/to/binary setpcap sys_admin
For Docker/Podman:
docker run --user $(id -u):$(id -g) \
--cap-drop=ALL \
--cap-add=setpcap \
--cap-add=sys_admin \
your-image:tag
For Kubernetes:
securityContext:
runAsUser: 1000
runAsGroup: 1000
capabilities:
drop:
- ALL
add:
- setpcap
- sys_admin
For applications that use file-based capabilities (like /usr/bin/ping with cap_net_raw+ep), cap-audit sees when those capabilities are actually exercised. A binary might have five capabilities set on the file, but only uses three during normal operation. Cap-audit shows you what's actually needed.
DEMO: File capabilities======================================================================
CAPABILITY ANALYSIS FOR: /usr/bin/ping (PID 7919)
======================================================================
SYSTEM CONTEXT:
----------------------------------------------------------------------
Kernel version: 6.18.7-100.fc42.x86_64
kernel.yama.ptrace_scope: 1
kernel.kptr_restrict: 1
kernel.dmesg_restrict: 1
kernel.modules_disabled: 0
kernel.perf_event_paranoid: 2
kernel.unprivileged_bpf_disabled: 2
net.core.bpf_jit_enable: 1
net.core.bpf_jit_harden: 1
net.core.bpf_jit_kallsyms: 1
vm.mmap_min_addr: 65536
fs.protected_hardlinks: 1
fs.protected_symlinks: 1
fs.suid_dumpable: 2
REQUIRED CAPABILITIES:
----------------------------------------------------------------------
setpcap (#8)
Checks: 1 granted, 3 denied
Reason: Used by capset (syscall 126)
CONDITIONAL CAPABILITIES:
----------------------------------------------------------------------
None
ATTEMPTED BUT DENIED:
----------------------------------------------------------------------
setuid (#7)
Attempts: 1 (all denied)
Impact: Application may have reduced functionality
sys_admin (#21)
Attempts: 10 (all denied)
Impact: Application may have reduced functionality
SUMMARY:
----------------------------------------------------------------------
Total capability checks: 15
Required capabilities: 1
Conditional capabilities: 0
Denied operations: 2
RECOMMENDATIONS:
----------------------------------------------------------------------
Programmatic solution (C with libcap-ng):
#include <cap-ng.h>
...
capng_clear(CAPNG_SELECT_BOTH);
capng_updatev(CAPNG_ADD, CAPNG_EFFECTIVE|CAPNG_PERMITTED, SETPCAP, -1);
if (capng_change_id(uid, gid, CAPNG_DROP_SUPP_GRP | CAPNG_CLEAR_BOUNDING))
perror("capng_change_id");
For systemd service:
[Service]
User=<non-root-user>
Group=<non-root-group>
AmbientCapabilities=setpcap
CapabilityBoundingSet=setpcap
For file capabilities (via filecap):
filecap /path/to/binary setpcap
For Docker/Podman:
docker run --user $(id -u):$(id -g) \
--cap-drop=ALL \
--cap-add=setpcap \
your-image:tag
For Kubernetes:
securityContext:
runAsUser: 1000
runAsGroup: 1000
capabilities:
drop:
- ALL
add:
- setpcap
Reading the Report
The report breaks down into several sections. Required Capabilities shows capabilities that were successfully checked - these are capabilities your application actively used and will need to function. Each entry includes how many times it was checked and which syscall triggered it. "CAP_NET_BIND_SERVICE: Used by bind (syscall 49)" tells you exactly what's going on.
Conditional Capabilities shows requirements that depend on system configuration. You'll see entries like "CAP_SYS_PTRACE: Needed when kernel.yama.ptrace_scope > 0, Current value: 1 (capability needed)". This tells you the capability is required on your current system, but might not be on systems with different sysctl values.
Attempted But Denied shows capability checks that failed. These are interesting because they reveal functionality your application tried to use but couldn't. Sometimes this is fine - the application has a fallback path. Other times, it indicates reduced functionality. The report notes "Application may have reduced functionality" so you can investigate.
The recommendations section generates ready-to-use configurations for systemd, Docker, Kubernetes, and file-based capabilities. For systemd, you get AmbientCapabilities and CapabilityBoundingSet directives. For Docker, you get --cap-drop=ALL followed by specific --cap-add entries. For file capabilities, you get the filecap command with the exact capability set. These aren't just suggestions - they're the minimal set your application demonstrated it needs. It also produces a snipit of C or python code to show how to programmatically solve this in the program.
Ground Truth, Not Guesswork
The key insight is that cap-audit hooks the actual kernel capability checking functions. When mount() checks CAP_SYS_ADMIN, cap-audit sees it. When bind() checks CAP_NET_BIND_SERVICE, cap-audit sees it. There's no parsing of source code, no heuristics based on syscall names, no guessing. The kernel's security subsystem itself is telling you what capabilities are being checked.
This is why the tool requires CAP_BPF and CAP_PERFMON to run - it's instrumenting kernel internals. But once set up, it gives you authoritative answers about capability requirements. If cap-audit says your application needs three capabilities, those are the three it checked during your trace. If it says your application doesn't need elevated capabilities at all, you can confidently run it as an unprivileged user.
Run your applications through cap-audit during development or security audits. Exercise all functionality, check the system context, and use the generated configurations to properly scope your capabilities. File issues on github if you have a request or find something is off.
