Debugging Linux kernel
Performance
editThere are many factors that can affect the performance of the Linux kernel, including hardware configurations, software configurations, and workload characteristics.
In this context, performance optimization of the Linux kernel involves identifying and addressing performance bottlenecks in the system. This can involve tuning kernel parameters, optimizing system resources, and identifying and fixing bugs and other issues that may be impacting performance.
Given the complexity of the Linux kernel and the wide range of factors that can affect performance, performance optimization can be a challenging task. However, with the right tools and techniques, it is possible to significantly improve the performance and reliability of Linux-based systems.
Perf_events
editPerf_events, short for performance events, is a powerful interface that provides detailed insights into the performance characteristics of software running on a system. By analyzing the data collected by perf_events, developers can identify performance bottlenecks and optimize software to improve performance and reduce resource utilization. Perf_events is designed to be a lightweight, low-overhead monitoring solution that has minimal impact on system performance.
🔧 TODO
⚲ Interfaces
- man 1 perf – performance analysis tools
- Basic commands:
- man 1 perf-help – display help information about perf
- man 1 perf-top – System profiling tool.
- man 1 perf-record – Run a command and record its profile into perf.data
- man 1 perf-report – Read perf.data (created by perf record) and display the profile
- Other commands:
- man 1 perf-annotate – Read perf.data (created by perf record) and display annotated code
- man 1 perf-archive – Create archive with object files with build-ids found ...
- man 1 perf-arm-spe – Support for Arm Statistical Profiling Extension within...
- man 1 perf-bench – General framework for benchmark suites
- man 1 perf-buildid-cache – Manage build-id cache.
- man 1 perf-buildid-list – List the buildids in a perf.data file
- man 1 perf-c2c – Shared Data C2C/HITM Analyzer.
- man 1 perf-config – Get and set variables in a configuration file.
- man 1 perf-daemon – Run record sessions on background
- man 1 perf-data – Data file related processing
- man 1 perf-diff – Read perf.data files and display the differential profile
- man 1 perf-dlfilter – Filter sample events using a dynamically loaded shared...
- man 1 perf-evlist – List the event names in a perf.data file
- man 1 perf-ftrace – simple wrapper for kernel's ftrace functionality
- man 1 perf-inject – Filter to augment the events stream with additional in...
- man 1 perf-intel-pt – Support for Intel Processor Trace within perf tools
- man 1 perf-iostat – Show I/O performance metrics
- man 1 perf-kallsyms – Searches running kernel for symbols
- man 1 perf-kmem – Tool to trace/measure kernel memory properties
- man 1 perf-kvm – Tool to trace/measure kvm guest os
- man 1 perf-kwork – Tool to trace/measure kernel work properties (latencies)
- man 1 perf-list – List all symbolic event types
- man 1 perf-lock – Analyze lock events
- man 1 perf-mem – Profile memory accesses
- man 1 perf-probe – Define new dynamic tracepoints
- man 1 perf-sched – Tool to trace/measure scheduler properties (latencies)
- man 1 perf-script – Read perf.data (created by perf record) and display tr...
- man 1 perf-script-perl – Process trace data with a Perl script
- man 1 perf-script-python – Process trace data with a Python script
- man 1 perf-stat – Run a command and gather performance counter statistics
- man 1 perf-test – Runs sanity tests.
- man 1 perf-timechart – Tool to visualize total system behavior during a workload
- man 1 perf-trace – strace inspired tool
- man 1 perf-version – display the version of perf binary
⚙️ Internals
- man 2 perf_event_open – sets up performance monitoring
- uapi/linux/perf_event.h inc
- tools/perf src
- linux/perf_event.h inc
- kernel/events/core.c src
- kernel/profile.c src – simple profiling
📖 References
- perf – instruments CPU performance counters, tracepoints, kprobes, and uprobes
- https://perf.wiki.kernel.org/
📚 Further reading
🛠️ Utilities
- Performance Co-Pilot, https://pcp.io/ – Performance Co-Pilot
- Prometheus, https://prometheus.io/
- https://github.com/redhat-nfvpe/container-perf-tools
- https://github.com/brendangregg/perf-tools – performance analysis tools based on Linux perf_events (aka perf) and ftrace
- readprofile – a tool to read kernel profiling information
📚 Further reading
User space debug interfaces
edit⚲ Interfaces
- man 1 dmesg – prints or control the kernel ring buffer
- man 2 syslog – system call, which is used to control the kernel printk() buffer
- man 1 strace – system calls and signals tracing tool
- man 2 ptrace – process trace system call
- man 3 klogctl
- man 5 core
- /sys/kernel/debug/ – debugfs
- dmesg --console-level <level>
- gdb /usr/src/linux/vmlinux /proc/kcore
- /proc/self/stack
- dynamic doc debug
- ⌨️ hands-on:
- echo "module atkbd +pfl" | sudo tee /sys/kernel/debug/dynamic_debug/control
⚙️ Internals
📚 References
Tracing and logging
edit⚲ API:
User-space interface:
- man 1 dmesg – prints or control the kernel ring buffer
- man 2 syslog – system call, which is used to control the kernel printk() buffer
- /proc/kmsg
- man 1 trace-cmd – interacts with Ftrace Linux kernel internal tracer /sys/kernel/debug/tracing/
Most common functions
- linux/printk.h inc
- pr_devel id- conditional debug-level message
- pr_debug id- conditional debug-level or dynamic doc message
- ⌨️ hands-on:
- echo "module atkbd +pfl" | sudo tee /sys/kernel/debug/dynamic_debug/control
- Log messages with other levels:
- asm-generic/bug.h inc
⚙️ Internals
- printk id
- kernel/printk/printk.c src
- arch/x86/kernel/traps.c src
- lib/dump_stack.c src
- kernel/trace src
- scripts/tracing/draw_functrace.py src
- logging ltp, tracing ltp
- samples/ftrace src
- samples/trace_events src
- samples/trace_printk src
- linux/instrumentation.h inc
📚 References:
- Debugging by printing
- Message logging with printk doc
- SystemTap
- man 1 stap – systemtap script translator/driver
- strace
- man 1 strace – trace system calls and signals
- LTTng
- ftrace
- Linux Tracing Technologies doc
- Tracepoint Analysis doc
- Function Tracer doc – function, latency and event tracing
- Using ftrace to hook to functions doc
- Fprobe - Function entry/exit probe doc
- Kprobes doc
- Kprobe-based Event Tracing doc
- Uprobe-tracer: Uprobe-based Event Tracing doc
- Using the Linux Kernel Tracepoints doc
- Subsystem Trace Points: kmem doc
- Subsystem Trace Points: power doc
- NMI Trace Events doc
- In-kernel memory-mapped I/O tracing doc
- Event Histograms doc
- Histogram Design Notes doc
- Boot-time tracing doc
- Hardware Latency Detector doc
- Intel(R) Trace Hub (TH) doc
- Lockless Ring Buffer Design doc
- System Trace Module doc
- CoreSight - ARM Hardware Trace doc
🔧 TODO. 🚀 advanced features
- linux/kmemleak.h inc – memory leak detector
- pr_cont id- continues a previous log message in the same line
- print_hex_dump_bytes id
- print_hex_dump_debug id
- dump_stack id
- CONFIG_PRINTK_CALLER id
- CONFIG_DEBUG_KERNEL id
- CONFIG_DEBUG_INFO id
- https://git.kernel.org/pub/scm/libs/libtrace/
kgdb and kdb
edit⚲ Interfaces
⚙️ Internals
📚 References
- Using kgdb, kdb and the kernel debugger internals doc
- kdump
- kdump doc
- man 8 crash – Analyze Linux crash dump data or a live system
⚲ API:
📖 References
📚 Further reading
- man 7 bpf-helpers
- Linux Extended BPF (eBPF) Tracing Tools
- bpftrace – High-level tracing language for Linux eBPF
- BCC – Tools for BPF-based Linux IO analysis, networking, monitoring, and more
- man 8 stapbpf
- eBPF Programming for Linux Kernel Tracing
- lockdep - Runtime locking correctness validator doc
Watchdogs
editThe Linux Kernel/Softdog Driver
dev_watchdog id – network device watchdog
The NMI watchdog lockup detectors:
⚲ API
- /proc/sys/kernel/nmi_watchdog
- /proc/sys/kernel/soft_watchdog
- /proc/sys/kernel/watchdog
- /proc/sys/kernel/watchdog_cpumask
- /proc/sys/kernel/watchdog_thresh
- /proc/sys/kernel/hardlockup_all_cpu_backtrace
- /proc/sys/kernel/hardlockup_panic
- /proc/sys/kernel/softlockup_all_cpu_backtrace
- /proc/sys/kernel/softlockup_panic
👁️ Example
- ./lib/test_lockup.c src – test module to generate lockups
Provoke NMI watchdog without panic:
echo 0 > /proc/sys/kernel/hardlockup_panic insmod test_lockup.ko disable_irq=1 time_secs=13
⚙️ Internals
- kernel/watchdog.c src – detects hard and soft lockups on a system
- kernel/watchdog_perf.c src – detects hard lockups on a system using perf
- kernel/watchdog_buddy.c src
📚 References
- Documentation for /proc/sys/kernel/ doc
- Softlockup detector and hardlockup detector (aka nmi_watchdog) doc
- kernel parameters:
...
edit⚙️ Internals
📖 References for debugging
- Ramoops oops/panic logger doc
- pstore block oops/panic logger doc
- Fault injection doc
- Bisecting a bug doc
- Development tools for the kernel doc
- linux/tracepoint.h inc
📚 Further reading