Linux 的 perf 实用程序被 Brendan Gregg 用于为 c/c++、jvm 代码、nodejs 代码等生成火焰图。
Linux 内核本身是否理解堆栈跟踪?即使进程是用完全不同的语言编写的,我在哪里可以阅读更多关于工具如何能够内省(introspection)进程堆栈跟踪的信息?
请您参考如下方法:
perf
中有关于堆栈跟踪的简短介绍通过格雷格:
http://www.brendangregg.com/perf.html
4.4 Stack Traces
Always compile with frame pointers. Omitting frame pointers is an evil compiler optimization that breaks debuggers, and sadly, is often the default. Without them, you may see incomplete stacks from perf_events ... There are two ways to fix this: either using dwarf data to unwind the stack, or returning the frame pointers.
Dwarf
Since about the 3.9 kernel, perf_events has supported a workaround for missing frame pointers in user-level stacks: libunwind, which uses dwarf. This can be enabled using "-g dwarf". ... compiler optimizations (
-O2
), which in this case has omitted the frame pointer. ... recompiling .. with-fno-omit-frame-pointer
:
非 C 风格的语言可能有不同的帧格式,或者也可能省略帧指针:
4.3. JIT Symbols (Java, Node.js)
Programs that have virtual machines (VMs), like Java's JVM and node's v8, execute their own virtual processor, which has its own way of executing functions and managing stacks. If you profile these using perf_events, you'll see symbols for the VM engine .. perf_events has JIT support to solve this, which requires the VM to maintain a
/tmp/perf-PID.map
file for symbol translation.Note that Java may not show full stacks to begin with, due to hotspot on x86 omitting the frame pointer (just like gcc). On newer versions (JDK 8u60+), you can use the
-XX:+PreserveFramePointer
option to fix this behavior, ...
Gregg 关于 Java 和堆栈跟踪的博客文章:
http://techblog.netflix.com/2015/07/java-in-flames.html (“Fixing Frame Pointers” - 在某些 JDK8 版本和 JDK9 中通过在程序启动时添加选项来修复)
现在,你的问题:
How does linux's perf utility understand stack traces?
perf
实用程序基本上(在早期版本中)只解析从 linux 内核子系统“
perf_events
”(或有时“
events
”)返回的数据,通过系统调用
perf_event_open
访问.对于调用堆栈跟踪,有选项
PERF_SAMPLE_CALLCHAIN
/
PERF_SAMPLE_STACK_USER
:
样本类型
PERF_SAMPLE_CALLCHAIN
记录调用链(堆栈回溯)。
PERF_SAMPLE_STACK_USER (since Linux 3.7)
Records the user level stack, allowing stack unwinding.
Does the Linux kernel natively understand stack traces?
它可能理解(如果实现)也可能不理解,具体取决于您的 cpu 架构。采样(从实时进程获取/读取调用堆栈)调用链的功能在内核的体系结构独立部分中定义为
__weak
空的 body :
http://lxr.free-electrons.com/source/kernel/events/callchain.c?v=4.4#L26
27 __weak void perf_callchain_kernel(struct perf_callchain_entry *entry,
28 struct pt_regs *regs)
29 {
30 }
31
32 __weak void perf_callchain_user(struct perf_callchain_entry *entry,
33 struct pt_regs *regs)
34 {
35 }
在 4.4 内核中,用户空间调用链采样器在 x86/x86_64、ARC、SPARC、ARM/ARM64、Xtensa、Tilera TILE、PowerPC、Imagination Meta 的内核架构相关部分重新定义:
http://lxr.free-electrons.com/ident?v=4.4;i=perf_callchain_user
arch/x86/kernel/cpu/perf_event.c, line 2279
arch/arc/kernel/perf_event.c, line 72
arch/sparc/kernel/perf_event.c, line 1829
arch/arm/kernel/perf_callchain.c, line 62
arch/xtensa/kernel/perf_event.c, line 339
arch/tile/kernel/perf_event.c, line 995
arch/arm64/kernel/perf_callchain.c, line 109
arch/powerpc/perf/callchain.c, line 490
arch/metag/kernel/perf_callchain.c, line 59
对于某些架构和/或某些模式,从用户堆栈读取调用链可能并非易事。
您使用什么 CPU 架构?使用什么语言和虚拟机?
Where can I read more about how a tool is able to introspect into stack traces of processes, even if processes are written in completely different languages?
你可以试试
gdb
和/或语言的调试器或
backtrace
function libc 或在 libunwind 中支持只读展开(有
local backtrace example in libunwind ,
show_backtrace()
)。
他们可能更好地支持帧解析/与语言的虚拟机或展开信息更好地集成。如果 gdb(使用
backtrace
命令)或其他调试器无法从正在运行的程序中获取堆栈跟踪,则可能根本无法获取堆栈跟踪。
如果他们可以得到调用跟踪,但是
perf
不能(即使在使用
-fno-omit-frame-pointer
重新编译 C/C++ 之后),可以在
perf_events
中添加对架构 + 帧格式组合的支持和
perf
.
有几个博客包含一些关于通用回溯问题和解决方案的信息:
__builtin_return_address(N)
vs glibc 的 backtrace()
vs libunwind 的本地回溯 矮人支持
perf_events
/
perf
: