一些杂七杂八的调试技巧整理

Table of Contents

大多数时候去做调试,无非是几个情况:

1、程序、库遇到bug

2、遇到靠思考也难以解决的故障

3、学习原理

4、分析漏洞

调试的目的不同,方法就有好多,但是我认为最重要的还是具备对系统调用和网络协议的熟悉,其次就是掌握各种工具。

1 跟踪函数调用

好多高级语言都能找到可以跟踪函数调用的工具,借助这些工具可以快速摸清函数调用流程。如PHP有Xdebug,只用在PHP源码中调用Xdebug提供的函数即可:

xdebug_start_trace('输出的日志路径');
....
xdebug_stop_trace();

1.1 strace

对于操作系统层面,strace命令可以帮我们跟踪程序的系统调用情况。时常会遇到下载的软件没有在文档里说清楚配置文件路径,导致启动时找不到配置文件,借助strace命令,我们只用分析程序启动时读取了哪些文件。

作为一个例子,我们用strace跟踪下cat一个不存在的文件,发生了什么:

execve("/usr/bin/cat", ["cat", "/etc/test.conf"], 0x7ffed4e728b8 /* 69 vars */) = 0
brk(NULL)                               = 0x558aaba6d000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=164363, ...}) = 0
mmap(NULL, 164363, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f26099c3000
close(3)                                = 0
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\00002\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2123224, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f26099c1000
mmap(NULL, 3926752, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f2609407000
mprotect(0x7f26095bc000, 2097152, PROT_NONE) = 0
mmap(0x7f26097bc000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b5000) = 0x7f26097bc000
mmap(0x7f26097c2000, 15072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f26097c2000
close(3)                                = 0
arch_prctl(ARCH_SET_FS, 0x7f26099c2540) = 0
mprotect(0x7f26097bc000, 16384, PROT_READ) = 0
mprotect(0x558aab405000, 4096, PROT_READ) = 0
mprotect(0x7f26099ec000, 4096, PROT_READ) = 0
munmap(0x7f26099c3000, 164363)          = 0
brk(NULL)                               = 0x558aaba6d000
brk(0x558aaba8e000)                     = 0x558aaba8e000
brk(NULL)                               = 0x558aaba8e000
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=209522432, ...}) = 0
mmap(NULL, 209522432, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f25fcc36000
close(3)                                = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
openat(AT_FDCWD, "/etc/test.conf", O_RDONLY) = -1 ENOENT (No such file or directory)
write(2, "cat: ", 5cat: )                    = 5
write(2, "/etc/test.conf", 14/etc/test.conf)          = 14
openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2997, ...}) = 0
read(3, "# Locale name alias data base.\n#"..., 4096) = 2997
read(3, "", 4096)                       = 0
close(3)                                = 0
openat(AT_FDCWD, "/usr/share/locale/zh_CN.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/zh_CN.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/zh_CN/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/zh.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/zh.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/zh/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
write(2, ": No such file or directory", 27: No such file or directory) = 27
write(2, "\n", 1
)                       = 1
close(1)                                = 0
close(2)                                = 0
exit_group(1)                           = ?
+++ exited with 1 +++

从输出结果来看,cat命令调用了很多次openat都失败了(返回的-1)。通过以下表达式就可以捕获所有的openat调用:

strace -e 'trace=openat' cat /etc/test.conf

1.2 Ktrace

Ktrace是BSD下类似Linux中strace的工具,用于跟踪进程的系统调用。ktrace输出的是二进制文件,需要kdump辅助解析。

例1,解决找不到文件问题

比如我修改MySQL的配置文件/etc/my.cnf里数据存储路径后,依旧提示找不到某文件。用ktrace命令执行:

# ktrace mysqld_safe

然后会在当前目录下生成ktrace.out的二进制文件,用kdump(参数-f指定dump文件)即可看到调用过程,类似:

22866 sh       CALL  munmap(0x32be7206000,0xff0)
22866 sh       RET   munmap 0
22866 sh       CALL  write(2,0x32c3dc7b410,0x6a)
22866 sh       GIO   fd 2 wrote 106 bytes
"/usr/local/bin/mysqld_safe[956]: cannot create /var/mysql/host.err: No such file or directory
"
22866 sh       RET   write 106/0x6a
22866 sh       CALL  read(10,0x32c3dc7bc58,0x200)
22866 sh       RET   read 0
22866 sh       CALL  close(10)
22866 sh       RET   close 0

其中CALL表示调用某个系统函数,RET是调用的返回值,NAMI是访问的文件路径。比如我想找出命令调用操作了哪些外部文件,以及它的返回值:

# kdump | fgrep -A 2 NAMI

类似结果如下:

22866 sh       CALL  sigprocmask(SIG_BLOCK,0x80000<SIGCHLD>)
22866 sh       NAMI  "/var/mysql/host.pid"
22866 sh       RET   stat -1 errno 2 No such file or directory
22866 sh       CALL  pipe(0x7f7ffffda880)
22866 sh       NAMI  "/var/mysql/host.err"
22866 sh       RET   open -1 errno 2 No such file or directory
22866 sh       CALL  issetugid()
22866 sh       NAMI  "/usr/share/nls/C/libc.cat"

或者查看系统调用返回失败的:

kdump | fgrep errno -B 2

例2,寻找配置文件

运行Nginx时,想知道加载的哪个目录配置文件:

# ktrace -t n nginx

参数-t指定要跟踪的系统调用类型,n表示跟踪namei。

# kdump -f ktrace.out | fgrep .conf