Posted on 

容器中 glibc 兼容问题

今天同事在线上做Debain版本升级遇到问题

1
2
3
4
5
6
# kubectl exec -it fqh-realserver-01-69586d7c74-dd9cl bash
command terminated with exit code 139

# apt-get update
E: Method http has died unexpectedly!
E: Sub-process http received a segmentation fault.

可以看到错误代码是 139 是客户端报错,再看一下内核日志

1
2
3
May 11 16:11:20 xxx kernel: [5790773.000196] bash[231251] vsyscall attempted with vsyscall=none ip:ffffffffff600400 cs:33 sp:7ffc7942aad8 ax:ffffffffff600400 si:7ffc7942af76 di:0
May 11 16:11:20 xxx kernel: [5790773.000200] bash[231251]: segfault at ffffffffff600400 ip ffffffffff600400 sp 00007ffc7942aad8 error 15
May 11 16:11:20 xxx kernel: [5790773.000202] Code: Bad RIP value.

问题分析

错误信息就怀疑是 ABI 层面的兼容问题,要验证想法就要看一下为什么发生 segmentation fault。

构造一下测试环境,在一个终端启动一个 docker,因为 docker 是 c/s 模型,没有必要在客户端 debug,客户端仅仅是个发请求的工具。

1
2
# /usr/bin/docker run -it dockerhub.nie.netease.com/frosty/sea sh
#

因为我发现镜像里面的 sh 是可以启动,通过 sh 先将 container 运行起来,任何再 container 运行其他指令。

可以可看到 container,获取这个 container 的 PID,也就是宿主层面的视角的 container 实体。

1
2
3
4
5
6
7
8
9
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
899eea8d3958 dockerhub.nie.netease.com/frosty/sea "sh" 4 seconds ago Up 3 seconds happy_galileo
307eb53d45df e15788915c71 "/usr/bin/cadvisor -…" 7 weeks ago Up 7 weeks k8s_cadvisor_kube-cadvisor-d429r_kube-system_5b88b786-1ecc-4c63-9c10-2b4d5000d3bc_0
076136a16137 dockerhub.nie.netease.com/whale/google_containers "/pause" 7 weeks ago Up 7 weeks k8s_POD_kube-cadvisor-d429r_kube-system_5b88b786-1ecc-4c63-9c10-2b4d5000d3bc_0
# docker inspect 899eea8d3958 | grep Pid
"Pid": 2204330,
"PidMode": "",
"PidsLimit": 0,

gdb attach 进程,附加上去并继续运行 container(这个 gdb 我安装了 peda 插件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# gdb -q attach 2204330
attach: No such file or directory.
Attaching to process 2204330
Reading symbols from target:/bin/dash...(no debugging symbols found)...done.
Reading symbols from target:/lib/x86_64-linux-gnu/libc.so.6...(no debugging symbols found)...done.
Reading symbols from target:/lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.

warning: Target and debugger are in different PID namespaces; thread lists and other data are likely unreliable. Connect to gdbserver inside the container.
[----------------------------------registers-----------------------------------]
RAX: 0xfffffffffffffe00
RBX: 0x0
RCX: 0x7f5da5a917d0 --> 0x3173fffff0013d48
RDX: 0x2000 ('')
RSI: 0x61a200 --> 0x0
RDI: 0x0
RBP: 0x61a200 --> 0x0
RSP: 0x7ffdc70f6608 --> 0x40881d (cmp eax,0x0)
RIP: 0x7f5da5a917d0 --> 0x3173fffff0013d48
R8 : 0x7f5da5d48e40 --> 0x100000000
R9 : 0x7f5da5d48e90 --> 0x0
R10: 0x0
R11: 0x246
R12: 0x0
R13: 0x1
R14: 0x0
R15: 0x0
EFLAGS: 0x246 (carry PARITY adjust ZERO sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
0x7f5da5a917c7 <read+7>: jne 0x7f5da5a917d9 <read+25>
0x7f5da5a917c9 <read+9>: mov eax,0x0
0x7f5da5a917ce <read+14>: syscall
=> 0x7f5da5a917d0 <read+16>: cmp rax,0xfffffffffffff001
0x7f5da5a917d6 <read+22>: jae 0x7f5da5a91809 <read+73>
0x7f5da5a917d8 <read+24>: ret
0x7f5da5a917d9 <read+25>: sub rsp,0x8
0x7f5da5a917dd <read+29>: call 0x7f5da5aaa240
[------------------------------------stack-------------------------------------]
0000| 0x7ffdc70f6608 --> 0x40881d (cmp eax,0x0)
0008| 0x7ffdc70f6610 --> 0x1
0016| 0x7ffdc70f6618 --> 0x0
0024| 0x7ffdc70f6620 --> 0x0
0032| 0x7ffdc70f6628 --> 0x40e9cd (mov ebx,eax)
0040| 0x7ffdc70f6630 --> 0x1
0048| 0x7ffdc70f6638 --> 0x40ca67 (test ebp,ebp)
0056| 0x7ffdc70f6640 --> 0x1
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
0x00007f5da5a917d0 in read () from target:/lib/x86_64-linux-gnu/libc.so.6

切换到容器里面执行 ldd (理论上任何非静态打包的程序都可以

1
2
# /usr/bin/docker run -it dockerhub.nie.netease.com/frosty/sea sh
# ldd

发现进程收到 SIGSEGV信号,而这个信号的默认行为就是终止进程。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
[----------------------------------registers-----------------------------------]
RAX: 0xffffffffff600400
RBX: 0x7ffd0d88cf60 ("/bin/bash")
RCX: 0x7361622f6e69622f ('/bin/bas')
RDX: 0x2f ('/')
RSI: 0x7ffd0d88cf60 ("/bin/bash")
RDI: 0x0
RBP: 0x0
RSP: 0x7ffd0d88baf8 --> 0x7f6be2ca160d --> 0x909090c308c48348
RIP: 0xffffffffff600400
R8 : 0x7ffd0d88cf60 ("/bin/bash")
R9 : 0x1
R10: 0x0
R11: 0x7f6be2ca1600 --> 0xc0c74808ec8348
R12: 0x42164c (<_start>: xor ebp,ebp)
R13: 0x7ffd0d88bd40 --> 0x2
R14: 0x0
R15: 0x0
EFLAGS: 0x10206 (carry PARITY adjust zero sign trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0xffffffffff600400
[------------------------------------stack-------------------------------------]
0000| 0x7ffd0d88baf8 --> 0x7f6be2ca160d --> 0x909090c308c48348
0008| 0x7ffd0d88bb00 --> 0x6d ('m')
0016| 0x7ffd0d88bb08 --> 0x420324 (<main+1028>: mov edx,DWORD PTR [rsp+0x28])
0024| 0x7ffd0d88bb10 --> 0x7ffd0d88bc80 --> 0x200000000
0032| 0x7ffd0d88bb18 --> 0x7ffd0d88bd48 --> 0x7ffd0d88cf60 ("/bin/bash")
0040| 0x7ffd0d88bb20 --> 0x7ffd0d88bd60 --> 0x7ffd0d88cf77 ("HOSTNAME=899eea8d3958")
0048| 0x7ffd0d88bb28 --> 0x7f6b00000002
0056| 0x7ffd0d88bb30 --> 0xf63d4e2e
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV
0xffffffffff600400 in ?? ()

发现 RIP 寄存器访问了内核的地址空间了,内核所在高地址。任何系统就给发了一个 SIGSEGV 信号,默认行为就是终止进程。

1
2
3
4
5
6
gdb-peda$ bt
#0 0xffffffffff600400 in ?? ()
#1 0x00007f828873f60d in time () from target:/lib/x86_64-linux-gnu/libc.so.6
#2 0x0000000000420324 in main ()
#3 0x00007f82886c0ead in __libc_start_main () from target:/lib/x86_64-linux-gnu/libc.so.6
#4 0x0000000000421675 in _start ()

可以看到其实进程死的时候调用了标准库里面的函数 time() ,标准库的问题在容器中位于 /lib/x86_64-linux-gnu/libc.so.6

反汇编看一下这个函数的实现

1
2
3
4
5
6
7
8
gdb-peda$ disassemble 0x00007fa58acd060d
Dump of assembler code for function time:
0x00007fa58acd0600 <+0>: sub rsp,0x8
0x00007fa58acd0604 <+4>: mov rax,0xffffffffff600400
0x00007fa58acd060b <+11>: call rax
0x00007fa58acd060d <+13>: add rsp,0x8
0x00007fa58acd0611 <+17>: ret
End of assembler dump.

可以看到 rax 值就是异常的原因,看一下当前进程的 memory layout。可以看到至少没有将 0xffffffffff600400 这个地址映射到进程的地址空间。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
gdb-peda$ vmmap
Start End Perm Name
0x00400000 0x004e5000 r-xp /bin/bash
0x006e4000 0x006e5000 r--p /bin/bash
0x006e5000 0x006ee000 rw-p /bin/bash
0x006ee000 0x006f4000 rw-p mapped
0x0190e000 0x01910000 rw-p [heap]
0x00007fa58ac33000 0x00007fa58adb5000 r-xp /lib/x86_64-linux-gnu/libc-2.13.so
0x00007fa58adb5000 0x00007fa58afb5000 ---p /lib/x86_64-linux-gnu/libc-2.13.so
0x00007fa58afb5000 0x00007fa58afb9000 r--p /lib/x86_64-linux-gnu/libc-2.13.so
0x00007fa58afb9000 0x00007fa58afba000 rw-p /lib/x86_64-linux-gnu/libc-2.13.so
0x00007fa58afba000 0x00007fa58afbf000 rw-p mapped
0x00007fa58afbf000 0x00007fa58afc1000 r-xp /lib/x86_64-linux-gnu/libdl-2.13.so
0x00007fa58afc1000 0x00007fa58b1c1000 ---p /lib/x86_64-linux-gnu/libdl-2.13.so
0x00007fa58b1c1000 0x00007fa58b1c2000 r--p /lib/x86_64-linux-gnu/libdl-2.13.so
0x00007fa58b1c2000 0x00007fa58b1c3000 rw-p /lib/x86_64-linux-gnu/libdl-2.13.so
0x00007fa58b1c3000 0x00007fa58b1e8000 r-xp /lib/x86_64-linux-gnu/libtinfo.so.5.9
0x00007fa58b1e8000 0x00007fa58b3e7000 ---p /lib/x86_64-linux-gnu/libtinfo.so.5.9
0x00007fa58b3e7000 0x00007fa58b3eb000 r--p /lib/x86_64-linux-gnu/libtinfo.so.5.9
0x00007fa58b3eb000 0x00007fa58b3ec000 rw-p /lib/x86_64-linux-gnu/libtinfo.so.5.9
0x00007fa58b3ec000 0x00007fa58b40c000 r-xp /lib/x86_64-linux-gnu/ld-2.13.so
0x00007fa58b604000 0x00007fa58b607000 rw-p mapped
0x00007fa58b609000 0x00007fa58b60b000 rw-p mapped
0x00007fa58b60b000 0x00007fa58b60c000 r--p /lib/x86_64-linux-gnu/ld-2.13.so
0x00007fa58b60c000 0x00007fa58b60d000 rw-p /lib/x86_64-linux-gnu/ld-2.13.so
0x00007fa58b60d000 0x00007fa58b60e000 rw-p mapped
0x00007ffca937e000 0x00007ffca939f000 rw-p [stack]
0x00007ffca93ca000 0x00007ffca93cd000 r--p [vvar]
0x00007ffca93cd000 0x00007ffca93cf000 r-xp [vdso]

确认一下 glibc的版本并将这个动态连接库复制到宿主上,反汇编分析。

1
2
3
4
5
6
# ls /lib/x86_64-linux-gnu/libc.so.6
/lib/x86_64-linux-gnu/libc.so.6
# ls -al /lib/x86_64-linux-gnu/libc.so.6
lrwxrwxrwx 1 root root 12 Oct 16 2014 /lib/x86_64-linux-gnu/libc.so.6 -> libc-2.13.so
# ls /lib/x86_64-linux-gnu/libc-2.13.so
/lib/x86_64-linux-gnu/libc-2.13.so

反汇编如下可以看到和运行时的代码是一样的,也就是说这个版本 glibc 代码里面硬编码了 0xffffffffff600400 这个地址。

1
2
3
4
5
6
7
8
9
10
# gdb -q libc-2.13.so
Reading symbols from libc-2.13.so...(no debugging symbols found)...done.
gdb-peda$ disassemble time
Dump of assembler code for function time:
0x000000000009d600 <+0>: sub rsp,0x8
0x000000000009d604 <+4>: mov rax,0xffffffffff600400
0x000000000009d60b <+11>: call rax
0x000000000009d60d <+13>: add rsp,0x8
0x000000000009d611 <+17>: ret
End of assembler dump.

这也就初步验证了就是应该是 ABI 层面的兼容问题,Google 一下解决方案也是很简单的在 grub 里面配置启动参数 vsyscall=emulate,之所以这样配置的原因是 kernel Config

修改参数修改一下重新尝试,发现内存里面多了一个 vsyscall 的 4k page。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
gdb-peda$ vmmap
Start End Perm Name
0x00400000 0x004e5000 r-xp /bin/bash
0x006e4000 0x006e5000 r--p /bin/bash
0x006e5000 0x006ee000 rw-p /bin/bash
0x006ee000 0x006f4000 rw-p mapped
0x0134e000 0x0138b000 rw-p [heap]
0x00007f2d59391000 0x00007f2d5939c000 r-xp /lib/x86_64-linux-gnu/libnss_files-2.13.so
0x00007f2d5939c000 0x00007f2d5959b000 ---p /lib/x86_64-linux-gnu/libnss_files-2.13.so
0x00007f2d5959b000 0x00007f2d5959c000 r--p /lib/x86_64-linux-gnu/libnss_files-2.13.so
0x00007f2d5959c000 0x00007f2d5959d000 rw-p /lib/x86_64-linux-gnu/libnss_files-2.13.so
0x00007f2d5959d000 0x00007f2d595a7000 r-xp /lib/x86_64-linux-gnu/libnss_nis-2.13.so
0x00007f2d595a7000 0x00007f2d597a6000 ---p /lib/x86_64-linux-gnu/libnss_nis-2.13.so
0x00007f2d597a6000 0x00007f2d597a7000 r--p /lib/x86_64-linux-gnu/libnss_nis-2.13.so
0x00007f2d597a7000 0x00007f2d597a8000 rw-p /lib/x86_64-linux-gnu/libnss_nis-2.13.so
0x00007f2d597a8000 0x00007f2d597bd000 r-xp /lib/x86_64-linux-gnu/libnsl-2.13.so
0x00007f2d597bd000 0x00007f2d599bc000 ---p /lib/x86_64-linux-gnu/libnsl-2.13.so
0x00007f2d599bc000 0x00007f2d599bd000 r--p /lib/x86_64-linux-gnu/libnsl-2.13.so
0x00007f2d599bd000 0x00007f2d599be000 rw-p /lib/x86_64-linux-gnu/libnsl-2.13.so
0x00007f2d599be000 0x00007f2d599c0000 rw-p mapped
0x00007f2d599c0000 0x00007f2d599c7000 r-xp /lib/x86_64-linux-gnu/libnss_compat-2.13.so
0x00007f2d599c7000 0x00007f2d59bc6000 ---p /lib/x86_64-linux-gnu/libnss_compat-2.13.so
0x00007f2d59bc6000 0x00007f2d59bc7000 r--p /lib/x86_64-linux-gnu/libnss_compat-2.13.so
0x00007f2d59bc7000 0x00007f2d59bc8000 rw-p /lib/x86_64-linux-gnu/libnss_compat-2.13.so
0x00007f2d59bc8000 0x00007f2d59d4a000 r-xp /lib/x86_64-linux-gnu/libc-2.13.so
0x00007f2d59d4a000 0x00007f2d59f4a000 ---p /lib/x86_64-linux-gnu/libc-2.13.so
0x00007f2d59f4a000 0x00007f2d59f4e000 r--p /lib/x86_64-linux-gnu/libc-2.13.so
0x00007f2d59f4e000 0x00007f2d59f4f000 rw-p /lib/x86_64-linux-gnu/libc-2.13.so
0x00007f2d59f4f000 0x00007f2d59f54000 rw-p mapped
0x00007f2d59f54000 0x00007f2d59f56000 r-xp /lib/x86_64-linux-gnu/libdl-2.13.so
0x00007f2d59f56000 0x00007f2d5a156000 ---p /lib/x86_64-linux-gnu/libdl-2.13.so
0x00007f2d5a156000 0x00007f2d5a157000 r--p /lib/x86_64-linux-gnu/libdl-2.13.so
0x00007f2d5a157000 0x00007f2d5a158000 rw-p /lib/x86_64-linux-gnu/libdl-2.13.so
0x00007f2d5a158000 0x00007f2d5a17d000 r-xp /lib/x86_64-linux-gnu/libtinfo.so.5.9
0x00007f2d5a17d000 0x00007f2d5a37c000 ---p /lib/x86_64-linux-gnu/libtinfo.so.5.9
0x00007f2d5a37c000 0x00007f2d5a380000 r--p /lib/x86_64-linux-gnu/libtinfo.so.5.9
0x00007f2d5a380000 0x00007f2d5a381000 rw-p /lib/x86_64-linux-gnu/libtinfo.so.5.9
0x00007f2d5a381000 0x00007f2d5a3a1000 r-xp /lib/x86_64-linux-gnu/ld-2.13.so
0x00007f2d5a599000 0x00007f2d5a59c000 rw-p mapped
0x00007f2d5a59e000 0x00007f2d5a5a0000 rw-p mapped
0x00007f2d5a5a0000 0x00007f2d5a5a1000 r--p /lib/x86_64-linux-gnu/ld-2.13.so
0x00007f2d5a5a1000 0x00007f2d5a5a2000 rw-p /lib/x86_64-linux-gnu/ld-2.13.so
0x00007f2d5a5a2000 0x00007f2d5a5a3000 rw-p mapped
0x00007ffe04499000 0x00007ffe044ba000 rw-p [stack]
0x00007ffe044d0000 0x00007ffe044d3000 r--p [vvar]
0x00007ffe044d3000 0x00007ffe044d5000 r-xp [vdso]
0xffffffffff600000 0xffffffffff601000 r-xp [vsyscall]

也就说启用这个特性操作系统会有更好的向下兼容的特性,但是这个操作系统上的每一个进程都会映射一块内核的地址空间。