Dynamic Linking Position-independent Code（PIC）

Posted on 2017-03-07 Edited on 2017-03-18

Dynamic linking 遇到的問題

Dynamic linking 想解決的一大問題是 memory 浪費。直覺想法是如果能讓不同 process 都會使用到的 library 在 memory 只有一份就能節省空間。對不同 process 來說 library 的內容必須是相同的才能共用。library 主要是 instruction 以及 data （executable file 都是這樣辣），data 不可能在 process 間共用，因為每個 process 都需要它自己的 data，不然會互相干擾（好像有古代系統是共用的…），因此能共用的主要是 instruction。

excutable file、object file 以及 library 等 binary file 中 instruction 會以不同定址方式 access symbol，絕對定址模式會將 symbol 的 virtual address 寫進 instruction，相對定址模式則跟 instruction 及 data 之間的相對位置有關。也就是說，無論是 executable file 還是要共用的 library，instruction 都可能涉及 symbol 的 address 資訊。

不像 executable file，library 在 compile time 無法知道會被 load 到哪，因為系統裡會有多個 library，如果各自指定要 load 到哪可能會撞到，所以得等到 runtime 由系統決定，其中 symbol 位置也要到 runtime 才能決定。這使得 compile time 無法修改 instruction 內的 address 資訊，也就是 static linking 做的事。

另一方面，即使在 runtime 修改 library 的 instruction，也會造成不同 process 實際上有不同的 library instruction 而無法共用。例如 library A 使用某個外部 symbol foo，process 1 跟 process 2 都有使用 library A，但它們分別以 library B 跟 library C 來提供 symbol foo 給 library A。此時，process 1 的 foo 的 address 在 library B，process 2 則在 library C，runtime 修改 library A instruction 會有兩種版本。

Position-independent Code（PIC）

上面的問題基本上是因為 instruction 裡含有 symbol address 相關資訊，就出現 Position-independent Code（PIC）「與位置無關的程式碼」來解決。

由於 process 有各自的 data section 而且可以修改裡面的值，PIC 將 library 會被修改的部分（instruction 中的 address 相關資訊）放到 data section，讓 instruction 跟 address 無關而能共用。

library 的 address reference 可分為 library 內與跨 library，各自又再分成 reference 到資料或 instruction（function call 或 jump），處理方式主要依據 library 內或跨 library 而不同。

library 內

同一 library 內的 instruction 跟資料間相對位置是固定的，所以可以用相對位置來 access 資料、call function 或 jump。

跨 library

ELF 在 data section 放一個指向其他 library 的 symbol 的 pointer array，稱為 Global Offset Table（GOT）。instruction 可以從 GOT 找到對應的 element 進行間接 reference。先找到 GOT（不同平台有不同作法，可以用相對定址也可以有特殊 register 記錄），再從 GOT 以及 instruction 所知道的「該 symbol 在 GOT 裡的 offset」得到 element，最後得到 symbol address。

GOT 由 linker 載入 library 時填填內容，同樣使用 relocation table 的 entry 標示需要修改的位置及如何修改。relocation table 不會管 offset 指向的位置是什麼，改那個地方的內容就對了，放 GOT element 就會改 GOT。至於變數與 call function 的差別在 GOT element 存的是變數還是 function 的 address，不過實際上 ELF 有區分變數跟 function，這部份下一篇再說。

雖然 GOT 可以達到 PIC，但代價是 access symbol 的速度會變慢，因為要先找到 GOT 再間接定址。

Example

foo.c

extern int sum;

int foo(int a, int b)
{
    static int* p = (int*)123;    // avoid to be placed in .bss
    p = &sum;
    return a + b;
}

1 2	$ gcc -c foo.c $ gcc -fPIC -c foo.c -o foo.o.pic

有 -fPIC 跟沒有的差別：

$ objdump -d foo.o

foo.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <foo>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   89 7d fc                mov    %edi,-0x4(%rbp)
   7:   89 75 f8                mov    %esi,-0x8(%rbp)
   a:   48 c7 05 00 00 00 00    movq   $0x0,0x0(%rip)        # 15 <foo+0x15>
  11:   00 00 00 00 
  15:   8b 55 fc                mov    -0x4(%rbp),%edx
  18:   8b 45 f8                mov    -0x8(%rbp),%eax
  1b:   01 d0                   add    %edx,%eax
  1d:   5d                      pop    %rbp
  1e:   c3                      retq   


$ objdump -d foo.o.pic

foo.o.pic:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <foo>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   89 7d fc                mov    %edi,-0x4(%rbp)
   7:   89 75 f8                mov    %esi,-0x8(%rbp)
   a:   48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 11 <foo+0x11>
  11:   48 89 05 00 00 00 00    mov    %rax,0x0(%rip)        # 18 <foo+0x18>
  18:   8b 55 fc                mov    -0x4(%rbp),%edx
  1b:   8b 45 f8                mov    -0x8(%rbp),%eax
  1e:   01 d0                   add    %edx,%eax
  20:   5d                      pop    %rbp
  21:   c3                      retq

結果是用 -fPIC compile 出來跟沒有用的 object file 不一樣~~（好像廢話）~~，而且沒有 -fPIC 無法 link 成 shared object。

1 2	$ gcc -shared foo.o -o foo.so /usr/bin/ld: foo.o: relocation R_X86_64_32S against `sum' can not be used when making a shared object; recompile with -fPIC

relocation table 如果有 R_X86_64_32S 定址無法變成 DSO。

$ readelf -r foo.o

Relocation section '.rela.text' at offset 0x240 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
00000000000d  000300000002 R_X86_64_PC32     0000000000000000 .data - 8
000000000011  000a0000000b R_X86_64_32S      0000000000000000 sum + 0

Relocation section '.rela.eh_frame' at offset 0x270 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000200000002 R_X86_64_PC32     0000000000000000 .text + 0

$ readelf -r foo.o.pic

Relocation section '.rela.text' at offset 0x278 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
00000000000d  000b00000009 R_X86_64_GOTPCREL 0000000000000000 sum - 4
000000000014  000300000002 R_X86_64_PC32     0000000000000000 .data - 4

Relocation section '.rela.eh_frame' at offset 0x2a8 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000200000002 R_X86_64_PC32     0000000000000000 .text + 0

再看看 shared library 的 section 們，只列出比較相關的部份。

$ gcc -fPIC -shared -o foo.so foo.c
$ readelf -S foo.so
There are 27 section headers, starting at offset 0x1170:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  ...
  [ 3] .dynsym           DYNSYM           00000000000001f8  000001f8
       0000000000000150  0000000000000018   A       4     2     8
  [ 4] .dynstr           STRTAB           0000000000000348  00000348
       00000000000000ab  0000000000000000   A       0     0     1
  ...
  [ 7] .rela.dyn         RELA             0000000000000430  00000430
       00000000000000d8  0000000000000018   A       3     0     8
  [ 8] .rela.plt         RELA             0000000000000508  00000508
       0000000000000030  0000000000000018  AI       3    10     8
  ...
  [10] .plt              PROGBITS         0000000000000560  00000560
       0000000000000030  0000000000000010  AX       0     0     16
  ...
  [18] .dynamic          DYNAMIC          0000000000200760  00000760
       00000000000001c0  0000000000000010  WA       4     0     8
  [19] .got              PROGBITS         0000000000200920  00000920
       0000000000000030  0000000000000008  WA       0     0     8
  [20] .got.plt          PROGBITS         0000000000200950  00000950
       0000000000000028  0000000000000008  WA       0     0     8
  ...

Lazy Binding

dynamic linking 以犧牲一點效能達到模組使用的靈活度。效能降低發生在兩個地方：程式開始執行時的 linking 工作以及 GOT 帶來的間接定址。

程式裡可能有很多 function 在執行過程中不會或很少被用到，例如錯誤處理跟少用的功能。一開始執行就 link 所有 library 裡的 function 顯然有點浪費，畢竟可能花時間 link 了卻沒用到。如果等到 function 第一次被使用時才 bind symbol（找 symbol、relocate 等等）可以加快程式啟動的速度，這個方法稱為 lazy binding。

ELF 用 Procedure Linkage Table（PLT）來實作 lazy binding。在沒有 lazy binding 前，會藉由 GOT 進行間接跳轉來 access 另一個模組的 function 。有了 lazy binding 表示一開始 load 模組時不會把 GOT 填完，所以使用 GOT 跳轉前要多一層 PLT 的處理：如果 GOT element 沒有值，會先由 dynamic linker 找到該 function 的 address，填入 GOT 後跳去該 function 執行。之後再使用到同一個 function，由於 GOT 裡已經有值，可以直接進行間接跳轉。

Dynamic Linking Basic
Dynamic Linking Relocation
Shared Library Versioning
Use Shared Library in Linux
Explicit Runtime Linking
《程式設計師的自我修養》ch7
https://www.bottomupcs.com/chapter08.xhtml
你所不知道的 C 語言：動態連結器篇