I was looking to the symbols in the libc.a file and I noticed there some "ABS" symbols.
For example, there is the "_nl_current_LC_COLLATE_used" symbol.
Here is the output of readelf on the libc.a file.
The symbols:
File: libc.a(setlocale.o)
Symbol table '.symtab' contains 77 entries:
Num: Value Size Type Bind Vis Ndx Name
...
39: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _nl_current_LC_COLLATE_used
...
File: libc.a(uselocale.o)
Symbol table '.symtab' contains 34 entries:
Num: Value Size Type Bind Vis Ndx Name
...
6: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _nl_current_LC_COLLATE_used
...
File: libc.a(lc-collate.o)
Symbol table '.symtab' contains 5 entries:
Num: Value Size Type Bind Vis Ndx Name
...
1: 0000000000000002 0 NOTYPE GLOBAL DEFAULT ABS _nl_current_LC_COLLATE_used
...
The relocations:
File: libc.a(setlocale.o)
Relocation section '.rela.text' at offset 0x1b98 contains 124 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
...
00000000000009a3 0000002700000009 R_X86_64_GOTPCREL 0000000000000000 _nl_current_LC_COLLATE_used - 5
...
Relocation section '.rela.data.rel.ro' at offset 0x2738 contains 13 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
...
0000000000000098 0000002700000001 R_X86_64_64 0000000000000000 _nl_current_LC_COLLATE_used + 0
...
File: libc.a(uselocale.o)
Relocation section '.rela.text' at offset 0x838 contains 29 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
...
0000000000000029 0000000600000009 R_X86_64_GOTPCREL 0000000000000000 _nl_current_LC_COLLATE_used - 5
...
So, the "_nl_current_LC_COLLATE_used" symbol is the target of:
two R_X86_64_GOTPCREL relocations
one R_X86_64_64 relocation
If I understand correctly, that means the adress of the "_nl_current_LC_COLLATE_used" symbol is needed so this symbol must be defined somewhere in the process memory.
But, this symbol is present in 3 files:
two times as a "WEAK UNDEFINED" symbol, that's ok, a definition should be found later by the linker
one time as a "ABS" symbol with a "2" as value
There is no other definition anywhere else, so when I compile a simple helloworld.c file and link it against the libc.a, a definition of "_nl_current_LC_COLLATE_used" must be used to produce a final binary, right?
But there is no section associated to this symbol, only the value "2".
According to the ELF specification :
SHN_ABS value specifies absolute values for the corresponding reference. This means if a symbol references this section then its already has an absolute value and aren't affected by relocation.
So, what ? "2" is the absolute address of this symbol ? I don't think so
By looking in the source code of the libc, the "_nl_current_LC_COLLATE_used" is defined (via a macro and using an asm directive) as a const integer value = 2
What I don't understand:
in the final binary, where is the "2" value stored ? Is it up to the linker to create a new data section to store all the absolute symbols values ?
if so, how the linker know which size to use for each absolute symbol value ? Here, by looking to the source code, it seems that the value "2" should be stored as a 64 bit integer, but the "size" field of the .symtab is set to 0
how the linker know if the symbol should be stored in a R or a RW data section ?