1 .. HTML version generated with LC_ALL=C rst2html -t README > README.html
3 .. |date| date:: %b %e, %Y
5 Bold - The Byte Optimized Linker
6 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
9 :Contact: <amand.tihon@alrj.org>
12 :Copyright: GNU GPL version 3 + Exception, see copyright file.
16 .. contents:: Table of contents
23 Bold is an ELF linker, currently only targetting x86_64 under Linux. Being
24 limited in capabilities, it should not be considered as an all-purpose linker.
30 Bold's main purpose is to generate very small executable programs.
32 While ``ld`` from the GNU binutils can do almost anything anyone would ever
33 need, some specific goals need an awful lot of tweaking, or can simply not be
34 achieved. Bold uses several tricks to reduce the size of the final executable
41 You can download the tarball from http://www.alrj.org/projects/bold
42 or get the latest development version with the following git command: ::
44 git clone http://git.alrj.org/git/bold.git
46 A gitweb interface is also available at http://git.alrj.org/
52 Bold itself is entirely written in Python. There are no additionnal
55 The runtime library that contains the external symbols resolver is written
56 in assembler (Intel syntax). An assembler like Nasm or Yasm is needed to
57 recompile the source code into an object file.
63 Go into Bold's directory, and run ::
67 Then, as root or using sudo, run ::
69 python setup.py install
79 bold [options] objfile...
85 Bold combines a number of object files, relocate their data and resolves their
86 symbols references, in order to generate executable binaries.
88 Bold has only one, very specific purpose: making small executables.
94 Show program's version and exit.
97 Show help message and exit.
99 -e SYMBOL, --entry=SYMBOL
100 Use SYMBOL as the explicit symbol for beginning execution of your program.
101 If ``--raw`` is specified, it defaults to ``_start``.
103 -l LIBNAME, --library=LIBNAME
104 Link against the shared library specified by LIBNAME. Bold relies on python's
105 ctypes module to find the libraries. This option may be used any number of
108 -L DIRECTORY, --library-path=DIRECTORY
109 This option does nothing, and is present ony for compatibility reasons. It
110 MAY get implemented in the future, though. This option may be used any number
113 -o FILE, --output=FILE
114 Set the output file name (default value is a.out).
117 Don't include the builtin external symbols resolution code. This is
118 described in details further in this document.
121 Make external symbols directly callable by C, without having to declare the
122 pointers on functions. This option adds 6 bytes for each externally defined
123 function. This is described in details further in this document.
126 Align the wrappers for external symbols on an 8 byte boundary, to take
127 advantage of the RIP-relative addressing. This is described in details
128 further in this document.
134 The ``LD_PRELOAD`` environment variable may not always work (as expected or
137 The ``main()`` function is called without any argument. Its return code is used
138 as exit code, though.
144 External symbols resolution
145 ---------------------------
147 The "import by hash" method is from parapete, leblane, las, as described on
148 http://www.pouet.net/topic.php?which=5392
154 If you write your code in C and need to call the external symbols, you
155 basically have two options. The first one is to redefine them (or define new
156 ones) to call by pointers. For instance, ::
162 int (*SDL_Init)(int);
164 Repeat it for all functions, or write a tool to automate it (hint: look at
165 http://research.mercury-labs.org/ibh-i386-0.2.2.tar.gz for help).
167 There's a second possibility however, and it's the one used by Bold when you
168 specify the ``--ccall`` option: make the resolved symbol point, not to the
169 address of the function, but to a JMP instruction to the actual address: ::
175 SDL_Init: jmp [rel _bold__SDL_Init]
176 SDL_SetVideoMode: jmp [rel _bold__SDL_SetVideoMode]
180 _bold__SDL_Init resq ; Filled by the import by hash code
181 _bold__SDL_SetVideoMode resq
184 This approach takes 6 bytes (the JMP instruction) for each external function
191 The x86_64 architecture has this nice thing called "RIP-relative addressing".
192 If all the JMP instructions are in the same order than the pointers to the
193 functions they refer to, having them aligned with the pointers would result
194 in identical instructions. This is done with the ``--align`` option.
196 Adding two null bytes between each JMP enlarges the final executable by
197 2 x (number of function - 1) bytes, and may seem to go against our goal.
198 However, the result is a repetition of the *same eight bytes*, something that
199 can improve compression a lot!
202 Additional Trick 1: DT_DEBUG
203 ----------------------------
205 Bold declares a global symbol named ``_dt_debug``, that points to the value of
206 the ``DT_DEBUG`` entry of the ``DYNAMIC`` table, for easy access. Just in case,
207 the ``DYNAMIC`` table can also be reached using the global ``_DYNAMIC`` symbol.
209 Additional Trick 2: Short DYNAMIC table
210 ---------------------------------------
212 Executables generated by ``ld`` usually have a lot of entries in their
213 ``DYNAMIC`` table. Bold puts only the strict necessary:
215 - One ``DT_NEEDED`` entry for each shared library to load (obviously).
216 - A ``DT_SYMTAB`` entry, with null-pointer. Without this one, the interpreter
218 - a ``DT_DEBUG`` entry, that will be used for symbol resolution.
226 The ``examples/`` directory contains a port of the *flow2* intro
227 (http://www.pouet.net/prod.php?which=30589). Adding the dropper is left as an
228 exercise for the reader.