X-Git-Url: https://git.alrj.org/?p=bold.git;a=blobdiff_plain;f=doc%2FREADME.html;fp=doc%2FREADME.html;h=8e79cb0a23d0d71f4cdb0b2323d005742339a7e3;hp=0000000000000000000000000000000000000000;hb=0e847121f4fa9135e773c07213b0cbf3f73a2495;hpb=11e867f1fbccaa1d55e0406ba6c970492f22e56a diff --git a/doc/README.html b/doc/README.html new file mode 100644 index 0000000..8e79cb0 --- /dev/null +++ b/doc/README.html @@ -0,0 +1,530 @@ + + + +
+ + +Author: | +Amand Tihon |
---|---|
Contact: | +<amand.tihon@alrj.org> |
Version: | +0.1.0 |
Date: | +Aug 8, 2009 |
Copyright: | +GNU GPL version 3 + Exception, see copyright file. |
Table of contents
+ +Bold is an ELF linker, currently only targetting x86_64 under Linux. Being +limited in capabilities, it should not be considered as an all-purpose linker.
+Bold's main purpose is to generate very small executable programs.
+While ld from the GNU binutils can do almost anything anyone would ever +need, some specific goals need an awful lot of tweaking, or can simply not be +achieved. Bold uses several tricks to reduce the size of the final executable +binary.
+You can download the tarball from http://www.alrj.org/projects/bold +or get the latest development version with the following git command:
++git clone http://git.alrj.org/git/bold.git ++
A gitweb interface is also available at http://git.alrj.org/
+Bold itself is entirely written in Python. There are no additionnal +dependencies.
+The runtime library that contains the external symbols resolver is written +in assembler (Intel syntax). An assembler like Nasm or Yasm is needed to +recompile the source code into an object file.
+Go into Bold's directory, and run
++python setup.py build ++
Then, as root or using sudo, run
++python setup.py install ++
+bold [options] objfile...+
Bold combines a number of object files, relocate their data and resolves their +symbols references, in order to generate executable binaries.
+Bold has only one, very specific purpose: making small executables.
++--version | +Show program's version and exit. |
+-h, --help | +Show help message and exit. |
+-e SYMBOL, --entry=SYMBOL | +|
Use SYMBOL as the explicit symbol for beginning execution of your program. +If --raw is specified, it defaults to _start. | |
+-l LIBNAME, --library=LIBNAME | +|
Link against the shared library specified by LIBNAME. Bold relies on python's +ctypes module to find the libraries. This option may be used any number of +times. | |
+-L DIRECTORY, --library-path=DIRECTORY | +|
This option does nothing, and is present ony for compatibility reasons. It +MAY get implemented in the future, though. This option may be used any number +of times. | |
+-o FILE, --output=FILE | +|
Set the output file name (default value is a.out). | |
+--raw | +Don't include the builtin external symbols resolution code. This is +described in details further in this document. |
+-c, --ccall | +Make external symbols directly callable by C, without having to declare the +pointers on functions. This option adds 6 bytes for each externally defined +function. This is described in details further in this document. |
+-a, --align | +Align the wrappers for external symbols on an 8 byte boundary, to take +advantage of the RIP-relative addressing. This is described in details +further in this document. |
The LD_PRELOAD environment variable may not always work (as expected or +at all).
+The main() function is called without any argument. Its return code is used +as exit code, though.
+The "import by hash" method is from parapete, leblane, las, as described on +http://www.pouet.net/topic.php?which=5392
+If you write your code in C and need to call the external symbols, you +basically have two options. The first one is to redefine them (or define new +ones) to call by pointers. For instance,
++int SDL_Init(int); ++
would become:
++int (*SDL_Init)(int); ++
Repeat it for all functions, or write a tool to automate it (hint: look at +http://research.mercury-labs.org/ibh-i386-0.2.2.tar.gz for help).
+There's a second possibility however, and it's the one used by Bold when you +specify the --ccall option: make the resolved symbol point, not to the +address of the function, but to a JMP instruction to the actual address:
++global SDL_Init + +.text + +SDL_Init: jmp [rel _bold__SDL_Init] +SDL_SetVideoMode: jmp [rel _bold__SDL_SetVideoMode] + +.bss + +_bold__SDL_Init resq ; Filled by the import by hash code +_bold__SDL_SetVideoMode resq ++
This approach takes 6 bytes (the JMP instruction) for each external function +used.
+The x86_64 architecture has this nice thing called "RIP-relative addressing". +If all the JMP instructions are in the same order than the pointers to the +functions they refer to, having them aligned with the pointers would result +in identical instructions. This is done with the --align option.
+Adding two null bytes between each JMP enlarges the final executable by +2 x (number of function - 1) bytes, and may seem to go against our goal. +However, the result is a repetition of the same eight bytes, something that +can improve compression a lot!
+Bold declares a global symbol named _dt_debug, that points to the value of +the DT_DEBUG entry of the DYNAMIC table, for easy access. Just in case, +the DYNAMIC table can also be reached using the global _DYNAMIC symbol.
+Executables generated by ld usually have a lot of entries in their +DYNAMIC table. Bold puts only the strict necessary:
+And that's it!
+The examples/ directory contains a port of the flow2 intro +(http://www.pouet.net/prod.php?which=30589). Adding the dropper is left as an +exercise for the reader.
+