To understand how a program makes use of shared objects, let's first see the format of an executable and then examine the steps that occur when the program starts.
In the following diagram, we show two views of an ELF file: the linking view and the execution view. The linking view, which is used when the program or library is linked, deals with sections within an object file. Sections contain the bulk of the object file information: data, instructions, relocation information, symbols, debugging information, and so on. The execution view, which is used when the program runs, deals with segments.
At linktime, the program or library is built by merging together sections with similar attributes into segments. Typically, all the executable and read-only data sections are combined into a single text segment, while the data and BSS s are combined into the data segment. These segments are called load segments, because they need to be loaded in memory at process creation. Other sections such as symbol information and debugging sections are merged into other, nonload segments.
BlackBerry 10 OS, however, doesn't rely at all on the COFF technique of loading sections. When developing our ELF implementation, we worked directly from the ELF spec and kept efficiency paramount. The ELF loader uses the execution view of the program. By using the execution view, the task of the loader is greatly simplified: all it has to do is copy to memory the load segments (usually two) of the program or library. As a result, process creation and library loading operations are much faster.
The diagram below shows the memory layout of a typical process. The process load segments (corresponding to text and data in the diagram) are loaded at the process's base address. The main stack is located just below and grows downwards. Any additional threads that are created have their own stacks, located below the main stack. Each of the stacks is separated by a guard page to detect stack overflows. The heap is located above the process and grows upwards.
In the middle of the process's address space, a large region is reserved for shared objects. Shared libraries are located at the top of the address space and grow downwards.
When a new process is created, the process manager first maps the two segments from the executable into memory. It then decodes the program's ELF header. If the program header indicates that the executable was linked against a shared library, the process manager extracts the name of the dynamic interpreter from the program header. The dynamic interpreter points to a shared library that contains the runtime linker code. The process manager loads this shared library in memory and then passes control to the runtime linker code in this library.
The runtime linker is invoked when a program that was linked against a shared object is started or when a program requests that a shared object be dynamically loaded. The runtime linker is contained within the C runtime library. The runtime linker performs several tasks when loading a shared library (.so file):
This dynamic section provides information to the linker about other libraries that this library was linked against. It also gives information about the relocations that need to be applied and the external symbols that need to be resolved. The runtime linker first loads any other required shared libraries (which may themselves reference other shared libraries). It then processes the relocations for each library. Some of these relocations are local to the library, while others require the runtime linker to resolve a global symbol. In the latter case, the runtime linker searches through the list of libraries for this symbol. In ELF files, hash tables are used for the symbol lookup, so they're very fast. The order in which libraries are searched for symbols is very important.
When all relocations have been applied, any initialization functions that have been registered in the shared library's init section are called. This is used in some implementations of C++ to call global constructors.
A process can load a shared library at runtime by using the dlopen() call, which instructs the runtime linker to load this library. When the library is loaded, the program can call any function within that library by using the dlsym() call to determine its address.
The program can also determine the symbol associated with a given address by using the dladdr() call. Finally, when the process no longer needs the shared library, it can call dlclose() to unload the library from memory.
When the runtime linker loads a shared library, the symbols within that library have to be resolved. The order and the scope of the symbol resolution are important. If a shared library calls a function that happens to exist by the same name in several libraries that the program has loaded, the order in which these libraries are searched for this symbol is critical. This is why the OS defines several options that can be used when loading libraries. All the objects (executables and libraries) that have global scope are stored on an internal list (the global list). Any global-scope object, by default, makes available all of its symbols to any shared library that gets loaded. The global list initially contains the executable and any libraries that are loaded at the program's startup.
By default, when a new shared library is loaded by using the dlopen() call, symbols within that library are resolved by searching in this order through:
The runtime linker's scoping behavior can be changed in two ways when dlopen()'ing a shared library: