Anonymous user
C++ to ASM linkage in GCC: Difference between revisions
Jump to navigation
Jump to search
m
Prefer writing assembly instead of ASM and assembly files are not "scripts"
[unchecked revision] | [unchecked revision] |
(Replace dead link with the local calling conventions article) |
m (Prefer writing assembly instead of ASM and assembly files are not "scripts") |
||
Line 1:
A small note before we begin: The GNU Compiler Collection C compiler, a very versatile compiler that has been around for a very long time, is pretty much the standard for OS Dev'ing, since it is even used (as you probably already know) to compile the Linux kernel. In fact, Linux is meant to be compiled with GCC. It has lots of useful extensions (__attribute__(())) which ease development by leaps and bounds, should you take the time to read about it enough.
Also, this article in itself is not really sufficient for a full understanding of Linking to C++ methods within C or
We will assume the use of the GCC compiler for your HLL development, and the use of C++ (C wouldn't be that different) for your little HLL - ASM linkage escapade.▼
▲We will assume the use of the GCC compiler for your HLL development, and the use of C++ (C wouldn't be that different) for your little HLL -
== C++ Name Mangling ==
Line 23 ⟶ 22:
The compiler uses the generated symbol name to encode information about the symbol. This generally says that 'this is a mangled symbol' (_Z). 'It has 8 characters of user-defined symbol relevance' (8), and those are [getObjId]. The 'E' is probably used to mean END, and after that GCC generally places several letters and namespace/object names as details on the arguments. (v=void, i=int, j=unsigned int...).
After seeing this, you can tell that, in order therefore to call a C++ function from
Note well that varying compilers DO ''NOT'' use the same mangling scheme, and in fact, are encouraged by the C++ standards committee to go ahead and use their own mangling schemes as they see fit.
Line 38 ⟶ 37:
In C and C++, you make a symbol global by defining it ''outside'' of any function. In C++, you may still hide the symbol by having it inside a 'private' section of a global symbol. But this is irrelevant to this article, seeing as anyone who is reading this article should be attempting to develop an OS. We assume you already understand your language.
In
<source lang="ASM">
Line 63 ⟶ 62:
</source>
To clarify, technically, the linker can 'see' all symbols. It just chooses to ignore linkage between files for symbols not exclusively declared global. If you were to write a second
<source lang="ASM">
Line 84 ⟶ 83:
So now we get back to the main question presented at the top of this section of the article: What is external linkage? External linkage is the linking to a global symbol which is not defined in the same file scope as the file you're working in. The linker will therefore place the referred variable's address where it is referenced in the referencing
Line 99 ⟶ 98:
In other words, when you tell the compiler that you are linking to an external variable with "C" style linkage, you are telling the compiler: "I am linking to a symbol of name XYZ. This is EXACTLY how the symbol looks, and there is no name mangling.". That is all 'extern "C"' means. It explicitly tells the compiler that there is nothing special about the symbol's name, and that it is to be taken exactly as you type it.
With this in mind, we not understand why in C, there is no need to specify a linkage style, since C only understands symbolas as you type them. C does not have any kind of name mangling, and expects plain, absolute symbol names. So to link to
So <tt>extern "C" getObjId</tt> tell the compiler to insert references to a symbol of the exact name 'getObjId' within your output object file. The linker will see your references, and look for a global symbol of that exact name, and if it is found, and there are no duplicates, it simply places the address of that symbol wherever the compiler placed a reference to it.
Line 114 ⟶ 113:
So linking to 'extern "C"' symbols is the same as telling the compiler to just trust you, and place references to XYZ symbol name as is, even though it may never see the definition of that symbol within the ''current file''. The symbol is defined elsewhere. It may even be defined in a shared library, and be expected to be linked in by the OS at runtime (Dynamic Linking), although this usually requires a little more, and is usually handled by the compiler and linker as set up by the host OS (this is where OS libraries come in).
''' "C++" Linkage '''
Line 142 ⟶ 140:
Then GCC is told to link to that symbol. It places a reference to a symbol name, all mangled in ''its'' own way. The linker is called on both object files. The second one is referencing a symbols which the linker sees nowhere. You are told the symbol referenced by the second object file does not exist.
== C++ -
I'm sorry I took this long to get to this part of the article, but the facts given above are pertinent.
To link to a C/C++ symbol from an
To link to an
The 'This' pointer issue is another thing altogether, and is actually a very serious consideration you should take into account when designing your kernel, or choosing whether or not to use C++ altogether. C makes library generation, and linking easier. C++ makes design and re-structuring (you will restructure your design many times, so this is a big plus) much easier.
|