Name Mangling and Function Overloading


The C++ provides a feature called function overloading. This enables you to write many functions with the same name, provided each function takes parameters of different types. The lower level languages (C or assembly) and tools (linker) do not have capability of understanding function overloading. For these languages, each function name must be unique so that functions can be differentiated from each other by its name only.
C++ differentiates functions with name and parameter both. It generates a new name for each function. The new name depends on the original C++ function name and parameters. Given a function name of a set of parameters, it will always generate a unique name. If parameters (number of params, type of params or order of params) change then it will generate another name even if the original C++ function name is the same. This process of encoding the function name is known as name mangling.
The process of name mangling is compiler dependent. Compilers can use different strategies. The name mangled by a compiler may not be same as mangled by other compilers.

Here are few examples of mangled name for g++ compiler:-
Original name:   void myFun();      Mangled name:  _Z5myFunv
Original name:  void myFun(int a, char c);      Mangled name:  _Z5myFunic

looking at the mangled name we can observe the pattern. The pattern looks like: -
_Z numberOfCharsInOriginalFunctionName OriginalFunctionName parameter_names_encoded

Parameter encoding scheme:
for primitive type:
char: c
int: i
long: l
void: v
for array or pointer:
P is appended
for user defined type names:
size of the user defined string is appended before the name e.x. for class name myClass, it will be encoded as 7myClass

Lets take few more examples:
void myFun(int a, char *p, int arr[]) ==> _Z5myFuniPcPi
void myFun(int a, MyClass c) ==> _Z5myFuni7MyClass

We have seen all example of functions not part of any class. If a function is part of any class, then the mangling is little different. For example

class MyClass
{
   void MyFunction(int a) {}
}
Here is mangled name of MyClass:MyFunction will become _ZN7MyClass10MyFunctionEi.
Name mangling with namespace:
namespace NS
{
    class myClass
    {
        public: void MyFunction(int a) {}
    };
}
Here mangled name for the function will become _ZN2NS7myClass10MyFunctionEi
Now find out the pattern yourself.
There is gnu tool called c++filt which can take a mangled name and demangles it. Readers are encouraged to play with this tool and try to demangle all names in the above examples.
The exact pattern of name mangling is not important to programmers. For programmers, the important thing to understand is that there is a unique method name which is generated for each overload so that it can be resolved (by assembler, linker) without any ambiguity.
 

Interfacing with C

C++ mangles function names but C does not. What will happen if you are writing mixed-mode code. Some of the source files are compiled by C++ compiler and some files by C compiler and then you link the objects created by these compilers.
Let's take an example: you are writing a library in a file called lib.cpp and consuming this library in another file called use.cpp the content of these files are like-

lib.cpp
int libFunction(int x)
{
    return x++;
}
use.cpp
int libFunction(int x);
int main()
{
    cout<<libFunction(2);
}

compile these two files (by C++ compiler) separately and link. This will work fine. The use.cpp has an unresolved reference to the method "int libFunction(int x)" and this is provided by lib.cpp and the unresolved reference gets resolved. As the code is getting compiled by the c++ compiler, there will be name mangling. So the use.cpp has an unresolved reference to a method name _Z11libFunctioni and the lib.cpp file provides the definition of the method _Z11libFunctioni.
Now think what will happen if the lib.cpp file is compiled by a C compiler and use.cpp by c++ compiler? The lib.cpp will export a method with the name libFunction but the use.cpp has an unresolved reference to _Z11libFunctioni. The unresolved reference will not be satisfied and there will be a linker error.
There is a way to tell c++ the compiler that do not mangle the method name which is implemented somewhere else. That means it will not import the method with mangled names. There is a keyword extern which is used for this. After using this keyword the use.cpp will will become:

use.cpp
extern "C" int libFunction(int x);
int main()
{
    cout<<libFunction(2);
}

in this code the imported method name will become libFunction rather than _Z11libFunctioni. So this will work fine if the lib.cpp is compiled with a C compiler.


Writing a header file which can be used by both C and C++ compiler
A typical way of writing C or C++ code is to declare method names in a header file and include the header file in both files: the consumer of the method name and the implementer of the method. If one of the files is in C and other in C++ and then the inclusion of header file will result in a compilation error in the C file because extern is not a C keyword. There is a simple solution of using conditional compilation. The header file which contains the libFunction declaration will look like

#ifdef __cplusplus
extern "C" int libFunction(int x);
#else
int libFunction(int x);
#endif

The __cplusplus is only defined for C++. So the declaration with the extern keyword will be used only by C++ compiler and the other declaration will be used only by the C compiler.


up

Want to learn (C/C++ internals) directly from the author using video call? Learn more

Do you collaborate using whiteboard? Please try Lekh Board - An Intelligent Collaborate Whiteboard App