The Wayback Machine - https://web.archive.org/web/20090805055429/http://www.codeguru.com:80/cpp/com-tech/atl/tutorials/article.php/c3611/

CodeGuru
Earthweb Search
Login Forums Wireless Jars Gamelan Developer.com
CodeGuru Navigation
RSS Feeds

RSSAll

RSSVC++/C++

RSS.NET/C#

RSSVB

See more EarthWeb Network feeds

follow us on Twitter

Member Sign In
User ID:
Password:
Remember Me:
Forgot Password?
Not a member?
Click here for more information and to register.

jobs.internet.com

internet.commerce
Partners & Affiliates
















Home >> Visual C++ / C++ >> COM-based Technologies >> ATL & WTL Programming >> Tutorials


ATL Under the Hood Part 4
Rating:

Zeeshan (view profile)
June 7, 2002

Environment: ATL



(continued)




Previous articles:


Until now, we haven't discussed anything about assembly language. But we can't avoid it any longer if we really want to know what is going on under the hood of ATL because ATL uses some low-level techniques as well as some inline assembly language to make it as small and as fast as possible. I assume that readers already have a basic knowledge of assembly language, so I will concentrate only on my topic and not try to write another tutorial of assembly language. If you don't know enough assembly language, I recommend taking a look at MATT PIETREK's Article "Under The Hood" in Feb 1998 issue of Microsoft System Journal. It gives you enough information about assembly language to get started.

To start our tour, take a look at this simple program.

Program 55

void fun(int, int) {
}

int main() {
  fun(5, 10);
  return 0;
}

Now, compile it on the command line with command line compiler cl.exe. Compile it with the -FAs switch. For example, if this program's name is prog55, compile it this way.

Cl -FAs prog55.cpp

This will generate a file with the same name, but its .asm extension contains the assembly language code of the following program. Now, take a look at the generated output file. Let's discuss the calling of the function first. The assembly code to call this function is something like this.

  push  10                    ; 0000000aH
  push   5
  call   ?fun@@YAXHH@Z        ; fun

The parameters of the function are pushed on the stack from right to left and then call the function. But the name of function is a little bit different than our given function name. This is because the C++ compiler decorates the name of the function to perform function overloading. Let's change the program a little bit and overload the function to take a look at the code's behavior.

Program 56

void fun(int, int) {
}

void fun(int, int, int) {
}

int main() {
  fun(5, 10);
  fun(5, 10, 15);
  return 0;
}

Now, the assembly language for calling both of the functions looks something like this:

  push    10                    ; 0000000aH
  push     5
  call     ?fun@@YAXHH@Z                ; fun

  push    15                    ; 0000000fH
  push    10                    ; 0000000aH
  push     5
  call     ?fun@@YAXHHH@Z              ; fun

Take a look at the name of the function; we write both functions with the same name but the compiler decorates these functions to do function overloading.

If you don't want to decorate the function name, you can use external "C" with functions. Let's see a little bit of a change in the program.

Program 57

extern "C" void fun(int, int) {
}

int main() {
  fun(5, 10);
  return 0;
}

The assembly language code of this function is:

  push     10                   ; 0000000aH
  push      5
  call     _fun

This means that now you can't overload the function with C linkage. Take a look at the following program.

Program 58

extern "C" void fun(int, int) {
}

extern "C" void fun(int, int, int) {
}

int main() {
  fun(5, 10);
  return 0;
}

This program gives a compilation error. This happens because function overloading is not supported in the C language and you are going to make the two functions with the same name and tell the compiler to not decorate its name -- use C language linkage, not C++ linkage.

Now, take a look what the code compiler generates for our do-nothing function. Here is the code that the compiler generates for our function.

  push  ebp
  mov   ebp, esp
  pop   ebp
  ret   0

Before I go into further detail, take a look at the last statement of the function: ret 0. Why it is 0? Or can it be other than 0? As we have seen, all the parameters that we pass to the function are in fact pushed into the stack. What will be the effect on the register when you or the compiler pushes something on the stack? Take a look at the following simple program to see the behavior of this. I use printf rather than cout to avoid the overhead of cout.

Program 59

#include <cstdio>

int g_iTemp;

int main() {

  fun(5, 10);

  _asm mov g_iTemp, esp
  printf("Before push %d\n", g_iTemp);

  _asm push eax
  _asm mov g_iTemp, esp
  printf("After push %d\n", g_iTemp);
  _asm pop eax

  return 0;
}

The output of this program is:

Before push 1244980
After push 1244976

This program displays the value of the ESP register before and after pushing some value onto the stack. This clearly shows that when you push something into the stack, it grows downward in the memory.

Now, there is a question. Who is going to restore the stack pointer when we pass a parameter into the function, the function itself or the caller of that function? In fact, both cases are possible and this is the difference between a standard calling convention and the C calling convention. Take a look at the very next statement after calling the function.

  push  10                      ; 0000000aH
  push   5
  call  _fun
  add   esp, 8

Here, two parameters are passed in the function, so the stack pointer subtracts 8 bytes after pushing two values onto the stack. Now, in this program it is the responsibility of the function's caller to set the stack pointer. This is called the C Calling convention. In this calling convention, you can pass a variable with an argument because the caller knows how many parameters are being passed to the function, so it can set the stack pointer itself.

However, if the standard calling convention is selected, it is the responsibility of the callee to clear the stack. So, in this case, variables, not arguments, can't be passed in the function. There is no way to tell the function how many parameters are passed, so it can set the stack pointer appropriately.

Take a look at the following program to see the behavior of the standard calling convention.

Program 60

extern "C" void _stdcall fun(int, int) {
}

int main() {

  fun(5, 10);
  return 0;
}

Now take a look at the calling of the function.

  push  10                   ; 0000000aH
  push   5
  call  _fun@8

Here, @ with the function name shows that this is a standard calling convention and 8 shows the number of bytes pushed into the stack. So, the number of the argument can be calculated by dividing this number by 4.

Here is the code of our do-nothing function.

  push  ebp
  mov   ebp, esp
  pop   ebp
  ret   8

This function sets the stack pointer itself with the help of the "ret 8" instruction before leaving it.

Now, explore the code that the compiler generates for us. The compiler inserts this code to make a stack frame so it can access the parameter and local variable in the standard way. Stack frame is a memory area reserved for the function to store the information about the parameter, local variable, and return address. Stack frame is always created when a new function is called and destroys it when the function returns. On 8086 architecture, the EBP register is used to store the address of the stack frame, sometimes called the stack pointer.

So, the compiler first saves the address of the previous stack frame and then creates a new stack frame by using the value of ESP. And, before returning the function, the value of the old stack frame is preserved.

Now, take a look at what is in the stack frame. The Stack frame has all the parameters at the positive side of EBP and all the local variables at the negative side of EBP.

So, the return address of the function is stored at EBP and the value of the previous Stack frame is stored at EBP + 4. Now, take a look at the example, which has two parameters and three local variables.

Program 61

extern "C" void fun(int a, int b) {
  int x = a;
  int y = b;
  int z = x + y;
  return;
}

int main() {
  fun(5, 10);
  return 0;
}

And now, take a look at the compiler-generated code of the function.

  push  ebp
  mov   ebp, esp
  sub   esp, 12                  ; 0000000cH

  ; int x = a;
  mov	eax, DWORD PTR _a$[ebp]
  mov	DWORD PTR _x$[ebp], eax

  ; int y = b;
  mov	ecx, DWORD PTR _b$[ebp]
  mov	DWORD PTR _y$[ebp], ecx

  ; int z = x + y;
  mov  edx, DWORD PTR _x$[ebp]
  add  edx, DWORD PTR _y$[ebp]
  mov  DWORD PTR _z$[ebp], edx

  mov  esp, ebp
  pop  ebp
  ret  0

Now, what is _x, _y, and so forth? It is defined just above the function definition, something like this:

_a$ = 8
_b$ = 12
_x$ = -4
_y$ = -8
_z$ = -12

This means you can read this code something like this:

  ; int x = a;
  mov  eax, DWORD PTR [ebp + 8]
  mov  DWORD PTR [ebp - 4], eax

  ; int y = b;
  mov  ecx, DWORD PTR [ebp + 12]
  mov  DWORD PTR [ebp - 8], ecx

  ; int z = x + y;
  mov  edx, DWORD PTR [ebp - 4]
  add  edx, DWORD PTR [ebp - 8]
  mov  DWORD PTR [ebp - 12], edx

This means the address of parameters a and b are EBP + 8 and EBP + 12, respectively. And, the value of x, y, and z are stored at memory location EBP - 4, EBP - 8, and EBP - 12, respectively.

After you've been armed with this knowledge, let's play a game with the parameter of the functions. Let's take a look at this simple program.

Program 62

#include <cstdio>

extern "C" int fun(int a, int b) {
  return a + b;
}

int main() {

  printf("%d\n", fun(4, 5));
  return 0;
}

The output of this program is expected. The output of this program is "9". Now, let's change a program a little bit.

Program 63

#include <cstdio>

extern "C" int fun(int a, int b) {
  _asm mov dword ptr[ebp+12], 15
  _asm mov dword ptr[ebp+8], 14
  return a + b;
}

int main() {

  printf("%d\n", fun(4, 5));
  return 0;
}

The output of this program is "29". We now know the address of the parameter and in this program, we change the value of the parameter. And, when we add those variables, the new values -- 15 and 14 -- are added.

VC has naked attributes for functions. If you specify any function to naked, it won't generate the prolog and epilog code for that function. Now, what is prolog and epilog code? Prolog is an English word mean "Opening;" yes, it is the name of a programming language, too, which is used in AI, but there is no relation between that programming language and prolog code generated by the compiler. This is a code that the compiler automatically inserted in the opening of the function calling to set the stack frame. Take a look at the assembly language code generated by Program 61. In the beginning of the function, the compiler automatically inserted the following code to set the stack frame.

  push  ebp
  mov   ebp, esp
  sub   esp, 12                 ; 0000000cH

This code is called prolog code. And, in the same way, the code inserted at the end of function is called Epilog code. In the same program, the Epilog code generated by the compiler is:

  mov  esp, ebp
  pop  ebp
  ret  0

Now, take a look at the function with the naked attribute:

Program 64

extern "C" void _declspec(naked) fun() {
  _asm ret
}

int main() {

  fun();
  return 0;
}

The code of the function fun, which is generated by the compiler, is something like this.

  _asm ret

This means that there are no prolog and epilog code in this function. In fact, there are rules of naked functions. You can't declare an automatic variable in a naked function because, for this compiler, you have to generate the code yourself and in the naked function, the compiler won't generate any code for you. In fact, you have to write the ret statement yourself; otherwise, the program will crash. You even can't write a return statement in the naked function. Why? Because when you return something from the function, the compiler puts its value in the eax register. So, it means the compiler has to generate the code for your return statement. Let's take a look at this simple program to understand the working of the return value from the function.

Program 64

#include <cstdio>

extern "C" int sum(int a, int b) {
  return a + b;
}

int main() {

  int iRetVal;
  sum(3, 7);
  _asm mov iRetVal, eax
  printf("%d\n", iRetVal);
  return 0;
}

The output of this program is "10". Here we haven't directly used the return value of the function; instead, we copy the value of eax into the variable just after calling the function.

Now, write the whole function naked, with prolog and epilog code, which returns the value of two variables after returning it.

Program 65

#include <cstdio>

extern "C" int _declspec(naked) sum(int a, int b) {

  // prolog code
  _asm push ebp
  _asm mov ebp, esp

  // code for add two variables and return
  _asm mov eax, dword ptr [ebp + 8]
  _asm add eax, dword ptr [ebp + 12]

  // epilog code
  _asm pop ebp
  _asm ret
}

int main() {

  int iRetVal;
  sum(3, 7);
  _asm mov iRetVal, eax
  printf("%d\n", iRetVal);
  return 0;
}

The output of this program is "10"; in other words, the sum of two parameters: 3 and 7.

This attribute is used in the ATLBASE.H file to implement the member of the _QIThunk structure. This structure is used to debug the reference counting the ATL program when _ATL_DEBUG_INTERFACES are defined.

I hope to explore some other mysteries of ATL in the next article.

About the Author
C++ Developer at Bechtel Corporation.

Tools:
Add www.codeguru.com to your favorites
Add www.codeguru.com to your browser search box
IE 7 | Firefox 2.0 | Firefox 1.5.x
Receive news via our XML/RSS feed







RATE THIS ARTICLE:   Excellent  Very Good  Average  Below Average  Poor  

(You must be signed in to rank an article. Not a member? Click here to register)

Latest Comments:
Article is a joke - odered (09/24/2004)
integrate ADO program to ATL service / win service - Legacy CodeGuru (10/24/2002)
integrate ADO program to ATL service / win service - Legacy CodeGuru (10/24/2002)
Very nice article: really under the ATL hood!!! - Legacy CodeGuru (10/15/2002)
Very interesting, will there be others?? - Legacy CodeGuru (10/04/2002)

View All Comments
Add a Comment:
Title:
Comment:
Pre-Formatted: Check this if you want the text to display with the formatting as typed (good for source code)



(You must be signed in to comment on an article. Not a member? Click here to register)