Target Program State Access
in the Alamo Monitor Framework

Wenyi Zhou and Clinton L. Jeffery
Feb. 28, 1996
Technical Report CS-96-5







Abstract


In the Alamo Monitor Framework, execution monitors (EMs) reside in the same address space as the target program (TP). EMs can easily access the local variables and structures in the TP, using the debugging symbol table information from the TP's object file and those registers that are the keys to the TP's state information, such as the current program counter, the frame pointer, and the stack pointer. A set of library functions provide EM authors with direct access to the TP's symbol tables, scopes, addresses and values of variables, and stack frames.







Division of Computer Science
The University of Texas at San Antonio
San Antonio, TX78255



This work was supported in part by the National Science Foundation under grant CCR-9409082.


Introduction

The Alamo Monitor Framework [Jeff96] consists of two major parts: a Configurable C Instrumentation tool (CCI) and the Alamo Monitor Executive (AME). CCI instruments target programs for event generation [Templ96]. The AME allows execution monitors (EMs) to gather extensive information from an executing target program (TP), starting with the information from instrumented events and supplementing that information with direct access to the TP state. Alamo's design is based on an execution monitoring framework developed for the Icon programming language [Jeff94] and generalizes that model to encompass compiled, systems programming languages. This report describes the design and the programming interface of the AME used by monitor authors, focusing on the topic of how to access the target program's state information.

The Alamo Monitor Executive dynamically loads the target program (TP) being monitored along with one or more execution monitors (EMs) as needed. When multiple EMs are executing, they are controlled by a special EM designated as a monitor coordinator (MC). For each TP event that is requested by one or more of the EMs, the TP suspends, and control is transferred to the MC, which forwards event information (and control) to those EMs interested in the event. To acquire further information, EMs require the ability to access the TP's variables and structure information while processing the events. This functionality enables the EM writer to extract more information from the running target program and understand the program behavior better.

EM access to TP state is provided by several means. Since EMs do not have compile-time knowledge of TP types, a generic value datatype and related service functions implement a run-time traversal mechanism for TP values including arrays, structures, and unions. EM global variables corresponding to critical pieces of TP state are automatically assigned during event transmission. EM library functions provide access to TP symbol tables, scopes, addresses and values of variables, and stack traversal.

Descriptors

In Alamo, EMs use descriptors to refer to TP variables. Descriptors include type information from the debugging symbol table sections in the TP object module. The descriptor structure is as follows:
   typedef struct {
       struct ctype *type;
       void * addr;
   } Desc;
Descriptors have two fields: type is a pointer to a structure that contains the type information. It is not used directly by EM authors; instead, service functions use this pointer to extract type information for a given variable. addr is a void pointer that holds the memory location where the variable is stored.

Target Program's Stack Frame

In Alamo's coroutine model of execution monitoring, the TP saves its registers and automatic variables onto its stack before the control of program execution is transferred to EMs. Figure 1 shows the view of TP's stack when the control is transferred to the EM for the program shown on the right side of the diagram. The main() function calls function f1(). Inside f1(), it calls another function f2(). At some line in the f2(), the event is generated by CCI and this is the point where the control is transferred to the EM. Service functions are provided for EM authors to walk up and down the TP's stack to inspect any particular stack frame of interest. These service functions use several global variables internally to track the stack frame that is currently being visited. EM authors can inspect the current stack frame and retrieve information, such as values, types, or addresses of variables from it.

Figure 1: a view of the TP's stack frame when the control is transferred to the EM.

EM Global Variables

EMs have several global descriptors: Params, Locals, Local_Statics, Globals, and Statics. These descriptors are used directly by EM authors to get handles for the parameters, local variables, and static variables for the function being visited, global variables, and file scope static variables, respectively. They are accessed using the same notation that is used for target program struct values. The type field of the descriptors for these implicit structures contains the program counter value for the current stack frame, and the addr field contains the frame pointer value for the current stack frame.

C does not allow nested functions, but does allow nested block structures called compound statements inside pairs of curly braces. Variables declared inside each pair of curly braces are in a new scope that is one level deeper than the variables declared outside. The scope is a conceptual term for distinguishing the visibility of variables inside one file or between several files. For each function that is still active, there is a stack frame (or activation record) associated with it as shown in Figure 1. We can have several scopes within each function and all the local variables declared in each scope will be stored in the stack frame if they are alive. The Locals represents the local variables for the function being visited at the scope which is indicated by the program counter value. An example C program is shown in Figure 2:

Figure 2: An example C function with several block structures defined inside it.

For the program shown in Figure 2, when the control is transferred at the assignment a = 9, Locals in EM refers to the variables declared inside block 3. Block 2 and block 1 are direct ancestors of block 3. Block 1 is also a direct ancestor of block 4. Block 2 and block 4 are siblings. When the program is inside the block 4, it cannot reference variables declared inside block 2.

Traversing the TP's stack and accessing named variables

In order to access local variables in different invocations of functions in the target program, EMs must be able to traverse the TP's stack. There are two functions provided for moving the internal frame pointer and program counter which keep track of the stack frame being visited:

int UpStack();
This function moves EM's view of the TP stack "up" to the previous function call's stack frame. It returns 1 upon success, or 0 if the current frame is on the TP's main() function.

int DownStack();
This function moves EM's view of the TP stack "down" one stack frame. It returns 1 upon success, or 0 if the current frame is on the frame where the TP was suspended.

Desc Var(char *name);
Given the name of a variable, this function first looks up the symbol in the local symbol table for the current stack frame in the scope indicated by the program counter. If it is not found, the local symbol tables that are the direct ancestors of the current scope are searched in sequence. If it is still not found, the global symbol table is searched. This function returns a descriptor with the type and address where this variable is stored in the memory. If the symbol is not found in any of the symbol tables, this function returns a descriptor with both the addr and type fields being NULL indicating error.

Using these three functions, an EM author can obtain information for any given named variable in the current stack frame where the internal frame pointer and program counter are at.

Accessing Component Values Using Descriptors

EMs do not in general know the names, or types, of TP variables and their components, such as struct or union fields, or array elements. Another set of functions produces the names and types of component values. The functions take a descriptor parameter. The descriptor can either be an implicit structure, such as Locals, Params, Local_Statics, Globals, Statics, or it can be an ordinary C structure, union, or array value. Below is a brief description of various functions that extract component information from a descriptor.
int Num(Desc var);
This function returns the number of components described by the descriptor var.

Desc Elem(Desc var, int i);
This function returns a descriptor for the ith component described by the descriptor var.

char *Name(Desc var, int i);
This function returns the name of the ith component described by the descriptor var.

For example, using the implicit structures as the parameter for the above functions:

   int i;
   Desc temp;
   char *name;

   /* get the number of local variables for the current scope*/
   i = Num(Locals); 
 
   /* get the 2nd parameter for the current function */
   temp = Elem(Params, 2);
   
   /* get the name for the 2nd global variables */
   name = Name(Globals, 2);

   ......

In addition to implicit structures, a descriptor can represent an ordinary C type. In the above example, temp which retrieves the 2nd parameter for the current function may be a structure itself. The above functions can also be used to extract similar information from an ordinary C type. The code sample would be:
   int i;
   Desc d;
   Desc temp;
   char *string;

   /* 
    * extracts the 2nd parameter, which is a structure, and places it in
    * the descriptor temp.
    */

   temp = Elem(Params, 2);

   /* i contains the number of fields of the structure. */
   i = Num(temp)

   /* d holds the descriptor for the 2nd field of the structure */ 
   d = Elem(temp,2)

   /* string gets the name of the 3rd field of the structure */
   string = Name(temp,3)

   ....

The above three functions enable EMs to access variable information in the current stack frame of the program indicated by the internal frame pointer value. In some cases, the EM writer would need to access a variable that is hidden by the current scope. As an example shown in Figure 2, when the target program gives control to EM, EM wants to access the variable a which is declared in block 2. The function: Desc WhereIs(char *name, int i); returns a descriptor for the variable named name in the scope i levels outside the current scope. When i is 0, this is the current scope. When i is 1, this is the immediately enclosing scope, and so on. The code example would be:

   Desc d1, d2, d3;

   /*
    * d1 holds the descriptor for the variable "a" in block 3.
    */
   d1 = WhereIs("a", 0);

   /* 
    * d2 holds the descriptor for the variable "a" in block 2.
    */
   d2 = WhereIs("a", 1);

   /* 
    * d3 holds the descriptor for the variable "a" in block 1,
    * which should be a NULL descriptor value indicating not found.
    */
   d3 = WhereIs("a", 2);

Note that a local variable is alive only if it is declared in a direct ancestor of the current scope. EM authors cannot ask for information about dead local variables, since their memory is reused by other variables.

Type, value, address, and size information for a variable

The following functions on descriptors give EM writers the ability to obtain information about the size of a variable, the type of a variable, and the value obtained by dereferencing a variable.

int Size(Desc var);
This function returns the size of a variable, similar to sizeof in C.

char * TypeString(Desc var);
This function returns a string description of the type of the variable. For a structure or union type variable, it gives out each field's type but is restricted to one level.

int Type(Desc var);
This function returns an integer value representing one of the predefined C variable types. The predefined variable types are: INT, CHAR, LONG_INT, UNSIGNED_INT, LONG_UNSIGNED_INT, SHORT_INT, SHORT_UNSIGNED_INT, SIGNED_CHAR, UNSIGNED_CHAR, LONG_DOUBLE, FLOAT, DOUBLE, VOID, STRUCT, UNION, ENUM, POINTER, FUNCTION, ARRAY.

long DerefI(Desc var);
This function returns a long for the variable of types CHAR, UNSIGNED_CHAR, SIGNED_CHAR, INT, LONG_INT, UNSIGNED_INT, LONG_UNSIGNED_INT, SHORT_INT, SHORT_UNSIGNED_INT. We promote the 1-byte and 2-bytes integer value to 4-bytes value. It is up to the EM writers to type cast it to the desired type.

long double DerefD(Desc var);
This function returns a long double value for the variable of types LONG_DOUBLE, DOUBLE, FLOAT. We promote the 4-bytes, 8-bytes floating point value to 16-bytes value. The EM authors can type cast it to a float or double.

Desc DerefP(Desc var, int i);
This function returns a descriptor with its type field being the type of what the pointer points to and its addr field being the address where the ith object of this pointer points to is stored.

char * Image(Desc var);
This function returns a string image for the value of a variable. It only works for the variable types of INT, CHAR, LONG_INT, UNSIGNED_INT, LONG_UNSIGNED_INT, SHORT_INT, SHORT_UNSIGNED_INT, SIGNED_CHAR, UNSIGNED_CHAR, LONG_DOUBLE, FLOAT, DOUBLE.

void *Addr(Desc var);
This function returns the address of the variable specified.

Predicates

The following predicate functions return 1 for true, 0 for false.
int IsMember(Desc var1, char *var2);
If var1 is of type structure or union, tests whether var2 is a field of var1.
If var1 is of type function, tests whether var2 is a local variable of var1.

int IsNull(Desc var);
Tests whether a descriptor's addr and type fields are NULL.

int IsSameType(Desc var1, Desc var2);
Tests if var1 and var2 have the same type fields.

int IsEQ(Desc var1, Desc var2);
This function first calls IsSame to test the types of var1 and var2. If it is false, returns -1 indicating error. Otherwise, tests if var1 and var2 have the same addr fields, returns 1 for true, zero for false.

int IsGT(Desc var1, Desc var2);
This function first calls IsSame to test if var1 and var2 have the same type. If it is false, returns -1 indicating error. Otherwise, tests if var1's value is strictly greater than var2's value, returns 1 for true, zero for false.

int IsGE(Desc var1, Desc var2);
This function first calls IsSame to test if var1 and var2 have the same type. If it is false, returns -1 indicating error. Otherwise, tests if var1's value is greater than or equal to var2's value, returns 1 for true, zero for false.

int IsLE(Desc var1, Desc var2);
This function first calls IsSame to test if var1 and var2 have the same type. If it is false, returns -1 indicating error. Otherwise, tests if var1's value is less than or equal to var2's value, returns 1 for true, zero for false.

int IsLT(Desc var1, Desc var2);
This function first calls IsSame to test if var1 and var2 have the same type. If it is false, returns -1 indicating error. Otherwise, tests if var1's value is strictly less than var2's value, returns 1 for true, zero for false.

Location information for the TP program

void Loc(char *filename, int *line_num);
This function does a lookup using the internal program counter and frame pointer values, and writes the filename and current line number information into the parameters given.

Examples of using these functions

An example of traversing fields of a structure and indexing elements inside an array. Print out the value of the fields or elements in the array if they are of simple types:

    Desc d, temp;
    int i, j, typecode;
    char *s;

    ......

    for ( i = 0; i < Num(Locals); i ++ ) {
       d = Elem(Locals, i);	
       if ( Type(d) == STRUCT || Type(d) == ARRAY ) {
	  for ( j = 0; j < Num(d); j ++ ) {

		......

	     /* get the jth field's or element's variable name */
	     s = Name(d, j);	  

	     /* get the jth field's or element's descriptor */
	     temp = Elem(d, j);  

	     /* 
	      * Or using Var(s) to get the jth field's or 
	      * element's descriptor.    
	      *
	      * temp = Var(s);
	      */

	     /* get the jth field's variable type */
	     typecode = Type(temp); 

	     switch (typecode) { /* printing out the values */
	        case INT:
		   printf("integer value: %d \n", DerefI(temp));
		   break;
 	        case CHAR:
	   	   printf("char value: %c \n", DerefC(temp));
		   break;
	        case DOUBLE:
	   	   printf("double value: %f \n", DerefD(temp));
		   break;
	        case FLOAT:
		   printf("float value: %f \n", DerefF(temp));
		   break;

		   /* 
		    * Or we can use Image() to get the string image 
                    * of the value for a variable of type INT, CHAR, 
                    * DOUBLE, FLOAT, etc.
		    *
		    * printf("var value: %s \n", Image(temp));
		    */

		 ......
              }
	      ......
 	   }
       }
   }
   ......

An example of following a linked list. In C, the statement like: for ( p = x; p; p = p->next ) is typical. An EM writer can write similar statements for the same purpose. The difference is the EM writer usually doesn't know the structure of variable "x". In order for him to do the same thing, he has to do a little more work to find out the "next" field for this structure. The code sample would be:

  
   Desc d;
   int i, index;
 
   /* 
    * find out the "next" field in the structure, 
    * assign that element's index to variable index
    */

    d = Var("x");    
    for ( i = 0; i < Num(d); i ++ ) {
       if ( IsSameType(d, Elem(d, i)) ) {
	  index = i;
	  break;
       }
    }

    for ( d = Var("x"); !IsNull(d); d = Elem(d, index) ) {

       ......

    }

An example of querying information about a variable which is not in the current scope.
  
   char *s;
   int i, j;
   Desc d;

   .....
 
   for ( i = 0; i < Num(Locals); i++ ) {
      s = name(Locals, i);
      for ( j = 1, d = WhereIs(s, j); !IsNull(d); 
		j++, d = WhereIs(s, j) ) {

	 ......
      
      } 

      ......
    }

An example of traversing the stack frame of TP. Print out the filename and line number information for the TP.
 
    char *filename;
    int line_num;

    ......

    /* get current filename and line number info */
    Loc(filename, &line_num); 
    printf("filename: %s, line_number: %d \n", filename, line_num);

    while ( UpStack() != -1 ) {
       ......   

       Loc(filename, &line_num);
       printf("filename: %s, line_number: %d \n", filename, line_num);

       ...... 
    }

Bibliography

[Jeff96] Jeffery, C. L., Zhou, W, and Templer, K. S., "The Alamo Monitor Framework", Technical Report 96-7, Division of Computer Science, University of Texas at San Antonio, February, 1996.

[Templ96] Templer, K. S., "A Configurable C Instrumentation Tool", Technical Report 96-6, Division of Computer Science, University of Texas at San Antonio, February, 1996.

[Jeff94] Jeffery, C. L., "A Framework for Execution Monitoring in Icon", Software: Practice and Experience, November, 1994.