Target Program State Access
in the Alamo Monitor Framework
Wenyi Zhou and Clinton L. Jeffery
Feb. 28, 1996
Technical Report CS-96-5
Abstract
In the Alamo Monitor Framework, execution monitors (EMs) reside in the
same address space as the target program (TP). EMs can easily access the local
variables and structures in the TP, using the debugging symbol
table information from the TP's object file and those registers that are
the keys to the TP's state information, such as the current program counter,
the frame pointer, and the stack pointer. A set of library functions provide
EM authors with direct access to the TP's symbol tables, scopes, addresses
and values of variables, and stack frames.
Division of Computer Science
The University of Texas at San Antonio
San Antonio, TX78255
This work was supported in part by the National Science Foundation under
grant CCR-9409082.
Introduction
The Alamo Monitor Framework [Jeff96] consists of two major parts:
a Configurable C Instrumentation tool (CCI) and the Alamo Monitor Executive
(AME). CCI instruments target programs for event generation [Templ96].
The AME allows execution monitors (EMs) to gather extensive information
from an executing target program (TP), starting with the information from
instrumented events and supplementing that information with direct access
to the TP state. Alamo's design is based on an execution monitoring framework
developed for the Icon programming language [Jeff94] and generalizes that model
to encompass compiled, systems programming languages.
This report describes the design and the programming
interface of the AME used by monitor authors, focusing on the topic of how
to access the target program's state information.
The Alamo Monitor Executive dynamically loads the target program (TP) being
monitored along with one or more execution monitors (EMs) as needed. When
multiple EMs are executing, they are controlled by a special EM designated
as a monitor coordinator (MC). For each TP event that is requested by one
or more of the EMs, the TP suspends, and control is transferred to the MC,
which forwards event information (and control) to those EMs interested in
the event. To acquire further information, EMs require the ability to
access the TP's variables and structure information while processing the
events. This functionality enables the EM writer to extract more information
from the running target program and understand the program behavior better.
EM access to TP state is provided by several means. Since EMs do not have
compile-time knowledge of TP types, a generic value datatype and
related service functions implement a run-time traversal mechanism for TP
values including arrays, structures, and unions. EM global variables
corresponding to critical pieces of TP state are automatically assigned
during event transmission. EM library functions provide access to TP symbol
tables, scopes, addresses and values of variables, and stack traversal.
Descriptors
In Alamo, EMs use descriptors to refer to TP variables.
Descriptors include type information from the debugging symbol
table sections in the TP object module.
The descriptor structure is as follows:
typedef struct {
struct ctype *type;
void * addr;
} Desc;
Descriptors have two fields: type is a pointer to a
structure that contains the type information. It is not used directly
by EM authors; instead, service functions use this pointer to extract
type information for a given variable.
addr is a void pointer that holds the memory location
where the variable is stored.
Target Program's Stack Frame
In Alamo's coroutine model of execution monitoring, the TP saves its registers
and automatic variables onto its stack before the control of program
execution is transferred to EMs. Figure 1 shows the view of TP's stack
when the control is transferred to the EM for the program shown on the
right side of the diagram.
The main() function calls function f1().
Inside f1(), it calls another function
f2(). At some line in the f2(), the event is generated by
CCI and this is the point where the control is transferred to the EM.
Service functions are provided for EM authors to walk
up and down the TP's stack to inspect any particular stack frame
of interest. These service functions use several global variables internally
to track the stack frame that is currently being visited.
EM authors can inspect the current stack frame and retrieve
information, such as values, types, or addresses of
variables from it.
Figure 1: a view of the TP's stack frame when the control
is transferred to the EM.
EM Global Variables
EMs have several global descriptors:
Params, Locals, Local_Statics, Globals,
and Statics. These descriptors are used directly by EM authors
to get handles for the parameters, local variables, and static variables for
the function being visited, global variables,
and file scope static variables, respectively. They are accessed using the
same notation that is used for target program struct values.
The type field of the descriptors for these implicit structures
contains the program counter value for the current stack frame,
and the addr field contains the frame pointer value for the
current stack frame.
C does not allow nested functions, but does allow nested block structures
called compound statements inside pairs of curly braces.
Variables declared inside each pair of curly braces are
in a new scope that is one level deeper than the variables declared outside.
The scope is a conceptual term for distinguishing the visibility of variables
inside one file or between several files. For each function that is still
active, there is a stack frame (or activation record) associated with it as
shown in Figure 1. We can have several scopes within each function and all
the local variables declared in each scope will be stored in the stack frame
if they are alive. The Locals represents the local variables for the
function being visited at the scope which is indicated by the program counter
value. An example C program is shown in Figure 2:
Figure 2: An example C function with several block structures
defined inside it.
For the program shown in Figure 2, when the control is transferred at the
assignment a = 9, Locals in EM refers to the variables declared
inside block 3. Block 2 and block 1 are direct ancestors of block 3. Block
1 is also a direct ancestor of block 4. Block 2 and block 4 are siblings.
When the program is inside the block 4, it cannot reference variables
declared inside block 2.
Traversing the TP's stack and accessing named variables
In order to access local variables in different invocations of functions in the
target program, EMs must be able to traverse the TP's
stack. There are two functions provided for moving the internal frame pointer
and program counter which keep track of the stack frame being visited:
- int UpStack();
- This function moves EM's view of the TP stack "up" to the previous
function call's stack frame. It returns 1 upon success,
or 0 if the current frame is on the TP's main() function.
- int DownStack();
- This function moves EM's view of the TP stack "down" one stack frame.
It returns 1 upon success, or 0 if the current
frame is on the frame where the TP was suspended.
- Desc Var(char *name);
- Given the name of a variable, this function first looks up the
symbol in the local symbol table for the current stack frame in
the scope indicated by the program counter. If it is not found,
the local symbol tables that are the direct ancestors
of the current scope are searched in sequence. If it is still not
found, the global symbol
table is searched. This function returns a descriptor with
the type and address where this variable is stored in the memory.
If the symbol is not found in any of the symbol tables, this function
returns a descriptor with both the addr
and type fields being NULL indicating error.
Using these three functions, an EM author can obtain information for any given
named variable in the current stack frame where the internal frame pointer and
program counter are at.
Accessing Component Values Using Descriptors
EMs do not in general know the names, or types, of TP variables and their
components, such as struct or union fields, or array elements. Another set of
functions produces the names and types of component values.
The functions take a descriptor
parameter. The descriptor can either be an implicit structure, such as
Locals, Params, Local_Statics, Globals,
Statics, or it can be an ordinary C structure,
union, or array value. Below is a brief description of various functions that
extract component information from a descriptor.
- int Num(Desc var);
- This function returns the number of components described by the
descriptor var.
- Desc Elem(Desc var, int i);
- This function returns a descriptor for the ith component
described by the descriptor var.
- char *Name(Desc var, int i);
- This function returns the name of the ith component described by
the descriptor var.
For example, using the implicit structures as the parameter for the
above functions:
int i;
Desc temp;
char *name;
/* get the number of local variables for the current scope*/
i = Num(Locals);
/* get the 2nd parameter for the current function */
temp = Elem(Params, 2);
/* get the name for the 2nd global variables */
name = Name(Globals, 2);
......
In addition to implicit structures, a descriptor can represent
an ordinary C type. In the above example,
temp which retrieves the 2nd parameter
for the current function may be a structure itself. The above functions
can also be used to extract similar information from an ordinary C type.
The code sample would be:
int i;
Desc d;
Desc temp;
char *string;
/*
* extracts the 2nd parameter, which is a structure, and places it in
* the descriptor temp.
*/
temp = Elem(Params, 2);
/* i contains the number of fields of the structure. */
i = Num(temp)
/* d holds the descriptor for the 2nd field of the structure */
d = Elem(temp,2)
/* string gets the name of the 3rd field of the structure */
string = Name(temp,3)
....
The above three functions enable EMs to access variable information
in the current stack frame of the program indicated by the internal
frame pointer value. In some cases, the EM writer would need to access a
variable that is hidden by the current scope. As an example shown in
Figure 2, when the target program gives control to EM, EM wants to
access the variable a which is declared in block 2.
The function: Desc WhereIs(char *name, int i);
returns a descriptor for the variable named name in the scope
i levels outside the current scope. When i is 0,
this is the current scope. When i is 1,
this is the immediately enclosing
scope, and so on. The code example would be:
Desc d1, d2, d3;
/*
* d1 holds the descriptor for the variable "a" in block 3.
*/
d1 = WhereIs("a", 0);
/*
* d2 holds the descriptor for the variable "a" in block 2.
*/
d2 = WhereIs("a", 1);
/*
* d3 holds the descriptor for the variable "a" in block 1,
* which should be a NULL descriptor value indicating not found.
*/
d3 = WhereIs("a", 2);
Note that a local variable is alive only if it is declared
in a direct ancestor of the current scope. EM authors cannot ask for
information about dead local variables, since their memory is
reused by other variables.
Type, value, address, and size information for a variable
The following functions on descriptors give EM writers the ability to
obtain information
about the size of a variable, the type of a variable, and the value
obtained by dereferencing a variable.
- int Size(Desc var);
- This function returns the size of a variable, similar to
sizeof in C.
- char * TypeString(Desc var);
- This function returns a string description of the type of the variable.
For a structure or union type variable, it gives out each field's type
but is restricted to one level.
- int Type(Desc var);
- This function returns an integer value representing one of the predefined
C variable types. The predefined variable types are: INT, CHAR, LONG_INT,
UNSIGNED_INT, LONG_UNSIGNED_INT, SHORT_INT, SHORT_UNSIGNED_INT,
SIGNED_CHAR, UNSIGNED_CHAR, LONG_DOUBLE, FLOAT, DOUBLE,
VOID, STRUCT, UNION, ENUM, POINTER, FUNCTION, ARRAY.
- long DerefI(Desc var);
- This function returns a long for the variable of types CHAR,
UNSIGNED_CHAR, SIGNED_CHAR, INT, LONG_INT,
UNSIGNED_INT, LONG_UNSIGNED_INT, SHORT_INT, SHORT_UNSIGNED_INT.
We promote the 1-byte and 2-bytes integer value to
4-bytes value. It is up to the EM writers to type cast it to the
desired type.
- long double DerefD(Desc var);
- This function returns a long double value for the variable of types
LONG_DOUBLE, DOUBLE, FLOAT. We promote the 4-bytes, 8-bytes
floating point value to 16-bytes value. The EM authors can type cast
it to a float or double.
- Desc DerefP(Desc var, int i);
- This function returns a descriptor with its type field
being the type of what the pointer points to and its addr
field being the address where the ith object
of this pointer points to is stored.
- char * Image(Desc var);
- This function returns a string image for the value of a variable.
It only works for the variable types of INT, CHAR, LONG_INT,
UNSIGNED_INT, LONG_UNSIGNED_INT, SHORT_INT, SHORT_UNSIGNED_INT,
SIGNED_CHAR, UNSIGNED_CHAR, LONG_DOUBLE, FLOAT, DOUBLE.
- void *Addr(Desc var);
- This function returns the address of the variable specified.
Predicates
The following predicate functions return 1 for true,
0 for false.
- int IsMember(Desc var1, char *var2);
- If var1 is of type structure or union, tests whether var2
is a field of var1.
If var1 is of type function, tests whether var2 is
a local variable of var1.
- int IsNull(Desc var);
- Tests whether a descriptor's addr and
type fields are NULL.
- int IsSameType(Desc var1, Desc var2);
- Tests if var1 and var2 have the same type fields.
- int IsEQ(Desc var1, Desc var2);
- This function first calls IsSame to test the types of
var1 and var2.
If it is false, returns -1 indicating error. Otherwise,
tests if var1 and var2 have the same addr
fields, returns 1 for true, zero for false.
- int IsGT(Desc var1, Desc var2);
- This function first calls IsSame to test if var1
and var2 have the
same type. If it is false, returns -1 indicating error. Otherwise,
tests if var1's value is strictly greater than var2's value,
returns 1 for true, zero for false.
- int IsGE(Desc var1, Desc var2);
- This function first calls IsSame to test if var1
and var2 have the
same type. If it is false, returns -1 indicating error. Otherwise,
tests if var1's value is greater than or equal to var2's value,
returns 1 for true, zero for false.
- int IsLE(Desc var1, Desc var2);
- This function first calls IsSame to test if var1
and var2 have the
same type. If it is false, returns -1 indicating error. Otherwise,
tests if var1's value is less than or equal to var2's value,
returns 1 for true, zero for false.
- int IsLT(Desc var1, Desc var2);
- This function first calls IsSame to test if var1
and var2 have the
same type. If it is false, returns -1 indicating error. Otherwise,
tests if var1's value is strictly less than var2's value,
returns 1 for true, zero for false.
Location information for the TP program
- void Loc(char *filename, int *line_num);
- This function does a lookup using the internal program counter and frame pointer
values, and writes the filename and current line number information into the
parameters given.
Examples of using these functions
An example of traversing fields of a structure and indexing
elements inside an array. Print out the value of the fields
or elements in the array if they are of simple types:
Desc d, temp;
int i, j, typecode;
char *s;
......
for ( i = 0; i < Num(Locals); i ++ ) {
d = Elem(Locals, i);
if ( Type(d) == STRUCT || Type(d) == ARRAY ) {
for ( j = 0; j < Num(d); j ++ ) {
......
/* get the jth field's or element's variable name */
s = Name(d, j);
/* get the jth field's or element's descriptor */
temp = Elem(d, j);
/*
* Or using Var(s) to get the jth field's or
* element's descriptor.
*
* temp = Var(s);
*/
/* get the jth field's variable type */
typecode = Type(temp);
switch (typecode) { /* printing out the values */
case INT:
printf("integer value: %d \n", DerefI(temp));
break;
case CHAR:
printf("char value: %c \n", DerefC(temp));
break;
case DOUBLE:
printf("double value: %f \n", DerefD(temp));
break;
case FLOAT:
printf("float value: %f \n", DerefF(temp));
break;
/*
* Or we can use Image() to get the string image
* of the value for a variable of type INT, CHAR,
* DOUBLE, FLOAT, etc.
*
* printf("var value: %s \n", Image(temp));
*/
......
}
......
}
}
}
......
An example of following a linked list. In C, the statement
like: for ( p = x; p; p = p->next ) is typical. An EM writer
can write similar statements for the same purpose. The difference is
the EM writer usually doesn't know the structure of variable "x". In
order for him to do the same thing, he has to do a little more work
to find out the "next" field for this structure. The code sample would
be:
Desc d;
int i, index;
/*
* find out the "next" field in the structure,
* assign that element's index to variable index
*/
d = Var("x");
for ( i = 0; i < Num(d); i ++ ) {
if ( IsSameType(d, Elem(d, i)) ) {
index = i;
break;
}
}
for ( d = Var("x"); !IsNull(d); d = Elem(d, index) ) {
......
}
An example of querying information about a variable which is not
in the current scope.
char *s;
int i, j;
Desc d;
.....
for ( i = 0; i < Num(Locals); i++ ) {
s = name(Locals, i);
for ( j = 1, d = WhereIs(s, j); !IsNull(d);
j++, d = WhereIs(s, j) ) {
......
}
......
}
An example of traversing the stack frame of TP. Print out the
filename and line number information for the TP.
char *filename;
int line_num;
......
/* get current filename and line number info */
Loc(filename, &line_num);
printf("filename: %s, line_number: %d \n", filename, line_num);
while ( UpStack() != -1 ) {
......
Loc(filename, &line_num);
printf("filename: %s, line_number: %d \n", filename, line_num);
......
}
Bibliography
[Jeff96] Jeffery, C. L., Zhou, W, and Templer, K. S., "The Alamo Monitor
Framework", Technical Report 96-7, Division of Computer Science,
University of Texas at San Antonio, February, 1996.
[Templ96] Templer, K. S., "A Configurable C Instrumentation Tool",
Technical Report 96-6, Division of Computer Science,
University of Texas at San Antonio, February, 1996.
[Jeff94] Jeffery, C. L., "A Framework for Execution Monitoring in Icon",
Software: Practice and Experience, November, 1994.