List of Passes

Click here to go to Parafrase-2 home page.

Available Passes

Click the name of the pass to see the description. Refer to Parafrase-2 User's Manual for detail. However, this page is likely to be more up-to-date than the User's Manual or man pages.

Description of the Passes

fixup
fixup fills in some information that is difficult to do when parsing, such as parent pointers and previous pointers in linked lists. It insures the integrity of the parsetree data structure. This should always be the first pass, and will probably be later eliminated and hard coded into Parafrase-2. The following information is set up in the fixup pass:
callgraph
callgraph builds a simple callgraph. As yet this callgraph does not take into account procedural parameters.
flow
flow builds the flow graph of the program. If the debug flag has been set to a positive number via the -d option then the flow graph and the corresponding connectivity matrix and also information about the dominator tree of the flow graph will be printed.
depend
depend generates the In/Out sets for each statement. In/Out sets for statements can be quite complicated due to the characteristics of C as variable names can be arbitrarily qualified. Elements of In/Out sets are variables (pointers to the symbol table) and the list of conditions which qualifies this variable. Thus, for instance, the name x[2].b[4,5].c is kept as the variable x along with the list (2 b [4,5] c). Whenever the In/Out sets cannot be computed an universal In/Out set is assumed. This is taken to mean that the statement could read/write conceivably to any memory location. Error messages and output can be written to a file whose name is an argument to the pass.
builddep
builddep builds the dependence graph using information generated by the depend pass. This involves finding all dependences between pairs of statemnets and constructin the appropriate dependence arcs. During this process, dependence direction vectors are also constructed. When the In/Out sets are universal sets (any memory location could possibly be read/written) all possible dependence edges are not created but the dependence node for the statement is marked with this fact. Output can be written to a specified file and can also be displayed graphically using the draw_depend pass.
codegen
codegen generates code for the current language. Everything in the syntax tree is output in the appropriate language. Codegen accepts one optional argument, the file to which the code is written. Code is written to stdout if there is no argument. The -l option will cause line numbers to be printed out in FORTRAN code in columns 73-80.

For Fortran, "IMPLICIT NONE" is generated in the declaration part by default (the same as the -I1 option). You can avoid this by the -I0 option (the same as -I). The -I2 option gives you SPARC Fortran syntax "IMPLICIT UNDEFINED(A-Z)". Note that "IMPLICIT NONE" is not Fortran77 but Fortran8x. Alliant Fx/Fortran, f77(DEC OSF/1 2.0) and f77(ULTRIX 4.3) are confirmed to accept "IMPLICIT NONE". As of Oct. 25, 1994, this paragraph is not applicable to sites other than CSRD.

test
Do nothing.
param_alias
param_alias generates aliasing information due to formal parameters and stores it in the symbol table. It will soon take into account global variables as well.
debug_symtab
debug_symtab is a debugging pass that will print the contents of the symbol table in a somewhat readable form.
debug_codegen
debug_codegen generates the source code in a pseudo-C language.
libsum
libsum reads in reference information about functions for which the source code is not available. The reference information is in terms of mod-use information for parameters and global variables. The reference information is read in from a file specified using the -i flag for the pass.
unresolved
If there are any unresolved function calls after parsing, this pass will attempt to resolve them as standard library calls and notify the user about those that could not be resolved.
draw_flow
draw_depend
draw_call
These three passes displays the call graph, dependence graph and the flow graph graphically, and they are the core of the graphics interface of Parafrase-2. In order to invoke these passes, the user must first be running Parafrase-2 on the X-windows environment.
spa
spa performs Static Program Analysis on the program and generates timing and overhead estimates. An input file that contains values for certain program and machine characteristics is required.
structure_walker
structure_walker is a debugging pass that allows the user to interactively walk through the data structures. After a starting point is selected (currently limited to a symbol table entry), the various fields are displayed and any that are pointers may be followed to the corresponding structure.
dotodoall
dotodoall is a simple pass which uses the information generated by the builddep pass. It examines all the loops in the program and checks whether they can be parallelized; this is equivalent to checking that dependences exist only in the = direction. This pass employs the ``outer first'' strategy so that outer loops are parallelized first. This also means that an inner loop with < or > dependences could still be parallelized if these directions can be satisfied by the known sequential nature of an outer loop. If a loop cannot be parallelized, all the dependeces obstructing the parallelization are pointed out. Output is written to a specified output file.
symdep
Hopefully, symdep will analyze symbolic dependency sometime in the future. This capability is currently not available.
donest
donest generates the nesting level of each statement. For each statement the field T_NEST(stmt) is modified to point to a list of Do loops in which the statement is nested. This information is used later in the depend pass to identify loop induction variables when they occur in array subscripts and in the builddep pass when actually building the dependence graph. Error messages and output can be written to a file whose name is an argument to the pass.
sumfcn
sumfcn analyzes and summarizes procedure reference information.
expand
expand performs in-line subroutine expansion. This pass currently performs in-line expansion on all possible functions. This behavior should be modified in the future.
vectorize
vectorize examines all the loops and transforms to vector statements if they can be vectorized. Currently this capability is not fully working.
mem_usage
mem_usage displays dynamic memory usage of Parafrase-2.
symstat
symstat displays symbolic reference counts.
constant
constant propagates the constants throughout the program. It does both constant folding and constant propagation. The propagation is done symbolically, i.e. cases in which values of variables in an expression are not constants but they result in a constant, will be handled accurately. Running this pass together with the induction pass results in a very precise constant propagation scheme.
blockloops
Do nothing.
whatsinit
Do nothing.
checkdecls
Do nothing.
deptest
Do nothing.
loop_interchange
loop_interchange performs loop interchange on the program. Each loop nest pair is analyzed to determine whether the two loops can be interchanged. If they can be safely swapped, the user is queried to see if interchange of the two loops should be carried out.
scalar
scalar performs scalar expansion.
distribute
distribute performs loop distribution on the program. The user may specify: innermost loop only (-I), outermost loop only (-O), or inside-out (default). The manual option (-M) will interactively ask the user how each loop nest should be handled.
induction
This pass basically does the symbolic analysis of Parafrase-2. It recognizes all the induction expressions in the program that can be expressed in terms of loop index variables and loop invariants. Basic induction variables will also be introduced for the loops detected from semantics of the program, which do not have syntactic loop constructs. This includes loops formed by combinations of statements such as IF and GOTO. All the induction expressions in these loops will be defined in terms of the introduced induction variable.

The user can control the effort of Parafrase-2 in finding induction expressions. This is done via a switch, "-v", that specifies the highest degree of interpolating polynomials for induction expressions. For example, "-v2" causes Parafrase-2 to find all induction expressions up to degree two. This includes loop invariant expressions, linear induction expressions, and quadratic induction expressions. The default is "-v1", i.e. only loop invariants and linear induction expressions are recognized by default. In some cases, however, Parafrase-2 can detect induction expressions of degrees higher than that specified by the "-v" switch, although its effort is restricted by the switch "-v".

This pass normalizes the loops bounds when this is more appropriate. It also forward substitutes all the values, when this is legal. Symbolic constant propagation is also done in this pass. Also, all array subscript expressions will be normalized by this pass. A dead code elimination is also performed by this pass to eliminate unused scalar computations. This elimination is necessary for elimination of the intermediate computations of induction variables which otherwise prevents loop parallelization.

hier
hier builds the hierarchical graph using information generated by the flow/induction/depend pass. This involves developing a hierarchy to represent the program. The hierarchy is based on loops which are identified a part of the induction pass. When the acyclic control flow graph at each level is recognized its post dominator tree, the control dependence graph and the data dependence graphs are also built. Redundant dependence are recognized by a heuristics in this pass. The hier pass recognizes the following command line options in the pass file:
-o
call original HIER pass. This is incompatible with any of the flags below. The orignial pass prints a lot of output used for Dr. Girkar's thesis.
-k
compute KILL sets, to get more accurate dependences. This allows more accurate computation of IN sets for compound nodes, because variables defined before used are not included in the compound node's IN set.
-l
find all variables local to each node, i.e. those that are always defined before used inside the node, and that don't have outgoing flow dependences. Delete dependences in the HTG.
-s
delete false loop-carried dependences due to scalars that are really local to the loop. This allows more DOALL loops to be found, although it does not actually change the code. This implies the -l option. It works on the dependence graph (in addition to the HTG).
-v
generate local variable declarations for DO loops. Implies -l option. The local variable declarations will appear in the output code if dotodoall changes the loop into a Cedar Fortran parallel loop.
These options allow scalar privatization to be performed to increase the number of parallel loops found by dotodoall. The following passes and options should be used. Note that builddep needs to be applied before hier, so hier can delete the offending arcs in the dependence graph:
        builddep
        hier -klsv
        dotodoall
reduction
Hopefully, reduction performs symbolic strength reduction sometime in the future. This capability is currently not available.
bbinst
bbinst instruments all basic blocks and loops in the program to obtain a profile or trace of the program's dynamic behavior. When used in HTG construction, no instrumentation code is actually generated, but identification numbers are assigned to basic blocks so that a correspondence can be maintained between the histogrammed basic blocks and the generated HTGL code.
gen3ad
gen3ad generates " 3-address code " from ordinary Fortran (or C) input. It simply splits compex expression statements into sequences of simple assignment statements which correspond roughly to single RISC assembly language statements. This pass also optionally performs some " instruction-level " optimizations.
deadcode
deadcode is a simple-minded dead code elimination pass. It eliminates assignments to variables that are assigned but never used. It does not eliminate assignments to subroutine parameters or common block variables. The current implementation uses the dependence graph, and looks for statements with no outgoing flow dependences. This has probably been superseded by the dead code elimination performed in the latest induction pass.
hcodegen
hcodegen generates HTGL format output from the internal HTG data structures generated by the hier pass. HTGL is a program modelling language used to describe the parallel structure of programs. Command line options allow the HTG to be massaged before printing it out, and to allow instruction type information to be included in the HTGL output, among other things. They also specify how much effort hcodegen should go to in trying to make the output syntactically and semantically correct (in general, some manual intervention on the output will be required to produce a valid HTGL program, for complex codes).
f2c
This pass is not avialable in most versions of Parafrase-2. It converts the internal parse tree from Fortran to something compatible with C. It tries to preserve the memory semantics of the original code, such as by reversing the dimensions of arrays and subtracting one from array index expressions. The purpose of this pass is to generate autoscheduling code, i.e. code that explicitly manages its own parallelism. This turns out to be more convenient to do in C, because structures and pointers can be used to create and manage a general cactus stack required in general parallel execution.
hier_opt
This pass optimizes an HTG constructed in the HIER pass, using the algorithms developed in Carl Beckmann's Ph.D. thesis. Currently it performs redundant data dependence edge removal and string finding in acyclic task graphs (the bodies of LOOP or COMPOUND nodes in the HTG). Eventually, it will also perform task merging.
show_hier [-t -v -s]
show_hier visualizes the hierarchical structure of the HTG for the input program. By default, it is interactive and uses "gs"(ghostscript) to draw a diagram (equivalent to "-s" option). The PostScript(tm) viewer can be switched to "ghostview" by "-v" option. Generated PostScript code is idraw compatible ("idraw" is a drawing tool for UNIX machines). When this pass is invoked with "-t" option, a text style representation of the HTG is printed out with no interaction.
Click here to see an example.
Click here for usage.

Last Updated: January 20, 1994
Questions? Send E-mail to Hideki Saito (saito@csrd.uiuc.edu)