Aceobj.pm
Aceobj.pm is a perl module to use for manipulating acedb objects,
output by the aceserver or tace using the 'show -perl' command. The
object format used in Aceobj preserves the tree structure of acedb
objects, but most of the implementation details are hidden behind the
access methods provided. Used in conjunction with Aceclient.pm, this
provides a powerful system for interacting with acedb and manipulating
objects.
Creating objects:
Currently, the internal representation of an object is a collection of
associative arrays (hash tables), containing pointers which give the
object its tree structure. Since perl 5 has literal representations
for each of the constructs used in the object, acedb has been modified
to print out a string which is legal perl (with slight modifications).
Note that any constructed types have already been expanded in place.
All that needs to be done to create the object in perl is to eval this
string, which is done by the access method 'new'. Without any
arguments, the object is empty; given an argument (usually text output
from acedb), this creates and initializes the object. For example:
$foo = new Aceobj; # creates an empty object
($status, $reply) = askServer($handle, 'find locus "WX"');
($status, $reply) = askServer($handle, 'show -perl');
$wx = new Aceobj $reply; # $wx contains the acedb object locus WX
Access methods:
Once the object is created, other methods are available for extracting
or changing information in it, for traversing the tree, and for adding
or removing nodes or branches from the tree. The perl syntax for
calling these methods is flexible; for instance, if $obj1 is an
Aceobj, then the following are equivalent:
&Aceobj::setValue($obj,'foo');
setValue $obj 'foo';
$obj->setValue('foo');
To get information about a particular node, the available methods are:
type (Tag, Int, Float, Object), value, title (for objects which use
-T), class (only for object references), isEmpty (for references to
empty objects), right (for a list of nodes to the right of the
current), left, and root (for a pointer to the root node of the tree).
For example:
print $node->type;
will output "Object" if the node is an object reference, or "Tag" if
$node is pointing to a tag in the tree. Each of these basic methods
hides the underlying object representation (an associative array with
cryptic keys) and is preferred to accessing the object directly
(except for node matching- see below). The underlying representation
may change (particularly because hash tables are memory intensive),
but the access methods will remain the same. Correspondingly, there
are methods to set attributes of an object: setType, setValue,
setTitle, setClass, setEmpty.
Manipulating trees
Trees and nodes can be copied with the methods copyTree and copyNode.
If an argument is supplied, the tree or node is copied to it.
Otherwise, a new object is created. Thus, the following are
equivalent:
$obj2 = new Aceobj;
copyTree $obj1 $obj2; # copy $obj1 to $obj2
$obj2 = $obj1->copyTree;
Other useful operations for tree manipulation are grafting (using
the methods graft or graftAt) and pruning (via prune). For instance,
one could graft a copy of an object into another:
$obj1->graft($obj2->copyTree);
Search methods and object representation:
Some methods require knowledge of the internal representation of an
object; these methods allow the programmer to specify matching
criteria for finding nodes (or paths) within an object. For instance,
the method findNodes finds all nodes in the tree matching a given
criterion. This criterion can either be a reference to a function or
a reference to a node description. For example, an anonymous function
could be used to search a tree for any references to locus Adh-1 as
follows:
@list = $obj->findNodes(sub { my ($node) = @_;
return ($node->class eq 'Locus'
&& $node->value eq 'Adh-1');
}
);
Using a reference to a function as the matching criterion gives the
programmer unlimited flexibility for how to check a node: the routine
could involve regular expression matching, range checking, database
lookups, etc. However, for exact matching of specific criteria,
it is simpler to use the internal representation of an object,
which is a reference (pointer) to an associative array with certain
keys: ty (type), cl (class), ti (title), mt (empty), and va (value).
So the above query could be performed as:
@list = $obj->findNodes( {'cl'=>'Locus', 'va'=>'Adh-1'} );
The corresponding values (here, Locus and Adh-1) can also be subroutine
references, which get passed the value as their sole argument; this
facilitates regular expression matching on one field, for example:
@list = $obj->findNodes( {'cl'=>'Locus', 'va'=> sub { $_[0] =~ /^adh/i; }});
The search methods include:
- findNodes : finds all nodes in a tree matching the given criterion
- findNodesRooted : only checks nodes immediately to the right of the
given node
- findPaths : accepts a list of criteria, and returns a list of paths;
each path is a pointer to a list of nodes
- findPathsRooted : finds paths beginning at the root node
These search methods form the basis of html markup rules used in
AceWWW.pm. As an example of a complex query possible through these
methods, consider the following example from the Mendel database, to
add markup to ?GeneFamily nodes to point to AAtDB only if another
part of the tree refers to Arabidopsis:
$obj->findNodes( sub { my ($node) = @_;
return $node->class eq 'GeneFamily' &&
$node->root->findNodes({'va'=>'Arabidopsis thaliana'});
}
);
Note that this rule nests two queries, each of a different style,
the outer using a function reference, and the inner using node
key-value matching.
Output:
Currently, there are two output methods in Aceobj.pm: aceDump and
prettyPrint. Each routine returns a list of strings representing
each line of the output. Typically, they would be invoked as follows:
print join("\n", $obj->aceDump);
The prettyPrint routine is invoked similarly, and returns a nice human
readable form of an ace object. The module AceWWW.pm contains functions for formatting an
object with html.