Tace

John Morris
AAtDB Project
Department of Molecular Biology
Massachusetts General Hospital
Boston, MA USA

morrisjohn@frodo.mgh.harvard.edu

August, 1994

Contents


Introduction

The tace program provides text only access to the ACEDB database. It employs the ACEDB query language to select information of interest to the user. Like the graphical version, the appropriate environmental variables must be set prior to running the program. In particular, the variable ACEDB must be set to the parent of the wspec directory. Assuming that your current working directory is indeed the parent of the wspec directory, you may issue the following commands to start tace (note the use of back quotes):
setenv ACEDB `pwd`
bin/tace
Alternatively, enter the full path name to the parent directory of wspec and start the database. Something similar to:
setenv ACEDB /usr/home/acedb
bin/tace
After the opening lines of the database description scroll by you will be presented with the tace prompt:
acedb >
Typing ? and return, provides a list of command options as follows.
  Quit : Exits the program. Any command abbreviates, $starts a
         subshell, until empty line.

  Help : for further help.

  Classes : Give a list of all the visible class names and how many
            objects they contain.

  Find class [name]: New current list with all objects from class,
                     optionally matching name.

  Model class: Shows the model of the given class, useful before
               Follow or Where commands.

  Grep template: searches for *template* everywhere in the database.

  Undo : returns the current list to its previous state.

  List [template]: lists names of items in current list [matching
                   template].

  Show [name] : shows objects of current list matching name or all.

  Is template : keeps in list only those objects whose name match the
                template.

  Remove template : removes from list those objects whose name match
                    the template.

  Follow Tag : i.e. Tag Author gives the authors of the papers of the
               current list.

  Where query_string : performs a complex query - 'help query_syntax'
                       for further info.

  Write filename : acedump current list to file.

  Biblio : shows the associated bibliography.

  Dna : Fasta dump of related sequences.

  Array class:name format : formatted acedump of A class object.

  Parse file : parses an aceFile, or stdio,  until 

  Table-Maker file outputfile : Executes a table-maker.def command
                                file, outputs

  Read-models

Using tace commands

The tace commands are the means for accessing the database information. They may be abbreviated to a single character if that character unambiguously designates a command. For example C or c functions for the command Classes. However other commands require two characters. For example Wh (or wh) are required to unambiguous differentiate Where from the Write command.

acedb > Classes

The Classes command lists all of the classes in the database. This includes classes which would be visible or hidden on the main menu, and those which are empty. This is a good command to use to get an overview of the database. For AAtDB 3.3 the following list is produced.
acedb > C
These are the known classes and the number of objects in each class 
                             KeySet 1 
                              Model 40 
                             Action 0 
                              Image 235 
                             Author 4867 
                         Laboratory 0 
                            Journal 377 
                              Paper 3917 
                            Library 0 
                            Mutagen 51 
                             Method 8 
                           Sequence 16008 
                        Restriction 0 
                                Map 31 
                         Gene_Class 244 
                            Species 0 
                             Allele 1186 
                             Strain 0 
                           Interval 0 
                       2_point_data 3741 
                      Multi_pt_data 0 
                        Df_Dup_data 0 
                              Clone 19136 
                         Clone_Grid 6 
                               Pool 0 
                                ABI 0 
                              Locus 1488 
                           MultiMap 5 
                                YAC 0 
                           Fragment 0 
                         Chrom_Band 0 
                              Motif 343 
                             Enzyme 0 
                             GDB_id 0 
                               OMIM 0 
                              Probe 527 
                          Reference 0 
                           Homology 0 
                            Peptide 0 
                             Source 59 
                            Contact 1315 
                              Cross 0 
                        Segregation 256 
                       Gene_Product 687 
                     Map_population 26 
                          Qualifier 0 
                               Gene 0 
                       DNA_Resource 2869 
                 Germplasm_Resource 7299 
                       Lab_Location 0 
                              Match 0 
                      Interval_ends 0 
                       Multi_counts 0 
                           Balancer 0 
                       map_location 0 
                          map_error 0 

acedb > Quit

The Quit command exits the program. One may temporarily exit and enter a subshell using the $ character. See special character commands below.

acedb > Help

The Help command displays the summary help information below.
acedb > help 
Commands are not case sensitive and can be abbreviated.  Lines
starting with @ will take the next word as a file name to read
commands from (include file), passing subsequent words on the line as
parameters.  Lines starting with $ execute the remainder of the line
in an interactive subshell.

Everything following // on a line is ignored (comment).
To escape any of these special characters use a backslash, eg \\@.

This program maintains an internal list of current objects. 
 You can List or Show or Write the content of the list.
You can change the list with the simple commands: 
       Grep, Find, Is, Follow, Biblio. 

Perform complex queries with:
       Where search-string', where 'search-string' follows the acedb
       query syntax.  

Print relational tables with Table table-command-file

To see a list of all possible commands type ?.

For further help on any of the following features type 'help feature'
   Tacedb : copyright etc
   Query_syntax for the find command
   Useful_sequences : SL1 etc
   Clone_types used in the physical map (vector etc)
   DNA_and_amino_acids_nomenclature : codes for bases/amino acids and
   translation table

acedb > Find

The Find command creates a new current list. It requires a Class name and may optionally contain an object name. For example "Find Author" and "Find Author Morr*" are both valid commands. If the find command is successful, the number of objects found is displayed. If no objects are found, the "acedb >" prompt is displayed and the previous current list of objects is retained. When designating object names it is easy to exclude the desired object due to associated punctuation or text. Including wild-card characters generally solves the problem.
acedb > find author morris
acedb > 

acedb > find author morris*

Found 6 objects in this class
acedb > list

KeySet : Answer_3
Author : "Morris, A."
Author : "Morris, P. C."
Author : "Morris, P. F."
Author : "Morris, Paul Francis"
Author : "Morris, R. O."
Author : "Morrison, A."

acedb > find contact *george*

Found 8 objects in this class
acedb > list

KeySet : Answer_1
Contact : "Coupland, George"
Contact : "Haughn, George"
Contact : "Jen, George"
Contact : "Karlin-Neumann, George"
Contact : "Mourad, George"
Contact : "Murphy, George J.P."
Contact : "Picard, George"
Contact : "Redei, George P."

acedb > Model

The Model command displays the model for the given class. Knowing the "tags" used in the model can help you refine searches. For instance if you were looking for email addresses, you could consult the models and find that AAtDB keeps the email information in the Contact class and not the Author class as ACEDB does.
acedb > model author
?Author
        Paper ?Paper XREF Author

acedb > model contact
?Contact
                Profession Text
                Address Mail Text
                        Institution     Text
                        Address1 Text
                        Address2 Text
                        Address3 Text
                        Address4 Text
                        Address5 Text
                        City    Text
                        State   Text
                        Region  Text
                        PostalCode Text
                        Country Text
                        Phone Text
                        E_mail Text
                               Bitnet Text
                               Internet Text
                        Fax Text
                        Telex Text
                Research_interest ?Text
                Keyword ?Keyword
                Publishes_as ?Author XREF Full_name
                Associate ?Contact XREF Associate
                Member_of_lab ?Contact XREF Member_of_lab
                Last_update Text
                Obtained_from ?Source
                Hint ?LongText

acedb > Grep

The Grep command searches for text strings in the object names and text entries of the database (LongText objects and array objects such as DNA sequences are not searched). The wild-card character "*" is added to both ends of the string, allowing matches within longer strings such as titles of articles.
acedb > grep pyrophosphorylase
I search for texts or object names matching *pyrophosphorylase*

Found 35 objects
acedb > list

KeySet : Answer_2
Phenotype : "General-3"
Phenotype : "Greenbook-10"
Phenotype : "New-130"
Phenotype : "New-131"
Phenotype : "New-132"
Phenotype : "New-133"
Phenotype : "New-134"
Phenotype : "New-367"
Sequence : "ATRNAAPL1"
Sequence : "ATRNAAPL2"
Sequence : "ATRNAAPL3"
Sequence : "ATRNAAPS"
Sequence : "GenPept:VFAGPC_1"
Sequence : "GenPept:VFAGPP_1"
Sequence : "PIR:S12038"
Sequence : "PIR:S13380"
Sequence : "SwissProt:GALU_ECOLI"
Sequence : "SwissProt:GLG1_ORYSA"
Sequence : "SwissProt:GLG1_SOLTU"
Sequence : "SwissProt:GLG1_WHEAT"
Sequence : "SwissProt:GLG2_HORVU"
Sequence : "SwissProt:GLG2_SOLTU"
Sequence : "SwissProt:GLG2_WHEAT"
Sequence : "SwissProt:GLGS_WHEAT"
Sequence : "SwissProt:RFBM_SALTY"
Paper : "komed-1988-aadbz"
Paper : "li----1992-aahfx"
Paper : "lin---1987-aabri"
Paper : "lin---1988-aaavt"
Paper : "lin---1988-aadau"
Paper : "lin---1988-aadcr"
Paper : "neuha-1990-aadlh"
Paper : "preis-1987-aabqj"
Paper : "stark-1992-aahet"
Paper : "villa-1993-aahqk"
acedb > 

acedb > Undo

The Undo command serves to recover a previous list of objects. Only one undo command is effective. A second consecutive undo has no effect.
acedb > grep pyro
I search for texts or object names matching *pyro*

Found 111 objects
acedb > is toto
 0 objects kept 
acedb > undo

Recovered 111 objects

acedb > List

List displays the object names in the current list. If a name template is supplied, only those objects matching the template are listed.
acedb > find clone 112*

Found 91 objects in this class
acedb > list 1129*

KeySet : Answer_3
Clone : "1129%"
Clone : "11290"
Clone : "11291"
Clone : "11292"
Clone : "11293"
Clone : "11294%"
Clone : "11296"
Clone : "11297"
Clone : "11298"
Clone : "11299"

acedb > Show

The show command displays all the text data included with objects in the current list.
acedb > find clone 11290

Found 1 objects in this class
acedb > show
Clone : "11290"


11290
  FingerPrint     Gel_Number      262
                  Approximate_Match_to    11568
                  Bands   116980      21

acedb > Is

The command Is prunes the current list to include only those objects which match the given template.
acedb > find clone 112*

Found 91 objects in this class
acedb > is 1129*
 10 objects kept 
acedb > is *6*
 1 objects kept 
acedb > list

KeySet : Answer_6
Clone : "11296"

acedb > Remove

The Remove command removes from the current object list, those objects which match the given template.
acedb > find locus em*

Found 175 objects in this class
acedb > remove emb*
 1 objects kept 

acedb > Follow

To be successful, the Follow command requires a tag that links to another class in the database. For each item in the current list which contains the given tag, tace follows the link to the new object and adds that object to the newly created current list. For example from a list of Clones, the command "Follow locus" would produce a new current list of loci which had been cloned.
acedb > find clone

Found 19136 objects in this class
acedb > follow locus
Autocompleting locus to Locus
Found 119 objects
acedb > follow sequence
Autocompleting sequence to Sequence
Found 11 objects
acedb > list

KeySet : Answer_8
Sequence : "ATABI3"
Sequence : "ATHADH"
Sequence : "ATHCHS"
Sequence : "ATHCPGAPBA"
Sequence : "ATHGAPBB"
Sequence : "ATNR2R"
Sequence : "ATU22"
Sequence : "ATU23"
Sequence : "ATU24"
Sequence : "ATU25"
Sequence : "ATU29"

acedb > Where

The Where command performs a complex query on the current object list. For example one could search for prolific authors using COUNT paper in the query field. One might also search for recent papers with a given keyword.
acedb > Find Author

Found 4862 objects
acedb >  Where COUNT paper > 50

Found 10 objects

acedb > find paper

Found 3916 objects in this class
acedb > where Year >= 1992 AND Keyword = stress*

Found 5 objects
acedb > list

KeySet : Answer_3
Paper : "coghl-1992-aagba"
Paper : "lang--1992-aagdi"
Paper : "marrs-1993-aahcx"
Paper : "oster-1993-aahay"
Paper : "takah-1992-aahkc"

acedb > Write

The Write command requires a filename, and writes a file in acedump format containing the text information of the objects in the current list. The file is written to the tace start up directory, unless the path to a new directory is given with the file name.
acedb > write stress.papers
I wrote 5 objects to file stress.papers

acedb > write /files/home/stress.papers
I wrote 5 objects to file /files/home/stress.papers

acedb > Biblio

The Biblio command searches the objects in the current list for the Paper tag, and automatically displays the associated citation information. Since Papers themselves do not have a tag "Paper", this means that objects of class Paper that are in the current list, are not displayed.
acedb > grep dehydrogenase
I search for texts or object names matching *dehydrogenase*

Found 509 objects
acedb > biblio
   Associated bibliography 
knoop-1991-aadyt : 
    Trans-splicing integrates an exon of 22 nucleotides into the nad5
    mRNA in higher plant mitochondria
   Knoop, V., Brennicke, A., Schuster, W., Wissinger, B..
    EMBO (European Molecular Biology Organization) Journal 10,
    3483-3493 (1991) 
brand-1992-aahan :
    The nad4L gene is encoded between exon c of nad5 and orf25 in the
    Arabidopsis mitochondrial genome.
   Knoop, V., Brennicke, A., Brandt, P., Sunkel, S., Unseld, M..
    Molecular & General Genetics 236, 33-38 (1992)
chang-1986-aacxn : 
    Molecular cloning and DNA sequence of the Arabidopsis thaliana
    alcohol dehydrogenase gene
   Chang, C., Meyerowitz, E. M..
    Proceedings of the National Academy of Sciences of the United
States of America 83, 1408-1412 (1986)
 ....

acedb > Dna

The Dna command is meant to write a file of associated DNA sequences in fasta format. However, it is not yet implemented. To get around this limitation, create a current list of DNA objects, and then use the Write command. In the example below, $cat [filename] allows one to view the created file with out leaving tace.
acedb > find sequence atd*

Found 2 objects in this class
acedb > dna
Option not written yet

acedb > follow DNA
Found 2 objects
acedb > write /tmp/atd.seq
I wrote 2 objects to file /tmp/atd.seq
acedb > $cat /tmp/atd.seq
DNA : "ATDNABFS1"
         gaattcttatattaacttttgttccttcagttttatacatatagtcatat
         gacattgaaaaaccagcacaaactccttcagttgatcaccaatagaaaca
         acaaaatacagtataatattt...

acedb > Array

Array or A objects do not have models (unlike B or "tree" objects) and are stored in special formats by the program. You can dump A objects if you know the the format. Array objects are not usually directly accessed except by programmers. Without an argument this command causes a segmentation fault.
acedb > Array DNA:ATHCOLUMB c





u



....   
This is not a bug, DNA is stored in compressed format.

acedb > Parse

Parse reads in formatted ace files or input from the keyboard until a Control-d character. Your user name must be entered in the wspec/passwd.wrm file in order to use this command. Otherwise tace will not allow you write access and will quit. Data read from the keyboard must be in the format identified by the models.wrm file. Following a parse of data entered from the keyboard, tace immediately saves the data and exits. Parsing an ace file does not cause tace to exit.
acedb > parse rawdata/test.ace
!! 1 objects read with 0 errors

acedb > parse 
Contact "Joe Thaliana"
Institution "Massachusetts General Hospital"
PostalCode "02114"
Last_update "1-Jan-1990"

Contact "James Thaliana"
Institution "Massachusetts Institute of Technology"
Last_update "1-Jan-1990"

[control-d pressed at this point]

!! 2 objects read with 0 errors
acedb > Please type ? for a list of commands. 
acedb > 
$>
 A bientot 

acedb > Table-maker

The Table-maker command executes a predefined table-maker.def command (which is generally created using xace). The subtitle line found in the xace version of Table-maker is not printed out, nor is the column formatting of the xace version used.
acedb > table wquery/long.seq.def
AT2SALBGB       4274    
AT3RRNA 4310    
ATAAI   4395    
ATACCSYNG       5613    
ATATPGP1        6300    
ATATSGS 9647    
ATAUXIN 4966    
....

acedb > Read-models

The Read-models command allows you to read the models.wrm file found in the wspec directory. This command is not executable unless your user name is included in the wspec/passwd.wrm file. Because reading an improperly formatted models file may abolish one's access to the data, a warning is issued. After issuing the Quit command tace will prompt you for yes or no as to whether it should save the database in its current state.
acedb > Read
Watch out ! An erroneous modification of the models may screw up the
system.  Do you want to proceed? (y or n) y
!! 0 objects read with 0 errors
New models read, save your work before quitting
acedb > 
Please type ? for a list of commands. 
acedb > save
!! Keyword save does not match
Please type ? for a list of commands. 
acedb > quit

$>You did not save your work, should I ? (y or n) y

 A bientot 

Special Character Commands

There are two special character commands that may be used $unix_command and @file_name. (Recall also that double slashes "//" precede a comment line. This is useful in documenting scripts, see examples below.)

The $ starts a subshell and is useful for listing file name and contents. For example listing the contents of the wquery directory when you want to run the Table-maker command, or listing the contents of a command file (see below). Although the "acedb >" prompt does not reappear unless the return key is pressed, the next command is expected to be a tace command unless preceded by the $ character again.

acedb > $ls wquery
2pointData.def             findgenes.cmd
DfDupData.def              g2pmap.def
Select_strains.qry         gMap-dominant-genes.def
allele2clone.def           gMap-dominant-alleles.def
clones2chromo.def          lage1.def
coauthors.cmd              long.seq.def
command.examples.qry       map_data.1.def
contig.qry                 multimap.def
df_dup_1.def               restrict.sites
df_dup_2.def               sam-bug.def
df_dup_4.def               sam2.def
email1.def                 table1.def
examples.qry               table2.def
examples2.qry              types.qry
The @ character command is used preceding a filename and indicates that tace should execute commands listed in the file. This is useful if you have a set of queries that you perform frequently. Additionally, it is possible to pass parameters to the file in much the same manner as C shell scripts, by appending the parameters after the file name. Note that in the second example below the string "author morris*" is passed as a single parameter because the two words are enclosed within quotes. By default tace appends .cmd to the file name, unless the "." character is part of the filename.
acedb > @script.tace

Found 4867 objects in this class
acedb > 
Found 14 objects
acedb > 
KeySet : Answer_1
Author : "Harris, J. I."
Author : "Harris, L. M."
Author : "Harris, T. J. R."
Author : "Heslop Harrison, J. S."
....

acedb > @script2.tace "author morris*" paper

Found 6 objects in this class
acedb > Autocompleting paper to Paper
Found 10 objects
acedb > 
KeySet : Answer_2
Paper : "altma-1991-aachi"
Paper : "dunn--1993-aahmw"
Paper : "halft-1992-aaemw"
Paper : "laten-1993-aahor"
Paper : "morri-1987-aadyh"
Paper : "morri-1988-aaawf"
Paper : "morri-1989-aaaux"
Paper : "morri-1991-aachz"
Paper : "rober-1976-aadze"
Paper : "rober-1976-aaeet"
Here are the script files, displayed by using the tace "$" character and the unix cat command. Each contain three lines of text.
acedb > $cat script.tace
find author
where *rris*
list


acedb > $cat script2.tace
find %1
follow %2
list

Here is a longer example using script files to update the "Contact" class in a (very small) database and automatically keep track of the files that are read in. Note that tace displays the message "Please type ? for a list of commands." when it encounters a blank line (which is how it interprets the comment lines in the script file). The contents of the files are shown at the bottem of the example.
acedb > find contact

Found 5 objects in this class
acedb > @update.script newdata.ace
Please type ? for a list of commands. 
acedb > Please type ? for a list of commands. 
acedb > !! 3 objects read with 0 errors
acedb > Please type ? for a list of commands. 

acedb > find contact

Found 8 objects in this class
acedb > list

KeySet : Answer_3
Contact : "ACEDB Newsgroup"
Contact : "Boguski, Mark"
Contact : "Buchanan, Barbara"
Contact : "Cherry, J. Michael"
Contact : "GenBank"
Contact : "Morris, John"
Contact : "Thierry-Mieg, Jean"
Contact : "Tolstoshev, Carolyn"
acedb > @update.script olddata.ace
Please type ? for a list of commands. 
acedb > Please type ? for a list of commands. 
acedb > !! 3 objects read with 0 errors
acedb > Please type ? for a list of commands. 
acedb > find contact

Found 5 objects in this class
acedb > list

KeySet : Answer_4
Contact : "Boguski, Mark"
Contact : "Buchanan, Barbara"
Contact : "Cherry, J. Michael"
Contact : "GenBank"
Contact : "Tolstoshev, Carolyn"
Here are the contents of the files. As displayed by tace.

The update script

Note the name of the file is passed using "%1" not only to the tace Parce command, but also to the UNIX echo command.
acedb > $cat update.script
// Update script for tace
// Modified Thu Aug  4 08:44:09 EDT 1994 JWM
$echo `date` >> update.log
$echo -n "Parsing file %1 ... " >> update.log
Parse %1
$echo "Done" >> update.log

The update.log file

$cat update.log
Thu Aug 4 09:48:36 EDT 1994
Parsing file newdata.ace ... Done
Thu Aug 4 09:49:42 EDT 1994
Parsing file olddata.ace ... Done

The newdata.ace file

This reads in new data into the Contact class.
$cat newdata.ace
Contact : "Morris, John"
Internet john.morris@frodo.mgh.harvard.edu

Contact : "Thierry-Mieg, Jean"
Internet mieg@kaa.cnrs-mop.fr

Contact : "ACEDB Newsgroup"
Internet acedb@net.bio.net

The olddata.ace file

In this case, the data just read in is removed. (Warning: if there were other data associated with these Contact objects, they would be removed too!)
$cat olddata.ace
-D Contact : "Morris, John"

-D Contact : "Thierry-Mieg, Jean"

-D Contact : "ACEDB Newsgroup"

An example for dumping the complete database.

First you need a file of all the Classes in the database. You may type this by hand or capture a list of the classes using the Unix script command to capture a tace session which displays the names as shown below.
/usr/home/acedb% script
Script started, file is typescript
~% cd /usr/home/acedb
/usr/home/acedb% bin/tace
**** acedb queryserver: Version 3.0 20 November 1993 ****
Authors: Richard Durbin (MRC, UK) rd@mrc-lmb.cam.ac.uk
         Jean Thierry-Mieg (CNRS, France) mieg@kaa.cnrs-mop.fr
etc....

acedb > Classes
These are the known classes and the number of objects in each class 
                             KeySet -1 
                              Model 17 
                           LongText 1 
                             Action 0 
                         Laboratory 0 
 etc ...

acedb > quit [quit tace]

$>
 A bientot 

/usr/home/acedb% exit [quit the Unix script]
exit
Script done, file is typescript
/usr/home/acedb% 

Now edit the "typescript" file to create a tace "dump.script" file that has a format similar to the following. Don't include Classes without any entries as the Find commmand will retain the objects in the current list from the previous Find and they will again be written out. Exclude the Models class as this may cause you problems if you read the file back in. Notice at the bottem there is a Unix command to concatenate all the files into one large file. This can be easier to handle than many little files.
//tace script to dump database
Find LongText
Write LongText.dump
Find Laboratory
Write Laboratory.dump
Find Paper
Write Paper.dump
Find Contact
Write Contact.dump
.
.
.
Find map_error
Write map_error.dump
$cat *.dump > database.fulldump.ace
When you run the dump script you get something like the session below. The contents may be examined in the database.fulldump.ace file.
acedb > @dump.script
Found 2 objects in this class
acedb > I wrote 2 objects to file LongText.dump
acedb > 
Found 1 objects in this class
acedb > I wrote 1 objects to file Laboratory.dump
acedb > 
Found 1 objects in this class
acedb > I wrote 1 objects to file Paper.dump
acedb > 
Found 8 objects in this class
acedb > I wrote 8 objects to file Contact.dump
acedb > 
.
.
.
// Modified Fri Aug 5 15:56:22 EDT 1994 //jmorris-aatdb