Illustration 1 shows a typical paper object, [cgc12].
Illustration 1 --------------- [cgc12] Reference Title Critical oxygen tension of C. elegans --------- ----- Journal Journal of Nematology ------- Year 1977 ---- Volume 9 ------ Page 253 254 ---- Author Anderson GL ------ Dusenbery DB Type ARTICLE ---- Keyword ENVIR CONDITIONS ------- OXYGEN
Each class has a "model", which provides the format for the contents of any object which belongs to that class.
There are two ways to see a class's model:
Illustration 2 -------------- ?Paper Reference Title UNIQUE ?Text --------- ----- Journal UNIQUE ?Journal XREF Paper ------- Publisher UNIQUE Text --------- Contained_in ?Paper XREF Contains ------------ Year UNIQUE Int ---- Volume UNIQUE Int Text ------ Page UNIQUE Int UNIQUE Int ---- Author ?Author XREF Paper ------ Abstract ?LongText -------- Type UNIQUE Text ---- Contains ?Paper XREF Contained_in -------- Refers_to Locus ?Locus XREF Reference --------- ----- Rearrangement ?Rearrangement XREF Reference ------------- Sequence ?Sequence XREF Reference -------- Keyword ?Keyword -------
Tags are positioned to the left of the data entries they label. Tags are underlined in the Illustrations and other examples, so that readers can clearly distinguish between tags and their entries.
Some tags are subtags of other tags. Illustration 3 shows the ?Author model, in which the Address tag contains several subtags. Illustration 4 shows a typical Author object, "Jones A".
Illustration 3 -------------- ?Author Full_name Text --------- Laboratory ?Laboratory XREF Staff ---------- Address Mail Text ------- ---- E_mail Text ------ Phone Text ----- Fax Text --- Paper ?Paper -----
Illustration 4 -------------- Jones A Full_name Alan Jones --------- Laboratory CB ---------- Address Mail 25 Green Road, Cambridge ------- ---- E_mail alan@mrc-lmb ------ Phone 30535 ----- Fax 30536 --- Paper [cgc12] ----- [cgc120]
An object is displayed in a clearer and more orderly fashion in ACEDB when the system groups together items which have something in common, such as parts of an address, as subtag entries within a main tag.
Also, if a group of tags are subtags of a main tag and we want to delete that group (and hence their related data), rather than having to delete each tag individually as we would have to if they were not subtags, we can simply delete the main tag. For instance, rather than deleting Mail, E-mail, Phone and Fax each time we want to delete address data in an Author object, we can simply delete Address. Deletion details will be explained in section C.
Models also indicate the nature of the data that can
be given for each tag entry.
An "Int" data field to the right of a tag indicates that only an integer can be given for that data field. "Float" means that a floating point number must be used.
If "Text" or "?Text" is stated, any Ascii characters can be entered. "?Text" data, unlike "Text" data, can be searched with the "Text Search" option in the main ACEDB window.
There might be a few Ints and/or (?)Texts for one tag entry. For instance, the ?Sequence model includes the following:
promoter Int Int ?Text --------A possible entry would be:
promoter 2520 2526 TATA --------
This makes sense. In the case of objects belonging to the Paper class, for instance, whereas it is important that the Author tag can include several entries, there should obviously be only one entry in the Journal or Volume tags.
If the word "UNIQUE" is used in a model for objects in a class, there can only be one item to the immediate right of the "UNIQUE" for every item to the left.
Often "UNIQUE" is positioned between a tag and a data field. This means that there can only be one entry in that data field for that tag. For instance, the ?Paper model includes the following:
Volume UNIQUE Int TextThis indicates that each Paper object can only contain one entry in the Int field of the Volume tag, though there can be several entries in the Text field.
"UNIQUE" can also be placed between two data fields in a model's tag. For instance, the ?Sequence model contains:
"misc-feature Int UNIQUE Int" ------------In this case, a misc-feature entry in a Sequence object can have only one integer in its second field for each one in its first.
Some tag entries are mutually exclusive (i.e. a multiple-choice type tag). The options are labelled with subtags. In the model there will be a UNIQUE between the tag and the subtags to indicate that in a given object a tag can only have one of the subtags (and its entry). The ?Allele model, for instance, includes:
Source UNIQUE Gene ?Locus ------ ---- Gene_class ?Gene-Class ----------An Allele object might include the following:
Source Gene_Class unc ------ ---------- letor:
Source Gene unc-33 ------ ---- let-45However, it should not contain the following:
Source Gene unc Gene_Class unc-33
Tag ?Class ---(for instance, Author ?Author in the case of the Paper model)
This indicates that within an object with that model, any entry in the stated tag will be an object in the stated class.
What implications does the fact that a tag's entry is an object have for the user? An example will help to make this clear.
The Mapper tag for a Multi-pt-data object "ABC" contains the entry "Jones A". The Mapper tag section of the Multi-pt-data model states that Mapper entries are members of the Author class (ie Mapper ?Author).
As a result, when the text relating to the "ABC" Multi-pt-data object is displayed in a window, an entry in its Mapper tag, "Jones A", will be in bold face to indicate that it is an object in its own right. If "Jones A" is selected with the mouse, the "Jones A" object, together with its related tags and data, will be displayed in a new window.
Sometimes when tag entries are described in models as objects in a particular class, the words XREF and a tagname appear to the right of the class. eg. the Sequence model contains:
Clone ?Clone XREF Sequence. -----This establishes a cross-reference from the tag entry which is an object in its own right, back to the object in which it is a tag entry.
In the example just given, Clone tag entries are described in the ?Sequence model as "?Clone", or objects belonging to the Clone class. What exactly do the words "XREF Sequence" mean? They indicate that the model for Clone objects contains a Sequence tag, entries for which are defined as "?Sequence", or objects belonging to the Sequence class. If a Sequence object contains a Clone object in its Clone tag, then that Clone object will automatically have the same sequence as an entry in its own Sequence tag. e.g. If we have a Sequence object "ABC", whose Clone tag entry is "Clone35", then the Clone object, "Clone35", will automatically include the Sequence object "ABC" in its own Sequence tag.
If the word REPEAT is stated after a data field in a tag, then
that data field"s entries can each contain
several items of a particular kind on a single line.
The Clone_Grid model includes the following:
Row Int UNIQUE ?Clone XREF Gridded REPEATLooking at the second half of the line, we see that there is a REPEAT statement for ?Clone entries. Hence the ?Clone part of two row entries might contain the following:
Clone1 Clone2 Clone3 Clone4 Clone5 Clone6 Clone7 Clone8 Clone9Why have a UNIQUE before ?Clone and REPEAT after it? We learnt above that UNIQUE means there can only be one of what is to the right of UNIQUE for each element on the left. Since the row tag entry begins with Int, there can only be a single-line group of clones for each integer given.
Hence you could have the following:
Row 1 A1 A2 A3 A4 2 A6 A7 A8 A9 A10but not:
Row 1 A1 A2 A3 A4 A6 A7 A8 A9 A10 2 B1 B3 B6 A5
Description Recessive ----------- --------- Dominant -------- Semi-dominant ------------- Weak ----
Adding/editing ACEDB data
There are two ways of editing data in ACEDB.
First, objects can be edited manually using the Add/Delete/Rename
option in the main ACEDB window menu and/or the Update option in
the text window for any object. The second solution is to
import data using files in "Ace" format.
What is an Ace file?
If a file is in Ace format, ACEDB can interpret the message it
contains. Ace files can be used to add, delete, and rename
objects, and to add data and comments to objects and
delete data from them. The files, which must have
the ".ace" extension, are read into ACEDB using
the "Read Ace Files" option in the menu of the main ACEDB window.
Ace File Operations
To add an object to the database, first state its class and
name (e.g. Journal Cell) on a single line in an Ace
Data can be attached to this object on immediately subsequent lines. The data must be prefixed with suitable tags, such as Author, Journal, Volume and Page in the case of Paper objects. An object's potential tags are listed in the model for the class of objects to which that object belongs.
The steps are the same when the object is already in ACEDB and we simply want to add to its data. If, for instance, we wanted to add an author and a volume number to the paper "[wbg6]", we could put the following:
Paper [wbg6] Author "Jones A" ----- Volume 39 -----Note: I have only underlined the tags in examples in order to make the examples clearer. There should not be any underlining in an Ace file.
If you want to add several entries for a particular tag in an object, each of the entries must be listed on a separate line and each line must begin with the tag in question. The order of the lines dosen't matter. If, for instance, we needed to add several Gene tag entries to the Paper object, [wbg6], we might put the following in an Ace file:
Paper [wbg6] Gene let-31 ---- Gene let-32 ---- Gene let-45 ----As mentioned earlier, some tags have subtags. In this case only label the data with the subtags; there is no need to mention the tag containing the subtags since ACEDB will know the tag to which the subtags belong, from the models.
If a subtag has no description to its right in a model, there should be no data to the right of that subtag when it is appended to an object in an Ace file. Your choice of a subtag or subtags is adequately descriptive in itself.
When adding data in an Ace file, it is important to have the correct data types (e.g. integer, floating-point) for each part of an object's tag entry, and to have those parts in the correct order. The format is defined in the object's class model (e.g. see the definitions given to the right of the tag entries shown in the Paper class model in Illustration 2).
If the word "UNIQUE" is in the model, you can only have one of what is on the right of the "UNIQUE" for each element on the left. The Paper model contains:
Publisher UNIQUE Text. ---------This means that there can only be one entry for Publisher in a Paper object.
Usually in ACEDB when a new entry is added for a particular tag in an object (e.g. "Jones A" for the Author tag in a Paper object), the new tag entry will be added to any previous entries. However, if a tag can only contain one entry, the new entry will overwrite the old one.
For a full discussion of tags, data types, their order, "UNIQUE" etc, see section B above.
If mistakes are made in an Ace file, ACEDB will point them out when attempts are made to read in the file.
Renaming data If you want to rename an object the Ace file syntax is as follows:
-R Classname Oldobject Newobjecte.g. -R Author "Jones A" "Jones AB"
Deleting data The Ace file syntax for deleting an object is as follows:
-D Classname Objectnamee.g. -D Author "Jones A"
To delete data which is attached to an object, first give the object's class and name on a single line. Then on subsequent lines say "-D Tagname" for each of the tags whose entries you want to delete. You only need to say "-D Tagname" once for each tagname in an object no matter how many entries that tag has. Each "-D Tagname" should be on a separate line. For example, the following statements will delete all the papers and laboratories associated with the Author "Jones A":
Author "Jones A" -D Paper -------- -D Laboratory -------------If you want to delete entries for a particular tag in an object and add in new entries for that tag, the order of the Ace file statements is important. The deletion statement should come before you add in the new data, since if the deletion comes after the new data is added, both the new and old data will be deleted. For instance, if, in addition to wanting to delete the current papers for the Author object "Jones A", you wanted to add in some new Paper tag entries, you could say:
Author "Jones A" -D Paper ---- Paper "Worm tales" ----- Paper "Worm mysteries" -----If it is stated in a model that there can only be one entry for a particular tag (e.g. "Title UNIQUE Text" in the Paper model - see the discussion of "UNIQUE" in section B above), and you want to replace an object's entry for that tag, there is no need to say "-D tagname", since the new entry will automatically overwrite the old.
Sometimes a tag has subtags. How is data deleted when this is the case? If you only want to delete the data in some of a tag's subtags then the Ace file syntax is:
Classname Objectname -D subtag1 ------- -D subtag3 -------There is no need to mention the tag which contains the subtags.
If, however, you want to get rid of the contents of all the subtags in a tag, rather than saying -D subtag for each of the subtags, you could have the simpler equivalent, "-D tag". e.g. in the case of Author objects you can have -D Address, rather than:
-D Mail ---- -D E_mail ------ -D Phone ----- -D Fax ---Adding comments Comments can be added to any tag entry for an object. These comments will be added into ACEDB.
In an Ace file the syntax is:
Tag TagEntry -C Commente.g. we could have the following for a Paper object, [cgc12]:
Paper [cgc12] Author "Jones A" -C "and the rest of the gang" ------ Page 456 467 -C "C. elegans growth patterns" ----In ACEDB comments are displayed with a dark background.
Ace file data that ACEDB does not see If you want to put something in an Ace file which you don't want ACEDB to try to interpret, such as a description of the contents of the Ace file, prefix your note with "//". Anything following this to the end of the line will be ignored.
General Syntax Rules Objects listed in the Ace files should be separated by a blank line, as in the case of the Paper objects below.
Paper [cgc12] Journal Nature ------- Page 10 16 ---- Paper [cgc13] Journal Nature ------- Page 17 24 ----A data item should be enclosed in quotes if it contains a space. Otherwise the data after the space will be lost. The following Author object is an example:
Author "Jones A" Mail "6 Blackheath Park" ----- Phone "044 656565" -----If a tag entry has multiple parts there should only be quotes around individual parts e.g. the Sequence model includes:
promoter Int Int ?Text --------In this case you might have the following:
promoter 2520 2526 "TATA signal" -------but you should not have:
promoter "2520 2526 TATA signal" -------Sentences in quotes can spread over several lines, as long as no carriage returns are included.
LongText and DNA Some ACEDB classes have an array structure rather than the standard tree structure. Objects in these classes need special treatment in an ace file.
The LongText array class objects are long pieces of text (usually abstracts). The syntax is as follows:
LongText [wbg11.1p68] This is an intricate worm family saga spanning several generations. It can contain blank lines, because there is a special symbol for the end of the entry. ***LongTextEnd***DNA objects are pieces of dna. The syntax is as follows:
DNA CEMSA02.f TCGTTAAGAATTGGAAGTTCCGATGTTAGTGAAAATGAGA AGAAGGAGCTGAAGAAGAGAAAGCTTATCAGTGAAGTAAA CATCAAAGCATTGGTGGTTTCCAAGGGAACATCTTTCACC ACTAGTCTTGCAAAGCAGGAAGCTGATTTGACTCCGGAAA TGATTGCTTCTGGTTCATGGAAAGACATGCAATTCAAAAA GTATAATTTCGATTCACTCGGAGTTGTTCCGTCATCTGGG CATCTGCATCCATTAATGAAAGTGCGGTCTGAATTCCGAC AAATCTTCTTCTCAATGGGATTTTCTGAAATGGCGACAAA TCGATACGTGGAGTCGTCTTTCTGGAACTTTGATGCCCTT TTCCAACCTCAACAGCATCCTGCAAGAGATGCTCATGATA CTTTCTTCGGTTCTGATCCCGCGATTAGCACGAGTTCCCTGEverything is taken up to the next blank line. Letters can either be upper or lower case, and should be IUPAC codes for nucleotides, e.g. A, C, G, T, N for anything, R for purine...
DNA and LongText objects should be attached to other objects. The model for Paper objects includes the following:
Abstract ?LongTextA Paper and its related LongText abstract might be listed in an ace file as follows:
Paper [wbg11.1p68] Abstract [wbg11.1p68] Author "Jones J" Author "Smith T" LongText [wbg11.1p68] This is the full text of a fascinating article by Leon Avery on nuclear protein extracts. The only important thing about it is that the final line is as follows, exactly, without any spaces after the stars at the end. ***LongTextEnd***The Paper and LongText objects are usually given the same name.
Sequence objects include the following:
DNA ?DNAA sequence and its DNA might be listed in an ace file as follows:
Sequence CEMSA02.f From_Author "Kerlavage AR Library Genbank ? M79466 Reference [cgc1567] DNA CEMSA02.f Related_sequence CEMSA02.r Brief_Identification "Phenylalanyl-tRNA synthetase beta DNA CEMSA02.f TCGTTAAGAATTGGAAGTTCCGATGTTAGTGAAAATGAGA AGAAGGAGCTGAAGAAGAGAAAGCTTATCAGTGAAGTAAA CATCAAAGCATTGGTGGTTTCCAAGGGAACATCTTTCACC ACTAGTCTTGCAAAGCAGGAAGCTGATTTGACTCCGGAAA TGATTGCTTCTGGTTCATGGAAAGACATGCAATTCAAAAA GTATAATTTCGATTCACTCGGAGTTGTTCCGTCATCTGGG CATCTGCATCCATTAATGAAAGTGCGGTCTGAATTCCGAC AAATCTTCTTCTCAATGGGATTTTCTGAAATGGCGACAAA TCGATACGTGGAGTCGTCTTTCTGGAACTTTGATGCCCTT TTCCAACCTCAACAGCATCCTGCAAGAGATGCTCATGATA CTTTCTTCGGTTCTGATCCCGCGATTAGCACGAGTTCCCTGThe Sequence and DNA objects are usually given the same name.
KeySets The other main array class is the KeySet class. The syntax for adding a KeySet via an ace file is as follows:
KeySet nameofkeyset Classname Objectname // repeated up to next blank lineThe following is an example of an ace file keyset:
KeySet GenesfromBill Locus unc-32 Locus lin-19 Locus dpy-40 Locus cll-3 KeySet DatafromBen Locus unc-31 Clone C05G2