The Segment Loader in ACEDB

Whaaat??? [Dave Matthews, 1997-07-30 (ca.)]

The Segment loader has been part of ACEDB a long time, but has never really been used outside of Sanger (and it certainly has not been documented before!). Most of the functionality is redundant - the same results can probably be achieved by use of .ace files. Thus you should only look into this if you're into heavy wizardry. An additional complication is that this function crashes the current 4.5 ace (this bug has been reported to the relevant authorities). It does work in 4.3, though.

The segment loader is only available in the X applications (xace and xaceclient) [???? how 'bout gifaceserver, would it be possible to get it called there? Probably not ...]

Look for the two commands on the dropdown menu in the Sequence (fMap) display. They should be Read Segments and Clear Segments. Now, very few ACEDB users are aware that they are there, and even fewer know what they do. This document will try to explain those commands.

Segments is the way the ACEDB database manager thinks of parts of the Sequence, for instance when it is time to put small markers for splice sites, or large boxes for certain features on the display. Briefly, each segment has at least the following items:

  1. Sequence name == the Sequence object it applies to
  2. A start coordinate (which needs to be an Int)
  3. A stop coordinate (see above)
  4. A TYPE (see below)
  5. A Method by which it was produced (see below)
  6. And a score (which should be a Float)

The TYPE

The TYPE tells the Display a bit about how you want it to look. The Method tells even more ...

There are 4 different TYPEs, and they each come in two flavours.

  1. FEATURE, FEATURE_UP is a box. If Method has the Strand_sensitive tag set, FEATURE will refer to the down strand, and FEATURE_UP will be on the up strand.
  2. SPLICE3, SPLICE3_UP give small hooks, pointing up. In this case the segments are inherently strand-sensitive, and the down versions are by default drawn as red hooks, while the up versions are drawn in blue. The length of the stem is determined by the score.
  3. SPLICE5, SPLICE5_UP works exactly as SPLICE3 and SPLICE3_UP, except that the hooks point down.
  4. HOMOL and HOMOL_UP are used for homologies. These take some extra data fields, namely a 'hit sequence' and two coordinates in that sequence.

The Method

So, what is this business with the Method? Well, if the Method name matches a Method object already in the database, then the settings for that Method is used, otherwise a default behaviour is used. So if you already have a Method for, say, blastx, it may be good to use that as the method. However, it is not a fault to give a method that is nonexistent, ace will warn about it, but then create a temporary Method object.

... and all together now!

The format of lines is the following (there are two line types):

TYPEFormat
HOMOL
HOMOL_UP
SeqObj Start Stop TYPE MetObj Score SeqName Start Stop
SPLICE3
SPLICE3_UP
SPLICE5
SPLICE5_UP
FEATURE
FEATURE_UP
SeqObj Start Stop TYPE MetObj Score

See Footnote 1 for a slightly more formal description of the format

The data needs to be saved in a file with the extension '.useg', as Read Segments filters the filenames ruthlessly.

Here is some test data [this needs adjusting to your database, especially the Sequence names]

MD0101	 5001  5002 SPLICE3    GF_splice   3.0
MD0101	 6001  6002 SPLICE3_UP GF_splice   3.0
MD0101	 7001  7002 SPLICE5    GF_splice   3.0
MD0101	 8001  8002 SPLICE5_UP GF_splice   3.0
MD0101	 5001 10000 FEATURE    MyFeat      3.0
MD0101	10001 15000 FEATURE_UP MyFeat      3.0
MD0101	 5001 10000 HOMOL      blastn    200.0 MD0102 2001 7000 
MD0101  10000  5001 HOMOL_UP   blastn    300.0 MD0103 2001 7000 

Load 'em up!

Select Read Segments from the dropdown menu in the Sequence map window. A file selection dialog will pop up, filtering out files with the .useg extension. Double-click on one of them to load it.

If you have got the format right, you should get a small window reporting on the progress, saying how many segments were loaded. If one of the lines is malformed, you'll get an error message, and the Segment loader will skip over that line.

... and get rid of them again

When you are done looking at your segments, you can clear them away - select Clear segments from the dropdown menu, and answer the question about which method. You will have to repeat this for every method for which you loaded new segments (and if you have many, it may be easier to simply close the Sequence display and open it again). Note also that only segments you have loaded during the lifetime of the display are affected: other segments, even if they have the same method, are not removed.
Last edited 1997-08-06 /staffan
Footnote 1 A slightly more formal description of the .useq file format
File          = Line(\nLine)*
Line          = Ordinary-line | Homol-line
Homol-line    = HereLocation <ws> Homol-type <ws> Method-part <ws> HitLocation
Ordinary-line = HereLocation <ws> Ordinary-type <ws> Method-part
Method-part   = MetObj <ws> Score
ThisLocation  = SeqObj <ws> Location
HereLocation  = SeqName <ws> Location
Location      = Start <ws> Stop 
Homol-type    = HOMOL | HOMOL_UP
Ordinary-type = SPLICE3 | SPLICE3_UP | SPLICE5 | SPLICE5_UP | FEATURE | FEATURE_UP
Score         = FLOAT
Start         = INT
Stop          = INT
<ws>          = [ \t]+
SeqName       = 
SeqObj        = 'name of the current Sequence object'
MetObj        = 'name of a Method object' /*	does not need to be an existing object, it
						will be created with default behaviour */