Building ACE Objects for the Traces Display

Data entry for the ACEMBLY and Traces Display is through Sequence Objects. The main body of tags for the Sequence class is described elsewhere, however, the ACEMBLY tags are given below.
	  ABI      ABI_Date Text
                   Run_Start Text
                   Run_Stop Text
                   BaseCall ?BaseCall
                   Significant_bases Int UNIQUE Text
                   Clipping UNIQUE Int UNIQUE Int // present clipping
                   Old_Clipping UNIQUE Int UNIQUE Int // as imported from Ted
                   Trace_quality  Excellent_upto UNIQUE Int 
                                  Good_upto UNIQUE Int 
                                  Fair_upto UNIQUE Int 
                   Otto Text REPEAT
                   Stolen
                   ABI_Comment ?Text
	           ABI_Analysis Text
	           ABI_Machine Text
	           Sample Text
                   Author ?Author XREF Sequence
                   Allele Text
                   Primer ?Motif XREF PCR_Product
                   SCF_File UNIQUE Text
                   Fragment_of UNIQUE ?Sequence 
          Vector   // Set for Sequence Lorist6 etc.. -> Blue in acembly

Example entries

A shotgun sequencing project for a given clone might be named "a2." It is made up of Contigs which are listed under the Subsequece tag. The integers following each subsequence defines the position of the Subsequence on the ACEMBLY map display. A partial project Sequence object is shown below

Subsequence (a contig) is made up of individual overlaping sequence reads. The Subsequence has a consensus sequence which is stored as a DNA object of a given length. The individual reads are Sequence objects listed under the Assembled_From tag. The integers following each sequence read identifies the base-pair position within the subsequence (contig). The sequences of individual reads may disagree with the consensus for the Subsequence to which they belong. A partial Subsequence entry is shown below.

Information on the individual reads is stored as a separate Sequence object. Each read has a corresponding DNA object. The name and length are stored following the DNA tag. An individual read may be part of several Subsequences, each of which are listed following the Assembled_into tag. The ABI tag indicates that trace information is available for a given sequence read. Clipping indicates the region which is currently removed from assembly calculations. This changes dynamically with editing choices. Old_Clipping is read once from the SCF file and does not change dynamically. SCF_File designates the location of the original SCF file that is read by ACEDB. Two examples are given below.

In order to load a DNA sequence and view its trace, follow the steps below:

  • Set the environmental variable SCF_DATA to the directory where the SCF files are stored.
  • Read in the Sequence annotations in a Sequence object.
  • Read in the DNA sequence as a standard DNA .ace file.

    For example if the SCF files were stored on /disk2/abi/SCF, use the command "setenv SCF_DATA /disk2/abi/SCF". In the example below, the value F46G11/bt09f09.r1 following the tag SCF_File indicates that the file can be found in a subdirectory named F46G11 in the SCF directory.

    Sequence : "F46G11.bt09f09.r1abi"
    DNA	 "F46G11.bt09f09.r1abi" 200
    Assembled_into	 "a12.1"
    Clipping	 9 395
    Old_Clipping	 5 369
    SCF_File	 "F46G11\/bt09f09.r1"
    
    DNA : "F46G11.bt09f09.r1abi"
             tcaaagtacctgaaaaatagttttagctaaaatcagcaacgaaagaccaa
             ccatttccgtttataagcaaattttctgctttgatatctcggtggatggc
             accgacggaatgcaaatatactagggctcttgttgtaaagtagattatgc
             gaacgacaaagtcaattgagacttctttgtacttgagaagcacatccgca
    
    

    Back to the ACEMBLY index page