The methods of DNA sequencing

In this article, I briefly describe the various methods of DNA sequencing and their applications.

DNA sequencing

The process of determining the order of nucleotides in DNA is called DNA sequencing. It determines the order of the four bases, i.e., adenine, guanine, thymine, and cytosine. The nucleotide sequence is the base of knowing a gene or genome as it contains the instructions for building an organism. DNA sequencing is used to determine the sequence of individual genes, clusters of genes, full chromosomes, or entire genome of any organism. The first-generation sequencing technology determines DNA sequences of several kb in length, the chain termination method, and the chemical degradation method. The chain-termination method was developed by F. Sanger and A.R. Coulson in the UK. The chemical degradation method was developed by the American molecular biologists A. Maxam and W. Gilbert.

The Sanger-Coulson method of DNA sequencing

This is a chain-terminating method and requires single-stranded DNA. In this method, DNA chains were synthesized on a template strand. However, when one of the four possible dideoxy nucleotides lacking a 3′ hydroxyl group was incorporated, it prevented the addition of a second nucleotide. It is the main principle of the Sanger-Coulson method to use the dideoxynucleotide triphosphates as the chain terminators.

The classical chain-termination method requires a single-stranded DNA template, a DNA primer, a DNA polymerase, normal deoxynucleotide triphosphates (dNTPs), and modified nucleotides (dideoxyNTPs) that terminate DNA strand elongation. These ddNTPs are also radioactively or fluorescently labeled for detection in automated sequencing machines. The DNA sample is separated into four different sequencing reactions, containing all four of the standard deoxynucleotides (dATP, dGTP, dCTP and dTTP) and the DNA polymerase (figure 1). To each of the reaction, only one of the four dideoxynucleotides is added. The four dideoxynucleotides are ddATP, ddGTP, ddCTP, and ddTTP, which are the chain-terminating nucleotides. Only one of the four dideoxynucleotides, lacking a 3′-OH group is required for the formation of a phosphodiester bond between two nucleotides, thus terminating DNA strand extension and resulting in DNA fragments of varying length.

Figure 1: Diagrammatic representation of the Sanger-Coulson method

Detection of DNA sequence

The newly synthesized and labeled DNA fragments are first denatured with a heat treatment. Then, they are size-separated by gel electrophoresis on a denaturing polyacrylamide-urea gel. Each of the four reactions runs in one of the four individual lanes (lanes A, T, G, C). The DNA bands are then visualized by autoradiography or UV light. The DNA sequence can be directly read off the X-ray film or gel image. The dark bands correspond to DNA fragments of different lengths. A dark band in a lane indicates a DNA fragment that is the result of chain termination after incorporation of a dideoxynucleotide (ddATP, ddGTP, ddCTP, or ddTTP). The relative positions of the different bands among the four lanes are then used to read (from bottom to top) the DNA sequence (figure 1).

Nucleotides can be tagged with radioactive phosphorous for radiolabeling or a primer labeled with a fluorescent dye at the 5′ end can be used for variations of chain-termination sequencing. Dye-primer sequencing facilitates reading in an optical system for faster and more economical analysis and automation.

This method of DNA sequencing was later performed by using automated sequencing machines, in which the curtailed DNA molecules, labeled with fluorescent tags, were separated by size within thin glass capillaries. The detection is done by laser excitation.

The Maxam-Gilbert method of DNA sequencing

In 1976-1977, Allan Maxam and Walter Gilbert developed a method of DNA sequencing, which is based on the chemical modification of DNA and subsequent cleavage at specific bases. It gained popularity as it involves the direct use of purified DNA. In this method, chemical reagents are used to cleave the existing DNA molecule. The reagents act precisely at a particular nucleotide. Thus, this method does not require a primer.

At first, the double-stranded DNA fragment to be sequenced is labeled by attaching a radioactive phosphorus group to the 5′ end of each strand. Then, dimethyl sulphoxide is added and the labeled DNA sample is heated to 90°C. This results in the breakdown of the base-pairing and thus DNA molecules dissociate into its two component strands. Gel electrophoresis separates the two strands from one another in such a way that one of the strands probably contains more purine nucleotides than the other. Thus, it will be slightly heavier. One strand is purified from the gel and divided into four samples. Each of the four samples is treated with one of the cleavage reagents.

The modification and cleavage reactions result in only one breakage per strand. Some of the cleaved fragments retain the 32P label at their 5′ ends. After electrophoresis, applying the same principles for chain termination sequencing, the bands visualized by autoradiography will represent these labeled fragments (figure 2). The nucleotide sequence can now be read from the autoradiograph the same way as in the Sanger-Coulson method.

Figure 2: The Maxam-Gilbert method of DNA sequencing

Reading the DNA sequence from the autoradiograph

First the band that moved the farthest is located. This represents the smallest piece of DNA, the strand terminated by incorporation of the dideoxynucleotide at the first position in the template. If the track in which the band located is G, the first nucleotide in the sequence is therefore G (figure 3).

The next mobile band corresponds to a DNA molecule, which is one nucleotide longer than the first. If the track is noted as C, then the sequence so far is GC (figure 3). The process is continued along the autoradiograph until the individual bands become so bunched up that they can’t be separated from one another.

Figure 3: Reading a DNA sequence

Building up a long DNA sequence

Sequencing DNA either by Sanger-Coulson or Maxam-Gilbert method, gives only about 400 nucleotides of sequence. DNA sequencing experiments when performed with many different fragments, all derived from a single larger DNA molecule, give long DNA sequences. These fragments should overlap, so the individual DNA sequences will themselves overlap. The overlaps can be located by visualization and the master sequence gradually built up. Two different restriction endonucleases cleave DNA molecules to produce two different sets of fragments. It leads to the production of overlapping sequences. However, in this situation, one drawback arises that the restriction sites may be inconveniently placed and individual fragments may be too long to be completely sequenced.

Next-generation DNA sequencing technology

The new DNA sequencing technologies are more advanced than the first-generation technologies. These methods cause many DNA fragments to be sequenced together. These include faster and more cost-effective techniques than previous ones. The advancement in bioinformatics improved the next-generation technologies. This resulted in increased data storage. It also caused the analysis and manipulation of very large data sets, often in the gigabase range.

Applications of DNA sequencing

In molecular biology, DNA sequencing is used to study genomes and the proteins encoded by them. Researchers get information from sequencing to find out changes in genes and non-coding DNA, associations with diseases and phenotypes, and identify potential drug targets.

In evolutionary biology, DNA sequencing plays an effective role. It is used to study the relationship between different organisms and how they are evolved. In virology, sequencing is one of the major applications to identify and study the virus. Sanger-Coulson method of DNA sequencing along with traditional methods is used to sequence viruses in basic and clinical research as well as for the diagnosis of new viral infections. During the 1990 avian influenza outbreak, it was determined through viral sequencing that the influenza sub-type originated through re-assortment between quail and poultry.

In the medical field, DNA sequencing is used to diagnose and treat rare diseases. Clinicians through DNA sequencing identify genetic diseases, improve disease management, and provide more improvised therapies. DNA sequencing can also be useful in allowing more precise antibiotic treatments. For forensic identification and paternity testing, DNA sequencing may be used along with DNA profiling methods.

Conclusion

DNA sequencing determines the order of the four bases in DNA, i.e., adenine, guanine, thymine, and cytosine. The nucleotide sequence is the base of knowing a gene or genome as it contains the instructions for building an organism.

The Sanger-Coulson method of DNA sequencing is chain-terminating and requires single-stranded DNA. In this method, DNA chains were synthesized on a template strand. The main principle of the Sanger-Coulson method is to use the dideoxynucleotide triphosphates as the chain terminators.

In 1976-1977, Allan Maxam and Walter Gilbert developed a method of DNA sequencing, which is based on the chemical modification of DNA and subsequent cleavage at specific bases.

The new DNA sequencing technologies are more advanced than the first-generation technologies. These cause many DNA fragments to be sequenced together and include faster and more cost-effective techniques than previous ones.

DNA sequencing has many applications in the fields of molecular biology, virology, evolutionary biology, and in the field of medicine. Sanger-Coulson method of DNA sequencing along with traditional methods is used to sequence viruses in basic and clinical research as well as for the diagnosis of new viral infections. In the medical field, DNA sequencing is used to diagnose and treat rare diseases. Clinicians through DNA sequencing identify genetic diseases, improve disease management, and provide more improvised therapies.

You may also like