DNA Beyond the Double Helix: Structural Variants and Unusual Sequences

In this article, I will briefly describe the different structural variants of DNA and some unusual sequences beyond the double helix.

The DNA Double Helix

The journey of understanding DNA began in 1868, when Friedrich Miescher first isolated and characterized a substance he called nuclein, now known as DNA. In the late 1940s, Erwin Chargaff and his colleagues discovered that the four nucleotide bases—adenine, thymine, guanine, and cytosine—occur in varying ratios across different organisms, yet maintain a specific pairing pattern. A breakthrough came in 1953, when James Watson and Francis Crick proposed the iconic double helix model of DNA structure (Figure 1, B-DNA). This model described DNA as a right-handed helix formed by two complementary strands held together by hydrogen bonds between base pairs and base-stacking interactions. The negative charges on the phosphate backbone are stabilized by metal cations, contributing to the overall stability of the molecule. Central to this structure is the concept of complementarity, which ensures accurate storage and transfer of genetic information.

The Different Three-Dimensional Forms of DNA

DNA displays remarkable structural flexibility, adopting alternative helical forms and unusual shapes depending on its sequence and surrounding conditions. The ability of the sugar-phosphate backbone to rotate around several types of bonds, combined with thermal fluctuations, can lead to bending, stretching, or even strand separation. While many structural deviations from the classic Watson-Crick model exist within cellular DNA, these changes typically do not alter its fundamental characteristics as described by Watson and Crick.

Structural variations in DNA mainly reflect three aspects: the different conformations that the deoxyribose sugar can adopt, the rotational freedom around the bonds in the phosphodeoxyribose backbone, and the free rotation around the C-1′-N-glycosyl bond connecting the base to the sugar. Due to spatial limitations, purine bases in nucleotides can assume two stable positions relative to the sugar, known as syn and anti conformations. In contrast, pyrimidines are usually confined to the anti-conformation because steric clashes occur between the sugar and the carbonyl group at the C-2 position of the pyrimidine ring.

B, A, and Z Forms of DNA

B-DNA, also known as the B-form DNA, is the most stable and predominant conformation of DNA under physiological conditions (e.g., normal salt and pH levels). It is the canonical form of DNA as described by Watson and Crick in 1953, and it serves as the standard reference structure in most studies of DNA properties (Figure 1). B-DNA is a right-handed double helix with approximately 10.5 base pairs per turn, and it features major and minor grooves that are important for protein-DNA interactions.

A and Z Forms of DNA

The A-form of DNA is a right-handed double helix that is slightly wider than the B-form and contains 11 base pairs per turn (Figure 1). This form typically arises in environments with low water content. In A-DNA, the base pairs are tilted by about 20° compared to those in B-DNA, meaning they are not perfectly perpendicular to the helix axis. This tilt results in a deeper major groove and a shallower minor groove. Because the chemicals used for DNA crystallization often remove water, short DNA fragments usually crystallize in the A-form.

The Z-form of DNA is characterized by a left-handed helical twist, unlike the right-handed twists seen in A-DNA and B-DNA. It contains 12 base pairs per turn and has a slimmer, more extended appearance compared to the other forms. Its sugar-phosphate backbone forms a distinctive zigzag pattern (Figure 1). Certain nucleotide sequences are more likely to adopt the Z-DNA structure. In this form, purine bases shift to the syn conformation to support the left-handed helix. The major groove is almost absent in the Z-form, while the minor groove is narrow and deep. In both bacterial and eukaryotic cells, short segments of Z-DNA have been identified. These regions may be involved in regulating gene expression or facilitating genetic recombination.

The A,B and Z forms of DNA
Figure 1: The A, B, and Z forms of DNA

The Properties of The Three Forms of DNA

The properties of three forms of DNA, such as A-form, B-form, and Z-form, are listed in the table below.

PROPERTIESA FORMB FORMZ FORM
Base pairs per helical turn1110.512
Diameter~26 Å~20 Å~18 Å
Helical senseRight handedRight handedLeft handed
Helix rise per base pair2.6Å3.4Å3.7Å
Base tilt normal to the helix axis20⁰6⁰7⁰
Glycosyl bond conformationAntiAntiAnti for pyrimidines; syn for purines
Sugar pucker conformationC-3’ endoC-2’ endoC-2’ endo for pyrimidines; C-3’ endo for purines

Unusual Structures of Certain DNA Sequences

DNA is not always a uniform, regular helix. Its sequence can cause significant structural variations. For instance, stretches of four or more consecutive adenosines (A-tracts) in one strand can induce localized bends in the DNA helix. Remarkably, a run of six adenosines can produce a bend of approximately 18°. These sequence-dependent bends are not random anomalies but may play crucial roles in biological processes, particularly in facilitating the binding of certain proteins to DNA. Such unusual sequences highlight the dynamic and flexible nature of DNA beyond the classic Watson-Crick model.

A Palindrome

A common feature of DNA sequences is the presence of palindromes—stretches of nucleotides that read the same in the 5′ to 3′ direction on one strand and in the 5′ to 3′ direction on the complementary strand (Figure 2). In molecular biology, this refers to inverted repeats, where self-complementary sequences are arranged in reverse orientation across the two strands. These palindromic sequences enable the formation of secondary structures such as hairpins in single-stranded DNA and cruciforms in double-stranded DNA.

palindromes and mirror repeats in DNA
Figure 2: Palindromes and mirror repeats in double-stranded DNA

In contrast, mirror repeats are sequences that exhibit symmetry within a single strand but lack self-complementarity. Therefore, they cannot form typical hairpin or cruciform structures. Both palindromic and mirror repeat motifs are widely distributed in genomic DNA and can span from a few base pairs to several thousand. Notably, self-complementary regions in single-stranded DNA can spontaneously fold into complex secondary structures, influencing biological functions such as replication, recombination, and gene regulation.

Hairpins and Cruciforms

Hairpin and cruciform structures are non-standard DNA formations that result from the presence of inverted repeat sequences. A hairpin (Figure 3) forms when a single DNA strand folds back and pairs with itself, creating a stem made of complementary base pairs and a loop of unpaired bases. This structure is often seen in single-stranded DNA or RNA, especially during replication or transcription.

hairpin DNA
Figure 3: A hairpin DNA

On the other hand, a cruciform structure develops in double-stranded DNA when both strands have inverted repeats. Each strand folds into a hairpin, producing a cross-like shape. Cruciform structures usually form under stress conditions like negative supercoiling and can affect DNA function by altering stability or interacting with specific proteins.

cruciform DNA
Figure 4: A cruciform of a duplex DNA

Hoogsteen Pairing

Several unusual DNA structures are formed from three or even four DNA strands. In triplex DNA, a third DNA strand binds to the major groove of a standard Watson–Crick base-paired duplex, forming additional hydrogen bonds. This interaction typically involves Hoogsteen or reverse Hoogsteen base pairing.

In such structures, the guanine (G) of a standard G≡C pair can form additional hydrogen bonds with a cytosine from the third strand, protonated at N3 (C⁺), resulting in a C⁺·G≡C triplet. Similarly, adenine (A) in an A=T pair can bind a thymine (T) from the third strand (Figure 5), forming a T·A=T triplet. In the figure, the Hoogsteen pairing in each case is shown in green colour.

base pairing patterns in triplex DNA
Figure 5: The base pairing patterns in one form of triplex DNA

The N7, O6 (for guanine) and N7, N6 (for adenine) atoms on purines—located in the major groove—participate in these additional hydrogen bonds. These non-canonical base pairings are collectively referred to as Hoogsteen interactions, named after Karst Hoogsteen, who first described this alternative hydrogen-bonding geometry in 1963. Hoogsteen pairing allows the formation of triplex DNAs. Some triplex DNAs contain two pyrimidine strands and one purine strand, while others contain two purine strands and one pyrimidine strand.

The DNA Tetraplex

Four DNA strands can come together to form a structure known as a tetraplex. This occurs most effectively in DNA sequences that contain a high number of guanosine residues (Figure 6). The resulting guanosine tetraplex, or G-tetraplex, is notably stable across a wide range of conditions.

Guanosine tetraplex
Figure 6: The guanosine tetraplex

In cellular DNA, the binding sites for many sequence-specific DNA-binding proteins often appear as palindromic sequences. Additionally, polypurine or polypyrimidine stretches—capable of forming triple-helical DNA structures—are commonly found in regions that regulate the expression of certain eukaryotic genes. Theoretically, synthetic DNA strands could be engineered to bind to these specific sequences, forming triplex DNA and potentially interfering with gene expression.

Conclusion

James Watson and Francis Crick proposed the iconic double helix model of DNA structure. This model described DNA as a right-handed helix formed by two complementary strands held together by hydrogen bonds between base pairs and base-stacking interactions.

DNA displays remarkable structural flexibility, adopting alternative helical forms and unusual shapes depending on its sequence and surrounding conditions. The ability of the sugar-phosphate backbone to rotate around several types of bonds, combined with thermal fluctuations, can lead to bending, stretching, or even strand separation. B-DNA, also known as the B-form DNA, is the most stable and predominant conformation of DNA under physiological conditions. The A-form of DNA is a right-handed double helix that is slightly wider than the B-form and contains 11 base pairs per turn. The Z-form of DNA is characterized by a left-handed helical twist, unlike the right-handed twists seen in A-DNA and B-DNA.

DNA is not always a uniform, regular helix. Its sequence can cause significant structural variations. A common feature of DNA sequences is the presence of palindromes—stretches of nucleotides that read the same in the 5′ to 3′ direction on one strand and in the 5′ to 3′ direction on the complementary strand. Several unusual DNA structures can form from three or even four strands. In triplex DNA, a third strand binds to the major groove of a standard Watson–Crick duplex through Hoogsteen or reverse Hoogsteen hydrogen bonding.

You may also like: