23 Min Read

Eukaryotic Transcription

Photo of author

MN Editors

Eukaryotic Transcription Introduction

Although the procedure of synthesis of RNA is the same in prokaryotes as well as eukaryotes however, the process is much more complicated in eukaryotes. In eukaryotes it is possible to synthesize RNA in the nucleus. Moreover, most RNAs that encode protein need to be transferred into the cytoplasm to be translated onto the ribosomes. There is evidence to suggest that some translation happens in the nucleus, however the majority of translation happens in the cell cytoplasm.

Prokaryotic mRNAs usually contain the codons of several genes. these mRNAs are thought to be multigenic. However, a majority of the eukaryotic transcripts which have been studied have the coding region of one gene (are monogenic). However, up to one-fourth of transcription units of the worm’s small size Caenorhabditis elegans could be multigenic. Evidently, mRNAs from eukaryotes can be multigenic or monogenic. 5 different polymerases exist in eukaryotes. Each enzyme is responsible for expression of a particular group of genes. In addition it is the case that in eukaryotes most of the transcriptional transcripts from polypeptides are subject to three major modifications prior their transfer to the cells to be translated.

  1. 7-Methyl Guanosine caps are added to the five ends of the transcripts that are primary.
  2. Poly(A) tails can be added on the three ends of the transcripts they are created through cleavage and not by ending chain extension.
  3. If present, intron sequences are spliced into transcripts.

The 5 cap that is found on the majority of Eukaryotic mRNAs is a seven-methyl Guanosine residue that is joined to the first nucleoside of the transcript via the 5-5 phosphate linkage. Its 3 poly(A) tail comprises a polyadenosine tract that runs from 20 to 200 nucleotides in length. In eukaryotes the amount of primary transcripts within a nucleus is referred to as heterogeneous nuclear transcripts (hnRNA) due to the vast variation in size of the molecules found. A large portion of the hnRNAs comprise noncoding intron sequences, which are taken from the transcripts that are primary and then degraded within the nucleus.


So, a large portion of HnRNA are actually pre-mRNA molecules going through different processing processes before being released from the nucleus. Additionally, in eukaryotes they have RNA transcripts coated with RNA binding proteins during or right after their production. These proteins shield the transcripts of genes from degradation through ribonucleases, enzymes which break down RNA molecules during processing and transportation to the cell cytoplasm. The half-life average of genes in eukaryotes ranges from five to five hours, compared to a half-life that is just five seconds in E. coli. This increased longevity of gene transcripts in eukaryotes is facilitated at least in part by their interactions with RNA binding proteins.

RNA polymerase

Structure and molecular function of RNA polymerase

The three RNA polymerases are structurally alike and comprise of a number of subunits. They are composed of two catalytic subunits that are similar to the subunits b and b’ of the bacterial RNA Polymerase. The structures of RNA Polymerase from various species have been identified using the X-ray crystallography. There is a remarkable resemblance between the bacteria’s enzyme and its counterparts from the eukaryotic species. Figure 12.12a compares the structure of a bacterial RNA-polymerase with the RNA polymerase II of yeast.

As you can see, both enzymes share the same structure. Additionally, it’s interesting that this structure offers the ability to visualize the way that transcription works. As shown in Figure 12.12b The DNA is inserted into the enzyme via the jaw, and then is laid on a surface inside the RNA polymerase, which is known as the bridge. The component of the enzyme known as the clamp is believed to regulate the flow of DNA through the RNA polymerase. The enzyme’s wall makes the RNA-DNA hybrid create a right-angle bend. This bend allows the nucleotides’ ability to connect with the template. Mg2+ can be found on the catalytic site located at precisely the 3′-end of the RNA strand that is growing. The nucleoside triphosphates (NTPs) are introduced into the catalytic area via an pore. The right nucleotide is bound to the DNA template and is covalently linked to the 3′-end of the. When RNA polymerase moves along its template, it is accompanied by a Rudder located about 9 bp from the three-quarters of RNA is used to force the RNA-DNA hybrid to break. DNA and the single stranded RNA eventually exit through a tiny lid.

Structure and molecular function of RNA polymerase
Figure Description: Structure and molecular function of RNA polymerase. (a) A comparison of the crystal structures of a bacterial RNA polymerase (left) to a eukaryotic RNA polymerase II (right). The bacterial enzyme is from Thermus aquaticus. The eukaryotic enzyme is from Saccharomyces cerevisiae. (b) A mechanism for transcription based on the crystal structure. In this diagram, the direction of transcription is from left to right. The double-stranded DNA enters the polymerase along a bridge surface that is between the jaw and clamp. At a region termed the wall, the RNA-DNA hybrid is forced to make a right-angle turn, which enables nucleotides to bind to the template strand. Mg2+ is located at the catalytic site. Nucleoside triphosphates (NTPs) enter the catalytic site via a pore region and bind to the template DNA. At the catalytic site, the nucleotides are covalently attached to the 3′ end of the RNA. As RNA polymerase slides down the template, a small region of the protein termed the rudder separates the RNA-DNA hybrid. The DNA and single-stranded RNA then exit under a small lid.

Function of RNA polymerase

The genetic material in the nucleus of an Eukaryotic cell is translated through three different enzymes of RNA polymerase named RNA polymerase I II and III. What are the functions for these enzymes? Three RNA polymerases is able to transcribe distinct types of genes.

  • RNA polymerase I: transcribing all the genes that make up the ribosomal RNA (rRNA) apart from the 5S rRNA.
  • RNA polymerase II: It is a transcription factor that transcribes all proteins-coding genes. It is therefore responsible for the production of all mRNAs. It also transcriptionally regulates the genes that make up the majority of snRNAs, which are essential to splice RNA. It also is able to transcribe a variety of genes that make other non-coding RNAs. This includes most non-coding RNAs that are long microRNAs, snoRNAs and microRNAs.
  • RNA polymerase III: It transcribed all tRNA genes, as well as five-sense rRNA genes. In a lesser degree as RNA polymerase II, it also transcribes some genes that generate other non-coding RNAs like snRNAs, microRNAs, long non-coding RNAs and snoRNAs.

Components Required for Eukaryotic Transcription

RNA polymerase II

The enzyme responsible for catalyzing the linking of nucleotides in the 5′-3 direction, by using DNA for a model. The majority of eukaryotic DNA polymerase II proteins consist of 12 subunits. Two of the largest subunits have structural similarities to the b b subunits within E. coli RNA polymerase.

General transcription factors

  • TFIID: TFIID is composed of TATA-binding proteins (TBP) as well as additional TBPassociated factor (TAFs). It recognizes TATA as the TATA box of promoters for eukaryotic proteins.
  • TFIIB: Binds with TFIID and then allows RNA polymerase II to bind to the promoter’s core. It also promotes TFIIF binding.
  • TFIIF: The TFIIF molecule binds RNA polymerase II and plays part in its capacity to attach to TFIIB and the promoter at the core. It also plays a part in the capacity for TFIIE as well as TFIIH to connect to RNA polymerase II.
  • TFIIE: It plays a key role in the creation or maintaining (or either) that of the complex open. It can exert its influence through the facilitation of binding TFIIH in the presence of RNA polymerase II, and also by regulates the activities of TFIIH.
  • TFIIH: Multisubunit Protein that performs multiple roles. The first is that certain subunits function as helicases, and facilitate opening of complexes. Other subunits phosphorylate carboxyl terminal domain (CTD) of RNA polymerase II and lets it interact with TFIIB which allows RNA polymerase II to move to the elongation stage.


A multisubunit complex which mediates the effects of transcription factors that regulate on the functions of the RNA polymerase II. While mediator generally has specific subunits that are core, the composition of its subunits are different according to the type of cell and the environmental conditions. The capacity of mediator to alter the functions of RNA polymerase II is believed to happen through CTD. CTD that is a part of RNA polymerase II. Mediator could influence the capability of TFIIH to be able to phosphorylate CTD and also subunits of mediator have the capacity to make CTD. Since CTD is necessary for the release of RNA polymerase II from TFIIB Mediator plays an important role in the capacity to allow RNA polymerase II switch from the initial stage of transcription to the elongation stage in transcription.

Initiation of Eukaryotic Transcription

In contrast to their prokaryotic counterparts the eukaryotic RNA Polymerases can’t begin transcription on their own. The five eukaryotic RNA polymerases need the help of proteins to initiate the process of synthesising the RNA chain. These transcription factors must be able to bind with an area of DNA called a promoter and create an appropriate initiation complex prior to the RNA polymerase can begin to bind and start transcription. Different transcription factors and promoters are utilized by RNA Polymerases. In this article we will focus on the beginning of pre-mRNA production by the RNA polymerase II enzyme, which can transcribe the majority of the eukaryotic genes.

Structure of a promoter recognized by RNA polymerase II
Structure of a promoter recognized by RNA polymerase II. The TATA and CAAT boxes are located at about the same positions in the promoters of most nuclear genes encoding proteins. The GC and octamer boxes may be present or absent; when present, they occur at many different locations, either singly or in multiple copies. The sequences shown here are the consensus sequences for each of the promoter elements. The conserved promoter elements are shown at their locations in the mouse thymidine kinase gene.

In all instances the process of initiating transcription involves the creation of a locally unwound section of DNA, which provides the DNA strand in a position to serve as a template for creation of a complementary string of RNA. The creation of the locally unwound section of DNA needed to begin transcription requires the interaction of various transcription factors and specific sequences within the promoter of that transcription element. The promoters identified in RNA polymerase II consist of shorter conserved elements or modules, that are located downstream from the point where transcription begins.

Other promoters that are detected as such by RNA polymerase II contain some of, but not all of these elements. The conserved element located closest to the site of transcription (position 1.) is known as the TATA box. It has its consensus sequence TATAAAA (reading 5-3 on the nontemplate Strand) and is located around 30 degrees. It is believed that the TATA box plays a significant function in determining the transcription start point. The second conserved component is known as the CAAT box. It typically occurs at the position of 80 and is a consensual sequence GGCCAATCT. Two other elements that are conserved include the GC box that is a consensus GGGCGG and the octamer the consensus ATTTGCAT occur in RNA promoters for polymerase II; they impact the efficiency of a promoter’s role in initiating transcription.

Initiation of Eukaryotic Transcription
Initiation of Eukaryotic Transcription

Try Solving It: The transcription process is initiated through RNA Polymerase II in Eukaryotes to discover the way these promoter sequences that are conserved are used in the HBB human (-globin) gene.

The process of initiation of transcription initiated through RNA polymerase II demands the aid of several basic transcription factors. Additional transcription factors as well as regulatory sequences, called enhancers or silencers affect the effectiveness of transcription initiation. The transcription factors that are the basal ones must cooperate with promoters in a right sequence to trigger transcription in a way that is effective. Every basal transcription factor is identified by TFIIX (Transcription Factor X for the RNA polymerase II gene, where the letter X represents the particular factor).

TFIID is the very first basal transcription factor that interacts with the promoter. It includes a TATA-binding protein (TBP) along with a few tiny TBPassociated proteins. Then, TFIIA joins the complex and is followed by TFIIB. TFIIF initially associates in the first place with RNA polymerase II Then TFIIF as well as RNA polymerase II link the transcription-initiation complex. TFIIF has two subunits, one of which is DNA unwinding function. This means that TFIIF is likely to catalyze the localized unwinding process of the DNA double helix needed to start transcription. TFIIE will then join the initiation complex, binds to DNA downstream of the point of transcription starting. Two additional components, TFIIH and TFIIJ, join the complex shortly after TFIIE but their places within the complex remain unclear. TFIIH is a helicase and interacts in conjunction with RNA polymerase II during the process of elongation, releasing the strands of RNA in the transcription region (the “transcription bubble”).

RNA polymerases I and II start transcription using processes that are identical to, though a little less, in comparison to the process utilized by polymerase II however, the methods employed by RNA polymerases IV as well as V are being investigated. The promoters of genes that are transcribed in polymerases I or III differ different from the ones used by polymerase II when they have the identical regulatory elements. The RNA polymerase I promoters are bipartite with a core sequence stretching between 45 and 20 and an upstream control element stretching from 180 to around 105. The two regions share the same sequences and have GC-rich. Its core sequence is adequate to initiate the process, however, the effectiveness of the initiation process is significantly enhanced by having the upstream control. It is interesting to note that the promoters of the majority of the genes transcribed through RNA polymerase III are found in the transcription units just downstream of the start of transcription, instead of upstream, as in the units that are which are transcribed through RNA polymerases I or II. The promoters of the other genes that are transcribed by polymerase III lie downstream of the start of transcription similar to the polymerases I and II. In reality, the promoters of polymerase III can be classified into three categories and two of them have promoters that are located inside the unit of transcription.

Elongation Of Eukaryotic Transcription And The Addition Of 5 Methyl Guanosine Caps

When eukaryotic polymerases have been liberated of their initiating complexes they catalyze elongation of the RNA chain using the same mechanism as the RNA polymerases found in prokaryotes. Research on their crystal structure different the RNA polymerases has provided an accurate picture of the key characteristics of this crucial enzyme. While the RNA polymerases from archaea, bacteria, and eukaryotes differ in their substructures, their main properties and the mechanisms they employ are remarkably similar.

Crystal structure for RNA Polymerase II (resolution .28 millimeters) that is part of S. cerevisiae. A diagram that illustrates the the structural characteristics of an RNA-based polymerase and its interactions with DNA and the RNA transcript growing is displayed in the beginning of the process of elongation, the five ends of the eukaryotic pre-mRNAs are altered by adding 7-methyl Guanosine (7-MG) caps. The caps of 7-MG are placed at the time the RNA chains that are growing are about 30 nucleotides in length.

The 7-MG cap has an unusual triphosphate linkage of 5-5 (see figure 11.12) along with two or three methyl groups. The 5 caps are co-transcribed by biosynthetic pathways. The caps of 7-MG are recognized by proteins that are that are involved in the process of initiating transcription and help to in preventing the RNA chains that are growing from degrading by nucleases.

Pathway of biosynthesis of the 7-MG cap.
7-Methyl guanosine (7-MG) caps are added to the 5 ends of pre-mRNAs shortly after the elongation process begins

It is important to remember you that eukaryotic genes are located in nucleosomes, chromatin. How does RNA Polymerase translate DNA contained in nucleosomes? Do nucleosomes need to be disassembled before DNA inside can be translated? Incredibly, RNA polymerase II is able to get past nucleosomes the aid of a protein complex known as FACT ( helps in the transcription of chromatin) that removes dimers of histone H2A/H2B from nucleosomes and leaves the histone “hexasomes.”

When polymerase II leaves the nucleosome, FACT as well as other proteins assist in redepositing the dimers of histones, which restore the nucleosome’s structure. It is important to be aware that chromatin with genes that are being transscribed is smaller structures than the chromatin that has inactive genes. Chromatin that contains active genes are packaged is likely to contain histones with a lot of acetyl groups. In contrast, the chromatin that is not active contains histones that are less acetyl-linked.

Termination of Eukaryotic Transcription By Chain Cleavage And The Addition Of 3 Poly(A) Tails

The three ends of the transcripts made via RNA polymerase II are produced by endonucleolytic degradation of primary transcripts , not by the end of transcription. The actual termination of transcription events typically occur at multiple locations which are situated between 1000 and 2000 nucleotides downstream of the site that becomes the 3rd end in the maturing transcript. This means that transcription continues beyond the location which will be the 3 terminus, and then the distal segment is eliminated by the endonucleolytic cutting. The cleavage process that creates the 3 terminus of a transcript typically occurs at a location between 11 and 30 nucleotides downstream of the conserved polyadenylation signal that is based on consensus AAUAAA, and further upstream from an GU-rich sequence that is located close to the terminus in the transcription.

After cleavage the enzyme poly(A) polymerase attaches poly(A) tails, or tracts of adenosine monophosphate adenosine residues approximately 200 nucleotides in length, to the three ends of transcripts. This addition to the addition of poly(A) tails onto eukaryotic mRNAs is known as polyadenylation. To analyze the polyadenylation signals for the human HBB (-globin) gene look up Solve It The 3-Terminus is formed from the RNA Polymerase II Transcript.

Termination of Eukaryotic Transcription By Chain Cleavage And The Addition Of 3 Poly(A) Tails
7 Poly(A) tails are added to the 3 ends of transcripts by the enzyme poly(A) polymerase. The 3-end substrates for poly(A) polymerase are produced by endonucleolytic cleavage of the transcript downstream from a polyadenylation signal, which has the consensus sequence AAUAAA.

The creation of poly(A) tails on transcripts is dependent on a specific part that recognizes binds the AAUAAA sequence and a stimulatory component that connects to the GU-rich DNA structure, an endonuclease as well as an endonuclease, and poly(A) polymerase. The proteins form multimeric complexes that carry out both the cleavage as well as the polyadenylation process in tightly coupled reactions. It is believed that the poly(A) tails that are present in mRNAs from eukaryotes increase the stability of their mRNAs and also play an essential function in their transfer from the nucleus into the cells.

Contrary with RNA polymerase II Both RNA polymerase I as well as III are responsive to distinct termination signals. RNA polymerase I ends transcription due to an 18-nucleotide sequence which is recognized by a terminator protein. RNA polymerase III is responsive to a signal for termination that is identical to the rho-independent termination within E. coli.

RNA processing

The eukaryotic primary transcript for mRNA is more extensive and is located in the nucleus. It is also referred to as heterogenous nuclear transcript (hnRNA) or pre-mRNA. It goes through a variety of processing steps to transform into mature RNA.


Larger precursors of RNA are cleaved to make smaller RNAs. Primary transcript is cut by ribonuclease (an enzyme for RNA) to produce 7 tRNA precursors.

Capping and Tailing

At first, at the 5′-end the caps (consisting of 7-methyl guanosine , or 7 mg) and an A-like tail at the 3′-end are added. The cap is chemically altered the molecule of guanosine triphosphate (GTP).


The eukaryotic primary mRNAs consist from two kinds of segments: non-coding introns as well as the exons that code. The introns are eliminated by an process known as RNA Splicing in which ATP is used for cutting the RNA, thus releasing the introns, and joining two exons that are adjacent to each other in order to make mature mRNA.

Nucleotide Modifications

They are mostly found when tRNA-methylation is involved (e.g. the for example, methyl-cytosine, methyl-guanosine) and deamination (e.g. inosine, derived from Adenine) dihydrouracil, pseudouracil, and so on. Post-transcription processing is necessary to convert the primary transcript into functional transcripts.

What Is RNA Editing?

In accordance with the fundamental dogma of molecular biology, information about genetics is transferred from DNA to proteins during the process of the expression of genes. The genetic information does not change within the mRNA intermediary. However, the development of RNA editing has revealed that it is possible to have exceptions. Editing RNA processes alter the gene’s information content transcripts by two methods: 

  1. by changing the structures of individual bases and 
  2. by inserting or deleting uridine monophosphate residues.

RNA Editing by changing the structures of individual bases

The first kind of RNA editing that results in the substitution of one base with another, is a rare phenomenon. This kind of editing was found in research on the apolipoprotein B (apo-B) gene and the mRNAs found in rabbits as well as humans. Apolipoproteins are blood protein which transport specific types of fat molecules within circulation. Within the liver, apoB gene encodes a huge protein that is 4563 amino acids long. In the intestines, the apo-B is mRNA that controls the synthesis of a protein that is only 2153 amino acid long. In this case, an C residue from the pre-mRNA gets converted into an U which creates an internal UAA translation-termination codon that results in the apolipoprotein being shortened.

UAA is one codon which end polypeptide chains that are undergoing translation. If there is a UAA codon is created within the coding area of an mRNA it will prematurely end the polypeptide’s translation process which results in an unfinished gene product. The C U conversion process is catalyzed by a sequence-specific RNA binding protein, which eliminates amino groups from the cytosine sequences. Similar instances in RNA editing is reported for an mRNA identifying the protein (the glutamate receptor) that is found in rat brain cells. A more extensive editing of mRNAs of the C type and the U kind occurs inside the plants’ mitochondria where the majority transcripts of genes are altered to a certain degree. Mitochondria possess unique DNA genomes, as well as the machinery for protein synthesizing. There are a few transcripts that can be found in plant mitochondria, the majority of the C’s get converted into U residues.

RNA Editing by inserting or deleting uridine monophosphate residues

Another, more complicated form of RNA editing can be found within the mitochondria in Trypanosomes (a group of flagellated protozoa which cause sleepiness in humans). In this instance the uridine monophosphate molecule is added to (occasionally removed from) genes, causing significant changes in the polypeptides identified by mRNA molecules. The editing of RNA is controlled through guide RNAs that have been transcribed from different mitochondrial genes. Guide RNAs have sequences that are partly identical to the pre-mRNAs which can edit. The pairing between guide RNAs as well as the pre-mRNAs causes gaps that have non-paired A residues in guides RNAs. Guide RNAs are used as templates to edit, since U’s are placed into the gaps in premRNA molecules that are opposite to those in guides RNAs.

Editing of the apolipoprotein-B mRNA in the intestines of mammals
Editing of the apolipoprotein-B mRNA in the intestines of mammals

Why do these RNA editing processes occur?

What are the nucleotide sequences for these mRNAs not outlined in the mitochondrial genes like they are found in many nuclear genes? So far, the solutions to these intriguing questions are only speculations. Trypanosomes are eukaryotes that were single-celled that separated from other eukaryotes earlier in their evolution. Many evolutionary scientists have suggested the possibility that editing RNA was prevalent in the early cells, and several reactions are believed to be catalyzed through proteins, not RNA molecules. Another theory suggests the idea that RNA editing is an ancient method for changing the patterns that regulate gene expression. No matter the reason it is believed that RNA editing is a key factor for the expression of the genes within mitochondria in trypanosomes as well as plants.

Eukaryotic Genes Have a Core Promoter and Regulatory Elements

To ensure that transcription occurs at a suitable rate the eukaryotic genes possess two essential elements: a primary promoter as well as regulatory elements. The image illustrates a common pattern of sequences that are found in proteins-encoding genes. The promoter’s primary function is to be a DNA sequence that is relatively short that is essential to allow transcription to occur. It is typically comprised of the TATAAA sequence, which is known as the TATA box, and the transcriptional start point, which is where transcription starts.

The TATA box that is typically 25 bp further downstream than the transcriptional start point and is essential in determining the precise start location for transcription. If it’s missing from the promoter that is the core the location of the transcription start is unclear and transcription can begin in a variety of places. The core promoter by itself is responsible for a low degree of transcription. This is referred to as basal transcription.

The regulatory elements are short DNA sequences that impact the capacity that RNA polymerase can recognize a core promoter, and then begin with transcription. They are identified by transcription factors, proteins that affect the speed of transcription.

There are two types in the regulatory element category. Activating sequences, also known as enhancers are essential to trigger transcription. Without enhancer sequences most genes in the eukaryotic family have low levels of transcription. In certain circumstances it is possible to block transcription of a specific gene.

Silencers are DNA sequences which are identified by transcription factors that hinder transcription. The most frequent location of regulatory factors is in the -50 to -100 area. However, the location of these elements differ across different eukaryotic gene types. They can be located far away from the promoter’s core but they can significantly influence the capacity of RNA polymerase to start transcription.

DNA sequences, such as those in the TATA box and enhancers and silencers work only on a specific gene. They are referred to as cis-acting components. The word “cis” is derived from the chemistry nomenclature which means “next to.” Cis-acting elements, although they could be away from the promoter’s core but are always located on the same chromosome as genes they control. Contrary to this the transcription factors that regulate that bind to these elements are known as trans-acting factor (the term trans translates to “across from”).

Transcriptional factors which regulate the expression of genes are encoded by genes. Regulatory genes which encode transcription factors can be different from the genes they regulate or even on an entirely different chromosome. If a gene that encodes an trans-acting protein is expressed, the transcription protein may be released into the cell nucleus and attach to its appropriate cis-acting component. Now let’s turn our attention to the role of these proteins.

Eukaryotic Genes Have a Core Promoter and Regulatory Elements
A common pattern for the promoter of protein-encoding genes recognized by RNA polymerase II. The start site usually
occurs at adenine (A); two pyrimidines (Py: cytosine or thymine) and a cytosine (C) are to the left of this adenine, and five pyrimidines (Py) are to the right. A TATA box is approximately 25 bp upstream from the start site. However, the sequences that constitute eukaryotic promoters are quite diverse, and not all protein-encoding genes have a TATA box. Regulatory elements, such as GC or CAAT boxes, vary in their locations but are often found in the −50 to −100 region. The core promoters for RNA polymerase I and III are quite different. A single upstream regulatory element is involved in the binding of RNA polymerase I to its promoter, whereas two regulatory elements, called A and B boxes, facilitate the binding of RNA polymerase III.


Submit Your Question
Please submit your question in appropriate category.

Leave a Comment