|Organism||Streptococcus pyogenes serotype M1|
|Location on chromosome||0.85 to 0.86Mb|
|Protein||CRISPR-associated endonuclease Cas9/Csn1|
|Biological processing||Interference- defense response to phage.|
Maintaining CRISPR repeat sequences
|Functions||DNA and RNA binding|
Metal ion binding
3’-5’ exonuclease activity
- Cas9 is a nuclease that degrades phage DNA via RNA-guided double DNA cleavage, DNA binding, and nuclease activity.
- Cas9 protein is prominent in CRISPR systems of bacterial type II.
- It requires both crRNA and tracrRNA to function properly.
- Catalytic activity also requires a PAM sequence on the target DNA.
- Cas9 is changed for a variety of functions, including gene activation and gene expression suppression.
- Cas9’s significance in CRISPR-mediated gene editing and applications such as disease modelling, gene role research, therapeutic and gene expression investigations is well established.
- Using the PAM sequence as a marker, it simply locates, binds, and cleaves the target nucleic acid. To identify the fugitive, the sgRNA containing cRNA and tracrRNA seeks complementarity with the target location.
- However, its two-level authentication (the employment of sgRNA and PAM) diminishes in vitro gene editing efficiency significantly. Therefore, customised Cas9 nucleases such as spCas9, dCas9, SaCas9, and XCas9 are available.
What is Cas9 Protein?
- Cas9, also known as CRISPR-associated protein 9, is one of the well-studied, significant, and commercially available nucleases employed not only in bacterial systems, but also in in vitro gene-editing techniques.
- Cas9 is a form of DNA nuclease that can accurately remove dsDNA, and it is exclusive to CRISPR type II. It is most typically found in Streptococcus pyogenes and is referred to as dual RNA-guided DNA endonuclease.
- To comprehend why only Cas9 is commonly employed for gene editing, it is necessary to comprehend the structure, function, and significance of the Cas9 protein, formerly known as Cas5, Csx12, and Csn1.
- S. pyogenes SpyCas9 is a large (1,368 amino acids), multidomain, and multifunctional DNA endonuclease.
- It uses its two unique nuclease domains to snip dsDNA 3 bp upstream of the PAM: an HNH-like nuclease domain that cleaves the DNA strand complementary to the guide RNA sequence (target strand), and a RuvC-like nuclease domain that cleaves the DNA strand opposite the complementary strand (nontarget strand).
- Cas9 also contributes in crRNA maturation and spacer acquisition, in addition to its essential involvement in CRISPR interference.
Structure of Cas9
- Cas9 in its apo state has two different lobes: the alpha-helical recognition (REC) lobe and the nuclease (NUC) lobe, which contains the conserved HNH and split RuvC nuclease domains as well as the more variable C-terminal domain (CTD).
- Two linking segments join the two lobes, one created by the arginine-rich bridge helix and the other by a disordered linker (residues 712–717).
- The REC lobe consists of three alpha-helical domains (Hel-I, Hel-II, and Hel-III) and is structurally distinct from all other known proteins.
- The extended CTD has a Cas9-specific fold and contains PAM-interacting sites necessary for PAM interrogation. Nonetheless, this PAM-recognition region is highly disordered in the apo–Cas9 structure, indicating that the apo–Cas9 enzyme is maintained in an inactive state, unable to detect target DNA prior to binding to a guide RNA.
- This structural finding is consistent with so-called DNA curtains tests demonstrating that apo–Cas9 binds nonspecifically to DNA and can be swiftly removed from nonspecific locations in the presence of competing RNA (guide RNA or heparin).
- The structural superimposition of apo–Cas9 with sgRNA-bound and DNA-bound structures indicates further that the enzyme adopts a catalytically inactive conformation in its apo state, requiring RNA-induced structural activation for DNA recognition and cleavage.
- This structural result corroborates the biochemical findings that Cas9 enzymes are inactive as nucleases in the absence of bound guide RNAs and further supports their activity as RNA-guided endonucleases.
HNH and RuvC Nuclease Domains
- Comparing the structures of Cas9 nuclease domains to those of other DNA-bound nucleases shows that the Cas9 RuvC nuclease domain is similar to members of the retroviral integrase superfamily that have an RNase H fold. This suggests that RuvC probably uses a two-metal-ion catalytic mechanism to cut the nontarget DNA strand.
- The HNH nuclease domain, on the other hand, has the same -metal fold as other HNH endonucleases and most likely uses a single metal ion to cut the target-strand DNA.
- One metal-ion-dependent and two metal-ion-dependent nucleic acid cleaving enzymes can be identified by a general base histidine that is always the same and an aspartate residue that is always the same.
- This is in line with Cas9 mutagenesis studies that show changing either the HNH (H840A) or the RuvC domain (D10A) turns Cas9 into a nickase, while changing both nuclease domains of Cas9 (so-called “dead Cas9” or dCas9) keeps its ability to bind to RNA-guided DNA but gets rid of its ability to cut DNA.
- But these proposed catalytic mechanisms still need to be tested in the lab to make sure they work.
Mechanism of working
- In the first step of bacterial interference, the REC lob and the gRNA complex work together to form the ribonucleoprotein complex (RNP). Then, the nuclease domains RuvC and HNH break two phosphodiester bonds between two different strands of DNA, which separates the dsDNA strands.
- An in-depth study shows that the HNH active domain hydrolyzes the phosphodiester bond of the complementary strand, while the RuvC active site hydrolyzes the phosphodiester bond of the non-complementary strand. RuvC and HNH each use two metal ions and one metal ion for hydrolysis because it needs metal ions to work (Tang H et al., 2021).
|REC||Bridge helix||60-93||Recognition of DNA|
|REC||REC1||94-179, 308-713||RNA guided DNA targeting|
|NUC||RuvC (RuvCI, RuvCII and RuvCIII)||1-59, 718-769, 909-1098||RNase H activity; Nuclease activity for non-complementary target strand.|
|NUC||HNH||775-908||Nuclease activity for complementary target strand|
|NUC||PAM-interacting- domain||1099-1368||Finds the PAM sequence on the target DNA.|
Types of Cas9 nucleases
There are different kinds of Cas9 nucleases that come from both nature and labs. They are put into groups based on their function or the species from which they came. I’ll list and explain a few of them here.
|Structure||Bilobed (REC and NUC)|
|Domains||NUC (Nuclease domain): HNH and RuvC|
REC (recognition domain): Rec1, Rec2 and Rec3.
|Bacterial CRISPR system||System II|
|PAM sequence||5’-NGG-3’ (N is any nucleotide)|
|SgRNA||Required (crRNA: tracrRNA)|
|Variants||SpCas9-NRRH, SpG, SpCas9-NRCH, SpCas9-NRTH,|
- SpCas9 comes from Streptococcus Pyogenes and is one of the most popular, well-studied, and widely used Cas9 nucleases in genetic engineering experiments.
- As was already said, it needs both crRNA and tracrRNA as sgRNA and the PAM sequence to find the target.
- Once the SpCas9 finds the PAM (5′-NGG-3′) sequence, the sgRNA sends the nuclease right to the target region, where the spCas9 cuts through both strands of DNA.
- The structure is similar to the general structure of Cas9, with the nuclease lobe for catalytic activity and the recognition lobe for recognising and identifying the target DNA.
Advantages of SpCa9
- easy to get and well-researched.
- Simple to separate
- Very efficient
- Simple to use.
Disadvantages of SpCa9
- Required PAM sequence.
- Also finds false PAM and makes effects that don’t hit the target.
- Learn to recognise other PAMs, such as 5′-NAG-3′ and 5′-NGA-3′.
- It’s big and can’t be moved around easily.
- Hard to say and say out loud.
Applications of SpCa9
As was already said, the current system has been carefully studied and has a lot of data. Because of this, it is popular in gene therapy. Among the most common uses are
- Transcriptional repression
- Activation of transcription
- Epigenetic modulation
- Gene disruption
- Conversion of a single base pair
|Structure||Bilobed (REC and NUC)|
|Domains||NUC (Nuclease domain): HNH and RuvCREC (recognition domain): Rec1, Rec2 and Rec3.|
|Bacterial CRISPR system||System II|
|PAM sequence||5’-NNGRRT-3’ (N is any nucleotide)|
|SgRNA||Required (crRNA: tracrRNA)|
|Variants||efSaCas9, KKHSaCas9 and SaCas9-HF|
- The SaCas9 is another very popular Cas9 nuclease. Its structure is similar to that of the SpCas9, but its size is different. The best thing about SaCas9 is that it is small. Since then, it can be used to replace the SpCas9.
- SaCas9 comes from the bacteria Streptococcus aureus. It is made up of only 1053 amino acids, which is about 1Kb less than SpCas9.
- It also needs a PAM sequence, such as 3′-NNGRRT-5′, to tell the difference between its own DNA and other DNA. When catalysed, it makes double-stranded ends that are sticky.
- Small in size
- A lot of accuracy
- Easy to put into a virus’s carrier
- Required PAM sequence
- You need a bigger sgRNA to have a big effect off-target.
The current Cas9 nuclease is used a lot to change the genome of plants in studies of how plants and pests interact.
- Research on stress tolerance
- Research into pathogen resistance
- It can also be used to treat diseases that are caused by viruses or genes.
- Recently, a special kind of SpCas9 was used to figure out what role the Myostatin gene plays in Muscular atrophy.
|Species derived||Streptococcus canis|
|sgRNA requirement||Yes, as crRNA:tracrRNA|
- Streptococcus canis is where the ScCas9 nuclease was found. For it to work, it needed a slightly different PAM recognition site, which is 5′-NNG-3′ (instead of NGG).
- The structure of the present nuclease is similar to that of other Cas9, but it shouldn’t be used because it doesn’t work as well.
- Plant genome editing is often done with ScCas9 and its variations, such as SpCas9++, SpCas9n++, and SpCas9+.
|dCas9-TadA||repair mutated resistance in gene bacteria, preserve adenosine deaminase activity. The present modification is capable enough to repair the faulty or mutated resistance gene for various gene editing purposes.|
|dCas9-rAPOBEC1||preserves cytidine deaminase activity|
|dCas9-APOBEC3A||preserves cytidine deaminase activity|
|dCas9-AID||preserves cytidine deaminase activity|
|SunTag-VP64||transcriptional activator used to study the effect of overexpression.|
|dCas9-VPR||tripartite complex and transcription activator|
|dCas9-CBP||rearranging chromatin structure by histone acetyltransferase domain.|
|Falk-fused dCas9||transcriptional activator module|
- Why is dCas9 one of the most advanced, flexible, amazing, and unique versions of the Cas9 nuclease? Because it doesn’t have “nucleolytic activity,” which is the main job of nuclease. So, people call it the dead Cas9 system.
- When the catalytic domain is taken away, the recognition domains can only find the target DNA, but they can’t cut it. So, in a technical sense, different transcriptional factors can be moved to a target location.
|PAM||NGG||CRAA (R=A or G)|
- Mougiakos et al. (2017) created a thermoCas9 nuclease that could work well at a higher temperature. It is made from the thermostable bacterium Geobacillus thermodenitrificans T12.
- They have also said that it can delete genes and stop transcription even at higher temperatures (55°C) without affecting the sensitivity or the need for PAM. Most of the time, it works well between 20°C and 70°C.
- It can also be called GeoCas9.
- The HypaCas9 is a Hyper Cas9 that enhances genome-wide specificity without diminishing target activity in human and mouse cells.
- Additionally, it reduces off-target activities. Technically, HypaCas9 is created by introducing the Cas9 mutations N692A, M694A, Q695A, and H698A.
- Enhanced precision Cas9 is a mutant version of the natural SpCas9, with a single point mutation reducing off-target activity.
- It is sometimes referred to as high-fidelity spCas9 or highly specific Cas9
- XCas9 is a specialised, genetically designed nuclease with a reduced off-target effect with both non-NGG and NGG PAM.
- As is well known, Cas9 requires a PAM sequence in order to function well, which boosts its specificity and significantly complicates research.
- XCas9 can effectively detect many PAM sequences, including NGG, GAA, and GAT.
- Therefore, it becomes more effective and efficient than SpCas9 or SaCas9 and significantly reduces the need for PAM (Hu et al., 2018).
|Cas9 type||Origin||PAM sequence (5’ to 3’)||Specialization|
|SpCas9||Streptococcus pyogenes||NGG||Cleaves dsDNA using the sgRNA|
|SaCas9||Streptococcus aureus||NNGRRT or NNGRR(N)||Small off-targeting effect|
|ScCas9||Streptococcus canis||NNG||The PAM sequence can be altered depending upon the variant used.|
|ThermoCas9||Geobacillus thermodenitrificans T12||CRAA (R=A or G)||Can work efficiently at a higher temperature.|
|StCas9||Streptococcus thermophilus||NNAGAAW||High on-target cleavage activity|
|HypaCas9||Streptococcus pyogenes||N/A||Greater genome-wide specificity|
|eSpCas9||Streptococcus pyogenes||NGG||Enhanced SpCas9 work more effectively than native SpCas9|
|NmCas9||Neisseria meningitidis||NNNNGATT||Need longer cRNA which increases the accuracy|
|XCas9||Streptococcus pyogenes||NGG and non-NGG||A specialized Cas9 that works with/without the PAM.|
|dCas9||Streptococcus pyogenes||NGG||Specialized Cas9 that lacks nuclease activity|
|Cas9-DD||Streptococcus pyogenes||NGG||Destabilized Cas9 prepared to increase the accuracy and efficiency.|
|SpCas9-VQR||Streptococcus pyogenes||NGA||Altered PAM for increasing SpCas9 specificity|
|SpCas9-EQR||Streptococcus pyogenes||NGAG||Altered PAM for increasing SpCas9 specificity|
|SpCas9-VRER||Streptococcus pyogenes||NGCG||Altered PAM for increasing SpCas9 specificity|
|SpCas9-NG||Streptococcus pyogenes||NG||Altered PAM for increasing SpCas9 specificity|
|SpCas9-HF1||Streptococcus pyogenes||NGG||Altered PAM for increasing SpCas9 specificity|
|evoCas9||Streptococcus pyogenes||NGG||Altered PAM for increasing SpCas9 specificity|
|Sniper-Cas9||Streptococcus pyogenes||NGG||Altered PAM for increasing SpCas9 specificity|
CRISPR–Cas9 Effector Complex Assembly
- Cas9 must be associated with guide RNA (a natural crRNA–tracrRNA or a sgRNA) to create an active DNA surveillance complex for site-specific DNA recognition and cleavage.
- The 20-nt spacer sequence of crRNA confers DNA target selectivity, whereas tracrRNA is indispensable for Cas9 recruitment.
- Genetic and pharmacological research have elucidated the significance of a so-called seed sequence of RNA nucleotides within the spacer region of crRNAs for target selectivity.
- In type II CRISPR systems, the seed region is described as the 10–12 nucleotides positioned at the 3 end of the 20-nt spacer sequence that are closest to the PAM.
- Mismatches in this seed region severely impede or abrogate target DNA binding and cleavage, but close homology in the seed region frequently results in off-target binding events, even in the presence of numerous mismatches elsewhere.
Conformational Rearrangement Upon sgRNA Binding
- The sgRNA-bound crystal structure best illustrates the concepts of Cas9–sgRNA assembly and the placement of guide RNA before to target identification.
- Comparison of the sgRNA-bound structure to that of apo–Cas9 reveals precisely how guide RNA binding induces Cas9 to undergo a substantial structural rearrangement from an inactive conformation to a DNA recognition–competent conformation, as suggested by studies with lower resolution electron microscopy.
- Upon sgRNA binding, the most notable conformational shift occurs in the REC lobe, namely Hel-III, which advances 65 A toward the HNH domain.
- Cas9 exhibits much smaller conformational changes upon binding to target DNA and PAM sequence, indicating that the majority of the extensive structural rearrangements occur prior to target DNA binding and reinforcing the notion that guide RNA loading is an essential regulator of Cas9 enzyme function.
Interactions with sgRNA
- Cas9 interacts extensively with the sgRNA. It forms several direct interactions with the repeat–antirepeat duplex, stem loop 1, and the linker region between stem loops 1 and 2 via Hel-I, the arginine-rich bridge helix, and the CTD domain.
- Cas9 makes significantly less interactions with stem loop 2 of the sgRNA, mostly through its RuvC and CTD domains.
- Due to the absence of a 3 tracrRNA tail in the sgRNA construct used for crystallography, no protein–RNA interaction was detected for stem loop 3 in the Cas9–sgRNA structure.
- However, the DNA-target-bound structures demonstrate that Cas9 has very few interactions with stem loop 3.
- According to biochemical investigations, sgRNAs lacking the linker region and stem loops 2 and 3 are still capable of inducing Cas9-mediated DNA cleavage, albeit with diminished efficiency, but stem loop 1 deletion entirely abolishes cleavage.
- Nevertheless, functional studies demonstrate that stem loops 2 and/or 3 are necessary for substantial Cas9 activation in vivo.
- These observations suggest that the repeat–antirepeat duplex and stem loop 1 are required for Cas9–sgRNA complex formation, whereas the linker, stem loop 2, and stem loop 3 are not required for function but may stabilise guide RNA binding to promote active complex formation, thereby enhancing catalytic efficiency in vivo.
Preordered Seed RNA and PAM-Interacting Cleft
- Cas9 creates extensive interactions with the ribose–phosphate backbone of the guide RNA, thereby establishing the A-form conformation of the 10-nt RNA seed sequence required for initial DNA interrogation.
- This preordering is assumed to be thermodynamically advantageous for target binding, similar to the positioning of guide RNA reported in other small regulatory RNA processes, such as the bacterial Hfq protein–RNA complex and eukaryotic Argonaute-mediated RNA silencing.
- Notably, in the type I CRISPR interference complex Cascade, the guide RNA is preordered throughout the entire crRNA, not just in the seed region. This is likely due to the helical assembly of the complex and the release of topological constraints by completely flippedout nucleotides at every sixth position.
- The PAM-interacting sites R1333 and R1335, which are responsible for 5 -NGG-3 PAM recognition and disordered in the apo structure, are prepositioned prior to establishing contact with target DNA, demonstrating that sgRNA loading permits Cas9 to form a DNA recognition– capable structure.
- Notably, despite the fact that the 5 10-nt nonseed RNA sequence is completely disordered in the sgRNA-bound crystal structure, the electron microscopy (EM) structure of SpyCas9 bound to a full-length sgRNA (EMD-3276) reveals that the 5 end of the guide RNA lies within the cavity formed between the HNH and RuvC nuclease domains.
- This structural observation shows that the 5 end of sgRNA is shielded from degradation and that an additional conformational change is necessary to liberate the 5 distal end from constraint during target DNA binding.
- Jiang, F., & Doudna, J. A. (2017). CRISPR-Cas9 Structures and Mechanisms. Annual review of biophysics, 46, 505–529. https://doi.org/10.1146/annurev-biophys-062215-010822.
- Wada, N., Ueta, R., Osakabe, Y. et al. Precision genome editing in plants: state-of-the-art in CRISPR/Cas9-based genome engineering. BMC Plant Biol 20, 234 (2020).
- Nishimasu, Hiroshi et al. “Crystal structure of Cas9 in complex with guide RNA and target DNA.” Cell vol. 156,5 (2014): 935-49. doi:10.1016/j.cell.2014.02.001.
- Zuo, Z., Liu, J. Structure and Dynamics of Cas9 HNH Domain Catalytic State. Sci Rep 7, 17271 (2017). https://doi.org/10.1038/s41598-017-17578-6
- Mougiakos, I., Mohanraju, P., Bosma, E.F. et al. Characterizing a thermostable Cas9 for bacterial genome editing and silencing. Nat Commun 8, 1647 (2017). https://doi.org/10.1038/s41467-017-01591-4
- Hu, J., Miller, S., Geurts, M. et al. Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature 556, 57–63 (2018). https://doi.org/10.1038/nature26155.
- Qi, L. S., Larson, M. H., Gilbert, L. A., Doudna, J. A., Weissman, J. S., Arkin, A. P., & Lim, W. A. (2013). Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell, 152(5), 1173–1183. https://doi.org/10.1016/j.cell.2013.02.022.