The CRISPR-Cas system
CRISPR in native systems
The CRISPR system exists naturally in many prokaryotes as a component of the organism’s naturally occurring adaptive immunity against foreign genetic elements (e.g. viruses and plasmids). Broadly speaking, it is a molecular mechanism by which prokaryotes exposed to foreign genetic material capture a sample of that foreign material in order to help the cell fight off future attacks (similar to how antibodies in humans provide adaptive immunity to antigens to which the immune system has been exposed). The captured foreign material is used as a guide for molecular machinery produced by the cell (i.e. CRISPR Associated protein or “Cas” proteins) to target any nucleic acid corresponding to the captured sequence, cleaving the target nucleic acid and thus flagging it for destruction. In the CRISPR-Cas system most commonly used (Class 2 type II, derived from S. pyogenes), the target sequence is ~ 20 bp long and the guide RNA sequence (CRISPR RNA or crRNA) must form a duplex with a further small RNA known as trans-activating RNA (tracrRNA) as part of its maturation into the form which associates with, and guides the Cas9 protein.
Importantly, in order to bind to and cleave target sites, the CRISPR-Cas protein complex relies on one further element in addition to the complementarity of the target site to the guide RNA. In the prokaryotic system, only foreign sequences immediately upstream of a short “protospacer adjacent motif” (PAM) region (5’-NGG-3’ for Cas9) are captured and, as it happens, the CRISPR-Cas complex will only bind to targets where an appropriate PAM is present. As the PAM sequence is not part of the crDNA captures and incorporated into the organism’s CRISPR region, this ensures that the prokaryotes’ DNA is not attacked by its own CRISPR defence system – only foreign DNA will have a PAM proximally downstream from a site complementary to the guide RNA. Different Cas proteins are associated with different PAMs, which range from 3 bp to 6 bp in length and require vary degrees of base specificity. In practice, the frequency of the Cas9 PAM throughout many genomes means that this requirement is not a significant limitation on the use of the system.
CRISPR as a molecular tool
It is the relative simplicity of the CRISPR targeting system, and the specificity of the nucleic acid cleavage, which makes CRISPR such a powerful tool. With appropriately designed vectors and man-made crRNA guide sequences taking the place of the captured foreign nucleic acid in the system, researchers can theoretically send a Cas protein to any location on the genome upstream from a PAM.
In engineered CRISPR-Cas systems a “single guide RNA” (sgRNA) is designed, which comprises both the crRNA guide and the tracrRNA minimal region required to interact with the Cas9 protein and form the mature protein/RNA complex. This simplifies the native process by cutting out the pre-crRNA processing and maturation step, and cuts down on the number of separate RNA transcripts required to be generated.
Once the CRISPR-Cas complex is bound to the target site:
- the endonuclease activity of the Cas protein can be used to:
- knock out a gene; or
- edit a gene, by taking advantage of the cell’s natural DNA repair mechanisms (non-homologous end joining (NHEJ) or homology directed repair (HDR))
- Cas proteins whose nuclease activity has been neutralised (so called “nuclease dead” Cas proteins) can be used to:
- upregulate gene expression, where a transcription activator is attached; or
- downregulate gene expression, where a transcription repressor is attached or the Cas complex interferes with an enhancer region at the binding site.
CRISPR system components
There are many different variants of the system, utilising different biochemical machinery and functional features however a number of characteristics and components are common to all, or most, of those in use today (see glossary for a full description of each):
- Cas protein (most commonly Cas9 from S. pyogenes (spCas9)), either:
- with full wild type endonuclease activity, e.g. Cas9
- without endonuclease activity, e.g. dCas9
- with modified endonuclease activity to generate single stranded “nicks”, e.g. Cas9n
- sgRNA (comprising crRNA and tracrRNA)
Variants and various aspects of each of these components are a rich source of subject matter for the explosion of patent filings in the CRISPR field. According to analysis by Egelie et al in Nature Biotechnology,1 the vast majority of CRISPR related patent families identified include subject matter relating to the components of the CRISPR system, especially the guide RNA. Most patent families also related to potential applications for the technology, in particular molecular targeting, gene therapy and diagnosis. Vectors and modes of delivery also comprised a number of patent applications. The latter will undoubtedly be an area of increasing importance as R&D efforts come closer to the clinical trial and regulatory approval stage, and researchers grapple with how best to deliver CRISPR based therapies to living patients/cells.
1 - Egelie et al. “The emerging patent landscape of CRISPR-Cas gene editing technology” 2016 Nature Biotechnology 34(10) 1025.
Variations on a theme
In addition to gene editing through the CRISPR-mediated cleavage of DNA and harnessing of natural DNA repair mechanisms, a number of notable variations of CRISPR have been developed, with different capabilities and modes of action. These variations (which some in the media have dubbed “CRISPR 2.0” and “CRISPR 3.0”) take advantage of the ability of Cas proteins to be targeted to specific sites of interest in the genome by the use of complementary gRNA sequences, but use other mechanisms to achieve their gene expression or modulation.
“Base editors” are capable of affecting single nucleotide changes at a target site and researchers have now developed a range of such editors which, between them, are able to correct for all of the so-called “transition” mutations — C to T, T to C, A to G, or G to A. Base editor systems employ modified versions of Cas9 with nickase activity, which are linked to enzymes designed to chemically effect the single base change. The role of the CRISPR-Cas component in the base editor system is twofold: (1) to specifically target the desired location and open up access to the target site; and (2) to generate a nick in the strand complementary to the edited base, in order to induce repair of the DNA around the site.
These editors have already been used to attempt to correct for single nucleotide mutations in the beta globin gene which cause beta-thalassemia. There are about 32,000 known disease associated point mutations, half of which are caused by a base change from a G to an A (this high prevalence is due to the spontaneous deamination of C to U, resulting in the change from a G-C to an A-U or A-T base pairing); and 15% of which are caused by the opposite mutation, from A to G.
Researchers are also continuing to explore the potential for catalytically inactive dCas9 proteins (which is unable to cleave DNA), linked to promoters or repressors of gene expression to up or downregulate gene expression at the epigenetic level (rather than effecting any permanent edit to the underlying DNA).