1. Introduction
CRISPR (Clustered Regularly
Interspaced Short Palindromic Repeats) is an
array of short repeated sequences separated by spacers with unique sequences. CRISPR
can be found on both chromosomal and plasmid DNA of bacteria. The spacers are
often derived from nucleic acid of viruses and plasmids, and are used as recognition
elements to find matching virus genomes and destroy them. These sequences play a key role in a bacterial immune system, and
form the basis of a genome editing technology known as CRISPR/Cas9 which allows
permanent modification of genes within organisms.
CRISPR activity requires the presence of a set
of CRISPR-associated (Cas) genes, usually found adjacent to CRISPR, which code
for proteins necessary to the immune response. Generally,
the CRISPR-Cas system and their elements are hypervariable and differ broadly
in terms of occurrence, genes, sequences, number, and size across genomes.
Indeed, CRISPR repeats can vary widely (23–55 nucleotides), though they are
typically 28–37 nucleotides and their partially palindromic nature allows them
to form hairpin structures. Likewise, CRISPR spacers can also vary in size
(21–72 nucleotides), though they are typically 32–38 nucleotides (Barrangou and Marraffini, 2014). The
CRISPR locus, first observed in Escherichia coli, is present in about 90%
of sequence archaea and 40% of sequenced bacterial genome (Grissa et al., 2007).
2. CRISPR/Cas Mechanism
The CRISPR-Cas system process can be divided
into three stages (Fig. 1). The first stage, adaptation (spacer acquisition), leads to insertion of new spacers in the
CRISPR locus. In the second stage, expression, the system gets ready for action
by expressing the Cas genes and transcribing the CRISPR into a long precursor
CRISPR RNA (pre-crRNA). The pre-crRNA is subsequently processed into mature
crRNA by Cas proteins and accessory factors. In the third stage, interference,
target nucleic acid is recognized and destroyed by the combined action of crRNA
and Cas proteins (Rath et al., 2015).
Fig. 1. The key
steps of CRISPR-Cas immunity. 1) Adaptation: insertion of new spacers into the
CRISPR locus. 2) Expression: transcription of the CRISPR locus and processing of
CRISPR RNA. 3) Interference: detection and degradation of mobile genetic
elements by CRISPR RNA and Cas protein(s) (Rath et al., 2015).
The first step in CRISPR–Cas system is adaptation
(spacer acquisition) (Fig.
2). In the adaptation stage, short pieces of DNA virus or plasmid sequences are
integrated into the CRISPR loci. Each integration event is followed by the
duplication of a repeat and thus creates a new spacer–repeat unit. The
selection of spacer precursors (protospacers) from the invading DNA appears to
be determined by the recognition of protospacer adjacent motifs (PAMs); PAMs
are usually only several nucleotides long and differ between variants of the
CRISPR-Cas system (Makarova et al., 2011).
Two models of spacer acquisition have been reported for type I systems: (1)
naive and (2) primed. During naive adaptation, the organism obtains a spacer
from a foreign DNA source. In contrast, primed acquisition relies on a
pre-existing (priming) spacer which enables a biased and enhanced uptake of new
spacers. Both models are based on the action of two key proteins, Cas1 and Cas2.
Naive adaptation requires only Cas1 and Cas2, whereas primed adaptation
additionally requires the type I interference complex Cascade
(CRISPR-associated complex for antiviral defense) and the Cas3 nuclease
(Sternberg et al., 2016).
Fig. 2. Model
of the adaptation in the Type I-E system. There are two types of spacer
acquisition, naïve and primed. Both require the presence of a PAM and are
dependent on the Cas1-Cas2 complex. The Cas1-Cas2 complex recognizes the CRISPR
and likely prepares it for spacer integration. Naïve spacer acquisition occurs
when there is no previous information about the target in the CRISPR. Primed
spacer acquisition requires a spacer in the CRISPR locus that matches the
target DNA and the presence of Cas3 and the Cascade complex. Primed acquisition
results in insertion of more spacers from same mobile genetic element (Rath et
al., 2015).
The second stage in CRISPR–Cas system is
expression (Fig. 3), the CRISPR array is transcribed into a full pre-CRISPR RNA
(pre-crRNA) transcript which is processed into mature crRNAs containing partial
CRISPR spacer sequences attached to partial CRISPR repeats, forming CRISPR
guide RNAs. The processing step is catalyzed by endoribonucleases that either
operate as a subunit of a larger complex (such as Cascade in Escherichia
coli) or as a single enzyme (such as Cas6 in the archaeon Pyrococcus furiosus)
and form a CRISPR ribonucleoprotein (crRNP) complex. Recently, another variant
was discovered in Streptococcus pyogenes in which a trans-encoded small
RNA (tracrRNA) acts as a guide for the processing of pre-crRNA, which in this
organism is catalyzed by RNase III in the presence of Csn1 (also known as
Cas9). In the case of the Cascade complex of type I CRISPR-Cas systems, the
mature crRNA remains associated with the complex after the initial endonuclease
cleavage, whereas in P. furiosus the crRNA, processed by Cas6, is
passed on to a distinct Cas protein complex (the Cascade complex of
type III systems, Cmr-type), where it is processed further at the 3′ end
by unknown nucleases (Makarova et al., 2011).
The third step in CRISPR–Cas system is
interference (Fig. 3), the principle of target interference by CRISPR-Cas
systems is that crRNA bound to Cas protein(s) locates the corresponding protospacer
to trigger degradation of the target. The crRNAs guide Cas nucleases towards
complementary nucleic acids for sequence-specific targeting and cleavage of
invasive genetic elements. Most CRISPR effector proteins initiate targeting by
interaction with a particular two to four nucleotide sequence motifs, the PAM.
Once interaction with the PAM has been established, the crRNA guide loaded
within the Cas nuclease then interrogate the flanking target DNA. The strength
and duration of the molecular interaction correlates with the level of
complementarity between the crRNA and target DNA, which drives conformational changes
in Cas effector proteins, such as Cas9 and Cascade, that eventually lead to a
cleavage-competent structural state. If complementarity between the guide RNA
and target DNA extends beyond the seed sequence, a DNA R-loop is directionally formed,
which triggers subsequent nicking by the Cas effector nucleases (e.g. Cas3,
Cas9, Cpf1) at particular locations defined by a ruler-anchor mechanism (.
Fig. 3. Model
of crRNA processing and interference. (A) In Type I systems, the pre-crRNA is
processed by Cas5 or Cas6. DNA target interference requires Cas3 in addition to
Cascade and crRNA. (B) Type II systems use RNase III and tracrRNA for crRNA
processing together with an unknown additional factor that perform 5′ end
trimming. Cas9 targets DNA in a crRNA-guided manner. (C) The Type III systems
also use Cas6 for crRNA processing, but in addition an unknown factor performs
3′ end trimming. Here, the Type III Csm/Cmr complex is drawn as targeting DNA,
but RNA may also be targeted (Rath et al., 2015).
The Cas proteins are a highly diverse group.
Many of them are predicted or identified to interact with nucleic acids (e.g. as
nucleases, helicases, and RNA-binding proteins). The Cas1 and Cas2 proteins are
involved in adaptation stage (spacer
acquisition) and are virtually universal for CRISPR-Cas
systems. Other Cas proteins are only associated with certain types of
CRISPR-Cas systems. The most adopted classification identifies Type I, II and III
CRISPR-Cas systems, with each having several subgroups. Different types of
CRISPR-Cas systems can co-exist in a single organism. Recently, a Type IV
system was proposed, which contain several Cascade genes but no CRISPR, cas1 or
cas2. Type IV complex would be guided by protein-DNA interaction, not by crRNA,
and constitutes an innate immune system preset to attack certain sequences
(Rath et al., 2015).
The Type I systems are defined by the presence
of the signature protein Cas3, a protein with both helicase and DNase domains responsible
for degrading the target. Currently, six subtypes of the Type I system are
identified (Type I-A through Type I-F) which have a variable number of Cas
genes. Apart from cas1, cas2 and cas3, all Type I systems encode a Cascade-like
complex. Cascade binds crRNA and locates the target, and most variants are also
responsible for processing the crRNA. Cascade also enhances spacer acquisition in
some cases. In the Type I-A system, Cas3 is a part of the Cascade complex (Rath
et al., 2015).
The Type II CRISPR-Cas systems encode Cas1 and
Cas2, the Cas9 signature protein and sometimes a fourth protein (Csn2 or Cas4).
Cas9 assists in adaptation stage (spacer
acquisition), participates in crRNA processing and cleaves
the target DNA assisted by crRNA and an additional RNA called tracrRNA. Type II
systems have been divided into subtypes II-A and II-B but recently a third,
II-C, has been suggested. The csn2 and cas4 genes, both encoding proteins
involved in adaptation stage, are present in Type II-A and the Type II-B, respectively,
while Type II-C lacks a fourth gene (Rath et al., 2015). The simplicity of the type II CRISPR nuclease, with only three
required components (Cas9 along with the crRNA and tracrRNA) makes this system
amenable to adaptation for genome editing.
The Type III CRISPR-Cas systems contain the
signature protein Cas10 with unclear function. Most Cas proteins are destined
for the Csm (in Type III-A) or Cmr (in Type III-B) complexes, which are similar
to Cascade. Interestingly, while all Type I and II systems are known to target
DNA, Type III systems target DNA and/or RNA. So far, the Type II systems have
been exclusively found in bacteria while the Type I and Type III systems occur
both in bacteria and archaea (Rath et al., 2015).
3. Application of CRISPR/Cas System for Genome Editing
The
recently developed CRISPR/Cas9 system has been used in diverse eukaryotes for
targeted genome editing (Cong et al., 2013). This system comprises the
Cas9 endonuclease of Streptococcus pyogenes and a synthetic guide RNA
(gRNA), which combines the functions of CRISPR-RNA (cRNA) and trans-activating
cRNA (tracrRNA). The gRNA directs the Cas9 endonuclease to a target sequence
complementary to 20 nucleotides preceding the protospacer associated motif
(PAM) (NGG) required for Cas9 activity. The specificity of the system, and the
fact that targeting is determined by the 20-nucleotide sequences of the gRNA,
allows for unprecedented, facile genome engineering. Furthermore, the
CRISPR/Cas9 system is amenable to making multiple, simultaneous targeted
modifications (multiplexing).
CRISPR/Cas9
can be used to develop a new strategy to generate virus resistance in cucumber,
by targeting eIF4E gene. eIF4E gene is a plant cellular translation factor
essential for the Potyviridae life cycle, and natural point mutations in this
gene can confer resistance to potyviruses. In cucumber, two eIF4E genes have
been identified, eIF4E (accession no. XM_004147349) (236 amino acids) and
eIF(iso)4E (accession no. XM_004147116.2) (204 amino acids). Two regions in the
cucumber eIF4E gene are targeted by Cas9/sgRNA, which have no homology in the
eIF(iso)4E gene (Chandrasekaran et al., 2016).
The
Cas9/sgRNA1 construct was designed to target the sequence in the first exon of
eIF4E (positions 65–86 in the coding region) (Fig. 4A). The Cas9/sgRNA2 construct
was designed to target the third exon (positions 517–540) in the coding region
to allow translation of approximately two-thirds of the protein, perhaps
without disrupting all of its functions (Fig. 4A). Five independent T0
transgenic lines were generated by Agrobacterium-mediated transformation. The
presence of the trans-gene (Cas9/sgRNA) was confirmed by kanamycin resistance
and polymerase chain reaction (PCR) using sgRNA-specific primers (Fig. 4B).
Three lines (1, 3, and 4) were identified as transgenic with Cas9/sgRNA1 (Chandrasekaran
et al., 2016).
To
evaluate the types of mutations generated in sgRNA1 transgenic plants, PCR was
performed in T0 plants using primers flanking the sgRNA1 target and
subsequently digested with BmgBI (a site that would disappear if Cas9 and NHEJ
were active in this location). In line 1, a distinct undigested fragment was
observed following BmgBI restriction (Fig. 4B). The partial digestion observed
indicated a heterozygous genome with both wild-type and mutant eIF4E alleles.
Cloning and sequencing of the uncut BmgBI fragment showed two types of
mutation: a 20-nucleotide deletion around the PAM sequence in seven colonies
and a one-nucleotide deletion 3 bp upstream of the PAM sequence in two colonies
(Fig. 4C). In lines 4 (Fig. 4B) and 3, the amplified PCR was completely
digested by BmgBI (Chandrasekaran et al., 2016).
Two
additional Cas9/sgRNA2 transgenic plants (2 and 5) did not show genome editing
in T0 as determined by PCR and restriction analysis with BglII (Fig. 4B). Then
the experiment was continued with two sgRNA1 lines (1 and 4), designated CEC1-1
and CEC1-4, respectively (Cas/sgRNA1-eIF4E-Cucumber), and one sgRNA2 line (line
5), designated CEC2-5.
Fig. 4.
Gene editing of eIF4E mediated by CRISPR/Cas9 in transgenic cucumber plants.
(A) Schematic representation of the cucumber eIF4E genomic map and the sgRNA1
and sgRNA2 target sites (red arrows). The target sequence is shown in red
letters together with the restriction site (underlined), and the protospacer
adjacent motif (PAM) is marked in bold underlined letters. The black arrows
indicate the primers flanking the target sites used to detect the mutations. (B)
Restriction analysis of T0 polymerase chain reaction (PCR) fragments of CEC-1,
CEC1-4, and CEC2-5. (C) Alignment of nine colony sequences from the undigested
fragment of line 1 with the wild-type (wt) genome sequence. DNA deletions are
shown by red dashes and deletion sizes (nucleotides) are marked on the right
side of the sequence (Chandrasekaran et al., 2016).
Fig. 5.
Genotyping of eIF4E mutants in representative T1 progeny plants of the CEC1-1
line. (A) Polymerase chain reaction (PCR) restriction analysis of
Cas9/sgRNA1-mediated mutations (top panel) and transgene insertion (bottom
panel) in 10 representative T1 cucumber plants and non-mutant wild-type (wt).
(B) Alignment of four representative eIF4E mutant plants with the wild-type sequence.
The sequences of each plant represented clones from undigested fragments. The
target sequence is shown in red letters and the protospacer adjacent motif (PAM)
is marked by bold underlined letters. DNA deletions are marked with red dashes and
deletion sizes (nucleotides) are indicated on the right side of the sequence (Chandrasekaran
et al., 2016).
Fig. 6.
Genotyping of eIF4E mutants in representative T1 progeny plants of the CEC1-4
line. (A) Polymerase chain reaction (PCR) restriction analysis of Cas9/sgRNA1-mediated
mutations (top panel) and transgene insertion (bottom panel) in eight T1
cucumber plants and non-mutant wild-type (wt). (B) Alignment of three eif4e
transgenic mutant plants 4, 5 and 6 with the wild-type sequence. Sequences of
each plant represent clones from undigested fragments. The target sequence is shown
in red letters and the protospacer adjacent motif (PAM) is marked in bold underlined
letters. DNA deletions are marked by red dashes and deletion sizes
(nucleotides) are indicated on the right side of the sequence (Chandrasekaran et
al., 2016).
Fig. 7.
Genotyping of the Cas9/sgRNA2-mediated mutation in T1 progeny plants of the
CEC2-5 line. (A) Polymerase chain reaction (PCR) restriction analysis of
Cas9/sgRNA2-mediated mutations (top panel) and the presence of the Cas9/sgRNA2
transgene (bottom panel) in eight representative T1 cucumber plants. (B)
Alignment of four representative eIF4E mutant plants with the wild-type
sequence. The target sequence is shown in red letters and the protospacer adjacent
motif (PAM) is marked in bold underlined letters. DNA deletions or insertions are
marked by red dashes and letters, and the sizes of the deletions or insertions (nucleotides)
are indicated on the right side of the sequence (Chandrasekaran et al., 2016).
Two Cas9/sgRNA constructs were designed to target the cucumber
eIF4E gene: 1) sgRNA1 was expected to disrupt the intact eIF4E protein, and 2) sgRNA2
to permit translation of two-thirds of the protein product. In
Agrobacterium-transformed T0 lines, deletions were found in the eIF4E target
gene in one line (CEC1-1) out of five (Fig. 4). In the CEC1-1 line, the same
mutations were observed in the T1 generation, which implies a heterozygous mono-allelic
CEC1-1 T0 plant. In the transgenic line CEC1-4, all of the progeny from the T1
and T2 generations showed partial cleavage activity of Cas9 (Fig. 7), which
implies that cleavage occurred only in somatic cells and not in the germ cell
line. Such phenomena may be a result of the transgene insertion site of Cas9 or
spatial specificity of the 35S promoter in the plant genome that causes a low
level of expression and activity (Chandrasekaran et al., 2016).
In the transgenic T0 generation of the CEC2-5 line, a mutation was
not detected by PCR or restriction analysis (Fig. 4B), although, in the T1
generation (Fig. 7), homozygous, heterozygous and non-mutant plants were
observed; mutations in homozygous plants were bi-allelic, with two mutations in
the same plant. It is possible that Cas9 activity in T0 was undetectable,
although active, in the germ cell line. Alternatively, the T0 plant was
chimeric, in which expression of Cas9/sgRNA2 occurred in the germ cell line. The
differences in Cas9 targeting between the three T0 lines (CEC1-1, CEC1-2 and
CEC2-5) may be a result of differential activities of Cas9 in different
transgenic lines, depending on the transgene insertion site (Chandrasekaran et
al., 2016).
References
Diversity of CRISPR-Cas Immune Systems and
Molecular Machines. Genome
Biology. 16:247.
Barrangou, R. and L.A. Marraffini. 2014. CRISPR-Cas
Systems: Prokaryotes Upgrade to Adaptive Immunity. NIH
Mol Cell. 54(2): 234–244.
Chandrasekaran, J., M. Brumin, D. Wolf., D. Leibman, C. Klap, M.
Pearlsman, A. Sherman, T. Arazi, and A. Gal-on. 2016. Development of Broad
Virus Resistance in Non-Transgenic Cucumber Using CRISPR/Cas9 Technology. Molecular
Plant Pathology. 17(7): 1140-1153.
Cong,
L., F.A. Ran, D. Cox, S. Lin, R. Barretto, N. Habib, P.D. Hsu, X. Wu, W. Ziang,
L.A. Marraffini, F. Zhang. Multiplex Genome Engineering Using CRISPR/Cas
Systems. Science. 339(6121): 819-823.
Grissa, I., G. Vergnaud, and C. Pourcel. 2007. CRISPRFinder: A Web Tool
to Identify Clustered Regularly Interspaced Short Palindromic Repeats. Nucleic
Acids Research. 35 (Web Server issue): W52-W57.
Makarova, K.S., D.H. Haft, R. Barrangou, S.J.J. Brouns,
E. Charpentier, P. Horvath, S. Moineau, F.J.M. Mojica, Y.I. Wolf, A.F. Yakunin,
J. van der Oost, and E.V. Koonin. 2011. Evolution and Classification of
the CRISPR–Cas Systems. Nature Reviews-Microbiology. 9: 467-477.
Rath, D., L. Amlinger, A. Rath, and M. Lundgren. 2015. The
CRISPR-Cas Immune System: Biology, Mechanisms and Applications. Elsevier- Biochimie. 117: 119-128.
Sternberg, S.H., H. Richter, E. Charpentier, and U. Qimron. 2016. Adaptation
in CRISPR-Cas Systems. Elsevier-Molecular Cell. 61: 797-808.
No comments:
Post a Comment