Biology: CRISPR/Cas Systems and Their Application for Genome Editing

1. Introduction

CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) is an array of short repeated sequences separated by spacers with unique sequences. CRISPR can be found on both chromosomal and plasmid DNA of bacteria. The spacers are often derived from nucleic acid of viruses and plasmids, and are used as recognition elements to ﬁnd matching virus genomes and destroy them. These sequences play a key role in a bacterial immune system, and form the basis of a genome editing technology known as CRISPR/Cas9 which allows permanent modification of genes within organisms.

CRISPR activity requires the presence of a set of CRISPR-associated (Cas) genes, usually found adjacent to CRISPR, which code for proteins necessary to the immune response. Generally, the CRISPR-Cas system and their elements are hypervariable and differ broadly in terms of occurrence, genes, sequences, number, and size across genomes. Indeed, CRISPR repeats can vary widely (23–55 nucleotides), though they are typically 28–37 nucleotides and their partially palindromic nature allows them to form hairpin structures. Likewise, CRISPR spacers can also vary in size (21–72 nucleotides), though they are typically 32–38 nucleotides (Barrangou and Marraffini, 2014). The CRISPR locus, ﬁrst observed in Escherichia coli, is present in about 90% of sequence archaea and 40% of sequenced bacterial genome (Grissa et al., 2007).

2. CRISPR/Cas Mechanism

The CRISPR-Cas system process can be divided into three stages (Fig. 1). The ﬁrst stage, adaptation (spacer acquisition), leads to insertion of new spacers in the CRISPR locus. In the second stage, expression, the system gets ready for action by expressing the Cas genes and transcribing the CRISPR into a long precursor CRISPR RNA (pre-crRNA). The pre-crRNA is subsequently processed into mature crRNA by Cas proteins and accessory factors. In the third stage, interference, target nucleic acid is recognized and destroyed by the combined action of crRNA and Cas proteins (Rath et al., 2015).

Fig. 1. The key steps of CRISPR-Cas immunity. 1) Adaptation: insertion of new spacers into the CRISPR locus. 2) Expression: transcription of the CRISPR locus and processing of CRISPR RNA. 3) Interference: detection and degradation of mobile genetic elements by CRISPR RNA and Cas protein(s) (Rath et al., 2015).

The first step in CRISPR–Cas system is adaptation (spacer acquisition) (Fig. 2). In the adaptation stage, short pieces of DNA virus or plasmid sequences are integrated into the CRISPR loci. Each integration event is followed by the duplication of a repeat and thus creates a new spacer–repeat unit. The selection of spacer precursors (protospacers) from the invading DNA appears to be determined by the recognition of protospacer adjacent motifs (PAMs); PAMs are usually only several nucleotides long and differ between variants of the CRISPR-Cas system (Makarova et al., 2011).

Two models of spacer acquisition have been reported for type I systems: (1) naive and (2) primed. During naive adaptation, the organism obtains a spacer from a foreign DNA source. In contrast, primed acquisition relies on a pre-existing (priming) spacer which enables a biased and enhanced uptake of new spacers. Both models are based on the action of two key proteins, Cas1 and Cas2. Naive adaptation requires only Cas1 and Cas2, whereas primed adaptation additionally requires the type I interference complex Cascade (CRISPR-associated complex for antiviral defense) and the Cas3 nuclease (Sternberg et al., 2016).

Fig. 2. Model of the adaptation in the Type I-E system. There are two types of spacer acquisition, naïve and primed. Both require the presence of a PAM and are dependent on the Cas1-Cas2 complex. The Cas1-Cas2 complex recognizes the CRISPR and likely prepares it for spacer integration. Naïve spacer acquisition occurs when there is no previous information about the target in the CRISPR. Primed spacer acquisition requires a spacer in the CRISPR locus that matches the target DNA and the presence of Cas3 and the Cascade complex. Primed acquisition results in insertion of more spacers from same mobile genetic element (Rath et al., 2015).

The second stage in CRISPR–Cas system is expression (Fig. 3), the CRISPR array is transcribed into a full pre-CRISPR RNA (pre-crRNA) transcript which is processed into mature crRNAs containing partial CRISPR spacer sequences attached to partial CRISPR repeats, forming CRISPR guide RNAs. The processing step is catalyzed by endoribonucleases that either operate as a subunit of a larger complex (such as Cascade in Escherichia coli) or as a single enzyme (such as Cas6 in the archaeon Pyrococcus furiosus) and form a CRISPR ribonucleoprotein (crRNP) complex. Recently, another variant was discovered in Streptococcus pyogenes in which a trans-encoded small RNA (tracrRNA) acts as a guide for the processing of pre-crRNA, which in this organism is catalyzed by RNase III in the presence of Csn1 (also known as Cas9). In the case of the Cascade complex of type I CRISPR-Cas systems, the mature crRNA remains associated with the complex after the initial endonuclease cleavage, whereas in P. furiosus the crRNA, processed by Cas6, is passed on to a distinct Cas protein complex (the Cascade complex of type III systems, Cmr-type), where it is processed further at the 3′ end by unknown nucleases (Makarova et al., 2011).

The third step in CRISPR–Cas system is interference (Fig. 3), the principle of target interference by CRISPR-Cas systems is that crRNA bound to Cas protein(s) locates the corresponding protospacer to trigger degradation of the target. The crRNAs guide Cas nucleases towards complementary nucleic acids for sequence-specific targeting and cleavage of invasive genetic elements. Most CRISPR effector proteins initiate targeting by interaction with a particular two to four nucleotide sequence motifs, the PAM. Once interaction with the PAM has been established, the crRNA guide loaded within the Cas nuclease then interrogate the flanking target DNA. The strength and duration of the molecular interaction correlates with the level of complementarity between the crRNA and target DNA, which drives conformational changes in Cas effector proteins, such as Cas9 and Cascade, that eventually lead to a cleavage-competent structural state. If complementarity between the guide RNA and target DNA extends beyond the seed sequence, a DNA R-loop is directionally formed, which triggers subsequent nicking by the Cas effector nucleases (e.g. Cas3, Cas9, Cpf1) at particular locations defined by a ruler-anchor mechanism (Barrangou, 2015).

Fig. 3. Model of crRNA processing and interference. (A) In Type I systems, the pre-crRNA is processed by Cas5 or Cas6. DNA target interference requires Cas3 in addition to Cascade and crRNA. (B) Type II systems use RNase III and tracrRNA for crRNA processing together with an unknown additional factor that perform 5′ end trimming. Cas9 targets DNA in a crRNA-guided manner. (C) The Type III systems also use Cas6 for crRNA processing, but in addition an unknown factor performs 3′ end trimming. Here, the Type III Csm/Cmr complex is drawn as targeting DNA, but RNA may also be targeted (Rath et al., 2015).

The Cas proteins are a highly diverse group. Many of them are predicted or identiﬁed to interact with nucleic acids (e.g. as nucleases, helicases, and RNA-binding proteins). The Cas1 and Cas2 proteins are involved in adaptation stage (spacer acquisition) and are virtually universal for CRISPR-Cas systems. Other Cas proteins are only associated with certain types of CRISPR-Cas systems. The most adopted classiﬁcation identiﬁes Type I, II and III CRISPR-Cas systems, with each having several subgroups. Different types of CRISPR-Cas systems can co-exist in a single organism. Recently, a Type IV system was proposed, which contain several Cascade genes but no CRISPR, cas1 or cas2. Type IV complex would be guided by protein-DNA interaction, not by crRNA, and constitutes an innate immune system preset to attack certain sequences (Rath et al., 2015).

The Type I systems are deﬁned by the presence of the signature protein Cas3, a protein with both helicase and DNase domains responsible for degrading the target. Currently, six subtypes of the Type I system are identiﬁed (Type I-A through Type I-F) which have a variable number of Cas genes. Apart from cas1, cas2 and cas3, all Type I systems encode a Cascade-like complex. Cascade binds crRNA and locates the target, and most variants are also responsible for processing the crRNA. Cascade also enhances spacer acquisition in some cases. In the Type I-A system, Cas3 is a part of the Cascade complex (Rath et al., 2015).

The Type II CRISPR-Cas systems encode Cas1 and Cas2, the Cas9 signature protein and sometimes a fourth protein (Csn2 or Cas4). Cas9 assists in adaptation stage (spacer acquisition), participates in crRNA processing and cleaves the target DNA assisted by crRNA and an additional RNA called tracrRNA. Type II systems have been divided into subtypes II-A and II-B but recently a third, II-C, has been suggested. The csn2 and cas4 genes, both encoding proteins involved in adaptation stage, are present in Type II-A and the Type II-B, respectively, while Type II-C lacks a fourth gene (Rath et al., 2015). The simplicity of the type II CRISPR nuclease, with only three required components (Cas9 along with the crRNA and tracrRNA) makes this system amenable to adaptation for genome editing.

The Type III CRISPR-Cas systems contain the signature protein Cas10 with unclear function. Most Cas proteins are destined for the Csm (in Type III-A) or Cmr (in Type III-B) complexes, which are similar to Cascade. Interestingly, while all Type I and II systems are known to target DNA, Type III systems target DNA and/or RNA. So far, the Type II systems have been exclusively found in bacteria while the Type I and Type III systems occur both in bacteria and archaea (Rath et al., 2015).

3. Application of CRISPR/Cas System for Genome Editing

The recently developed CRISPR/Cas9 system has been used in diverse eukaryotes for targeted genome editing (Cong et al., 2013). This system comprises the Cas9 endonuclease of Streptococcus pyogenes and a synthetic guide RNA (gRNA), which combines the functions of CRISPR-RNA (cRNA) and trans-activating cRNA (tracrRNA). The gRNA directs the Cas9 endonuclease to a target sequence complementary to 20 nucleotides preceding the protospacer associated motif (PAM) (NGG) required for Cas9 activity. The speciﬁcity of the system, and the fact that targeting is determined by the 20-nucleotide sequences of the gRNA, allows for unprecedented, facile genome engineering. Furthermore, the CRISPR/Cas9 system is amenable to making multiple, simultaneous targeted modiﬁcations (multiplexing).

CRISPR/Cas9 can be used to develop a new strategy to generate virus resistance in cucumber, by targeting eIF4E gene. eIF4E gene is a plant cellular translation factor essential for the Potyviridae life cycle, and natural point mutations in this gene can confer resistance to potyviruses. In cucumber, two eIF4E genes have been identified, eIF4E (accession no. XM_004147349) (236 amino acids) and eIF(iso)4E (accession no. XM_004147116.2) (204 amino acids). Two regions in the cucumber eIF4E gene are targeted by Cas9/sgRNA, which have no homology in the eIF(iso)4E gene (Chandrasekaran et al., 2016).

The Cas9/sgRNA1 construct was designed to target the sequence in the first exon of eIF4E (positions 65–86 in the coding region) (Fig. 4A). The Cas9/sgRNA2 construct was designed to target the third exon (positions 517–540) in the coding region to allow translation of approximately two-thirds of the protein, perhaps without disrupting all of its functions (Fig. 4A). Five independent T0 transgenic lines were generated by Agrobacterium-mediated transformation. The presence of the trans-gene (Cas9/sgRNA) was confirmed by kanamycin resistance and polymerase chain reaction (PCR) using sgRNA-specific primers (Fig. 4B). Three lines (1, 3, and 4) were identified as transgenic with Cas9/sgRNA1 (Chandrasekaran et al., 2016).

To evaluate the types of mutations generated in sgRNA1 transgenic plants, PCR was performed in T0 plants using primers flanking the sgRNA1 target and subsequently digested with BmgBI (a site that would disappear if Cas9 and NHEJ were active in this location). In line 1, a distinct undigested fragment was observed following BmgBI restriction (Fig. 4B). The partial digestion observed indicated a heterozygous genome with both wild-type and mutant eIF4E alleles. Cloning and sequencing of the uncut BmgBI fragment showed two types of mutation: a 20-nucleotide deletion around the PAM sequence in seven colonies and a one-nucleotide deletion 3 bp upstream of the PAM sequence in two colonies (Fig. 4C). In lines 4 (Fig. 4B) and 3, the amplified PCR was completely digested by BmgBI (Chandrasekaran et al., 2016).

Two additional Cas9/sgRNA2 transgenic plants (2 and 5) did not show genome editing in T0 as determined by PCR and restriction analysis with BglII (Fig. 4B). Then the experiment was continued with two sgRNA1 lines (1 and 4), designated CEC1-1 and CEC1-4, respectively (Cas/sgRNA1-eIF4E-Cucumber), and one sgRNA2 line (line 5), designated CEC2-5.

Fig. 4. Gene editing of eIF4E mediated by CRISPR/Cas9 in transgenic cucumber plants. (A) Schematic representation of the cucumber eIF4E genomic map and the sgRNA1 and sgRNA2 target sites (red arrows). The target sequence is shown in red letters together with the restriction site (underlined), and the protospacer adjacent motif (PAM) is marked in bold underlined letters. The black arrows indicate the primers flanking the target sites used to detect the mutations. (B) Restriction analysis of T0 polymerase chain reaction (PCR) fragments of CEC-1, CEC1-4, and CEC2-5. (C) Alignment of nine colony sequences from the undigested fragment of line 1 with the wild-type (wt) genome sequence. DNA deletions are shown by red dashes and deletion sizes (nucleotides) are marked on the right side of the sequence (Chandrasekaran et al., 2016).

Fig. 5. Genotyping of eIF4E mutants in representative T1 progeny plants of the CEC1-1 line. (A) Polymerase chain reaction (PCR) restriction analysis of Cas9/sgRNA1-mediated mutations (top panel) and transgene insertion (bottom panel) in 10 representative T1 cucumber plants and non-mutant wild-type (wt). (B) Alignment of four representative eIF4E mutant plants with the wild-type sequence. The sequences of each plant represented clones from undigested fragments. The target sequence is shown in red letters and the protospacer adjacent motif (PAM) is marked by bold underlined letters. DNA deletions are marked with red dashes and deletion sizes (nucleotides) are indicated on the right side of the sequence (Chandrasekaran et al., 2016).

Fig. 6. Genotyping of eIF4E mutants in representative T1 progeny plants of the CEC1-4 line. (A) Polymerase chain reaction (PCR) restriction analysis of Cas9/sgRNA1-mediated mutations (top panel) and transgene insertion (bottom panel) in eight T1 cucumber plants and non-mutant wild-type (wt). (B) Alignment of three eif4e transgenic mutant plants 4, 5 and 6 with the wild-type sequence. Sequences of each plant represent clones from undigested fragments. The target sequence is shown in red letters and the protospacer adjacent motif (PAM) is marked in bold underlined letters. DNA deletions are marked by red dashes and deletion sizes (nucleotides) are indicated on the right side of the sequence (Chandrasekaran et al., 2016).

Fig. 7. Genotyping of the Cas9/sgRNA2-mediated mutation in T1 progeny plants of the CEC2-5 line. (A) Polymerase chain reaction (PCR) restriction analysis of Cas9/sgRNA2-mediated mutations (top panel) and the presence of the Cas9/sgRNA2 transgene (bottom panel) in eight representative T1 cucumber plants. (B) Alignment of four representative eIF4E mutant plants with the wild-type sequence. The target sequence is shown in red letters and the protospacer adjacent motif (PAM) is marked in bold underlined letters. DNA deletions or insertions are marked by red dashes and letters, and the sizes of the deletions or insertions (nucleotides) are indicated on the right side of the sequence (Chandrasekaran et al., 2016).

Two Cas9/sgRNA constructs were designed to target the cucumber eIF4E gene: 1) sgRNA1 was expected to disrupt the intact eIF4E protein, and 2) sgRNA2 to permit translation of two-thirds of the protein product. In Agrobacterium-transformed T0 lines, deletions were found in the eIF4E target gene in one line (CEC1-1) out of five (Fig. 4). In the CEC1-1 line, the same mutations were observed in the T1 generation, which implies a heterozygous mono-allelic CEC1-1 T0 plant. In the transgenic line CEC1-4, all of the progeny from the T1 and T2 generations showed partial cleavage activity of Cas9 (Fig. 7), which implies that cleavage occurred only in somatic cells and not in the germ cell line. Such phenomena may be a result of the transgene insertion site of Cas9 or spatial specificity of the 35S promoter in the plant genome that causes a low level of expression and activity (Chandrasekaran et al., 2016).

In the transgenic T0 generation of the CEC2-5 line, a mutation was not detected by PCR or restriction analysis (Fig. 4B), although, in the T1 generation (Fig. 7), homozygous, heterozygous and non-mutant plants were observed; mutations in homozygous plants were bi-allelic, with two mutations in the same plant. It is possible that Cas9 activity in T0 was undetectable, although active, in the germ cell line. Alternatively, the T0 plant was chimeric, in which expression of Cas9/sgRNA2 occurred in the germ cell line. The differences in Cas9 targeting between the three T0 lines (CEC1-1, CEC1-2 and CEC2-5) may be a result of differential activities of Cas9 in different transgenic lines, depending on the transgene insertion site (Chandrasekaran et al., 2016).

References

Barrangou, R. 2015. Diversity of CRISPR-Cas Immune Systems and Molecular Machines. Genome Biology. 16:247.

Barrangou, R. and L.A. Marraffini. 2014. CRISPR-Cas Systems: Prokaryotes Upgrade to Adaptive Immunity. NIH Mol Cell. 54(2): 234–244.

Chandrasekaran, J., M. Brumin, D. Wolf., D. Leibman, C. Klap, M. Pearlsman, A. Sherman, T. Arazi, and A. Gal-on. 2016. Development of Broad Virus Resistance in Non-Transgenic Cucumber Using CRISPR/Cas9 Technology. Molecular Plant Pathology. 17(7): 1140-1153.

Cong, L., F.A. Ran, D. Cox, S. Lin, R. Barretto, N. Habib, P.D. Hsu, X. Wu, W. Ziang, L.A. Marraffini, F. Zhang. Multiplex Genome Engineering Using CRISPR/Cas Systems. Science. 339(6121): 819-823.

Grissa, I., G. Vergnaud, and C. Pourcel. 2007. CRISPRFinder: A Web Tool to Identify Clustered Regularly Interspaced Short Palindromic Repeats. Nucleic Acids Research. 35 (Web Server issue): W52-W57.

Makarova, K.S., D.H. Haft, R. Barrangou, S.J.J. Brouns, E. Charpentier, P. Horvath, S. Moineau, F.J.M. Mojica, Y.I. Wolf, A.F. Yakunin, J. van der Oost, and E.V. Koonin. 2011. Evolution and Classification of the CRISPR–Cas Systems. Nature Reviews-Microbiology. 9: 467-477.

Rath, D., L. Amlinger, A. Rath, and M. Lundgren. 2015. The CRISPR-Cas Immune System: Biology, Mechanisms and Applications. Elsevier- Biochimie. 117: 119-128.

Sternberg, S.H., H. Richter, E. Charpentier, and U. Qimron. 2016. Adaptation in CRISPR-Cas Systems. Elsevier-Molecular Cell. 61: 797-808.

Biology

Friday, July 20, 2018

CRISPR/Cas Systems and Their Application for Genome Editing

1. Introduction

2. CRISPR/Cas Mechanism

3. Application of CRISPR/Cas System for Genome Editing

References

No comments:

Post a Comment

Machine Learning dalam Biologi dan Bioinformatika: Masa Depan Analisis Data Hayati