Gene editing encompasses a powerful set of tools in biological research but also entails the risk of unintended changes to the host genome. Modifying DNA can trigger functional changes at sites both near to and far from target loci, potentially disrupting genes as well as their regulatory elements within the host genome. Even when there is successful insertion of a new DNA segment, cellular machinery may recognize it as foreign and silence its expression via methylation.1 Copy-number changes, translocations, unplanned single-nucleotide alterations, chromosomal rearrangements and generalized genomic instability can all be part of experimental outcomes. Such changes can increase the risk of cell death and cancer. For all these reasons, there is a need to identify safe-harbor insertion sites, regions where exogenous DNA may be introduced without disrupting endogenous functions or triggering the host’s defensive mechanisms.

Research into safe harbor sites has been proceeding for multiple decades now, contributed to by investigators like Rudolf Jaenisch at the Whitehead Institute for Biomedical Research and the Massachusetts Institute of Technology (MIT), Philippe Soriano at the Icahn School of Medicine at Mount Sinai, and many others. Historically, identifying safe harbor sites has been laborious and slow, often involving work with murine models.2 Data was gathered at first in a wreck-and-check fashion. More recently, however, bioinformatics has enabled researchers in the field to attack the same problem with much more accelerated methods.

In 2022, Erik Aznauryan, now at Harvard University’s Wyss Institute, and team reported in silico identification of nearly two-thousand potential safe harbor loci.3 Five of those were selected for experimental validation via CRISPR/Cas9. Custom guide RNA (gRNA) sequences were designed to direct insertion at each locus for HEK293T cells and Jurkat cells. These cell types were chosen based on their use in recombinant protein manufacturing and in engineered immune receptor research, respectively. Experimental outcomes were judged firstly based on successful integration of a sequence encoding the fluorescent reporter protein, mRuby. Cells expressing the protein were selected by fluorescence-activated cell sorting (FACS). By this method, the stability of expression over weeks and months was measured. Additionally, expression of known oncogenes was monitored over time to spot any clones indicating developing malignancy. Via these methods, two safe harbor sites were reported and labeled, Region Optimal for Gene Insertions 1 and 2 (Rogi1 and Rogi2).

Building on this and similar research efforts, Aashutosh Girish Boob at University of Illinois Urbana-Champaign and team have reported this year the development of CRISPR-COPIES, a “COmputational Pipeline for the Identification of CRISPR/Cas-facilitated intEgration Sites.”4 As the name suggests, the CRISPR-COPIES system streamlines the identification of compatible gRNA sequences and target intergenic loci. This is independent of the cell type being used and the type of CRISPR/Cas system in question, as the team reports demonstrating experimentally in both eukaryotic and prokaryotic cells: Saccharomyces cerevisiae, Cupriavidus necator, and human embryonic kidney (HEK) 293T cells.

These reported advancements on the predictive potential of new bioinformatic tools bode well for the field of gene editing. The arrival of CRISPR-based systems was hailed as a breakthrough moment for biotechnology, which it was, but the hurdle of off-target effects was not broadly appreciated at first. In the years since, growing understanding of the problem has made it possible for experts to address it more and more as an engineering challenge, iteratively gaining ground with each experiment. This points to a future where precise modification of genetic information becomes not just a possibility, but a routine reality, offering unprecedented opportunities for scientific discovery and societal benefit.



  1. Bestor, T. H. (2000). Gene silencing as a threat to the success of gene therapy. The Journal of clinical investigation, 105(4), 409-411.
  2. Palmiter, R. D., & Brinster, R. L. (1986). Germ-line transformation of mice. Annual Review of Genetics, 20, 465-499.
  3. Aznauryan, E., Yermanos, A., Kinzina, E., et al. (2022). Discovery and validation of human genomic safe harbor sites for gene and cell therapies. Cell Rep Methods, 2(1), 100154.
  4. Boob, A. G., Zhu, Z., Intasian, P., et al. (2024). CRISPR-COPIES: An in silico platform for discovery of neutral integration sites for CRISPR/Cas-facilitated gene integration. Nucleic Acids Research, 52(6), e30.