DNA methylation - an introduction

Marta Nabais | Jun 28, 2023 min read

Who is this post for

This post is an almost direct transcription of the first paragraphs of Chapter 1, of my PhD thesis. When I was writing it, I assumed the reader would have at least a basic knowledge of biology, and some familiarity with genetics and developmental biology. Some terms may be hard to digest if you’re not familiar with the science / history behind them. I hope this does not discourage you, but if it does, I provided a Glossary at the end of this page, to help guide you through it.

DNA methylation

DNA methylation is a covalent molecular modification by which methyl groups (i.e., a molecule composed of one carbon and three hydrogen atoms) are added to the 5th carbon position of cytosine.

Hold on: 5th carbon position of cytosine?…

What does that mean?

You’re right, let’s go back to basics, for a bit.

DNA is made of two linked strands that wind around each other forming a double helix. Each strand has a backbone made of alternating carbohydrates (molecules composed of carbon, hydrogen and oxygen atoms), more commonly known as sugars. I bet you know a few. You may even be addicted to some of them! In the case of DNA, the sugars are from the deoxyribose family. The 5’-end (pronounced “five prime end”) designates the end of the DNA strand that has the fifth carbon in the sugar-ring of the deoxyribose at its terminus.

A furanose (sugar-ring) molecule with carbon atoms labeled using standard notation. The 5' is upstream; the 3' is downstream. DNA and RNA are synthesized in the 5'-to-3' direction. From Wikipedia.

Figure 1: A furanose (sugar-ring) molecule with carbon atoms labeled using standard notation. The 5’ is upstream; the 3’ is downstream. DNA and RNA are synthesized in the 5’-to-3’ direction. From Wikipedia.

Also part of the DNA backbone are phosphate groups. The phosphate residue is attached to the hydroxyl group of the 5’ carbon of one sugar and the hydroxyl group of the 3’ carbon of the sugar of the next nucleotide, which forms a 5’–3’ phosphodiester linkage. These are very important in providing stability to DNA.

Attached to each sugar is also one of four bases: adenine (A), cytosine (C), guanine (G) or thymine (T). They are bases due to the basic nature of their nitrogen functional groups.

If you look at the figure below on the left, you can see the amplified structure of A, C, G & T. Do you notice how T & C (or A & G) resemble each other much more than A & T (or C & G)? Take a close look at their atomic composition. T and C are pyrimidines. A and G are purines. Purines always bind with pyrimidines – known as complementary pairing. Try to make it a game and figure out the differences for yourself!

Click for the solution

DNA molecular structure (on the left) and structure of a methyl group (on the right). Relative methyl group dimensions are amplified here!DNA molecular structure (on the left) and structure of a methyl group (on the right). Relative methyl group dimensions are amplified here!

Figure 2: DNA molecular structure (on the left) and structure of a methyl group (on the right). Relative methyl group dimensions are amplified here!

Going back to DNA methylation itself, in 1975, two key independent studies suggested that methylation of cytosine residues in the context of cytosine-guanine (CpG) dinucleotides could act as epigenetic regulators of gene expression in vertebrates (Holliday and Pugh 1975; Jones 2012; Riggs 2008). The CpG sites sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5’ → 3’ direction.

In the vertebrate world (and eukaryotes in general), the most common methylation modification occurs at the fifth carbon of the pyrimidine ring (5mC) at Cytosine-Guanine (CpG) sites. The symmetrical presence of CpG methylation marks on both DNA strands allows the maintenance of DNA methylation patterns after mitosis and is therefore a key feature of epigenetic regulation (Holliday and Pugh 1975; Riggs 2008).

Most of the genome is depleted of CpGs, due to spontaneous deamination, with certain regions showing expected CpG levels, known as CpG islands (Smith and Meissner 2013). These occur mainly at transcription start sites - sites at which genes start being transcribed into RNA, a single-stranded molecular that can be translated into a protein - of housekeeping and developmental regulator genes (Deaton and Bird 2011), and are generally unmethylated.

How methylation of CpG sites followed by spontaneous deamination leads to a lack of CpG sites in methylated DNA. As a result, residual CpG islands are created in areas where methylation is rare, and CpG sites stick (or where C to T mutation is highly detrimental).. From Wikipedia.

Figure 3: How methylation of CpG sites followed by spontaneous deamination leads to a lack of CpG sites in methylated DNA. As a result, residual CpG islands are created in areas where methylation is rare, and CpG sites stick (or where C to T mutation is highly detrimental).. From Wikipedia.

However, of the roughly 28 million CpGs in the human genome, 60-80% are methylated (Smith and Meissner 2013). Furthermore, most bulk genomic methylation patterns are stable across cell-types and throughout life, changing only in localized contexts – for example, due to disease-associated processes.

Two exceptions to this occur during mammalian embryonic development: a first demethylation wave occurs during pre-implantation, enabling totipotency of the zygote. During this process, global demethylation of the father’s genome at fertilization is followed by a depletion in both parental genomes, beginning in the zygote through the first few early embryonic replication cycles (Cedar and Bergman 2012) (with the exception of differentially methylated regions associated with imprinted genes, retrotransposons and centromeric heterochromatin). A wave of methylation then occurs during implantation throughout embryogenesis, allowing tissue-specific formation. Finally, a second demethylation wave takes place during genesis of primordial germ cells – including imprinting erasure – which is once again vital to maintain totipotency (Hajkova et al. 2008; Surani, Hayashi, and Hajkova 2007).

Dynamic of DNA methylation during mouse embryonic development. E3.5-E6, etc., refer to days after fertilization. PGC: primordial germ cells. From Wikipedia.

Figure 4: Dynamic of DNA methylation during mouse embryonic development. E3.5-E6, etc., refer to days after fertilization. PGC: primordial germ cells. From Wikipedia.

Click for a side note on the enzymes responsible for the reprogramming processes described above.

In addition to methyl groups, hydroxymethyl groups (OH) have also been observed to be bound to cytosine nucleotides, forming 5-hydroxymethylcytosine (5hmC), an oxidative process catalysed by TET enzymes. Interestingly, 5hmC was found to be 10-fold more abundant in neurons than other tissues or embryonic stem cells suggesting that it might have a significant role in brain (Sun et al. 2014).

Additionally, 5hmC is enriched in gene bodies, promoters, and transcription factor binding sites and literature evidence suggest roles of 5hmC in regulating gene expression and controlling cell identity. DNA methylation is also found at non-CpG sites (mCHG and mCHH, where H = A, C or T) (Lister et al. 2009), a phenomenon firstly described in the plant genome (Gruenbaum et al. 1981; Lindroth et al. 2001) with a well-established functional role (Chan, Henderson, and Jacobsen 2005).

Animal studies show that non-CpG methylation is more frequent in cultured pluripotent stem cells – including human embryonic stem cells, induced pluripotent stem cells (Guo et al. 2014; Laurent et al. 2010; Lister et al. 2009, 2011; Ramsahoye et al. 2000; Stadler et al. 2011; Ziller et al. 2011), and cells in the mouse germline (Guo et al. 2014; Ichiyanagi et al. 2013; Smith et al. 2012; Tomizawa et al. 2011) –, than in most somatic tissues (Guo et al. 2014; Ramsahoye et al. 2000; Ziller et al. 2011). However, several recent profiling studies have shown the presence of non-CpG methylation in the adult mouse dentate gyrus (Guo et al. 2014) and cortex (Guo et al. 2014; Lister et al. 2013; Xie et al. 2012), and human brain (Guo et al. 2014; Lister et al. 2013; Varley et al. 2013), with evidence suggesting clear distinctions between mCHs in the brain and those in pluripotent stem cells (Guo et al. 2014).

Most notably, non-CpG methylation accumulates significantly in neurons through early childhood and adolescence, becoming the dominant form of DNA methylation in mature human neurons (Lister et al. 2013). Indeed, several studies have suggested an independent epigenetic function of non-CpG methylation, particularly during neuronal maturation (Lister et al. 2013). For example, methylation at sites other than CpG dinucleotides can recruit methyl-CpG-binding protein 2 (MECP2), which is an important transcriptional repressor, particularly for long genes with neuronal functions (Chen et al. 2015; Gabel et al. 2015). Therefore it is important to keep in mind that other chemical modification (e.g., 5hmC) to DNA and methylation at non-CpG sites may also have an important (if not more important) functional role in brain tissue.

Glossary

  • 🧬centromeric heterochromatin - The centromere is the primary constriction observed in condensed chromosomes during mitosis (the process by which a cell replicates its chromosomes and then segregates them, producing two identical nuclei in preparation for cell division) (Bloom 2014) and provides the site of assembly for the kinetochore (large protein assemblies that connect chromosomes to microtubules of the mitotic and meiotic spindles in order to distribute the replicated genome from a mother cell to its daughters).
  • 🔗covalent bond - A covalent bond is a chemical bond that involves the sharing of electrons to form electron pairs between atoms. It is a stable, but reversible bond. Very useful in biology!
  • 🪼eukaryotes - any cell or organism that possesses a clearly defined nucleus. Eukaryotic cells have a nuclear membrane that surrounds the nucleus.
  • deamination - the removal of an amino group (chemical groups that contain basic nitrogen with a lone pair) from an aminoacid or other compound.
  • retrotransposons - evolutionarily widespread genetic elements that replicate through reverse transcription of an RNA copy and integrate the product DNA into new sites in the host genome.
  • 🪺zygote - fertilized egg cell that results from the union of a female gamete (egg, or ovum) with a male gamete (sperm).

References

Bloom, Kerry S. 2014. “Centromeric Heterochromatin: The Primordial Segregation Machine.” Annual Review of Genetics 48 (November): 457–84. https://doi.org/10.1146/annurev-genet-120213-092033.
Cedar, Howard, and Yehudit Bergman. 2012. “Programming of DNA methylation patterns.” Annual Review of Biochemistry 81: 97–117. https://doi.org/10.1146/annurev-biochem-052610-091920.
Chan, Simon W.-L., Ian R. Henderson, and Steven E. Jacobsen. 2005. “Gardening the Genome: DNA Methylation in Arabidopsis Thaliana.” Nature Reviews Genetics 6 (5): 351–60. https://doi.org/10.1038/nrg1601.
Chen, Lin, Kaifu Chen, Laura A. Lavery, Steven Andrew Baker, Chad A. Shaw, Wei Li, and Huda Y. Zoghbi. 2015. “MeCP2 binds to non-CG methylated DNA as neurons mature, influencing transcription and the timing of onset for Rett syndrome.” Proceedings of the National Academy of Sciences of the United States of America 112 (17): 5509–14. https://doi.org/10.1073/pnas.1505909112.
Deaton, Aimée M., and Adrian Bird. 2011. “CpG Islands and the Regulation of Transcription.” Genes & Development 25 (10): 1010–22. https://doi.org/10.1101/gad.2037511.
Gabel, Harrison W., Benyam Kinde, Hume Stroud, Caitlin S. Gilbert, David A. Harmin, Nathaniel R. Kastan, Martin Hemberg, Daniel H. Ebert, and Michael E. Greenberg. 2015. “Disruption of DNA-methylation-dependent long gene repression in Rett syndrome.” Nature 522 (7554): 89–93. https://doi.org/10.1038/nature14319.
Gruenbaum, Y., T. Naveh-Many, H. Cedar, and A. Razin. 1981. “Sequence specificity of methylation in higher plant DNA.” Nature 292 (5826): 860–62. https://doi.org/10.1038/292860a0.
Guo, Junjie U., Yijing Su, Joo Heon Shin, Jaehoon Shin, Hongda Li, Bin Xie, Chun Zhong, et al. 2014. “Distribution, recognition and regulation of non-CpG methylation in the adult mammalian brain.” Nature Neuroscience 17 (2): 215–22. https://doi.org/10.1038/nn.3607.
Hajkova, Petra, Katia Ancelin, Tanja Waldmann, Nicolas Lacoste, Ulrike C. Lange, Francesca Cesari, Caroline Lee, Genevieve Almouzni, Robert Schneider, and M. Azim Surani. 2008. “Chromatin dynamics during epigenetic reprogramming in the mouse germ line.” Nature 452 (7189): 877–81. https://doi.org/10.1038/nature06714.
Holliday, R., and J. E. Pugh. 1975. “DNA Modification Mechanisms and Gene Activity During Development.” Science 187 (4173): 226–32. https://doi.org/10.1126/science.187.4173.226.
Ichiyanagi, Tomoko, Kenji Ichiyanagi, Miho Miyake, and Hiroyuki Sasaki. 2013. “Accumulation and loss of asymmetric non-CpG methylation during male germ-cell development.” Nucleic Acids Research 41 (2): 738–45. https://doi.org/10.1093/nar/gks1117.
Jones, Peter A. 2012. “Functions of DNA Methylation: Islands, Start Sites, Gene Bodies and Beyond.” Nature Reviews Genetics 13 (7): 484–92. https://doi.org/10.1038/nrg3230.
Laurent, Louise, Eleanor Wong, Guoliang Li, Tien Huynh, Aristotelis Tsirigos, Chin Thing Ong, Hwee Meng Low, et al. 2010. “Dynamic changes in the human methylome during differentiation.” Genome Research 20 (3): 320–31. https://doi.org/10.1101/gr.101907.109.
Lindroth, A. M., X. Cao, J. P. Jackson, D. Zilberman, C. M. McCallum, S. Henikoff, and S. E. Jacobsen. 2001. “Requirement of CHROMOMETHYLASE3 for maintenance of CpXpG methylation.” Science (New York, N.Y.) 292 (5524): 2077–80. https://doi.org/10.1126/science.1059745.
Lister, Ryan, Eran A. Mukamel, Joseph R. Nery, Mark Urich, Clare A. Puddifoot, Nicholas D. Johnson, Jacinta Lucero, et al. 2013. “Global epigenomic reconfiguration during mammalian brain development.” Science (New York, N.Y.) 341 (6146): 1237905. https://doi.org/10.1126/science.1237905.
Lister, Ryan, Mattia Pelizzola, Robert H. Dowen, R. David Hawkins, Gary Hon, Julian Tonti-Filippini, Joseph R. Nery, et al. 2009. “Human DNA methylomes at base resolution show widespread epigenomic differences.” Nature 462 (7271): 315–22. https://doi.org/10.1038/nature08514.
Lister, Ryan, Mattia Pelizzola, Yasuyuki S. Kida, R. David Hawkins, Joseph R. Nery, Gary Hon, Jessica Antosiewicz-Bourget, et al. 2011. “Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells.” Nature 471 (7336): 68–73. https://doi.org/10.1038/nature09798.
Lyko, Frank. 2018. “The DNA Methyltransferase Family: A Versatile Toolkit for Epigenetic Regulation.” Nature Reviews Genetics 19 (2): 81–92. https://doi.org/10.1038/nrg.2017.80.
Ramsahoye, B. H., D. Biniszkiewicz, F. Lyko, V. Clark, A. P. Bird, and R. Jaenisch. 2000. “Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a.” Proceedings of the National Academy of Sciences of the United States of America 97 (10): 5237–42. https://doi.org/10.1073/pnas.97.10.5237.
Rasmussen, Kasper Dindler, and Kristian Helin. 2016. “Role of TET enzymes in DNA methylation, development, and cancer.” Genes & Development 30 (7): 733–50. https://doi.org/10.1101/gad.276568.115.
Riggs, A.D. 2008. “X Inactivation, Differentiation, and DNA Methylation.” Cytogenetics and Cell Genetics 14 (1): 9–25. https://doi.org/10.1159/000130315.
Smith, Zachary D., Michelle M. Chan, Tarjei S. Mikkelsen, Hongcang Gu, Andreas Gnirke, Aviv Regev, and Alexander Meissner. 2012. “A unique regulatory phase of DNA methylation in the early mammalian embryo.” Nature 484 (7394): 339–44. https://doi.org/10.1038/nature10960.
Smith, Zachary D., and Alexander Meissner. 2013. “DNA Methylation: Roles in Mammalian Development.” Nature Reviews Genetics 14 (3): 204–20. https://doi.org/10.1038/nrg3354.
Stadler, Michael B., Rabih Murr, Lukas Burger, Robert Ivanek, Florian Lienert, Anne Schöler, Erik van Nimwegen, et al. 2011. “DNA-binding factors shape the mouse methylome at distal regulatory regions.” Nature 480 (7378): 490–95. https://doi.org/10.1038/nature10716.
Sun, Wenjia, Liqun Zang, Qiang Shu, and Xuekun Li. 2014. “From Development to Diseases: The Role of 5hmC in Brain.” Genomics, 5-hydroxymethylation, 104 (5): 347–51. https://doi.org/10.1016/j.ygeno.2014.08.021.
Surani, M. Azim, Katsuhiko Hayashi, and Petra Hajkova. 2007. “Genetic and Epigenetic Regulators of Pluripotency.” Cell 128 (4): 747–62. https://doi.org/10.1016/j.cell.2007.02.010.
Tomizawa, Shin-ichi, Hisato Kobayashi, Toshiaki Watanabe, Simon Andrews, Kenichiro Hata, Gavin Kelsey, and Hiroyuki Sasaki. 2011. “Dynamic stage-specific changes in imprinted differentially methylated regions during early mammalian development and prevalence of non-CpG methylation in oocytes.” Development (Cambridge, England) 138 (5): 811–20. https://doi.org/10.1242/dev.061416.
Varley, Katherine E., Jason Gertz, Kevin M. Bowling, Stephanie L. Parker, Timothy E. Reddy, Florencia Pauli-Behn, Marie K. Cross, et al. 2013. “Dynamic DNA methylation across diverse human cell lines and tissues.” Genome Research 23 (3): 555–67. https://doi.org/10.1101/gr.147942.112.
Xie, Wei, Cathy L. Barr, Audrey Kim, Feng Yue, Ah Young Lee, James Eubanks, Emma L. Dempster, and Bing Ren. 2012. “Base-resolution analyses of sequence and parent-of-origin dependent DNA methylation in the mouse genome.” Cell 148 (4): 816–31. https://doi.org/10.1016/j.cell.2011.12.035.
Ziller, Michael J., Fabian Müller, Jing Liao, Yingying Zhang, Hongcang Gu, Christoph Bock, Patrick Boyle, et al. 2011. “Genomic distribution and inter-sample variation of non-CpG methylation across human cell types.” PLoS genetics 7 (12): e1002389. https://doi.org/10.1371/journal.pgen.1002389.