🔬 Cell Biology

DNA, Genes & Protein Synthesis

11 min read📄 5 sections🔑 8 key terms

DNA — The Blueprint of Life

Every cell in your body (except red blood cells) contains your complete DNA — about 3 billion base pairs of it, encoding all the instructions needed to build and run a human being. DNA structure — a quick reminder: DNA is a double helix — two strands wound around each other like a twisted ladder. The "rungs" of the ladder are pairs of bases: - Adenine (A) always pairs with Thymine (T) - Guanine (G) always pairs with Cytosine (C) This complementary base pairing is the key to everything — it is how DNA copies itself and how it is read. From DNA to protein — the central dogma: The flow of information in biology follows one fundamental rule, called the Central Dogma of Molecular Biology: DNA → RNA → Protein - DNA stores the information (the master blueprint — kept safe in the nucleus) - RNA carries a working copy of the instructions out of the nucleus - Ribosomes read the RNA instructions and assemble the protein Genes and non-coding DNA: Only about 2% of your DNA codes for proteins — these are the ~20,000 genes. The other 98% was once dismissed as "junk DNA" but is now known to include regulatory sequences (controlling when genes are switched on), RNA-coding sequences, and structural elements. Understanding this non-coding DNA is one of the most exciting frontiers in medicine.

DNA Replication — Copying the Blueprint

Before a cell divides (during S phase of the cell cycle), it must copy all 3 billion base pairs of its DNA precisely. This process is called DNA replication. How replication works — the key steps: 1. Unwinding: An enzyme called helicase breaks the hydrogen bonds between the base pairs and unzips the double helix, separating the two strands. The point where the helix opens is called the replication fork. 2. Priming: An enzyme called primase lays down a short RNA primer — a starting point for the new strand. 3. Building the new strand: The workhorse enzyme DNA polymerase reads along the original strand and adds new complementary bases (A pairs with T, G pairs with C). DNA polymerase can only add bases in one direction, which means one new strand is made continuously (the leading strand) and the other is made in fragments (the lagging strand, producing Okazaki fragments). 4. Proofreading: DNA polymerase checks its own work as it goes. If a wrong base is added, it removes it and replaces it — reducing errors dramatically. 5. Sealing: An enzyme called DNA ligase joins the Okazaki fragments together into a continuous strand. The result: two identical double helices, each containing one original strand and one new strand — called semi-conservative replication (each copy is "semi" new, "semi" old). This was one of the most elegant experiments in biology — proven by Meselson and Stahl in 1958 using heavy nitrogen isotopes. Error rate: Despite proofreading, about 1 error per billion bases slips through per replication. With 3 billion base pairs replicated in every cell division, this means roughly 1–3 mutations per cell cycle — most harmless, but over a lifetime these accumulate.

Transcription — Reading the Gene

Transcription is the first step of using a gene — making a working copy of the DNA instructions in the form of messenger RNA (mRNA). RNA is similar to DNA but: - Single-stranded (not a double helix) - Uses Uracil (U) instead of Thymine (T) - Uses ribose sugar instead of deoxyribose How transcription works: 1. Initiation: A protein called a transcription factor recognises the promoter — a specific DNA sequence just before the gene that acts as a "start here" signal. It recruits the enzyme RNA polymerase to the start of the gene. 2. Elongation: RNA polymerase unwinds the DNA and reads along one strand (the template strand), assembling a complementary mRNA strand — using A, U, G, C bases. 3. Termination: RNA polymerase reaches a terminator sequence and detaches, releasing the mRNA. 4. RNA processing (in eukaryotes): The raw mRNA transcript (pre-mRNA) is processed before leaving the nucleus: - A 5' cap and poly-A tail are added (protecting the mRNA from degradation) - Introns (non-coding sections within the gene) are removed - Exons (coding sections) are joined together — this is called splicing - The mature mRNA exits through nuclear pores to the cytoplasm Why this matters clinically: Many antibiotics work by targeting bacterial transcription or translation (bacteria have slightly different enzymes from eukaryotes, allowing selective targeting). Rifampicin — used to treat tuberculosis — specifically blocks bacterial RNA polymerase.

Translation — Building the Protein

Translation is the second step — ribosomes read the mRNA and assemble a chain of amino acids to make a protein. The genetic code: mRNA is read in groups of three bases called codons. Each codon specifies one amino acid (or a start/stop signal). There are 64 possible codons (4³ = 64) but only 20 amino acids — so most amino acids are coded by more than one codon (the code is degenerate/redundant). This redundancy provides some protection against mutations: if a mutation changes the third base of a codon, it often still codes for the same amino acid. Key codons: - AUG — the start codon (codes for methionine — every protein starts with methionine) - UAA, UAG, UGA — stop codons (no amino acid — signals the end of the protein) How translation works: 1. Initiation: The ribosome (made of two subunits — small and large) assembles around the start codon (AUG) of the mRNA. 2. Elongation: Transfer RNA (tRNA) molecules bring amino acids to the ribosome. Each tRNA has an anticodon — three bases that are complementary to the mRNA codon. When a tRNA anticodon matches the mRNA codon, the ribosome joins that amino acid to the growing chain. The ribosome moves along the mRNA, adding one amino acid at a time. 3. Termination: When a stop codon is reached, no tRNA matches it. A release factor protein triggers the ribosome to release the completed polypeptide chain. 4. Post-translational modification: The new protein is often further processed — folded (often with help from chaperone proteins), modified (phosphorylation, glycosylation), or cut to its active form. Many proteins must travel to the ER and Golgi apparatus for final processing. From gene to function — a concrete example: The gene for insulin in beta cells of the pancreas → mRNA transcribed → translated into preproinsulin (a precursor protein) → processed in the ER and Golgi → cleavage to active insulin → secreted into the bloodstream. All gene expression follows this same fundamental pathway.

Gene Regulation — Switching Genes On and Off

All of your cells contain exactly the same DNA. Yet a liver cell looks and functions completely differently from a neuron or a muscle cell. How? Gene regulation — different cells read different parts of the genome. Transcription factors: Transcription factors are proteins that bind to specific DNA sequences near genes (promoters and enhancers) and either switch genes on (activators) or off (repressors). Different cell types contain different transcription factors — which is what drives different patterns of gene expression. Epigenetics — above the DNA: "Epigenetics" literally means "above genetics." Epigenetic changes modify how DNA is read without changing the DNA sequence itself. Two key mechanisms: - DNA methylation — adding a methyl group (–CH₃) to a cytosine base. Methylated genes are typically silenced (switched off). This is how cells "remember" which genes they use — the pattern of methylation is copied when cells divide. - Histone modification — DNA is wound around proteins called histones. Chemical modifications to histones can loosen the DNA (making genes accessible and active) or tighten it (making genes inaccessible and silent). Histone acetylation loosens → gene on. Histone deacetylation tightens → gene off. Why epigenetics matters in medicine: - Cancer cells often have abnormal methylation patterns — silencing tumour suppressor genes - Epigenetic drugs (HDAC inhibitors, DNA methyltransferase inhibitors) are being used or trialled in certain cancers - Early life experiences, diet, and environment can alter epigenetic marks — this is one mechanism by which childhood adversity can have long-term health effects - Identical twins start with identical DNA but diverge epigenetically over time — explaining why they can develop different diseases despite identical genomes

🔑 Key Terms
Central Dogma
The fundamental principle: DNA → RNA → Protein. DNA is transcribed into mRNA, which is translated into protein. Information flows in this direction (with some exceptions like retroviruses).
DNA replication
The process of copying DNA before cell division. Helicase unzips the helix; DNA polymerase builds new complementary strands. Semi-conservative — each new helix has one old strand and one new strand.
DNA polymerase
The enzyme that builds new DNA strands during replication by adding complementary bases to a template strand. Also proofreads for errors.
Transcription
The process of making an mRNA copy of a gene. RNA polymerase reads the DNA template strand and assembles complementary mRNA (using U instead of T).
Translation
The process of making a protein from mRNA instructions at the ribosome. tRNA molecules bring amino acids; codons on mRNA are matched by anticodons on tRNA.
Codon
A three-base sequence on mRNA that codes for one amino acid (or a start/stop signal). AUG = start (methionine). UAA/UAG/UGA = stop.
Epigenetics
Changes in gene expression that do not involve changes to the DNA sequence. Mechanisms include DNA methylation (silencing genes) and histone modification (controlling DNA accessibility).
Splicing
The removal of introns (non-coding sequences) and joining of exons (coding sequences) from pre-mRNA to produce mature mRNA. Allows one gene to produce multiple protein variants.
📱 Practice what you just learned

The free iOS app has quizzes, spaced repetition flashcards, timed practice exams, and weak spot tracking — for every lesson.

🍎 Download Free