DNA, Genes & Protein Synthesis
DNA: The Blueprint of Life
Every cell in your body — from a skin cell to a neuron — contains the same complete set of instructions for building and running the entire human body. These instructions are stored in DNA (deoxyribonucleic acid), a molecule so long that if you stretched out the DNA from a single human cell, it would measure about 2 metres. Yet it is coiled so tightly that it fits inside a nucleus just 6 micrometres across. The structure of DNA: DNA is a double helix — two strands wound around each other like a twisted ladder. The "rails" of the ladder are made of alternating sugar (deoxyribose) and phosphate groups. The "rungs" are pairs of nitrogenous bases — the letters of the genetic code. There are four bases: - Adenine (A) - Thymine (T) - Guanine (G) - Cytosine (C) Base pairing rules: A always pairs with T (held by 2 hydrogen bonds), and G always pairs with C (held by 3 hydrogen bonds). This specific pairing — called complementary base pairing — is fundamental to everything DNA does: replication, transcription, and repair. The sequence matters enormously: The order of bases along the DNA strand encodes all genetic information. The human genome contains approximately 3 billion base pairs. Changing just one base (a point mutation) can completely alter or destroy a protein's function — as in sickle cell disease, where a single base change causes glutamate to be replaced by valine in haemoglobin.
Genes and the Genome
A gene is a specific sequence of DNA that encodes the instructions for making one protein (or a functional RNA molecule). Humans have approximately 20,000–25,000 protein-coding genes — surprisingly few, given the complexity of the body. However, only about 1.5% of the human genome actually codes for proteins. The remaining 98.5% was once dismissively called "junk DNA" but is now known to contain: - Regulatory sequences — promoters, enhancers, and silencers that control when and where genes are expressed - Introns — non-coding sequences within genes that are removed after transcription - Repetitive sequences — tandem repeats used in DNA fingerprinting - Non-coding RNAs — functional RNA molecules (microRNAs, lncRNAs) that regulate gene expression Chromosomes and organisation: DNA is packaged into chromosomes — humans have 46 chromosomes (23 pairs) in every nucleated cell. Each chromosome contains hundreds to thousands of genes. The DNA is wound around proteins called histones to form chromatin — this packaging also regulates gene expression (DNA wound tightly = gene switched off; loosely wound = gene accessible = switched on). Clinical connection — chromosomal abnormalities: - Down syndrome (trisomy 21) — an extra copy of chromosome 21, causing intellectual disability and characteristic features - Turner syndrome (45,X) — only one X chromosome in females — short stature and ovarian failure - Philadelphia chromosome — a translocation between chromosomes 9 and 22, creating the BCR-ABL oncogene that drives chronic myeloid leukaemia (CML). The drug imatinib (Gleevec) specifically inhibits BCR-ABL — one of the first targeted cancer therapies
DNA Replication: Copying the Code
Every time a cell divides, it must first make an exact copy of its entire DNA — all 3 billion base pairs — so that each daughter cell gets a complete genome. This process is called DNA replication and it is remarkably accurate: one error per billion base pairs copied. How replication works: 1. Unwinding: The enzyme helicase unwinds the double helix, breaking the hydrogen bonds between base pairs and creating a replication fork 2. Priming: Short RNA sequences called primers are laid down by primase to provide a starting point for the new strand 3. Elongation: DNA polymerase reads the template strand (3' to 5') and builds a new complementary strand (5' to 3'), adding bases according to the pairing rules (A→T, T→A, G→C, C→G) 4. Proofreading: DNA polymerase has a built-in error-checking function — it proofreads each added base and removes mistakes immediately 5. Sealing: DNA ligase joins the pieces together, completing the new strand Semi-conservative replication: Each new DNA molecule consists of one original (parental) strand and one newly synthesised strand. This is called semi-conservative replication — half the original is conserved in each daughter molecule. Clinical relevance — why replication errors cause cancer: Despite proofreading, errors occasionally slip through. Cells have additional repair mechanisms (base excision repair, nucleotide excision repair, mismatch repair). When these repair systems are themselves mutated, error rates soar. Lynch syndrome results from defective mismatch repair genes — carrying a very high risk of bowel and uterine cancer. BRCA1 and BRCA2 mutations impair DNA double-strand break repair — dramatically increasing breast and ovarian cancer risk.
Transcription: DNA → RNA
DNA never leaves the nucleus — it is too valuable and too fragile to be used directly as a template for protein production. Instead, the cell makes a working copy of a gene in the form of messenger RNA (mRNA). This process is called transcription. How transcription works: 1. Initiation: A protein called a transcription factor binds to the promoter — a specific DNA sequence just upstream of the gene. This recruits RNA polymerase to the start of the gene 2. Elongation: RNA polymerase unwinds the DNA and reads one strand (the template strand), synthesising a complementary RNA strand. RNA uses uracil (U) instead of thymine — so where DNA has A, RNA has U 3. Termination: RNA polymerase reaches a termination sequence and releases the new mRNA 4. mRNA processing (in eukaryotes): The initial transcript (pre-mRNA) is processed before leaving the nucleus: - 5' cap added — protects mRNA and aids ribosome binding - Poly-A tail added at 3' end — protects from degradation and helps export - Splicing: Introns (non-coding sequences) are cut out; exons (coding sequences) are joined together Gene regulation: Not every gene is active in every cell. A liver cell and a neuron contain identical DNA, yet behave completely differently because different sets of genes are expressed. Transcription factors, epigenetic marks (methylation and histone modifications), and non-coding RNAs all control which genes are turned on or off in which cells at which times. Clinical connections: - Many cancer-causing mutations affect transcription factors or promoter regions — causing genes to be expressed at the wrong time, in the wrong cells, or at the wrong levels - Gene expression profiling — measuring which mRNAs are present in a tumour — is used to classify cancers and predict prognosis
Translation: RNA → Protein
The mRNA exits the nucleus and travels to ribosomes in the cytoplasm (or endoplasmic reticulum), where its sequence is decoded to build a protein. This is called translation. The genetic code: The mRNA sequence is read in groups of three bases called codons. Each codon specifies one amino acid (or a start/stop signal). Since there are 4 bases, there are 4³ = 64 possible codons — enough to encode all 20 amino acids with room for redundancy (degeneracy). This redundancy means that many amino acids are encoded by multiple codons — a single base change may code for the same amino acid (a "silent" or synonymous mutation) and have no effect on the protein. How translation works: 1. The ribosome binds to the mRNA at the start codon (AUG, which also codes for methionine — all proteins start with methionine) 2. Transfer RNA (tRNA) molecules act as adaptors — each tRNA carries a specific amino acid and has an anticodon that recognises the complementary mRNA codon 3. The ribosome has three sites (A, P, E): tRNA arrives at the A site, the amino acid is added to the growing chain at the P site, and used tRNA exits from the E site 4. The ribosome moves along the mRNA codon by codon until it reaches a stop codon (UAA, UAG, or UGA) 5. The completed protein chain is released and folds into its functional 3D shape Mutations and their consequences: - Silent mutation — base change but same amino acid → no effect - Missense mutation — base change causing wrong amino acid → protein may be non-functional (e.g. sickle cell — Glu→Val) - Nonsense mutation — base change creating a premature stop codon → truncated, usually non-functional protein - Frameshift mutation — insertion or deletion of a base, shifting the reading frame for all codons downstream → completely garbled protein (usually non-functional) Antibiotics target translation: Many antibiotics kill bacteria by blocking their ribosomes. Since bacterial ribosomes (70S) are structurally different from human ribosomes (80S), these drugs selectively kill bacteria without harming human cells. Examples: tetracyclines (block A site), aminoglycosides (cause misreading), macrolides (block elongation).
Mutations, Inheritance and Genetic Disease
Mutations are permanent changes in the DNA sequence. They are the raw material of evolution — occasionally beneficial, often neutral, sometimes harmful. Sources of mutations: - Replication errors that escape proofreading - Chemical mutagens — tobacco smoke, alcohol metabolites (acetaldehyde) - Radiation — UV light (causes thymine dimers → skin cancer), ionising radiation (X-rays, gamma rays, causes double-strand breaks) - Viruses — some viruses insert their own DNA into the host genome, disrupting genes Types of inheritance: - Autosomal dominant — one mutated copy is enough to cause disease (e.g. Huntington's disease, Marfan syndrome, familial hypercholesterolaemia). 50% chance of passing to each child - Autosomal recessive — two mutated copies needed (one from each parent). Carriers are unaffected. Examples: cystic fibrosis, sickle cell disease, phenylketonuria (PKU). 25% chance affected if both parents are carriers - X-linked recessive — gene on the X chromosome. Males (XY) affected if they inherit one mutated copy (no second X to compensate). Females (XX) are usually carriers. Examples: haemophilia A and B, Duchenne muscular dystrophy, colour blindness The BRCA genes: BRCA1 and BRCA2 are tumour suppressor genes that help repair damaged DNA. Mutations in these genes are inherited in an autosomal dominant pattern — carrying one mutated copy gives 70–80% lifetime risk of breast cancer and 40–60% risk of ovarian cancer. Testing is offered to women with strong family histories. Pharmacogenomics — DNA guides drug treatment: Genetic variants affect how individuals metabolise drugs. CYP2D6 is a liver enzyme that metabolises many drugs (codeine, tamoxifen, antidepressants). "Poor metabolisers" (those with two non-functional CYP2D6 copies) have exaggerated drug effects; "ultra-rapid metabolisers" may need higher doses. Genetic testing before prescribing is increasingly used to personalise treatment.
The free iOS app has quizzes, spaced repetition flashcards, timed practice exams, and weak spot tracking — for every lesson.