Chapter 3 Molecular Evolution

3.1 Why Molecular Evolution Matters

One of the central questions in origin-of-life research is how simple chemistry became capable of evolution.

Modern biological evolution depends on three fundamental ingredients:

  • Information
  • Replication
  • Variation

Today these functions are carried out primarily by DNA, RNA, and proteins. However, before modern cells existed, simpler molecular systems may have possessed primitive versions of these capabilities.

Molecular evolution studies how populations of molecules change through time when information is copied imperfectly and some variants become more successful than others.

Understanding molecular evolution helps bridge the gap between prebiotic chemistry and the emergence of life.

3.2 From Prebiotic Chemistry to Evolution

The previous chapter introduced prebiotic molecular pools. These pools contain diverse molecules that may have formed through natural chemical processes on the early Earth.

Diversity alone, however, is not sufficient for evolution.

For evolution to occur:

  1. Molecules must differ from one another.
  2. Some molecules must persist or reproduce more successfully than others.
  3. Information must be transmitted between generations.
  4. New variants must appear.
  5. Successful variants must become more common through time.

When these conditions are satisfied, populations can evolve.

This transition from chemistry to evolution represents one of the most important steps in the origin of life.

3.3 The RNA World Hypothesis

One of the most influential origin-of-life theories is the RNA World hypothesis.

The central idea is that early life may have been based primarily on RNA-like molecules.

RNA is particularly interesting because it can perform two critical functions:

  • Store information
  • Catalyze chemical reactions

Modern life separates these functions among DNA, RNA, and proteins. The RNA World hypothesis proposes that earlier biological systems may have relied on a single class of molecules capable of doing both.

If correct, evolution may have begun before cells, genomes, and proteins existed.

Although the RNA World remains one of the leading hypotheses, researchers continue to investigate alternative and complementary explanations involving metabolism, compartments, and autocatalytic networks.

3.4 Information and Heredity

Evolution requires information.

In biological systems, information refers to patterns that can be copied and transmitted.

For example:

AUGCAUGCAUGC

contains information because the arrangement of symbols matters.

When sequences are copied, information is inherited. When copying errors occur, variation is introduced.

The interaction between heredity and variation forms the foundation of evolution.

Without heredity, successful innovations would be lost. Without variation, populations could not adapt or explore new possibilities.

3.5 Conceptual Model

In lifesimulatoR, molecular evolution is represented using symbolic molecular sequences.

The model contains five major processes:

  1. Molecular populations
  2. Fitness evaluation
  3. Replication
  4. Mutation
  5. Selection

Together, these processes generate evolutionary dynamics.

Although simplified, the model captures the essential logic underlying evolutionary change.

3.6 Creating a Molecular Population

Evolution begins with a population of molecules.

pool <- create_prebiotic_pool(
  n_molecules = 50,
  alphabet = c("A", "U", "G", "C"),
  min_length = 5,
  max_length = 15,
  seed = 123
)

head(pool)
## [1] "UGUUUGA"         "UUAUGCAG"        "ACAAAGCUGUAUGCU" "GGACG"           "AGAAUGGCAGAGCU"  "UAACCGAUAAGAU"

Each sequence represents a symbolic molecule.

Although these molecules are simplified abstractions, they allow us to explore the logic of molecular evolution.

3.7 Molecular Fitness

Not all molecules are equally successful.

The concept of fitness attempts to capture how likely a molecule is to persist or contribute descendants to future generations.

In real chemistry, fitness may depend on factors such as:

  • Stability
  • Catalytic activity
  • Replication efficiency
  • Environmental conditions
  • Resource availability

In lifesimulatoR, fitness is represented using a simplified scoring function.

molecules <- c(
  "AUGC",
  "AAAAUUUU",
  "GCGCGC",
  "AUAUAUAUAUAU"
)

fitness <- molecule_fitness(molecules)

data.frame(
  molecule = molecules,
  fitness = fitness
)
##       molecule   fitness
## 1         AUGC 0.6993290
## 2     AAAAUUUU 1.0687308
## 3       GCGCGC 0.8876282
## 4 AUAUAUAUAUAU 1.2500000

3.8 Fitness Landscapes

Evolution is often visualized using the concept of a fitness landscape.

A fitness landscape maps molecular structures to fitness values.

  • High-fitness molecules occupy peaks.
  • Low-fitness molecules occupy valleys.
  • Mutation moves populations through the landscape.
  • Selection tends to push populations toward higher-fitness regions.

Although fitness landscapes are simplified conceptual tools, they help explain how populations can gradually become better adapted.

A useful way to think about evolution is as a search process moving through a landscape of possibilities.

3.9 Replication

Replication is the process by which molecules produce copies of themselves.

Without replication, successful molecules cannot become more common.

next_generation <- replicate_molecules(
  molecules = molecules,
  n_molecules = 20,
  selection_strength = 1
)

next_generation
##  [1] "AAAAUUUU"     "AAAAUUUU"     "AUGC"         "AUAUAUAUAUAU" "AUAUAUAUAUAU" "GCGCGC"       "AUAUAUAUAUAU"
##  [8] "AUAUAUAUAUAU" "AUAUAUAUAUAU" "GCGCGC"       "AUAUAUAUAUAU" "AUGC"         "AAAAUUUU"     "AAAAUUUU"    
## [15] "GCGCGC"       "AUAUAUAUAUAU" "AUAUAUAUAUAU" "AUGC"         "AAAAUUUU"     "GCGCGC"

Replication allows information to persist through time.

Every evolutionary process depends on some mechanism that preserves information from one generation to the next.

3.10 Mutation

Mutation introduces novelty into a molecular population.

Without mutation, evolution would eventually stop because no new variants could appear.

original <- "AUGCAUGCAUGC"

mutated <- mutate_sequence(
  sequence = original,
  alphabet = c("A", "U", "G", "C"),
  mutation_rate = 0.05
)

data.frame(
  original = original,
  mutated = mutated
)
##       original      mutated
## 1 AUGCAUGCAUGC AUGGAUGCGUGC

Mutation generates diversity, but excessive mutation can also destroy information.

3.11 Mutation Rate and the Evolutionary Trade-Off

Mutation creates one of evolution’s most important trade-offs.

Low mutation rates:

  • Preserve information
  • Limit innovation

High mutation rates:

  • Generate novelty
  • Risk destroying successful structures

Evolution requires a balance between stability and innovation.

Too little mutation limits exploration. Too much mutation prevents successful information from being preserved.

3.12 The Error Threshold

One of the most important concepts in molecular evolution is the error threshold.

If mutation rates become too high:

  • Successful sequences cannot be preserved.
  • Information is lost.
  • Selection becomes ineffective.

This idea is closely associated with Eigen’s work on early replicators.

A major challenge for the earliest evolving systems may have been maintaining enough copying accuracy to preserve information while still generating useful variation.

The error threshold remains a central concept in studies of the origin of life.

3.13 Selection

Selection occurs whenever some molecules contribute more descendants than others.

Selection is not a conscious process. It emerges automatically whenever certain variants persist or replicate more successfully.

In the simulation:

  • Weak selection produces nearly neutral evolution.
  • Strong selection favors fitter molecules more strongly.

Selection acts as a filter on variation.

Mutation generates possibilities. Selection determines which possibilities persist.

3.14 Evolving One Generation

Mutation, replication, and selection can be combined into a single evolutionary step.

next_generation <- evolve_generation(
  molecules = pool,
  mutation_rate = 0.02,
  selection_strength = 1
)

head(next_generation)
## [1] "UUAUGCAG"     "GCCGUGC"      "UUCACUACCA"   "UAUGCAAC"     "UCUGAUCACUAC" "UUUGCGGUAUC"

This represents one generation of molecular evolution.

Repeated application of this process can produce long-term evolutionary change.

3.15 Simulating Molecular Evolution

The full simulation repeatedly applies mutation, replication, and selection across many generations.

sim <- simulate_abiogenesis(
  n_molecules = 100,
  generations = 200,
  mutation_rate = 0.02,
  selection_strength = 1,
  seed = 123
)

head(sim)
## # A tibble: 6 × 6
##   generation n_molecules mean_length mean_fitness diversity max_fitness
##        <int>       <int>       <dbl>        <dbl>     <int>       <dbl>
## 1          0         100        12.6         1.00       100        1.25
## 2          1         100        12.7         1.04        67        1.25
## 3          2         100        12.3         1.05        61        1.25
## 4          3         100        12.3         1.11        61        1.25
## 5          4         100        12.5         1.11        48        1.25
## 6          5         100        12.8         1.13        53        1.25

3.16 Visualizing Fitness Through Time

plot_simulation(
  sim,
  x = "generation",
  y = "mean_fitness"
)

This plot helps reveal whether fitter molecular variants become more common through time.

3.17 Visualizing Diversity Through Time

plot_simulation(
  sim,
  x = "generation",
  y = "diversity"
)

Diversity provides information about how broadly the population is exploring sequence space.

3.18 Interpreting Simulation Results

Several outcomes are possible.

3.18.1 Increasing Fitness

May indicate:

  • Successful variants becoming more common
  • Selection acting effectively

3.18.2 Increasing Diversity

May indicate:

  • Exploration of new sequence space
  • Significant mutation activity

3.18.3 Decreasing Diversity

May indicate:

  • Dominance by a small number of successful variants
  • Strong selective pressures

3.18.4 Stable Values

May indicate:

  • Equilibrium under the assumptions of the model
  • Balance between mutation and selection

3.19 Evolution as Information Processing

A useful way to think about evolution is as a process that accumulates information.

Mutation generates possibilities.

Selection filters possibilities.

Replication preserves successful patterns.

Together, these processes can gradually increase organization within a population.

From this perspective, evolution can be viewed as a mechanism for discovering and preserving useful information.

3.20 Limitations of the Model

This model intentionally omits many aspects of real chemistry, including:

  • RNA folding
  • Catalytic mechanisms
  • Resource competition
  • Environmental fluctuations
  • Thermodynamics
  • Spatial organization
  • Membrane compartmentalization
  • Metabolic networks

The goal is conceptual clarity rather than chemical realism.

3.21 Connections to Later Chapters

Molecular evolution addresses how information-bearing molecules may change through time.

However, life requires more than information.

Living systems also require:

  • Compartments
  • Networks
  • Energy flow
  • Self-maintenance

The next chapters explore diversity, complexity, protocells, and autocatalytic networks to examine how these additional ingredients may have contributed to the emergence of life.

3.22 Key Takeaways

  • Molecular evolution may have preceded modern cellular life.
  • The RNA World hypothesis provides one possible framework for early evolution.
  • Replication, mutation, and selection generate evolutionary change.
  • Fitness landscapes help explain adaptation.
  • Excessive mutation can destroy information through the error threshold.
  • Evolution can be viewed as an information-processing process.
  • Molecular populations can accumulate organization through repeated cycles of variation and selection.
  • lifesimulatoR provides simplified computational models for exploring these concepts.

3.23 Suggested Readings

  • Eigen, M. (1971). Self-Organization of Matter and the Evolution of Biological Macromolecules.
  • Gilbert, W. (1986). The RNA World.
  • Joyce, G. F. (2002). The Antiquity of RNA-Based Evolution.
  • Maynard Smith, J., & Szathmáry, E. (1999). The Origins of Life.
  • Kauffman, S. (1993). The Origins of Order.

3.24 Reflection Questions

  1. Why is information essential for evolution?
  2. Could evolution occur before cells existed?
  3. What limits the mutation rate of early replicators?
  4. How does the concept of a fitness landscape help explain adaptation?
  5. Is higher fitness always associated with greater complexity?
  6. Could molecular evolution begin before metabolism?
  7. What additional mechanisms are needed to transform evolving molecules into living systems?
  8. What aspects of molecular evolution are not captured by symbolic sequence models?
  9. Can selection create information, or only preserve it?
  10. What role might molecular evolution have played in the transition from chemistry to biology?