Chapter 4 Diversity, Entropy, and Complexity

4.1 Why Diversity Matters

One of the most important questions in origin-of-life research is how simple chemical systems became sufficiently complex to give rise to life.

A key ingredient in this process is diversity.

A system containing many different molecular types can explore more possibilities than a system containing only a few molecular types. Diversity creates opportunities for new interactions, new structures, and potentially new functions.

However, diversity alone is not the same as life.

A completely random mixture of molecules may be highly diverse but possess no organization, no persistence, and no capacity for evolution.

This distinction is important:

Diversity creates possibilities, but organization creates life-like behaviour.

Understanding the relationship between diversity, entropy, and complexity helps us investigate how chemical systems may transition from randomness to biological organization.

4.2 Diversity in Origin-of-Life Research

Many origin-of-life theories rely on diversity as a source of innovation.

Examples include:

  • Diverse molecular pools exploring chemical space
  • Diverse catalytic networks generating new reactions
  • Diverse protocell populations competing for resources
  • Diverse replicators undergoing selection

Without variation, evolution cannot occur.

Without diversity, selection has nothing to act upon.

Consequently, diversity is often viewed as the raw material of evolutionary change.

4.3 What Is Entropy?

Entropy is a concept used in several scientific disciplines.

In thermodynamics, entropy is often associated with disorder and the number of possible microscopic arrangements of a system.

In information theory, entropy measures uncertainty or unpredictability.

Although these definitions are related, they are not identical.

This chapter focuses primarily on Shannon entropy, which is widely used to quantify diversity in populations.

4.4 Shannon Entropy

Claude Shannon introduced entropy in 1948 as a measure of information and uncertainty.

Suppose a population contains several categories with different abundances.

If one category dominates, uncertainty is low because outcomes are predictable.

If abundances are evenly distributed, uncertainty is higher because outcomes are less predictable.

Shannon entropy quantifies this uncertainty.

Higher entropy generally indicates:

  • Greater diversity
  • More even distributions
  • Less predictability

Lower entropy generally indicates:

  • Less diversity
  • Dominance by a few categories
  • Greater predictability

4.5 Conceptual Model

In lifesimulatoR, diversity can be explored through:

  • symbolic molecular populations,
  • abundance distributions,
  • summary statistics,
  • entropy calculations,
  • evolutionary simulations.

These tools help us examine how diversity changes through time and how diversity interacts with mutation and selection.

4.6 Summarizing a Molecular Population

Before measuring diversity, it is useful to summarize the molecular population.

molecules <- c(
  "AUGC",
  "AUGC",
  "UUUU",
  "GCGCGC",
  "AUAUAUAUAU"
)

summarize_molecules(
  molecules = molecules,
  generation = 0
)
## # A tibble: 1 × 6
##   generation n_molecules mean_length mean_fitness diversity max_fitness
##        <dbl>       <int>       <dbl>        <dbl>     <int>       <dbl>
## 1          0           5         5.6        0.787         4        1.20

The summary may include information such as:

  • Number of molecules
  • Mean sequence length
  • Diversity
  • Mean fitness
  • Maximum fitness

These metrics provide a high-level description of the population.

4.7 Measuring Shannon Entropy

Entropy can be calculated directly from abundance counts.

counts <- c(10, 5, 1)

shannon_entropy(counts)
## [1] 1.198192

The result represents the uncertainty associated with randomly selecting an individual from the population.

4.8 Comparing Low and High Diversity Systems

Consider two populations.

The first population is dominated by one category.

The second population is evenly distributed.

low_diversity <- c(100, 1, 1, 1)

high_diversity <- c(25, 25, 25, 25)

shannon_entropy(low_diversity)
## [1] 0.2361547
shannon_entropy(high_diversity)
## [1] 2

The evenly distributed population has higher entropy because outcomes are less predictable.

This illustrates one of the most important properties of entropy:

Entropy increases when abundance is distributed more evenly among categories.

4.9 Diversity Versus Randomness

An important misconception is that high entropy automatically means high complexity.

This is not necessarily true.

Consider three systems:

4.9.1 Ordered System

AAAAAAAAAAAA

Characteristics:

  • Low diversity
  • Low entropy
  • Low complexity

4.9.2 Random System

AUGCGAUUGCGA

Characteristics:

  • High diversity
  • High entropy
  • Often low organization

4.9.3 Organized System

Functional replicator

Characteristics:

  • Moderate diversity
  • Structured information
  • Potentially high complexity

Life appears to occupy a middle ground between complete order and complete randomness.

This idea is central to many theories of emergence and self-organization.

4.10 Diversity During Evolution

Mutation and selection continually influence diversity.

Mutation tends to create new variants.

Selection tends to amplify successful variants.

The interaction between these forces determines how diversity changes through time.

sim <- simulate_abiogenesis(
  n_molecules = 100,
  generations = 100,
  mutation_rate = 0.02,
  selection_strength = 1,
  seed = 123
)

head(sim)
## # A tibble: 6 × 6
##   generation n_molecules mean_length mean_fitness diversity max_fitness
##        <int>       <int>       <dbl>        <dbl>     <int>       <dbl>
## 1          0         100        12.6         1.00       100        1.25
## 2          1         100        12.7         1.04        67        1.25
## 3          2         100        12.3         1.05        61        1.25
## 4          3         100        12.3         1.11        61        1.25
## 5          4         100        12.5         1.11        48        1.25
## 6          5         100        12.8         1.13        53        1.25

4.11 Visualizing Diversity Through Time

plot_simulation(
  sim,
  x = "generation",
  y = "diversity"
)

This plot allows us to examine how molecular diversity changes throughout the simulation.

4.13 Complexity and the Origin of Life

Complexity is one of the most challenging concepts in science.

A system may be considered complex when:

  • Many components interact
  • Patterns emerge across multiple scales
  • The whole exhibits properties not obvious from the parts

Life is often regarded as a complex system because:

  • Molecules interact in networks
  • Information is stored and transmitted
  • Feedback loops exist
  • Evolution generates adaptation

One of the central challenges in origin-of-life research is explaining how complexity emerged from simpler chemical systems.

4.14 Diversity Is Not Enough

Diversity is necessary for evolution, but it is not sufficient.

A highly diverse system may still lack:

  • Replication
  • Heredity
  • Selection
  • Compartmentalization
  • Self-maintenance

Many origin-of-life theories therefore focus on understanding how diversity became organized into increasingly integrated systems.

This transition from diversity to organization may represent one of the most important steps in the emergence of life.

4.15 Connections to Other Chapters

This chapter builds upon:

  • Prebiotic molecular pools
  • Molecular evolution

It also prepares the foundation for:

  • Protocells
  • Autocatalytic networks
  • Emergence
  • Information theory
  • Complexity science

These topics explore how diversity becomes structured and how organization emerges from interacting components.

4.16 Key Takeaways

  • Diversity provides the variation required for evolution.
  • Shannon entropy measures uncertainty and diversity.
  • High entropy does not necessarily imply high complexity.
  • Mutation tends to increase diversity.
  • Selection often reduces diversity.
  • Complexity emerges from interactions among components.
  • Life appears to occupy a region between complete order and complete randomness.
  • Understanding diversity is essential for understanding the origin of life.

4.17 Suggested Readings

  • Shannon, C. E. (1948). A Mathematical Theory of Communication.
  • Maynard Smith, J., & Szathmáry, E. (1999). The Origins of Life.
  • Kauffman, S. (1993). The Origins of Order.
  • Adami, C. (2016). What Is Information?
  • Walker, S. I. (2017). Origins of Life and Complexity.

4.18 Reflection Questions

  1. Is a highly diverse system always more life-like?
  2. Can selection reduce diversity while increasing organization?
  3. How is Shannon entropy different from thermodynamic entropy?
  4. Why is high entropy not necessarily equivalent to high complexity?
  5. Can complexity emerge without evolution?
  6. How much diversity is needed before selection becomes effective?
  7. Could life emerge in a system with very low diversity?
  8. What additional complexity metrics could be added to future simulations?
  9. Why does life appear to exist between complete order and complete randomness?
  10. How might diversity contribute to the emergence of new biological functions?