Note · 2026-04-30 · RESEARCH

Diversity is two things, not one

Most genetic-programming systems collapse all diversity into a single concept. Ufinq treats structural diversity and behavioural diversity as independent properties — because they fail in different ways, for different reasons, at different times.

Premature convergence is the canonical failure mode of evolutionary search: the population stops exploring the space and concentrates on a single local optimum. The standard mitigation is to encourage diversity in the population. The conventional way to do this is to compute one diversity metric — typically a structural one, like average tree-edit distance between candidates — and use it as a soft constraint or a niching pressure.

This is one diversity metric short of what is actually needed. Two candidates can be structurally distinct and behaviourally identical: two spellings of the same function. Two candidates can be structurally identical and behaviourally distinct: the same skeleton with different numeric constants that produce different predictions. Either failure mode individually is enough to hide the population’s real state from the algorithm.

The two gates

Ufinq tracks both properties as independent gates on the gene pool:

Structural diversity. Operates on the canonicalised computation graph. Two graphs are structurally equivalent if they reduce to the same canonical form under the simplification rule DSL plus the e-graph rewriter. Structural-diversity gates prevent the population from accumulating multiple spellings of the same hypothesis.
Behavioural diversity. Operates on the candidate’s output vector across a representative sample of inputs. Two candidates are behaviourally equivalent if their output vectors are close in some metric. Behavioural-diversity gates prevent the population from accumulating different-looking candidates that compute the same thing.

Why both, and why independent

These two failure modes are not redundant: a structurally diverse population can still be behaviourally collapsed (many phrasings of the same function), and a behaviourally diverse population can still be structurally trivial (the same skeleton with many parameter settings). Both gates have to fire, separately, on every refined candidate before it enters the gene pool.

An open research question is how to combine the two metrics into a single scalar pressure without throwing away the property that makes them useful separately. So far, treating them as independent constraints rather than a weighted sum has held up across the workloads we benchmark. We expect this to remain the right design as the value substrate widens further.