§ 2.6  ·  Research area

Search strategies

Navigating the universal function space — exploration vs exploitation, adaptive search spaces, and the dual diversity gates that hold populations open.

§ 1   Why search strategies

Navigating a universal function space

Symbolic regression searches an open-ended space. There is no fixed parameter vector, no bounded architecture, no closed set of candidates to enumerate. Every function the language can express is in scope, and the language is universal. Selection pressure alone is too coarse a tool for that space — a sharp pressure collapses the population onto the first plausible local optimum; a soft pressure drifts forever.

Ufinq treats search strategy as an explicit, named concern. Branches alternate strategies on succession so the population is repeatedly nudged off whatever attractor it has settled into. Each branch sees a search space that has been narrowed by what its predecessors learned, not the entire universal space. And the gene pool is held open by two diversity gates that operate on different properties — structural form and behavioral output — so neither one can be gamed by the other.

§ 2   Sub-areas

Four threads of work

§ 2.1

Strategy alternation across branches

  • Strict alternation on succession — when a branch enters the arena and its successors are spawned, the children's strategies alternate strictly. The population never runs two consecutive generations under the same strategy unless the alternation is intentional.
  • Per-strategy gene pools — diversity is measured per strategy, not population-wide. A strategy that compresses its sub-population is not penalised by another strategy's larger spread, and a winning strategy cannot quietly collapse the entire population's diversity.
  • SearchSpace.error — when a strategy commits to a search space that the data later contradicts, the contradiction is recorded on the search space rather than silently re-explored. Successor branches see the error and avoid re-entering the same dead end.
  • Open questions — strategy-pool sizing under heterogeneous datasets; the right point at which a strategy should be retired permanently versus temporarily; cross-branch error propagation policy.
§ 2.2

Adaptive search spaces

  • Per-branch SearchSpace — every branch carries its own search space, which encodes what is in scope: which functions, which variables, which structural shapes. The search space is not the entire language; it is the language minus what predecessors have ruled out.
  • Inheritance with subtraction — when a branch forks, children inherit the search space and may further constrain it based on what they learned. Constraints accumulate downward; no branch ever widens its search space against its parent.
  • Error-driven shrinkage — failed candidates contribute structure to the search space's error model. The same shape that produced a non-finite result in one generation gets harder to sample in the next.
  • Open questions — error-model expressiveness (how much of the shape needs to be captured to be useful); search-space serialization cost when delegated across the cluster; recovery when a constraint turns out to be wrong.
§ 2.3

Dual diversity gates

  • Structural diversity — measured on the canonical form of the refined candidate via Node.digest. Two candidates with the same digest occupy the same slot, regardless of how their constants differ. This is what makes meaningful diversity counts possible — refined populations cannot hide duplicates behind constant variation.
  • Behavioral diversity — measured on the candidate's output on the training data. Two candidates with different structure may behave identically; behavioral diversity catches the case structural diversity misses, and vice versa.
  • Post-refinement gating — both gates run after refinement, not before. A candidate that would be a duplicate post-refinement is filtered before it consumes a gene pool slot, even if its raw form looked unique.
  • Open questions — choice of behavioral-similarity metric under different output distributions; gate-interaction policy when a candidate is structurally novel but behaviorally identical to an incumbent (and the converse); calibration of gate strictness as the population matures.
§ 2.4

Selection over the Pareto front

  • No single best — there is no bestGenotype. The coordinating branch maintains a Pareto frontier across (accuracy, complexity, behavior) and selection samples from the frontier rather than picking a singular winner.
  • SearchStrategy as a first-class concept — each branch carries an explicit strategy that determines how it samples from the frontier and how it weights structure versus accuracy. Strategy is data, not code; new strategies are added by registering them, not by forking the search loop.
  • Frontier as the Hall of Fame — what the coordinating branch shows the user, exports, and uses to seed successors is a frontier, not a single curve. A model dossier surfaces the frontier explicitly so a deployer can choose the trade-off rather than inherit one.
  • Open questions — frontier compression under saturated runs; multi-objective ordering when objectives disagree by orders of magnitude; user-supplied preference functions over the frontier.
§ 3   Why it matters

Three consequences of this design

Local optima are escapableStrategy alternation forces the population off whatever attractor the previous strategy settled into. The population cannot quietly converge on a wrong answer because no single strategy ever runs long enough to exhaust diversity.
Failure is a signal, not noiseSearch-space error makes a failed candidate inform the rest of the search rather than be quietly retried. The same dead end is not re-explored across successors.
Diversity is dimensional, not scalarTwo distinct dimensions of diversity are gated independently. A population cannot satisfy diversity by being structurally varied while behaviorally identical, or vice versa.
§ 4   Notes & references

Related material

A short-form note on the two diversity dimensions lives at Diversity is two things, not one. The full lifecycle of a single branch — including how a strategy commits to a search space and how succession plays out — is described under Technology § 1 with Figure 1.

Adjacent areas: Genetic operators § 2.5 describes the operators a strategy uses to generate offspring; Symbolic refinement § 2.2 describes the canonicalization that makes the structural-diversity digest meaningful; Distributed orchestration § 2.1 describes how strategies travel across delegated branches.

Read

Continue reading

Other research areas are listed under Research. Published work lives on Papers; short-form notes on Notes.

Back to Research areas