§ 1   Technology

The Mathematical Colosseum

Ufinq discovers the source code of nature — small, readable programs that describe how data behaves. Built on Symbolic Regression, Ufinq treats the structure itself as the unknown, opening a vast, combinatorial search space of possible programs.

It is a mathematical colosseum. Generations of candidate programs enter the arena: some grow in complexity to capture more of the data; others stay small and elegant, trading detail for brevity. They compete on a frontier of accuracy and simplicity — where the fittest survive, recombine, and mutate, and where the weak fall away.

Ufinq scales the colosseum. Many evolutionary branches train in parallel — each a small population evolving in isolation until stagnation signals it is ready to enter the arena. There they meet equally mature opponents; their genetic material splits into an intensification path that exploits the strongest patterns and a diversification path that explores new ones. One arena, distributed across massive computational resources, where every node contributes to the same evolutionary contest.

What survives the colosseum is code — an explicit, line-by-line program written in the language of mathematics. Where deep learning compresses your data into millions of opaque weights, Ufinq returns a model deployable on any machine and readable by any human.

The AHISTER lifecycle Two evolutionary branches develop in isolation, mature, and meet in the arena. Their genetic material is asymmetrically partitioned into a tight tournament branch (intensification) and a wide advanced branch (diversification). IsolationArenaAsymmetric splitBranch ABranch BMeetingTournamentintensificationAdvanceddiversification
Figure 1 The AHISTER lifecycle. Two branches mature in isolation, meet in the arena, and split asymmetrically — one branch intensifies, one diversifies.
What survives the colosseum is code — not weights.
§ 2   Differentiators

What makes Ufinq different

§ 2.1

Algorithm

  • AHISTER orchestration — independent branches train in isolation, then meet in the arena; intensification and diversification happen simultaneously.
  • DEBOHAC distributed coordination — sister algorithm; one logical colosseum across heterogeneous compute clusters, with a real-time Hall of Fame.
  • Dual diversity preservation — structural and behavioral diversity, both actively gated. Populations do not collapse into local optima.
§ 2.2

Universal applicability

  • Every value type is first-class — bool, scalar, vector, matrix, tensor, text, set, tuple, record — all participate in the symbolic search.
  • 200+ function operators — arithmetic, comparison, conditional, set, conversion, text, vision — composable across all value types.
  • Multi-domain reach — regression, classification, system identification, simulation, control, computer vision — same symbolic substrate.
§ 2.3

Distributed evolutionary arena at scale

  • One logical colosseum across compute clusters — multi-cloud, multi-data-center, edge. The arena scales horizontally.
  • Modern concurrency end-to-end — virtual threads throughout; no polling, no callback hell.
  • Peer-local content-addressable datastore — population state moves between nodes without redundant transfers.
§ 2.4

A refinement pipeline far beyond evolution alone

  • Continuous simplification — hundreds of declarative rules in a custom rule DSL normalize every population, every generation.
  • Research-grade numeric optimization — CMA-ES, L-BFGS, Nelder-Mead, Coordinate Descent.
  • Native symbolic differentiation — gradients of evolved expressions, computed natively. Unusual for symbolic regression.
§ 2.5

Output

  • Code, not weights — every model is an explicit, line-by-line program; readable by humans, executable by machines.
  • A frontier of solutions, not a single answer — multiple candidates trading accuracy for simplicity. Pick the tradeoff that fits.
  • Generic substrate — symbolic regression on tabular data, time series, images, regions, and text — one technology.
The Ufinq colosseum, system view Multiple evolutionary branches at different lifecycle stages within a cluster boundary. Some are still maturing in isolation around the periphery; others converge on the central arena from the left; two emerge on the right — one tight (intensification), one spread (diversification). ClusterMaturingMaturingStagnatedBranchBranchArenaTournamentintensificationAdvanceddiversification
Figure 2 The Ufinq colosseum, system view. Many branches at many stages, one central arena where the asymmetric split happens.
§ 3   Computation graphs

Underlying computation graphs

Solutions are symbolic expressions built from mathematical building blocks. Each solution consists of a computation graph containing one or more computation trees (abstract syntax trees), built up of function nodes, reference nodes, and constant nodes. The value spaces of all nodes, references, constants, and functions (domains and codomains) are precisely defined. A computation graph can process arbitrarily complex features, states, and labels, not restricted to real-valued vectors. It supports variables for intermediate computation results within a single computation and persistent states shared across multiple computations. Computation trees are evaluated sequentially and can build on previous results.

Value types

ElementaryInteger · Real · Bool · Text · Vector · Matrix · Tensor · Nil
CompositeTuple · Map
§ 4   Function space

Universal Function Space

The algorithm searches for solutions in an approximated Universal Function Space, containing approximately all possible functions. Ufinq creates this effective function space from a fixed set of many predefined elementary functions and a comprehensive value system. Functions are built on value types and value spaces, with all characteristics (domain, codomain, …) defined for every elementary function. In computation graphs they are composed into more complex functions.

Elementary functions

ArityUnivariate · Bivariate · Trivariate · Multivariate
DomainsScalar · Vector · Matrix · Tensor · Boolean · Set · Comparison
§ 5   Objectives

Multiple objectives

Models are determined by searching the universal function space for functions that approximate the relationship between features and labels while best meeting the method's objectives. In addition to the error function, minimizing complexity is also an objective: solutions should be as accurate and as simple as possible. Simplicity is derived from the graph's complexity (the sum of node complexities, where different nodes carry different costs); accuracy is computed from prediction errors. Both metrics are normalized and combined into a quality value.

§ 6   Refinement

Sophisticated refinement

Each potential solution goes through a refinement pipeline including verification, simplification, optimization, normalization, and validation before joining the gene pool. Constants are optimized via CMA-ES, L-BFGS, Nelder-Mead, and Coordinate Descent. Expressions are simplified and normalized via a declarative rule DSL and an e-graph rewriter. All graphs, trees, and nodes are verified for correctness, and their value to the population is validated. It is an evolution of local optima — every individual is already refined to its best form before it competes.

The refinement pipeline Five sequential refinement stages — verify, simplify, optimise, normalise, validate — applied to every candidate before it enters the gene pool. § 6.1Verifystructuralinvariants§ 6.2Simplifyrule DSL +e-graph§ 6.3OptimiseCMA-ES, L-BFGS,Nelder-Mead, CD§ 6.4Normalisecanonicalform§ 6.5Validatevalue to thepopulationGene pool
Figure 3 The refinement pipeline. Every candidate passes through all five stages before it is allowed to compete.
§ 7   Search

Evolution of local optima

The evolutionary process is built on many search strategies, genetic operators, and smart generations and modifications of computation graphs. Many specialized functions integrate techniques from across mathematics — automatic differentiation, gradient descent, normalization, optimization, simplification, reasoning, logic, algebra, analysis, statistics, distributions — to improve search efficiency. Many techniques limit the effective search space or guide the search process. Because the universal function space is infinite, smart and focused non-evolutionary methods are essential. Techniques from other ML methods (deep learning, gradient boosting, SVMs) are integrated, so Ufinq indirectly generalizes machine learning. The search is driven by a multi-objective function that considers prediction errors, solution complexity, adaptive search strategies, and neutral paths to overcome local maxima.

§ 8   Applicability

Universal applicability

Ufinq was developed for universal applicability with any data in any structure and any complexity. Structured concurrency makes optimal use of high-performance systems; distributed computing spans multiple data centers and clouds; vertical and horizontal scaling, platform independence, and very large datasets enable the exploration of even large and complex models.

§ 9   Use

Use Ufinq commercially

Ufinq is a closed technology distributed via Diafunc, our Symbolic AI platform. Diafunc bundles Ufinq with the computing resources, data infrastructure, and tooling you need to apply it directly in your projects.

Use Ufinq at Diafunc