Skip to content

Data-Oriented Design

Web Engine Dev adopts Data-Oriented Design (DOD) as its core architectural principle. This page explains what DOD is, why it matters for a web game engine, and how it manifests throughout the codebase.

ADR Reference

This design is formally documented in ADR-007: Data-Oriented Design.

Why Data-Oriented Design?

Traditional web game engines use object-oriented patterns: class hierarchies for game objects, scattered heap allocations, and per-object update loops. In JavaScript, this approach creates several problems:

ProblemOOP ImpactDOD Solution
Cache missesObjects scattered across heapComponents in contiguous typed arrays
Entity capacity~1,000-5,000 objects50,000+ entities
SerializationJSON.stringify per object (~20ms/1000)Typed array bulk copy (~0.5ms/1000)
ParallelismShared mutable objectsSystems with non-overlapping data access
GPU uploadsPer-object uniform writesBatch typed array uploads

DOD flips the design: instead of organizing code around objects and their behaviors, you organize it around the data and the transformations applied to it.

Structure of Arrays (SoA) vs Array of Structures (AoS)

The key data layout decision in DOD is how component data is stored.

Array of Structures (AoS) -- Traditional OOP

typescript
// Each entity is an object with all its properties
const entities = [
  { x: 1.0, y: 0.0, vx: 0.5, vy: 0.0 },
  { x: 2.0, y: 1.0, vx: -1.0, vy: 0.5 },
  { x: 3.0, y: 0.0, vx: 0.0, vy: 9.8 },
];

// Updating positions loads entire objects into cache
for (const e of entities) {
  e.x += e.vx * dt;
  e.y += e.vy * dt;
}

Problem: Iterating positions also loads velocities, health, names, and every other field into the CPU cache line. With 500+ byte objects, most of the cache is wasted on data the current system does not need.

Structure of Arrays (SoA) -- Data-Oriented

typescript
// Each field is a separate contiguous array
const positionX = new Float32Array([1.0, 2.0, 3.0]);
const positionY = new Float32Array([0.0, 1.0, 0.0]);
const velocityX = new Float32Array([0.5, -1.0, 0.0]);
const velocityY = new Float32Array([0.0, 0.5, 9.8]);

// Updating positions accesses only the data it needs
for (let i = 0; i < count; i++) {
  positionX[i] += velocityX[i] * dt;
  positionY[i] += velocityY[i] * dt;
}

Benefit: The CPU prefetcher can predict sequential memory access. Only relevant data occupies cache lines. With typed arrays, the data is ~104 bytes per entity versus ~500+ bytes per object in OOP patterns.

ECS Archetype Storage

Web Engine Dev uses archetype-based storage (ADR-001) as the concrete implementation of SoA principles. Entities with the same set of components share an archetype, and each archetype stores data in columnar (SoA) layout.

How Archetype Storage Works

  1. Component signature -- Each unique combination of component types defines an archetype. Entities with [Position, Velocity] are in a different archetype than entities with [Position, Velocity, Health].

  2. Columnar tables -- Each archetype owns a table where each component field is stored as a separate Column (a typed array). Entities in the same archetype are stored contiguously.

  3. Archetype transitions -- When a component is added or removed, the entity moves to the appropriate archetype. An edge cache makes repeated transitions O(1).

  4. Query matching -- Queries use bitmask matching to find all archetypes containing the required components. This is an O(1) operation per archetype.

Benefits for Game Engines

CapabilityHow SoA/Archetypes Enable It
Cache-friendly iterationComponents of matching entities are contiguous in memory
Fast queriesBitmask matching finds archetypes in O(1), then iterates dense arrays
Parallel executionSystems with non-overlapping component access can run concurrently
GPU batch uploadsTyped arrays can be uploaded to GPU buffers directly
Network delta encodingDiff component arrays for minimal network traffic
Deterministic replayReplay command stream against typed array state

Systems as Data Transformations

In DOD, systems are pure functions that iterate over component data. There are no class hierarchies, no virtual dispatch, and no hidden this context:

typescript
const MovementSystem = defineSystem({
  query: [Position, Velocity],
  run(world, entities) {
    const dt = world.getResource(Time).delta;
    for (const entity of entities) {
      const pos = world.get(entity, Position);
      const vel = world.get(entity, Velocity);
      world.insert(entity, Position, {
        x: pos.x + vel.x * dt,
        y: pos.y + vel.y * dt,
      });
    }
  },
});

This design makes systems:

  • Testable -- Pure input/output with no hidden state
  • Composable -- Systems combine freely without class coupling
  • Parallelizable -- Systems with non-overlapping component access can run on separate threads

Single Source of Truth

All game state lives in the ECS World. There are no parallel state stores, no shadow copies, and no synchronization bugs:

This means:

  • Network serialization reads component arrays directly for delta encoding
  • Save/load serializes the World state in a single pass
  • Deterministic replay replays a command stream against the World
  • Debugging inspects one data store, not scattered object graphs

Performance Implications

Memory Efficiency

MetricOOP (AoS)DOD (SoA)
Bytes per entity~500+~104
Cache utilization~20% (loads unused fields)~90%+ (only relevant data)
Max entity count (browser)~5,00050,000+

Serialization Speed

OperationOOP (JSON.stringify)DOD (typed array copy)
1,000 entities5-20ms0.1-0.5ms
10,000 entities50-200ms1-5ms

GPU Integration

SoA layout enables batch GPU uploads. Instead of per-object uniform buffer writes, the renderer reads component arrays and uploads them as contiguous buffer data:

typescript
// SoA transform data is already in the right format for GPU upload
device.queue.writeBuffer(
  instanceBuffer,
  0,
  worldMatrixArray.buffer,  // Float32Array from ECS archetype table
  0,
  entityCount * 64          // 64 bytes per mat4x4f
);

Tradeoffs

DOD is not without costs:

TradeoffImpactMitigation
Steeper learning curveDevelopers familiar with OOP need to think differentlyComprehensive documentation, progressive disclosure API
Component add/remove overheadEntity moves to a new archetype tableEdge caching makes repeated transitions O(1), batch transitions for 3+ components
Less intuitive debuggingData in arrays, not named object propertiesDebug bridge provides entity inspection via MCP tools
Third-party integrationLibraries expect objects, not typed arraysFacade/adapter layers (ADR-004)

DOD in Practice

The DOD philosophy extends beyond the ECS to influence the entire engine:

  • Math library -- Provides BatchMath for SIMD-friendly bulk Float32Array operations alongside the object-oriented Vec3/Mat4 API
  • Renderer -- Uses SoA layouts for GPU-driven indirect rendering, instanced draw batching, and frustum culling over contiguous position/bounds arrays
  • Physics -- Rapier (Rust/WASM) internally uses SoA storage; the adapter reads ECS component arrays directly
  • Netcode -- Delta encoding operates on typed array diffs rather than object comparison
  • Scheduler -- Topological system ordering with conflict detection enables parallel execution on systems with non-overlapping data access

Further Reading

Proprietary software. All rights reserved.