Skip to main content

Rules and Transforms

This page documents the schema configuration surfaces and how the current pipeline uses them. Not all configuration implies identical runtime enforcement across execution modes.

Not all configuration implies identical runtime enforcement across execution modes.


Validation Rules

Schemas define baseline field rules, with an optional validationConfig for additional constraints.

Rule Types

Field-level rules (from RowOpsSchemaField):

  • type: string | number | date | boolean
  • required
  • regex
  • enumValues

Validation config (validationConfig) can include:

  • fields: per-field rules (type, required, regex, enumValues, min, max, minLength, maxLength)
  • crossFieldRules: conditional rules such as "when X equals Y, then Z is required"
  • regexRulesEnabled: toggles regex evaluation

These structures are passed to the WASM validator. Rule support is determined by the engine version.

Rule Evaluation

  • Validation runs in WASM against Arrow IPC chunks (no JS row loops).
  • Invalid rows are recorded with row indices and error codes.
  • Regex rules can be disabled by tier; when disabled, regex rules are stripped or rejected.

Transform Configuration

Transforms are defined as a Transform DSL and applied after masking.

Transform Types

Transform DSL operations include:

  • cast
  • rename
  • derive
  • filter
  • lookup
  • conditional

Transforms can be provided via:

  • schema.transformConfig
  • the transformPipeline prop (overrides schema)

Transform Execution

  • Transforms execute in a worker on Arrow IPC bytes.
  • Tier gating restricts operation kinds, expression complexity, and operation count.
  • Transform errors fail the pipeline.

Row-Filtering Restriction

The import pipeline currently rejects any transform that changes row count. Filter and delete operations in the DSL will cause a failure in the importer pipeline, even though the DSL can represent them.


Masking Intent

Masking is driven by maskConfig in the schema or importer props.

Masking Strategies

The current mask executor registers these built-in strategies:

  • null, redact, hash, partial, truncate, email, phone, date, number, passthrough

If a strategy is not registered, the mask stage fails closed.

Masking Execution

  • Masking runs only on valid rows.
  • Masking must run before export/sync; the pipeline enforces masked datasets for those steps.
  • Masking errors halt the pipeline (fail closed).

Note: The schema MaskConfig includes fields such as autoDetectPII and per-type strategies (email, phone, card, ip, ssn, passport). No evidence found that those fields are applied by the current mask executor; columnRules and defaultStrategy are the fields actively consumed. Validate any advanced masking behavior in your integration.


Replay Semantics

The pipeline aims to be deterministic given the same input, schema, and config. Replay can diverge if:

  • You use external validation callbacks or sync targets with non-deterministic behavior.
  • You change schema versions, mappings, or transform config between runs.

Mode-Specific Enforcement

Dashboard-Assisted Mode

  • Publishable keys with domain locking.
  • License verification provides tier limits and enables feature gating.
  • Pipeline uses browser workers and IndexedDB chunk storage.

Headless Mode

  • Secret keys, no domain lock.
  • License modes: strict, demo.
  • Same pipeline stages, but storage is local to the execution environment.

These modes share the engine but differ in auth, licensing, and storage behavior. Validate critical rules in the mode you deploy.


What This Configuration Does Not Guarantee

  • It is not a substitute for backend validation of business logic.
  • It does not guarantee identical behavior across versions or execution modes.
  • It does not imply that unused schema fields are enforced by the engine.