Mask Module

The Mask module applies masking strategies to sensitive fields before downstream processing or export.

Masking behavior is implemented in the core execution engine. Configuration support exists, but runtime enforcement should be validated per execution mode.

Purpose

Apply masking strategies to sensitive fields, replacing or obscuring values before they are exported or delivered to external systems. Masking is destructive by design unless you opt into tokenization with a local vault.

When It Runs

Pipeline Position: After Validate (optional)

Parse -> Validate -> **Mask** -> Transform -> Profile

Masking occurs before transformation and export, ensuring sensitive data is protected before any downstream operations.

Inputs

Input	Type	Description
Data	Arrow IPC	Validated Arrow RecordBatch stream
Config	`MaskConfig`	Masking rules per field

Outputs

Output	Type	Description
Masked data	Arrow IPC	Masked Arrow RecordBatch stream

Configuration

MaskConfig Structure

type MaskConfig = {
  autoDetectPII?: boolean;
  defaultStrategy?: MaskStrategy;
  projectSecret?: string;
  emailStrategy?: MaskStrategy;
  phoneStrategy?: MaskStrategy;
  cardStrategy?: MaskStrategy;
  ipStrategy?: MaskStrategy;
  ssnStrategy?: MaskStrategy;
  passportStrategy?: MaskStrategy;
  columnRules?: Record<string, MaskStrategy | { strategy: MaskStrategy; options?: MaskRuleOptions }>;
};

type MaskRuleOptions = {
  first?: number;
  last?: number;
  fixed?: string;
  projectSecret?: string;
  pattern?: string;
  replacement?: string;
};

MaskStrategy Options

Basic Strategies

Strategy	Description	Example
`none`	No masking (pass-through)	`john@example.com` -> `john@example.com`
`redact`	Replace with static placeholder	`john@example.com` -> `****`
`hash`	One-way SHA-256 hash (optional secret)	`john@example.com` -> `a8d91c...`
`deterministic`	Hash using `projectSecret`	`john@example.com` -> `a8d91c...`
`partial`	Mask portion of value (configurable)	`secret123` -> `se***23`
`fixed`	Replace with fixed string	`secret` -> `[HIDDEN]`
`null`	Replace with null value	`john@example.com` -> `null`
`shuffle`	Deterministically shuffle characters	`secret` -> `teserc`

Semantic Strategies (Pro+ Tier)

These strategies automatically format output based on the data type:

Strategy	Description	Example
`email`	Mask local part, keep domain	`john@example.com` -> `j***@example.com`
`phone`	Keep last 4 digits	`555-123-4567` -> `*-*-4567`
`ssn`	SSN format with last 4 visible	`123-45-6789` -> `*--6789`
`creditcard`	Card format with last 4 visible	`4111111111111111` -> `**--**-1111`

Regex Strategy (Scale+ Tier)

Strategy	Description
`regex`	Replace matches using pattern and replacement template

Regex strategy requires pattern and replacement options:

{
  strategy: "regex",
  options: {
    pattern: "\\d{4}-\\d{4}-\\d{4}-\\d{4}",
    replacement: "****-****-****-$4"
  }
}

Replacement templates support capture groups ($1, $2, etc.).

Specific strategy availability depends on tier and field type.

Tokenize Strategy (Scale+ Tier)

Strategy	Description
`tokenize`	Replace values with vault-backed tokens

Tokenize requires a projectSecret (or a vault secret provided by the host):

{
  strategy: "tokenize",
  options: {
    projectSecret: "import-session-secret"
  }
}

Tokens are deterministic per secret and recorded in the local token vault.

Tier Behavior

Mask features are available across multiple tiers:

Feature	Free	Pro	Scale	Enterprise
Basic strategies (none/redact/hash/partial/null)	Yes	Yes	Yes	Yes
Deterministic + fixed	No	Yes	Yes	Yes
Semantic masking (email/phone/ssn/creditcard)	No	Yes	Yes	Yes
Shuffle + regex	No	No	Yes	Yes
Tokenization vault	No	No	Yes	Yes

What This Module Does Not Do

Does not run ML-based PII detection: full detection lives in the pii-detect module; Mask uses config + heuristics
Does not guarantee identical enforcement across all execution modes: validate worker vs pipeline usage in your app
Does not provide server-side reidentification: tokenization is reversible only with the local vault
Does not mask based on content analysis alone: masking applies to configured fields and semantic hints

Constraints

Module Status

The public API exports type definitions. Actual masking implementation resides in the core execution engine.

Execution Mode Differences

Masking behavior may differ between execution modes:

Aspect	Dashboard-Assisted	Headless
Implementation	Engine layer	Pipeline layer
Configuration	Schema-embedded	Pipeline config
Enforcement	Should be validated	Should be validated

Runtime enforcement should be validated per execution mode. Do not assume identical behavior without testing.

Fail-Closed Behavior

In tested paths, masking failures throw MaskingFailedError and halt the pipeline. No fallback to unmasked data is provided.

Failure Modes

Failure	Behavior
Configuration error	`MaskingFailedError` thrown; pipeline halts
Unsupported strategy	`MaskingFailedError` thrown; pipeline halts
Invalid field reference	Error thrown; pipeline halts

Masking failures are fail-closed: no partial or unmasked results are produced.

Observed Status

Fully implemented. Masking is executed via WASM in the pipeline and engine layer, with:

Core strategies (none/redact/hash/deterministic/partial/fixed/null/shuffle)
Semantic strategies (email/phone/ssn/creditcard)
Regex-based masking with capture group support
Tokenization with a local token vault
Tier-gated feature access and fail-closed error handling

Type definitions are exported publicly from @rowops/schema.

Purpose​

When It Runs​

Inputs​

Outputs​

Configuration​

MaskConfig Structure​

MaskStrategy Options​

Basic Strategies​

Semantic Strategies (Pro+ Tier)​

Regex Strategy (Scale+ Tier)​

Tokenize Strategy (Scale+ Tier)​

Tier Behavior​

What This Module Does Not Do​

Constraints​

Module Status​

Execution Mode Differences​

Fail-Closed Behavior​

Failure Modes​

Observed Status​