Skip to main content

Mask Module

The Mask module applies masking strategies to sensitive fields before downstream processing or export.

Masking behavior is implemented in the core execution engine. Configuration support exists, but runtime enforcement should be validated per execution mode.


Purpose

Apply masking strategies to sensitive fields, replacing or obscuring values before they are exported or delivered to external systems. Masking is destructive by design unless you opt into tokenization with a local vault.


When It Runs

Pipeline Position: After Validate (optional)

Parse -> Validate -> **Mask** -> Transform -> Profile

Masking occurs before transformation and export, ensuring sensitive data is protected before any downstream operations.


Inputs

InputTypeDescription
DataArrow IPCValidated Arrow RecordBatch stream
ConfigMaskConfigMasking rules per field

Outputs

OutputTypeDescription
Masked dataArrow IPCMasked Arrow RecordBatch stream

Configuration

MaskConfig Structure

type MaskConfig = {
autoDetectPII?: boolean;
defaultStrategy?: MaskStrategy;
projectSecret?: string;
emailStrategy?: MaskStrategy;
phoneStrategy?: MaskStrategy;
cardStrategy?: MaskStrategy;
ipStrategy?: MaskStrategy;
ssnStrategy?: MaskStrategy;
passportStrategy?: MaskStrategy;
columnRules?: Record<string, MaskStrategy | { strategy: MaskStrategy; options?: MaskRuleOptions }>;
};

type MaskRuleOptions = {
first?: number;
last?: number;
fixed?: string;
projectSecret?: string;
pattern?: string;
replacement?: string;
};

MaskStrategy Options

Basic Strategies

StrategyDescriptionExample
noneNo masking (pass-through)john@example.com -> john@example.com
redactReplace with static placeholderjohn@example.com -> ****
hashOne-way SHA-256 hash (optional secret)john@example.com -> a8d91c...
deterministicHash using projectSecretjohn@example.com -> a8d91c...
partialMask portion of value (configurable)secret123 -> se***23
fixedReplace with fixed stringsecret -> [HIDDEN]
nullReplace with null valuejohn@example.com -> null
shuffleDeterministically shuffle characterssecret -> teserc

Semantic Strategies (Pro+ Tier)

These strategies automatically format output based on the data type:

StrategyDescriptionExample
emailMask local part, keep domainjohn@example.com -> j***@example.com
phoneKeep last 4 digits555-123-4567 -> ***-***-4567
ssnSSN format with last 4 visible123-45-6789 -> ***-**-6789
creditcardCard format with last 4 visible4111111111111111 -> ****-****-****-1111

Regex Strategy (Scale+ Tier)

StrategyDescription
regexReplace matches using pattern and replacement template

Regex strategy requires pattern and replacement options:

{
strategy: "regex",
options: {
pattern: "\\d{4}-\\d{4}-\\d{4}-\\d{4}",
replacement: "****-****-****-$4"
}
}

Replacement templates support capture groups ($1, $2, etc.).

Specific strategy availability depends on tier and field type.

Tokenize Strategy (Scale+ Tier)

StrategyDescription
tokenizeReplace values with vault-backed tokens

Tokenize requires a projectSecret (or a vault secret provided by the host):

{
strategy: "tokenize",
options: {
projectSecret: "import-session-secret"
}
}

Tokens are deterministic per secret and recorded in the local token vault.


Tier Behavior

Mask features are available across multiple tiers:

FeatureFreeProScaleEnterprise
Basic strategies (none/redact/hash/partial/null)YesYesYesYes
Deterministic + fixedNoYesYesYes
Semantic masking (email/phone/ssn/creditcard)NoYesYesYes
Shuffle + regexNoNoYesYes
Tokenization vaultNoNoYesYes

What This Module Does Not Do

  • Does not run ML-based PII detection: full detection lives in the pii-detect module; Mask uses config + heuristics
  • Does not guarantee identical enforcement across all execution modes: validate worker vs pipeline usage in your app
  • Does not provide server-side reidentification: tokenization is reversible only with the local vault
  • Does not mask based on content analysis alone: masking applies to configured fields and semantic hints

Constraints

Module Status

The public API exports type definitions. Actual masking implementation resides in the core execution engine.

Execution Mode Differences

Masking behavior may differ between execution modes:

AspectDashboard-AssistedHeadless
ImplementationEngine layerPipeline layer
ConfigurationSchema-embeddedPipeline config
EnforcementShould be validatedShould be validated

Runtime enforcement should be validated per execution mode. Do not assume identical behavior without testing.

Fail-Closed Behavior

In tested paths, masking failures throw MaskingFailedError and halt the pipeline. No fallback to unmasked data is provided.


Failure Modes

FailureBehavior
Configuration errorMaskingFailedError thrown; pipeline halts
Unsupported strategyMaskingFailedError thrown; pipeline halts
Invalid field referenceError thrown; pipeline halts

Masking failures are fail-closed: no partial or unmasked results are produced.


Observed Status

Fully implemented. Masking is executed via WASM in the pipeline and engine layer, with:

  • Core strategies (none/redact/hash/deterministic/partial/fixed/null/shuffle)
  • Semantic strategies (email/phone/ssn/creditcard)
  • Regex-based masking with capture group support
  • Tokenization with a local token vault
  • Tier-gated feature access and fail-closed error handling

Type definitions are exported publicly from @rowops/schema.