Skip to main content

lint API

Dataset-level invariant enforcement with tier-aware rules and a deterministic DSL.

npm install @rowops/lint

Use the pipeline API to lint data already ingested as Arrow IPC.

import type { ImportJob } from "@rowops/import-core";

async function runLint(job: ImportJob) {
const result = await job.lint({
rules: [
{
id: "unique_email",
type: "uniqueness",
severity: "error",
config: { columns: ["email"] },
},
{
id: "amount_range",
type: "numeric_range",
severity: "warning",
config: { column: "amount", min: 0 },
},
{
id: "end_after_start",
type: "dsl",
severity: "error",
config: { expression: "end_date >= start_date" },
},
],
onProgress: (pct, message) => console.log(pct, message),
});

console.log(result.counts, result.totalViolations);
return result;
}

LintRule

interface LintRule {
id: string;
type: string;
severity: "error" | "warning" | "info";
config?: Record<string, unknown>;
/** Additional fields are allowed and merged into config if not present there. */
[key: string]: unknown;
}

config is merged with top-level fields. If both set the same key, config wins. Tier enforcement inspects config, so place rule parameters there.


LintResult and LintViolation

interface LintViolation {
rule_id: string;
severity: "error" | "warning" | "info";
row_index?: number;
message: string;
details?: unknown;
}

interface LintResult {
violations: LintViolation[];
counts: { error: number; warning: number; info: number };
}

Rule Types and Config

Uniqueness (uniqueness, unique)

{
id: "unique_email",
type: "uniqueness",
severity: "error",
config: {
columns: ["email"],
ignoreNull: true,
caseInsensitive: false,
},
}

Foreign Key (foreign_key, fk)

{
id: "account_fk",
type: "foreign_key",
severity: "error",
config: {
column: "accountId",
lookupValues: ["A", "B", "C"],
allowNull: true,
caseInsensitive: false,
},
}

Numeric Range (numeric_range, range)

{
id: "amount_range",
type: "numeric_range",
severity: "warning",
config: {
column: "amount",
min: 0,
max: 10000,
minInclusive: true,
maxInclusive: true,
allowNull: true,
},
}

Window (window, date_window)

{
id: "no_overlap",
type: "window",
severity: "error",
config: {
groupBy: ["accountId"],
start: "startDate",
end: "endDate",
constraint: "noOverlap",
allowEqualEdges: false,
},
}

Group Constraint (group_constraint, group)

{
id: "one_primary",
type: "group_constraint",
severity: "warning",
config: {
groupBy: ["accountId"],
condition: "sum(isPrimary) <= 1",
message: "Only one primary record allowed",
},
}

DSL (dsl, expression)

{
id: "active_adults",
type: "dsl",
severity: "error",
config: {
expression: "status == 'active' && age >= 18",
},
}

See Lint DSL Reference for the full grammar.


Worker Factory (Advanced)

For custom pipelines or small, bounded datasets, you can run lint rules in a worker.

import { createLintWorker, createWorkerRequest } from "@rowops/lint";
import { resolveBrowserLicense } from "@rowops/import-core";

const worker = createLintWorker();
const { tierGateInit } = await resolveBrowserLicense({
projectId: "proj_xxx",
entitlementToken: "eyJ...",
});
const request = createWorkerRequest("START_LINT", {
rows: [{ email: "a@test.com" }, { email: "b@test.com" }],
rules: [
{
id: "unique_email",
type: "uniqueness",
severity: "error",
config: { columns: ["email"] },
},
],
tierGate: tierGateInit,
});

worker.postMessage(request);

Tier Restrictions

Rule TypeFreeProScaleEnterprise
Uniqueness (single column)YesYesYesYes
Uniqueness (multi column)NoYesYesYes
Foreign keyNoYesYesYes
Numeric rangeNoYesYesYes
Window / group constraintNoNoYesYes
DSL rulesNoNoYesYes

See Also