lint API
Dataset-level invariant enforcement with tier-aware rules and a deterministic DSL.
npm install @rowops/lint
Recommended Usage (ImportJob)
Use the pipeline API to lint data already ingested as Arrow IPC.
import type { ImportJob } from "@rowops/import-core";
async function runLint(job: ImportJob) {
const result = await job.lint({
rules: [
{
id: "unique_email",
type: "uniqueness",
severity: "error",
config: { columns: ["email"] },
},
{
id: "amount_range",
type: "numeric_range",
severity: "warning",
config: { column: "amount", min: 0 },
},
{
id: "end_after_start",
type: "dsl",
severity: "error",
config: { expression: "end_date >= start_date" },
},
],
onProgress: (pct, message) => console.log(pct, message),
});
console.log(result.counts, result.totalViolations);
return result;
}
LintRule
interface LintRule {
id: string;
type: string;
severity: "error" | "warning" | "info";
config?: Record<string, unknown>;
/** Additional fields are allowed and merged into config if not present there. */
[key: string]: unknown;
}
config is merged with top-level fields. If both set the same key, config wins.
Tier enforcement inspects config, so place rule parameters there.
LintResult and LintViolation
interface LintViolation {
rule_id: string;
severity: "error" | "warning" | "info";
row_index?: number;
message: string;
details?: unknown;
}
interface LintResult {
violations: LintViolation[];
counts: { error: number; warning: number; info: number };
}
Rule Types and Config
Uniqueness (uniqueness, unique)
{
id: "unique_email",
type: "uniqueness",
severity: "error",
config: {
columns: ["email"],
ignoreNull: true,
caseInsensitive: false,
},
}
Foreign Key (foreign_key, fk)
{
id: "account_fk",
type: "foreign_key",
severity: "error",
config: {
column: "accountId",
lookupValues: ["A", "B", "C"],
allowNull: true,
caseInsensitive: false,
},
}
Numeric Range (numeric_range, range)
{
id: "amount_range",
type: "numeric_range",
severity: "warning",
config: {
column: "amount",
min: 0,
max: 10000,
minInclusive: true,
maxInclusive: true,
allowNull: true,
},
}
Window (window, date_window)
{
id: "no_overlap",
type: "window",
severity: "error",
config: {
groupBy: ["accountId"],
start: "startDate",
end: "endDate",
constraint: "noOverlap",
allowEqualEdges: false,
},
}
Group Constraint (group_constraint, group)
{
id: "one_primary",
type: "group_constraint",
severity: "warning",
config: {
groupBy: ["accountId"],
condition: "sum(isPrimary) <= 1",
message: "Only one primary record allowed",
},
}
DSL (dsl, expression)
{
id: "active_adults",
type: "dsl",
severity: "error",
config: {
expression: "status == 'active' && age >= 18",
},
}
See Lint DSL Reference for the full grammar.
Worker Factory (Advanced)
For custom pipelines or small, bounded datasets, you can run lint rules in a worker.
import { createLintWorker, createWorkerRequest } from "@rowops/lint";
import { resolveBrowserLicense } from "@rowops/import-core";
const worker = createLintWorker();
const { tierGateInit } = await resolveBrowserLicense({
projectId: "proj_xxx",
entitlementToken: "eyJ...",
});
const request = createWorkerRequest("START_LINT", {
rows: [{ email: "a@test.com" }, { email: "b@test.com" }],
rules: [
{
id: "unique_email",
type: "uniqueness",
severity: "error",
config: { columns: ["email"] },
},
],
tierGate: tierGateInit,
});
worker.postMessage(request);
Tier Restrictions
| Rule Type | Free | Pro | Scale | Enterprise |
|---|---|---|---|---|
| Uniqueness (single column) | Yes | Yes | Yes | Yes |
| Uniqueness (multi column) | No | Yes | Yes | Yes |
| Foreign key | No | Yes | Yes | Yes |
| Numeric range | No | Yes | Yes | Yes |
| Window / group constraint | No | No | Yes | Yes |
| DSL rules | No | No | Yes | Yes |