Skip to main content

Overview

The Workflow Engine executes multi-step workflows as explicit state machines with idempotency, resumability, and reconciliation against on-chain state.

Core Abstraction

Workflow = (States, Transitions, Context, Reconciler)

Where:
- States: finite set of named states
- Transitions: allowed state changes, each with a single action
- Context: immutable input + mutable progress data
- Reconciler: function that determines current state from on-chain truth

Workflow Principles

PrincipleMeaning
Single ResponsibilityEach state has exactly one action to perform
Explicit TransitionsState changes are explicit, not implicit side effects
Idempotent ActionsRunning an action twice has the same result
External TruthWorkflow reconciles against on-chain state, not its own
Crash ToleranceAny state can be resumed after a crash

Universal States

Every workflow shares these meta-states:
StateMeaningTransitions Out
PENDINGWorkflow created, not yet startedRUNNING
RUNNINGActively executing stepsCOMPLETED, FAILED, STALLED
STALLEDWaiting for external conditionRUNNING (when condition met)
COMPLETEDAll steps finished successfully(terminal)
FAILEDUnrecoverable error(terminal)

FAILED vs STALLED

This distinction is critical for correct error handling.
CategoryExamplesRecovery
FAILEDContract revert, invalid input, authorization errorManual intervention required
STALLEDRPC timeout, Arweave upload failed, rate limitAuto-retry or manual resume

Workflow Record Schema

Each workflow instance is persisted with this structure:
interface WorkflowRecord {
  // === Identity ===
  id: string;           // UUID
  type: WorkflowType;   // "WorkSubmission" | "ScoreSubmission" | "CloseEpoch"
  created_at: number;
  updated_at: number;
  
  // === State ===
  state: WorkflowState; // PENDING | RUNNING | COMPLETED | FAILED | STALLED
  step: string;         // Current step (e.g., "UPLOAD_EVIDENCE")
  step_attempts: number;
  
  // === Immutable Context ===
  input: {
    studio_address: string;
    epoch: number;
    signer_address: string;
    data_hash: string;
    // ... workflow-specific fields
  };
  
  // === Mutable Progress ===
  progress: {
    arweave_tx_id?: string;
    arweave_confirmed?: boolean;
    onchain_tx_hash?: string;
    onchain_confirmed?: boolean;
    // ... step-specific progress
  };
  
  // === Failure Info ===
  error?: {
    step: string;
    message: string;
    code: string;
    recoverable: boolean;
  };
}

Persistence Guarantees

GuaranteeMeaning
Write-aheadState is persisted BEFORE action is taken
Atomic transitionState change and progress update are atomic
Immutable inputInput fields are never modified after creation
Append-only progressProgress fields are set once, never cleared

Reconciliation

Reconciliation ensures the workflow state matches on-chain reality.

When Reconciliation Runs

  1. On Gateway startup — For all workflows in RUNNING or STALLED state
  2. Before step execution — Optionally, to skip already-completed steps
  3. After timeout — When a step has been pending too long

Reconciliation Algorithm

function reconcile(workflow: WorkflowRecord): WorkflowRecord {
  // Step 1: Query on-chain state
  const onchainState = queryOnchainState(workflow.input);
  
  // Step 2: Compare and determine true state
  if (onchainState.workSubmitted && workflow.step < "COMPLETED") {
    // Work already on-chain, workflow should be complete
    return workflow.transitionTo("COMPLETED");
  }
  
  // Step 3: Check pending tx status
  if (workflow.progress.onchain_tx_hash) {
    const txStatus = getTxReceipt(workflow.progress.onchain_tx_hash);
    
    if (txStatus === "confirmed") {
      return workflow.advanceStep();
    }
    if (txStatus === "reverted") {
      return workflow.fail({ reason: "tx_reverted" });
    }
    if (txStatus === "not_found") {
      // Tx was never mined, safe to retry
      return workflow.retryStep();
    }
  }
  
  // Step 4: No changes needed
  return workflow;
}

Reconciliation Rules

RuleDescription
On-chain winsIf on-chain says it’s done, it’s done
Tx hash is checkpointIf we have a tx hash, check its fate before retrying
No tx hash = safe to retryIf no tx hash, the action never happened
Arweave is append-onlyIf we have an Arweave tx ID, the content exists

Transaction Submission Flow

Key Points:
  • Lock is held for entire submit → confirm cycle
  • Only one tx per signer can be in flight
  • Crash between steps 4-5 → reconcile via nonce check
  • Crash between steps 5-8 → reconcile via persisted tx hash

Failure Handling

Retry Policy

const RetryPolicy = {
  max_attempts: 5,
  initial_delay_ms: 1000,
  max_delay_ms: 60000,
  backoff_multiplier: 2.0,
  jitter: true
};
AttemptDelay
11s
22s
34s
48s
516s
6+→ STALLED or FAILED

Per-Step Failure Handling

StepFailureAction
UPLOAD_EVIDENCENetwork errorRetry with backoff
UPLOAD_EVIDENCEArweave rejects (invalid)→ FAILED
UPLOAD_EVIDENCEArweave rejects (no funds)→ STALLED
SUBMIT_ONCHAINNonce too lowReconcile, retry
SUBMIT_ONCHAINContract revert: “already submitted”Reconcile → COMPLETED
SUBMIT_ONCHAINContract revert: “epoch closed”→ FAILED
AWAIT_TX_CONFIRMTx pendingKeep polling
AWAIT_TX_CONFIRMTx not found after timeoutRetry submission

Invariants Summary

#InvariantEnforcement
1Workflows are explicit state machinesSchema enforces states and transitions
2Every step is idempotentActions check preconditions before acting
3Every step is resumableWrite-ahead persistence
4On-chain state is authoritativeReconciliation queries on-chain
5Crash toleranceNo in-memory-only state
6Per-signer serializationTX Queue holds lock per signer
7Retries are boundedRetry policy limits attempts
8Progress is append-onlyOnce set, never cleared