Open SourceBioinformaticsRustTauri

OpenBio Operating System

An architecture that bridges the gap between physical lab inventory, digital experimental notes, and computational analysis.

The Problem & Solution

Why the current bio-software landscape fails, and how OpenBio fixes it.

The "Air-Gapped" Reality

PROBLEM

Physical samples, protocols, and digital data live on systems that don't talk to each other. Six months later, a 50GB file is useless because you don't know which patient it came from.

The Unity Schema

SOLUTION

We force a hard link. A digital file implies an Experiment, which implies a Physical Sample. Clicking a data point traces all the way back to the freezer slot.

The "Bus Factor"

PROBLEM

Vital knowledge lives in people's heads. If "Steve who knows where the samples are" leaves, the lab grinds to a halt.

Database as Truth

SOLUTION

The Inventory Module is the source of truth, not memory. Git-backed protocols mean every change is timestamped and authored.

The "IT Barrier"

PROBLEM

Existing enterprise software is too expensive; open-source tools require a DevOps degree. Small labs are stuck on Excel.

Tauri Hub

SOLUTION

"Enterprise-grade" structure in a "Double-click install" executable. A PhD student can set up a fully traceable lab system in 5 minutes.

The "Black Box" of Analysis

PROBLEM

Biologists can't code Python/R. They wait weeks for bioinformaticians to generate static PDF plots they can't explore.

WASM Insight Module

SOLUTION

We wrap complex math in a friendly UI. Biologists can "Gate" and "Test" without writing code, empowering domain experts.

From Freezer to Insight

1. The Inventory (Morning)

Scan QR on tube (Sample P-405) and Box 4.
Link: Database knows Sample P-405 is in Box 4, Slot A1.

2. The Experiment Setup (Noon)

Create "New Experiment" and select Sample P-405.
Link: Experiment 505 contains Sample P-405.

3. The Lab Work (Afternoon)

Open Experiment 505 notebook. Note: "Used Protocol A, but added extra reagent. @Sample-P-405."
Link: Metadata snapshots preserved via @mentions.

4. The Data Haul (Next Day)

Sequencer finishes. Ingest Agent (or manual upload) detects file.
Link: File `run_data.fastq` belongs to Experiment 505.

5. The Processing (Nextflow)

OpenBio triggers Nextflow wrapper. Math turns text strings into count matrix (`matrix.mtx`).

6. The Insight (Visualization)

WASM Engine loads matrix. You hover a red dot: "High Insulin. Sample P-405 (Box 4). Note: 'extra reagent'."
Payoff: Full traceability.

Architecture

Technology Stack

The Client (Tauri Desktop App) Tauri v2, Vite + React + TS, TanStack Query, ShadCN/UI. WASM Engine (Rust) for Insight.
The Server (Rust API) Axum (HTTP), Prisma Client Rust (DB), Abstracted Storage (LocalFS/S3).
Deployment Embedded (runs inside Tauri for solo/small labs) or Headless (Docker for Enterprise).

Deployment Tiers

Tier 1: Solo Mode Single machine, offline, local SQLite.
Tier 2: Small Lab (Hub & Spoke) One machine acts as Hub (mDNS broadcast), others connect via LAN.
Tier 3: Enterprise Remote Docker API + Postgres + S3.

Handling Large Files (50GB+)

How do we move data from a 50GB fastq file to a WebGL visualization without checking 64GB of RAM?

Memory Mapping (Mmap): Rust Core uses `memmap2` to map file on disk to virtual address space, avoiding full load.
SharedArrayBuffer Pipeline: Zero-copy transfer. Rust reads chunks → Web Worker (WASM) via IPC → SharedArrayBuffer.
React-less Compute: React only sends coordinates. WASM Worker iterates SAB and updates Selection Bitmask.
WebGL Renderer: Reads directly from SAB as Vertex Buffer Object.

Modules

Module A: The Freezer

Inventory & Identity. Polymorphic storage (Facility → Box).

Module B: Notebook

Git-backed Markdown protocols with smart linking (@Sample_ID).

Module C: Pipeline Automator

Rust wrapper around Nextflow. Dynamic config & live streaming logs.

Module D: Ingest Agent

Separate binary on instrument PCs. Auto-uploads & tags files.

Module E: Insight

WASM/WebGL Single-Cell Explorer. Zero-copy rendering.

Module F: Library

Zotero-like reference manager. Auto-bibliography generation.

Development Roadmap

Phase 1Client scaffold, Core server (Embedded/Hub), Config wizard.

Phase 2Database (Prisma), Inventory schemas, Box Grid UI.

Phase 3Networking & Discovery (mDNS), Connection Settings UI.

Phase 4Pipelines (Nextflow wrapper), Ingest Agent CLI.

Phase 5Insight analsysis & T-Test automation.

Phase 6Docker image release and enterprise licensing.

Under DevelopmentView source →

From Freezer to Insight

1. The Inventory (Morning)

Scan QR on tube (Sample P-405) and Box 4.
Link: Database knows Sample P-405 is in Box 4, Slot A1.

2. The Experiment Setup (Noon)

Create "New Experiment" and select Sample P-405.
Link: Experiment 505 contains Sample P-405.

3. The Lab Work (Afternoon)

Open Experiment 505 notebook. Note: "Used Protocol A, but added extra reagent. @Sample-P-405."
Link: Metadata snapshots preserved via @mentions.

4. The Data Haul (Next Day)

Sequencer finishes. Ingest Agent (or manual upload) detects file.
Link: File `run_data.fastq` belongs to Experiment 505.

5. The Processing (Nextflow)

OpenBio triggers Nextflow wrapper. Math turns text strings into count matrix (`matrix.mtx`).

6. The Insight (Visualization)

WASM Engine loads matrix. You hover a red dot: "High Insulin. Sample P-405 (Box 4). Note: 'extra reagent'."
Payoff: Full traceability.

Architecture

Technology Stack

The Client (Tauri Desktop App) Tauri v2, Vite + React + TS, TanStack Query, ShadCN/UI. WASM Engine (Rust) for Insight.
The Server (Rust API) Axum (HTTP), Prisma Client Rust (DB), Abstracted Storage (LocalFS/S3).
Deployment Embedded (runs inside Tauri for solo/small labs) or Headless (Docker for Enterprise).

Deployment Tiers

Tier 1: Solo Mode Single machine, offline, local SQLite.
Tier 2: Small Lab (Hub & Spoke) One machine acts as Hub (mDNS broadcast), others connect via LAN.
Tier 3: Enterprise Remote Docker API + Postgres + S3.

Modules

Module A: The Freezer

Inventory & Identity. Polymorphic storage (Facility → Box).

Module B: Notebook

Git-backed Markdown protocols with smart linking (@Sample_ID).

Module C: Pipeline Automator

Rust wrapper around Nextflow. Dynamic config & live streaming logs.

Module D: Ingest Agent

Separate binary on instrument PCs. Auto-uploads & tags files.

Module E: Insight

WASM/WebGL Single-Cell Explorer. Zero-copy rendering.

Module F: Library

Zotero-like reference manager. Auto-bibliography generation.

Development Roadmap

Phase 1Client scaffold, Core server (Embedded/Hub), Config wizard.

Phase 2Database (Prisma), Inventory schemas, Box Grid UI.

Phase 3Networking & Discovery (mDNS), Connection Settings UI.

Phase 4Pipelines (Nextflow wrapper), Ingest Agent CLI.

Phase 5Insight analsysis & T-Test automation.

Phase 6Docker image release and enterprise licensing.