Bioinformatics tools
your lab will actually use.
We build practical software, pipelines, and workflows for research labs — shaped by the problems you bring to us, not by what looks good on a slide deck.
Active collaborations in
Research labs lose weeks on problems that should already be solved.
Most bioinformatics software is either too complex, too fragile, or built for a slightly different problem than the one you have. Labs end up maintaining workarounds instead of doing science.
Pipelines break on your data
Tools built for model organisms or large cohorts fail silently — or throw cryptic errors — when your dataset doesn't fit the expected format.
Setup takes longer than the analysis
Conda environments, Docker images, config files, reference downloads — the overhead before any actual science can take days and requires expert knowledge.
Documentation assumes you already know
Academic tools are written by researchers who already understand the method. If you're new to the workflow, you're largely on your own with a sparse README.
Results are hard to interpret and share
Raw output files work fine for whoever ran them. Sharing results with collaborators, PIs, or clinicians usually means manual reformatting and a lot of explaining.
Tools aren't built for your organism or context
Non-model organisms, unusual experimental designs, or small clinical cohorts often require workarounds that quietly compromise result quality.
Reproducibility is an afterthought
Running the same analysis six months later — or on a different server — should not require detective work. But without disciplined tooling, it usually does.
Tools shaped by your work.
Not the other way around.
Pilot labs work directly with us to define what gets built. You bring the problem; we build something that actually solves it for your specific data, organism, and infrastructure.
Apply for the pilot programDirect influence over what gets built
You describe your real workflow pain points. We design around your data, your organism, your computing environment — not a generic use case.
Early access to finished tools
Pilot participants receive working tools before public release and can use them for live research immediately. No waiting for a paper to accompany the code.
Tools designed to stay free for academic use
Our aim is to keep validated tools openly available to academic labs. If your lab helped shape a tool, you won't be charged to keep using it.
No commitment beyond an initial conversation
We start with a short discussion about what you need. If it's a good fit, we move forward together. If not, we'll say so and suggest alternatives.
Acknowledgment in publications and releases
Collaborating labs are credited in tool documentation, preprints, and any peer-reviewed releases. Substantive methodological contributions open the door to co-authorship.
Tools for problems that show up every week in real labs.
These are examples of the kinds of solutions we develop with and for research groups. Every project starts from a real need — not from a technology looking for an application.
Variant calling with automatic QC reports
End-to-end short-read variant calling from raw FASTQ to annotated VCF, with per-sample QC metrics and a shareable HTML report. Supports hg38 and custom references.
- BWA-MEM2 + GATK4 HaplotypeCaller
- VEP annotation with ClinVar integration
- Automated HTML + TSV output
- Reproducible with Snakemake or Nextflow
Differential expression for non-model organisms
Bulk RNA-seq analysis that works when there's no reference genome — or only a poorly annotated one. De novo assembly, pseudo-alignment, and DESeq2/edgeR wrapped in a clean pipeline.
- Trinity de novo assembly or reference-based
- Salmon/kallisto pseudo-alignment
- DESeq2 and edgeR with automatic visualizations
- Gene ontology enrichment reports
scRNA-seq analysis from count matrix to cell atlas
Structured single-cell analysis with Seurat or Scanpy, from raw count matrices through clustering, marker identification, and trajectory inference. Outputs structured data objects and clean figures.
- QC, normalization, dimensionality reduction
- Leiden/Louvain clustering + annotation
- Trajectory inference with PAGA / Monocle3
- Cell type deconvolution for bulk data
Lab data management and format conversion tooling
Command-line tools and lightweight web interfaces for turning messy lab data — mass spec output, plate reader files, sequencer manifests — into structured, reproducible data tables.
- Batch format conversion with validation
- Metadata linking and sample tracking
- Schema-based output compatible with ENA / NCBI SRA
- Runs locally, no cloud upload required
Proteomics data processing and differential abundance
MaxQuant / DIA-NN output to interpretable results. Handles missing value imputation, normalization strategies, and statistical testing with volcano plots and pathway enrichment.
- LFQ and TMT / iTRAQ support
- Perseus-compatible + R-based workflow option
- Protein complex and interaction enrichment
- Exportable report for clinical collaborators
Don't see your use case?
We build new tools from scratch. If you have a problem that doesn't fit existing software, tell us about it. That's exactly the kind of work we're here for.
Describe your problemWe're onboarding a small number of labs this quarter.
We work closely with each lab we take on, which means we keep the number small on purpose. If your lab has a real bioinformatics bottleneck and you want something built to fit your exact situation, now is the time to reach out.
No sales calls. No decks. Just a short form and a conversation.
Common questions.
If your question isn't here, just ask us directly.
Primarily academic and clinical research labs working with sequencing, mass spectrometry, imaging, or other quantitative biological data. We've worked with groups ranging from two-person PhD labs to multi-PI core facilities. The common thread is a real, recurring data problem — not a one-off analysis request.
The pilot is currently free for academic labs. We're building the tools we want to exist, and working with real labs is the best way to make sure they actually work. In exchange, we ask for your time — a few hours of scoping, feedback during development, and a short validation pass when the tool is ready.
We retain the code and release it under an open-source license for academic use. Commercial use requires a separate agreement. Your data, your results, and your methods remain entirely yours. We do not store, access, or use your research data for anything beyond building and testing the tool you asked for.
It depends on the scope, but a typical first deliverable — something you can run on your own data — arrives within four to six weeks of the initial scoping session. More complex tools or multi-stage pipelines take longer. We break work into milestones so you're not waiting months to see anything.
We iterate. The whole point of working closely with a pilot lab is that we can adjust based on real results, not hypothetical ones. If a first version misses something important, we revise it. If the underlying problem turns out to be unsolvable with the approach we chose, we'll tell you honestly and we'll discuss alternatives together.
No. We design tools to run on whatever you have — local workstation, HPC cluster, or cloud. If your lab uses SLURM, we support SLURM. If you prefer to run things on a laptop first and scale later, we make that possible. We'll ask about your environment during scoping and design accordingly.
Yes, and the same applies in the other direction. Collaborating labs are always acknowledged in tool documentation and releases. If a lab contributes substantially to the methodological design — providing benchmarking data, expert domain feedback, or biological validation — we'll discuss co-authorship on any resulting preprint or publication. We approach this the same way a collaborative academic project would.
Tell us about your lab's problem.
We read every message and respond within two working days. You don't need a polished brief — a few sentences about what's blocking you is enough to start a conversation.