User Defined Open Source Nextflow Workflow

Steps for successfully running an open source GitHub repo workflow on the Form Bio platform

Create a Private Fork of Public repository

Make a private fork of a public github repo e.g.:

Download and install

Or create a new repository from the Github web UI.


gh repo create --private formbio/$GITHUB_REPO_NAME

git clone --bare https://github.com/$GITHUB_REPO_ORG/$GITHUB_REPO_NAME.git
git push --mirror https://github.com/formbio/$GITHUB_REPO_NAME.git

Then you can just check it out and you can add the original repo as an upstream to pull changes from and keep in sync:

git clone https://github.com/formbio/$GITHUB_REPO_NAME.git
git remote add upstream https://github.com/$GITHUB_REPO_ORG/$GITHUB_REPO_NAME.git
git remote set-url --push upstream DISABLE

# Pull (fetch & merge) from latest upstream (e.g. merge in changes from original repo)
git pull upstream main --rebase

Update Your Forked Repo

Create correct structure

  1. Ensure main.nf and nextflow.config live in the root of the repo
  2. Create a “workflows” directory inside the repo, if one does not already exist.
  3. Inside that directory, create another directory with same name as the ID of the workflow (if one does not already exist)
  4. Inside that directory, create a workflow.json file and symlink any documentation that already exists in the repo here under the names overview.md citations.md inputs.md outputs.md
    1. You can also create your own documentation files if you’d like
    2. image

Convert json schema

  1. Convert the repo’s existing json schema to a Form Bio workflow schema in workflow.json, using our Schema Guide to help you
    1. There is a script inside our workflow-schema repo that will help with this process if you have node installed on your machine. To use it:
      1. Clone the repo to your local machine
      2. Cd into repo → workflow-schema/
      3. Run sh scripts/json-schemaToV3.sh <path to json-schema file>
      4. This will output a workflow.json in our format inside the json-schema file’s parent directory
        1. It is important to then go through the output and fill in missing fields, like id, and fix any field types that the conversion script may have missed, and verify everything looks okay
        2. It can then be moved to the correct workflows/[id] directory
  2. Some important notes:
    1. Ensure the value you give the id property matches the parent directory’s name
    2. If the schema includes a field for specifying an output directory, make sure to hide it
      1. Keep the id of the output param in the back of your mind for future steps

Update nextflow.config

  1. Beneath the section where default params are defined, include statement that remaps our output param to theirs:
    1. params.out_dir = "${params.output}" where “out_dir” is their param name

Import Workflow