- Creating a Form Bio Nextflow Workflow
- Form Bio Nextflow Runtime Specification
- Workflow Run Quotas / Limitations
- Nextflow Workflow outputs
- params.output - Form Bio Workflow output folder
- Custom workflow output parameter name
- Form Bio Reserved Nextflow Configuration values
- Form Bio Reserved Nextflow Param values
- Form Bio GitHub App Integration
- Importing a new or existing workflow
- Monitor
Creating a Form Bio Nextflow Workflow
Adding a workflow.json schema defining inputs and validation.
Form Bio Nextflow Runtime Specification
Workflow Run Quotas / Limitations
Currently there are 2 quotas that impact how many concurrent Workflows can be run by an organization and project, as well as how many parallel processes/tasks VMs can be run in a given workflow run.
This can be increased on a per organization / project basis by creating a Service request to support@formbio.com specifying the organization you want to increase the quota for.
- Project Concurrent Workflow Runs (Default: 50) - Limits the number of concurrent Workflow runs for a given Form Bio Project (Project Quota increases can be applied across an Org, or on a specific Project).
- Nextflow concurrent processes/tasks per Workflow Run
executor.queueSize
(Default: 50) For a given Workflow run specifies the maximum parallel processes/task VMs that can be run (Quota increases can be applied to Org, Project, User Email). https://www.nextflow.io/docs/latest/config.html#scope-executor
Nextflow Workflow outputs
params.output - Form Bio Workflow output folder
Nextflow outputs can be published to the Form Bio platforms workflow outputs using the reserved nextflow paramter --output
/ params.output
Custom workflow output parameter name
Or if your workflow already has a defined ouput paramter you can map it to the Form Bio workflow outputs using the following templated syntax:
# you can specify an existing nextflow paramter or set a default value in your workflow schema
# e.g. nf-core workflows usually use a `--outdir`
--outdir='{{formbio.params.output}}'
Form Bio Reserved Nextflow Configuration values
The following configuration is provided by the Form Bio platform at workflow runtime, and will be ignored if set by the workflow via nextflow.config or in main.nf.
The base Nextflow head node Docker container used to launch all workflows is defined here:
formbio.config
generated at runtime by the Form Bio API and provides runtime configuration specific to a form bio project (e.g. customer project context)$HOME/nextflow.config
provided as default configuration in the Form Bio Nextflow Docker container base image gls-nextflow.
See Nextflow Configuration documentation
Nextflow Configuration | Value | Description |
process.executor | “google-lifesciences” OR “google-batch” | The Nextflow executor to run the workflow this depends on if the Workflow is running via Google Lifesciences or Google Batch |
process.time | 30d | Set new default timeout for a process (7 days is default) |
exectuor.queueSize | 50 (default) OR provided by LaunchDarkly flag enable-larger-queue-size override per Form Bio Project https://app.launchdarkly.com/default/production/features/enable-larger-queue-size/targeting | The number of tasks (VMs) the executor will handle in parallel (default for lifesciences is 1000) |
google.storage.maxTransferAttempts | 10 | Increase retry for intermittent GCS API errors like (503 service unavailable) - Default is 0 https://www.nextflow.io/docs/latest/google.html?highlight=maxtransferattempts |
google.project | Default tundra-prd OR BYO BID GCP project ID | Form Bio GCP Project to execute workflow tasks in e.g. central production projects tundra-prd for BYO BID projects then use that GCP project |
google.region | Google Batch Executor: us-central1 (or whatever BYO BID project region) - Batch only supports a single region to execute tasks in (but can use multiple zones) https://cloud.google.com/batch/docs/reference/rest/v1/projects.locations.jobs#LocationPolicy.FIELDS.allowed_locations
Google Lifesciences Executor (load balanced across US regions):
["us-central1",
"us-west2",
"us-west4",
"us-east1",
"us-west1"] | GCP Region(s) that the workflow tasks will execute in. |
-work-dir | ./work | The local working directory for nextflow tasks/steps for processes running via executor=local Docker |
-bucket-dir | formbio://${org}/${project}/pipeline-outputs/${workflow-run-folder} | The working GCS directory for nextflow tasks/steps as well as intermediate staging of channel files and cache for resuming failed workflows. |
Form Bio Reserved Nextflow Param values
The following params are reserved and provided by the Form Bio platform and will be overwritten if provided by the workflow or as user-provided input params.
Workflow Param | Value | Description |
--output
params.output
| formbio://${org}/${project}/pipeline-outputs/output/ | Form Bio (GCS) path for output workflow results |
--region
params.region | Google Lifesciences Executor:
us-central1,us-west2,us-west4,us-east1,us-west1
Google Batch Executor:
us-central1 | GCP region to execute workflow in, based on BYOBID project. (this may not be used as we’re setting the config regions too.
|
--bqLabels
params.bqLabels | Example:
"formbio-org":"sbx-uat","formbio-project-id":"40c9baa0-9156-4e5e-a784-f3ffffd022ef","formbio-user-id":"d4312620-981b-4276-b619-567bb9518e07","formbio-operation":"workflow_run" | Form Bio provide GCP Resource labels used to track / categorize cloud usage costs by Org/Project/Workflow/User |
--registry
params.registry | gcr.io/bioinfo-devel | Docker container registry used by Form Bio managed workflows. (This is likely not relevant to BYO WF User Defined workflows) |
--cloudprj
params.cloudprj | tundra-prd OR BYOBID customer GCP project ID | deprecated Not sure if this is being used as we’re overriding --registry which by default in Form Bio managed workflows is
params.registry = " gcr.io/${params.cloudprj} " |
Form Bio GitHub App Integration
Our GitHub app allows for faster, observable automation driven by pushes to workflow repositories on GitHub. Once a workflow is imported, any push to that repo will trigger an upload of the workflow under the version of the branch pushed to.
Importing a new or existing workflow
- Inside the Form Bio web app, navigate to the Manage section under Workflows
- Select Create New
- If you have not linked your GitHub account to the FormBio GitHub App yet, please click Sign in with GitHub and Authorize access
- Once authorized, you will be redirected to the Web App and see an Import Workflow from GitHub screen
- Select formbio under Select Account
- Select the repository containing the workflow you would like to import under Select Repository
- If you do not see the repository you are looking for you will need to install the App to it
- To do this, open up the Select Account dropdown again and select Add or Modify Installations
- From there, select formbio organization and select the repositories you would like to install the app onto.
- Click Update Access
- You will be redirected back to the Web App where you can now select that repository
- The Select Workflow list should automatically detect the workflows located within the repo’s workflows/ directory that contain a workflow.json for you.
- Click Configure on the workflow you’d like to import
- The configure form should be pre-populated with the default values and paths. It is strongly discouraged to change any of these at this time.
- Click Import Workflow and you’ll be redirected to the main branch’s build in progress. See step 6 in Monitor
Monitor
- Inside the Form Bio web app, navigate to the Manage section under Workflows (See first step in Import)
- You’ll see a list of cards representing all workflows that have been uploaded to the current project (whether via CLI or GitHub App) listed here
- Workflows imported via GitHub App will be indicated by a GH logo and a View Deployments button
- Click View Deployments to see a record of all uploads since being imported. To see logs for a particular upload, click View Logs.
- You’ll be taken to a logs view, displaying data about the workflow, version, and build status.
- If an upload was successful, you’ll see a Go to Launch button that will take you to the docs page for that version, where you can Launch the workflow