- Overview
- nf-core Launch via API/SDK
- nf-core Launch via Web Walkthrough
- 1. Navigate to the nf-core launcher card.
- 2. Launcher Tabs
- 3. Go to the Review & Submit
Overview
nf-core provides a collection of open source pipelines and workflows. This workflow can be launched via API/SDK or via Web Walkthrough.
nf-core Launch via API/SDK
You can launch nf-core workflows programmatically using the Form Bio CLI/SDK tool to call the Form Bio API.
- Upload any relevant input data/files
- For transferring large files or transferring from Cloud Providers like AWS S3, Azure, Box use the Form Data Transfer Service
# Upload files to Form Bio project
$ formbio storage cp -r ./local-files/sequences formbio://${org}/${project}
- Create input parameters for a given nf-core workflow as a JSON params list
- You can use nf-core Launch pipeline to create JSON parameters.
- Note: any input files should be of the URI scheme:
formbio://${project}/${org}/${filepath}
- You then can launch a workflow via the API using the Form Bio CLI/SDK:
- See docs on how to use the Form Bio CLI/SDK to run workflows via the API
For example, to launch nf-core/bamtofastq workflow:
# Run nf-core/bamtofastq workflow
$ formbio workflow run \
--run-name 'nf-core_bamtofastq_re-run_1' \
--org formbio \
--project formbio-workflows \
--repo nf-core \
--workflow formbio/formbio-workflows/nf-core \
--version main \
--execution-engine nextflow \
-- \
--outdir='{{formbio.params.output}}' \
# nf-core JSON input params
--params='{
"input": "https://raw.githubusercontent.com/nf-core/test-datasets/bamtofastq/samplesheet/test_bam_samplesheet.csv"
}' \
--workflow='nf-core/bamtofastq' \
--workflowVersion='2.1.0'
Another example to launch nf-core/taxprofiler using formbio input files:
$ formbio workflow run \
--run-name 'nf-core-tax-profiler' \
--org formbio \
--project formbio-workflows \
--repo nf-core \
--workflow formbio/formbio-workflows/nf-core \
--version main \
--execution-engine nextflow \
-- \
--outdir='{{formbio.params.output}}' \
# nf-core JSON input params
--params='{
"input": "formbio://formbio/formbio-workflows/nf-core-data/taxprofiler/samplesheet.csv",
"databases": "formbio://formbio/formbio-workflows/nf-core-data/taxprofiler/database_v1.1.csv",
"perform_shortread_qc": true,
"perform_longread_qc": true,
"shortread_qc_mergepairs": true,
"perform_shortread_complexityfilter": true,
"perform_shortread_hostremoval": true,
"perform_longread_hostremoval": true,
"perform_runmerging": true,
"hostremoval_reference": "https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/genome.fasta",
"run_kaiju": true,
"run_kraken2": true,
"run_bracken": true,
"run_malt": false,
"run_metaphlan": true,
"run_centrifuge": true,
"run_diamond": true,
"run_krakenuniq": true,
"run_motus": false,
"run_ganon": true,
"run_krona": true,
"run_kmcp": true,
"kmcp_mode": 0,
"krona_taxonomy_directory": "https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/sarscov2/metagenome/krona_taxonomy.tab",
"malt_save_reads": true,
"kraken2_save_reads": true,
"centrifuge_save_reads": true,
"run_profile_standardisation": true
}' \
--workflow='nf-core/taxprofiler' \
--workflowVersion='1.1.2'
🔔 WATCH OUT for special characters!
Specific characters in parameter values can cause errors. In the nf-core
ampliseq
workflow, using the "=" symbol within the data_ref_taxonomy
parameter's value e.g., coidb=221216
could crash the workflow.
- 👍 Replacing the
=
character with itsUnicode
equivalence - 🙀 Put the below in a shell script and run it
json_param="$1"
formbio workflow run \
--run-name 'ampliseq_nf_core_re-run_10' \
--org form-bio-customer-support \
--project onboarding-project \
--repo nf-core \
--workflow formbio/formbio-workflows/nf-core \
--version main \
--execution-engine nextflow \
-- \
--outdir='{{formbio.params.output}}' \
--params='{
"input":"formbio://form-bio-customer-support/onboarding-project/ampliseq_nf_core/samplesheet_ampliseq.csv",
"max_cpus": 2,
"max_memory":"6.GB",
"dada_ref_taxonomy": "'" ${json_param//=/\\u003D} "'",
"skip_cutadapt": true
}' \
--workflow='nf-core/ampliseq' \
--workflowVersion='2.8.0'
nf-core Launch via Web Walkthrough
1. Navigate to the nf-core
launcher card.
- Find nf-core workflow via:
- Inside Workflows, start scrolling down the category, click nf-core to display it
- Search bar in the top right corner once click Launch
2. Launcher Tabs
3 to-do before running an nf-core workflow
- Workflow name
- Search for the desired nf-core workflow. nf-core workflow repository supports a variety of workflows available on nf-co.re.
b. Workflow version
- Search for workflow version of selected workflow repository
c. Parameters JSON
- Step 1: To determine required parameters, go to the documentation of selected workflow and version in the link as shown below.
- Step 2: Check desired/required Parameters
Navigate to the Parameters tab to see list of parameters required for the JSON input.
- Compulsory are parameters with required on the right side. These are needed for the workflow to run.
- Others are optional depending on your specific needs.
- outdir does not need be defined since Form Bio platform handles it automatically .
- Step 3: For each desired/required parameter, assign value to it a put the parameters in JSON format
{
"input": "formbio://form-bio-customer-support/onboarding-project/rnaseq_nfcore/rnaseq_nf_core/samplesheet_test.csv",
"fasta": "formbio://form-bio-customer-support/onboarding-project/rnaseq_nfcore/rnaseq_nf_core/genome.fasta",
"gtf": "formbio://form-bio-customer-support/onboarding-project/rnaseq_nfcore/rnaseq_nf_core/genes_with_empty_tid.gtf.gz"
}
- Add the created JSON to Parameters JSON on the platform
🚧 Note:
- Any file or directory paths should point to its
formbio://
path - Within a file, any path should be formatted as described below:
- The platform can handle different types of data paths, including those from public datasets (S3 buckets) and web addresses (HTTP).
- If your data is already uploaded onto the platform, paths must be in Google Cloud Storage format.
# Example of correctly formatted paths in a samplesheet.csv
patient,sample,lane,fastq_1,fastq_2
ID1,S1,L002,gs://formbio-production-26a68935-d9a6-4b3e-9b89-43dacfbf5e32/DNAseq/cancergenomics/SRR15663418_T1_1.fastq.gz,gs://formbio-production-26a68935-d9a6-4b3e-9b89-43dacfbf5e32/DNAseq/cancergenomics/SRR15663418_T1_2.fastq.gz
ID2,S2,L002,gs://formbio-production-26a68935-d9a6-4b3e-9b89-43dacfbf5e32/DNAseq/cancergenomics/SRR15663419_N1_1.fastq.gz,gs://formbio-production-26a68935-d9a6-4b3e-9b89-43dacfbf5e32/DNAseq/cancergenomics/SRR15663419_N1_2.fastq.gz
📖 To generate the GCS format path
🤏 Go to Data → Find desired input files → Click three-dots → Select Copy Download URL
👍 This generates a download URL:
https://storage.googleapis.com/formbio-production-26a68935-d9a6-4b3e-9b89-43dacfbf5e32/DNAseq/cancergenomics/SRR15663418_T1_1.fastq.gz?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=go-api%40tundra-prd.iam.gserviceaccount.com%2F20240220%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20240220T065959Z&X-Goog-Expires=604799&X-Goog-Signature=8b47d09eeda7d5c3d50ca44e93390f76acf5e3e576617a622686060a4b66c52d45fb293534ddeae725593281133542f1f048deda3c8fd328b84471c103a170d89e8f684b4f6154578d48fd39481d88bc1e04ffb604f87d6a8eed7c56171912c8ea21fcae409409397d0382589695a1965e4b838ba692068d90323bfa4edbfb0b26dae5ac1513b7ae628293368c13083859f375c9c917b4040e9fe38d711923153805045093cf4f2149ea7fcb61b13f65f9f5e6ab165be1e5d2718fd335f982feecb6b307b5bb588b56424b3b59b943df5ef91ea86532814f27cf28ec45e6fc45313fc9d2440d6d5725b6d88d35784ad91dc1e71169c9a60135adafcb49c4676c&X-Goog-SignedHeaders=host&response-content-disposition=attachment
- 👉 Replace
https://storage.googleapis.com/
withgs://
- 👉 Remove whatever comes after the first ?
🙀 The final result should be:
gs://formbio-production-26a68935-d9a6-4b3e-9b89-43dacfbf5e32/DNAseq/cancergenomics/SRR15663418_T1_1.fastq.gz
💯 Parameters JSON can be created using nf-core website
- You can use nf-co.re Launch to map which fields to populate in the JSON input field for this workflow.
- For rnaseq https://nf-co.re/rnaseq/3.14.0 → Click on Launch version <latest> (here it’s 3.14.0) to see which JSON fields to populate into JSON format
red asterisk
marks those required which must be filled. Note: outdir
must be omitted in the final result.outdir
and add the generated JSON to Parameters JSON# Final JSON format
{
"input": "formbio:\/\/form-bio-customer-support\/onboarding-project\/rnaseq_nfcore\/rnaseq_nf_core\/samplesheet_test.csv",
"fasta": "formbio:\/\/form-bio-customer-support\/onboarding-project\/rnaseq_nfcore\/rnaseq_nf_core\/genome.fasta",
"gtf": "formbio:\/\/form-bio-customer-support\/onboarding-project\/rnaseq_nfcore\/rnaseq_nf_core\/genes_with_empty_tid.gtf.gz"
}
3. Go to the Review & Submit
- ✍️ Add a run name
- ✅ Check all input again
- 😃 Click Run Workflow to execute