nf-core の workflow を CWL に移植する
やっていくぞ
まずはnfで動かします
pull して
run して
log を見る
nextflow log -l で指定可能なログフィールドを列挙
wfで使ったコンテナとコマンドが知りたいので process hash, container, script を指定
code:nextflow
$ nextflow pull nf-core/viralrecon
$ nextflow run nf-core/viralrecon -profile test_nanopore,docker --outdir test_nanopore -with-dag flow.html
$ nextflow log
TIMESTAMP DURATION RUN NAME STATUS REVISION ID SESSION ID COMMAND
2022-08-30 16:43:53 9m 34s nauseous_poincare OK 3ee1fe98fd f9a76113-6a5e-4d31-97c0-f2c9e30dfeb7 nextflow run nf-core/viralrecon -profile test_nanopore,docker --outdir test_nanopore
2022-08-30 16:58:30 30.9s grave_edison ERR 3ee1fe98fd f04246f2-7d2e-449c-b233-c4e520d16ff2 nextflow run nf-core/viralrecon -profile test_nanopore,docker --outdir test_nanopore -stub-run
2022-08-30 17:12:02 4m 20s condescending_torvalds OK 3ee1fe98fd 98a0e65d-a185-4e77-a79c-31beddeaf9ed nextflow run nf-core/viralrecon -profile test_nanopore,docker --outdir test_nanopore -with-dag flow.html
2022-08-30 17:49:04 4m 35s pedantic_nightingale OK 3ee1fe98fd 336cf0a5-62b5-443c-9a3d-042f880e7c78 nextflow run nf-core/viralrecon -profile test_nanopore,docker --outdir test_nanopore -with-dag
$ nextflow log -l
$ nextflow log pedantic_nightingale -f hash,container,script
-with-dag が効いてない…なぜ…
ここから吐き出されたスクリプト群を泥臭く変換していく
まず cat <<-END_VERSION で始まる version 記録用の部分は全部削除
残りのスクリプトは改行を殺して複数コマンドは && で繋いでやる
複数のサンプルに対して同じ処理を行う部分が多いので sort してやる
コマンドをダブルクオートで括る
謎の python script 直打ちはどうでもよさそうなので削除
etc.
そうするとこれが
code:log.raw
ac/2a9e11 quay.io/biocontainers/python:3.9--1
check_samplesheet.py \
samplesheet_test_nanopore.csv \
samplesheet.valid.csv \
--platform nanopore
cat <<-END_VERSIONS > versions.yml
"NFCORE_VIRALRECON:NANOPORE:INPUT_CHECK:SAMPLESHEET_CHECK":
python: $(python --version | sed 's/Python //g')
END_VERSIONS
d5/70321f quay.io/biocontainers/python:3.9--1
collapse_primer_bed.py \
--left_primer_suffix _LEFT \
--right_primer_suffix _RIGHT \
nCoV-2019.primer.bed \
nCoV-2019.primer.collapsed.bed
cat <<-END_VERSIONS > versions.yml
"NFCORE_VIRALRECON:NANOPORE:PREPARE_GENOME:COLLAPSE_PRIMERS":
python: $(python --version | sed 's/Python //g')
END_VERSIONS
f1/ebc3d2 ubuntu:20.04
gunzip \
-f \
\
GCA_009858895.3_ASM985889v3_genomic.200409.gff.gz
cat <<-END_VERSIONS > versions.yml
"NFCORE_VIRALRECON:NANOPORE:PREPARE_GENOME:GUNZIP_GFF":
gunzip: $(echo $(gunzip --version 2>&1) | sed 's/^.*(gzip) //; s/ Copyright.*$//')
END_VERSIONS
32/983449 quay.io/biocontainers/samtools:1.15.1--h1170115_0
samtools faidx nCoV-2019.reference.fasta
cut -f 1,2 nCoV-2019.reference.fasta.fai > nCoV-2019.reference.fasta.sizes
cat <<-END_VERSIONS > versions.yml
"NFCORE_VIRALRECON:NANOPORE:PREPARE_GENOME:CUSTOM_GETCHROMSIZES":
custom: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
END_VERSIONS
こうなる
行頭に -c がついているのは後ほど zatsu-cwl-generator に食わせるため
code:log_sorted
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "artic guppyplex --min-length 400 --max-length 700 --directory barcode01 --output SAMPLE_01.fastq && pigz -p 2 *.fastq"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "artic guppyplex --min-length 400 --max-length 700 --directory barcode02 --output SAMPLE_02.fastq && pigz -p 2 *.fastq"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "artic guppyplex --min-length 400 --max-length 700 --directory barcode03 --output SAMPLE_03.fastq && pigz -p 2 *.fastq"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "artic guppyplex --min-length 400 --max-length 700 --directory barcode04 --output SAMPLE_04.fastq && pigz -p 2 *.fastq"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "artic guppyplex --min-length 400 --max-length 700 --directory barcode05 --output SAMPLE_05.fastq && pigz -p 2 *.fastq"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "artic guppyplex --min-length 400 --max-length 700 --directory barcode06 --output SAMPLE_06.fastq && pigz -p 2 *.fastq"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "artic guppyplex --min-length 400 --max-length 700 --directory barcode07 --output SAMPLE_07.fastq && pigz -p 2 *.fastq"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "artic guppyplex --min-length 400 --max-length 700 --directory barcode08 --output SAMPLE_08.fastq && pigz -p 2 *.fastq"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "artic guppyplex --min-length 400 --max-length 700 --directory barcode09 --output SAMPLE_09.fastq && pigz -p 2 *.fastq"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "export HDF5_PLUGIN_PATH=/usr/local/lib/python3.6/site-packages/ont_fast5_api/vbz_plugin && artic minion --normalise 500 --minimap2 --threads 2 --read-file SAMPLE_01.fastq.gz --scheme-directory ./primer-schemes --scheme-version 3 --fast5-directory fast5_pass --sequencing-summary sequencing_summary.txt nCoV-2019 SAMPLE_01"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "export HDF5_PLUGIN_PATH=/usr/local/lib/python3.6/site-packages/ont_fast5_api/vbz_plugin && artic minion --normalise 500 --minimap2 --threads 2 --read-file SAMPLE_02.fastq.gz --scheme-directory ./primer-schemes --scheme-version 3 --fast5-directory fast5_pass --sequencing-summary sequencing_summary.txt nCoV-2019 SAMPLE_02"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "export HDF5_PLUGIN_PATH=/usr/local/lib/python3.6/site-packages/ont_fast5_api/vbz_plugin && artic minion --normalise 500 --minimap2 --threads 2 --read-file SAMPLE_03.fastq.gz --scheme-directory ./primer-schemes --scheme-version 3 --fast5-directory fast5_pass --sequencing-summary sequencing_summary.txt nCoV-2019 SAMPLE_03"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "export HDF5_PLUGIN_PATH=/usr/local/lib/python3.6/site-packages/ont_fast5_api/vbz_plugin && artic minion --normalise 500 --minimap2 --threads 2 --read-file SAMPLE_04.fastq.gz --scheme-directory ./primer-schemes --scheme-version 3 --fast5-directory fast5_pass --sequencing-summary sequencing_summary.txt nCoV-2019 SAMPLE_04"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "export HDF5_PLUGIN_PATH=/usr/local/lib/python3.6/site-packages/ont_fast5_api/vbz_plugin && artic minion --normalise 500 --minimap2 --threads 2 --read-file SAMPLE_05.fastq.gz --scheme-directory ./primer-schemes --scheme-version 3 --fast5-directory fast5_pass --sequencing-summary sequencing_summary.txt nCoV-2019 SAMPLE_05"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "export HDF5_PLUGIN_PATH=/usr/local/lib/python3.6/site-packages/ont_fast5_api/vbz_plugin && artic minion --normalise 500 --minimap2 --threads 2 --read-file SAMPLE_06.fastq.gz --scheme-directory ./primer-schemes --scheme-version 3 --fast5-directory fast5_pass --sequencing-summary sequencing_summary.txt nCoV-2019 SAMPLE_06"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "export HDF5_PLUGIN_PATH=/usr/local/lib/python3.6/site-packages/ont_fast5_api/vbz_plugin && artic minion --normalise 500 --minimap2 --threads 2 --read-file SAMPLE_07.fastq.gz --scheme-directory ./primer-schemes --scheme-version 3 --fast5-directory fast5_pass --sequencing-summary sequencing_summary.txt nCoV-2019 SAMPLE_07"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "export HDF5_PLUGIN_PATH=/usr/local/lib/python3.6/site-packages/ont_fast5_api/vbz_plugin && artic minion --normalise 500 --minimap2 --threads 2 --read-file SAMPLE_08.fastq.gz --scheme-directory ./primer-schemes --scheme-version 3 --fast5-directory fast5_pass --sequencing-summary sequencing_summary.txt nCoV-2019 SAMPLE_08"
-c quay.io/biocontainers/artic:1.2.2--pyhdfd78af_0 "export HDF5_PLUGIN_PATH=/usr/local/lib/python3.6/site-packages/ont_fast5_api/vbz_plugin && artic minion --normalise 500 --minimap2 --threads 2 --read-file SAMPLE_09.fastq.gz --scheme-directory ./primer-schemes --scheme-version 3 --fast5-directory fast5_pass --sequencing-summary sequencing_summary.txt nCoV-2019 SAMPLE_09"
これがまあまあ大変である