Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contribute back to nf-core/modules #110

Open
LouisLeNezet opened this issue Feb 27, 2025 · 7 comments
Open

Contribute back to nf-core/modules #110

LouisLeNezet opened this issue Feb 27, 2025 · 7 comments
Labels
enhancement New feature or request

Comments

@LouisLeNezet
Copy link

Description of feature

There is a lot of local modules present in this pipeline.
It would be really nice to have them integrated into the nf-core/modules repository.

This could be part of a Hackathon 😉

@LouisLeNezet LouisLeNezet added the enhancement New feature or request label Feb 27, 2025
@nschan
Copy link
Collaborator

nschan commented Feb 27, 2025

To make it easier, here is a list of local modules and my personal, subjective judgement on whether they should be in nf-core/modules, or if there is a module already that I did not use:

Module What does it do? / Why not the nf-core module? nf-core module candidate?
collect_reads gunzips fastq.gz files into one fastq file, this useful because ONT can create many files per library if the run is restarted. etc no
genomescope runs genomescope, most important for pipeline functionality is that this estimates the estimated haploid length, which gets passed on to flye a nf-core module exists but did not suit my needs
gfa2fa converts gfa to fa, essentially awk  maybe?
jellyfish runs jellyfish to create inputs for genomescope (see above), split into COUNT, STATS,HISTO and DUMP yes
links  runs LINKS for scaffolding  yes 
longstitch  runs longstitch for scaffolding  yes 
medaka  runs medaka for polishing. A medaka nf-core module exists, but runs an older version. My implementation here may need to be optimized, but also the future of medaka is not clear and it might be good to switch to dorado  probably / update nf-core module 
nanoq runs nanoq, nf-core module exists, but does not emit something I need (median read length) module exists
QUAST  Has an nf-core module. The nf-core module uses a biocontainer that is missing something (I forgot what exactly, there was some discussion about this with @jfy133 on slack), this local module uses a different container  module exists
ragtag Performs scaffolding to reference with ragtag, there is currently no nf-core module for this yes

@jfy133
Copy link
Member

jfy133 commented Feb 27, 2025

Quast was missing some dependency (a unversioned java jar or something) for one particular function you want to use it for, it is doable as there is some other depency that has the same fix. I just havent had the time to do it

@jfy133
Copy link
Member

jfy133 commented Feb 27, 2025

For genscipe can you what you mean it doesn't suit your needs? Modules should be designed so they work for everyone...

@nschan
Copy link
Collaborator

nschan commented Feb 27, 2025

I am using genomescope to estimate the haploid length and I am not very clever about it, so I do in the script:

est_hap_len=\$(cat ${prefix}_genomescope.txt \\
        | grep 'Haploid Length' \\
        | sed 's@ bp@@g' \\
        | sed 's@,@@g' \\
        | awk '{printf "%i", (\$4+\$5)/2 }')

and in output:

        tuple val(meta), env(est_hap_len)           , emit: estimated_hap_len

Edit:
I know that this could be solved differently: using nf-core genomescope and then having a process that does the extraction. I guess would still require a (different) local module and did not seem much better to me at the time, but maybe this is something that could be changed in the future?

@nschan
Copy link
Collaborator

nschan commented Feb 27, 2025

Quast was missing some dependency (a unversioned java jar or something) for one particular function you want to use it for, it is doable as there is some other depency that has the same fix. I just havent had the time to do it

Sure, but currently I can't use the core module, so I made a local one. This is mainly to explain why I do not think it is necessary to contribute my local quast module back to nf-core, since it is a workaround for the current situation, which will be resolved in the future.

@jfy133
Copy link
Member

jfy133 commented Feb 28, 2025

Sorry, yes - to clarify:

  1. Genoscope: if it's just that modification, that suggest to me to use nf-core modules patch on your copy of the official nf-core module to add the extra command - it doesn't seem to different to require an entirely separate local module.
  2. QUAST: Given you pinged me as you didn't remember the exact problem I wanted to give more context in case someone else wants to contribute ;) using a local module in this case is fine

@nschan
Copy link
Collaborator

nschan commented Feb 28, 2025

Genoscope: if it's just that modification, that suggest to me to use nf-core modules patch on your copy of the official nf-core module to add the extra command - it doesn't seem to different to require an entirely separate local module.

Patching this is probably a good option. Since the release review is now done, I would prefer to not fiddle around with modules at this point, but I will put it on the list of things that should happen for future releases.

QUAST: Given you pinged me as you didn't remember the exact problem I wanted to give more context in case someone else wants to contribute ;) using a local module in this case is fine

Ok, good; once the nf-core quast module (biocontainer) gets the required update I will switch to nf-core, probably similar timeline to patching everything things that are local here and could use a core module :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants