-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add toolsheet-related implementations #443
base: dev
Are you sure you want to change the base?
Conversation
|
…ve remainder:true
… ch_report_input_files
…generation of report input channel
Thanks @suzannejin - another big one. Will review thoroughly when I can, just give me some time :-) |
Unfortunately many parts of the pipeline had to be changed to properly use the toolsheets args, otherwise rare behaviours raise... However, if it makes the review process easier for you, I can split an subPR with the changes related to params.differential_method and params.functional_method, which are more independent to the rest. |
That's OK, I just need some dedicated time with it to understand and look for simplifications, as in the previous PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scheduling some time in hackathon hours next week. Just putting a block here so others know I really want a hand in this one :-)
@@ -224,6 +224,16 @@ To override the above options, you may also supply your own features table as a | |||
|
|||
By default, if you don't provide features, for non-array data the workflow will fall back to attempting to use the matrix itself as a source of feature annotations. For this to work you must make sure to set the `features_id_col`, `features_name_col` and `features_metadata_cols` parameters to the appropriate values, for example by setting them to 'gene_id' if that is the identifier column on the matrix. This will cause the gene ID to be used everywhere rather than more accessible gene symbols (as can be derived from the GTF), but the workflow should run. Please use this option for MaxQuant analysis, i.e. do not provide features. | |||
|
|||
## Toolsheet | |||
|
|||
We provide a set of toolsheet files that define the tools that make sense to run for a given study type. These files are in the `assets` directory of the pipeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a link to the directory on github
|
||
We provide a set of toolsheet files that define the tools that make sense to run for a given study type. These files are in the `assets` directory of the pipeline. | ||
|
||
Each row defines a combination of differential analysis tool and functional analysis tool (optional), with the respective arguments. Note that the arguments defined in the toolsheet have highest priority, meaning that they will overwrite any other arguments defined in the command line or in the configuration files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each row defines a combination of differential analysis tool and functional analysis tool (optional), with the respective arguments. Note that the arguments defined in the toolsheet have highest priority, meaning that they will overwrite any other arguments defined in the command line or in the configuration files. | |
Each row defines a combination of differential analysis tool and functional analysis tool (optional), with the respective arguments. | |
> [!WARNING] | |
> Note that the arguments defined in the toolsheet have highest priority, meaning that they will overwrite any other arguments defined in the command line or in the configuration files. |
@@ -90,9 +90,13 @@ params { | |||
exploratory_log2_assays = 'raw,normalised' | |||
exploratory_palette_name = 'Set1' | |||
|
|||
// Tools options | |||
analysis_name = null | |||
toolsheet_custom = null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be named only toolsheet
? I think it is more intuitive. Or do you think it's confusing?
} else if (params.study_type == 'maxquant') { | ||
ch_toolsheet = Channel.fromList(samplesheetToList("${projectDir}/assets/toolsheet_maxquant.csv", "${projectDir}/assets/schema_tools.json")) | ||
} else { | ||
error("Please make sure to mention the correct study_type") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error("Please make sure to mention the correct study_type") | |
error("Please make sure to mention the correct study_type. The available options are: 'rnaseq', 'affy_array', 'geo_soft_file' or 'maxquant'") |
def meta = [ | ||
analysis_name: it[0].analysis_name, | ||
diff_method : it[0].diff_method, | ||
diff_args : getParams('differential', it[0].diff_method) + parseArgs(it[0].diff_args) + [differential_method: it[0].diff_method], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain what is getParams
used for, and why do you give the string 'differential'
or 'functional'
?
It's also not clear to me why are you adding the [differential_method: it[0].diff_method]
map to the arguments.
Thanks!
// args_functional as a map to stay consistent. We also remove null values | ||
// from files list. | ||
def meta = (it[0].method_functional) ? it[0] : it[0] + [args_functional: [:]] | ||
[it[0], it.tail().grep().flatten()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add a comment describing this channel like you did with the ones before?
def pattern_with_tools = params_pattern + "|${meta.method_differential}" + (meta.method_functional ? "|functional|${meta.method_functional}" : "") | ||
// return params for report | ||
params_with_tools.findAll{ k,v -> k.matches(~/(${pattern_with_tools}).*/) } + | ||
[report_file_names, files.collect{ f -> f.name}].transpose().collectEntries() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a description of the channel
Changes made to enable the usage of toolsheets:
gsea_run
andgprofiler2_run
flags intofunctional_method
-> more consistent with the ch_tools behaviourdifferential_method
andfunctional_method
flagsNice to have, but left for next PRs (to not extend this one too much):
analysis_name
flagPR checklist
nf-core lint
).nf-test test main.nf.test -profile test,docker
).nextflow run . -profile debug,test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).