-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add referencer utilities and constants for Enterobacteriaceae #201
base: rc4.2.0
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, apart from some concerns I have with the clarity of some of the code constructs. See below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it pretty much solved it, but I suggest using a list of tuples instead of a set of tuples, as sets are not sorted (if that matters), and can be a little more confusing for people not used to them.
"""Checks if the organism name is a proper Klebsiella.""" | ||
return any(species_name == member.value for member in Klebsiella) | ||
|
||
|
||
def get_reference_if_enterobacteriaceae(organism_name: str) -> str: | ||
"""Returns the reference for the organism name if it belongs to Enterobacteriaceae.""" | ||
species: str = get_species(organism_name) | ||
GENUS_REFERENCE_MAP = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The surrounding {}
also should probably preferably be changed, e.g. to []
to make it a list, since right now you actually got a "set" (consisting of tuples), since that is what you get with {}
containing single items.
Well, a set might technically work here, but I think they also are not guaranteed to be in sorted order, if that matters...
@@ -43,8 +43,8 @@ def test_is_klebsiella(species_name, expected_result): | |||
@pytest.mark.parametrize( | |||
"organism_name, expected_reference", | |||
[ | |||
(Escherichia.COLI, ESCHERICHIA_REFERENCE), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes related to the previous discussion, or was this something else?
(No opinions here, just curious)
Description
This pull request includes several changes to the
microSALT
project, focusing on improving the handling of organism references and restructuring the codebase for better organization. The most important changes include updating import paths, adding utility functions for organism reference checks, and modifying methods to use these new utilities.Codebase restructuring and improvements:
microSALT/utils/job_creator.py
: Updated the import path forReferencer
and added a new import forget_reference_if_enterobacteriaceae
. Modified thecreate_blast_search
method to use the new utility function for checking Enterobacteriaceae references. [1] [2]microSALT/utils/referencer/constants.py
: IntroducedStrEnum
and new enumerations forEscherichia
andKlebsiella
species, along with their corresponding references.microSALT/utils/referencer/referencer.py
: Renamed the file for better organization and updated theorganism2reference
method to streamline the reference matching process. [1] [2]microSALT/utils/referencer/utils.py
: Added utility functionsis_escherichia
,is_klebsiella
, andget_reference_if_enterobacteriaceae
to check organism names and return appropriate references.microSALT/utils/scraper.py
: Updated the import path forReferencer
and added a new import forget_reference_if_enterobacteriaceae
. Modified thescrape_blast
method to use the new utility function for checking Enterobacteriaceae references. [1] [2]Primary function of PR
Testing
If the update is a hotfix, it is sufficient to rely on the development testing along with the Travis self-test automatically applied to the PR.
Test routine to verify the stability of the PR:
bash /home/proj/production/servers/resources/hasta.scilifelab.se/install-microsalt-stage.sh BRANCHNAME
us
conda activate S_microSALT
export MICROSALT_CONFIG=/home/proj/dropbox/microSALT.json
microSALT analyse project MIC3109
microSALT analyse project MIC4107
microSALT analyse project MIC4109
microSALT analyse project ACC5551
Verify that the results for projects MIC3109, MIC4107, MIC4109 & ACC5551 are consistent with the results attached to AMSystem doc 1490, Microbial_WGS.xlsx
Test results
These are the results of the tests, and necessary conclusions, that prove the stability of the PR.
Sign-offs