Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError after installing cudf that is calling dask_cudf which is looking for a dask.dataframe dependency and failing to find it #9019

Closed
jacobtomlinson opened this issue Mar 6, 2025 · 3 comments
Labels
gpu needs info Needs further information from the user needs reproducer

Comments

@jacobtomlinson
Copy link
Member

Dear dask community,

It is 2025 and I am running into an ImportError after installing cudf that is calling dask_cudf which is looking for a dask.dataframe dependency and failing to find it. I got the cudfand dask installed from rapidsai and conda-forgechannels using conda:
conda install -c rapidsai -c conda-forge -c nvidia cudf cuml 'cuda-version=12.6'
And that installed the following packages:

$ conda list 'dask|cuml|cudf|distributed'
# packages in environment at ~/anaconda3/envs/rnntf2:
#
# Name                    Version                   Build  Channel
cudf                      24.12.00        cuda12_py312_241211_gff41ecf473_0    rapidsai
cuml                      24.12.00        cuda12_py312_241211_ge79cd670a_0    rapidsai
dask                      2024.11.2          pyhff2d567_1    conda-forge
dask-core                 2024.11.2          pyhff2d567_1    conda-forge
dask-cuda                 24.12.00        py312_241211_g3b3b356_0    rapidsai
dask-cudf                 24.12.00        cuda12_py312_241211_gff41ecf473_0    rapidsai
dask-expr                 1.1.19             pyhd8ed1ab_0    conda-forge
distributed               2024.11.2          pyhff2d567_1    conda-forge
distributed-ucxx          0.41.00         py3.12_241211_gd355f9c_0    rapidsai
libcudf                   24.12.00        cuda12_241211_gff41ecf473_0    rapidsai
libcuml                   24.12.00        cuda12_241211_ge79cd670a_0    rapidsai
libcumlprims              24.12.00        cuda12_241211_g8df6c7e_0    rapidsai
pylibcudf                 24.12.00        cuda12_py312_241211_gff41ecf473_0    rapidsai
raft-dask                 24.12.00        cuda12_py312_241211_geaf9cc72_0    rapidsai
rapids-dask-dependency    24.12.00                   py_0    rapidsai

My idea is to use dask on a slurm based HPC system and I saw in the documentation that you recommend the dask-jobqueue package. I found two similarly named packages:

$ conda search -c conda-forge 'dask*jobqueue'
dask-gateway-server-jobqueue           0.9.0  py38h578d9bd_2  conda-forge         
dask-gateway-server-jobqueue           0.9.0  py39hf3d152e_2  conda-forge         
dask-gateway-server-jobqueue        2022.4.0      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue        2022.6.1      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue       2022.10.0      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue        2023.1.0      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue        2023.1.1      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue        2023.9.0      ha770c72_0  conda-forge         
dask-gateway-server-jobqueue        2024.1.0      ha770c72_0  conda-forge         
dask-jobqueue                  0.8.0    pyhd8ed1ab_0  conda-forge         
dask-jobqueue                  0.8.1    pyhd8ed1ab_0  conda-forge         
dask-jobqueue                  0.8.2    pyhd8ed1ab_0  conda-forge         
dask-jobqueue                  0.8.5    pyhd8ed1ab_0  conda-forge         
dask-jobqueue                  0.9.0    pyhd8ed1ab_0  conda-forge         

What is the difference between them? Which one should I pick?

Thanks!

Originally posted by @ovalerio in #962

@jacobtomlinson
Copy link
Member Author

jacobtomlinson commented Mar 6, 2025

I am running into an ImportError after installing cudf that is calling dask_cudf which is looking for a dask.dataframe dependency and failing to find it

@ovalerio can you share the error you are getting when you import things?

What is the difference between them? Which one should I pick?

I expect you will want to use dask-jobqueue if you are looking to submit jobs to a SLURM HPC and you are able to run commands like srun and sbatch.

The dask-gateway-server-jobqueue package is part of dask-gateway which is used to provide Dask as a service on top of SLURM and is generally used by cluster admins to provide Dask access to a team or org who do not have direct access to submit SLURM jobs.

@jacobtomlinson jacobtomlinson added needs info Needs further information from the user gpu needs reproducer and removed needs triage labels Mar 6, 2025
@jacobtomlinson
Copy link
Member Author

Closing as a duplicate of rapidsai/dask-cuda#1455

@ovalerio
Copy link

ovalerio commented Mar 6, 2025

Hey @jacobtomlinson,

Thanks for answering my question and for the amazing jupyterlab-nvdashboard. 🎩 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gpu needs info Needs further information from the user needs reproducer
Projects
None yet
Development

No branches or pull requests

2 participants