-
-
Notifications
You must be signed in to change notification settings - Fork 728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Installing dask and distributed packages can be confusing #962
Comments
I'm OK with option 1 for now, although it's suboptimal. I think #2, if you're up for the hassle, is the better situation long-term, especially since I anticipate that there will eventually crop up other optional things on top of dask (e.g. a more complete graphical admin UI package, or various scheduler compatibility libraries), which would be best to have depend on just |
Currently dask is just dask-core. Another alternative is to make a metapackage called dask-complete or something and point people to this. This has the same flaw of "few people read the docs" but it is easier than |
I don't have a strong opinion on which option is the best, as long as we don't have cyclic dependencies. This is actually what prompted this discussion, because I was asked to add |
Oh hey, it turns out that we already solve this problem on conda-forge. We just have dask/dask depend on the previous version of dask/distributed . We always create some buffer room when versioning (the two projects work with one version behind) at least with micro-version updates. So copying the recipes here might solve the problem today: |
Isn't that a bit confusing? If someone runs |
It's a >= dependency. I'm not sure what Conda's policy is here. Do they prefer newer versions to older ones if available? |
Conda only prefers newer version newer versions for Even though the projects depend on older versions of each other, they still depend on each other, which is confusing. Also the |
I understand the motivation here, but for development and dependency reasons they will likely remain separate projects. They develop at different rates, have different developer communities, and are depended on by different projects that sometimes only want dask, and not the distributed scheduler. I've had strong requests both to add and to remove the |
I've started a dask-core package on conda-forge here: conda-forge/staged-recipes#3820 |
Why is the Conda package for
|
Probably mostly because of history, and now because of inertia |
Hey guys just a quick question since I am new to dask. I followed all the instructions given on their official website to install the package but am getting this error ModuleNotFoundError: No module named 'dask.dataframe'; 'dask' is not a package. Any help on how to resolve this issue |
I am experiencing the same issue as @yosopak2020 , have tried multiple versions of dask, multiple versions of python (3.5, 3.6). |
After successful tries, I just restarted my computer and it worked. I don’t know why so you can try.
…Sent from my iPhone
On Apr 15, 2018, at 12:53, YorT ***@***.***> wrote:
I am experiencing the same issue as @yosopak2020 , have tried multiple versions of dask, multiple versions of python (3.5, 3.6).
Exact error message for me is: "ModuleNotFoundError: No module named 'dask.dataframe'; 'dask' is not a package"
No issue importing other libs either (pandas or numpy), only dask appears to be effected.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@yort how are you installing dask? If you're using pip, you need |
Hi! Despite using
when later trying How to solve this issue?? |
What's the output of |
The output after Restart of the notebook/IPython results in kernel death............ Any help is very valuable. |
I'm not sure what get_ipython().system does. Perhaps it would be worth
activating your environment in a terminal and upgrading from there.
…On Thu, Apr 19, 2018 at 2:28 AM, NatLun091238 ***@***.***> wrote:
The output after
# In[6]: get_ipython().system('pip install dask[complete] distributed
--upgrade') is :
Collecting dask[complete] Downloading https://files.pythonhosted.
org/packages/1d/f1/700c604af030d9b256a6590adf56ca
dd174c30c8ac6f555daf0e3023d294/dask-0.17.2-py2.py3-none-any.whl (582kB)
100% |████████████████████████████████| 583kB 966kB/s eta 0:00:01
Collecting distributed Downloading https://files.pythonhosted.
org/packages/39/e8/7453e61bbee910aa91936743d6782a
2108c28d9945f5f61cf801b485b5fa/distributed-1.21.6-py2.py3-none-any.whl
(458kB) 100% |████████████████████████████████| 460kB 1.3MB/s eta 0:00:01
Collecting numpy>=1.10.4; extra == "complete" (from dask[complete])
Downloading https://files.pythonhosted.org/packages/76/4d/
418dda252cf92bad00ab82d6b2a856e7843b47a5c2f084aed34b14b67d64
/numpy-1.14.2-cp27-cp27mu-manylinux1_x86_64.whl (12.1MB) 100%
|████████████████████████████████| 12.1MB 45kB/s eta 0:00:01 Collecting
toolz>=0.7.3; extra == "complete" (from dask[complete]) Requirement already
up-to-date: pandas>=0.19.0; extra == "complete" in
/usr/local/envs/py2env/lib/python2.7/site-packages (from dask[complete])
Collecting cloudpickle>=0.2.1; extra == "complete" (from dask[complete])
Downloading https://files.pythonhosted.org/packages/aa/18/
514b557c4d8d4ada1f0454ad06c845454ad438fd5c5e0039ba51d6b032fe
/cloudpickle-0.5.2-py2.py3-none-any.whl Collecting partd>=0.3.8; extra ==
"complete" (from dask[complete]) Downloading https://files.pythonhosted.
org/packages/4a/ca/207a28fd81111f6a88e79a006745ff
432b9cae850fbafa27486e98d459da/partd-0.3.8-py2.py3-none-any.whl
Collecting tblib (from distributed) Downloading
https://files.pythonhosted.org/packages/4a/82/
1b9fba6e93629a8557f9784cd8f1ae063c8762c26446367a6764edd328ce
/tblib-1.3.2-py2.py3-none-any.whl Collecting six (from distributed)
Downloading https://files.pythonhosted.org/packages/67/4b/
141a581104b1f6397bfa78ac9d43d8ad29a7ca43ea90a2d863fe3056e86a
/six-1.11.0-py2.py3-none-any.whl Requirement already up-to-date:
click>=6.6 in /usr/local/envs/py2env/lib/python2.7/site-packages (from
distributed) Collecting tornado>=4.5.1 (from distributed) Collecting
sortedcontainers (from distributed) Downloading
https://files.pythonhosted.org/packages/ea/67/
c76c354ff30a689aeb2c75c4d383ae618c27fc2180d313f387f8918a3429
/sortedcontainers-1.5.9-py2.py3-none-any.whl Requirement already
up-to-date: singledispatch; python_version < "3.4" in
/usr/local/envs/py2env/lib/python2.7/site-packages (from distributed)
Collecting psutil (from distributed) Collecting zict>=0.1.3 (from
distributed) Downloading https://files.pythonhosted.org/packages/5d/c9/
eddd6c9a7ebd65fc799f9b87e56b45599a4e35d66e3da2722d7fc2a89f1f
/zict-0.1.3-py2.py3-none-any.whl Collecting msgpack-python (from
distributed) Collecting futures; python_version < "3.0" (from distributed)
Downloading https://files.pythonhosted.org/packages/2d/99/
b2c4e9d5a30f6471e410a146232b4118e697fa3ffc06d6a65efde84debd0
/futures-3.2.0-py2-none-any.whl Collecting python-dateutil (from
pandas>=0.19.0; extra == "complete"->dask[complete]) Downloading
https://files.pythonhosted.org/packages/0c/57/
19f3a65bcf6d5be570ee8c35a5398496e10a0ddcbc95393b2d17f86aaaf8
/python_dateutil-2.7.2-py2.py3-none-any.whl (212kB) 100%
|████████████████████████████████| 215kB 2.8MB/s eta 0:00:01 Collecting
pytz>=2011k (from pandas>=0.19.0; extra == "complete"->dask[complete])
Downloading https://files.pythonhosted.org/packages/dc/83/
15f7833b70d3e067ca91467ca245bae0f6fe56ddc7451aa0dc5606b120f2
/pytz-2018.4-py2.py3-none-any.whl (510kB) 100%
|████████████████████████████████| 512kB 1.2MB/s eta 0:00:01 Collecting
locket (from partd>=0.3.8; extra == "complete"->dask[complete]) Requirement
already up-to-date: backports-abc>=0.4 in /usr/local/envs/py2env/lib/python2.7/site-packages
(from tornado>=4.5.1->distributed) Collecting heapdict (from
zict>=0.1.3->distributed) Installing collected packages: numpy, tblib, six,
futures, tornado, cloudpickle, sortedcontainers, psutil, heapdict, zict,
msgpack-python, toolz, distributed, locket, partd, dask, python-dateutil,
pytz Found existing installation: numpy 1.14.0 Uninstalling numpy-1.14.0:
Successfully uninstalled numpy-1.14.0 Found existing installation: six
1.10.0 Uninstalling six-1.10.0: Successfully uninstalled six-1.10.0 Found
existing installation: futures 3.0.5 Uninstalling futures-3.0.5:
Successfully uninstalled futures-3.0.5 Found existing installation: tornado
4.4.2 Uninstalling tornado-4.4.2: Successfully uninstalled tornado-4.4.2
Found existing installation: psutil 4.3.0 Uninstalling psutil-4.3.0:
Successfully uninstalled psutil-4.3.0 Found existing installation: toolz
0.8.2 Uninstalling toolz-0.8.2: Successfully uninstalled toolz-0.8.2 Found
existing installation: dask 0.17.1 Uninstalling dask-0.17.1: Successfully
uninstalled dask-0.17.1 Found existing installation: python-dateutil 2.5.0
Uninstalling python-dateutil-2.5.0: Successfully uninstalled
python-dateutil-2.5.0 Found existing installation: pytz 2016.7 Uninstalling
pytz-2016.7: Successfully uninstalled pytz-2016.7 Successfully installed
cloudpickle-0.5.2 dask-0.17.2 distributed-1.21.6 futures-3.2.0
heapdict-1.0.0 locket-0.2.0 msgpack-python-0.5.6 numpy-1.14.2 partd-0.3.8
psutil-5.4.5 python-dateutil-2.7.2 pytz-2018.4 six-1.11.0
sortedcontainers-1.5.9 tblib-1.3.2 toolz-0.9.0 tornado-5.0.2 zict-0.1.3 You
are using pip version 9.0.1, however version 10.0.0 is available. You
should consider upgrading via the 'pip install --upgrade pip' command.
Restart of the notebook/IPython results in kernel death............ Any
help is very valuable.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#962 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIjjd9sR2wFFhmw3U7xcZ_FZFO20gks5tqDyagaJpZM4Ml1Rb>
.
|
It seems like we found rather stable solution which doesn't interfere with existing datalab dask installation. We run an external script doing: |
Glad to hear it!
…On Mon, Apr 23, 2018 at 6:03 AM, NatLun091238 ***@***.***> wrote:
It seems like we found rather stable solution which doesn't interfere with
existing datalab dask installation. We run an external skript doing:
!pip install tornado==4.5.1 distributed==1.21 dask-ml[complete]
and restarting current server on GCP and then verifying installation.
After that all packages need for development are available. It's not a best
way of solving the problem, but it works for us.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#962 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AASszJWghXXuEPkZmZKkIV7ovTeb51tGks5trabmgaJpZM4Ml1Rb>
.
|
@TomAugspurger sorry, ignore me, total noob mistake on my part. I had a testing file called dask.py being used which was being imported instead of actual dask! Lesson learned! |
We've all made that mistake at least once :)
…On Sun, Apr 29, 2018 at 6:09 AM, YorT ***@***.***> wrote:
@TomAugspurger <https://github.com/TomAugspurger> sorry, ignore me, total
noob mistake on my part. I had a testing file called dask.py being used
which was being imported instead of actual dask! Lesson learned!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#962 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIiW8iFeliNnwupPCJflBJDV-ox7hks5ttZ9rgaJpZM4Ml1Rb>
.
|
I just did the same thing; thanks to your post I realized it. :) |
Is there any update on how to solve this issue? Tried doing: which I don't think changed much since I am still getting the module not found error. |
@jsCoder020193 Hmm, that's odd. Are you sure that the pip you are using is in the same python environment? I'm not able to reproduce the same problem following the install docs:
or, if you only want distributed and the core parts of dask...
|
Today if you Between that, and the more detailed installation docs on the dask and distributed pages, I'm not sure there's much else to do here. Details:
|
Dear It is 2025 and I am running into an ImportError after installing
My idea is to use
What is the difference between them? Which one should I pick? Thanks! |
Our installation docs recommend that people do the following
or
However we shouldn't expect most people to do the proper diligence of reading installation docs. We all tend to just guess that
conda install name-of-project
works pretty well most of the time. Unfortunately, if you've heard that Dask does distributed computing, youconda install dask
, and then try out any distributed example then you're likely to receive an import error, which makes for a bad first impression.There are a few ways that we could resolve this problem:
dask.distributed
thing. These would point them to installation docs. This wouldn't help if the just didimport distributed
though I think that most of the public materials we produce at this point always import fromdask.distributed
.dask
with a metapackage that included both dask and distributed. This would be foolproof in the conda case but would be a bit of an organizational hassle from a packaging perspective. We would rename the existing package dask-core (or something similar) and then switch in the dask metapackage. We would have to do this on conda-forge at the same time.I'm in favor of starting with option 1, though would love to find a more thorough alternative.
cc @pzwang @ilanschnell
The text was updated successfully, but these errors were encountered: