Skip to content

Environment Setting

Nihal Soans edited this page Feb 2, 2018 · 1 revision

Environment Setup

Anaconda

Anaconda is a complete Python distribution embarking automatically the most common packages, and allowing an easy installation of new packages.

Download and install Anaconda (https://www.continuum.io/downloads).

PyCharm or IntelliJ Idea

IntelliJ Idea is a complete IDE with, between others, Java, Scala and Python pluggins. PyCharm is an equivalent IDE, but with Python as only pluggin (therefore lighter).

Download one of those two IDEs (community edition)

If you choose IntelliJ Idea, you must install the Python pluggin, which is not incorporated by default.

Spark

Download the latest, pre-built for Hadoop 2.6, version of Spark.

  • Go to http://spark.apache.org/downloads.html
  • Choose a release (prendre la dernière)
  • Choose a package type: Pre-built for Hadoop 2.6 and later
  • Choose a download type: Direct Download
  • Click on the link in Step 4
  • Once downloaded, unzip the file and place it in a directory of your choice

Go to WIKI tab for more details of running IDE for Pyspark. (IDE Setting for Pyspark)