Apache spark quick standalone installation guide for windows

Photo by Franki Chamaki on Unsplash

In this article, we will go through steps to install Apache Spark on windows and test a word counting code on Intellij/Eclipse using maven dependencies. The steps are as follows:

  1. Download spark 2.3.4 bin hadoop2.7.tgz (Spark 2.3.4) from this link
    This is pre-built for Hadoop 2.7, but don’t install Hadoop (we will use Spark in standalone cluster mode)!
  2. Create a new directory C:\Spark\ on partition C:
  3. Unpack the downloaded archive file and move all the files and subdirectories under the root directory of the archive into C:\Spark\ (so that README.md, bin\, examples\, etc are directly under C:\Spark\
  4. Create a new directory C:\winutils and a subdirectory C:\winutils\bin
  5. Download winutils.exe from this link (Right-click this link)
  6. Move winutils.exe into C:\winutils\bin
  7. Set HADOOP_HOME as C:\winutils\ in system environment variables. Remember “System variables” (not “User variables”!). Then restart your computer.
  8. Also, make sure your JAVA_HOME exists and contains path to existing JDK8 (e.g. C:\Program Files\Java\jdk-8). It must point to JDK and not JRE.
  9. Open your window shell and change the directory to C:\Spark
  10. Enter .\bin\run-example SparkPi 10 . To test your Spark installation. Somewhere in the output, you should see the line “Pi is roughly 3.14…”.
  11. Open IntelliJ/Eclipse and add maven dependencies as follow in your pom.xml

12. Now create a class WordCount.java in your src/main/java of your project. Also specify correct location of your file which you want to count words for.

References: Spark Installation guide by Dr. Matthias Nickles



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store