Prerequisite
- Windows 7/10 Operating System
- Java 8(JDK 1.8 or later version)
Walk-through
In this article, I am going to walk-through various steps to create first Apache Spark Application using IntelliJ IDEA in Windows 7/10.Step 1: Download IntelliJ IDEA Community Edition
Search for "intellij idea download" from www.google.com
Click on "Intellij IDEA Download" link from search results
Click on "Download" button which is Community Edition installer(.exe) in the Windows tab
Click on "Show in folder"
Step 2: Installation of Intellij IDEA Community Edition
Double click on "ideaIC-2019.2.4.exe" to start the installation
Click on "Yes" button
Click on "Next" button to continue
Change the "Destination Folder" if you like to change else click on "Next" button to continue
Click on different check box if you wish to select those options or else click on "Next" button to continue
Click on "Install" button to continue
Click on "Run IntelliJ IDEA Community Edition" check box and click on "Finish" button to complete the installation.
Step 3: Configure IntelliJ IDEA Community Edition
Choose "Do not import settings" and click on "OK" button to configure IntelliJ IDEA Community Edition
Choose "Set UI theme" either "Darcula" or "Light"
Click on "Next: Default plugins" button to continue
Click on "Next: Featured plugins"
Click on "Install" button under the "Scala" Featured plugins
Click on "Start using IntelliJ IDEA" button to continue
Step 4: Create sbt based Scala project for Apache Spark Application
Click on "Create New Project" button
Choose "Scala" from left section/menu
Choose "sbt" from right section and click on "Next" button to continue
Provide project name as "apachespark101", JDK as 1.8(Java 8), sbt version as 1.3.2 and Scala version as 2.12.8
Click on "Finish" button to continue
Click on "Close" button to continue
Click on "Project" explorer and open the "build.sbt" file
Add the Apache Spark dependency in the "build.sbt"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.4"
Click on "sbt" tab from the right section
Click on "Refresh all sbt projects" button in the "sbt" tab from the right section
Expand the "src", "main" and "Scala"
Right click on the "Scala" folder and choose "New" and "Scala Class"
Provide class name "create_first_app_apachespark101_part_1" and choose "Object"
Note: Scala Object Class is created, hence no need to create object, have a main method and code in the main method can be executed directly.
Place the below code in the create_first_app_apachespark101_part_1.scala file
package com.datamaking.apachespark101 import org.apache.spark.sql.SparkSession object create_first_app_apachespark101_part_1 { def main(args: Array[String]): Unit = { println("Started ...") println("First Apache Spark 2.4.4 Application using IntelliJ IDEA in Windows 7/10 | Apache Spark 101 Tutorial | Scala API | Part 1") val spark = SparkSession .builder .appName("Apache Spark 101 Tutorial | Part 1") .master("local[*]") .getOrCreate() spark.sparkContext.setLogLevel("ERROR") val tech_names_list = List("spark1", "spark2", "spark3", "hadoop1", "hadoop2", "spark4") val names_rdd = spark.sparkContext.parallelize(tech_names_list, 3) val names_upper_case_rdd = names_rdd.map(ele => ele.toUpperCase()) names_upper_case_rdd.collect().foreach(println) spark.stop() println("Completed.") } }
Run the Scala Object Class
Summary
In this article, we have successfully installed IntelliJ IDEA Community Edition and ran the first Apache Spark Application. Please post your feedback and queries if you have anything. Thank you.Happy Learning !!!
0 Comments