(+91) 7 999 01 02 03

Spark Streaming with Kafka

Suraj Ghimire
42 Posts

Lets use covid19 dataset to build a small poc on Spark steaming with Kafka.



  1. Download Kafka latest from Internet
  2. Extract to a folder say D:\kafka_2.13-2.5.0
  3. Navigate to below path D:\kafka_2.13-2.5.0\kafka_2.13-2.5.0\bin\windows
  4. copy  D:\kafka_2.13-2.5.0\kafka_2.13-2.5.0/conf/ to kafka/bin/windows
  5. copy  D:\kafka_2.13-2.5.0\kafka_2.13-2.5.0/conf/ to kafka/bin/windows
  6. start zookeeper
    $ zookeeper-server-start.bat

  7. start Kafka Server
    $ kafka-server-start

  8. Create Kafka Topic
    $ kafka-topics.bat --zookeeper localhost:2181 --create --topic covid19india --partitions 2 --replication-factor 1
  9. Create DB Table to Store Data.

    CREATE TABLE `covid19india` (
      `sno` varchar(100) DEFAULT NULL,
      `date_of_identification` varchar(100) DEFAULT NULL,
      `current_status` varchar(100) DEFAULT NULL,
      `state` varchar(100) DEFAULT NULL,
      `num_of_cases` int(11) DEFAULT NULL
  10.  Run Kafka Consumer Code

  11. Run Kafka Producer Code
  12. Check Your MySQK Database table getting Populated.
  13. Visualize it using Tableau.

Example usecase

Published By : Suraj Ghimire
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.


Jquery Comments Plugin