Big Data Play Ground for Engineers: Dump Twitter Stream into Kafka topic

Git:

Example Illustration

Creating Your Own Credentials for Twitter APIs

Twitter Stream with Python

twitter_stream = Stream(auth, TweetsListener(kafka_addr=self._kafka_addr, topic=kafka_topic, is_ai=is_ai))
# https://developer.twitter.com/en/docs/tweets/filter-realtime/api-reference/post-statuses-filter
# https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters
twitter_stream.filter(track=keywords, languages=["en"])
self._kafka_producer = KafkaProducer(bootstrap_servers='localhost:9092')
self._kafka_producer.send("kafka_topic", data.encode('utf-8')).get(timeout=10)

--

--

--

A simple guy in pursuit of of AI and Deep Learning with Big Data tools :) @ https://www.linkedin.com/in/mageswaran1989/

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

First ever analytics projec

K-means Clustering in Python

How to visualize the programming language influence graph

Why Visualized Data Analysis Is Something You Should Have in Your Toolkit

Best Python IDEs for Data Science Applications

The Mathematics of Recommendation Systems

Principal Component Analysis (PCA)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Mageswaran D

Mageswaran D

A simple guy in pursuit of of AI and Deep Learning with Big Data tools :) @ https://www.linkedin.com/in/mageswaran1989/

More from Medium

How poor provisioning of cloud resources can lead to 10X slower Apache Spark jobs

Introduction Apache Kafka

Time-based batch processing architecture using Apache Spark, and ClickHouse