Installation

Dependencies

  • Apache Spark - You will need spark installed/downloaded on your machine to run this. Note: when running on a Mac, I had to (downgrade) to Java v8 to get Spark to run (otherwise even the Spark examples fail).
  • Apache Kafka - Our Twitter feed uses Apache Kafka to publish the stream of tweets. Download it and unpack the tarball.
  • The scipy toolkit is installed automatically below, but on Windows you need to make sure you have the Microsoft C++ libraries installed. Follow the instructions here.

Stable release

To install Twitter ML, run this command in your terminal:

$ pip install twitter_ml

This is the preferred method to install Twitter ML, as it will always install the most recent stable release.

If you don’t have pip installed, this Python installation guide can guide you through the process.

From sources

The sources for Twitter ML can be downloaded from the Github repo.

You can either clone the public repository:

$ git clone git://github.com/paulknewton/twitter_ml

Or download the tarball:

$ curl -OJL https://github.com/paulknewton/twitter_ml/tarball/master

Once you have a copy of the source, you can install it with:

$ python setup.py install