Creating a realtime Tweet visualization in 3-D

By
Monday, 6 October 2014

The volume of Tweets available through Twitter’s streaming APIs is quite impressive. While these public streaming APIs provide a smaller subset of all Tweets by definition, the amount of real-time data they return can still be a bit overwhelming to process and visualize.

Thankfully, WebSockets is a great solution for sending large volumes of Tweets; when paired with WebGL, it allows for creative ways to visualize Twitter data.

Creating a realtime Tweet visualization in 3-D

If you’ve been to one of our recent meetups or hackathons in San Francisco, you may have seen the real-time 3-D visualization above. This in-browser application connects to a Twitter streaming API and plots geocoded Tweets on a 3-D globe in real time. While this graphic seems daunting at first, we can break down its development into three basic steps:

  1. Streaming Tweets with geo data
  2. Publishing Tweets to the browser with WebSockets
  3. Plotting the Tweets in 3-D

The full source code for this project is available on GitHub, and we suggest cloning the repo and following along with this tutorial on your own. If you are looking for a primer on using WebGL, you may want to check out this getting started guide before you proceed.

1. Streaming Tweets with geo data

For this step, the Twitter API endpoint we connect to is statuses/filter. This endpoint will provide a continuous stream of Tweets in real time. Connections to the stream are rate-limited, but since you only need to connect once, this will not be a constraining factor for this use case. To connect to the Twitter API, we will use Node.js and the Twit package in tweet-publisher/index.js.

// Connect to stream and filter by a geofence that is the size of the Earth
var stream = twitter.stream('statuses/filter', { locations: '-180,-90,180,90' });

stream.on('tweet', function (tweet) {	

	// calculate sentiment with "sentiment" module
	tweet.sentiment = sentiment(tweet.text);

	// save the Tweet so that the very latest Tweet is available and can be published
	cachedTweet = tweet
});

The statuses/filter stream can be further filtered by users, keywords and location. By specifying a location parameter, we will only receive those Tweets that contain geo data. The location parameter accepts a pair of lat/longs that then define a geo fence. In this example, the geo fence will include the entire planet.

When the Tweet object is received through the stream, we perform some AFINN-based sentiment analysis using a node package called sentiment. This analysis returns a score from -5 to 5 that is assigned to a sentiment property of the Tweet object. We also save the Tweet object to the variable cachedTweet, so the latest Tweet is always available for reference.

2. Publishing Tweets to the browser with WebSockets

Now that we have Tweets steaming in, we need to provide this data to a front-end javascript application running in the browser. One option would be to use a polling approach where the front-end application repeatedly requests the latest Tweet via POST or GET every few milliseconds. This would work, but in practice it puts undue stress on the web server that can easily be offloaded to a third-party.

PubNub offers a WebSocket service that is easy to implement and can take on this load. This particular sample application used the PubNub Node.js module. If you do not want to use a third-party service, Socket.io is a great option to use instead.

function publishTweet (tweet) {

	if (tweet.id == lastPublishedTweetId) {
		return;
	}
	
	lastPublishedTweetId = tweet.id;

	pubnub.publish({
		post: false,
		channel: 'tweet_stream',
		message: tweet,
		callback: function (details) {
			// success
		}
	});
}

The function above will verify that the Tweet has not been sent by comparing IDs and then publishing the Tweet as a message through PubNub. We could call this function every time we receive a Tweet, but the visualization would have a hard time keeping up with volume of activities provided by the stream. To limit our flow of Tweets we will set up an interval that calls this function every 100 milliseconds.

// This will provide a predictable and consistent flow of real-time Tweets
publishInterval = setInterval(function () {
	if (cachedTweet) {
		publishTweet(cachedTweet);
	}
}, 100); // Adjust the interval to increase or decrease the rate at which Tweets sent to the clients

At every 100 millisecond interval, the application will now publish the latest Tweet previously saved in an in-memory variable. You can adjust this interval to increase or decrease the flow of Tweets as needed. Tweets that stream in between intervals will simply be ignored. This allows us to predict the maximum number of Tweets visible and optimize the visualization.

3. Plotting Tweets in 3-D

Before we get to plotting the Tweets, we need to connect to the PubNub WebSocket channel on the client-side. The 2-D HTML part of the visualization is an AngularJS app, so we will use PubNub’s AngularJS SDK in public/javascripts/TweetHud.js.

PubNub.init({ subscribe_key: pubnubConfig.subscribe_key });
PubNub.ngSubscribe({ channel: pubnubConfig.channel })

$rootScope.$on(PubNub.ngMsgEv(pubnubConfig.channel), function(event, payload) {
  
  // Add tweet to this hud
  addTweet(payload.message);

  // Add tweet to 3D globe
  TwtrGlobe.onTweet(payload.message);
});

The pubnubConfig object is defined in javascript rendered by the Node.js Express application in views/index.ejs. This keeps the config settings defined in one place on the server. PubNub events are broadcast through the AngularJS $rootScope. When a Tweet object is received we update the 2-D view in addTweet and pass the Tweet object to onTweet in the TwtrGlobe namespace encapsulating the 3-D globe behavior.

TwtrGlobe.onTweet = function (tweet) {

	// extract a latlong from the Tweet object
	var latlong = {
		lat: tweet.coordinates.coordinates[1],
		lon: tweet.coordinates.coordinates[0]
	};
	
	var position = latLonToVector3(latlong.lat, latlong.lon);

	addBeacon(position, tweet);
}

function addBeacon (position, tweet) {
	
	var beacon = new TweetBeacon(tweet);
	beacon.position.x = position.x;
	beacon.position.y = position.y;
	beacon.position.z = position.z;
	beacon.lookAt(earthMesh.position);
	beaconHolder.add(beacon);

	// remove beacon from scene when it expires itself
	beacon.onHide(function () {
		beaconHolder.remove(beacon);
	});
}

To interface with WebGL, we use a framework called Three.js. The onTweet function will extract the latitude and longitude from the Tweet object and convert those coordinates to a Three.js Vector3 instance which contains X, Y, Z positions for placement in 3-D space.

The addBeacon function adds a 3-D marker to the surface of the planet at the provided position. The marker is defined in the TweetBeacon class and extends Object3D. Object3D is a Three.js base class for 3-D objects in the scene. Within TweetBeacon.js the lines representing Tweets are constructed and animated according to the sentiment score. After the beacons finish their animation and expire, a callback is fired where the beacon can be removed from the scene. This is important to preserve memory and keep the visualization rendering at close to 60 fps.

These are the basics to get you started. The Twitter streaming APIs offer a great source of data for visualizations, and existing technologies like WebSockets and WebGL truly allow for endless creative possibilities. If you would like to dig deeper into the code, remember to clone the repo on GitHub and play with the visualization animations as you see fit. For a primer on working with WebGL and Three.js, don’t forget to check out this getting started guide. Finally, if you are interested in working with Twitter data in AngularJS please check out our previous post on Rendering Tweets with AngularJS and Node.js.