MAR 02

Experiments with Twitter

I’ve spent the last week experimenting with the Twitter API, creating various prototype feeds. These have been created with PHP 5, using a modified version of David Billingham’s PHP Twitter Library, updated/hacked in parts to take account of changes to the Twitter API (my hacked version is here).

Before describing each prototype, I’d first like to congratulate Twitter on creating such an easy-to-use, comprehensive REST API, which anyone can start playing with very easily. As it is REST based, you can even start by entering most requests (which are URLs) into a standard browser address bar, enter your twitter account details as credentials, and View Source on what you get back.

Even though it is rate-limited (100 per hour), Twitter are extremely generous with what falls outside of the limit, such as unlimited requests to the Search API, or any POST methods such as updating status, following, or unfollowing.

If you have any questions about any of these prototypes, leave a comment below.

Free City

Want to know when companies are giving out free stuff near you (e.g. free chocolates or tickets)? Then this prototype is for you!

The prototype regularly (every minute) searches the Twitter Search API for mentions of the word ‘free’, together with a city name, and either ‘handing’, ‘giving’ or ‘tickets’. If it finds anything new since the last matching tweet, it re-tweets the results to a specific Twitter account.

Technically, this if fairly straightforward. The trickiest part is being ‘nice’ to the Twitter API by storing the most recent tweet ID on each update, so that the next search doesn’t have to re-search the entire database. This is complicated by the Search API, which doesn’t seem to return the ID in the format that it accepts it back again! So you have to hack the ID by removing the ‘tag:search.twitter.com,2005:’ from the start of it before sending it back with the next request.

Another slightly tricky aspect is ensuring that your search query includes a ‘-RT’ in it (i.e. doesn’t pick-up any re-tweets), otherwise you could enter an infinite loop of re-tweeting your re-tweets.

I’ve included some of the pseudo-code below. Note that this has many limitations (it updates the since_id before it re-tweets; it is limited to re-tweeting 100 updates at one time, etc.), but you get the idea for a prototype.

    $num_tweets = 0;
    $api_terms  = array('q=' . $query,
                        'since_id=' . $since_id,
                        'rpp=100');
   
    $oTs = new twittersearch();
    $oTs->username = $twitter_username;
    $oTs->password = $twitter_password;
   
    $oXml = $oTs->search($api_terms);
 
    if ($oXml->entry)
    {
        $num_tweets = count($oXml->entry);
    }
   
    if ($num_tweets)
    {      
        // Update 'since_id' in database (I know we should do this later, but screw it for now)
        $update_with_since_id = translate_tweet_id($oXml->entry[0]->id);   

        // Update database with since_id
        $sql = "UPDATE twitterQuery SET since_id = '" . $update_with_since_id  . "';
        mysql_query($sql, $db_link);

        // Re-tweet tweets
        for ($i = 0; $i < $num_tweets; $i++)
        {
            $oTweet = $oXml->entry[$i];
           
            $content   = strip_tags($oTweet->content);  // API auto-highlights search terms
            $authorUri = str_replace('http://twitter.com/', '@', $oTweet->author->uri);
       
            // POST IT HERE
            $status = "RT {$authorUri} $content";
            $oTs->update($status);
        }
    }


This is then called once per minute using the Linux crontab command, with the following style entry:

* * * * * /usr/local/bin/php -f /www/freecity/get_latest_tweets.php > /dev/null

Note that this will not use any of the 100-per-hour rate-limiter requests, as all API methods used (search, update status) fall outside the limiter.

So far we’ve set this up for @freelondon, @freecardiff, @freenewyork, @freesanfran and @jtrant has kindly helped set up @torontoFree.

Rebound Finder

Once I’d created a basic prototype that could search for any terms and re-tweet the results, other prototypes were easy. In this case, I set up a search for ‘dumped’ or ‘broken up’ with either ‘my boyfriend’ or ‘my girlfriend’. The results are then re-tweeted to @reboundfinder.

This is obviously not a serious use for the API, but an example of how a quick prototype can be easily re-used for other purposes.

Twitshrink

The prototype Twitshrink interface

Not really a use for the API, but I wanted to create a cross between TwitterKeys and TweetShrink, by automatically converting appropriate words in tweets to Unicode characters, to help squeeze more information into the limited tweet space.

I actually got as far as creating a working prototype (see screenshot) before sadly awakening to the fact that many of the popular Twitter clients (TweetDeck, Twhirl) do not support Unicode rendering by default.

If a significant number of people can’t read your tweets, there’s no point, so this idea was quickly scrapped.

This is one of the drawbacks of a popular API – you can’t guarantee the baseline support for any types of technology (such as Unicode), as these are dictated by whichever third parties produce the most successful layers on your API.

Solo London

This is a slightly different take on the ‘search and re-tweet’ aspect of Free City. Rather than searching for arbitrary phrases, this prototype searches for tweets that are tagged (#sololondon) or sent to a particular username (@sololondon). These are then re-tweeted from the account (again, first stripping the tag or @ username from the tweet to prevent looping).

This is meant to demonstrate a more socially beneficial use of the Twitter API, by providing people with an easy means for meeting others (at a gig, etc) if they are by themselves. Follow @sololondon if you’re willing to help out people who may be by themselves, or if you may need someone to hang out with in the future.

Twitexperiment

As part of another blog post I’m writing, I wanted to quickly demonstrate that – as most Twitter users know – follower numbers are largely meaningless.

So I set up @twitexperiment to prove this, and also have an experimental account with which to ‘try things out’.

To start, I used some basic Linux comments (wget, grep) to extract the names of the top 1000 Twitter users who follow the most people. I then created a simple script which followed each of them, one a minute, to see how many auto-follows I could attract back.

It turns out that about 60% followed back, and (surprisingly to me), about 20% auto Direct Messaged on follow. I’m not someone who dislikes auto-following, but auto DM’ing feels a little too much, especially when the messages are not particularly worthwhile (“I am sincerely glad to have you as a follower”: how can this be sincere when the person has no idea who I am or why I am following them?).

I created some further simple scripts to quickly calculate total number of tweets for particular search phrases (as the current Twitter Search website doesn’t show result numbers), so that I could use the @twitexperiment account to at least provide some interesting tidbits of information.

The account has started to pick up new followers, so with Twollow out of action at the moment, I created a simple script to auto-follow new followers. Again, this is fairly limited, but seems to work well at relatively low follower numbers (I would probably switch to using the notification emails once the follower numbers increase over a few thousand). A code sample for auto-following is below:

    $oTs = new twitter();
    $oTs->username = $twitter_username;
    $oTs->password = $twitter_password;
   
    $oFollowers = $oTs->follower_ids();
    $oFriends   = $oTs->friend_ids();
   
    // Bit of a hack to put these into arrays...
    $num_friends = count($oFriends->id);
    $num_followers = count($oFollowers->id);
   
    $aFriend   = array();
    $aFollower = array();
   
    for ($i = 0; $i < $num_friends; $i++)
    {
        $aFriend[] = strval($oFriends->id[$i]);
    }
   
    for ($i = 0; $i < $num_followers; $i++)
    {
        $aFollower[] = strval($oFollowers->id[$i]);
    }
   
    $aNotFollowed = array_diff($aFollower, $aFriend);

    foreach ($aNotFollowed as $follow_id)
    {
        $oTs->followUser($follow_id);
    }


Interestingly, you can just swap the $aFollower and $aFriend in the array_diff() function, and change followUser() to leaveUser() if you instead want to use the script to unfollow any users who are not reciprocally following you.

Again, you may want to set-up a crontab entry so that this script is called (e.g. once per hour) automatically, auto-following users as they follow you.

Note that this script uses two of the rate-limited requests; once to get the follower IDs, and once to get friend IDs. Following/Unfollowing does not use any rate-limited requests.

Work in progress

I have two more projects ‘in progress’ (a Twitter-powered re-launch of http://www.fakeanimalfacts.com, and a #FollowFriday statistics table, as suggested by @jowyang), which I’ll write about in a follow-up blog item.

UPDATE (6 March 2009): The followfriday leaderboard is now live!

And Finally...

You can follow me on twitter at @zambonini, and Box UK at @boxuk.

 

Comments

12 comments

  1. Marty Thornley said... 5th Mar 2009, 01:04

    Dan, Great article. Thanks for sharing your experiments. I will be coming back soon to go over them all and when I develop a couple i am working on myself, I'll come back and add a link. Nice work!

  2. Micah Silverman said... 6th Mar 2009, 16:27

    Thanks for the article. I am doing a PHP based search program and I am running into issues with since_id. The Search API documentation is very sparse on its use. What format should it be in? When I do a search without a since_id, I can see id's in the format: tag:search.twitter.com,2005:1286981691 If I include a URL param with either the end number after the : or the whole string, I still get the entire search result back. How should the since_id be formatted?

  3. Dan Zambonini said... 7th Mar 2009, 17:35

    @Micah - you should use the ID after the ':', as you say (i.e. just the long number). If you're having problems, feel free to send me the code (dan [at] boxuk [dot] com) and I'll see if I can spot the problem.

  4. ScalpingMan said... 8th Mar 2009, 12:32

    Hello Dan, thanks for your interesting experiment with the twitter api! I have a problem using your "hacked version" of the php twitter libary: If I try to use the method isFriend it only returns a object like this: stdClass Object ( [scalar] => ) No matter if a friendship between two users exists or not. Do you got any idea what I'm doing wrong? Yesterday I've tried several hours to get this method working but didn't get it... The methods showUser, follow etc. are working fine. I'm very helpless with this problem... Thx, ScM

  5. Dan Zambonini said... 9th Mar 2009, 10:45

    @ScalpingMan Thanks for that - there was a bug in that method (I think it was in the original file I downloaded; I hadn't noticed as I hadn't used that method before). If you re-download the file (above), hopefully it should now work!

  6. Scalping Man said... 9th Mar 2009, 13:47

    Thank you very much for the fast update!

  7. watch halloween 2 2009 online said... 29th Aug 2009, 04:24

    I never really took to Twitter - either as a blogging platform, social site or for traffic (to my site). I think it is geared towards celebrities and their egos - ie. I have more followers and which celeb' is most popular.

  8. marine boot camp said... 7th Sep 2009, 18:44

    Twitter is so overrated IMO - although their API is quite nice... some nice ideas here... :)

  9. Plasterboard said... 8th Sep 2009, 06:53

    I think the trickiest part is being ‘nice’ to the Twitter API by storing the most recent tweet ID on each update, so that the next search doesn’t have to re-search the entire database. This is complicated by the Search API, which doesn’t seem to return the ID in the format that it accepts it back again!

  10. cufflinks said... 8th Sep 2009, 07:19

    Here is Another slightly tricky aspect is ensuring that your search query includes a ‘-RT’ in it (i.e. doesn’t pick-up any re-tweets), otherwise you could enter an infinite loop of re-tweeting your re-tweets.

  11. Research Paper Help said... 19th Sep 2009, 03:43

    This is complicated by the Search API, which doesnot seem to return the ID in the format that it accepts it back again!

  12. Joseph said... 4th Dec 2009, 19:05

    Nice. I have used pieces of your code to create a new one. Thanks for sharing this.

Post Comment

[This form does not accept any HTML]

Anti Spam *

About The Author

Dan Zambonini

Dan Zambonini is the Technical Director of Box UK. He is the original architect of the Amaxus Content Management System, conceived clickdensity, has participated in industry-shaping think tanks, and has had articles featured in international websites and magazines. He is passionate about making use of the latest technologies in everyday life, and believes people and communities are key to innovation. For more, you can visit him on his personal website at danzambonini.com.Follow Dan Zambonini on Twitter

 

Social Bookmarks

Box UK Twitter Feed