Posts Tagged: http


17
Mar 09

Twitter / HTTP / REST API Invocation Infrastruture using data pipelines

This blog post is not about twitter API programming, though thats what it does deal with. It focuses on the intermediate level infrastructure (ie. higher than the HTTP/REST APIs exposed by other sites but lower than the class libraries that surround those) necessary to work with HTTP/REST based APIs being offered by various web sites.

I have been working on scouring and analysing twitter data for which I have been having to work with continuously accessing twitter on a sustained basis for a number of days. I started refactoring my code recently, and this blog post details the results of that exercise. More specifically in the context of invoking HTTP APIs it deals with the following aspects (many of them which are specifically introduced due to twitter.

  • Ability to make HTTP Calls and deal with HTTP error conditions
  • Ability to deal with Connection and other failures and allow for auto retries
  • Ability to make HTTP Calls in bulk in batches
  • Ability to spawn HTTP Calls across multiple threads
  • Ability to respect and deal with API Rate Limitations

I have specifically focused on creating a pipelined design which might be an interesting design aspect for many in this situation, and have primarily relied on python based generators for the same though similar functionality could be built using Iterators in other languages as well.
Continue reading →