Tag: design

Design Characteristics of REST / Resource Oriented Server Frameworks and Clients

Posted by – June 10, 2009

This post is the third part of continuing series of articles on REST. The first one was Why REST ? and the next one was REST is the DBMS of the internet with hopefully some more to follow in the coming weeks.

Struts, Django, Ruby on Rails. We’ve worked with these and many other similar frameworks. Some time back I started thinking of what would a completely new ground up REST / Resource oriented framework would look like (ground up to ensure it had no legacy design to deal with). Would such frameworks be similar to the ones dominantly used today ? What about the ecosystem that surrounds and interacts with them (client libraries) ? And finally what about the implications on the fine grained object model (assuming there is one) and its relationship with the resource model ? This post deals with some of the thoughts.

There are some specifics the post does not address and is agnostic about :

  • Language : I shall be avoiding language issues as much as possible. Wherever I do bring in code constructs these may be assumed to be in Java (or pseudo-Java)
  • Convention or Configuration : I think both are valid choices in their appropriate contexts, and I don’t specifically emphasise one over the other in this post

The frameworks mentioned above are not the only ones out there. There are many, and some actually are very REST specific eg. Apache CXF JAX-RS or Restlet. It would certainly be interesting to contrast my thoughts with these, but for reasons of insufficiently detailed knowledge about them, I shall choose to skip it (better to not make any statements than making incorrect ones).

I shall be assuming a HTTP connector with GET, PUT, POST and DELETE as the constant set of operations. These four operations shall be collectively referred to as Resource Operations.

We shall first start with the server side characteristics, and the term ROF shall refer to a Resource Oriented (server side) Framework

A ROF will have a resource oriented interface : Certainly not a profound statement, but it was important to lay that down upfront. So what is a Resource Oriented Interface. Given a particular resource, a Resource Oriented Software will support or consume end points which allow you GET, PUT, POST or DELETE the resource. There is one reason why this particular constraint is relaxed just a little bit. Modern browsers do not support all the four methods easily eg DELETE and make it just slightly hard to use the PUT method. Hence these methods can also be invoked by using a URI segment containing the method name eg. delete.

A ROF will have an abstraction to represent a resource as an end point : Again, that seems to be pretty obvious. But there is a reason why I make it explicitly. In many situations we see controllers acting as end points. To the extent a controller acts as an abstraction for a resource end point which essentially only has the resource operations as public methods, it would fit this requirement. However if I was using an Order as a resource and if I introduced an approve method on the OrderController that would not be consistent with this requirement. That would need to be modelled as an OrderApproval resource which may on successful completion, effect a state change on the Order resource to the status ‘approved’.

This is where a potential differences with conventional frameworks arise. If I was to think of it from an EJB like perspective, I would model a OrderController as a Session bean and a Order as an entity bean. In case of lightweight POJO based model, I would have an OrderController as the endpoint exposed by say using Struts and model the Order as a entity POJO and map it to the database using Hibernate. In other non java frameworks, I would have a class to represent an OrderController and another one to represent the order along ActiveRecord pattern. But I would argue this separation is not entirely necessary, since what we want is something that implements a single abstraction mapping onto a Resource which also support the primarily lifecycle methods or resource operations of GET, PUT, POST and DELETE. But there is an issue to be worked through here. These resource operations are actually class level and not object level methods. Thus if we have an abstraction to represent the resource instance, the class level methods cannot be defined in the same class except as class level (static) methods. This is a tricky problem, and I would submit the designer may make one of two choices (a) Implement the resource operations as class level methods on the Resource abstraction (ie. they will get or return the resource references as method parameters and not rely on the ‘this’ or ’self’ qualifier for getting access to the resource variables or (b) Implement the resource operations as methods on a separate one-to-one mapped class on the resource abstraction (eg. an OrderHome in case of an EJB like analogy)

Given consistent expectations of the Resource Operations these will actually be auto-magically implemented : Thats a bit of a turnaround from what I was just describing in the earlier paragraph. What I mean to suggest is that the class level methods I just referred to will be implemented within the framework. What the framework will allow are plugins to provide extended functionality at specific points. Thus a “public static Order Order.put(Order order)” method will be implicitly implemented by the framework. But before a put can be processed it needs to be validated. Thus the framework will allow the developer to plug in / override his own implementation for an Order.validate(Order order). There are multiple ways such plug-ins could be implemented. Depending upon the nature of abstraction, it could be an overridden method as I just described, or it could be a standalone method that is registered into the overall workflow (either through convention or configuration). The latter might be especially useful in case one wants to implement the functionality as stand alone methods or in case of functional programming languages. The plugin points provided could be framework specific. eg, One may want to validate a resource for consistency even at it is being read from the database. For the rest of the post I shall refer to these as plugins. In addition, framework will most certainly provide methods for for downstream handling of impact of PUT, POST or DELETE. This is covered in the next point. In case the framework chooses to not deal with persistence, it may choose to allow capabilities for integration with other persistence frameworks.

A ROF will provide capabilities to a developer to override or register methods to handle downstream impact of PUT, POST and DELETE : Before I get into the details of this, I encourage you to take a look at my earlier post REST is the DBMS of the internet in case you have not already done so. To summarise it quickly, I have drawn the analogy that a REST based system is like a DBMS where client applications can perform direct SQL such as SELECT, INSERT, UPDATE, DELETE (GET, PUT, POST, DELETE in case of HTTP/REST) on the Tables (Resources in case of REST), and the business logic is implemented as triggers. Thus the framework will need to allow the developer to define such triggers. Such methods will need to support ability to reject the request (in case of downstream validation failures), and update the resource state (to reflect the appropriate resource state after the completion of the downstream processing). It is also feasible to imagine scenarios where such methods are triggered asynchronously. Much of the logic of the traditional controllers which controlled interactions across multiple objects etc. is likely to now be shifted into these methods. I have no particularly good name for such methods. They could be referred to as triggers, event or message handlers, glue methods, extension points etc. For the rest of this post I shall refer to these methods specifically as ‘handlers’.

Note that the actual invocations to select, insert, update, delete the resource are *NOT* to be programmed by the developer. These are automatically handled by the framework. The developer essentially fills in the necessary logic to the plugin methods (eg. Order.validate) or handlers (eg. Order.onCreate)

A ROF will provide a mechanism to describe or map a resource abstraction to to the actual programming constructs : There are a number of ways this could be achieved. XML, YAML, DSL, Annotation – take your pick. Also the actual class could be defined (as in case of a POJO) and the resource characteristics mapped onto it, or the class may manifest itself at runtime based on metaprogramming around the metadata. Sample possibilities here are Hibernate like Resource-to-Object-to-Relation mapping (using either Annotations or XML) or a a completely metaprogrammed ActiveResource. One important aspect that the framework will need to cover is the situations where a Resource is a composite of many or partial underlying business objects. eg. an Order resource instance could theoretically span one Order instance and many OrderItem instances. Thus a one to one relationship between a resource and underlying business objects (or datastructures) is not assumed. What is assumed is that the framework will allow such relationships to be described or introspected.

A ROF will allow resources to be mapped onto URI or URI segments : This is too obvious an requirement to be explained and is mentioned here only for completeness.

A ROF will allow foreign keys across resources which manifest themselves as URIs to be mapped onto the underlying business object references : Resources refer to each other through URIs. The underlying business objects refer to each other through object references. Given the resource descriptions and URI mappings, the framework should allow for a transparent referencing/dereferencing between such URIs and the object references.

A ROF will allow factory methods for locating or allow injection of other resources / business objects : Within the handler functions, developers will need references to the associated resources or business objects. I say resources or business objects, since the developer may choose to interact with these at a coarse grained (resource) or fine grained (business object) level. The framework should allow the necessary support for such activities.

A ROF may provide additional support for typical aspects of lifecycle (eg. validation) : While I mentioned validate as a possible plugin function. However given the omnipresence of validations, the framework may provide additional support for such activities. Thus the framework may choose to automatically implement such capabilities using the resource descriptions.

A ROF may provide capabilities for domain specific extension of resource capabilities : Certain domains have standardised mechanisms of working with resources. As an example most banking systems based on the four eyes principle require approval activities. While this particular aspect is much tougher than it seems, a ROF may choose to allow extension of such capabilities using template like functions or mix ins. As an example in this situation, once an Order resource is defined, an OrderApproval resource will be automatically made available as will the GET and PUT methods on it (POST and DELETE in this particular case may not be relevant), as will the necessary and appropriate handler functions on OrderApproval.

A ROF will provide capabilities for automatically generating the resource representation from the resource and vice-versa : Resources manifest themselves in multiple possible formats eg. XML, JSON etc. An ROF will allow such transformations between the representation and the resource/business object instance automatically.

A ROF will provide capabilities for assembling more complex representations using templates : In many situations, especially when the representations are being composed for manual (browser based) consumption, additional resources may need to be pulled into a view. A ROF will allow for such assembly of resources to be composed into a final view using templates.

A ROF will allow for introduction of appropriate additional URIs in views using templates : Thanks to HATEOAS (I’ve really avoided it thus far :) ), the framework will need to allow some mechanism of describing what are the additional context specific URIs to be included in the final representation. The template logic should allow the developer to specify such URIs.

A ROF should allow for the resource / media-type descriptions to be shipped in band with the resource representation : Since REST allows media types to be auto discovered and auto described, the framework should allow for the metadata for such media types to be also presented to the client. While I think it is essential that such in band information should be conveyed on demand, the framework may also optionally support upfront interrogation for media types and their details, which will require such information to be shipped out of band as well. I am not aware of any specific standards around such interrogation APIs so the framework could implement a custom API for the same. The actual metadata could be represented using any of the typical appropriate standards such as RDFa, XML Schema snippets etc.

A ROF should optionally allow support for auto generation of bindings for clients : I really really cringe as I write this. I cringe because to me the great attraction of REST is the simplicity and the ease of introducing incremental integration. The client binding generation (especially if it is statically generated) flies in the face of many accepted lightweight design scenarios. However I think there are likely to be some situation where availability of such client side bindings would be helpful. When possible (eg. with dynamically typed, metaprogramming capable languages like Python or Ruby), such bindings should be dynamic. In such cases the client can automatically introspect the server side media types and make available the necessary client side objects on the fly. In cases where statically typed languages such as Java or Scala are used, the client side may choose to expose everything as generic datastructures (e.g trees of name value pairs) or may allow for generation and compilation of client side bindings. I have no specific thoughts around the API support needed on the client side, except that quite obviously this would include support for the resource construction, resource operations etc. and that they would allow the client to interact with the server using the underlying language constructs rather having to work at a raw HTTP level.

In addition to the characteristics described above, I suspect frameworks will have many other optional characteristics such as support for monitoring, auditing / logging, transaction management, object pooling etc. etc. However these are unlikely to be particularly interesting when focusing on the framework aspects especially from a resource oriented perspective, which is indeed the focus of this post.

Update : InfoQ covered this blog post here : Design Characteristics Of Resource Oriented Server Frameworks

So were Jeff / Joel / Uncle Bob discussing happiness and fitness ?

Posted by – February 15, 2009

For those who are still unaware of the Quality and Testing discussion, I would refer you to the first half of Do you wanna be the Picasso of programming? First learn the rules, and only after break them to first come upto speed on the events.

Subsequently Jeff Atwood wrote Real Ultimate Programming Power and I posted a comment on it which compared the issue to that related to fitness (search for fitness to reach my comment). Thats what also made me realise that if one really looked at the entire discussion as one between Happiness and Fitness, it just seemed so much easier to understand and comprehend.

So if one goes back to the first podcast #38 where Joel appears to take on software quality, and if one takes up an analogy where software usability and customer satisfaction are treated as happiness (of the software) and the quality is considered as fitness (again of the software) then what Jeff and Joel seem to be primarily saying is that (words are entirely mine)

In the overall scheme of things happiness is more important than fitness.


That would make a lot of sense that most would readily agree to. However one more statement in there says

Fitness doesn't matter so much



And thats what probably triggered off a whole bunch of reactions. It also seemed to offer an explanation of why there was such a storm raised.

The way I perceive it, Jeff and Joel were making an argument for happiness which probably would’ve gone unnoticed but for the fact that the portrayal of fitness was (if I may say so) a bit incendiary. It wasn’t so wrong as that it simply seemed to be sending out a completely wrong message. And if this was an opinion on some small blog it would still have gone unnoticed. But Jeff and Joel being the influential voices that they are were less than likely to be ignored especially when the message they were sending out was considered “dangerous” by the fitness community. Dangerous in the sense that it could lead to a whole bunch of people treat fitness with even lesser importance (especially in the context where general fitness levels were quite suspect). As I revisited the podcasts and the blog posts, the happiness / fitness analogy seemed to generally hold up.

This paragraph is a little speculative in the sense I don’t know that this is what actually happened and am speculating at my end. So Joe n Jeff got a little flustered about the fact that they couldn’t understand why they had kicked up a storm in the first place since they believed in what they had said about happiness. So they got together with Uncle Bob to sort things out in a subsequent podcast. To me it seemed like a rather uneasy and tepid podcast where they didn’t disagree with each other’s points of view but still continued to be uncomfortable with them. The multiple axes that were being referred to could’ve been happiness and fitness. Jeff still continued to defend their stance in a manner which perhaps only made matters worse. The Ferengi Programmer seemed to suggest that fitness regimens were bureaucratic steps (which cast them in a negative light) which were rather expensive to deal with and hence negotiable. The Real Ultimate Programming Power seemed to suggest that since most people wouldn’t worry about reading up on or attempting for better fitness, probably a much simpler set of rules along with a continuous thought to be more fit was what was really important.

I am convinced when I look at things in this perspective Jeff and Joel’s arguments make sense as do Uncle Bob’s. In my mind many of the differences can also be explained reasonably well by this analogy. It also explains why some of the statements seemed to invite so much ire. The issue probably lay in packaging of the arguments. If the statements are re-presented in a manner which does not seem to reduce the importance or desirability of fitness per se while continuing to emphasise the primary goal of happiness, it could be possible to close the discussion and move beyond the debate back into the real issues related to happiness and fitness, oops customer satisfaction and code quality.

An experienced programmer doesn’t use SOLID as a checklist – he internalises it.

Posted by – February 12, 2009

Reading The Ferengi Programmer by Jeff Atwood really made me quite concerned. Here’s clearly an opinion which to me seems not grounded in sustained experience in applying the principles and is likely poor message going out to junior programmers.

In the post the author treats SOLID principles by Bob Martin as a ruleset that programmers apply from time to time. Once you get yourself into that frame of mind it is difficult to then contest the rest of the post. I would want to re-present the same topic with a different frame of mind.

SOLID principles are principles that you learn in your early days as a designer. These are formative stages when you are honing your skills and attempting to review your designs in terms of specific checklists of items to go through as a mechanism of validating your design. But each time you apply them, you internalise a part of them, and soon in 3 or more years of regular application, their application becomes internalised and ingrained. At this stage you might even well forget that they exist, since you apply them subconsciously, day after day, time after time, and sometimes referring back to them only when debating or reviewing your designs with other designers.

The analogy to the painter is in the post referred to : Are You Following the Instructions on the Paint Can? is also quite instructive. The instructions on the paint can are for one time painters, hobbyists, amateurs etc. No experienced painter is likely to be reading them since he’s probably internalised them. But each seasoned painter would want every junior painter to learn the instructions and the costs of not following them before stepping up to deciding whether and when not to follow them. He is unlikely to teach a new painter in the making – follow the instructions that make sense.

Sure you do make tradeoffs at times in design. Most people tradeoff guidelines in all spheres. However the keyword is to understand that you are breaking a guideline and then do so explicitly knowing its costs fully well.

My big difficulty with the post is that it is an advice which may do more harm to junior programmers than good. It might encourage them to make tradeoffs before they learn the cost and implications of making the tradeoffs. And it might set themselves away from a path that requires careful and judicious application (which requires a lot of effort in the early days) and helps them internalise the principles.

I would not recommend the post I refer to to any junior and upcoming programmer. My advice is as follows. If you have grown to a stage where you are applying these rules implcitly – don’t worry, you have the experience on your side to generally make the right judgement calls and you are likely to anyway apply them under most of the cases. In such a situation, this post and the one it refers to are probably inconsequential to you. If you are at a stage where you still need to review your design with respect to the SOLID principles (or other appropriate design principles) – please take your time to apply the principles, learn if you are breaking them, understand the costs of doing so (I would recommend that involve a senior programmer / designer in the process) and then by all means make the best judgement. Principles distilled over time and experience should be adjusted preferably by those who understand the cost and implications of doing so, the rest should strive to reach that state first.

To clarify, the reason why I upfront made the statement “seems not grounded in sustained experience in applying the principles”, is that those who have internalised them hardly every feel the burden (if at all) of applying them, and are unlikely to ever ever treat it as an explicit checklist, and they seem like checklists to those who haven’t internalised them. Precisely the audience to whom you want to craft a more careful and nuanced message.

Factors influencing Cache Design

Posted by – August 6, 2008

A lot of posts talk about how caching can improve performance and how one can use tools such as memcached for the same. It is a great tool, and I am certain it does wonders, but I do wonder if it and its usage is getting too much press at the cost of other caching options.

This post talks about the various dimensions that should be carefully examined before deciding upon the caching strategy. I would like to argue that the most difficult part of caching is not necessarily learning or selecting the caching tool but the detailed analysis of the application, its architecture and its data flow, that actually precedes the caching tool selection. I also submit that once this analysis is conducted and the various dimensions as described below well understood, the actual cache design then becomes a relatively simpler and lesser risky activity. Before getting into the dimensions of cache design, there is some important groundwork to be done.

Note : I am referring to data caching. I am not referring to optimisation of opcodes / byte codes / instruction sets etc. which is a different class of caching.

Know your application well :

Let me emphasise this. If you want to decide on the most appropriate caching strategy, you need to have a sufficiently good understanding of your application. This is relatively easy if you are enhancing the caching capabilities in an existing application or are rebuilding an existing application. But if you are building a completely new application, some of the data listed below may not be easily available and you may need to project a little bit after understanding the requirements and a high level data / logic flow. One of the things you really need to have either a clear understanding or at least some reasonable projection of, are the various data elements in the application, how frequently will they be accessed and modified and given the processing flow, which elements make sense to be cached. You will also need to have a good understanding of how critical is it to have the most current data (eg will a 1 min. stale data be acceptable) and the transactional semantics around the data processing (are strict ACID semantics a must ?).

Note that for the rest of this post when I mention data, I shall be referring to cachable data.

Know your performance targets :

Unless you are clear about your performance targets, it will be quite difficult to work out the appropriate strategy. I would suggest you should have clear targets defined (for a specified hardware) before proceeding down this path. Some times the target specifies very ambitious hardware which you may not have access to in early stages of the development. Try to convert these targets into appropriate targets for the hardware you shall be working with (with some reasonable margin of safety).

Know your architectural guidelines / preferences :

Will you be using a single server or multiple ? When handling future growth would you prefer to scale up or scale out ? What languages will you be using ? Would you prefer to use a single database instance or would you want to create vertically partitioned shards ? As we shall see, these are extremely important considerations before getting into cache design. Sometimes the decision making is not sequential, so you may end up revisiting these topics once you get into your cache design.

Dimensions

We shall now explore some of the various dimensions that influence cache design.
More…