CRUD is not only good for, but is the only consistent way to build REST over HTTP

Posted by – August 14, 2009

This is to comment on a perception forming that REST encourages exposing basic data elements through CRUD and that it encourages development of dumb applications (applications with shallow business logic).

Apart from some tweets I saw on the topic and some twitter conversations, the blog posts which perhaps set off the thought were

The underlying fear and rationale for these posts makes a lot of sense – the fear of creating real dumb passive and shallow applications. I submit, that the problem however is not CRUD – it is resource identification and scoping, and CRUD is not only good for but is the right way to build intelligent, active and deep applications.

CRUD supports Uniform Interface : The primary reason why CRUD gets used is because it supports a uniform interface. At the end of the day, a consistent Create/Read/Update/Delete or POST/GET/PUT/DELETE interface makes things easy. It makes things easy for the development team because of the consistency it introduces in their applications. It makes things easy for the clients who have a simple and consistent interface to deal with. At the interface level CRUD breeds consistency, and at the risk of broad generalisation, consistency is good.

So why do we end up creating shallow applications at times with REST ? CRUD in general works with simple forms built on simple tables. Quite often this style of programming gets elevated into simple forms over simple domain objects. Standardised CRUD helps a lot at the lower end of application development and most database driven application developers are likely to have at some stage in their early development life attempted to build a small CRUD library or framework to help themselves substantially. The reason why we are likely to be ending up creating shallow applications is not because we apply CRUD, but because we continue to apply CRUD on tables or simple domain objects. And therein lie the distinctions

  • REST is not about CRUD on tables – its about CRUD on resources
  • CRUD is the interface – not the implementation

I attempt to bring up the difference in the example that I detail below.

Simple Account Transfer Example

Lets say we want to build the software to transfer amount X from account A into account B. Lets further specify that a transfer is not effected immediately and requires one more explicit approval. Lets also specify that while a transfer is waiting to be approved, it could be amended. Thats the simple scenario that we shall deal with.

In order to implement this, we shall define a datastructure / table / object for Account which shall contain a field called balance. Further there shall also be Transfer table / object which shall contain the fields sourceAccount, destinationAccount, amount and status. The possible status values shall be Initiated and Completed.

In a simple service oriented application we shall perhaps have a transfer service. Ignoring error handling, SOA wrapping etc., the service interface will probably boil down to the following equivalent Java interface.

public interface TransferService
{
    public Long transfer (Long sourceAccountId,
                                Long destinationAccountId,
                                BigDecimal amount);
    public Transfer get(Long transferId);
    public void amend(Long transferId,
                             Long sourceAccountId,
                             Long destinationAccountId,
                             BigDecimal amount);
    public approve(Long transferId)
}

Lets think of these might get modeled in a REST environment. The important thing to remember is – don’t think about services or functions or methods – think about what are the resources you choose to expose using a simple CRUD interface.


# The following creates a new transfer. The returned data shall include
# the URI of the new transfer, and the URI to approve it

POST /transfer
# The following retrieves the status of a current transfer. If it has not
# been approved the returned data shall include the URI to approve it.

GET /transfer/${id}
# The following modifies the transfer. The returned data shall also
# include the URI to approve it

PUT /transfer/${id}
# The following approves and further processes the transfer. It shall
# return the URI for the transfer

POST /transfer/${id}/approve

While most of this seems all right – what sticks out like a sore thumb to me is the approve URI. Its just so SOAish / RPCish. Plus at least the way this particular interface has been implemented, there is no way to access the approval specific information, without actually accessing the transfer. Hence I suggest that we define a new resource TransferApproval to account for the same.


# The following creates a new approval. If successfully executed
# the transfer is complete and no future amendments or approvals
# are allowed. the returned data shall include the TransferApproval
# URI and the transfer URI

POST /transfer/${id}/approval
# The following gets an existing approval
GET /transfer/${id}/approval/${approvalId}

Please note that the “${approvalId}” at the approval URI simply wasn’t required – since there exists a 1-1 relationship with the transfer. I just included it for easier understanding. If I had to implement the functionality as is I would choose to skip it however if I knew I would very soon need to build in multi-stage approval (as in most banking systems), I would keep it so that each approval against a transfer can also be listed.

But the really interesting method above is the POST. This is a seemingly simple new (in RDBMS parlance) insert into TransferApproval table. But if you are building a REST service, you might be tempted to encourage your clients to not only create the new TransferApproval resource, but also go back and update the Transfer table to update a status to Approved. That would be a smell. Once the POST on the approval is processed, all side effects on other tables should be handled while servicing the POST request. In other words the POST request is not just an insert – its an insert with an associated trigger to conduct all the necessary downstream processing. And its essential one looks at request servicing in this manner so that CRUD can be used effectively. Servers should be designed this way, and clients should anticipate it and we should be on our way to build non-shallow applications.

So finally – CRUD is good. It makes things easy for the clients. Stick to CRUD. Just remember that it is CRUD on resources and not on tables, and the resources shall handle all the downstream changes necessary so that you don’t have to. And finally CRUD is the interface, not the implementation.

Note: Some might notice that this post is just a much more detailed elucidation of one of my earlier posts – REST is the DBMS of the internet.

Related posts: (Automatically Generated)

  1. REST is the DBMS of the Internet

17 Comments on CRUD is not only good for, but is the only consistent way to build REST over HTTP

Closed

  1. Nice post, I completely agree. People say CRUD like it’s a bad thing, and if you have a CRUD frontend it probably is. But we’re not talking about frontends here. Http is all about CRUD, so if you want to create services on top of it they should be pretty CRUDdy too if you want them to make use of the scalability of the HTTP protocol. It’s then up to the applications you build on top of these services to add interactivity. Twitter is a great example of how you can do this.

    • @mendelt

      One of the surprising aspects of berating CRUD in the context of REST is that people forget what it came from. It comes from the Uniform Interface constraint, which is exactly how HTTP got designed as well. While REST doesn’t say it should be CRUD (it just says it should be uniform), the Uniform Interface exported by HTTP is very CRUD like. And while I didn’t explicitly state it in the post body itself, if one is not using CRUD especially when used with HTTP, then the RESTfulness of the interface could also be called into question.

      I think your observation about it not being CRUD for CRUD frontends is absolutely correct. I would go on to add that the interactivity you refer to can be added by either the applications on top or by proper resource identification and scoping.

  2. Thank you for the post!

    I have one concern. When we access resources over HTTP we have just a few HTTP methods to trigger behaviors on our resources. So how do we deal correctly with many different behaviors using this limited set of HTTP methods?

    An example (not from banking, but from an agile context, but should be understandable).

    We have a UserStory as a resource. To change the UserStory data we use PUT and send a new version of a resource. The change in the data can of course trigger some side effects on the server. That’s easy.

    How about splitting user story into two resources (common operation you can do with a UserStory)?

    This is in some way a “create” operation (new user story being added) and in other way an “update” (the old user story stays but we can move some tasks to the newly created one). We can also think of this as creating two new user stories, but anyway then we need a delete of an old one as a side effect. So what would it be:

    PUT /userstory/{id} with some data including a reference to a base user story that is to be split

    or maybe:

    POST /userstories

    or maybe something else?

    The other example is upgrading a Task (a part of a user story) to a UserStory. Again this is not a regular update, but more a transformation of a Resource to another one.

    More general question is if it is expected to use a a single resource locator (here URI like /userstory/{id}) with different data sent to it to trigger different behaviors (the server will recognize the type of data send to the resource and will do what is required) or should we rather use a unique data structure when for example POSTing to that URI?

    Regards,
    Marcin

    • Marcin,

      This is a great example to further discuss introducing non-shallow functionality into REST applications using CRUD.

      I did mention earlier that resource identification and scoping actually turns out to be the aspect which requires some skill when designing REST interfaces. Based on my understanding of agile methodologies and tools, and based on the otherwise limited information I have about the context, I would probably go down the following path. You may with a slightly better understanding of the context choose a more appropriate path.

      A story being split is a resource – say StorySplit. When I split a story, I would need at least the following information :

      • The name of the new story
      • The URI of the story being split
      • The list of URIs of the tasks being moved over to the new story

      These would form the essential attributes for creation a StorySplit resource. There is one more attribute that will eventually become an attribute of this resource – thats the URI of the new story so formed.

      A POST on say /story/{story_id}/StorySplit would result in a creation of new story with a URI /story/{new_story_id}. The name of the new story will be as was requested in the StorySplit resource and the tasks that were moved will now be visible when GET on the new story, and will no longer be visible on GET on the old story.

      A moot question would be would you want to support a GET on the StorySplit itself. Note that even if REST requires you to maintain a Uniform Interface – it does not require you to implement every one of the methods. Thus if it is not important to store the history of a story split, the same need not be supported at all. Consequently PUT and DELETE will also be irrelevant in case of a StorySplit – and the only method to be supported is POST. In such a situation, you may not be required to create a new domain object / table to reflect a story split.

      On the other hand if it is important to maintain the history of a StorySplit, then it might be helpful to define the domain object / table to reflect the story split. Moreover the StorySplit could be accessed using a GET /story/{story_id}/StorySplit/{story_split_id} or GET /storysplit/{id}. Even in this situation it is less than likely you might want to support a PUT or a DELETE operation on a StorySplit (though it is not unimaginable to do so).

      Similarly a POST /task/{id}/story_upgrade could result in the earlier task getting delinked from the earlier story, and getting attached to a new story that gets created (assuming one would want the same task to get moved to a new story).

      Finally to the general question you refer to – I would not overload a single resource locator /userstory/{id} with multiple connotations based on different / unique datastructures, beyond those clearly identified as separate based on HTTP methods. I would define a new resource type, support a POST on the new resource type, support a GET if history is important, and finally support a PUT and DELETE if it contextually makes sense – but all of these on a new resource type eg.on /userstory/{id}/new_resource_type or on /new_resource_type (the latter would require that the id’s for the new resource types are unique and do not require the userstory id to be resolved into the new resource ie. the id is alone can function as a primary key for the new resource type.

  3. I also thought about this approach with a new resource but for me this just introduces a set of “temporary” resources as they live only during POST operation (yes I don’t need to store StorySplit or TaskUpgrade anywhere).

    This is something that probably requires a mind shift for somebody still thinking more RPC style or more OO style :-) An maybe REST also require thinking about resources as not about sth permanent/persistent.

    Generally in CRUD thinking the “U” is the problem :-) as we perceive most of the behaviors as hidden after that UPDATE part. All other are rather straightforward.

    • Yes, there is indeed a mind shift, and a pretty strong one at that. Whatever style one uses, the essential atomic elements of an API do not change – what REST changes is the way these API elements get represented. What it does cause a shift in is that instead of thinking of something as a (remote) method with a signature, it requires one to think of a HTTP Method + Resource pair. As a consequence many verbs in the former style do become a HTTP Method + noun in the latter. eg. story.split() became POST story/{id}/storysplit

      One of the positive consequences of this shift is that for the end user of the API, things become far easier. as Subbu Allamaraju portrays very eloquently in Describing RESTful applications

  4. Manas Garg says:

    From the SNMP days, I remember that everything could be expressed in terms of get/set. Of course, in that world, there was no notion of create/delete because the resources were predefined by the vendors.

    SNMP has been successful for this reason and this is precisely the same reason for which people have hated SNMP. At the end of the day, more complicated things could not be expressed in get/set. It kept getting harder and harder to cleanly model resources in the system which could be get/set. Most of the MIBs for routers/switches look horrible today.

    When I want to block a MAC on a switch, my mind sees it as an action to the effect “go block a MAC”. But in SNMP world, first a resource had to be identified and then a state variable will be created and then one could set that variable.

    As I mentioned earlier, that’s what made SNMP so wildly successful because it was very easy to implement things. But at the same time, the switch vendors were forced into exposing CORBA or other interfaces because the modeling was non-intuitive most of the time.

    CRUD is fine. It may work well for 80% cases. May be even 85% cases. However, just like everything else, it has limitations. You can always model it in CRUD manner. But there are cases when introducing new verbs will be more graceful.

    To give a real world example, Trailofview has an object versioning system. You can access the latest version of an object in the form /verobj/obj_type/{obj_id} and you can also access an older version of the same obj /verobj/obj_type/{obj_id}/rev/{rev_id}.

    Now, there is a facility to revert an object to older revision. In my opinion, there are two ways to do it:

    1. First fetch the values in a specific revision GET “/verobj/obj_type/{obj_id}/rev/{rev_id}” and then update the resource by putting those values PUT “/verobj/obj_type/{obj_id}”

    2. Introduce a new verb and do REVERT “/verobj/obj_type/{obj_id}”. Since, there is nothing like REVERT in HTTP, I overload PUT and pass revert as a parameter to it.

    While #1 is possible and is more crudy but #2 is more direct and sounds more natural to me.

    • CRUD is fine. It may work well for 80% cases. May be even 85% cases. However, just like everything else, it has limitations

      Sure, like any other style it will have some limitations, some pecularities and some strengths. In a different post actually Why REST ? I attempt to address in an extremely high level of detail, not only the similarities but also how REST style was influenced by web protocols such as HTTP and FTP.

      Introduce a new verb and do REVERT “/verobj/obj_type/{obj_id}”. Since, there is nothing like REVERT in HTTP, I overload PUT and pass revert as a parameter to it.

      That in essence can also be looked at as the feature of REST (just a matter of deciding what goggles one wants to put on). The Uniform Interface constraints of REST ensures that the API is consistent and even though it may constrain you from adding a REVERT, it may in many cases make things for the clients due to the consistency it enforces. That consistency is characteristic to be cherished.

      Would I overload PUT and pass a revert parameter ? I would actually POST a /verobj/obj_type/{obj_id}/reversal . While constraining the methods, REST does give all of us a relatively free hand in designing the resource structure, and its through discussions like this that we collectively shall not only get better at it, but hopefully to some extent more consistent as well.

  5. Hello Dhananjay.
    I agree with all reasons, but disagree a little bit about CRUD and also about the proposed API.

    First, CRUD. Create, Read, Update, Delete are the major operations for a data record (or data element) metaphor. It is not bad, but not necessarily THE semantics of HTTP.
    As you can see, in several discussions about how to implement things using HTTP, the majority of confusions comes when you think of HTTP operations as CRUD operations, given the work you have to do does not match the simple CRUD semantic, or given you are not sure how to simply create something (the C) or update (the U) using post/put.

    The problem there is HTTP is different to the simple four CRUD operations. That is why they are not called the same. POST is a multifunction thing, and a resource does not behave as a data element.

    Let’s see it this way. In CRUD you can Create, Read, Update and Delete a data element.
    In HTTP, you can POST a payload to a managing resource in a server, you can PUT a resource to a server, you can GET a resource representation from a server, and you can DELETE (or disable) a resource in a server.

    Similar? Yes, but not equal. If using the CRUD semantics, things like splitting the user story means you have to delete the original one, and insert to new ones, right?
    Using HTTP semantics, you POST to the userStory container resource a Split request (in the payload, not the URL), including data in the payload to indicate the container where to split, and the operation should return the data indicating the new resources created.

    Upgrading a Task is clearly a PUT, where you replace the actual resource with the one in the payload. Of course, it is rich enough to allow adding to the payload only the upgrade info, so you don’t send the full entity again.
    See the difference? If I think in HTTP semantics, it is easier for me to know how to do things. At least, I see it this way.

    Last, the URL. As per our discussions, things like /story/{story_id}/StorySplit should be used carefully. the StorySplit at the end is the identification of a resource contained or managed by /story/{story_id} resource. It is a noun, not a verb. This is to avoid the action or method description in the URL. The idea of posting to that as a resource is totally correct, as a resource may be dynamic and may not be related to an “physical” entity, but just a concept.

    Note however, that HTTP provides metadata too in terms of headers. So, a GET to that StorySplit may return metadata in headers and no content! Same, DELETE to that StorySplit may mean to rollback the split, which we may not want to do.

    The thing I love from this is the richness of the discussions!

    Regards.

    • As you can see, in several discussions about how to implement things using HTTP, the majority of confusions comes when you think of HTTP operations as CRUD operations, given the work you have to do does not match the simple CRUD semantic, or given you are not sure how to simply create something (the C) or update (the U) using post/put.

      I have seen many of these discussions around PUT and POST, and the way I’ve internalised these is that if I am updating an existing resource its a PUT and if it is creating a new resource its a POST. However as a API designer it is my responsibility to decide what is more appropriate design under the context.

      The problem there is HTTP is different to the simple four CRUD operations. That is why they are not called the same. POST is a multifunction thing, and a resource does not behave as a data element.

      The HTTP RFC refers to POST as “The POST method is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI” and PUT as “The PUT method requests that the enclosed entity be stored under the supplied Request-URI”. I’ve generally found these helpful guidelines under the circumstances.

      Using HTTP semantics, you POST to the userStory container resource a Split request (in the payload, not the URL), including data in the payload to indicate the container where to split, and the operation should return the data indicating the new resources created.

      The intent of the request or the logical resource that is being created (the split) is being conveyed by the URI – its only the supporting data for the same that is being conveyed through the payload. Thats pretty consistent with HTTP imo.

      Upgrading a Task is clearly a PUT, where you replace the actual resource with the one in the payload

      I don’t think it is so clear. Again as a designer you make the choice. In this case a Task is being upgraded / converted to a story. I would feel very hard pressed to justify a PUT.

      While I did offer different opinions there, I think we perhaps do agree on the fact that using CRUD as a metaphor in the context of data records alone is perhaps bad. I suggest that given the choice of decrying / berating CRUD, and explaining that CRUD needs to be viewed in the context of resources, and also as an interface and not an implementation, I would choose the latter – which is what this blog post is all about.

  6. Manas Garg says:

    @dnene But reversal doesn’t sound like a resource, does it? :)

    I agree to everything that you say except the one-size-fits-all theory. CRUD is a tool that works very well in most of the cases. However, there are times when one should break the rules to keep things simple.

    Eventually, the goal is not CRUD, the goal is simplicity. And when one crosses that 85% threshold, one must break the CRUD rules to retain simplicity.

  7. Travis Dunn says:

    Just agreeing here with Manas Garg; REST is an abstraction for simplicity, and we should be free to reshape that abstraction if it gets us more simplicity. In fact, deciding when to violate REST is probably the central task of a developer implementing it in their application; monitoring the extent to which actions against resources push the boundary of design until they demand the extraction of a new resource.

    • I agree in principle that we should not become dogmatic about implementing a particular abstraction, rule. The only caveat I have is that so far I have discovered that what initially seems complex (having to identify a new resource rather than overload an existing one), actually turns out to be simpler once you work through the various paces that are required. However that is an argument that will require a lot of detailed supporting evidence .. something I propose to offer over the next week or two with another blog post focused on exactly the same topic.

  8. Very good post Dhananjay; I agree completely.

    While originaly HTTP was originally designed, as you say, with the Uniform Interface constraint, unfortunately it has not stayed so.

    In my blog post at: http://www.internetfilter.com/w/articles/http_methods_and_proxies

    I list the current complete official and unofficial HTTP methods:

    ACL BASELINE-CONTROL BCOPY BDELETE BIND BMOVE BPROPFIND BPROPPATCH CHECKIN CHECKOUT CONNECT COPY DELETE GET HEAD LABEL LINK LOCK MERGE MKACTIVITY MKCALENDAR MKCOL MKREDIRECTREF MKWORKSPACE MOVE NOTIFY OPTIONS ORDERPATCH PATCH POLL POST PROPFIND PROPPATCH PUT REBIND REPORT SEARCH SUBSCRIBE TRACE UNBIND UNCHECKOUT UNLINK UNLOCK UNSUBSCRIBE UPDATE UPDATEREDIRECTREF VERSION-CONTROL X-MS-ENUMATTS

    This makes me sad. A few like “CONNECT” or “TRACE” are more network oriented so those are fine. But the group of people who designed these additional HTTP verbs really did not ‘get’ the original intent.

    For instance, why would they need “MKCALENDAR” or “MKACTIVITY” at all when they could just POST it? If people followed in this vein we ought to have “MKBLOG”, “MKBLOGENTRY”, “MKBLOGCOMMENT”, “DELBLOG”, … Thankfully, we don’t! That’s because we already had everything we need!

    Every function performed by all these HTTP verbs can be boiled down to one of “Create”, “Read”, “Update”, or “Delete”. Having these other verbs in the protocol make the networking unnecessarily a lot more complex, and this additional complexity makes for more bugs and possible security holes – which is why Javascript is restricted from doing most of the HTTP verbs.

    And as you say, the important thing here is that these operations do not have to be restricted for management of a single low level table at all. Of course one should not become dogmatic about a particular abstraction.

    My personal rule of thumb here is “… think more than twice if you think you need other methods. How is it going to be anything other than a Create, Read, Update, or Delete of some resource?”

    Regards,
    Jeff Koftinoff
    http://www.jdkoftinoff.com

  9. Yes, Dhananjay, the POST and PUT do say so in the HTTP.
    Actually, POST adds: “POST is designed to allow a uniform method to cover the following functions:
    – Annotation of existing resources;
    – Posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles;
    – Providing a block of data, such as the result of submitting a form, to a data-handling process;
    – Extending a database through an append operation.
    The actual function performed by the POST method is determined by the server and is usually dependent on the Request-URI”
    And it includes a way to identify when to use POST and when PUT. It says in POST, the URI is the target resource that receives the payload entity as a subordinate, while in PUT the URI represents the actual payload entity!

    That is why, if you want to replace something, it is PUT. If you want to move something, from the task container to the story container for instance, you can use a POST to the Story URI and the payload can be the Task URI to be moved! That is simple.

    Now, about URIs, the are IDs in the form:
    (rfc2616, sec 3.2.2)
    http_URL = “http:” “//” host [ ":" port ] [ abs_path [ "?" query ]]
    Note that the URL is for identifying resources. So, additional to the path, it may contain a query string to further identify the resource, but nothing more. Any other info like properties, should be managed as metadata using headers. Same for changes metada or actual data, using header or payload. Adding extra things to the URL to denote action, or payload, breaks the idea. This is usually done to simplify the API notation to be completely contained in the URL, but HTTP is far more complex than that.
    Sure, I agree with your take on CRUD, just saying be careful!.

    Cheers.

  10. M2 says:

    I don’t completely agree with your choice of POST in some cases. The litmus test to choose between PUT and POST is idempotence. For idempotent (create or update) requests you should use PUT, otherwise you should use POST! If you want to create a new ressource and you know before sending your request the URI of the resource to be created, then you should use PUT instead of POST. If you don’t know the URI (id) of the new ressource and your request is sent to a factory ressource then it should be a POST. The reasoning is the same for modifying an existing ressource. If your modiifcation is idempotent, it should be an PUT, otherwise it should be a POST (for example your want to append some content to an existing resource).

    • M2,

      I apologise for being unable to infer which set of my statements you were referring to. I did a quick review and couldn’t find anything which contradicts your litmus test. So a quick note to that effect would be quite helpful.

      There was one statement I did make as follows which I wonder was a candidate to cause some confusion.

      A POST on say /story/{story_id}/StorySplit would result in a creation of new story with a URI /story/{new_story_id}. The name of the new story will be as was requested in the StorySplit resource

      Just in case that was the issue allow me to clarify that I was requesting a “name” and not an “id” – so I did not know the URI upfront. If not, as I requested, a quick note to clarify what you were referring to would be greatly helpful.