Month: August 2009

Why should I switch to Scala ?

Posted by – August 17, 2009

This post is a role-play and does not reflect my individual opinion about scala accurately. I am convinced about the capabilities and features of Scala along with the fact that it deserves the mantle of a long term replacement for Java. However language adoption goes beyond technical capabilities, and this post is a speculation on what a typical manager might be dealing with when attempting to decide whether to switch to Scala.


So I have been reading a lot about Scala lately and even opinions about how it will be a long term replacement for Java. I’ve also read some interesting writeups about Scala adoption such as On Scala’s Future and A Tipping Point for Scala. While I used to code a lot, my responsibilities today require me to interact with and address a lot of issues including those faced by our customers, our development teams and also engage with my peers and superiors on many other difficulties bedeviling our organisation. This gives me little time to try out Scala. I know I should be doing that, but sincerely I do not have the time. So I rely on the feedback of my team, the trade journals and other influential architects within and outside my organisation.

I have heard about many developers switching from Java to Python / Ruby. However I have heard of relatively only a smaller number of large Java shops which have done the shift – most of the switch stories I’ve heard reflect a smaller sized teams. I can feel the excitement Scala has generated amongst the development teams – the brevity, the functional programming model introduction, the exciting stuff being done concurrently et al. I have no doubt that, given so much excitement it must really be a good language.

To introduce my organisation – it is one of those shops which service many projects concurrently. Given the tremendous business and growth, I must confess we do not always have the luxury of being able to hire the most top notch talent. We do have a lot of projects we use Java for, and thats a language our customers are comfortable with. I’ve had some of the senior people check out Scala to gain some feedback into the language. But at this stage I must say I am inclined to evaluate the shift but not convinced enough to do so. I am sure that I could if convinced drive the change to Scala incrementally. However my fear stems from the fact that if things don’t turn out well, despite all the great advice I’ve received – its going to be my rear end on the line. So here’s some of my concerns regarding evaluating the shift to Scala and there are many of them, so some of you might be able to help me through this thought process.

  • Functional Programming : I’m sure in many ways it rocks. But my guys tell me they are not sure how to use it in the typical bread and butter applications which read from database, do some processing and write back to the database. Does Functional Programming help me in this context ? Will my team scale into being able to write functions with no side effects assuming thats a desirable goal ? What if they tie themselves up in knots and my release to the customer is risked ? I can’t afford that. Is functional programming even desirable in such contexts ? So I am not sure if in these contexts I should just ditch functional programming and work with just normal imperative programming capabilities of Scala. I am so confused, and afraid.
  • Different Syntax : While Scala runs on the JRE, its syntax is very different from Java. From what I could gather, it is much easier for a Java programmer to read (make sense of) simple Python code than to read Scala code. Is it true ? So even if I do get compatibility in terms of the runtime environment, would I be picking up a language that is syntactically so different a language that it would involve a substantial relearning curve ? I remember when we had to learn Java and Javascript. For better or for worse these were indeed relatively minor modifications of the C/C++ syntax, compared to what I sense as the syntactic shift between Java and Scala. Am I wrong ? If so, could you help point me to resources which help me understand that Scala code is not much different than Java ?
  • Sample code : Guys, I need your help. I need to see some good sample code. Some code which reflects how a typical application is architected, designed and programmed in Scala. And I don’t need it for a complex multi threaded actor based processing – I just need to see simple J2EE server based departmental applications maybe a simple recruitment tracking or library maintenance application. If I find a good one, I’ll just take it and give it to my team and say – there, thats how we’re largely going to build it, and even if we make a few changes along the way we at least have a reasonable template that we can build from.
  • Dumbed down environment : I remember my great adventures with C and vi and make. But my team today is very different. They want great IDEs. They must have syntax highlighting, autocompletion and nice refactoring capabilities. If I ask them to move, some of them might be excited about the change and be willing to overcome these short term hurdles. But there are some of them who will not be keen to do so and may be disinclined to support such a shift. And at the end of the day my ability to conduct this shift is a function of my ability to carry a large proportion of them along with me. Even when I considered a shift from svn to git, the IDE support was a big issue even though quite obviously git capabilities were really exciting. I couldn’t push along that change, and in this case we are talking of changing the language.
  • Is this a good time to shift to Scala ? I remember the early adopters of Java from 1996 thru 2001. While they gained a lot of experience, JRE and J2EE really matured only post JRE 1.3. Scala seems to be coming out with so many enhancements so fast, I am not sure if it has stabilised. I am told there is a 2.8 coming out in a few months. So if I train my team and Scala continues to change rapidly will I have to keep on retraining my team regularly ? And what about the customers I take to production. Will the frequent upgrades mean I end up supporting multiple customers on multiple versions of Scala ? Maybe Scala is stable but it would be helpful for someone important enough to make a clear statement that there are no new major shifts anticipated anytime soon and that these version shifts are likely to be no faster than the JRE version upgrades (which were fast enough).
  • Support from peers and superiors : I remember the day I decided to shift to Java. What made the move easy for me was the sheer fact that Java was a big paradigm leap away from the then dominant C++. Not only was it cross platform with binary compatibility thrown in for good measure, Sun ensured that it made all the right noises to appeal to the enterprise architects and all the business managers. I see the senior developers in my team clamouring for the shift to Scala, but my peer managers and my superiors don’t display even the fraction of the enthusiasm they displayed during the Java shift. The implication for me is that the risk cover I get when I order the shift is far lesser than what I had when I made the move to Java. Which means if things don’t quite work out well, I’m really going to be screwed.
  • Business friendliness : I understand all the nice talk about the technical excellence of Scala. But I really need to translate all these great language features into a projected ROI that I can use to convince others about. So I would like to see actual case studies of applications that were moved to Scala and what impact it had on the time and cost so that I can use it to compute my ROI. And what scares me is that learning curve may risk the initial applications long enough to push my breakeven point of shifting to Scala well beyond a 12 month and perhaps even a 24 month period. I fear things might not be as difficult but in absence of known studies, I am likely to lean towards projecting a worst case scenario rather than an optimistic one.

So folks, I am asking for your help. And while a lot of you may think that people like us who balk at the thought of limited IDE support are wimps, please remember that 80% of us don’t fit into the top 20%. And if you would like Scala to be popular, you need us as much as we need you. And if you are not too sure, please remember Lisp and Smalltalk are great languages as well.

CRUD is not only good for, but is the only consistent way to build REST over HTTP

Posted by – August 14, 2009

This is to comment on a perception forming that REST encourages exposing basic data elements through CRUD and that it encourages development of dumb applications (applications with shallow business logic).

Apart from some tweets I saw on the topic and some twitter conversations, the blog posts which perhaps set off the thought were

The underlying fear and rationale for these posts makes a lot of sense – the fear of creating real dumb passive and shallow applications. I submit, that the problem however is not CRUD – it is resource identification and scoping, and CRUD is not only good for but is the right way to build intelligent, active and deep applications.

CRUD supports Uniform Interface : The primary reason why CRUD gets used is because it supports a uniform interface. At the end of the day, a consistent Create/Read/Update/Delete or POST/GET/PUT/DELETE interface makes things easy. It makes things easy for the development team because of the consistency it introduces in their applications. It makes things easy for the clients who have a simple and consistent interface to deal with. At the interface level CRUD breeds consistency, and at the risk of broad generalisation, consistency is good.

So why do we end up creating shallow applications at times with REST ? CRUD in general works with simple forms built on simple tables. Quite often this style of programming gets elevated into simple forms over simple domain objects. Standardised CRUD helps a lot at the lower end of application development and most database driven application developers are likely to have at some stage in their early development life attempted to build a small CRUD library or framework to help themselves substantially. The reason why we are likely to be ending up creating shallow applications is not because we apply CRUD, but because we continue to apply CRUD on tables or simple domain objects. And therein lie the distinctions

  • REST is not about CRUD on tables – its about CRUD on resources
  • CRUD is the interface – not the implementation

I attempt to bring up the difference in the example that I detail below.

Simple Account Transfer Example

Lets say we want to build the software to transfer amount X from account A into account B. Lets further specify that a transfer is not effected immediately and requires one more explicit approval. Lets also specify that while a transfer is waiting to be approved, it could be amended. Thats the simple scenario that we shall deal with.

In order to implement this, we shall define a datastructure / table / object for Account which shall contain a field called balance. Further there shall also be Transfer table / object which shall contain the fields sourceAccount, destinationAccount, amount and status. The possible status values shall be Initiated and Completed.

In a simple service oriented application we shall perhaps have a transfer service. Ignoring error handling, SOA wrapping etc., the service interface will probably boil down to the following equivalent Java interface.

public interface TransferService
{
    public Long transfer (Long sourceAccountId,
                                Long destinationAccountId,
                                BigDecimal amount);
    public Transfer get(Long transferId);
    public void amend(Long transferId,
                             Long sourceAccountId,
                             Long destinationAccountId,
                             BigDecimal amount);
    public approve(Long transferId)
}

Lets think of these might get modeled in a REST environment. The important thing to remember is – don’t think about services or functions or methods – think about what are the resources you choose to expose using a simple CRUD interface.


# The following creates a new transfer. The returned data shall include
# the URI of the new transfer, and the URI to approve it

POST /transfer
# The following retrieves the status of a current transfer. If it has not
# been approved the returned data shall include the URI to approve it.

GET /transfer/${id}
# The following modifies the transfer. The returned data shall also
# include the URI to approve it

PUT /transfer/${id}
# The following approves and further processes the transfer. It shall
# return the URI for the transfer

POST /transfer/${id}/approve

While most of this seems all right – what sticks out like a sore thumb to me is the approve URI. Its just so SOAish / RPCish. Plus at least the way this particular interface has been implemented, there is no way to access the approval specific information, without actually accessing the transfer. Hence I suggest that we define a new resource TransferApproval to account for the same.


# The following creates a new approval. If successfully executed
# the transfer is complete and no future amendments or approvals
# are allowed. the returned data shall include the TransferApproval
# URI and the transfer URI

POST /transfer/${id}/approval
# The following gets an existing approval
GET /transfer/${id}/approval/${approvalId}

Please note that the “${approvalId}” at the approval URI simply wasn’t required – since there exists a 1-1 relationship with the transfer. I just included it for easier understanding. If I had to implement the functionality as is I would choose to skip it however if I knew I would very soon need to build in multi-stage approval (as in most banking systems), I would keep it so that each approval against a transfer can also be listed.

But the really interesting method above is the POST. This is a seemingly simple new (in RDBMS parlance) insert into TransferApproval table. But if you are building a REST service, you might be tempted to encourage your clients to not only create the new TransferApproval resource, but also go back and update the Transfer table to update a status to Approved. That would be a smell. Once the POST on the approval is processed, all side effects on other tables should be handled while servicing the POST request. In other words the POST request is not just an insert – its an insert with an associated trigger to conduct all the necessary downstream processing. And its essential one looks at request servicing in this manner so that CRUD can be used effectively. Servers should be designed this way, and clients should anticipate it and we should be on our way to build non-shallow applications.

So finally – CRUD is good. It makes things easy for the clients. Stick to CRUD. Just remember that it is CRUD on resources and not on tables, and the resources shall handle all the downstream changes necessary so that you don’t have to. And finally CRUD is the interface, not the implementation.

Note: Some might notice that this post is just a much more detailed elucidation of one of my earlier posts – REST is the DBMS of the internet.

The Microsoft Word injunction has nothing to do with XML

Posted by – August 13, 2009

Obviously the development community is buzzing loudly about a injunction issued by a court against Microsoft disallowing it apparently from marketing Word. However what foxes me is that most articles give the impression that it is an issue with XML documents. As an example :

I also came across many other tweets bemoaning the verdict and expressing the opinion that it is a sad turn of events – it being related to storage of documents as XML. I did take a quick look at the said patent : Patent 5787449.

And therein I thought there’s seems to be a big misunderstanding (at least to my lay reading). The patent itself has nothing to do with storing data or documents or XML. Its got to do with a particular implementation of data storage which requires maintaining a metadatametacode map. I shall use an example from the patent itself.

Lets say the XML is as follows :

<Chapter><Title>The Secret Life of Data</Title><Para>Data is hostile. </Para>The End</Chapter>

The patent suggests a storage which would maintain a metacode map as follows :

Element Number Element Character Position
1 <Chapter> 0
2 <Title> 0
3 </Title> 23
4 <Para> 23
5 </Para> 39
6 </Chapter> 46

The metacode map essentially stores a tag along with its position. You can also see the same clearly on page 15 of the patent document on google patents.

My reading suggests that the patent and alleged infringement if any has got nothing to do with storage of XML documents per se. In fact, that the tags being used are XML/SGML like is probably completely coincidental – these could very well be ${chapter} instead of <chapter>. And XML documents are stored with the tags embedded along with the content – this patent actually refers to maintaining a map of these tags and the positions they should be inserted into.

Could I be wrong in my interpretations ? Perhaps, since I haven’t seen anyone else point this out and would prefer to be corrected. But the fact remains that at this point in time as I write this post, I believe that XML and storage of XML documents are completely orthogonal to the patent and the case around it – that centers around a metacode map, and metacode maps are not a characteristic of typical XML storage at all. So there’s probably one big misunderstanding about what this case is about and if the injunction upsets you because you are against patents in principle, thats fair. But if you are disappointed about the possibility that this somehow substantially impacts XML storage, the way I interpret it – there’s no such implication.

I shall keep my fingers crossed and hope no one points out a seemingly obvious flaw in my interpretation.