How I ended up selecting Python for my latest project “It worked once” - The ’selfish gene’ for architects and managers ?
Jun 10

Seems like a nice term doesn’t it - Polyglot programming. Some recent posts related to it are Kill Java, Vol. 2 and Fractal Programming It may seem hot, it may seem in and it may make sense for you. However please allow me to lay down the fine print.

Polyglot programming requires to build bridges across computer languages. These bridges act like translators at the UN. If you are attempting to run the UN you need the translators, you need to accept the cost. But make sure you understand that there is a cost, and justify to yourself that you really need to incur it.

Let me give you a case in point - JSP Tag Libraries. These were meant to help the presentation / UI guys write the cool UIs while allowing the java programmers to focus on the other interesting stuff which was much better done in Java. The Tag Library construct was then a bridge between HTML and Java. Did you ever try to measure the runtime performance cost of tag libraries ? I did and it turned out that for all but the completely non trivial tag libraries - the cost was too high. I was able to run the controller logic, lookup the data from the cache, update the data in the cache (basically the entire lifecycle of a transaction except for the final updates) and sometimes finish the database updates as well, all in much much lesser time than the time it required to move the data across the tag library when presenting the data. I had a simple rule - if you need screaming concurrency and thruput from a small box - do not touch tag libraries. I however did use tag libraries when the performance demands weren’t so high.

Another example of such bridges being sold without proper warnings - EJB v 1.0. The fine print did talk about the remote API costs, but the general paradigm encouraged using the EJB APIs to communicate between the session beans and the entity beans. History does tell us this performance sucked too!

If you work with JNI you will soon realise that a lot of data transformation needs to happen when moving complex data structures between C and Java. Again while much faster than exchanging XML documents over inter process pipelines, this can be expensive too.

Suggestion : focus on the granularity and the triviality of what you are doing. If the communication across the different languages is fine grained or too frequent sit up, take notice and be careful. If the code in the other language (the language that is providing the service and thus is being called) is too small or too trivial, again ask yourself whether this is really required.

Like it or not we are already polyglots. We do DHTML + CSS + Javascript + (Java / Python / Name your Language). It makes sense to be polyglottish. Treat polyglottism as a tool in your toolbox and not as a fashion statement. Keep the kids in mind and especially when showing them how to move between the static and dynamic programming languages - run your benchmarks and lay down the caveats. There is indeed a possibility that reckless application of the paradigm might lead you (more likely your unsuspecting readers) to a crawling system - something the users will describe as a PolyClot. :)



Viewing 9 Comments

    • ^
    • v
    The concern about marshaling data is valid.

    But, often with polyglot programming (in Java anyway) we are talking about generating byte code. As far as the runtime is concerned, there is no difference whether the bytecode was generated from (say a) Scala compiler, or a Java compiler.

    Now, if you are talking about a dynamic scripting language.... yes, performance will definitely be an issue.
    • ^
    • v
    Paul,

    I am not sure if the assumption that given the scenario both languages work with Java bytecode there should be no difference at runtime. If you look at the first example I talked about - Java Tag Libraries - these are compiled to java bytecode as are the calls to them. Yet the performance is radically different. This precisely is my concern.

    There just seems to be this assumption that given java bytecodes on both sides things will just be hunky dory. It just seems a little facile assumption to me. I am sure with some use cases it might seem just fine but developers will trip on some others. The trouble is that if they are not watching out for it they will have a harder fall - and that is what I am attempting to highlight.

    Dhananjay
    • ^
    • v
    While I agree that polyglot programming is over-hyped and dangerous, I'm not sure that your reasoning is sound. As a previous commenter pointed out, a lot of the "polyglot languages" being pushed these days already run on the JVM. Some of these languages (such as Scala, Clojure, Groovy, etc) even go so far as to use the same call-stack and internal runtime semantics as Java. For such languages, the intrinsic performance penalty is extremely minimal. Obviously Groovy imposes all the overhead of a dynamic language, but Scala is just as fast as Java (even faster for some operations). Interop between such languages really seems to be the only significant overhead (loading the extra support libs, etc), but because the call stack is shared, "polyglot calls" are really just the same as normal Java method invocations.

    Your arguments also make the assumption that developers are jumping into polyglotism blindly without considering the performance implications at all. I don't think this is the case. When I use Ruby for something, I'm well aware that there are much faster (in terms of runtime performance) options. Given the vociferous nature of most language proponents, it's hard *not* to be aware of the performance minutia of modern languages, especially as they compare to old stand-bye languages like Java.

    There are a lot of reasons to be wary of polyglot programming, but I don't think performance is one of them.
    • ^
    • v
    While I agree that polyglot programming is over-hyped and dangerous, I’m not sure that your reasoning is sound.


    If you read me carefully - I have not reasoned it out. I have projected my past experience into the future and done that quite explicitly.

    For such languages, the intrinsic performance penalty is extremely minimal


    I have already stated my specific experience with JSP Tag Libraries. The only reason I am unable to quote these figures is because the metrics are locked away in documents at a customer site - documents I do not have access to since the customer engagement is over.

    Your arguments also make the assumption that developers are jumping into polyglotism blindly without considering the performance implications at all


    I still remember the days in middle of 2000 I used to look at EJB design and go - this just can't work, this just can't scale. Even if a lot of developers do keep their eyes open I have seen sufficient empirical evidence of developers and managers having their eyes closed.

    There are a lot of reasons to be wary of polyglot programming, but I don’t think performance is one of them.

    If polyglottism becomes more prevalent and if it so turns out 2-3 years from now then I shall stand corrected. But I shall prefer to stand corrected after sounding a cautionary note, rather than being stand correct without sounding it.

    Performance just like beauty is in the eye of the beholder. I have had to deliver more than 50 Transactions Per Second (many reads + few writes per transaction) on a one desktop class machine (as of hardware 2-3 years ago) on multiple occasions in the past few years. Under demanding level every millisecond counts. What can be good for one set of performance targets may suck for another. So even if 9 out of 10 readers think I'm overreacting, but I am able to help the 1 remaining developer meet his performance targets, the objective of this post will be served.

    I am not worried about the large grained calls. However even if you stick in a 100 nanosecond overhead in a deep inner loop on method calls on a high thruput system - it will hurt and hurt badly. The reason I wrote this note is not that I am expecting the individual method overhead to be too high. But if you have a core java engine calling small business rules written in other languages (as some of the thoughts seem to be progressing towards) these are more likely to be fine grained calls stuck in inner loops.

    Having said that - I actually tried to benchmark something using jruby. Unfortunately the sample code for jruby didn't compile and I couldn't get find the javadoc API for the ruby classes. So for now treat the caution as empiricial and instinctive. If over the next few days I am able to test it out I will post the figures here.
    • ^
    • v
    The JSP tag libraries probably 'appeared' expensive because they were writing to the servelet output stream. There is a risk that the network time got added to your profiled tag times.
    • ^
    • v
    Ruby documentation: http://ruby-doc.org/core/

    The problem with benchmarking this sort of thing on JRuby is the platform has significant overhead, both in startup time and invocation cost. Ruby is an almost excessively dynamic language. It lends itself extremely nicely to metaprogramming and other nifty tricks, but this dynamic nature also precludes use of the native JVM call stack for most situations. For this reason, JRuby has quite a bit more overhead associated with a simple call than does Scala or Clojure. Also, JRuby has a fairly large runtime (including JIT) which must be init'd prior to doing anything. Groovy has something similar, but with Scala it's all just part of the JVM. Scala *literally* compiles down to Java-equivalent bytecode, there is no runtime or indirection layer other than the JVM itself.

    I think I agree with your central point, that the use of additional languages just to be "trendy" introduces unnecessary overhead and performance costs, but when judiciously applied, this overhead can be acceptably mitigated or even less-costly than the same algorithm handled in the single "main" language.
    • ^
    • v
    @mark

    I compared apples to apples. Simply replaced the tag libraries with raw java code.
    • ^
    • v
    So you can for example use a combination of Ruby, Java and external or internal DSLs:


    and
    Or you could use Clojure, Scala and JavaScript:


    and when talking about stable layer
    Languages in the stable layer can be Java, Scala or F#.


    and with regards to the dynamic layer

    The languages used in this layer are mostly external DSLs, but can also include extremely DSL-friendly languages like Ruby, Python or Groovy.


    Wonder what it will be like ....
    • ^
    • v
    Could tag libs perfromance be affected by synchronised pooling and the container in question?
    As far as using JNI - I would only use it if it was non trivial and worth the reuse.

Trackbacks

blog comments powered by Disqus