Archive for the ‘ruby’ Category

How I ended up selecting Python for my latest project

Posted on June 9th, 2008 in Uncategorized, java, php, python, ruby, software | 25 Comments »

I have 8+ years experience on C++ and Java each and at least consider myself an expert on the latter and used to consider myself one on the former a long time back. For any programmer it is a difficult choice to move away from the platform and the environment in which one has both a substantial investment into and into another one where you essentially throw away years of experience and start as a novice (at least in pure programming terms).

This is not a post which is either to be interpreted as pro Python or anti Java or a pro/anti “name any language you wish”. I will gladly and gleefully go back to C++ / Java when using them makes sense under a context. I am sharing my thoughts and not promoting or denigrating any languages or frameworks here.

Context :

Here’s the shift of context (not all aspects are necessarily relevant to the shift in language). There are many details I have deliberately not got into for obvious protection required for any commercial activity.

  • From a relatively large programming team that I used to manage into a small one (yours truly only to begin with)
  • From building a commercially owned closed software into open source software development (what is at this stage intended to be though it will take a few months to get there)
  • From customers always being large corporates who could often afford substantial hardware into customers who can range from individuals to large corporates
  • From mostly internal (intranet) facing to internet facing
  • From performance requirements of upto a few hundred thousands of transactions per hour into a completely wide range of requirements based on each customer
  • From a very high percentage of writes to a much smaller percentage (ie. read percentages are now much higher)

Initial Choice Set

Given the fact that this application is intended to be hosted on the internet and is primarily a browser based application my choices quickly narrowed down to Java / JEE (something I had a long experience on), PHP (I had developed one web based application with it), Ruby and Python (I only had academic exposure to these). C/C++ did not figure on the choice set for their obvious development overheads in the context of web applications.

I went through a fair degree of thought and creation of dummy applications and the mental to and fro and the the process was not nearly as linear as I will describe below. The process is simplified below simply so that the reader can get at least some insight into my mind and my mind is made to seem a lot less confused than it actually is.

1st Elimination :

Java was the first one to get knocked off. One reason was that Java based applications typically require either a dedicated host or have to work under memory constraints in case of shared hosts. Simply put Java scales exceptionally well but it has a minimum hardware / investment requirement which was not acceptable within this context. The application should be able to worked on shared hosting environments. Another equally important reason was that the productivity of initial development and that of making changes to Java applications is much lesser than the other languages. I was only too acutely aware of the performance implications of this choice, but I believe I the appropriate choice here shall be to scale out especially where the read activity is especially large compared to the writes. Scaling out does require more complex architectures (compared to simply scaling up) but thats the way that is appropriate in this context.

Subsequent thought process

This one was much tougher. Let me delve for a moment into what I believed the key strengths of the various languages were.

Ruby : I really really loved the syntax. Compact, cute ‘n’ thoroughly OO. Strong metaprogramming capabilities
PHP : Massive developer base (especially important when the intention is to eventually open source the application). It is amongst the easiest languages to use. Another advantage in its favour is the ‘C’ness of the syntax which makes it easier for anyone coming in from the C/C++/Java world.
Python : The metaprogramming and OO were almost but not as good as Ruby. I really love the indentation driven syntax. I know many might differ but I really like it and the neat paragraphs and lack of block braces make the code a lot more readable. Current production interpreters seem to be the best performant compared to Ruby and PHP. I know Ruby 1.9 is getting much faster but I suspect it is unlikely to be enough to make it much faster than Python already is.

As I considered the languages, it was important to look at the frameworks. I looked at CodeIgniter, CakePHP and Zend for PHP, Rails for Ruby and Pylons and Django for Python.

I believe one of the important aspects of architecture decision making today is that you bring in the available toolsets / frameworks into the decision making process even when you are attempting to select a language. You are in effect evaluating a package involving both the language and the framework. A specific issue to be noted in my context was that regardless of what framework I chose I was completely sure I would need to change it / extend it due to the fact that some of the basic requirements of the application I am about to build - no well known existing framework supports them. I am certain and convinced this is not a “Not Built Here” syndrome that I am suffering from but simply a necessity of the domain I am attempting to work with.

In terms of ease of use for simpler applications I would rate Rails very highly. Not only does it make the actual programming simple, it gives you a nice set of tools around it to make a lot of typical activities really easy.

2nd elimination

Ruby and Rails went out. The reasons were as follows.

  • Ruby is just so well designed from a syntax and OO perspective, that coming from so many years of C++/Java background with a substantial grounding in Object Orientation as implemented by these languages, I really did not get the confidence that I could do it sufficient justice. My fear was that I would keep on finding ways of doing things in a better way in this language (and given my inherent compulsions would feel inclined to refactor code rather than focus on newer development).
  • I perceived that Rails was really focused on typical much simpler use cases. What I intend to do requires getting into an immense amount of complexity and I felt fairly certain Rails wasn’t designed for that and that I would spend a very very substantial time reinventing or hacking through Rails.
  • One of the strong features of Rails was something I couldn’t really completely come to terms with was “Convention over Configuration”. I still can’t get over some of the implicitness in the environment.

3rd Elimination

Clearly PHP is such a widely used language with so many developers who are already trained on using it, that using it for an as yet intended to be open source application should be a no brainer - Right ? Not in this case. Two reasons why PHP went out.
  • I intend to pull off some really complex programming. Given the better OO and metaprogramming capabilities of python - I thought I would be able to keep my code much better concise, structured and readable if I was to use Python (this would’ve been true with Ruby as well!).
  • Django - This framework simply came closest to being the framework I would really like to end up with. Thus the gap between what it has vs. what I would like to have was the smallest, and the general design principles were those that I was extremely comfortable with.

In summary : Why these got eliminated

  • JEE : The difficulty of using JEE in shared hosting environments and the long development times required made it tough to use it.
  • Ruby : I like the language. I was a little intimidated by it. Rails however seemed limited to simpler use cases
  • PHP : Wonderful developer base. However didn’t give me the same comfort in terms of its OO capabilities and metaprogramming and did not believe the resultant code would be the most compact, concise, and readable.

In summary : Python and Django : What do I hope to get from them

Now that I have made the choice - what do I expect from these choices at this stage :

  • Excellent framework to start off with - provide the maximum initial boost
  • Ability to write concise, structured and readable code
  • Ability to make changes rapidly
  • Reasonably performant application

Other choices that I did not spend too much time on

During this process I also evaluated other languages such as C# and Groovy/Grails. I actually was quite impressed with the recent architecture trends coming from Microsoft and was actually tempted to spend more time on them. However the same reasons that eliminated JEE quickly made these choices unviable as well. I wish I had the luxury to consider other languages as well, but these were what were at the top of my mind, and I did not consider any other languages in the process given the time constraints.

I am missing JEE

I really really will miss JEE. I would love to have used it for the simple fact that I know it so well and this one will require me to move from an expert to a novice category. I will also miss it for the fact that with JEE I knew what it took to deliver exceptionally high performance. Even though I am feeling the JEE separation pangs, I believe Python is the right choice since it will allow for the fastest development, will allow for some really rapid changes (agility in coding), and will allow me to get the developers who shall eventually be joining us come up the curve much faster and be able to deliver more features much faster.

Final thoughts

One thing I realised through the whole process was that there are two strong influencing factors to any architecture choices. The first one obviously is the context. There is no way to compare or contrast any architecture or design choices without putting a context around it. Secondly and this was a little bit more of a surprise to me was that you simply cannot remove personal proclivities from such a choice making process. Since I as an individual am more comfortable with some styles of design than others, and since I am likely to be substantially involved with the initial development, it only seems to make sense that the choice set gets evaluated in this subjective context of individual comfort and therefore the projected individual productivity and effectiveness as well. Note that while in this case I was evaluating it strictly from my own perspective, one could evaluate the choices in the context of Team styles, culture and comfort as well.

Turbocharge your string keyed hashmaps

Posted on April 17th, 2008 in java, ruby, software | 13 Comments »

This post gives you a small tip which just might make a world of difference to your java hashmap’s performance. This trick has been inspired by the “symbol” construct in Ruby language.

I have often considered using hash maps using Strings as keys as quite expensive indeed. And in many ways they often are. However if the keys used in your hashmap are either a well known set at the time of either writing the code or at least when the program starts up, the following is likely to help you make your map performance much much zippier.

In case you are not familiar with Ruby, it has a special construct called a symbol which is somewhat similar to a constant string. However you can create as many instances of it, but ruby runtime will ensure that multiple instances having the same character data will refer to the same runtime instance.

The design of any key will influence the performance of the hashmap primarily based on the performance of its hashcode and equals methods. The java.utils.HashMap implementation uses the result of hashCode() to narrow down the potential number of keys to be compared and then compares the keys based on whether they are the same instance (ie. occupy the same address space in memory) or in case they aren’t then by invoking the equals() method.

Thus if one wants to use Strings as keys, then there are at least two optimizations that could be potentially targeted :
(a) The hashcode could be cached rather having to be computed each time (Turns out this makes a positive but a rather small difference)
(b) Ensure that the same instance of strings get used for the same string data. (Turns out this does make a substantial difference).

The following two pieces of code indicate the difference.

Slower Code

    map.put(new String("mykey"),/* .. some value .. */);
    Object o = map.get(new String("mykey"));

Faster Code

    String key = "mykey";
    map.put(key,/* .. some value .. */);
    // Note : In this case the same instance of the key is
    //            is used in both the get and the put
    Object o = map.get(key);

The big reason why this makes a difference is the following line of code in java.util.HashMap

// The following line has two ampersand signs indicating a logical 
// and. Formatting is destroying the way it looks 
//     (and I do not know how to fix it)
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) ...

Thus when comparing the keys, the code first tests whether they are identical and then they are equal. Obviously the test for identity is substantially inexpensive compared to that of equality. Thus the faster code shown above is faster since the keys are identical.

In order to be able to provide the same capability of ensuring that only one instance of a string key with a particular string data is constructed, while taking away the onus from the programmer of having to track the instance creation, a wrapper class called Symbol is used as shown below :

Update:I have updated the version to address Khalil’s and Dave’s concerns and suggestions. The important part of the modification is that the hashmap has been done away with and the getSymbol() method is modified. Earlier code which has been replaced has been commented out.

package com.dnene.utils.symbolmap;

/**
 * License : Based on BSD Template
 * 
 * Copyright (c) 2008, Dhananjay Nene
 * All rights reserved.
 * 
 * Redistribution and use in source and binary forms, 
 * with or without modification, are permitted provided 
 * that the following conditions are met:
 *
 *    * Redistributions of source code must retain the 
 *      above copyright notice, this list of conditions 
 *      and the following disclaimer.
 *    * Redistributions in binary form must reproduce 
 *      the above copyright notice, this list of 
 *      conditions and the following disclaimer in the 
 *      documentation and/or other materials provided 
 *      with the distribution.
 *    * The name of the Dhananjay Nene  may not be used 
 *      to endorse or promote products derived from this 
 *      software without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND 
 * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, 
 * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 
 * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 
 * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR 
 * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 
 * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 
 * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; 
 * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY 
 * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT 
 * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 
 * POSSIBILITY OF SUCH DAMAGE.
 */

import java.util.HashMap;
import java.util.Map;

public final class Symbol
{
	private final String t;
	private final int hashcode;
	// not required after update
        // private static Map<String,Symbol> map = new HashMap<String,Symbol>();
	
        //  method simplified upon update
	// public static Symbol getSymbol(String t)
	// {
	// 	Symbol symbol = map.get(t);
	// 	if (symbol == null)
	// 	{
	// 		symbol = new Symbol(t);
	// 		map.put(t, symbol);
	// 	}
	// 	return symbol;
	// }
	
	public static Symbol getSymbol(String t)
	{
		return new Symbol(t.intern());
	}

	private Symbol(String t)
	{
		this.t = t;
        this.hashcode = t.hashCode();
	}
	
	public final String get()
	{
		return t;
	}

	@Override
	public int hashCode()
	{
		return this.hashcode;
	}


	@Override
	public final boolean equals(Object that)
	{
		return ((that instanceof Symbol) ? 
                           (this.t == (((Symbol)that).t)) : false);
	}

}

The code snippets shown earlier would be modified as follows to use Symbol

Symbol usage :

    Symbol key = Symbol.getSymbol("mykey");
    map.put(key,/* .. some value .. */);
    // Note : In this case another instance of the symbol is created
    // Note : You can also use key.get() in this case so long as you use it consistently
    Symbol key = Symbol.getSymbol("mykey");
    Object o = map.get(key);

Update :As per Dave’s suggestion in using String.intern(), the following could alternatively be used. Note that the performance benefits are to be had only if the symbol construction or the intern() call are made far less frequently than the calls to the getter on the map.

    Symbol key = new String("mykey").intern();
    map.put(key,/* .. some value .. */);
    // ... elsewhere in code
    Symbol key2 = new String("mykey").intern();
    Object o = map.get(key2);


Performance Difference

In my benchmarks involving million keys, the get performance of maps using identical String keys and symbols was almost the same (the symbol based implentation was roughly either slower by 3% or faster by upto 10% with the average performance of the symbol implementation being faster by 4%). I did not benchmark the puts since I did not imagine they would change much. However the Symbol based implementation get method was consistently faster by at least 30% when compared to the String map performance when the string when the keys used during the ‘put’ where equal to but not identical to those used during the ‘get’. There is an overhead of creating a symbol instance compared to a string, but under most circumstances I believe this should be much more than offset by the gains.

What surprised me was that the Symbol based String keys beat the get performance of using Long keys quite handsomely and consistently. I am not sure why it works so and am not sure if I had made a mistake. However since using Long keys was not something I was particularly focused on - I did not hunt down the reason for the performance difference between Symbol and Long based keys.


Update : adding a sample output of performance benchmarking sample run. The runs consist of hashmaps of million entries each, with each key being 16 characters long, and a million lookups getting done. The times mentioned are in nanoseconds for each run of a million lookups.

=== Comparison of Symbol to Long ===
Long lookup time : 246225003
Symbol lookup time : 148041217
Performance Ratio : 1.6632193
Reduction in time : 39%
=== Comparison of Symbol to Identical Strings ===
Usual lookup time : 151010200
Symbol lookup time : 148041217
Performance Ratio : 1.0200552
Reduction in time : 1%
=== Comparison of Symbol to String Copies ===
Copy lookup time : 231829056
Symbol lookup time : 148041217
Performance Ratio : 1.5659764
Reduction in time : 36%

Am adding the link to the source and the driver files for running your tests independently.
Symbol source and driver program

Field of use

This should be useful in a fairly large proportion of typical applications. In most most situations the possible universe of keys in the hashmap are known upfront either when writing the code or when starting up the application. If instead of creating hard coded strings or by using various string key parameters from say an XML file .. just use a Symbol instead. The construction is a little expensive but the map gets run much faster.


This solution is not limited to a String. It could actually be used for any data structure which has a high cost of either hash code computation and / or equality check. In fact the version I wrote for myself was a Symbol<T>. The code shown above is only a specialised version where T is a String.

Java : if (compete with PHP / Ruby / Python) { stop fixing the syntax and start fixing the runtime }

Posted on April 14th, 2008 in java, php, python, ruby, software, web | 11 Comments »

A lot of people are wondering whether Java is under threat from a set of nimble languages - PHP / Ruby / Python / Perl. There is a flurry of activity relating to Whether Java is losing the battle for the modern web lot of which are being driven from the earlier post by Andi Gutman Java is losing the battle for the modern Web. Can the JVM save the vendors?.

My recent activities had me visit the same question and the following is how I would summarise the situation and the mechanisms for Java to better compete with these nimble languages.

Current scenario

Yes, I think there is some threat to the scope of use of Java as a server side web application development language. Given the high acceptance of Java in the corporate environments this threat will take some time to play itself out. The sales pitch of Java has often been supported heavily by the vendors, and this has led to lesser focus on making java compete at the non-corporate-enterprise end. The nimble languages compete quite well in this space and combined with the increasing power of hardware which makes them more and more relevant each day for high performance requirements, there is a clear threat of these languages starting from a lower end pushing java out of even the middle end.

Opinion: Dynamism of syntax is not the real issue

What I am not yet fully on board with is the characterisation that the real issue is that Java is lesser dynamic than say PHP / Ruby / Python. Java has a pretty strong set of capabilities and while some may desire more from a syntax perspective I don’t know that thats the real issue. Lack of Closures, Dynamic Types, Duck Typing may make it difficult for java to compete in some contexts only to be at least partially offset in others.

Issue : Shared hosting of java applications

The first real issue is the overhead to develop and then start hosting java applications. It is difficult for host providers to support allowing each user the ability to run their own JVM processes in a sustained fashion. I remember the days when using PostgresSQL required you to have a dedicated server whereas you could use mysql easily by just setting up yourself for a small multi-tenant plan on a web server out there. Today Java is like PostgreSQL of those days. There is no easy way to simply set yourself up to run a small Java application in a shared tenant environment. Even if you did set yourself up chances are that you would be far less than satisfied with the performance in a multi tenant situation even though Java is actually a really really fast language. Java has endeared itself to the corporate environments with their funding for creating large infrastructure stacks and simply hasn’t offered enough opportunities for small enterprises / individuals to create and host their web applications on a shared infrastructure. This takes away an entire community of supporters who when they grew up would’ve carried java along into the larger of their applications.

Issue: Compile cycle

The second issue is the nastily long compile-build cycle. This leads to a scenario where developers need to twiddle their thumbs for half their development times waiting for ant / maven / javac to do their work. It destroys their rhythm and hurts their productivity. The traditional argument of the speed / scale and enterprise-ness of Java over the nimble languages just seems lesser and lesser relevant as these languages start having a nice stack of their own enterprise frameworks and hardware developments whittle away the performance disadvantage. When someone very recently asked me why I was less likely to choose Java for my next project I just said Java is not agile enough. The agility of the nimble languages (no compile-build cycle) coupled with their having adequate performance profiles do lead Java to become lesser capable of competing in the space. By the way one of the real strengths of eclipse is its incremental compilation, but I cannot unfortunately use it with making code changes remotely the way I would be able to change say PHP.

So how can java compete ?

Assuming that java needs to compete with these languages on a better footing, some changes will be required (and some of them quite painful). These will be :

Make java easy to host on multi-tenant application servers

This would definitely require some changes to JVM to reduce the startup / shutdown overheads to make each processes really lightweight. It would also require some attention to how much memory gets utilised. Scenarios such as those I have used in the past to really make applications run faster by caching big time in the RAM are a no-no for multi-tenant hosting. We will thus need a version of JEE which loses the ApplicationContext. We will also need to be able to creatively work with other pools such as connection pools. Finally JEE application servers may need to be restructured to support rapidly dying processes (ie. process per web request). I am no JVM writer so wouldn’t know if some changes might be required to the language but cant think of any major changes that might be required.

Lose the compile cycle

Why can’t “javac” be conditionally executed implicitly at the beginning of “java” process. JSPs generate java on the fly. Similarly python creates the .pyc compiled files on the fly. If the deployment can be done using .java files instead of .class, it would eliminate the compile cycle and allow a developer to change the code in one window, save it, do an alt-tab and go press the reload button on the browser. Compared to all the attention that is being paid to language features like closures and duck typing, I think this is a really really big deal.

How will this help ?

These steps are not targeted towards attracting the developers working in building the corporate enterprise applications (though I am certainly they will break out into three cheers if the compile cycle was done away with - especially in an optional way - ie. the production deployment would still require a war consisting of .class and not .java files). More and more development going forward is going to be characterised by a larger number of smaller applications rather than the old 60s-70s days of small number of large applications. Average applications will get smaller and these will communicate with each other more frequently using web service (REST?) calls. The trend will definitely be towards more in-place remote application reuse by remote invocation rather than through using class libraries. The nimble languages are well positioned to compete in this space. Java isn’t. This will help Java attract and retain developers in a space where currently it is only likely to lose them increasingly. Moreover as application hosting infrastructure starts getting more outsourced and cloud computing gets more prevalent (eg. Amazon Web Services / Google App Engine) Java can at least compete in that space which is otherwise likely to be locked out for it.

OpenID for Intranets and Extranets

Posted on February 5th, 2008 in java, ruby, software | No Comments »

This post continues from OpenID or OpenAvataar ? UserID or AvataarID ?, and Implications of OpenID on software design and specifically looks at how the OpenID specification could be used within corporate intranets and extranets.

Why would a corporate even want to implement OpenID

The problem OpenID is attempting to solve is widely prevalent within corporates as well. There are multiple applications, databases, web sites etc. which seem to want to create their own userid / password combinations. There is an enormous amount of activity and effort that a corporate has to invest in identity management. Additionally there is today a compelling need to for a user to transparently navigate across a wide variety of corporate applications especially in situation where each application performs its identity management tasks independently. This is precisely one reason why identity management, and Single Sign On are terms which are in many cases far more important to a software designer within a corporate context than on the public internet.

What is different about corporate environments

For starters, there is a much stronger need within corporate environments to be able to associate a person’s identity with the authentication mechanism (eg. OpenID). I argued earlier that OpenID should reflect avataars and not necessarily a specific person’s identity especially within the public internet’s context. However many social interactions on the internet are relatively casual in nature and in most cases are likely to be sufficiently non-risky at least when looked at strictly from a commercial transaction perspective. The internet is a very democratic environment where most people are treated as fairly equal to each other. Within a corporate environment however each person has varying roles and along with that comes a varying set of responsibilities. It is fairly unlikely that corporate environments would easily allow any arbitrary OpenIDs (such as one created by a employee from one of a plethora of Internet based OpenID providers or even by creating a self hosted provider on his desktop himself). Corporates will be compelled to define ACLs around various corporate resources and these will need to be based on user identities and not their avataars.

How could a corporate implement OpenID ?

First there would need to be sufficient conviction that this indeed helps solve the problem more effectively than many other Identity Management solutions out there (Some of the competing strategies are based on LDAP and SAML). Assuming that one reaches that conclusion, the way forward would be to either identify specific public OpenID providers or more likely create an internal OpenID provider (which may in turn be a layer on top of the Directory Services). The URLs as registered / provided from this internal providers would serve as a mechanism for the user identity presentment and verification.

Would there be a conviction that OpenID would be able to provide better identity verification solutions than the other solutions out there ? I suspect not always. However it is more likely in scenarios such as follows :

(a) Its a highly decentralised and large corporate with independent identity management functions being carried out by a variety of sub units.
(b) There is a necessity to establish broader extranets and expose the corporate application to other partners or consume applications and services provided by various partners.

Even in both the above situation there are other identity management solutions that do exist. However I do believe that OpenID is better placed at being able to find its place in these situations given the relative simplicity of implementing it and more importantly the notion of standardisation that it brings with it. Moreover in a heterogenous world especially with all the various partner organisations and the identities that these spawn (which could be maintained in fairly diverse ways using different technology platforms) also starting to play a role, there will be a necessity for identity management, presentment and verification solutions to start talking one lingua franca and OpenID just might be it. There are other claims to be already having the common language (eg. SAML, LDAP), but I suspect the advantage OpenID brings in terms of a standard, widely used specification and especially in terms of it riding primarily on HTTP will help it hold its own in many situations. It is imjportant to note here that SAML perhaps provides a much more structured mechanism of data exchange and providing more sophisticated assertions about the user identity and it may be so that it is more appropriate in a given context. However I believe OpenID is likely to be used more often than SAML in most less intensive cases primarily because of its simplicity and given the presumption that OpenID is far far more likely to be successful in the internet than SAML.

However OpenID only solves a part of the problem - ie. identity of the users. Within extranets, sometimes its important to establish the identity of the partner organisations as well. OpenID is unlike to be able to solve the same by itself. However there are other initiatives such as inames which are at least attempting to solve that piece of the puzzle though it would be important in such cases that the individual inames and OpenIDs be seamlessly integrated.

OpenID or OpenAvataar ? UserID or AvataarID ?

Posted on February 5th, 2008 in java, ruby, software, web | 2 Comments »

Introduction to OpenID

Lately with the prolific activity around OpenID and especially with a biggies like Yahoo and AOL , I was getting curious about how this will influence identity management both on the public internet and corporate intranets. One of the nice starting points to understand OpenID is OpenID » What is OpenID , and for developers is OpenID » developers

How many OpenIDs per person ?

The OpenID » What is OpenID page says

OpenID eliminates the need for multiple usernames across different websites, simplifying your online experience.

This is a wonderful thing to happen, since we have been sufficiently bothered with trying to keep track of a whole bunch of userids and the passwords. It further goes on to add

For businesses, this means a lower cost of password and account management, while drawing new web traffic. OpenID lowers user frustration by letting users have control of their login.

There is some correlation that is now being drawn between an OpenID id and an “account”. However it is still not yet clear what the OpenID reflects. Further it goes on to add

For geeks, OpenID is an open, decentralized, free framework for user-centric digital identity

Now I start wondering does my OpenID reflect my user-centric digital identity ? Turns out I can have as many OpenIDs as I want. So there seems to be a many to one relationship between my OpenIDs and me.

This then is further reinforced in one of the articles that are referred to from the developer page A Recipe for OpenID-Enabling Your Site . This page contains the following text :

Here’s an overview of what you’re going to add to your site:
1. A new database table to map OpenIDs to your internal user IDs
* It’s a many-to-one relationship (each user can have multiple OpenIDs attached to their account, but a given OpenID can only be claimed by a single user)

The OpenID specification of course provides the most appropriate and insightful definition of OpenID even though it kind of has us wondering - what is the ID in OpenID (and leaves us with the loosely comfortable with the thought that the ID is simply an ID in the web space and has nothing to do with User Identity).

OpenID Authentication provides a way to prove that an end user controls an Identifier.

So it seems sufficiently clear that an OpenID is not meant to reflect my identity the way my Tax Identification Number or Social Security Number or Drivers License Number works (ie. exactly one valid identifier per person at any point in time).

It is now clear OpenID reflects what I own (to the exclusion of others) rather than who I am. My OpenID does is not a unique or exclusive reference to me or any of my identifying characteristics as much as it asserts the fact that I control the ID and therefore others don’t. But that still leaves me thinking even harder - how many OpenIDs do I really need ?

Why would I want to have multiple OpenIDs.

Sure having multiple ids is nice since I now have multiple providers, I am not tied to any particular one, I can have redundant ids. etc. etc. The reasons are quite similar to why I might have different email ids. Turns out at least in my case the most dominant reason why I would want to have different open ids is the same why I would want to have different email ids : I have different facets to my identity and I would like each to be reflected differently. Thus I might want to have one OpenID to reflect my persona as a professional consultant, another to reflect me in my personal and individual capacity, and yet another to reflect my persona within the context of a particular of a client / project / organisation. This way I could use my personal openID across all my social networking sites, my professional one across a smaller number of professionally focused sites, and probably my organisation specific ID being used for sites hosted by a particular organisation which really isn’t focused on my global identity but wants to create and independently manage a single ID within that organisation. The number of OpenIDs I would want to reasonably maintain is the number of personas or avataars I want to project on the web. Thus I probably need 3 avataars instead of 1 identity or the hundreds of site specific userid/password combinations I today have. Probably I can better understand the word OpenID if in my mind I was to call it OpenAvataar. My Submission here is that each person may have multiple openIDs and we shall probably be using each one to reflect one avataar. This is likely to be the primary reason why each person may have multiple OpenIDs”.

Summary

  • An OpenID does not exclusively identify a particular user. It simply asserts that users control over the OpenID.
  • A user may have multiple OpenIDs. My hypothesis is that each of these is likely to reflect one of his avataars

Should Sun focus more on Java-Ruby or Java-Groovy integration

Posted on January 18th, 2008 in java, ruby, software | 10 Comments »

Rick Hightower presents an argument to encourage sun to support groovy rather than ruby (Quit pimple pimping ruby)

Can we just get some decent support for Groovy? No instead Sun invests in Ruby via JRuby. DOH! Groovy looks a lot like Java. It is much easier to get started with it. The syntax does not make developers want to hurl. Why is Sun investing so much money in JRuby?

The investment should be in Groovy. Developers who know Java can learn Groovy quickly and are more likely to do so if the tools support it. Ruby is a non-starter.

One of the arguments based on a chart similar (not the same) to the one below is :

Here is another reason not to invest heavily in Ruby. For the color blind: RUBY COMES IN DEAD LAST!

Ruby comes in dead last. If there was going to be a revolution, it would have happened already. Ruby is a little long in the tooth to finish this poorly. Don’t you think?



java, c#, php, pl/sql, ruby, python, c++, visual basic, groovy Job Trends graph




java, c#, php, pl/sql, ruby, python, c++, visual basic, groovy Job Trends java jobsc# jobsphp jobspl/sql jobsruby jobspython jobsc++ jobsvisual basic jobsgroovy jobs

Read the rest of this entry »