March, 2010


19
Mar 10

Double whammy. The state and dilemma of Indian IT

Every now and then I come across a blog post which talks about outsourcing and soon enough a commentary or discussion on Indian IT and a whole host of associated parameters comes up. Soon enough some of them start attracting a number of views and comments. And more often than not the comment stream starts attracting far more extreme views which do sometimes leave me recoiled.

No, I am not going to list them, link to them or comment on their content specifically. After some thought I decided that simply would shift the focus unnecessarily. Instead I thought given my reasonably long experience both in and outside India, it would be helpful offer my perspective on Indian IT within the context of global IT and Indian economy.

Indian IT and the role of exports

Indian IT (along with ITES and Gems & Jewellery) is a one of the stranger segments of the Indian economy. Unlike the rest of Indian economy, it is heavily export focused. The various numbers reflecting the state of Indian IT agree on at least one factor – at least 2/3rd of its revenues are based on exports.

Let us first look at some of the perceptions about Indian IT that I would like to comment upon.

You get what you pay for : In general thats a very reasonable statement. But it breaks down when you realise one Indian engineer + 1 H1B visa + 1 long flight results in tripling (at least) of remuneration without either the H1B visa or the long flight adding anything to the capability or the quality of the output of the engineer (well it could add a chip on the shoulder). Part of the difference is due to the difference in nominal exchange rates (which are driven by demand and supply of goods to/from the respective countries) and the exchange rates based on purchasing power parity. Based on purchasing power parity Indian salaries in the IT space do not underperform the rest of the world substantially. In fact when compared on purchasing power parity Indian Economy GDP more than triples and it ranks as the fourth largest economy in the world next only to the US, China and Japan.

Indian engineers are offering their services too cheaply. : Quite to the contrary exactly the opposite is true. The Indian IT engineer is too handsomely remunerated compared to the non IT engineers in India. I believe the high salaries in Indian IT is a problem. Many Indian innovations are today happening in areas outside of IT – primarily in the areas of making products and services affordable to the millions surviving on the fringe of the poverty line. And if recent trends in automobiles and telecom sectors are indicative, India is actually proving, that it is the rest of the world which is way too expensive, by offering products and services at price points which are unfathomable elsewhere. My reading of the empirical evidence leads me to believe that Indian IT could underperform other sectors of the Indian economy in terms of both innovation and quality. And a lot of it is due to the fact that the export revenues offer cushy jobs without the really hard work it takes to compete within the Indian economy. Quite frankly this one stereotype must be deposited very quickly where it belongs. In the Trash Can. So let me repeat – Indian engineers are and Indian software is too expensive and containment in its cost growth is most urgently required. While containing salary growth might be useful, investing in ability to create high quality software upfront and eliminating the defect fixing cycles post the initial release will help bring the cost down.

Indian IT offers poor quality software (The alternative version is outsourcing leads to drop in quality) : This is too hard a statement to comment upon, given its utter and gross generalisation. I am not aware of specific quality benchmarks which could be used to assert or deny such claims, though there is a fair amount of empirical evidence which could be used to either assert or deny these claims. India like any other country has a range of software quality in different companies and products which span the entire spectrum from superb to utter crap. I tend to agree with the statement that in many cases Indian software output does leave a lot to be desired in terms of quality. I also believe that there is a curious dynamic at play here. It is well known that in software one can demand and expect any two of three parameters to be fulfilled – viz. Cost, Time and Quality. Any efforts to increase quality can in some contexts run into the business issue of the customer clearly prioritising cost and time to market. And remember two thirds of the revenues (and presumably the customers) of Indian IT are from overseas. I have often struggled hard on the quality aspects, and have generally found that it requires a very strong support not just from the engineers but from the business stakeholders to offer sustained high quality.

Indian engineers are not as skilled / Indian engineers are far superior to the rest of the world : This is an interesting stereotype which is obviously wrong at either of the two extremes at which it is observed. Some believe the Indian engineers are extremely competent and productive but fail to realise that these engineers are from the extreme top end of a very competitive educational system. Bring down the comparison to some reasonable way of comparing an average Indian and Non Indian IT programmer and perhaps the comparison may not be so rosy for India. I think India does need to work much harder to strengthen its capabilities of programmers at the middle and lower ends. And sometimes I blame the fact that too many projects being transferred to India has resulted into so much demand for programmers that a programmer with a 10 percentile performance can still make a wonderful living by just changing his jobs every two years. This needs to change. But I am afraid as long as India continues to bill the rest of the world by the hour and not by capability and quality, this shall continue to be an uphill battle even as we shall continue to see islands of excellence.

Double whammy

The double whammy I refer to in the title of this post stems from the fact that due to heavy reliance on exports, Indian IT has been substantially based on the priorities dictated by her customers. Thus Indian IT is largely today what its customers asked it to be. And most of the customers are non Indian. The double whammy is the fact that Indian IT gets criticised for becoming exactly what its customers asked it to be – fast, low cost and high quality so long as the high quality doesn’t interfere with the fast or the low cost :) . To be fair I am aware of the same criteria getting applied within many non Indian IT companies as well – so its not an Indian innovation. To be fair I am also aware of some situations where the lack of focus on quality is not driven by customer prioritisation. Is there a way out of the double whammy. Frankly I can’t see one easily. Yet this post would perhaps go some way in enriching the reader’s perspective on some of these factors.

The role of the population

There is one aspect about India which separates her from many other IT exporting countries. Her population. Not is it only really large (the second largest in the world), but it is growing fast and expected to continuously grow younger over the next two decades (dream demographics for an economist). Which means India stands unique in her ability to deploy masses. Which also means the problems which are easier to solve by deploying masses are more likely to find their way to the Indian shores. Combined with a high growth rate it also contributes to most Indian programmers getting promoted to management ranks (whatever that means) fairly soon. As we shall see in the next section this is an important factor in the Indian context.

Improving capability and quality

Independent of the validity of stereotypes which I questioned above I think it is a dead certainty that Indian IT can focus on improving both its capability and quality. It is not infrequent for me to feel demotivated after meeting some college students or fresh graduates and realising their priorities are really driven around what is the quickest way to make maximum money. This leads to unhealthy focus on building skills with high resume value. (Yeah that may sound funny to some of you, but its not uncommon). And given the high growth as I referred to in the earlier section – the role model for IT is – start as a programmer, change jobs every 2-3 years, become a team lead in 3 years, a manager in 6. And if you are particularly technically inclined you can become an architect guiding many projects and helping support many pre-sales efforts. What I haven’t seen Indian IT getting criticised for frequently enough, is the fact that so few of her members contribute to open source. While dzone and reddit may attract a large number of readers from India, a disproportionately small proportion of the people writing the posts stay in India. And most good programmers have moved on to becoming a Tech Lead or a Project Manager driving the average sustained technology experience lower and lower. So much for the “IT superpower” marketing that the hype manufacturing machinery creates internally. In a recent meeting with around 20 plus people in the room, I was one of the only 2-3 persons who believed India is not an IT superpower.

Change

I would like this to change. I would like this to change substantially. But I do not see the economic incentives in place for that to happen. Yet. However there is this strong feeling that something will indeed happen to make things change for the better. This is clearly something Indian IT will need to grapple with in the months and years to come. However there are at least a few factors which will lend themselves positively towards strengthening Indian IT.

Agile. Movement to agile requires a continuous focus on quality that cannot be wished away as easily. Projects that genuinely adopt agile methodologies will be implicitly driven toward being able to offer higher and sustained high quality.

Saas. As more software development shifts from intra enterprise development (where it is a little easier to contain impacts of poor quality) into Saas, there will be implicit pressures from the customers on the quality front. Another positive influence of Saas is that given the higher sharing of software across a much larger user base, the demand for number of developers would come down. Even as it may influence the demand for Indian IT itself negatively, I think it would help drive Indian IT into focusing more on capability than deploying masses.

Internal Growth. While much of the historic growth of Indian IT came from outsourcing, I anticipate the Indian industry to start driving its growth even more soon (given the really high growth engine it finds itself in). In such scenarios, the portfolio of software assignments will include a higher number of strategic and critical projects at extremely challenging cost parameters. This portfolio readjustment will help influence the quality positively.

I’ve attempted to highlight that Indian IT as exists today is (partially) a function of her customer’s expectations. And while there are some unfair stereotypes about it, there are clearly some things it can clearly improve upon. It is with some trepidation I write this since attempting to deal with perceptions in a generalised / stereotyped scenarios can be quite risky. So allow me to end with the disclaimer. This is a documentation of my understanding – YMMV.


1
Mar 10

Functional Programming with Python – Part 2 – Useful python constructs

In Functional Programming with Python – Part 1, I focused on providing an overview. In this post I shall focus on the core python language constructs supporting functional programming. If you are experienced pythonista, you may choose to skip this post (and wait for the next post in this series :) )

Sequences in python are not immutable :

When using sequences in python the thing to be noted is that sequences are not immutable. This provides you with the following options.

a) Use immutable sequence types : This is only possible by defining different types for sequences than the ones built into the language.
b) Ignore all methods on the sequences which modify them
c) Waive functional programming tenet of immutability

My preferred option when explicitly focusing on writing functional code is to use b)

Sequence types in python

The most commonly used sequence types in python are :

  • Tuple : Immutable, Ordered, Indexed collection represented as (val1, val2)
  • List : Ordered collection represented as [val1, val2]
  • Dictionary : A key indexed collection represented as { key1: val1, key2 : val2 }
  • Set : An unordered collection allowing no duplicates. Represented as set([val1,val2])
  • str and unicode : While primarily meant to serv as ascii and unicode strings, these data structures also act as sequences of characters.Represented as “abcd” or ‘abcd’

Of the above only tuples are immutable.

There are other sequences used less often as well. An example is frozenset which is an immutable set.

Simple iteration
The for construct allows you to loop through a sequence. eg.

1
2
3
one_to_five = [1,2,3,4,5]
for num in one_to_five :
    print num

Iterators on dictionaries

Unlike tuples, lists and sets where iterators essentially traverse through the sequence constituents, there are a number of different iterators on dictionaries. These are :

1
2
3
4
5
6
7
8
9
d = {1:"One", 2: "Two", 3:"Three"}
# keys
for val in d : print val
# an alternative for keys
for val in d.keys() : print val
# values
for val in d.values() : print val
# key, value tuples
for key,val in d.items() : print key, val

Slices

Slices allow creation of a subset on a sequence. They take the syntax [start:stop:step] with the caveat that a negative value for start or stop indicates an index from the end of the sequence measured backwards, whereas a negative step indicates a step in the reverse direction. The following should quickly indicate the use of slices

1
2
3
4
5
6
7
8
9
10
seq=[0,1,2,3,4,5,6,7]

#first 3
print seq[:3]
# last 3
print seq[-3:]
# 2nd to second last
print seq[1:-1]
# reverse the sequence
print seq[::-1]

The iterator protocol
Since python is an object oriented language as well, it provides strong support for allowing iteration over an object internals. Any object in python can behave like a sequence by providing the following :

a. It must implement the __iter__() method which in turn supports the iterator protocol
b. The iterator protocol requires the returned object, an iterator, to support the following ie. an __iter__() method returning self. and a next() method which returns the next element in the sequence, or raise a StopIteration in case the end of iteration is reached.

I’ve tested iterators which do not themselves have an __iter__() method and it still works, but I still do not give in to the temptation of not defining the next() method alone since that would be inconsistent with the documented python specifications

Let us examine a simple class indicating a range. Note that this is just for demonstration since python itself has better constructs for a range.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# An iterator object to sequentially offer the next number
# in a sequence. Raises StopIteration on reaching the end
class RangeIterator(object):
    def __init__(self,begin,end):
        self._next = begin - 1
        self.end = end
    def next(self):
        self._next += 1
        if self._next < self.end :
            return self._next
        else :
            raise StopIteration
           
# A class which allows sequential iteration over a range of
# values (begin inclusive, end exclusive)
class Range(object):
    def __init__(self,begin,end):
        self.begin = begin
        self.end = end
    def __iter__(self):
        return RangeIterator(self.begin, self.end)

oneToFive = Range(1,5)

In the above example, __iter__() on Range returns an instance of a RangeIterator.

The way this capability is used is as follows :

1
2
3
4
5
for number in oneToFive :
    print 'Next number is : %s' % number

print 'Is 2 in oneToFive', 2 in oneToFive
print 'Is 9 in oneToFive', 9 in oneToFive

The output you get with it is :

1
2
3
4
5
6
Next number is : 1
Next number is : 2
Next number is : 3
Next number is : 4
Is 2 in oneToFive True
Is 9 in oneToFive False

Generators

In the above situation, the state necessary for the iteration (ie.begin and end attributes) need to be stored. This was stored as a class member. Python also provides a function based construct called a generator. In case of a generator, the function should just do a yield instead of a return. Python implicitly provides a next() method which resumes where the last yield left off and implicitly raises a StopIteration when the function completes. So representing the above class as a function,

1
2
3
4
5
6
7
8
9
10
11
def my_range(begin, end):
    current = begin
    while current < end :
        yield current
        current += 1
       
for number in my_range(1,5) :
    print 'Next number is : %s' % number

print 'Is 2 in oneToFive', 2 in my_range(1,5)
print 'Is 9 in oneToFive', 9 in my_range(1,5)

I do not document the results since these are identical to the earlier code using a RangeIterator.

Note: I used my_range() and not range() since there already exists another range() already provided by python

An interesting point to realise is that in both the above situations, the next element to be returned in the sequence was being computed dynamically. To put it in the terminology better consistent with functional programming, it was being lazily evaluated. While python has no mechanism of lazy evaluation of functions, the ability of functions or objects to lazily evaluate the return values are sufficiently adequate to get the benefits of lazy evaluation at least from the perspective of cpu utilisation only on demand, and minimal memory utilisation.

List comprehensions

List comprehensions are one of the most powerful functional programming constructs in python. To quote from python documentation (even as I leave out a very important part of the full quote .. to be covered later),

Each list comprehension consists of an expression followed by a for clause, then zero or more for or if clauses. The result will be a list resulting from evaluating the expression in the context of the for and if clauses which follow it. If the expression would evaluate to a tuple, it must be parenthesized.

A sample list comprehension is one which returns a sequence of even numbers between 0 and 9 :

1
2
3
4
# A list comprehension that returns even numbers between 0 and 9
even_comprehension = (num for num in range(10) if num % 2 == 0)
print type(even_comprehension)
print tuple(even_comprehension)

The output one gets on running the code above is

1
2
type 'generator'
(0, 2, 4, 6, 8)

Note that the list comprehension returns a generator which then returns a sequence containing 0, 2, 4, 6 and 8.

The for and if statements can be deeply nested.

lambda

Lambdas are anonymous functions with a constraint. Their body can only be a single expression which is also the return value of the function. I will demonstrate their usage in the next section.

map, filter and reduce

Three of the functional programming constructs which probably have aroused substantial discussions within the python community are map, filter and reduce.

map takes a sequence, applies a function of each of its value and returns a sequence of the result of the function. Thus if one were to use map with a function which computes the square on a sequence, the result would be a sequence of the squares. Thus

1
2
def square(num): return num * num
print tuple(map(square,range(5)))

results into

1
(0, 1, 4, 9, 16)

I mentioned earlier I will demonstrate usage of a lambda. In this case I could use a lambda by defining the square function anonymously in place as follows :

1
print tuple(map(lambda num : num * num,range(5)))

filter takes a predicate and returns only the elements in the sequence for whom the predicate evaluates to true.

Thus

1
print filter(lambda x : x % 2 == 0, range(5))

results in the following output

1
[0, 2, 4]

Finally the most controversial of them all – reduce. Starting with an initial value, reduce reduces the sequence down to a single value by applying a function on each of the elements in the sequence along with the current manifestation of the reduced value. Thus if I wanted to compute the sum of squares of numbers 0 through 4, I could use the following

1
print reduce(lambda reduced,num : reduced + (num * num), range(5), 0)

In the above example, 0 as the right most argument is the initial value. range(5) is the sequence of numbers from 0 thru 4. Finally the anonymous lambda takes two parameters – the first is always the current value of the reduced value (or initial value in case it is being invoked for the very first time) and the second parameter is the element in the sequence. The return value of the function is the value which will get passed to the subsequent invocation of the reduce function with the next element in the sequence as the new reduced value. The reduced value as returned finally by the function is then the returned value from reduce

The functional programming folks are likely to find this an extremely natural expression. Yet reduce resulted in a substantial debate within the python community. With the result that reduce is now being removed from python 3.0. (Strictly speaking it is being removed from python core but will be just an import statement away as a part of the functools package). See The fate of reduce() in Python 3000. Why so controversial ? Simply because the usage of map, filter, reduce above could be rewritten as

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#map
seq = []
for num in range(5) :
    seq = seq + [num * num]
print seq

#filter
seq = []
for num in range(5) :
    if num % 2 == 0 :
        seq = seq + [num]
print seq

#reduce
total = 0
for num in range(5) :
    total = total + (num * num)
print total

Just look at the two blocks of code and figure out which is easier to read (especially look at reduce). One of the areas where a large number of pythonistas and lispists are likely to disagree is the tradeoffs between brevity and easy readability. Python code is often english like (well, at least to the extent that a programming language can be) and some pythonistas do not like the terse syntax of lisp thats hard to follow.

I earlier mentioned that I left out a part of the description of list comprehensions from python documentation. Here’s that part of the quote.

List comprehensions provide a concise way to create lists without resorting to use of map(), filter() and/or lambda. The resulting list definition tends often to be clearer than lists built using those constructs.

Having said that I do believe many including myself will continue to use map, filter, reduce

Other helpful functions in python core that are helpful for functional programming are :

  • all : Returns True if all elements in a sequence are true
  • any : Returns True if any element in a sequence is true
  • enumerate : Returns a sequence containing tuples of element index and element
  • max : Returns the maximum value in a sequence
  • min : Returns the minimum value in a sequence
  • range : Return a sequence for given start, end and step values
  • sorted : Returns a sorted form of the sequence. It is possible to specify comparator functions, or key value for sorting, or change direction of sort
  • xrange : Same as range except it generates the next sequence element only on demand (lazy evaluation) thus helping conserve memory or work with infinite sequences.

We’ve seen many of the constructs that are typically useful for functional programming. I left out one big part – the itertools package. This is not a part of python core (its a package which is available with a default python installation). Its a large library and substantially helps functional programming. That along with some more sample python programs shall be the focus of my next blog post in the series. At this point in time I anticipate at least a few more parts after that to focus on a) Immutability b) Concurrency and c) Sample usage.

Hope you found this useful and keep the feedback coming so that I can factor it into the subsequent posts.