August, 2010


30
Aug 10

Monads in an Object Oriented context

The other day I was referred to a blog post jQuery is a Monad. That is an interesting post which if you have any interest in Monads should read. My first thought was that jQuery was another implementation of Fluent Interface and it just did not strike me as a monadic construct.

There are blog posts which take an essential thought or a message and by putting together various arguments bring it to some logical conclusion. This post is not one of them. It just describes the results of my thought process which started off contesting the assertion, to accepting it and then realising that either way, it really didn’t matter a whole lot. I don’t claim it to be sufficiently accurate except to state that it is accurate only to the best of my belief and understanding. Pointers to any inaccuracies or alternate interpretations in the comments shall be gratefully received. And yes, this post in the end reaches a somewhat boring conclusiion.

Having got that little expectation management out of the way, lets get back to monads, jQuery and Object Orientation. Writing a tutorial on Monads is a rite of passage for many. Thats something I’ve never done. And I don’t wish to, but I cannot escape the fact that I’m going to have to introduce monads to those who are as challenged in their understanding as I am.

What are Monads ?

I’ll give you the loose idea. Whenever a function completes any computation, the results of the computations are often returned as the return value of the function. But every once in a while you need a container around that return value. A container to store the state of a computation, sometimes with additional metadata or historical annotations for use later. At its very basis, a monad is a container for storing computations and being the container to be carried between functions. Think of it as a shipping container if you will, which moves materials between factories – the factories being various functions which operate on the contents of that container.

But it takes more for a monad to be a monad than just being a container. Given any content, the monad should be able to construct itself by wrapping itself around the content. Thats what the source factory invokes when it fills up the container with goods. Some prefer to call it a type constructor – the type here being the container, or the monad (which may internally contain goods of various types, but I get ahead of myself).

In addition once the container reaches the destination factory, unlike typical container, this one does not just allow its contents to be unloaded. Nay, that would make it too simple. In this case, the container has to be taken to a factory where it allows a robotic arm of the factory to plug into a receptacle it provides, which the factory can use to extract the underlying values, often one at a time. The factory further processes these goods, which again come out at the other end of the factory as a different set of containers. And it is quite likely that the type of the goods in the new containers could’ve changed.

That in essence is a monad. Easy, right ? I’m sure not and I’m now only going to make it a bit harder.

The wikipedia, definition of monad Monad

I’m just going to let you read the definition as provided by wikipedia.

A monad is a construction that, given an underlying type system, embeds a corresponding type system (called the monadic type system) into it (that is, each monadic type acts as the underlying type). This monadic type system preserves all significant aspects of the underlying type system, while adding features particular to the monad.
The usual formulation of a monad for programming is known as a Kleisli triple, and has the following components:

  1. A type construction that defines, for every underlying type, how to obtain a corresponding monadic type. In Haskell’s notation, the name of the monad represents the type constructor. If M is the name of the monad and t is a data type, then “M t” is the corresponding type in the monad.
  2. A unit function that maps a value in an underlying type to a value in the corresponding monadic type. The result is the “simplest” value in the corresponding type that completely preserves the original value (simplicity being understood appropriately to the monad). In Haskell, this function is called return due to the way it is used in the do-notation described later. The unit function has the polymorphic type t→M t.
  3. A binding operation of polymorphic type (M t)→(t→M u)→(M u), which Haskell represents by the infix operator >>=. Its first argument is a value in a monadic type, its second argument is a function that maps from the underlying type of the first argument to another monadic type, and its result is in that other monadic type. The binding operation can be understood as having four stages:
    1. The monad-related structure on the first argument is “pierced” to expose any number of values in the underlying type t.
    2. The given function is applied to all of those values to obtain values of type (M u).
    3. The monad-related structure on those values is also pierced, exposing values of type u.
    4. Finally, the monad-related structure is reassembled over all of the results, giving a single value of type (M u).

In object-oriented programming terms, the type construction would correspond to the declaration of the monadic type, the unit function takes the role of a constructor method, and the binding operation contains the logic necessary to execute its registered callbacks (the monadic functions).

In practical terms, a monad (seen as special result values carried throughout the pipeline) stores function results and side-effect representations. This allows side effects to be propagated through the return values of functions without breaking the pure functional model.

The bind operation

So coming back to the container analogy, the container is the monad, the goods are the underlying types, and the factory is a function which takes a underlying type and spews out another monad which contains the newly manufactured goods. There is a difference to be noted, the container doesn’t go into the factory, its the factory thats plugged into the monad through the bind operation, which triggers the processing. Also note that the output of each such factory is not just goods, but containers of such goods it manufactures.

There, you can now imagine a series of containers and factories, or monads and functions which are sequenced to assemble a pipeline to produce the desired goods / computations.

Before we go on to how this works in OO, I would like to spend some more time on the function thats provided to the binding operation. This function accepts the individual contents of the incoming container and returns a container with the newly manufactured goods. A somewhat similar function is used when using map operations on containers eg. Lists. So if I had a list of integers, and I wrote a function which doubled their values, then the map operation would be written in python as :

1
2
3
4
5
def twice(val): return str(2 * val)

print map(twice,(1,2,3,4,5))

#Expected output is :['2', '4', '6', '8', '10']

This is a good example where the container (tuple) is unbundled inside the map operation, each constituent value of the list is passed to the twice function, and all the results of the twice function are reassembled into yet another container (list). The type of the constituents also changed from ints to str (string) along the way. To be a strictly monadic construct, the twice function should’ve returned not just the double of the value, but a double contained in some other container (either a tuple or a list), and the map function should’ve extract the individual strings from each of the individual tuples/lists and constructed a final tuple / list out of the same.

While unlikely to be confused as so, let me restate for clarity, the map function is not the monad. The tuple and the lists are the monads here. The map function is an example of the contextual capabilities that monads expect from their environment (eg. via and around the bind operator in haskell). And twice is the operation thats performed on the monad. Again to restate, the monadic chain is like a set of factories that spew out containers, which allow other factories to plug into them and in turn extract the contents and then spew out even more containers.

Monads and Object Orientation

So how does this work in an object oriented context ?

Lets take rules 1 and 2 of a monad. If I was to declare a class which had a constructor which took in a value and then wrapped it, then I would satisfy these two rules for being then be able to suggest that the class is a monad. However the rule 3 gets a little bit more interesting. Object oriented languages have the “.” operator which is somewhat analogous to the do block which can chain operations. So if I was to write o.foo() that would be equivalent of suggesting that I invoke the method foo() on the object which being a member method of the class has access to all the object internals and thus is able to access the wrapped value and do the necessary computations. Now if foo() were to return any object again of a class which satisfies these very rules, then I would be able to say that this class along with its member function foo() is a reasonable object oriented analogy of the monad. And I would be able to start chaining the methods as in o.foo().bar().baz()

Whoa! let me restate that again. If a class, has a constructor which takes a value and wraps that value, and has a number of additional member functions which each operate on these underlying values and return an instance of either the same object or another object of a class which follows the same rules, then I can say that that particular class is a monad. Incidentally thats exactly what fluent interfaces do, except that they do not have any specific expectation of wrapping. And a large number of classes may incidentally fit this description. And if jQuery is a monad, so are all of them.

Well, we did relax a few constraints along the way. First of all the functions are not stand alone functions. They are member methods which have direct access to the underlying wrapped value. Secondly they themselves don’t return a monad around an individual item in a collection. They return a monad around the entire collection of values. Thus the complexities and capabilities of the do block and the bind function are substantially simplified when using the “.” operator. And finally going back to the container analogy, the class defines and consists of both the containers and the factories. Seems like the threshold for stating a particular class is a monad is actually quite low. Turns out we reached a boring end. And it seems we are not much wiser at least in terms of any specific conclusions or insights. But every once in a while sometime the journey is more exciting than its end. For me this did seem like one.

Why are objects simple and monads complex ?

At least for me the above was true. Turns out objects collate related data and functions together into one class. Also the do block and bind operator on an object is very simple. So for many especially coming from the OO school, objects are well understood. On the other hand understanding the requirements of monadic constructs takes quite some time. So there’s a lot of gray cells that need to be exercised to start understanding what a monad is and how the various requirements for a monadic construct can be satisfied. And when mapping between monads, and objects which can also be seen to be monads, a lot of that complexity is either partially waived (eg. functions returning monads, the infrastructure around bind operators unbundling and bundling monads) or just simplified (functions and underlying values being colocated in the same class, the “.” operator being a far more simplified version of do block). But after some substantial headaches, it does start to seem that perhaps, just perhaps, monads aren’t so complicated after all :) .

So is jQuery a monad ? I believe one could choose to be very pedantic and point out minor issues with that assertion. Or one could accept that the intent of monadic sequences are well represented in jQuery chains and accept it as one. I started with the former and ended at the latter position. Would I express that jQuery is a monad ? To the monadically challenged – No. That obfuscates far more than it enlightens. To them I would say jQuery is a fluent interface which allows continuous chaining of operations on an underlying set of dom objects. And is there a big “Aha moment” when one realises that jQuery is a monad ? I couldn’t find one. While the journey of trying to understand monads and correlate it with monads was very exciting, at least the OO practitioner in me is unlikely to have missed much had I not known that. But I got to understand monads better – and thats well worth the time, and all the headspinning.


26
Aug 10

A case for non leaky dual abstractions.

A long long time ago, I worked on a fairly complex piece of design. And like any well behaved designer, I broke it down into a number of abstractions that made it manageable. I gave the abstractions funny sounding names. And before long I found those abstractions finding their way into the user interface. Abstractions which made no sense to the end user.

That experience taught me a good lesson. In attempting to deal with complexity, I was attempting to come up with the appropriate abstractions in a very bottom up way. So, strategic closure (of the Open Closed Principle), dependency inversion principle, et. al. all found their way into these abstractions I was modeling. These abstractions represented themselves via the various programmatic interfaces in the software design and their roles. At the same time there was a different perspective of the software. The way the end users would’ve preferred to see that software. The abstractions the way the end users saw the system were sometimes different than the underlying abstractions I modeled.

In a good design, the two abstractions should line up, and if there is an inconsistency, there is probably an issue with the design. Certainly this could be an issue in some cases. However in many cases the underlying backend might be built to offer far more capabilities than what are being exposed through the early version of the user interfaces. Probably there are multiple intents for which the software is being designed, and the particular user interface on table is just one of them. Probably, the underlying complexity of the back end is way too high which requires a very different nature of bottom up abstractions, which are different from the top down ones. Frankly, the reason doesn’t matter. Even after so many years, I look back and am comfortable with the thought that there needed to be two abstractions – one top down and one bottom up, one front end and one backend.

Fast forward many years. Another example helped me further attain some insight into the matter. We had a bunch of mobile smartphones floating around with the traditional PC UI abstractions being carried over into their design. Along came an iPhone – which rethought the interface the way users would’ve preferred to see the interface. Now iPhone was built on a traditional operating system. So it had the same abstractions that these operating systems have in the backend. However it decided to change some of the front end abstractions – in tune with the target market and the device / form factor peculiarities. Did the iPhone change the backend abstractions baked into its software – most likely not. However it changed the frontend abstractions, just enough to make the experience really simple and easy for its users.

As an engineer, I have lived with the regret of allowing backend abstractions to leak into the front end. But I have learnt something along the way.

  • Don’t let the inapplicable frontend abstractions leak into the backend. This is especially true for most reasonably complex software. Strictly top down design can lead to a lot of brittleness in the long run, requiring very substantial surgery eventually.
  • Don’t let the inapplicable backend abstractions leak into the frontend. In most cases this is a usability nightmare. ’nuff said.
  • Realise that you have to simultaneously service independent expectations ie. robustness of usability and robustness of modeling. Work with both the abstractions to mould each of them well: Work your frontend abstractions well enough to mould them to reflect the use cases of the software in the most usable manner. Work with the backend abstractions to let them emerge from the problem space you are trying to resolve at the backend. Now make sure you work with both of them to map them into each in a reasonably smooth fashion. This is easier said than done. But doing it is whats necessary.

PS: I am not referring to what is conventionally referred to as leaky abstractions, where the implementation leaks into an abstraction. I am referring to a set of frontend and backend abstractions leaking into each other. However if all the backend abstractions were to be treated as an implementation, then this would be a case of leaky abstractions as well.


13
Aug 10

Programming Languages should be Simple (or My ideal programming language)

I am disappointed with many of the newer languages which I earlier thought showed great promise of making programming easier, quicker, and more robust. And it boils down to one thing. Simplicity in learning. Having gone through substantial amounts of programming in C, C++, Java and Python, my quest for the “next” programming language remains unfulfilled.

Why ?

Programming should be simple. And it should be accessible. And when I mean accessible I mean people with IQ of approximately 100 should be able to write programs. I am disappointed that many of the trends seem to raise either the minimum IQ or the training time required to gain competency. And while that helps a community of the super brilliant, it does not make a substantial difference to programming in general. It remains esoteric and does not stoop to touch everybody.

What ?

So what are the features of my preferred programming language :

  • JRE support: Should run on the JRE with Java interop. Thats the dominant well engineered platform that runs across all classes of desktops, servers and devices. Additional support for CLR is a bonus but not mandatory.
  • Simplicity: Should be fairly simple to read, learn and understand. Python is a good example. PHP is a great example (at a simpler class of problems). C++ and Scala not good examples.
  • Multi paradigm : Should support both OO and FP constructs. eg. Scala and Python. Half hearted support to functional programming as with python discouraged. Ditto with passionate support for objects with second class treatment for functions as in Java/Ruby.
  • Multi core compatible : Should have good constructs for leveraging multiple cores eg. erlang, scala, clojure and many others.
  • Type inferencing : (and I am a python programmer :) ). Good type inferencing coupled with on the fly non intrusive/disruptive compilations as with say the eclipse on the fly compiler or play framework. Three cheers for Scala. One reason I prefer type inferencing to dynamic typing is the much superior performance even while maintaining brevity and removing boilerplate.
  • Constructs that are natural to humans not mathematics : This is actually a sub point to Simplicity. The constructs should be consistent with the normal average non mathematically trained brains. 2 + 3 is much simpler to understand than (+ 2 3). Python rocks. Lisp / Clojure or for that matter brainfuck dont.
  • Closures and code blocks : Love ruby for this.

Is this a pipe dream ?

For the moment seems so. Do you know of a language which helps meet these requirements ?

And to be very clear (because there is a substantial risk of the same) – this is no flame bait or an opportunity to trigger language wars. It is meant to highlight two things

  1. There is no ideal language out there, and
  2. When designing languages – make them simple to learn and use. ie. for a given problem statement a good language is one which requires the minimum talent or training to solve the problem