Performance Comparison – C++ / Java / Python / Ruby/ Jython / JRuby / Groovy

Posted by – July 8, 2008

Update (README CAREFULLY) : I am starting to see hyperlinks to his post with only some of the findings being treated as the link title (eg. X is 100 times faster than Y, X faster than Z). I emphasise once again that I have carefully indicated in the original post that this is but one of many possible microbenchmarks and that you should treat the results as one of many data points. Given the comments I’ve received and some of the links I’ve seen to this post, if I was to make this posting anew, I would choose to assign the title of this post as “Implementing an identical object oriented solution to the Josephus Problem in Java / C++ / Ruby / JRuby / Python / Jython / Groovy and measuring the performance results thereof.”

This post compares performance across various languages for a specific micro benchmark (actually it isn’t really a microbenchmark – it is simply a benchmark for a specific piece of logic – but thats the closest word I could think of).

Last week, while preparing for a presentation – Contrasting Java and Dynamic Languages, I came across this interesting Perl/Python/Ruby Comparison which focused on comparing the code style of different languages. I thought it would be interesting to use the same to get some actual benchmarks based on the same. Note that you could also use the code segments below to get a feel for different syntactic flavours. However since I have strived to keep the code as similar as possible to each other, some of the advanced syntactic sugar of the dynamic languages is not on display here.

Problem Statement

Quoting from the post linked to above :

Flavius Josephus was a roman historian of Jewish origin. During the Jewish-Roman wars of the first century AD, he was in a cave with fellow soldiers, 40 men in all, surrounded by enemy Roman troops. They decided to commit suicide by standing in a ring and counting off each third man. Each man so designated was to commit suicide…Josephus, not wanting to die, managed to place himself in the position of the last survivor.

In the general version of the problem, there are n soldiers numbered from 1 to n and each k-th soldier will be eliminated. The count starts from the first soldier. What is the number of the last survivor


Design

I actually changed the design of the solution as compared to the original post. Instead of using the deeply recursive calls as used in the earlier post, I decided to split the logic into two classes, and use loop iteration instead of recursion. It is my belief that we tend to do loop iterations far more frequently than recursions, and the resultant class design having two classes – one to indicate a Chain and one reflecting a Person seemed more appropriate to me.

Logic
The Chain object contains a reference to one person (first) who is but one member in a circular linked list. Each person object has a reference to its previous (prev) and next (next) person in the circle. When the kill loop starts, it sets a threshold (nth). The count starts with 1 from the first person. Each person when asked to shout, checks if the shout count (shout) is less than the threshold (nth). If less, the person just returns an incremented count. If the two are same, the person in effect commits suicide. In doing so the person, updates the next reference of its prev, and prev reference of its next to take himself off the circle and keep the circle consistent, finally returning a shout of 1 (which is what the next person in the list will shout).

The code does not have any comments (sorry!) and all the console outputs have been removed so that the benchmarking activity is not interfered with by the IO overheads.

The results

All the results are as observed on my notebook with the following config
OS : Ubuntu Gutsy Gibbon 7.10
Kernel : 2.6.22-15-generic
CPU : Intel(R) Core(TM) Duo CPU T2600 @ 2.16GHz
RAM : 2GB

Language Version Lines of Code Time per iteration (microseconds)
Java Sun JDK 1.6.0.03 10186 1.6
C++ 4.1.3 20070929 (prerelease)
(Ubuntu 4.1.2-16ubuntu2)
Compiled with optimisation -O3
86 3
gcc version 4.2.3
(Ubuntu 4.2.3-2ubuntu7)
Compiled with optimisation -O3
Alberto Bignotti’s modified code with customised memory reuse and management
124 approx 0
Ruby ruby 1.9.0 (2008-04-14 revision 16006) [i686-linux] 63 114 89
ruby 1.8.6 (2007-06-07 patchlevel 36) [i486-linux] 372 380
jruby : ruby 1.8.6 (2008-05-28 rev 6586) [i386-jruby1.1.2] 84 80
Python 2.5.1 41 225 192
2.5.1 with psyco 33
Jython 2.2.1 on JRE 1.6.0.03 884 632
Groovy Groovy Version: 1.5.6 JVM: 1.6.0_03-b05 uncompiled 81 363
Compiled to bytecode and run using java 360
UpdateGroovy Version: 1.6-beta-1 JVM: 1.6.0_03 104
PHP PHP 5.2.3-1ubuntu6.3 (cli) 85 593



Updates :

  • Ken suggested syntactic improvements (see comments below) which lead to even faster ruby execution times : jruby : 80 microseconds, ruby 1.9 : 89 microseconds, ruby 1.8.6 : 380 microseconds. The table above has been updated
  • Cato requested a run using Groovy 1.6 beta 1 – have updated the same. Big improvement
  • Nicholas Riley suggested introducing slots and using “is not” and “is” in the if conditions for the python code. Updated the results to reflect the figure of 192 and 632 micro seconds for CPython and Jython. The figure was 182 microseconds for CPython and 131 microseconds for Jython if I did not use the new style classes, however I did not reflect the same, since most new code is likely to be using new style classes. However this does indicate one possible performance optimisation if your code does not depend upon new style classes. Moreover makes me really interested in waiting for the Jython performance optimisations for new style classes that Nicholas suggests are on their shortlist.
  • Tim Fountain in a comment below indicates that on his hardware (core 2 Quad) with Ubuntu Hardy Heron, Ruby 1.8.6 (same version as above) performs somewhat faster (15%) whereas upgraded version of Python and PHP run much faster(63% for python and 83% for ruby). Another difference in config- he is running 64 bit.
    • python – version 2.5.2: – 138 microseconds
    • ruby – 1.8.6 (2007-09-24 patchlevel 111) [x86_64-linux] :321 microseconds
    • PHP – 5.2.4-2ubuntu5.1 with Suhosin-Patch 0.9.6.2 (cli): 323 microseconds.
  • Peter Lupo requested a reduction in line count for Java since the conventional way is to have the opening braces not on a separate independent line. Given the fact that it is a fair comment, I have reduced the line count of java to 86 (didn’t physically change the code – 86 = 101 – 15 opening curly braces).
  • Added another finding of using python with psyco
  • C++ – Added results using Alberto Bignotti’s alternative code with customised memory reuse management


Summarisation

The following are the results. Given the long code blocks, I am presenting the summarisation first followed by the code.

Note: This can only be treated as one particular benchmark. The results are a little atypical with respect to my general understanding. Advise caution against drawing broad conclusions based on this benchmark alone but would suggest that you could treat this as one data point amongst many. People better versed than me in the details of language runtimes might be able to suggest why some of the results seem surprising or atypical.

  • Java / C++ Rock : The performance of Java and C++ was head and shoulders beyond other languages (nearly 100 times faster). My thought is that while a difference of 10x was only to be expected – this difference was just way too massive
  • Java is faster than C++ : Though I had read about other microbenchmarks reaching the same conclusions, it is the first time I actually ran one where Java was faster. There are many others I have run where C++ beats Java quite handsomely. More importantly – the performance of C++ worsened by almost 40% once I added code which started freeing memory that was being allocated (there’s still a small memory leak in the code – there is no Chain destructor which will clean up first). I would later definitely want to look at the impact of garbage collection in this context, and whether the Java garbage collector simply was much faster than the hand crafted new – delete calls in C++. Update:Using the customised memory management (which is not used in any of the other examples) but the same algorithm as in the code written by Alberto, C++ is much faster than Java
  • Ruby 1.9 is twice as fast as Python : While it has been known for a while Ruby 1.9 is much faster than Ruby 1.8.6, heres one more supporting data point. I was expecting ruby 1.9 to give python a run for its performance money. But at least in this particular context it seems to be much much faster.
  • JRuby is faster than Ruby : Even ruby 1.9. Very interesting indeed.
  • Jython still has some catching up to do : Though in the ballpark as the other languages, it was the slowest in the pack.
  • Overhead of dynamism is dominant : I have no idea if JRuby ran much faster because of the java bytecode or because of its implementation (though its performance was not even remotely close to that of Java). However even after I compiled groovy code, to java bytecode, it still ran much slower than python and ruby. It seems the overhead of supporting dynamic constructs is much more dominant than any benefits that one gets out of compilation (whether to java byte code or to intermediate compiled files). I think the argument that because something compiles to java bytecode it is likely to be fast should be looked at a little carefully.
  • PHP stays at the rear end : Though I benchmarked PHP for the first time, I wasn’t completely surprised by the fact that PHP could only manage to be faster than Jython.

Update : There are many comments to this post including those from cwilbur who benchmarks perl using a idiomatic method, Paddy3118 who offers an optimised algorithm for python, and peter lawrey who offers an optimised algorithm for Java. I would like to state that each of their solutions offer superior performance than that what has been described here. However I believe any benchmark comparison should compare apples to apples. Should these contributions be taken into account and be reflected in the table above ? I certainly believe there is a case to do so as an exercise using a different algorithm. However to ensure that it is a fair comparison, one has to modify all the code in all other languages also to reflect the same algorithm. Only then can we get an apple to apples comparison. That is probably an exercise for another post. Is the algorithm I have chosen the fastest – No. However I believe it is a very readable algorithm and if one ignores the IO with networks and databases and files, it is probably close to the kind of code many programmers write (and maintain) on a day to day basis. It has been consistently implemented in all the languages. Readers should be aware that there are algorithms which will deliver much superior performance – but they will also make the performance superior in all the languages (perhaps to slightly differing extents and thus possibly somewhat different results).

The code

For all you who are either interested in running it for yourself or would like to perhaps explore this in more detail, .. here’s the code. Note that I am not equally competent across all languages. So if you believe there is something that could be more appropriate way to code the same, do post a comment. One of the things I have tried to do is to ensure that the code remains more or less similar across all languages. Also I have used getter – setters or skipped them based on my understanding of the generally accepted convention for users of the language.

Java :

package com.dnene.josephus;

public class Chain
{
	private Person first = null;

	public Chain(int size)
	{
		Person last = null;
		Person current = null;
		for (int i = 0 ; i < size ; i++)
		{
			current = new Person(i);
			if (first == null) first = current;
			if (last != null)
			{
				last.setNext(current);
				current.setPrev(last);
			}
			last = current;
		}
		first.setPrev(last);
		last.setNext(first);
	}

	public Person kill(int nth)
	{
		Person current = first;
		int shout = 1;
		while(current.getNext() != current)
		{
			shout = current.shout(shout, nth);
			current = current.getNext();
		}
		first = current;
		return current;
	}

	public Person getFirst()
	{
		return first;
	}
	public static void main(String[] args)
	{
		int ITER = 100000;
		long start = System.nanoTime();
		for (int i = 0 ; i < ITER ; i++)
		{
			Chain chain = new Chain(40);
			chain.kill(3);
		}
		long end = System.nanoTime();
		System.out.println("Time per iteration = " + ((end - start) / (ITER )) + " nanoseconds.");
	}
}
package com.dnene.josephus;

public class Person
{
	int count;
	private Person prev = null;
	private Person next = null;

	public Person(int count)
	{
		this.count = count;
	}

	public int shout(int shout, int deadif)
	{
		if (shout < deadif) return (shout + 1);
		this.getPrev().setNext(this.getNext());
		this.getNext().setPrev(this.getPrev());
		return 1;
	}

	public int getCount()
	{
		return this.count;
	}

	public Person getPrev()
	{
		return prev;
	}

	public void setPrev(Person prev)
	{
		this.prev = prev;
	}

	public Person getNext()
	{
		return next;
	}

	public void setNext(Person next)

	{
		this.next = next;
	}
}

C++

#include 
#include 
#include 
#include 

class Person
{

    public:

        Person(int count) : _next(NULL), _prev(NULL) { _count = count; }
        int shout(int shout, int nth)
        {
            if (shout < nth) return (shout + 1);
            _prev->_next = _next;

            _next->_prev = _prev;
            return 1;
        }
        int count() { return _count; }
        Person* next() { return _next; }
        void next(Person* person) { this->_next = person ; }
        Person* prev() { return _prev; }
        void prev(Person* person) { this->_prev = person; }
    private:
        int _count;
        Person* _next;
        Person* _prev;
};

class Chain
{
    public:
        Chain(int size) : _first(NULL)
        {
            Person* current = NULL;
            Person* last = NULL;
            for(int i = 0 ; i < size ; i++)
            {
                current = new Person(i);
                if (_first == NULL) _first = current;
                if (last != NULL)
                {
                    last->next(current);
                    current->prev(last);
                }
                last = current;
            }
            _first->prev(last);
            last->next(_first);
        }
        Person* kill(int nth)
        {
            Person* current = _first;
            int shout = 1;
            while(current->next() != current)

            {
                Person* tmp = current;
                shout = current->shout(shout,nth);
                current = current->next();
                if (shout == 1)
                {
                    delete tmp;
                }
            }
            _first = current;
            return current;
        }
        Person* first() { return _first; }
    private:
        Person* _first;
};

int main(int argc, char** argv)
{
    int ITER = 1000000;
    Chain* chain;
    struct timeval start, end;
    gettimeofday(&start,NULL);
    for(int i = 0 ; i < ITER ; i++)
    {
        chain = new Chain(40);
        chain->kill(3);

        delete chain;
    }
    gettimeofday(&end,NULL);
    fprintf(stdout,"Time per iteration = %d microsecondsnr", (((end.tv_sec - start.tv_sec) * 1000000) + (end.tv_usec - start.tv_usec)) / ITER);
    //fprintf(stdout,"Last man standing is %dnr", (chain->first()->count() + 1));
    return 0;
}

Python

class Person(object):
    def __init__(self,count):
        self.count = count;
        self.prev = None 

        self.next = None
    def shout(self,shout,deadif):
        if (shout < deadif): return (shout + 1)
        self.prev.next = self.next
        self.next.prev = self.prev
        return 1

class Chain(object):
    def __init__(self,size):
        self.first = None
        last = None
        for i in range(size):
            current = Person(i)
            if self.first == None : self.first = current
            if last != None :
                last.next = current
                current.prev = last
            last = current
        self.first.prev = last
        last.next = self.first
    def kill(self,nth):
        current = self.first
        shout = 1
        while current.next != current:
            shout = current.shout(shout,nth)
            current = current.next
        self.first = current
        return current

import time
ITER = 100000
start = time.time()
for i in range(ITER):
    chain = Chain(40)
    chain.kill(3)
end = time.time()
print 'Time per iteration = %s microseconds ' % ((end - start) * 1000000 / ITER)

Ruby

class Person
    attr_reader :count, :prev, :next
    attr_writer :count, :prev, :next

    def initialize(count)
        #puts 'Initializing person : ' + count.to_s()
        @count = count
        @prev = nil
        @next = nil
    end

    def shout(shout, deadif)
        if shout < deadif
            return shout + 1
        end
        @prev.next = @next
        @next.prev = @prev
        return 1
    end
end      

class Chain
    attr_reader :first
    attr_writer :first

    def initialize(size)
        @first = nil
        last = nil
        for i in (1..size)
            current = Person.new(i)
            if @first == nil
                @first = current
            end
            if last != nil
                last.next = current
                current.prev = last
            end
            last = current
        end
        @first.prev = last
        last.next = @first
    end

    def kill(nth)
        current = @first
        shout = 1
        while current.next != current
            shout = current.shout(shout,nth)
            current = current.next
        end
        @first = current
        return current
    end
end

ITER=100000
start = Time.now
ITER.times { |i|
chain = Chain.new(40)
chain.kill(3)
}
ends = Time.now
puts 'Time per iteration = ' + ((ends - start) * 1000000 / ITER).to_s() + " microseconds"

Groovy

class Chain
{
    def size
    def first

    def init(siz)
    {
        def last
        size = siz
        for(def i = 0 ; i < siz ; i++)
        {
            def current = new Person()
            current.count = i
            if (i == 0) first = current
            if (last != null)
            {
                last.next = current
            }
            current.prev = last
            last = current
        }
        first.prev = last
        last.next = first
    }

    def kill(nth)
    {
        def current = first
        def shout = 1
        while(current.next != current)
        {
            shout = current.shout(shout,nth)
            current = current.next
        }
        first = current
    }
}

class Person
{
    def count
    def prev
    def next

    def shout(shout,deadif)
    {
        if (shout < deadif)
        {
            return (shout + 1)
        }
        prev.next = next

        next.prev = prev
        return 1
    }
}

def main(args)
{
    println "Starting"
    def ITER = 100000
    def start = System.nanoTime()
    for(def i = 0 ; i < ITER ; i++)
    {
        def chain = new Chain()
        chain.init(40)
        chain.kill(3)
    }
    def end = System.nanoTime()
    println "Total time = " + ((end - start)/(ITER * 1000)) + " microseconds"
}

def ITER = 100000
def start = System.nanoTime()
for(def i = 0 ; i < ITER ; i++)
{
    def chain = new Chain()
    chain.init(40)
    chain.kill(3)
}
def end = System.nanoTime()
println "Time per iteration = " + ((end - start)/(ITER * 1000)) + " microseconds"

PHP

class Person
{
    function __construct($c)
    {
        $this->count = $c;
    }       

    function getPrev()
    {
        return $this->prev;
    }           

    function setPrev($pr)
    {
        $this->prev = $pr;
    }   

    function getNext()
    {
        return $this->next;
    }

    function setNext($nxt)
    {
        $this->next = $nxt;
    }

    function shout($shout, $nth)
    {
        if ($shout < $nth)
        {
            return $shout + 1;
        }
        $this->getPrev()->setNext($this->getNext());
        $this->getNext()->setPrev($this->getPrev());
        return 1;
    }
}

class Chain
{
    var $first;

    function __construct($size)
    {
        for($i = 0; $i < $size ; $i++)
        {
            $current = new Person($i);
            if ($this->first == null) $this->first = $current;
            if ($last != null)
            {
                $last->setNext($current);
                $current->setPrev($last);
            }
            $last = $current;
        }
        $this->first->setPrev($last);
        $last->setNext($this->first);
    }

    function kill($nth)
    {
        $current = $this->first;
        $shout = 1;
        while($current->getNext() !== $current)
        {
            $shout =  $current->shout($shout,$nth);
            $current = $current->getNext();
        }
        $this->first = $current;
    }
}

$start = microtime(true);
$ITER = 100000;
for($i = 0 ; $i < $ITER ; $i++)
{
    $chain = new Chain(40);
    $chain->kill(3);
}
$end = microtime(true);
printf("Time per iteration = %3.2f microsecondsnr",(($end -  $start) * 1000000 / $ITER));

Related posts: (Automatically Generated)

  1. Constructor / Method overloading in Python using Function Switching
  2. Improve your web based software development and maintenance ROI with dynamic programming languages

128 Comments on Performance Comparison – C++ / Java / Python / Ruby/ Jython / JRuby / Groovy

Closed

  1. @Kuba, I had originally run Java and C++ with 1000000 and the rest with 100000 (just because of the large timing difference), somewhere along the way (and I am not sure when exactly) one of the zeroes in the java program dropped off :) . I confirmed later this did not influence the results in any significant way.

  2. Nigel says:

    for a more representative comparison of the relative performance merits of languages try http://shootout.alioth.debian.org/

  3. Nigel says:

    for a more representative comparison of the relative performance merits of languages try http://shootout.alioth.debian.org/

  4. Paddy3118 says:

    I know I’m supposed to be following the class based approach of your static language examples but I do like the problem itself.

    There is more info on the problem here:
    http://www.research.att.com/~njas/sequences/A032434

    Including a recurrence relation that gives the answer.

    It translates into the following Python recursive function:

    >>> def T(n, k):
    ….if n == 1 : return 1
    ….tmp = (T(n-1, k)+k) % n
    ….if tmp == 0:
    ……..return n
    ….else:
    ……..return tmp

    ….
    >>> for n in range(1,8):
    ….print “\n”,n,”:”,
    ….for k in range(1, n+1): print T(n, k),

    1 : 1
    2 : 2 1
    3 : 3 3 2
    4 : 4 1 1 2
    5 : 5 3 4 1 2
    6 : 6 5 1 5 1 4
    7 : 7 7 4 2 6 3 5
    >>> T(40,3)
    28
    >>>

    If only I new enough maths to derive the above!

    Oh, here’s the timings:
    Time per iteration for T = 8.75 microseconds
    Time per iteration for findlast2 = 93.1200003624 microseconds
    Time per iteration for Chain = 221.560001373 microseconds

    I haven’t had so much fun with a sequence since Ackermanns function, and that was ages ago :-)

    - Paddy.

  5. Out of curiosity I ported the Java to C# and ran it against the .NET 2.0 runtime on a virtualized instance of WinXP Pro running inside VMware Fusion using one of the two Intel Core Duo 2.2Ghz processors and 1Gig of ram. The result was an average of 12microseconds per iteration.

    The code in case anyone cares to try it in a non-virtualized instance of WinXP and/or Vista:

    using System;
    using System.Diagnostics;

    public class Chain
    {
    private Person first = null;

    public Chain(int size)
    {
    Person last = null;
    Person current = null;
    for (int i = 0 ; i size ; i++)
    {
    current = new Person(i);
    if (first == null) first = current;
    if (last != null)
    {
    last.setNext(current);
    current.setPrev(last);
    }
    last = current;
    }
    first.setPrev(last);
    last.setNext(first);
    }

    public Person kill(int nth)
    {
    Person current = first;
    int shout = 1;
    while(current.getNext() != current)
    {
    shout = current.shout(shout, nth);
    current = current.getNext();
    }
    first = current;
    return current;
    }

    public Person getFirst()
    {
    return first;
    }
    public static void Main(String[] args)
    {
    int ITER = 100000;
    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();
    for (int i = 0 ; i ITER ; i++)
    {
    Chain chain = new Chain(40);
    chain.kill(3);
    }
    stopwatch.Stop();
    long totalMicroseconds = stopwatch.ElapsedMilliseconds * 10000;
    Console.WriteLine(“Elapsed Time: {0}ms, Time per iteration: {1}ms”, totalMicroseconds, (totalMicroseconds / ITER));
    }
    }

    public class Person
    {
    int count;
    private Person prev = null;
    private Person next = null;

    public Person(int count)
    {
    this.count = count;
    }

    public int shout(int shout, int deadif)
    {
    if (shout deadif) return (shout + 1);
    this.getPrev().setNext(this.getNext());
    this.getNext().setPrev(this.getPrev());
    return 1;
    }

    public int getCount()
    {
    return this.count;
    }

    public Person getPrev()
    {
    return prev;
    }

    public void setPrev(Person prev)
    {
    this.prev = prev;
    }

    public Person getNext()
    {
    return next;
    }

    public void setNext(Person next)
    {
    this.next = next;
    }
    }

  6. Slight update to run the test multiple times yields a speed up of about 2 microseconds per iteration:

    PS Z:NuxleusPublicSourceCodeSamplesBenchmarkBenchmarkinRelease .Benchmark.exe
    Elapsed Time: 1070000ms, Time per iteration: 10ms
    Elapsed Time: 1080000ms, Time per iteration: 10ms
    Elapsed Time: 1050000ms, Time per iteration: 10ms
    Elapsed Time: 1080000ms, Time per iteration: 10ms
    Elapsed Time: 1050000ms, Time per iteration: 10ms

    The updated code:

    using System;
    using System.Diagnostics;

    public class Chain
    {
    private Person first = null;

    public Chain(int size)
    {
    Person last = null;
    Person current = null;
    for (int i = 0 ; i size ; i++)
    {
    current = new Person(i);
    if (first == null) first = current;
    if (last != null)
    {
    last.setNext(current);
    current.setPrev(last);
    }
    last = current;
    }
    first.setPrev(last);
    last.setNext(first);
    }

    public Person kill(int nth)
    {
    Person current = first;
    int shout = 1;
    while(current.getNext() != current)
    {
    shout = current.shout(shout, nth);
    current = current.getNext();
    }
    first = current;
    return current;
    }

    public Person getFirst()
    {
    return first;
    }
    public static void Main(String[] args)
    {
    int ITER = 100000;
    RunTest(ITER);
    RunTest(ITER);
    RunTest(ITER);
    RunTest(ITER);
    RunTest(ITER);
    }

    public static void RunTest(int ITER) {
    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();
    for (int i = 0; i ITER; i++) {
    Chain chain = new Chain(40);
    chain.kill(3);
    }
    stopwatch.Stop();
    long totalMicroseconds = stopwatch.ElapsedMilliseconds * 10000;
    Console.WriteLine(“Elapsed Time: {0}ms, Time per iteration: {1}ms”, totalMicroseconds, (totalMicroseconds / ITER));
    }
    }

    public class Person
    {
    int count;
    private Person prev = null;
    private Person next = null;

    public Person(int count)
    {
    this.count = count;
    }

    public int shout(int shout, int deadif)
    {
    if (shout deadif) return (shout + 1);
    this.getPrev().setNext(this.getNext());
    this.getNext().setPrev(this.getPrev());
    return 1;
    }

    public int getCount()
    {
    return this.count;
    }

    public Person getPrev()
    {
    return prev;
    }

    public void setPrev(Person prev)
    {
    this.prev = prev;
    }

    public Person getNext()
    {
    return next;
    }

    public void setNext(Person next)
    {
    this.next = next;
    }
    }

  7. Out of curiosity I ported the Java to C# and ran it against the .NET 2.0 runtime on a virtualized instance of WinXP Pro running inside VMware Fusion using one of the two Intel Core Duo 2.2Ghz processors and 1Gig of ram. The result was an average of 12microseconds per iteration.

    The code in case anyone cares to try it in a non-virtualized instance of WinXP and/or Vista:

    using System;
    using System.Diagnostics;

    public class Chain
    {
    private Person first = null;

    public Chain(int size)
    {
    Person last = null;
    Person current = null;
    for (int i = 0 ; i < size ; i++)
    {
    current = new Person(i);
    if (first == null) first = current;
    if (last != null)
    {
    last.setNext(current);
    current.setPrev(last);
    }
    last = current;
    }
    first.setPrev(last);
    last.setNext(first);
    }

    public Person kill(int nth)
    {
    Person current = first;
    int shout = 1;
    while(current.getNext() != current)
    {
    shout = current.shout(shout, nth);
    current = current.getNext();
    }
    first = current;
    return current;
    }

    public Person getFirst()
    {
    return first;
    }
    public static void Main(String[] args)
    {
    int ITER = 100000;
    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();
    for (int i = 0 ; i < ITER ; i++)
    {
    Chain chain = new Chain(40);
    chain.kill(3);
    }
    stopwatch.Stop();
    long totalMicroseconds = stopwatch.ElapsedMilliseconds * 10000;
    Console.WriteLine(“Elapsed Time: {0}ms, Time per iteration: {1}ms”, totalMicroseconds, (totalMicroseconds / ITER));
    }
    }

    public class Person
    {
    int count;
    private Person prev = null;
    private Person next = null;

    public Person(int count)
    {
    this.count = count;
    }

    public int shout(int shout, int deadif)
    {
    if (shout < deadif) return (shout + 1);
    this.getPrev().setNext(this.getNext());
    this.getNext().setPrev(this.getPrev());
    return 1;
    }

    public int getCount()
    {
    return this.count;
    }

    public Person getPrev()
    {
    return prev;
    }

    public void setPrev(Person prev)
    {
    this.prev = prev;
    }

    public Person getNext()
    {
    return next;
    }

    public void setNext(Person next)
    {
    this.next = next;
    }
    }

  8. Slight update to run the test multiple times yields a speed up of about 2 microseconds per iteration:

    PS Z:\Nuxleus\Public\Source\CodeSamples\Benchmark\Benchmark\bin\Release> .\Benchmark.exe
    Elapsed Time: 1070000ms, Time per iteration: 10ms
    Elapsed Time: 1080000ms, Time per iteration: 10ms
    Elapsed Time: 1050000ms, Time per iteration: 10ms
    Elapsed Time: 1080000ms, Time per iteration: 10ms
    Elapsed Time: 1050000ms, Time per iteration: 10ms

    The updated code:

    using System;
    using System.Diagnostics;

    public class Chain
    {
    private Person first = null;

    public Chain(int size)
    {
    Person last = null;
    Person current = null;
    for (int i = 0 ; i < size ; i++)
    {
    current = new Person(i);
    if (first == null) first = current;
    if (last != null)
    {
    last.setNext(current);
    current.setPrev(last);
    }
    last = current;
    }
    first.setPrev(last);
    last.setNext(first);
    }

    public Person kill(int nth)
    {
    Person current = first;
    int shout = 1;
    while(current.getNext() != current)
    {
    shout = current.shout(shout, nth);
    current = current.getNext();
    }
    first = current;
    return current;
    }

    public Person getFirst()
    {
    return first;
    }
    public static void Main(String[] args)
    {
    int ITER = 100000;
    RunTest(ITER);
    RunTest(ITER);
    RunTest(ITER);
    RunTest(ITER);
    RunTest(ITER);
    }

    public static void RunTest(int ITER) {
    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();
    for (int i = 0; i < ITER; i++) {
    Chain chain = new Chain(40);
    chain.kill(3);
    }
    stopwatch.Stop();
    long totalMicroseconds = stopwatch.ElapsedMilliseconds * 10000;
    Console.WriteLine(“Elapsed Time: {0}ms, Time per iteration: {1}ms”, totalMicroseconds, (totalMicroseconds / ITER));
    }
    }

    public class Person
    {
    int count;
    private Person prev = null;
    private Person next = null;

    public Person(int count)
    {
    this.count = count;
    }

    public int shout(int shout, int deadif)
    {
    if (shout < deadif) return (shout + 1);
    this.getPrev().setNext(this.getNext());
    this.getNext().setPrev(this.getPrev());
    return 1;
    }

    public int getCount()
    {
    return this.count;
    }

    public Person getPrev()
    {
    return prev;
    }

    public void setPrev(Person prev)
    {
    this.prev = prev;
    }

    public Person getNext()
    {
    return next;
    }

    public void setNext(Person next)
    {
    this.next = next;
    }
    }

  9. sri says:

    hey dhananjay,
    This is sri. Thanks for signing up at enewss.
    Regarding your performance comparison between different languages…
    Most of the compilers do a very good job at optimizing the loops and hence not good for performance metrics. I would prefer to solve an algorithm for searching for the sake of comparing different languages.
    Again, to be frank, we are comparing apples to oranges because the languages you chose are invesnted for specific purpose and hence shouldn’t be compared..

    Your conclusion that performance of c++ language worsens with freeing up of memory is correct because of reasons you already know. In case of java, jvm handles all of the memory related services freeing up process to do its job, whereas in c++, the user process has to take care of it.

    However, there is one subtle difference between c++ and java, if programmed correctly, c++ can beat the shit out of java when it comes to memory allocation.
    in c++ objects can be created on stack, whereas in java everything is on the heap. So, in c++, objects can be created and destroyed faster if you make use of stack properly without the need of new() routine/heap_memory.

    Again, if programmed perfectly to the finest bit and byte level, c++ can perform very well in real time applications compared to java, as java allocates storage for program variables and structures with highest common denominator among all the platforms available. In c++, you have the control on storage size.

    Performance measurement is an art, industry always has fine tuned their applications only to come up with a biased report to almost always give high marks to their products/applications.

    Neverthless, your blog made me recollect things which i almost forbot about.
    Thanks
    Sri

  10. Peter Lupo says:

    Wrong line count! :-)
    There is no reason why you should write java methods like

    public int getCount()
    {
    return this.count;
    }

    And C methods like

    int count() { return _count; }

    This gives you a wrong line count.

    As you say, you should compare apples to apples, so you should use the same rule for code formatting on all the languages.

    BTW, Java Code Conventions advise us to write our Java methods with { at the end of the line, like:
    public int getCount() {
    and not in the beginning of the next line like you did.

    Please, reformat the codes and recount the lines.

  11. Mauricio Reale says:

    Very impressive test. Java outperforms C++ in performance in a VERY PARTICULAR way. Cannot understand why most very intensive apps (games is a good example) are not written in Java. Anyway, java still rocks.

  12. Peter Lupo says:

    Wrong line count! :-)
    There is no reason why you should write java methods like

    public int getCount()
    {
    return this.count;
    }

    And C methods like

    int count() { return _count; }

    This gives you a wrong line count.

    As you say, you should compare apples to apples, so you should use the same rule for code formatting on all the languages.

    BTW, Java Code Conventions advise us to write our Java methods with { at the end of the line, like:
    public int getCount() {
    and not in the beginning of the next line like you did.

    Please, reformat the codes and recount the lines.

  13. Mauricio Reale says:

    Very impressive test. Java outperforms C++ in performance in a VERY PARTICULAR way. Cannot understand why most very intensive apps (games is a good example) are not written in Java. Anyway, java still rocks.

  14. titanandrews says:

    “Cannot understand why most very intensive apps (games is a good example) are not written in Java”

    One reason is because games have LOTS of math and Java is still very slow at intensive math calculations in comparison to C++.

    This benchmark isn’t worth much because it doesn’t do anything useful.

  15. Please, fix your C++ code:

    Instead of

    int ITER = 1000000;

    use

    int ITER = 100000;

    as other codes.

  16. Please, fix your C++ code:

    Instead of

    int ITER = 1000000;

    use

    int ITER = 100000;

    as other codes.

  17. pedro says:

    One vote to # Peter Lupo’s comment

  18. and says:

    Why you don´t put COBOL. COBOL is TOO Performance Comparison – C++ / Java / Python / Ruby/ Jython / JRuby / Groovy

  19. and says:

    Why you don´t put COBOL. COBOL is TOO Performance Comparison – C++ / Java / Python / Ruby/ Jython / JRuby / Groovy

  20. @Peter Lupo, @Pedro – I updated the line counts

    @and – I don’t know COBOL (actually the last time I coded with it was in 1990).

    @Mauricio – I believe more and more games are starting to be written using Java. Games often require not just a fast gaming engine (which I believe Java has started contributing to) but also highly optimised Graphics APIs (last I checked there was no DirectX / OpenGL using Java directly. :) ). However I am sure C will in all likelihood continue to have an advantage over Java in games.

    @Vitor – I have already remarked above (in the comments) – the number of iterations do not make any significant difference to the results (though the fact that these are different is accidental and not intentional)

  21. @sri,

    I would offer a rather different view than what you stated.

    If all compilers were able to optimise all loops equally I would agree with you. However the fact of the matter is that they dont. Hence such a comparison is not entirely unwarranted.

    If users were attempting to use languages for the reasons they were invented and not for others I would go along with you. However Java was invented for set top boxes. But it comes up time and again as an option in networking, games, transactional processing, databases etc. etc. I think it is fair to compare them.

    In both c++ and java the user process has to handle the memory allocation and deallocation. The difference is that in Java *sometimes* the automatic garbage collector because it works on deallocation in a bulk mode, has an opportunity to optimise memory management far beyond what any finely crafted developer specified memory allocation can do. Hence *sometimes* java memory management can be faster than (I don’t know if it can beat the shit out of) hand crafted C++ code.

    C will allow you fine control on storage sizes down to the byte level. There are scenarios where this is really useful (eg. embedded) and there are scenarios where this may not be. It is precisely because of this, an architect requires to achieve the appropriate tradeoff between control / speed / ram (C++) vs. high developer producitivity and agility (Python / Ruby) with Java somewhere inbetween.

  22. You should obviously not talk about languages you don’t have a clue of. Your PHP code sux a lot. If you wanted to make it serious, than you should have had a groups of experts of each language to decide de best way of doing things.

    Also, PHP has an extra line between 8 and 9, so it would be only 84 lines. Also, you don’t need to close the script with ?> if the file has only PHP code, which would lead us to 83 lines. More to say: you don’t nees {} for 1 line ifs, which would reduce 2 more lines, leaving us with 81 lines.

    And I could keep fixing your code for days, but I gotta better things to do than see your PHP 4 code (yes, you did not used PHP 5 code! meaning the PHP engine also slows down to try toi execute the code as PHP 4!).

  23. Thiago Pojda says:

    Good tests. :)

    Your php code does not look good but that’s fine.

    I just want to corrent Pablo Sánchez saying that PHP code is not PHP5 code. You didn’t have __construct() and __destruct() in PHP4, they were introduced in PHP5.

    Oh, and PHP CLI version is slower AFAIK. Apache module is faster (but that way you’d be testing apache also, I see the problem).

  24. [...] can connect java with c,c++ or asm code. So yes it is possible I’ve read something intersting here then I read in another place (I lost the link) the “benchmark”… Java is not the same Old Java… [...]