Update (README CAREFULLY) : I am starting to see hyperlinks to his post with only some of the findings being treated as the link title (eg. X is 100 times faster than Y, X faster than Z). I emphasise once again that I have carefully indicated in the original post that this is but one of many possible microbenchmarks and that you should treat the results as one of many data points. Given the comments I’ve received and some of the links I’ve seen to this post, if I was to make this posting anew, I would choose to assign the title of this post as “Implementing an identical object oriented solution to the Josephus Problem in Java / C++ / Ruby / JRuby / Python / Jython / Groovy and measuring the performance results thereof.”
This post compares performance across various languages for a specific micro benchmark (actually it isn’t really a microbenchmark – it is simply a benchmark for a specific piece of logic – but thats the closest word I could think of).
Last week, while preparing for a presentation – Contrasting Java and Dynamic Languages, I came across this interesting Perl/Python/Ruby Comparison which focused on comparing the code style of different languages. I thought it would be interesting to use the same to get some actual benchmarks based on the same. Note that you could also use the code segments below to get a feel for different syntactic flavours. However since I have strived to keep the code as similar as possible to each other, some of the advanced syntactic sugar of the dynamic languages is not on display here.
Problem Statement
Quoting from the post linked to above :
Flavius Josephus was a roman historian of Jewish origin. During the Jewish-Roman wars of the first century AD, he was in a cave with fellow soldiers, 40 men in all, surrounded by enemy Roman troops. They decided to commit suicide by standing in a ring and counting off each third man. Each man so designated was to commit suicide…Josephus, not wanting to die, managed to place himself in the position of the last survivor.
In the general version of the problem, there are n soldiers numbered from 1 to n and each k-th soldier will be eliminated. The count starts from the first soldier. What is the number of the last survivor
Design
I actually changed the design of the solution as compared to the original post. Instead of using the deeply recursive calls as used in the earlier post, I decided to split the logic into two classes, and use loop iteration instead of recursion. It is my belief that we tend to do loop iterations far more frequently than recursions, and the resultant class design having two classes – one to indicate a Chain and one reflecting a Person seemed more appropriate to me.
Logic
The Chain object contains a reference to one person (first) who is but one member in a circular linked list. Each person object has a reference to its previous (prev) and next (next) person in the circle. When the kill loop starts, it sets a threshold (nth). The count starts with 1 from the first person. Each person when asked to shout, checks if the shout count (shout) is less than the threshold (nth). If less, the person just returns an incremented count. If the two are same, the person in effect commits suicide. In doing so the person, updates the next reference of its prev, and prev reference of its next to take himself off the circle and keep the circle consistent, finally returning a shout of 1 (which is what the next person in the list will shout).
The code does not have any comments (sorry!) and all the console outputs have been removed so that the benchmarking activity is not interfered with by the IO overheads.
The results
All the results are as observed on my notebook with the following config
OS : Ubuntu Gutsy Gibbon 7.10
Kernel : 2.6.22-15-generic
CPU : Intel(R) Core(TM) Duo CPU T2600 @ 2.16GHz
RAM : 2GB
| Language | Version | Lines of Code | Time per iteration (microseconds) |
|---|---|---|---|
| Java | Sun JDK 1.6.0.03 | 1.6 | |
| C++ | 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2) Compiled with optimisation -O3 |
86 | 3 |
| gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7) Compiled with optimisation -O3 Alberto Bignotti’s modified code with customised memory reuse and management |
124 | approx 0 | |
| Ruby | ruby 1.9.0 (2008-04-14 revision 16006) [i686-linux] | 63 | |
| ruby 1.8.6 (2007-06-07 patchlevel 36) [i486-linux] | |||
| jruby : ruby 1.8.6 (2008-05-28 rev 6586) [i386-jruby1.1.2] | |||
| Python | 2.5.1 | 41 | |
| 2.5.1 with psyco | 33 | ||
| Jython 2.2.1 on JRE 1.6.0.03 | |||
| Groovy | Groovy Version: 1.5.6 JVM: 1.6.0_03-b05 uncompiled | 81 | 363 |
| Compiled to bytecode and run using java | 360 | ||
| UpdateGroovy Version: 1.6-beta-1 JVM: 1.6.0_03 | 104 | ||
| PHP | PHP 5.2.3-1ubuntu6.3 (cli) | 85 | 593 |
Updates :
- Ken suggested syntactic improvements (see comments below) which lead to even faster ruby execution times : jruby : 80 microseconds, ruby 1.9 : 89 microseconds, ruby 1.8.6 : 380 microseconds. The table above has been updated
- Cato requested a run using Groovy 1.6 beta 1 – have updated the same. Big improvement
- Nicholas Riley suggested introducing slots and using “is not” and “is” in the if conditions for the python code. Updated the results to reflect the figure of 192 and 632 micro seconds for CPython and Jython. The figure was 182 microseconds for CPython and 131 microseconds for Jython if I did not use the new style classes, however I did not reflect the same, since most new code is likely to be using new style classes. However this does indicate one possible performance optimisation if your code does not depend upon new style classes. Moreover makes me really interested in waiting for the Jython performance optimisations for new style classes that Nicholas suggests are on their shortlist.
- Tim Fountain in a comment below indicates that on his hardware (core 2 Quad) with Ubuntu Hardy Heron, Ruby 1.8.6 (same version as above) performs somewhat faster (15%) whereas upgraded version of Python and PHP run much faster(63% for python and 83% for ruby). Another difference in config- he is running 64 bit.
- python – version 2.5.2: – 138 microseconds
- ruby – 1.8.6 (2007-09-24 patchlevel 111) [x86_64-linux] :321 microseconds
- PHP – 5.2.4-2ubuntu5.1 with Suhosin-Patch 0.9.6.2 (cli): 323 microseconds.
- Peter Lupo requested a reduction in line count for Java since the conventional way is to have the opening braces not on a separate independent line. Given the fact that it is a fair comment, I have reduced the line count of java to 86 (didn’t physically change the code – 86 = 101 – 15 opening curly braces).
- Added another finding of using python with psyco
- C++ – Added results using Alberto Bignotti’s alternative code with customised memory reuse management
Summarisation
The following are the results. Given the long code blocks, I am presenting the summarisation first followed by the code.
Note: This can only be treated as one particular benchmark. The results are a little atypical with respect to my general understanding. Advise caution against drawing broad conclusions based on this benchmark alone but would suggest that you could treat this as one data point amongst many. People better versed than me in the details of language runtimes might be able to suggest why some of the results seem surprising or atypical.
- Java / C++ Rock : The performance of Java and C++ was head and shoulders beyond other languages (nearly 100 times faster). My thought is that while a difference of 10x was only to be expected – this difference was just way too massive
- Java is faster than C++ : Though I had read about other microbenchmarks reaching the same conclusions, it is the first time I actually ran one where Java was faster. There are many others I have run where C++ beats Java quite handsomely. More importantly – the performance of C++ worsened by almost 40% once I added code which started freeing memory that was being allocated (there’s still a small memory leak in the code – there is no Chain destructor which will clean up first). I would later definitely want to look at the impact of garbage collection in this context, and whether the Java garbage collector simply was much faster than the hand crafted new – delete calls in C++. Update:Using the customised memory management (which is not used in any of the other examples) but the same algorithm as in the code written by Alberto, C++ is much faster than Java
- Ruby 1.9 is twice as fast as Python : While it has been known for a while Ruby 1.9 is much faster than Ruby 1.8.6, heres one more supporting data point. I was expecting ruby 1.9 to give python a run for its performance money. But at least in this particular context it seems to be much much faster.
- JRuby is faster than Ruby : Even ruby 1.9. Very interesting indeed.
- Jython still has some catching up to do : Though in the ballpark as the other languages, it was the slowest in the pack.
- Overhead of dynamism is dominant : I have no idea if JRuby ran much faster because of the java bytecode or because of its implementation (though its performance was not even remotely close to that of Java). However even after I compiled groovy code, to java bytecode, it still ran much slower than python and ruby. It seems the overhead of supporting dynamic constructs is much more dominant than any benefits that one gets out of compilation (whether to java byte code or to intermediate compiled files). I think the argument that because something compiles to java bytecode it is likely to be fast should be looked at a little carefully.
- PHP stays at the rear end : Though I benchmarked PHP for the first time, I wasn’t completely surprised by the fact that PHP could only manage to be faster than Jython.
Update : There are many comments to this post including those from cwilbur who benchmarks perl using a idiomatic method, Paddy3118 who offers an optimised algorithm for python, and peter lawrey who offers an optimised algorithm for Java. I would like to state that each of their solutions offer superior performance than that what has been described here. However I believe any benchmark comparison should compare apples to apples. Should these contributions be taken into account and be reflected in the table above ? I certainly believe there is a case to do so as an exercise using a different algorithm. However to ensure that it is a fair comparison, one has to modify all the code in all other languages also to reflect the same algorithm. Only then can we get an apple to apples comparison. That is probably an exercise for another post. Is the algorithm I have chosen the fastest – No. However I believe it is a very readable algorithm and if one ignores the IO with networks and databases and files, it is probably close to the kind of code many programmers write (and maintain) on a day to day basis. It has been consistently implemented in all the languages. Readers should be aware that there are algorithms which will deliver much superior performance – but they will also make the performance superior in all the languages (perhaps to slightly differing extents and thus possibly somewhat different results).
The code
For all you who are either interested in running it for yourself or would like to perhaps explore this in more detail, .. here’s the code. Note that I am not equally competent across all languages. So if you believe there is something that could be more appropriate way to code the same, do post a comment. One of the things I have tried to do is to ensure that the code remains more or less similar across all languages. Also I have used getter – setters or skipped them based on my understanding of the generally accepted convention for users of the language.
Java :
package com.dnene.josephus;
public class Chain
{
private Person first = null;
public Chain(int size)
{
Person last = null;
Person current = null;
for (int i = 0 ; i < size ; i++)
{
current = new Person(i);
if (first == null) first = current;
if (last != null)
{
last.setNext(current);
current.setPrev(last);
}
last = current;
}
first.setPrev(last);
last.setNext(first);
}
public Person kill(int nth)
{
Person current = first;
int shout = 1;
while(current.getNext() != current)
{
shout = current.shout(shout, nth);
current = current.getNext();
}
first = current;
return current;
}
public Person getFirst()
{
return first;
}
public static void main(String[] args)
{
int ITER = 100000;
long start = System.nanoTime();
for (int i = 0 ; i < ITER ; i++)
{
Chain chain = new Chain(40);
chain.kill(3);
}
long end = System.nanoTime();
System.out.println("Time per iteration = " + ((end - start) / (ITER )) + " nanoseconds.");
}
}
package com.dnene.josephus;
public class Person
{
int count;
private Person prev = null;
private Person next = null;
public Person(int count)
{
this.count = count;
}
public int shout(int shout, int deadif)
{
if (shout < deadif) return (shout + 1);
this.getPrev().setNext(this.getNext());
this.getNext().setPrev(this.getPrev());
return 1;
}
public int getCount()
{
return this.count;
}
public Person getPrev()
{
return prev;
}
public void setPrev(Person prev)
{
this.prev = prev;
}
public Person getNext()
{
return next;
}
public void setNext(Person next)
{
this.next = next;
}
}
C++
#include#include #include #include class Person { public: Person(int count) : _next(NULL), _prev(NULL) { _count = count; } int shout(int shout, int nth) { if (shout < nth) return (shout + 1); _prev->_next = _next; _next->_prev = _prev; return 1; } int count() { return _count; } Person* next() { return _next; } void next(Person* person) { this->_next = person ; } Person* prev() { return _prev; } void prev(Person* person) { this->_prev = person; } private: int _count; Person* _next; Person* _prev; }; class Chain { public: Chain(int size) : _first(NULL) { Person* current = NULL; Person* last = NULL; for(int i = 0 ; i < size ; i++) { current = new Person(i); if (_first == NULL) _first = current; if (last != NULL) { last->next(current); current->prev(last); } last = current; } _first->prev(last); last->next(_first); } Person* kill(int nth) { Person* current = _first; int shout = 1; while(current->next() != current) { Person* tmp = current; shout = current->shout(shout,nth); current = current->next(); if (shout == 1) { delete tmp; } } _first = current; return current; } Person* first() { return _first; } private: Person* _first; }; int main(int argc, char** argv) { int ITER = 1000000; Chain* chain; struct timeval start, end; gettimeofday(&start,NULL); for(int i = 0 ; i < ITER ; i++) { chain = new Chain(40); chain->kill(3); delete chain; } gettimeofday(&end,NULL); fprintf(stdout,"Time per iteration = %d microsecondsnr", (((end.tv_sec - start.tv_sec) * 1000000) + (end.tv_usec - start.tv_usec)) / ITER); //fprintf(stdout,"Last man standing is %dnr", (chain->first()->count() + 1)); return 0; }
Python
class Person(object):
def __init__(self,count):
self.count = count;
self.prev = None
self.next = None
def shout(self,shout,deadif):
if (shout < deadif): return (shout + 1)
self.prev.next = self.next
self.next.prev = self.prev
return 1
class Chain(object):
def __init__(self,size):
self.first = None
last = None
for i in range(size):
current = Person(i)
if self.first == None : self.first = current
if last != None :
last.next = current
current.prev = last
last = current
self.first.prev = last
last.next = self.first
def kill(self,nth):
current = self.first
shout = 1
while current.next != current:
shout = current.shout(shout,nth)
current = current.next
self.first = current
return current
import time
ITER = 100000
start = time.time()
for i in range(ITER):
chain = Chain(40)
chain.kill(3)
end = time.time()
print 'Time per iteration = %s microseconds ' % ((end - start) * 1000000 / ITER)
Ruby
class Person
attr_reader :count, :prev, :next
attr_writer :count, :prev, :next
def initialize(count)
#puts 'Initializing person : ' + count.to_s()
@count = count
@prev = nil
@next = nil
end
def shout(shout, deadif)
if shout < deadif
return shout + 1
end
@prev.next = @next
@next.prev = @prev
return 1
end
end
class Chain
attr_reader :first
attr_writer :first
def initialize(size)
@first = nil
last = nil
for i in (1..size)
current = Person.new(i)
if @first == nil
@first = current
end
if last != nil
last.next = current
current.prev = last
end
last = current
end
@first.prev = last
last.next = @first
end
def kill(nth)
current = @first
shout = 1
while current.next != current
shout = current.shout(shout,nth)
current = current.next
end
@first = current
return current
end
end
ITER=100000
start = Time.now
ITER.times { |i|
chain = Chain.new(40)
chain.kill(3)
}
ends = Time.now
puts 'Time per iteration = ' + ((ends - start) * 1000000 / ITER).to_s() + " microseconds"
Groovy
class Chain
{
def size
def first
def init(siz)
{
def last
size = siz
for(def i = 0 ; i < siz ; i++)
{
def current = new Person()
current.count = i
if (i == 0) first = current
if (last != null)
{
last.next = current
}
current.prev = last
last = current
}
first.prev = last
last.next = first
}
def kill(nth)
{
def current = first
def shout = 1
while(current.next != current)
{
shout = current.shout(shout,nth)
current = current.next
}
first = current
}
}
class Person
{
def count
def prev
def next
def shout(shout,deadif)
{
if (shout < deadif)
{
return (shout + 1)
}
prev.next = next
next.prev = prev
return 1
}
}
def main(args)
{
println "Starting"
def ITER = 100000
def start = System.nanoTime()
for(def i = 0 ; i < ITER ; i++)
{
def chain = new Chain()
chain.init(40)
chain.kill(3)
}
def end = System.nanoTime()
println "Total time = " + ((end - start)/(ITER * 1000)) + " microseconds"
}
def ITER = 100000
def start = System.nanoTime()
for(def i = 0 ; i < ITER ; i++)
{
def chain = new Chain()
chain.init(40)
chain.kill(3)
}
def end = System.nanoTime()
println "Time per iteration = " + ((end - start)/(ITER * 1000)) + " microseconds"
PHP
class Person
{
function __construct($c)
{
$this->count = $c;
}
function getPrev()
{
return $this->prev;
}
function setPrev($pr)
{
$this->prev = $pr;
}
function getNext()
{
return $this->next;
}
function setNext($nxt)
{
$this->next = $nxt;
}
function shout($shout, $nth)
{
if ($shout < $nth)
{
return $shout + 1;
}
$this->getPrev()->setNext($this->getNext());
$this->getNext()->setPrev($this->getPrev());
return 1;
}
}
class Chain
{
var $first;
function __construct($size)
{
for($i = 0; $i < $size ; $i++)
{
$current = new Person($i);
if ($this->first == null) $this->first = $current;
if ($last != null)
{
$last->setNext($current);
$current->setPrev($last);
}
$last = $current;
}
$this->first->setPrev($last);
$last->setNext($this->first);
}
function kill($nth)
{
$current = $this->first;
$shout = 1;
while($current->getNext() !== $current)
{
$shout = $current->shout($shout,$nth);
$current = $current->getNext();
}
$this->first = $current;
}
}
$start = microtime(true);
$ITER = 100000;
for($i = 0 ; $i < $ITER ; $i++)
{
$chain = new Chain(40);
$chain->kill(3);
}
$end = microtime(true);
printf("Time per iteration = %3.2f microsecondsnr",(($end - $start) * 1000000 / $ITER));
No related posts.


Out of curiosity I just ran some of the tests on my own machine using your code. My results (with versions):
python – version 2.5.2:
Time per iteration = 137.559359074 microseconds
ruby – 1.8.6 (2007-09-24 patchlevel 111) [x86_64-linux]
Time per iteration = 321.06325 microseconds
PHP – 5.2.4-2ubuntu5.1 with Suhosin-Patch 0.9.6.2 (cli):
Time per iteration = 322.66 microseconds
Obviously the numbers will be different on different setups, but it’s interesting that for me, results with ruby and python were similar to yours, but the PHP one was much faster, and comparable to the Ruby result.
I’m running Ubuntu Hardy (desktop), 2GB RAM, Intel Core 2 Quad Q6600.
I would LOVE to see the results using Groovy 1.6 beta/snapshot. The Groovy guys have implemented the same technology (call site caching) that gives jruby such an edge in this scenario here.
It would definetly be interesing to see those numbers.
Nice work so far.
@joe
From wikipedia : Cēterīs paribus is a Latin phrase, literally translated as “with other things the same.” It is commonly rendered in English as “all other things being equal.”
The exercise is an exercise simultaneously in benchmarking and comparison. It is not an exercise in absolute performance improvement.
This exercise in comparison would be pointless if the algorithms themselves are variable. Often the algorithm might in many cases may make a far bigger impact than the language. The results in case of such an exercise would tell you even lesser about comparative performance across languages than this exercise. One way to conduct this exercise is to take up the algorithm you’ve suggested and implement it across all languages.
BTW If performance alone was critical, I would actually never even attempt to benchmark after C++ and Java – it would’ve been an obviously worthless exercise. The important message one gets out of this kind of exercise is to get a “sense” of the comparative performance so that one can exercise more informed judgement, and I really fail to see how a highly sophisticated algorithm is likely to give you any better sense of comparative performance than this exercise does (though I am certain the results themselves will NOT be identical – which is why one should always use many microbenchmarks not just one).
On a separate note, BAD algo is a very subjective point of view. In fact I would argue that in most real life stuations you would not always be able to use a int array as you have suggested, because the person class may have many more attributes than its id alone. If you realise, the algo I have used is not only more readable and maintainable, but is lesser likely to get thrown out and replaced with a different one as the problem domain gets more complex.
Is this the only way to do a language performance comparison – Certainly not. But is this any more or less flawed way than doing a comparison based on a different algorithm – I dont think so.
ps:Also, you may want to not that the Bootup of Java is far more latent.
small python tip: use xrange instead of range,range return a list which need to be allocated and initialized,instead of xrange which return a iterator.
also a thing that kills dynamic languages is attribute lookup,this is do that most of dynamic language implement objects using dictionaries so using static functions with int array is the fastest way for dynamic languages.
about mod operator being slow,if mod is a power of 2 then x % 2^n = x (2^n-1) so 5 % 4 is 5 3.
I would LOVE to see the results using Groovy 1.6 beta/snapshot. The Groovy guys have implemented the same technology (call site caching) that gives jruby such an edge in this scenario here.
It would definetly be interesing to see those numbers.
Nice work so far.
D language: 8 microseconds. (Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz), not bad.
// compile using: gdmd -O -inline -release josephus.d
module josephus;
import std.stdio;
import std.date;
final class Chain {
private Person first_;
this(int size) {
Person last;
for (int i = 0; i size; i++) {
Person current = new Person(i);
if (first is null) {
first = current;
}
if (last !is null) {
last.next = current;
current.prev = last;
}
last = current;
}
first.prev = last;
last.next = first;
}
Person kill(int nth) {
Person current = first;
int shout = 1;
while (current.next !is current) {
shout = current.shout(shout, nth);
current = current.next;
}
first = current;
return current;
}
Person first() {
return first_;
}
void first(Person first__) {
first_ = first__;
}
}
final class Person {
int count_;
private Person prev_;
private Person next_;
this(int count__) {
count_ = count__;
}
int shout(int shout, int deadif) {
if (shout deadif) {
return shout + 1;
}
prev.next = next;
next.prev = prev;
return 1;
}
int count() { return count_; }
Person prev() { return prev_; }
void prev(Person prev__) { prev_ = prev__; }
Person next() { return next_; }
void next(Person next__) { next_ = next__; }
}
void main(char[][] args) {
int ITER = 100000;
auto start = getUTCtime();
for (int i = 0; i ITER; i++) {
auto chain = new Chain(40);
chain.kill(3);
}
auto end = getUTCtime();
writefln(“Time per iteration = %.3f ms”, 1.0e3f*(end – start)/ITER / TicksPerSecond);
}
Good article; Here are a few observations comparing Java and C++.
1) The performance tests do not take into account
“Software Scalable Solutions”. In short; assume you need
to perform these operations for 100′s of millions of iterations
in a Super Static State maintaing a Global Context; and Hot Deployment. How would you even be able to do this w/C++, Corba?
2) The Java Code is not optimized; To start :
——–Latency———-
Chain chain = new Chain(40);
——–should be———-
Chain chain =null;
for (int i = 0 ; i < ITER ; i++)
{
chain = new Chain(40);
——–and———-
This
{
if (first == null) first = current;
if (last != null)
{
last.setNext(current);
current.setPrev(last);
}
}
Is can be written w/far less cycles; (eg:continue;else,etc…_)
ps:Also, you may want to not that the Bootup of Java is far more latent.
Small “optimisation”: 6 microseconds after disabling garbage collector. It was straith Java-D port, nothing special to D there.
small python tip: use xrange instead of range,range return a list which need to be allocated and initialized,instead of xrange which return a iterator.
also a thing that kills dynamic languages is attribute lookup,this is do that most of dynamic language implement objects using dictionaries so using static functions with int array is the fastest way for dynamic languages.
about mod operator being slow,if mod is a power of 2 then x % 2^n = x & (2^n-1) so 5 % 4 is 5 & 3.
D language: 8 microseconds. (Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz), not bad.
// compile using: gdmd -O -inline -release josephus.d
module josephus;
import std.stdio;
import std.date;
final class Chain {
private Person first_;
this(int size) {
Person last;
for (int i = 0; i < size; i++) {
Person current = new Person(i);
if (first is null) {
first = current;
}
if (last !is null) {
last.next = current;
current.prev = last;
}
last = current;
}
first.prev = last;
last.next = first;
}
Person kill(int nth) {
Person current = first;
int shout = 1;
while (current.next !is current) {
shout = current.shout(shout, nth);
current = current.next;
}
first = current;
return current;
}
Person first() {
return first_;
}
void first(Person first__) {
first_ = first__;
}
}
final class Person {
int count_;
private Person prev_;
private Person next_;
this(int count__) {
count_ = count__;
}
int shout(int shout, int deadif) {
if (shout < deadif) {
return shout + 1;
}
prev.next = next;
next.prev = prev;
return 1;
}
int count() { return count_; }
Person prev() { return prev_; }
void prev(Person prev__) { prev_ = prev__; }
Person next() { return next_; }
void next(Person next__) { next_ = next__; }
}
void main(char[][] args) {
int ITER = 100000;
auto start = getUTCtime();
for (int i = 0; i < ITER; i++) {
auto chain = new Chain(40);
chain.kill(3);
}
auto end = getUTCtime();
writefln(“Time per iteration = %.3f ms”, 1.0e3f*(end – start)/ITER / TicksPerSecond);
}
Small “optimisation”: 6 microseconds after disabling garbage collector. It was straith Java->D port, nothing special to D there.
Very nice article, had no idea that java is that fast. I had the impression that is should be about 10 times slower than C++…
As I work mainly with Flash and PHP, I was very curious to see how ActionScript3 behaves. Good news, it’s pretty fast.
Can’t wait for an open source server that uses AS3. Maybe an apache module?
here are my results:
)
AS3 (Activex player 9,0,124,0, nondebug): 50 microseconds
AS3 (Stand-alone player, 9.0.115.0, debug): 150 microseconds
PHP (apache module): 600 microseconds
(on the same machine – Core2 Duo 2.4GHz, php running into vmware/debian. If I run the php version in windows, it gets worse
the AS3 version of the program is on my blog http://www.victordramba.com/?p=16
Hi again Dhananjay,
I made a slightly different version using izip() and count(). It is no fasterbut might be even easier to show how the python list is used as a circular list.
It also has the two lines needed for optimising with psyco.
..from itertools import izip, count
..import psyco
..psyco.full()
..def findlast2(chainlength = 40, kill = 3):
……n, c = 0, range(1,chainlength + 1)
……while len(c)1:
……….#print “n=%3i, (n+1)%%3=%i, c=%s” % (n, (n+1)%3, c)
……….c = [x for n,x in izip(count(n+1),c) if n % 3]
……#print “n=%3i, (n+1)%%3=%i, c=%s” % (n, (n+1)%3, c)
……return c
Again, if you un-comment the print statements and call findlast2() you get a better insight into the algorithm as it will return:
findlast2()
n= 0, (n+1)%3=1, c=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40]
n= 40, (n+1)%3=2, c=[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 38, 40]
n= 67, (n+1)%3=2, c=[1, 4, 5, 8, 10, 13, 14, 17, 19, 22, 23, 26, 28, 31, 32, 35, 37, 40]
n= 85, (n+1)%3=2, c=[1, 5, 8, 13, 14, 19, 22, 26, 28, 32, 35, 40]
n= 97, (n+1)%3=2, c=[1, 8, 13, 19, 22, 28, 32, 40]
n=105, (n+1)%3=1, c=[1, 13, 19, 28, 32]
n=110, (n+1)%3=0, c=[1, 13, 28, 32]
n=114, (n+1)%3=1, c=[13, 28]
n=116, (n+1)%3=0, c=[13, 28]
n=118, (n+1)%3=2, c=[28]
[28]
Hi again Dhananjay,
I made a slightly different version using izip() and count(). It is no fasterbut might be even easier to show how the python list is used as a circular list.
It also has the two lines needed for optimising with psyco.
..from itertools import izip, count
..import psyco
..psyco.full()
..def findlast2(chainlength = 40, kill = 3):
……n, c = 0, range(1,chainlength + 1)
……while len(c)>1:
……….#print “n=%3i, (n+1)%%3=%i, c=%s” % (n, (n+1)%3, c)
……….c = [x for n,x in izip(count(n+1),c) if n % 3]
……#print “n=%3i, (n+1)%%3=%i, c=%s” % (n, (n+1)%3, c)
……return c
Again, if you un-comment the print statements and call findlast2() you get a better insight into the algorithm as it will return:
>>> findlast2()
n= 0, (n+1)%3=1, c=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40]
n= 40, (n+1)%3=2, c=[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 38, 40]
n= 67, (n+1)%3=2, c=[1, 4, 5, 8, 10, 13, 14, 17, 19, 22, 23, 26, 28, 31, 32, 35, 37, 40]
n= 85, (n+1)%3=2, c=[1, 5, 8, 13, 14, 19, 22, 26, 28, 32, 35, 40]
n= 97, (n+1)%3=2, c=[1, 8, 13, 19, 22, 28, 32, 40]
n=105, (n+1)%3=1, c=[1, 13, 19, 28, 32]
n=110, (n+1)%3=0, c=[1, 13, 28, 32]
n=114, (n+1)%3=1, c=[13, 28]
n=116, (n+1)%3=0, c=[13, 28]
n=118, (n+1)%3=2, c=[28]
[28]
>>>
A shorter version of the Java code is.
public class Chain {
private List people = new ArrayList();
public Chain(int size) {
for(int i=1;i1)
for (int i = 0; i people.size(); i+=nth)
people.remove(i);
return people.get(0);
}
public static void main(String[] args) {
int ITER = 100000;
long start = System.nanoTime();
for (int i = 0; i ITER; i++) {
Chain chain = new Chain(40);
chain.kill(3);
}
long end = System.nanoTime();
System.out.println(“Time per iteration = ” + ((end – start) / (ITER)) + ” nanoseconds.”);
}
}
Its appears that gt and lt get manged as per usual.
Hi,
You concentrate on the speed of each implementation but speed is no good if it gives wrong results.
How do you know you are getting the right result?
I’ve walked through my algorithms partial results as you can see and know that I have to return 28. It was straightforward to do in Pythons shell.
On researching the problem further I found this page: http://mathworld.wolfram.com/JosephusProblem.html. Mathworld seems to be good at maths problems and I stuck their 41,3 sized problem into my algorithm with the print statements un-commented and found that my algo gave the same results of the last person being numbered 31 and the second to last being 16.
And on that note I then tried to replicate the triangular table of results numbered (3) on the Wolfram page and came a cropper. I have to withdraw findlast() and make the following modification to findlast2() to handle the special case of choosing every soldier on order:
..def findlast2(chainlength = 40, kill = 3):
……if kill == 1:
……….return [chainlength]
……n, c = 0, range(1,chainlength + 1)
……while len(c)>1:
……….#print “n=%3i, (n+1)%%3=%i, c=%s” % (n, (n+1)%3, c)
……….c = [x for n,x in izip(count(n+1),c) if n % kill]
……#print “n=%3i, (n+1)%%3=%i, c=%s” % (n, (n+1)%3, c)
……return c
With the following loops to create the table:
..mx = 10
..print”\nTable showing last positions”
..print “\n%12s” % “kill every:”,
..for kill in range(1,mx+1):
……print “%2i” % kill,
..print
..for chain in range(1,mx+1):
……print “\n%12s” % (“chain of %i:” % chain),
……for kill in range(1,chain+1):
……….print “%2i” % findlast2(chain, kill)[0],
..print
..
I then get the following output that tallies with the Wolfram table:
..Table showing last positions
..
.. kill every: 1 2 3 4 5 6 7 8 9 10
..
.. chain of 1: 1
.. chain of 2: 2 1
.. chain of 3: 3 3 2
.. chain of 4: 4 1 1 2
.. chain of 5: 5 3 4 1 2
.. chain of 6: 6 5 1 5 1 4
.. chain of 7: 7 7 4 2 6 3 5
.. chain of 8: 8 1 7 6 3 1 4 4
.. chain of 9: 9 3 1 1 8 7 2 3 8
..chain of 10: 10 5 4 5 3 3 9 1 7 8
A quick modification of the table printing loop to use:
print “%2i” % int(findlast2(chain, kill)[0] == Chain(chain).kill(kill).count+1)
Shows that by allowing for the off by one error, Your Python solution tallies with mine.
What to glean from the above?
* Verification counts.
* Verification and exploration is easy in an interpreted language that gives concise algorithms – it is easier to “throw one away” in the quest for quality.
- Paddy.
A shorter version of the Java code is.
public class Chain {
private List people = new ArrayList();
public Chain(int size) {
for(int i=1;i1)
for (int i = 0; i < people.size(); i+=nth)
people.remove(i);
return people.get(0);
}
public static void main(String[] args) {
int ITER = 100000;
long start = System.nanoTime();
for (int i = 0; i < ITER; i++) {
Chain chain = new Chain(40);
chain.kill(3);
}
long end = System.nanoTime();
System.out.println(“Time per iteration = ” + ((end – start) / (ITER)) + ” nanoseconds.”);
}
}
Its appears that gt and lt get manged as per usual.
Hey, nice comparison, but – quick question, why did you run java with
100000; ITERATION and C++ with
1000000;
@Kuba, I had originally run Java and C++ with 1000000 and the rest with 100000 (just because of the large timing difference), somewhere along the way (and I am not sure when exactly) one of the zeroes in the java program dropped off
. I confirmed later this did not influence the results in any significant way.
Hey, nice comparison, but – quick question, why did you run java with
100000; ITERATION and C++ with
1000000;
@Kuba, I had originally run Java and C++ with 1000000 and the rest with 100000 (just because of the large timing difference), somewhere along the way (and I am not sure when exactly) one of the zeroes in the java program dropped off
. I confirmed later this did not influence the results in any significant way.