regex-dna benchmark hypotheses non fingo

↓ Match DNA 8-mers and substitute nucleotides for IUB codes, ~5MB N=500,000.

Which programs used least Code? Which programs use highly optimised assembly code libraries? Which programs make use of all the processor cores?

    sort sort sort sort
  ×   Program & Logs CPU secs Elapsed secs Memory KB Code B ~ CPU Load
1.0Tcl #2 3.3025,004373  
1.7OCaml #2 5.5445,364615  
1.7C++ GNU g++ #3 5.5812,7041588  
1.8Intel #2 5.9313,0881099  
1.8Python CPython 6.0420,516342  
1.8GNU gcc #2 6.0913,3601099  
1.9Pike #2 6.3212,792472  
1.9Python Psyco 6.4319,576355  
2.0BASIC FreeBASIC 6.4899,8641106  
2.0Lisaac 6.5526,1321299  
2.2C++ GNU g++ #2 7.2419,832635  
2.3Digital Mars #2 7.5486,596506  
2.3Java 6 -Xms64m #4 7.6075,192921  
2.4Java 6 -server #4 7.8175,892921  
2.5Python IronPython #2 8.32156,072314  
2.8PHP #2 9.20106,752675  
2.8Scheme PLT 9.35119,724835  
2.8Ruby MRI 9.3781,148323  
3.1CAL 10.3398,3121334  
3.2Ruby 1.9 10.4331,844323  
3.2Scheme PLT #2 10.59123,968669  
3.3Java 6 -client #4 10.8669,644921  
3.3Nice #2 10.8794,196637  
3.3Lisp SBCL 10.88276,452586  
3.6Java 1.4 -server 11.7973,376657  
3.6Perl #4 11.8121,408431  
3.8Scala 12.4192,956663  
4.2Perl #2 13.7124,852449  
4.3Ada 2005 GNAT #4 14.0814,4961352  
5.2JavaScript SpiderMonkey 17.16180,280365  
5.8Forth bigForth #2 19.0154,800762  
6.9Groovy 22.73112,176366  
7.3Eiffel SmartEiffel 24.0725,576767  
7.6C# Mono #3 25.11124,284607  
7.7Python IronPython 25.29170,448342  
7.7C# Mono 25.41157,420624  
8.3Ruby JRuby 27.52186,520323  
17Ada 2005 GNAT #3 54.5442,4441233  
20Erlang HiPE #3 66.08116,636687  
20Smalltalk VisualWorks 66.7444,092584  
39Java 6 -Xint #4 130.2666,972921  
115Java GNU gcj 6 min261,312657  
CINT Failed1101
C++ Intel #2 Failed635
C++ Intel #3 Failed1588
Digital Mars #3 Failed1022
JavaScript Rhino Failed592
Mozart/Oz #2 Timed Out589
Pike Timed Out892
Scheme Chicken Failed961
Smalltalk GNU Failed502
interesting alternative programs
 Pascal Free Pascal #2 Failed  1074
 Fortran G95 Failed  2425
0.3Pascal Free Pascal #3 1.100.0019,6042932
0.4Perl #6 1.230.0022,120471
0.4Perl #3 1.260.0031,584440
0.6Icon #2 1.950.0051,244770
0.7Perl #5 2.360.0024,760479
0.9Python CPython #2 2.880.0023,000314
1.0Perl 3.340.0021,956426
1.3Fortran Intel 4.180.0013,9962425
1.3Pascal Free Pascal 4.370.0011,4361199
1.4CAL #2 4.670.0067,6481471
2.0Digital Mars #4 6.620.0071,692488
2.3Ruby MRI #2 7.620.0042,824396
2.4Lua LuaJIT #3 7.880.0034,368427
3.0Lua #3 9.890.0032,908427
5.2C# Mono #4 17.310.0064,708657
missing programs
Clean No program
F# Mono No program
Forth GNU GForth No program
Fortran G95 No program
Fortran Intel No program
Haskell GHC No program
Icon No program
Io No program
Lua No program
Lua LuaJIT No program
Mercury No program
Oberon-2 OO2C No program
Pascal Free Pascal No program
Prolog SWI No program
Prolog YAP No program
Rebol No program
Scheme Ikarus No program
Smalltalk Squeak No program
SML MLton No program
SML SML/NJ No program
Zonnon Mono No program

 regex-dna benchmark : Match DNA 8-mers and substitute nucleotides for IUB codes

diff program output for this 100KB input file (generated with the fasta program N = 10000) with this output file to check your program is correct before contributing.

We use FASTA files generated by the fasta benchmark as input for this benchmark. Note: the file may include both lowercase and uppercase codes.

Each program should

Revised BSD license