Database Benchmark

Some people created a nice well made GUI based tool for comparing Database performance and size. I used this tool for performance tuning of VelocityDB. The benchmarks are limited in scope but for what they test, they are very good and compare 11 different Databases. VelocityDB performs nicely in comparison. Only the mysterious STSdb performs better in some cases but STSdb has a very limited API, isn't distributable, and cannot directly store classes with relations to other classes. The entire benchmarks lacks such cases and also doesn't test large objects. Databases part of the benchmark include: STSdb, Acess 2007, db4objects, Firebird, H2, MS SQL Server, MySQL, Oracle BerkeleyDB, Perst, and SQLite. The official benchmark version 1.1 did not include VelocityDB. You can test VelocityDB with this tool using the VelocityDB implementation of the benchmark tests. The benchmark version 1.1 did not use .net 4  but you can make it work using .Net 4.0. VelocityDB files: VelocityDbBenchmarkCompressed.zip . The C# source code for the entire tool can be downloaded from here. The benchmark shown here is with RecordCount being 250,000.

Adds [RecordCount] number of records to the database. The database must give an opportunity to directly access the added records by their index - 0, 1, 2, ... RecordCount - 1 thereafter.

As you can see, VelocityDB is way faster than these other Database Systems. Click here to see the results for the other tests part of the 3rd party test suite

Here a smaller bar is better! Not only is VelocityDB faster, it does it producing smaller databases, especially when compression is enabled.

A user asked for a comparison with Eloquera. We implemented the first two tests using Eloquera - we can implement the remaining ones on request

This time we used a faster PC for the benchmark, a ZT Affinity 7644Mi Desktop PC , and 5 million records were added. The src for the Eloquera implementation is here

Enumerate the records in order that they are added with the Append() method.


All of these Database systems adds the same type of Tick object , see class definition below.

public class Tick
{
  public string Symbol { get; set; }
  public DateTime Timestamp { get; set; }
  public double Bid { get; set; }
  public double Ask { get; set; }
  public long Volume { get; set; }
  public string Exchange { get; set; }
}

MongoDB Comparison

The official benchmark has changed and they added MongoDB. This test adds 1 million 8 byte keys with 32 byte values sorted by key.

New Benchmark 2.0 Comparisons

The benchmark group have added a few new databases to their benchmark. We tried a few of these January 28 - 30, 2014 and updated February 15, 2014 with VelocityDB 3.5 (improved). The following two charts are for 3,000,000 records

A few more databases. The following two charts are for 1,000,000 records

Triangle Counter

This test idea started with a Vertica comparison with Hadoop and PIG. Given 86,220,856 tuples compute the number of triangles that can be formed from these edges. We were not able to run the Vertica, Hadoop or PIG solution yet due to build issues (missing build.xml). Note that VelocityDB only needs about 387MB to store the data for this while Hadoop is claimed to need 160GB !! This same test was also performed using Oracle on a very expensive system. After the Oracle publication came a Kognitio press release where they claim to beat the other databases. We ran a VelocityDB version of this test on a Gateway SX2800 C2Q Desktop PC purchased refurbished for $329 in 2009 plus a $30 memory upgrade in December 2011. Using this low end hardware with Windows 7 and 8GB memory, we most likely outperform all the other databases. Using the Gateway PC, VelocityDB counts the triangles between the 4846609 nodes in test data in about 99 seconds plus 30 seconds for loading the database using 4 threads. Now, with a more powerful ZT Affinity 7644Mi Desktop PC  (made in USA), we see improved numbers: 42 seconds plus 14 seconds to load the database. The same calculations on a very powerful server using Hadoop: 13140 s, PIG: 9540 s, Vertica: 287 s. We will test on a big server asap and we are confident we'll be better or near the Oracle & Kognitio results when utilizing 96GB memory and 12 or more fast cores (cpu's). If you have access to such a powerful Windows machine, please help us out by running the VelocityDB version of the triangle counter test. We want to prove that we are  #1!

using System;
using System.Runtime;
using System.Collections.Generic;
using System.Linq;
using System.IO;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using VelocityDb;
using VelocityDb.Collection.BTree;
using VelocityDb.Session;

namespace TriangleCounter
{
  class TriangleCounter
  {
    static readonly string systemDir = System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments), "VelocityDB", "Databases", "TriangleCounter");

    static int discoverTrianglesSingleCore(BTreeMap<int, int[]> edges)
    {
      int triangles = 0;
      BTreeMapIterator<int, int[]> edgesItr = edges.Iterator();
      while (edgesItr.MoveNext())
      {
        int nodeId = edgesItr.CurrentKey();
        int[] edge = edgesItr.CurrentValue();
        int stop = edge.Length - 1;
        int i = stop;
        int edgeToStart, edgeTo;
        int pos;
        while (i >= 0)
        {
          int[] edgeInfo2;
          edgeToStart = edge[i--];
          if (nodeId < edgeToStart)
          {
            if (edges.TryGetValue(edgeToStart, out edgeInfo2))
            {
              for (int j = stop; j >= i; j--)
              {
                edgeTo = edge[j];
                if (edgeToStart < edgeTo)
                {
                  pos = Array.BinarySearch<int>(edgeInfo2, edgeTo);
                  if (pos >= 0)
                  { // we know this one is connected to edgeInfo.From because it is part of edgeInfo.To
                    triangles++;
                  }
                }
                else
                  break;
              }
            }
          }
          else
            break;
        }
      }
      return triangles;
    }

    static void Main(string[] args)
    {
      long triangles = 0;
      try
      {
        using (SessionNoServer session = new SessionNoServer(systemDir))
        {
          int numberOfWorkerThreads = -1;
          if (args.Length > 0)
          {
            if (!int.TryParse(args[0], out numberOfWorkerThreads))
              Console.WriteLine("First parameter is numberOfWorkerThreads which must be an Int32");
          }
          Placement btreePlace = new Placement(40, 1, 1, 10000, 65500, true);
          Placement edgeInfoPlace = new Placement(40, 10000, 1, 10000, 65500, true);
          session.BeginUpdate();
          BTreeMap<int, int[]> edges;
          BTreeMapIterator<int, int[]> edgesItr;
          int[] edge = null;
          Database edgeDb = session.OpenDatabase(40, false, false);
          if (edgeDb != null)
          {
            session.Commit();
            session.BeginRead();
            edges = (BTreeMap<int, int[]>)session.Open(40, 1, 1, false);
          }
          else
          {
            DatabaseLocation location = session.DatabaseLocations.Default();
            //location.CompressPages = false; // no compression should make it faster (?) 
            session.NewDatabase(40, 395, "Edges");
            edges = new BTreeMap<int, int[]>(null, session, 6000);
            edges.Persist(btreePlace, session, true);
            edgesItr = edges.Iterator();
            using (StreamReader stream = new StreamReader("c:\\edges.txt", true))
            {
              int a;
              int b;
              string line;
              string[] fields;
              while ((line = stream.ReadLine()) != null)
              {
                fields = line.Split(' ');
                if (!int.TryParse(fields[0], out a))
                  break;
                b = int.Parse(fields[1]);
                if (a != b)
                {
                  if (edgesItr.CurrentKey() == a || edgesItr.GoTo(a))
                  {
                    edge = edgesItr.CurrentValue();
                    Array.Resize(ref edge, edge.Length + 1);
                    edge[edge.Length - 1] = b;
                    edgesItr.ReplaceValue(ref edge); // we need to update the value in the BTreeMap
                  }
                  else
                  {
                    edge = new int[1];
                    edge[0] = b;
                    edges.Add(a, edge);
                  }
                }
              }
            }
            edgesItr = edges.Iterator();
            while (edgesItr.MoveNext())
            {
              edge = edgesItr.CurrentValue();
              Array.Sort(edge);
              edgesItr.ReplaceValue(ref edge);
            }
            session.Commit();
            session.BeginRead();
          }
          Console.WriteLine("Number of Nodes found: " + edges.Count);
          if (numberOfWorkerThreads > 0)           
            Console.WriteLine("Start of triangle discovery using " + numberOfWorkerThreads + " threads, time is " + DateTime.Now);
          else if (numberOfWorkerThreads < 0)
            Console.WriteLine("Start of triangle discovery using system automatically selected number of threads, time is " + DateTime.Now);
          else
            Console.WriteLine("Start of triangle discovery using main thread, time is " + DateTime.Now);

          // Start counting triangles !
          if (numberOfWorkerThreads != 0)
          {
            edgesItr = edges.Iterator();
            ParallelOptions pOptions = new ParallelOptions();
            pOptions.MaxDegreeOfParallelism = numberOfWorkerThreads;
            // First type parameter is the type of the source elements
            // Second type parameter is the type of the local data (subtotal)
            Parallel.ForEach<KeyValuePair<int, int[]>, long>(edges, // source collection
              pOptions,
              () => 0, // method to initialize the local variable
              (pair, loop, subtotal) => // method invoked by the loop on each iteration
              {
                int nodeId = pair.Key;
                int[] nodeTo = pair.Value;
                int stop = nodeTo.Length - 1;
                int i = stop;
                int edgeToStart, edgeTo;
                int pos;
                while (i >= 0)
                {
                  int[] edgeInfo2;
                  edgeToStart = nodeTo[i--];
                  if (nodeId < edgeToStart)
                  {
                    if (edges.TryGetValue(edgeToStart, out edgeInfo2))
                    {
                      for (int j = stop; j >= i; j--)
                      {
                        edgeTo = nodeTo[j];
                        if (edgeToStart < edgeTo)
                        {
                          pos = Array.BinarySearch<int>(edgeInfo2, edgeTo);
                          if (pos >= 0)
                          { // we know this one is connected to edgeInfo.From because it is part of edgeInfo.To
                            subtotal++;
                          }
                        }
                        else
                          break;
                      }
                    }
                  }
                  else
                    break;
                }
                return subtotal;
              },
              // Method to be executed when all loops have completed.
              // finalResult is the final value of subtotal. supplied by the ForEach method.
              (finalResult) => Interlocked.Add(ref triangles, finalResult));
          }
          else
            triangles = discoverTrianglesSingleCore(edges);

          session.Commit();

        }
        Console.WriteLine("Number of Triangles found: " + triangles);
      }
      catch (Exception e)
      {
        System.Console.WriteLine(e);
      }
    }
  }
}

Computing the Kevin Bacon Numbers

This is a sample that shows a problem that is easy to solve with VelocityDB but may be difficult to do using only SQL and in any case, the VelocityDB solution will be faster!

Other database vendors use this same example, see HPCC Systems and Neo4J

It is a problem that is supposed to be assigned to graph databases but as you can see it is very easily handled by VelocityDB and the performance is unbeatable. We look forward to seeing numbers to compare with.

See our sample page for further info.