Bombarded Database = Angry Database

A big challenge in designing DAQifi’s desktop app is the fact that we have to save a large amount of data and we have to do it quickly. Our Nyquist board has up to 20 input channels.  Each channel can be sampled at 1000Hz.  That’s 20,000 individual data samples per second streaming wirelessly from the board…over a million individual samples per minute!

FAILED ATTEMPT #1

At first we took the naive approach of saving data to the database as soon as we received it.  Our logging code looked similar to the code below.

using (LoggingContext context = new LoggingContext())
{
     //Add data sample
     context.Samples.Add(channel.ActiveSample);
     
     //Commit changes to the database
     context.SaveChanges();
}

This worked well for low sampling rates, but completely fell apart at higher rates.  The problem is that the database didn’t appreciate being bombarded 20,000 times a second and I don’t blame it!  It simply couldn’t keep up with that many transactions that quickly.

FAILED ATTEMPT #2

After much research on the struggles and tribulations others have had, a common theme emerged.  It’s best to do fewer but larger transactions.  Sure enough, the internet was right!  Instead of immediately saving to the database, we buffered data in memory and wrote to the database once a second.  This drastically improved our throughput.  We even found some tips and tricks about turning off AutoDetectChanges in Entity Framework which helped speed things up.

if (_stopwatch.ElapsedMilliseconds > _lastUpdateMilliSeconds + 1000)
{
     using (LoggingContext context = new LoggingContext())
     {
          context.Configuration.AutoDetectChangesEnabled = false;
          
          //Add buffered data to the database
          context.Samples.AddRange(_buffer);
          
          //Clear the buffer
          _buffer.Clear();

          //Commit changes to the database
          context.SaveChanges();
          context.Configuration.AutoDetectChangesEnabled = true;
     }
}

Unfortunately, this still wasn’t good enough and had some pitfalls.  To begin with, data could be added to our buffer between writing to the database and clearing the buffer.  This means we could be loosing data…definitely undesirable in a data acquisition system!  And although our performance had increased, it still couldn’t keep up with the Nyquist at high speeds.

THIRD TIME’S A CHARM!

More research ensued and we discovered a Bulk Insert library for Entity Framework.  Bulk insert drastically improved the commit time to the database.  We also re-architected our consumer thread a bit to have its own temporary buffer.  Now data samples are from the main buffer and copy them to the temporary buffer.  This way, if data gets added between us committing data and clearing out the buffer, we’ll catch it on the next time around.  Our final solution looks similar to this:

private void Consumer()
{
     var samples = new List(); //temporary buffer
     var count = _buffer.Count; //get a snapshot of main buffer
     using (LoggingContext context = new LoggingContext())
     {
          //Remove samples from the main buffer
          for (int i = 0; i < count; i++)
          {
               DataSample sample;
               if (_buffer.TryTake(out sample)) samples.Add(sample);
          }

          //Save temporary buffer to database
          context.BulkInsert(samples);
          
          //Clear out the temporary buffer
          samples.Clear();
     }
}

IN SUMMARY

Overall we learned the following in dealing with database performance

  1. Databases don't like to be harassed.  There is a limit to the number of transactions it can handle per second.  It's best to do fewer but larger transactions.  There is a balance though!  If you do really large transactions, you'll see a performance hit.
  2. Don't reinvent the wheel.  There are some awesome libraries out there to help with inserting a lot of data in the the database at once.
  3. Be careful with data integrity.  We implemented a double buffer system to ensure we don't lose any data from the data acquisition board.

Leave a Comment