Architecture - Bangdb, Embedded

Bangdb has been design from scratch keeping following items as the main design goals;

  • Performance - fast key value store, highly concurrent
  • Robust - crash resistant, fault tolerant, auto recovery
  • Flavors - should have various configurable aspects
  • Pluggable - standard API, can be plugged un-plugged easily
  • No Admin - easy install, uninstall. Self monitored
  • Economy - runs on commodity hardware

The general high level archtectural information is given below;

The bangdb provides simple standard APIs for clients to access the database. Please see the API section for detail. When enabled, bangdb creates a buffer pool of size given by the user and then creates many data structures to mange the buffer. It creates a hash table of buffer headers, an lru list, a dirty page list and a free header list. It also creates workers to handle the various housekeeping for buffer pool and also for flushing dirty pages to disk asynchronously

The bangdb takes decision at regular interval depending upon the pressure on the memory and requirement to decide on how much buffer should be freed, how many pages to be flushed etc. The lru list helps in deciding which headers to be flushed before others

Logically the bangdb consists of mainly following components as in simple view given below;

When enabled, write ahead log(wal) keeps on writing the individual operations in the log file and keeps on rotating the buffer as and when it gets filled. The wal provides the log check-pointing and replay functionality which helps in recovering data when db was not closed properly or in case db/machine crashed. There are multiple workers for wal which keeps on checking to do various housekeeping jobs. One of the workers wakes up regularly and flushes log if required. The log is flushed in bulk and being sequential write it happens relatively quickly. All log flushes are synchronous hence gurantee the persistence of log data which is critical

The high level view of the wal is as following;

The index as shown above is basically implementaion of access methods for data. These access mechanisms are implemented as B+link Tree and Extended Hash. The use of one over other purely dependent on the context and choice of user as it's available as configurable parameter

The buffer pool consists of headers hash table, lru list, dirty page list and free list. This is how the buffer pool would look like for ext hash as index type, for ex;

Please checkout the whitepaper for more details