The purpose of the performance analysis of bangdb under few scenarios is to present a high level mesurement figure which may help users to easily map their use cases and understand what to expect from bangdb. The performance measurement is done on commodity hardware without doing any customization
These performance numbers will vary depending upon the configuration of machine, OS, size of key and value and other parameters, hence users may see different metrics when they run on their own machines in different settings. However, the benchmark shown here and the comparison of numbers with few other dbs in next section would help user to take decision in some fashion
Following machine (commodity hardware) used for the test;
The Bangdb configuration;
Other parameters;
This configuration ensures that the db runs in conservative mode where if process or db crash happens, at restart the db will recover to the point where the db crashed. There are many workers who are ensuring that the mechanisms for write ahead logging and buffer pool health. Note that in the table the data with the Log = ON depict the numbers for above configuration. However if we switch off the log and just work with everything else as it is(apart from log and related stuff) then the numbers would look like as given in the column for Log = OFF
The tests are done in various conditions;
The buffer is allocated sufficiently and all the reads and writes will happen from the pool itself. The db will not go to the disk for any reason except the continuous log flush. This ensures that the performance data reflects the true performance for the db. In the real world this condition is realistic when we use db as in memory cache only for example session data. Otherwise for persistent database, it is not possible to guarantee this
The buffer allocated is less than the amount of operations that would take place. However, the buffer allocated is around half the amount of data to be written or read. Since all operations are random and continuous, hence it will be very difficult to flush out right set of pages and bringin other right set of pages in the buffer pool. In the real world operations are less random and also it's not continuous 100 percent read and write. But this models the real world in best possible manner slightly on the conservative side. It's important to note and we will see in the performance report that bangdb performs well in this condition too
The buffer allocated is much less than (around 5-7 times) the data that will be read and written into the database continuously and in random fashion. The flushing of dirty pages to reclaim some free ones makes the db to almost continuously write pages to the hard disk. The amount of read activities from disk are also high for both read and write ops. It becomes very very difficult to anticipate which pages to flush and read. The bangdb takes extra efforts to ensure that performance doesn't degrade to the level where writing to disk halts the whole system. The results show that performance degrades gracefully with the rise of amount of data vs buffer pool capacity ratio
Various sub sections of the benchmark will dig deeper into the performance analysis, please have a look at them to check out how Bangdb performs in different condtitions. There is a competition analysis also done to just compare bangdb with Oracle's Berkley DB and Google's Level DB. The comparison is to learn the conditions and scenarios where individual dbs performs in some manner
The average values for the throughput(ops/sec) over the 100K – 1M operations are;
| Index (Access Method) | Log - ON | Log - OFF | ||
|---|---|---|---|---|
| Write (ops/sec) | Read (ops/sec) | Write (ops/sec) | Read (ops/sec) | |
| Btree | 475,000 | 1,025,000 | 685,000 | 1,045,000 |
| Hash | 500,000 | 1,690,000 | 790,000 | 1,675,000 |