project-proposal-2024

⚡️ The Very Cool Super Fast Database That Probably Won’t Lose Your Data (aka The VCSFDBTPWLYD)

Abstract

The VCSFDBTPWLYD is a state of the art distributed in-memory key-value database. It aims to achieve three key principles:

The Very Cool Super Fast Database That Probably Won’t Lose Your Data addresses a key need in today’s database market. In-memory databases such as Redis are a key component of modern cacheing layers, and the distributed nature of NoSQL databases such as MongoDB give us the ability to scale happily to many millions of users. The VCSFDBTPWLYD addresses the market need for both fast cacheing AND scale via a unique distributed in-memory cluster design.

Team requirements: In order to evaluate both coolness and “probably won’t lose your data” ability, VCSFDBTPWLYD contributors must be able to remember the name of The VCSFDBTPWLYD.

Author

Name: Brook Hamilton Queree

Student number: 48423384

Functionality

The VCSFDBTPWLYD is a new in-memory database that will both respond rapidly to your queries, while also being resistant to failures within the system, providing high availability for all your database-ing needs. A VCSFDBTPWLYD cluster is a grouping of many happy node families, as we will define here:

FAMILY: A “family” is a group of N compute nodes, one parent node (write node) and N-1 read-replicas. Upon parent failure, one of the read-replicas will be promoted to the family-parent.

Alt text

CLUSTER: A cluster is a collection of families. Each cluster distributes its data across the families according to keyspace allocations.

Alt text

Each family is responsible for keeping a subset of your data safe! Luckily, the mommy and/or daddy node has given birth to N-1 read replicas that will be there to save your data in the case of parental failures. Because the keyspace is distributed evenly across your VCSFDBTPWLYD cluster, we are able to scale to support high volumes of concurrent queries. We do this by making use of both data distribution across the cluster, as well as child labour within the families! 🎉

Scope

The scope of The VCSFDBTPWLYD-MVP is designed to explicitly deliver the key features of being very cool, super fast, and probably not losing your data, while also adding key limitations to ensure feasability:

The VCSFDBTPWLYD-MVP supports the following cluster operations: CREATE CLUSTER, SPLIT FAMILY (add nodes), MERGE FAMILIES (remove nodes)

A “SPLIT” operation serves to split the keyspace of an existing family, dividing it in two across the current family and a newly created family instance. A “MERGE” operation merges the keyspace of two existing families. One of the families will transfer their current data to the new family, before heading off to the happy farm out of town where all old families are sent.

The VCSFDBTPWLYD-MVP supports the following table operations: CREATE TABLE, GET (key), PUT (key)

Unsupported: DELETE (key). The VCSFDBTPWLYD strives to probably not lose your data. As such, deletes are unsupported.

The VCSFDBTPWLYD-MVP supports one data type for both Key and Value fields: STRING

The VCSFDBTPWLYD-MVP supports the following user operations: ADD USER, DROP USER, GRANT ROLE [READ, WRITE, CLUSTER_ADMIN], REVOKE ROLE

Other automated Features:

In the adverse event of time contraints, the user and permissions model is a lower priority for The VCSFDBTPWLYD-MVP than the key database requirements of cluster and table operations.

Quality Attributes

Evaluation

The key database requirements will be evaluated using a number of important tests:

Basic Functionality Tests These tests will ensure the basic functionality of the VCSFDBTPWLYD has been achieved. Namely, can the cluster respond to GET and PUT requests for keys, is it able to grow and shrink according to user commands.

Cluster Growth and Shrink Tests The VCSFDBTPWLYD will be tested upon its ability to maintain 100% data retention after many cluster growth and shrink operations. It will also be tested upon its ability to retain data when receiving PUT requests during both cluster growth and shrinkage.

High Load Environment Tests The VCSFDBTPWLYD will be evaluated on its ability to still reliably respond to GET and PUT requests when placed under an induced high load environment.

Forced Node Shutdown Tests The VCSFDBTPWLYD will be tested thoroughly on its ability to continue to respond to requests despite the forced shutdown of one-to-many cluster nodes.

Forced Network Failure Tests A subset of the VCSFDBTPWLYD ndoes will have their network connection deliberately disabled. During this period we will evaluate the clusters ability to continue to respond to requests.