project-proposal-2024

OpenStore

OpenStore: The S3 we have at home

Abstract

It is well known that while the cost of high-speed storage has been decreasing rapidly, the costs of cloud services such as AWS’s S3 and Azure’s Blob storage have stagnated. OpenStore is a scalable and reliable self-hosted object storage service designed for organisations that are tired of sky-high S3 bills. Offering S3-compatible APIs with a focus on deployability, extensibility, and reliability, OpenStore aims to provide an alternative to the commercial cloud storage options currently available by allowing organisations to host their own S3-like object store.

Author

Name: Thomas Day

Student number: 46986791

Functionality

Generally speaking, OpenStore will implement the same functionality as AWS S3 with the added benefit of being self-hostable and open source. This key functionality includes:

S3-compatible APIs for object storage operations (PUT, GET, DELETE, etc.).
High availability.
Data redundancy/replication across nodes.
Data integrity verification (checksums, etc.).
Support for large-scale deployments.
A command-line tool that can act as an OpenStore client and allow for some basic configuration (similar to the AWS S3 CLI).
A web interface allowing for management and browsing of the storage cluster.

Scope

The Minimum Viable Product (MVP) will include the following functionalities:

Basic CRUD (Create, Read, Update, Delete) operations for object storage.
Implementation of a subset of the S3 API endpoints to allow for the basic CRUD operations.
High availability.
Data integrity checks.
Implementation of a command-line tool for basic interactions with OpenStore.
Bonus functionality (optional): A Terraform provider for Infrastructure as Code (IaC) configuration.

Quality Attributes

Availability: OpenStore will be used to hold critical data in production; as such, it must be resilient to partial failures (e.g., a node crashing or a connectivity issue).
Reliability: OpenStore will be used to store critical information in production environments. Therefore, the service must guarantee the integrity of the information it serves to users.
Scalability: This service will power critical production workloads that will need to scale. Thus, the service will need to be scalable to meet varying demands.

Evaluation

Availability

Testing Strategies:

Repeated fault injection: Spin up a new cluster and write numerous files to it. Then, abruptly remove/kill one of the nodes before attempting to read back all the written files.
Simulated network failure: In an active cluster performing data operations, simulate a network disconnection between nodes. Verify that all files can still be read back in the event of network partitioning.

Reliability

Testing Strategies:

File corruption: Write a file to the cluster, corrupt one of the instances of the file on disk, and then attempt to read the file through OpenStore. Through error detection and data replication, the correct uncorrupted file should be returned to the user.
Failure mode: Write a file to the cluster, then corrupt all copies of the file on the node’s disks, and attempt to read the file back. Ensure that an error is returned rather than a corrupted file.

Scalability

Testing Strategies:

Load test: Using a tool like k6, apply increasing loads to the cluster to ensure it handles high traffic well.
Inducting a new node: In a high-traffic situation, administrators may wish to expand the cluster. Ensure that a new node can join the OpenStore cluster within 5 minutes of provisioning.