Bloom Filters Mathmania

So, you’ve come across Bloom filters and understand that, despite their probabilistic nature, they are a great fit for your use case. You’ve decided to integrate them into your system design, but you’re unsure about the optimal size and the number of hash functions needed for your…

Using Binomial Distribution to Model Data Durability

Durability requirements influence the choice of data protection mechanisms, such as replication, erasure coding, and RAID parity configurations. Achieving higher durability involves trade-offs between redundancy, storage usage ratio, and computational complexity. Replication achieves durability by creating multiple copies of data, which increases redundancy but reduces the storage usage ratio. In…

Concurrency Control, Part III: How Databases Finalise a Conflict-Free Schedule

In my last post, we saw how databases handle conflicts through different philosophies i.e., prevention, validation, versioning, and observation, allowing transactions to run concurrently without violating serializability. But avoiding or resolving conflicts isn’t the end of the story. Even after all conflicts are managed, the database still has…

Concurrency Control, Part II: How Databases Handle Conflicts During Concurrency

In my last post, we saw how databases achieve concurrency by creating wiggle room for interleaving non-conflicting operations through conflict-free schedules. This ability to reoder independent operations without changing the final outcome is what gives serializable systems their performance edge. But we also saw that conflicting operations fix the relative…