Mar 26 2010

Structuring my storage

Category: Intellectual PursuitsJoeGeeky @ 00:04

In the late 90's, a term was born which spawned a whole range of non-relational storage facilities. What is this term? NoSql

NoSQL is a movement promoting a loosely defined class of non-relational data stores that break with a long history of relational databases and ACID guarantees.
- Wikipedia

There are a wide range of purpose-built solutions out there ranging from document storage systems to Tuple Space data grids. Each of these targets a specific niche sacrificing more traditional SQL-like patterns for other advantages (Ex. speed, portability, structure, etc...). Even with so many differences between them, these architectures generally share a number of common characteristics and while they may sounds a lot like traditional databases, their implementations can be quite different. Consider the following common Structured Storage components:

  • Store - This is the storage medium. This is commonly a file (or many files). However, in this modern distributed world of ours this can be virtualized in many different ways
  • Index - This can be one or many different representations of part or all of the stored data. This pattern is generally used to optimize search or data location routines. Any one store can have many different Indexes to facilitate its needs
  • Cursor - This is used to iterate through the store generally pointing at one record at a time. This is similar to an Enumerator or an Iterator although one store could have multiple cursors at any one time. The cursor is often the point from which data is written to the store or read from it

Understanding these basic principles can make it easy(ier) to create your own purpose-built store for meeting any specific needs you might have. I recently built a custom point-to-point queue and needed it to be durable. In this case, I wrote a store to guarantee delivery of queued messages on both the sending and receiving ends of the queue. In doing so, I was reminded of a few valuable lessons:

  • Technical Debt - To make a custom store suitable for highly available and/or performant applications, you will need to employ a number of advanced techniques. This includes asynchronous and possibly distributed processing, replication strategies for fail-over, etc... These issues are not trivial, and can require large investments in time and money to make them work correctly. If you have these needs then it may be better to go with an established technology
  • Disk thrash - It may not seem obvious, but high-performance persistence technologies need to be aware of what a storage medium; such as as disk; can and cannot do well. Think about how disk heads move. If your cursors, data readers, and/or data writers behave in a manner that would cause the disk heads to jump around, you will lose precious time just due to the mechanical limitations of the disk. Do a little research and you'll find patterns to help mitigate this kind of performance hit
  • Describe your data - When you're architecting your store, keep in mind that you need to store meta-data along with the data you intend to store. This could include data used to generate metrics, versioning, structure details, arbitrary headers, or whatever. While it may cost more, make sure you give yourself room to grow 
  • Familiarity - Take a sampling of developers and show them a database, tables, sprocs, etc... The vast majority will know exactly what to do if change is needed. Compare that to showing the same developers a proprietary storage solution. While they may be able to figure it out, it will take a great deal more time and energy to make change, isolate bugs, etc. Like it or not, most of us recognize the classic database model. Having something people recognize can be worth a lot, so don't underestimate the value of older patterns  

In today's complex environments, experience with the aforementioned patterns can really come in handy. Investing a little energy in this area can be worth it, even if it is just done as an intellectual pursuit. Just remember, this is all about purpose-built stores. Don't feel like you need to copy or replicate functions from other tool sets. If you are, then maybe you should just use those tools... Wink

You're welcome to take a look at an early version of one of my stores. Although its not the best example, it met my needs. Here is some SmellyStorage.

Tags: , , ,

Comments

1.
trackback Smelser.NET says:

SmellyStorage (Structured Storage)

SmellyStorage (Structured Storage)

2.
Angelica Langston Angelica Langston United States says:

Terrific article, thanks. Could you clarify the first paragraph in additional detail please?

Comments are closed