Thursday, December 4, 2008

Now I'm taking a look at Amazon SimpleDB, which went into beta a few days ago. SimpleDB works in concert with S3. It's designed for storing smaller chunks of data and automatically indexes everything. It's just as dynamic as S3 in that you don't have to provide any kind of schema before storing data.

So SimpleDB adds one component that was missing from S3 - indexing. However there is one more thing missing - concurrency handling. I see no mention of locking or even of whether operations are atomic. And there's a new problem, what Amazon calls "eventual consistency", which basically means that if you read back an item immediately after writing it, it might not reflect your changes. This is because the changes haven't yet propagated across all the storage locations.

They say the changes should usually be seen within a few seconds but any amount of time during which the data is inconsistent throws a big wrench into the works. Now I will have to maintain a sequence number on each item, increment the number when I make a change, and then POLL to see when it's up to date. Furthermore I'm going to need to maintain an index somewhere else containing all the sequence numbers of all the items so I can determine whether any given item is consistent. That's basically a show stopper for me.

No comments: