Introduction to Amazon’s SimpleDB

Amazon’s SimpleDB is a NoSql datastore with a whole lot of no: no sql, no datatypes (except utf-8 strings), no transactions, no joins, no indexes, no schema, no administration, and no cost for minimal usage. But when you google it, you find Amazon’s docs, a lot of bold predictions about it from 2007 and 2008… and not much else. SimpleDB seems like an interesting solution in search of a problem, but its ease of use and lack of administration effort make it worth at least checking out.

SimpleDB (and Amazon’s other web services offerings) can be accessed through a number of different APIs. For java, they have an eclipse plugin. A new aws project has this option screen:

If you want a quick example of how to interact with SimpleDB through java, check the Amazon SimpleDb Sample and you’ll get a runnable class that includes basic functions.

SimpleDB has ‘domains’ that are roughly like tables in a relational model, and ‘items’ within each domain, which are kind of like rows. Domains can’t be joined or related together at the database level. Each item has ‘attributes’, which are key/value pairs, but an item can have multiple attributes with the same key. For example, this is fine (using the java sdk):

ReplaceableItem item = new ReplaceableItem("The Java Programming Language") .withAttributes(
  new ReplaceableAttribute("category", "technical", true),
  new ReplaceableAttribute("title", "The Java Programming Language", true),
  new ReplaceableAttribute("price", "15", true),
  new ReplaceableAttribute("author", "Ken Arnold", true),
  new ReplaceableAttribute("author", "James Gosling", true),
  new ReplaceableAttribute("author", "David Holmes", true)));

The ReplaceableItem constructor takes a name or key. Attribute ‘keys’ are not unique for a given item- only the combination of key and values has to be unique. That produces an odd situation where this query retur true:

select * from `myDomain` where author = 'James Gosling'

returns true. So does this:

select * from `myDomain` where author = 'Ken Arnold'

But this is false:

select * from `myDomain` where author = 'Ken Arnold' and author = 'James Gosling'

To select items with both these authors, you need:

select * from `myDomain` where author = 'Ken Arnold' intersection author = 'James Gosling'

To select items with only the given author, it’s

select * from `myDomain` where every(author) in ('Ken Arnold')

It’s not difficult, but it might have been clearer if they hadn’t used sql-like syntax. There must be a nicer way to define a key that has a list for its value.

You can compare and sort, but remember that everything is a string. Amazon has suggestions for dealing with numbers and dates, but they will not impress you terribly. Think of the tedious things you have to do in a regular relational database when you are stuck storing a number or date as a string for some reason.

You can do a count(), but Amazon has this note: ‘If the count request takes more than five seconds, Amazon SimpleDB returns the number of items that it could count and a next token to return additional results. The client is responsible for accumulating the partial counts.’ It’s a good reminder of the fact that you’re getting your data through a web service and need to plan accordingly. Likewise, there are also some limits you need to consider when you’re scoping out your requirements.

To insert, you perform a PutAttributesRequest operation, which looks like this in the java sdk:

sdb.batchPutAttributes(new BatchPutAttributesRequest(myDomain, listOfReplaceableAttributes));

Each ReplaceableAttribute can be defined with a boolean replace flag. If you have an item with an existing key value pair of ‘category’/’technical’ and do a put operation with a new pair with the same key but a different value, say ‘category’/’programming’, it will replace the old pair if replace is true, or add an additional pair if false. Attribute keys, again, are not unique.

Amazon helped pioneer the idea of eventual consistency and it’s still the default in SimpleDB, but they brought out the option of immediate consistency and conditional puts in 2010. They don’t seem to be available directly in the java sdk, but they are in the web service apis.

There are also some third-party libraries you can try out, like topica and SimpleJPA, which tries to wrangle SimpleDB into a jpa implementation, and even a simpledb-jdbc library. Overall, though, you wouldn’t want to treat SimpleDB like a normal database that you can access behind a typical java interface. SimpleDB won’t replace most folks’ database, but it could still be right for any number of different situations.

This entry was posted in cloud, java, software and tagged . Bookmark the permalink.

7 Responses to Introduction to Amazon’s SimpleDB

  1. Pingback: Getting started with a free Amazon EC2 instance | Coppery Keen Claws

  2. Pingback: 3 New NoSQL Tutorials to Check Out This Weekend | TECHNOLOGY NEWS

  3. Pingback: 3 New NoSQL Tutorials to Check Out This Weekend | Kaka4prez83's Blog

  4. Pingback: Introduction to Amazon’s SimpleDB

  5. Pingback: Quick Intro to Amazon’s SimpleDB System | ChurchIT

  6. SDB Explorer says:

    SDB Explorer has been made as an industry leading graphical user interface (GUI) to explore Amazon SimpleDB service thoroughly and in a very efficient and user friendly way.

  7. SDB Explorer says:

    SDB Explorer 2011.05.01.02 version has come up with bulk upload feature. Now user can easily upload large number of data to Amazon SimpleDB in number of parallel threads. You can upload your MY SQL data, can directly edit cells or can import data from CSV file to upload a bulk data to Amazon SimpleDB. Uploading bulk data get started in queue, so that user can view his/her progress of uploading. SDB Explorer provides you better visualization and statistics of your uploading data. SDB Explorer allows you to generate item names automatically for uploading data in bulk.

Leave a Reply

Your email address will not be published. Required fields are marked *