Wednesday, January 7, 2009

AMI's data structure

Although AMI internally uses postgresql to store its data, the way it organizes its data is much more similar to Amazon S3. Simply stated, it's a data space consisting of nodes. Each node has a unique ID and can have any number of named attributes. Attribute names are universally defined, so if you have an attribute named "foo" attached to node 123, it will have the same characteristics as another attribute named "foo" attached to node 456.

Associated with each attribute name is a type. Attribute types are defined in AMI by java classes that implement a common interface. The interface defines a mechanism for serializing and deserializing the attribute data so it can be stored and retrieved. It also defines mechanisms for searching for attribute values.

AMI defines the usual collection of primitive types such as string, boolean integer. It also defines some more specialized types such as timestamp, and some collection types such as SetOfTimestamp, and a reference type that allows one node to refer to another. This last type provides a lot of power and flexibility to the data structure.

Let's take the RSS reader daemon as an example. This daemon is defined by a node that has the "handler" attribute that contains the name of a java class that implements the Daemon interface. AMI automatically starts a thread for the daemon which periodically checks an RSS feed. The details of how and when the daemon is started, which URL to check, etc. are contained in the nodes attributes. AMI users are also defined by nodes. Any given user can subscribe to the RSS feed by creating a subscription node that links the user node to the RSS daemon node via a couple of reference type attributes. This provides a many-to-many linkage between daemons and users.

As the RSS daemon reads the feed and finds new items, it creates nodes to contain them. Many of the XML attributes associated with an item translate directly into node attributes in AMI. If any kind of error occurs during the process, AMI creates a log node and links it to the daemon node.

In summary, this data structure provides a great deal of power and flexibility and can be used for just about any kind of application. It does require more processing at run-time than a traditional SQL database, but it is more dynamic, more alive, more suited to collaborative applications and artificial intelligence.

No comments: