Flexible Storage System (FS2)

Store your data anywhere, and move it at any time, with no client code impact.

FS2 allows you to manage arbitrary data objects using familiar operations like create, read, update, and delete. Using a simple interface that abstracts the underlying persistence details, applications can interact with their data in a way similar to using a terminal to interact with a local filesystem. Behind the curtains, a storage provider does the heavy lifting, be it integrating with a cloud storage provider like Google Cloud Storage or Amazon S3, a Mongo NoSQL database, a fileshare, or even MemCache.

Features at a glance

  • Create, Read, Update, Delete data objects without persistence-layer awareness
  • Hierarchical arrangement of objects, and list functionality with Regular Expression filters
  • Use object literals (JSON) to represent configuration and object metadata
  • High level abstraction for data migration and backup
  • Stream-based api

Uses

  • Anywhere you need to store and manage data objects
  • Use fs2 as a replacement for the native Java File API or Apache Fileutils for filesystem manipulation.
  • Share large data objects between appliances within a SOA environment by URI reference, downloading payloads only when necessary.
  • Move easily from public to private clouds as your needs change
  • Chain several storage providers for data duplication and redundancy
  • As a lightweight alternative to jsr170-1.0 implementations such as Apache Jackrabbit.

Roadmap

  • Complete prioritized todos
  • Complete MongoDB implementation
  • Add compression and encryption functionality
  • Front with a RESTFul webservice
  • Add security features
  • Integrate with the boto project so that users can access fs2 from a terminal
Tip: Achieve a tighter development cycle by storing data in memory or on the file system in early development and unit testing, and move to a storage provider or local data center nearer to deployment time.

Why FS2?

There are similar technologies out there such as Apache Commons Virtual File System (VFS) and Apache Jackrabbit. FS2 distinguishes itself with the following:

  • FS2 Objects are relatively simple, having a URI, headers, and a blob. This makes it a natural fit to map and handle web requests.
  • Storage-agnostic API. Client code does not know about the underlying persistence store. For example, a VFS URL make look like jar:/a/b, where fs2 is simply fs2:/a/b/.
  • TDD friendly. Use the default FS2 repo (in-memory) while developing for easy testing without minding the complexities of database/filesystem stores. Then when the code is ready for prime time, simply flip a switch (ie change "mem" to "mongo"), and objects will be persisted.
  • Lightweight dependencies. The core FS2 API code is lightweight, and for any given deployment scenario, you need only to include the concrete repository that will be used.
  • Built-in tests. It's easy to have confidence in a new concrete repository implementation when you can plug it right into an existing test framework.
  • Less config. By default, FS2 stores objects in memory, and there is zero configuration required.
  • Easier config. FS2 is not going to require heaps of XML files defining factories in order to work. Just override the default values you wish to change, in code or by providing a properties file in json format.
  • Leaves the typical "heavyweightness" of Java frameworks behind. IE uses json for config and object descriptors, relies on default values so the only existing configuration is override configuration.

Interacting with the API

FS2 decouples meta-information like object size, date created, compression, etc from the object itself. The most common meta-data. such as date created, is modeled within the meta class. Extended attributes are contained in unmodeled key value pairs called headers. Client code can interact with the FS2 api using the meta object, or URI. Here is a simple CRUD example.

Basic Crud

// get an instance of fs2 with default properties
FlexibleStorageSystem FS2 = FS2Factory.newInstance();

// create the object.  initially this object will have no data associated with it, just a name and some initial metadata.
FS2ObjectMeta foo = FS2.createObjectEntry("/foo");
    
// add a custom header field to object foo
FS2.addHeader(foo, "isText", "true");
    
// add some contents to foo
InputStream is = new ByteArrayInputStream("hello world".getBytes());
FS2.writePayloadFromStream(foo, is);
    
// delete
FS2.delete(foo);

FS2 also supports the hierarchical arrangement of objects, like you would see in a filesystem, or any tree like data structure.

  • In the code below, we create four distinct objects, all belonging to a tree with root /foo
    • This is a convenience method, and objects can just as easily be created one by one.
    • The nodes are created in order of the parameter list.
    • Any nodes in a path that do not previously exist will be created implicitly.
    • If a specified node pre-exists, an exception will be thrown.
      • For example, if the first two parameters below were switched, create "/foo" will fail since it was implicitly created when create "/foo/bar" was executed.
    • The (meta) objects that were explicitly created will be returned by the method.
  • Then we get the reference to bar's metadata by accessing the returned array.
    • Alternatively, we can access bar by providing its uri directly to FS2. Note this means an additional fetch.
  • Children and descendants can be listed with or without regular expression filters.
  • To delete a node with children, use deleteRecursive() or an exception will be thrown.

Working with trees

// create five nodes
FS2ObjectMeta[] nodes = FS2.createObjectEntries("/foo", "/foo/bar", "/foo/baz", "/foo/bar/bam", "/foo/bar/moo");

// can access meta like this
FS2ObjectMeta foo = nodes[0];
FS2ObjectMeta bar = nodes[1];

// sanity check a node exists by refetching from fs2.  note bar2 should equal() bar unless another thread updated (ie) headers in between create and fetch  
FS2ObjectMeta bar2 = FS2.fetchObject("/foo/bar");

// list descendants of foo who's names begin with "m" (expect moo) 
FS2.listDescendants(foo, "*/m.*");
    
// delete bar and bar/bam
FS2.deleteRecursive(bar);

// get foo's remaining descendants.  (expect just baz)
FS2.listDescendants(nodes[0]);

Data migration

Migrating from one storage provider to another can be done with just a few lines of code. It requires obtaining two instances of fs2, one for each provider, and making a move call. Below is an example of migrating from the file storage provider to a Mongo-backed storage provider.

// copy all contents from the file repo to mongoDB
FlexibleStorageSystem fileRepo = FS2Factory.newInstance("file"); // override default and force an instance backed by the "file" storage provider
FlexibleStorageSystem mongoRepo = FS2Factory.newInstance("mongo"); // override default and force an instance backed by the "mongo" storage provider
fileRepo.copyTo(mongoRepo);
Note the monikers used to create instances of fs2. These are mapped to fqn's in a property set used by fs2 at bootstrap. Configuration properties can be extended to map any number of fs2 storage provider implementations. Out of the box, file and mem are the only two that are always configured.