September 30, 2011

Groovy Graphs

As a software developer you're probably no stranger to SQL databases. There's every chance you even had to take seriously boring courses on database normalisations, and all forms of joins. In fact, I even taught some very basic SQL to biology and bio-engineer students (sincere apologies to them :).

What's more interesting is that the last decade or so has seen the rise of a NoSQL movement which is trying out alternative database schemes. They trade of some of the ACID properties in order to gain greater scalability, greater performance, etc. You might have heard of Google's BigTable, for instance, which is like a giant distributed dictionary (or HashMap, for the Java crowd).

One category of NoSQL database which interests me is graph databases. These focus more on the relationships between entities, and efficient matching of these. This could be useful if you wanted to, for instance, query complex syntax trees to do program analysis. Or maybe in a genealogy application where your users want to do complex queries on their family ancestry.

So I have been playing a little with one such graph database, which is Neo4J. It's open source, Java based, and has a very simple API. It integrates Lucene for indexing and quick retrieval of nodes, and has support for doing quite complex graph traversals. If there's a feature I'm missing it's one of graph matching, where you could give it an example graph and let it find matching instances in the full one.

Neo4J gets even better when you add in a bit of Groovy. If you check the design guide this is how Neo4J recommends you to implement your business classes in Java:

public class Student {
    private final Node node;
    public Student(Node node) { this.node = node; }

    public void setName(String name) {
        node.setProperty("name", name);

    public String getName() {
        return (String) node.getProperty("name");

    // And so on for other properties...

That's a lot of boilerplating going on for those properties. Groovy metaprogramming the rescue!

PropertyContainer.metaClass.getProperty = {name ->
PropertyContainer.metaClass.setProperty = {name, val ->
    delegate.setProperty(name, val)}

(I found this here, btw.) With those three lines you can now do:

def node = db.createNode() = 'John Doe'

So no more need for special business classes. You can now use nodes directly and read/write properties on them as if they were plain old objects. With a bit more Groovy magic you could even set this up so that assigning one node to a property of another node creates a relationship between those nodes instead. And Groovy guru's might even map GPath expressions to graph traversals. And then you're creating and navigating graph databases as easily as you do object graphs.

Now that's cool.

1 comment:

ayeeson said...

Awesome, glad you like Neo4j. For any questions, be sure to check out the forums, we have a good group of superusers who are always able to help out!

-Neo4j team