Menu

React, etc. Tech Stack

React, Flux, GraphQL, Hack, HHVM...? All of this and more!

Graph Query Languages: GraphQL, OpenCypher, Gremlin and SPARQL

Graph databases are popping up everywhere. During the 2016 they're bound to come up even more and will probably reach mainstream acceptance such as NoSQL Document Databases did a few years back. RDF and the Semantic web have held the promise of powerful graph data for a long time, but have not become a staple for web developers.

Applications that traditionally relied on SQL now often have supporting search engines (such as Solr or Elastic Search) or NoSQL document databases such as MongoDB. The common nominator between various popular relational database systems is SQL (Structured Query Language), even though the line between MySQL and a NoSQL storage is blurring.

There are specific flavors of SQL and not all features are supported on all servers, but the situation is much worse between the different flavours document databases and search engines. They burden developers with their own query language syntaxes, making for a heavy investment.

With the rise of Graph Databases in Web Content Management Systems and other mainstream web application domains the inevitable race for the defacto product(s) is on. Neo4j was the earliest to gain popularity, but now ArangoDB and OrientDB are picking up too.

This brings up the question of language. Could we have a shared query language for all of these databases? Are these even comparable? Let's take a look at some options and usages.

GraphQL for edge communications

GraphQL describes itself a Data Query Language and Runtime. It is not really an "SQL equivalent" for Graph Databases at all. It is aimed for communication between clients and server endpoints, typically using RESTful methods.

As Open Source products from Facebook in general, GraphQL is extremely practical and used inhouse since 2012. It allows easy access (and manipulation) of resources using structured JSON where the request itself defines the structure of the expected response.

GraphQL is a very great solution for building web APIs for all types of data stores (SQL, NoSQL, GraphDBs...) and is recommendable to keep and eye on (or even start implementing). As an open standard GraphQL is a great choice for communicating with the outside world.

OpenCypher and Gremlin for the database

OpenCypher and Gremlin is the most comparable pair of the bunch. Both are directly comparable to the functionalities of SQL in the Relational Database domain. They allow retrieval and manipulation of data, directly on Graph database level.

OpenCypher is a recent effort to Open Source the popular (and widely liked) Cypher language used by the Neo4j Graph Database. Announced in October 2015 the initiative aims to make Cypher language vendor independent by providing a proven query language specification. Neo4j's strategy is obviously to drive adoption of the language, while investing so that their product has the superior implementation of the language.

Gremlin describes itself a graph traversal language. It has been in development since 2009 by the Apache project and is actually part of the Tinkerpop project. Gremlin is adopded in multiple different products, including OrientDB and Neo4j. It comes with great resources such as SQL2Gremlin for learning the language and is already widely supported with drivers for PHP and other languages.

OpenCypher and Gremlin are currently the likely candidates to be the common languages for people working with Graph Databases. Gremlin has a head start and is already widely in use, but on the other hand the Cypher language where OpenCypher stems from is hugely popular among Neo4j users.

SPARQL for the Semantic Web

SPARQL is a language designed for querying RDF graphs. It is a W3C Recommendation and aims to drive the semantic web forward. It relies on the Resource Description Framework format which is a format for the semantic web. The Gremlin documentation describes SPARQL like this:

SPARQL is a popular query language for RDF graphs. SPARQL is simple and intuitive, though it lacks various constructs for expressing any arbitrary graph query (e.g. looping and branching constructs).

SPARQL does not seem to be in the race to be a widely adopted "SQL of Graphs". As a language it will definitely stay around, but it seems like it will be used more widely for querying larger data sets than closed local datasets. Archived data, collected from RDF markup metadata on the web by crawlers, etc. is what it will likely be used for.

Written by Jorgé on Monday October 26, 2015

Permalink - Tags: graphql, opencypher, SPARQL, gremlin

« Highlights: OpenCypher, HHVM and LLVM, React Forms - ECMAScript 6 / ES2015 and Shadow DOM in Safari / WebKit »