TIQView Blog RSS

TIQ Solutions - Spend Quality Time with your Data!

Contact

Impressum

Archive

Qlik Luminary

Ralf Becher on QlikCommunity

Member of LXQ - The League of eXtraordinary Qliketeers

Ralf Becher on GitHub

DZone Most-Valuable-Blogger

Profil von Ralf Becher auf LinkedIn anzeigen

Google+

Add this blog to my Technorati Favorites!

My Blogroll

Jun
19th
Tue
permalink

Neo4j data integration with Pentaho Kettle

During my Neo4j JDBC driver test I wanted to find out how an ETL tool like Pentaho Kettle can handle Neo4j’s Cypher queries to pull data out of the graph database.

Here are the steps for connecting Kettle to the Neo4j database.

At first copy the Neo4j JDBC driver into the Kettle JDBC folder:

data-integration/libext/JDBC

As next we define a new generic JDBC data source where we define the driver class and the connection URL:

create Kettle connection to Neo4j

Then we can build up a simple Kettle transformation like this:

Kettle transformation

In the first step “Cypher Query” we can enter the query code and do a data preview:

Cypher query input

Cypher query data preview

Then we propagate the input step fields to the output step and create an Excel file here. This runs quite good.

Generally, this is a workable data integration solution. I will do some more tests. Performance seems to be ok so far.

There is only one minor issue for now. The RETURN of a node or relationship as string is causing an exception. Queries like this are not working:

START n=node(*) RETURN n
START r=relationship(*) RETURN r

But, who needs the JSON like result string if you can access all the properties? I think this will be fixed soon anyways.