Get up and running with Tinkerpop 3 and PHP

Tinkerpop 3 has just been released with a whole new set of features and new gremlin language. These changes make working with graph databases a breeze in PHP. Let’s see how.

Advantages

There are multiple advantages to working with the TP3 stack. The most important one is simply that you can switch backend databases at will.
If you don’t know whether to go with Neo4j or Titan, don’t worry you can simply start with one and switch to the other if needed.

The available backend databases are :

  • Neo4J
  • Titan
  • Elastic Search
  • Giraph
  • Spark
  • TinkerGraph
  • OrientDB is coming

How it works – Installing the server

You simply need to install gremlin-server. Bellow is how to do it in linux. Just replace <INSTALLATION_DIR> with the directory you want to install into.

wget -O <INSTALLATION_DIR> https://www.apache.org/dist/incubator/tinkerpop/3.0.1-incubating/apache-gremlin-server-3.0.1-incubating-bin.zip
unzip apache-gremlin-server-3.0.1-incubating-bin.zip

The default configuration should be good enough to get you up and running. To run the server in a terminal simply do :

cd <INSTALLATION_DIR>/apache-gremlin-server-3.0.1-incubating
bin/gremlin-server.sh conf/gremlin-server-modern.yaml

Using the PHP driver

The php driver can be found here. It is best installed via composer. If you want to know how to install composer check this link out.

The simplest way is to go to your project folder and run the following command:

php composer.phar require brightzone/gremlin-php "*"

Or add this to your composer.json file:

"brightzone/gremlin-php": "*"

Once this is done you can easily connect to the server and run queries in PHP. For example :

require_once('vendor/autoload.php'); // depending on your project this may not be necessary
use \brightzone\rexpro\Connection;

$db = new Connection([
   'host' => 'localhost',
   'graph' => 'graph'
]);
$db->open();

$result = $db->send('5+5'); //result = [10]
//do something with result
$db->close();


Working with graphs

If you followed all the instructions in this post. Your graph database should load with the modern graph. I’ve included this graph bellow for reference and to use in examples:

Modern graph

So given this graph. Lets return the names of all the people:


use \brightzone\rexpro\Connection;

$db = new Connection([
 'host' => 'localhost',
 'graph' => 'graph'
]);
$db->open();

$query = 'g.V().has(label, "person").values("name")';

$result = $db->send($query);
print_r($result);
$db->close();

This will return the following array:

Array ( 
    [0] => marko
    [1] => vadas
    [2] => josh
    [3] => peter
 )

If you wanted to get the full set of entries with all properties you could have used g.V().has(label, “person”) to get the following response.

Array (
[0] => Array (
     [id] => 1 
     [label] => person
     [type] => vertex
     [properties] => Array (
          [name] => Array (
              [0] => Array ( 
                   [id] => 0 
                   [value] => marko 
               )
          )
          [age] => Array (
              [0] => Array (
                  [id] => 2 
                  [value] => 29
              )
         )
     )
     ... 3 more entries ...
 )

This response is a little verbose but it contains all the information you need.

Lets run a second query. This time we would like to find all the software the person josh has created and return all properties:


use \brightzone\rexpro\Connection;

$db = new Connection([
    'host' => 'localhost',
    'graph' => 'graph'
]);
$db->open();

$query = 'g.V().has(label, "person")
               .has("name", "josh")
               .out("created")
               .valueMap()';

$result = $db->send($query);
print_r($result);
$db->close();

The result of this query is:

Array (
    [0] => Array (
        [name] => Array (
            [0] => ripple
        )
        [lang] => Array (
            [0] => java
        )
    )
    [1] => Array (
        [name] => Array (
            [0] => lop
        )
        [lang] => Array (
            [0] => java
        )
    )
)

For more information on how to write gremlin queries you can check the tinkerpop graph traversal documentation.

Conclusion

You’ll be up and running in 5mn and a whole world of graph database manipulation will be at the tip of your hands.
There are many tips and tricks on for example how to install and use Neo4J and cypher via TP3. I’ll make further posts for this specific topic.

Have fun guys.

Advertisements

Which Graph database for php

Update: the Tinkerpop3 stack has just been released and now uses Gremlin Server instead of Rexster. If you’re interrested in using the Tinkerpop2 stack then by all means read on. If not I suggest you read Get up and running with Tinkerpop 3 and PHP

A while back I was confronted with the rewrite of a web application of ours.  Why rewrite? Very simple, the code was ugly, it was poorly documented, hard to maintain, and on some of the most stressed platforms it was slow.

Slow can mean anything but in this case it was quite obvious that the bottleneck was the database (MYSQL). It wasn’t so much the DB as what was being done with it that led us to our ruin. In fact, some of you will scream when you hear that an Entity Attribute Value (EAV) table was the root of all problems. Nothing like an anti-pattern to kill your app!

Unfortunately, as ugly as an EAV patterned table may be, it was the only option to fit the functional requirements of the application… Or so thought the original writers.

In fact, if you ever find yourself in a situation where you need to use such a pattern for more than simply finding Attributes and Values for a given Entity (like say you wanted to find an Entity from a Value) then you can be certain that a relational database is not for you.

But don’t despair, there’s a whole world of schema-less No-SQL databases out there just ready to give you a hand. I could go on and on about your options, but today I will focus on graph databases such as neo4j or orientDb, and more precisely, their use within php.

Before I go any further, I will not explain what a graph database is in this blog. You can head over to neo4j.org and check Emil’s video or read about what a graph database is.

Which graph database to choose? OrientDb vs neo4j vs whatever…

There are more graph databases out there than you probably expect.
Unfortunately it will always depend on your application and your use case. There is no other way or figuring this out without laying out your limitations, use cases and then prototyping them and benchmarking them, so instead of looking at things in this light let me help you answer this in the simplest of ways:

Use all of them.

Chances are you don’t know all your limitations, or you don’t have the time to make prototypes for all your graph DB options. So you can’t make an educated decision. Besides, no one likes to be restricted.

It might sound a little unrealistic to expect to be able to use any (or almost) graph DB without limiting yourself to a sole choice. But thankfully the TinkerPop stack exists. And in the case of PHP, the Rexster project is where it’s at.

TinkerPop and Rexster

Without going into detail, Rexster is a graph DB server that allows you to plug and play graphs from many different blueprint enabled graph databases. In other words, you could load a neo4j graph and/or an orientDB graph into rexster and access/modify them in the same way.
This means you can start developing your web app using neo4j and have the simple option of switching to orientDb if it suited your needs better later on. No need to decide on the database right away!! Let your project grow and then decide.

Since this is aimed towards php users, I will be concentrating on getting you started with the rexpro-php library but your favorite language is probably covered by other clients as well.

All scripts run against this server are to be written in gremlin (equivalent of SQL if you will). More on this in the next section.

Getting started with rexpro-php

Before anything else please make sure you have rexster server 2.4.0 running. You can find installation instructions here or download the server directly here.

Get rexpro-php

git clone https://github.com/PommeVerte/rexpro-php.git

Once this is done and it is put into your project you can simply get started by using:

require_once 'path-to-rexpro-php/rexpro/Connection.php';
$db = new \rexpro\Connection;
//you can set $db-&gt;timeout = 0.5; if you wish
$db-&gt;open('localhost:8184','tinkergraph',null,null);
$db-&gt;script = 'g.v(2)';
$result = $db-&gt;runScript();
print_r($result);
$db-&gt;close();

This should output information on the node/vertex # 2 in the form of an array. You can find more about the scripting language used in the gremlin wiki and at gremlindocs.

I’m going to leave things at that for now. I might post more information about necessary precisions in the future. Stay tuned. And please ask questions if things weren’t clear.