0

I'm running two different graph database setups on separate Kubernetes clusters (with the same configuration) and facing performance issues with JanusGraph compared to Neo4j. Here's the detailed setup:

Cluster 1 (AKS – Neo4j Setup):

  • Neo4j (Community Edition) running as a StatefulSet

Cluster 2 (AKS – JanusGraph Setup):

  • JanusGraph (latest version) running as a Deployment
  • Cassandra (2 pod) – storage backend
  • Elasticsearch (2 pod) – indexing backend

Problem:

I'm running two structurally similar queries — one on Neo4j, one on JanusGraph — intended to retrieve a connected subgraph with filters and traversal up to N levels.


Example Queries:

  1. Query A (1 hop): Filter on a label and property, then fetch directly connected neighbours (Feature for fetching the data and rendering on UI)
  • Neo4j time: ~4.84sec
  • JanusGraph time: ~2.30min
  • ~125 nodes returned

There are more examples, but you get the idea — as depth and node count increase, Neo4j scales better, while JanusGraph performance degrades sharply.


Code: JanusGraph-Django Integration (Gremlin Connection Pooling)

class BaseGremlinClass(View):
    _connection_pool = {}
    _traversal_pool = {}

    def __init__(self):
        self.connections = {}
        self.traversals = {}

    def get_traversal(self, keyspace_name):
        if keyspace_name not in settings.JANUSGRAPH_KEYSPACES:
            raise ValueError(f"Keyspace {keyspace_name} not found in settings")

        if keyspace_name in self.traversals:
            return self.traversals[keyspace_name]

        if keyspace_name in self.__class__._traversal_pool:
            self.connections[keyspace_name] = self.__class__._connection_pool[keyspace_name]
            self.traversals[keyspace_name] = self.__class__._traversal_pool[keyspace_name]
            return self.traversals[keyspace_name]

        try:
            config = settings.JANUSGRAPH_KEYSPACES[keyspace_name]
            connection = DriverRemoteConnection(
                config['url'],
                config['graph'],
                message_serializer=serializer.GraphSONSerializersV3d0(),
                timeout=30,
            )
            g = traversal().withRemote(connection)

            self.connections[keyspace_name] = connection
            self.traversals[keyspace_name] = g
            self.__class__._connection_pool[keyspace_name] = connection
            self.__class__._traversal_pool[keyspace_name] = g

            logger.info(f"Created new connection for keyspace {keyspace_name}")
            return g

        except Exception as e:
            logger.error(f"Error creating connection to {keyspace_name}: {e}")
            raise

    def close_connections(self, keyspace_name=None):
        if keyspace_name and keyspace_name in self.connections:
            del self.connections[keyspace_name]
            del self.traversals[keyspace_name]
        else:
            self.connections.clear()
            self.traversals.clear()

    @classmethod
    def close_all_connections(cls):
        for keyspace, connection in cls._connection_pool.items():
            try:
                connection.close()
                logger.info(f"Closed pooled connection for keyspace {keyspace}")
            except Exception as e:
                logger.error(f"Error closing connection: {e}")
        cls._connection_pool.clear()
        cls._traversal_pool.clear()

This code handles lazy connection initialization and connection pooling between Django and JanusGraph via Gremlin, so I don’t think the connection setup itself is causing delays — but let me know if this implementation can be improved too.

If anyone has faced a similar performance gap or has experience optimizing JanusGraph in a containerized setup with Cassandra and Elasticsearch, your insights would be greatly appreciated! Even small tips or configuration flags that helped in your case would be valuable.

Thanks in advance for your help!

2
  • Indeed something must be wrong with the JanusGraph setup. 10ms per vertex would be reasonable for a java based datastore with fast disk access, giving retrieval times as for the Neo4j setup. See here, for older performance measurements: yaaics.blogspot.com/2018/04/… Commented May 18 at 20:15
  • @HadoopMarc, thank you for your response. Regarding the performance issue, we are attempting to create the indexing; however, we are encountering some difficulties there as well - stackoverflow.com/q/79632303/6385767. Commented May 22 at 12:41

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.