Building a Scalable Graph RAG System for Enterprise Insights
I led the development of a Graph-based Retrieval-Augmented Generation (RAG) system integrating Neo4j, LangChain, and Azure OpenAI. Designed for enterprise-grade inventory and financial data analytics, the system enables natural language querying over knowledge graphs with structured Cypher output and dynamic result interpretation.
Visit websiteThe problem
In 2025, we initiated the GraphRAG project to build a cutting-edge Retrieval-Augmented Generation system over a Neo4j knowledge graph—designed from the ground up to empower domain experts and business users with natural language access to structured enterprise data. The legacy systems were rigid and siloed, with limited ability to connect insights across group companies, inventory hierarchies, and financial years. Our goals were to unify this fragmented data into a queryable knowledge graph, reduce the barrier to data exploration using LLMs, and create a system that’s intuitive for non-technical users while being scalable and robust for complex analytics.
The aero design system
To streamline development across data engineers, AI researchers, and product teams, we built a modular architecture combining graph databases, LLM pipelines, and prompt-driven orchestration. A flexible schema design and prompt framework laid the foundation for consistent query generation, natural language analysis, and scalable knowledge integration. This architecture informed not only the backend reasoning system but also the user experience across the application and supporting tools.
Design system docs
A system is only effective if teams know how to use it, so we built thorough documentation to guide contributors. It covers graph schema structure, prompt engineering principles, Cypher query patterns, LLM integration, and deployment guidelines—ensuring both developers and analysts can confidently work within the GraphRAG ecosystem.
Motion design
Interactivity and clarity were core principles in designing the query and result flow. Visual transitions and feedback help users understand how natural language inputs translate into Cypher queries and graph traversals—making the entire reasoning process feel intuitive and traceable.
Encouraging adaptivity
A major part of solving for collaboration was being able to visualize the learner experience in the editor. This was especially beneficial for subject matter experts and instructors need to review and give feedback on the higher level structure without having to dig through all of the adaptivity scenarios screen by screen.
An extensible plugin ecosystem usable by everyone
The most powerful aspect of the platform is the ability to create custom plugins for any content, whether it be a degree, course, lesson, screen, or interactive component. Out of the box these can be made configurable with minimal effort from developers. Learning designers can then edit everything using a common configuration interface.
Next-generation learning experiences
The flexibility of the product allowed for developers to create engaging interactive experiences as highly configurable plugins that could then be used and manipulated by learning designers.
Bringing 3D into learning
One really cool example is the 3D screen plugin. Learning designers can load any model into it and then configure camera positions to animate to for each section.
Interactivity
Learners can then be directed to specific parts of the model and shown labels. They’re also able to click and drag to orbit around and freely explore at any time.
Animation
Learning designers can pick an animation included in the model to play or loop for any section without having to use any complex animation tools.
Project outcomes
Ultimately the project was successful after Smart Sparrow and the aero platform were acquired by Pearson in 2020 to become a foundation for their next generation learning platform.




