KAG vs RAG

KAG vs RAG

RAG - for those who don't know - means Retrieval Augmented Generation : the user gives documents to a LLM and it answers the user questions based on the content in the documents.


RAG is all the rage right now, but there is something maybe more interesting : KAG, or Knowledge Augmented Generation.

If we simplify RAG, the process is :

  • Extracting content from files
  • Splitting content into chunks and encoding them with a Embedding Model (not the same as a LLM)
  • Encoding the question asked to the RAG with the same Embedding Models
  • Comparing question with embedded chunks, retrieving the Most Relevant Chunks (based on distance)
  • And then, feeding the chunks to a LLM and asking it to answer the question

RAG filters out all the info from the documents to retrieve only the most relevant parts. Or does it ?

Problem is, embedding is a kind of black box - so no real output on embedding quality. It depends on the chunks provided (split arbitrarily) and it - just - compares documents chunks with question based on distance (which means : do they look the same ? Are they far from each other ?).

A lot of research happens in this field right now, to improve results. One of the solutions could be the idea of KAG : Knowledge Augmented Generation.

In KAG, we just don’t simply compare things based on if they look the same, but the embedding phase is replaced by a phase where a LLM creates a graph representation of data : it creates relations between parts of provided documents. Thus, it can keep a better context, which means that it could provide better results.

This is experimental stuff right now - but there are some articles about some initiatives here or there :

https://pub.towardsai.net/kag-graph-multimodal-rag-llm-agents-powerful-ai-reasoning-b3da38d31358

https://towardsdatascience.com/enterprise-ready-knowledge-graphs-96028d863e8c

Notably the OpenSPG engine, developed by Ant Group :

GitHub - OpenSPG/openspg: OpenSPG is a Knowledge Graph Engine developed by Ant Group in collaboration with OpenKG, based on the SPG (Semantic-enhanced Programmable Graph) framework. Core Capabilities: 1) domain model constrained knowledge modeling, 2) facts and logic fused representation, 3) natively support KAG...
OpenSPG is a Knowledge Graph Engine developed by Ant Group in collaboration with OpenKG, based on the SPG (Semantic-enhanced Programmable Graph) framework. Core Capabilities: 1) domain model cons…

It seems an interesting route to process internal documents and extract information from them - what an exciting time to work on AI !