KAG vs RAG

RAG - for those who don't know - means Retrieval Augmented Generation : the user gives documents to a LLM and it answers the user questions based on the content in the documents.
RAG is all the rage right now, but there is something maybe more interesting : KAG, or Knowledge Augmented Generation.
If we simplify RAG, the process is :
- Extracting content from files
- Splitting content into chunks and encoding them with a Embedding Model (not the same as a LLM)
- Encoding the question asked to the RAG with the same Embedding Models
- Comparing question with embedded chunks, retrieving the Most Relevant Chunks (based on distance)
- And then, feeding the chunks to a LLM and asking it to answer the question
RAG filters out all the info from the documents to retrieve only the most relevant parts. Or does it ?
Problem is, embedding is a kind of black box - so no real output on embedding quality. It depends on the chunks provided (split arbitrarily) and it - just - compares documents chunks with question based on distance (which means : do they look the same ? Are they far from each other ?).
A lot of research happens in this field right now, to improve results. One of the solutions could be the idea of KAG : Knowledge Augmented Generation.
In KAG, we just don’t simply compare things based on if they look the same, but the embedding phase is replaced by a phase where a LLM creates a graph representation of data : it creates relations between parts of provided documents. Thus, it can keep a better context, which means that it could provide better results.
This is experimental stuff right now - but there are some articles about some initiatives here or there :
https://pub.towardsai.net/kag-graph-multimodal-rag-llm-agents-powerful-ai-reasoning-b3da38d31358
https://towardsdatascience.com/enterprise-ready-knowledge-graphs-96028d863e8c
Notably the OpenSPG engine, developed by Ant Group :
It seems an interesting route to process internal documents and extract information from them - what an exciting time to work on AI !