This tree search framework hits 98.7% on documents where vector search fails

via github.com

Short excerpt below. Read at the original source.

A new open-source framework called PageIndex solves one of the old problems of retrieval-augmented generation (RAG): handling very long documents. The classic RAG workflow (chunk documents, calculate embeddings, store them in a vector database, and retrieve the top matches based on semantic similarity) works well for basic tasks such as Q&A over small documents. PageIndex […]

Read at Source