A new open-source framework called PageIndex solves one of the old problems of retrieval-augmented generation (RAG): handling very long documents. The classic RAG workflow (chunk documents, calculate embeddings, store them in a vector database, and retrieve the top matches based on semantic…