adtotal.blogg.se

Unlocking Data With Generative Ai And Rag Pdf -

from langchain.chat_models import ChatOpenAI llm = ChatOpenAI(temperature=0) relevant_chunks = compressed_retriever.get_relevant_documents( "What was the net profit in 2024?" ) response = llm.predict(prompt_template.format( chunks=relevant_chunks, query=user_query ))

vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings()) unlocking data with generative ai and rag pdf

The next frontier involves models that don't just read the text but "see" the charts, diagrams, and formatting within the PDF to provide even deeper insights. from langchain

| Model | Dim | Best for | |-------|-----|-----------| | text-embedding-3-small (OpenAI) | 1536 | General, cost-effective | | all-MiniLM-L6-v2 (sentence-transformers) | 384 | Local, fast, lower accuracy | | BAAI/bge-large-en-v1.5 | 1024 | High retrieval quality | | voyage-2 | 1024 | Long documents, legal/financial PDFs | legal/financial PDFs |