RAG with Offline LLM
During my time as Head of Professional Services for a private Genomics based company I spearheaded an initiative to utilise an open source offline LLM for analysing our Jira, Confluence and Salesforce docs for quick retrieval of process and technical information.
The purpose was twofold,
a) To create a common ‘search’ tool in order to locate a source of truth as to what was configured and where, and
b) To be able to generate process documents very quickly in whole or in part through the use of generative AI.
As these were sensitive company documents, use of a service like ChatGPT was restricted, which is why we opted to trial PrivateGPT with Llama-2-7b amongst other models.
We found RAG to be less than perfect, but it did effectively allow us to search out the right documents. Text generation was better in that we managed to produce good content as an initial template for a process guides, as well as feeding a full process document to generate a RASCI chart. We ran the pilot service on AWS EC2 infrastruture.