More value from Generative AI through a secure and scalable architecture
When talking to organisations we notice that the value of Large Language Models is understood, but some things are still missing to bring this value to the whole organisation in a responsible manner. Scalability, security and privacy are concerns that are not fully addressed by current available tooling. In this article, we look at available tools such as OpenAI Enterprise and Microsoft Copilot, and give you an alternative (that you can build yourself) which can unlock the value of LLMs for your organisation.
ChatGPT & Copilot
We identify two major tool providers when it comes to applied Generative AI for organisations: Microsoft Copilot & ChatGPT Enterprise.
While these tools allow you to chat with an LLM such as GPT4, and create value with them, they still lack key features that unlock value for your organisation, such as:
Integration with live data sources, such as HubSpot, Google Drive, Notion and OneDrive. At the same time. Copilot still does not integrate with OneDrive, while it was announced almost half a year ago. External sources like HubSpot or Notion might never see the light of day for Copilot.
Keeping the data that is shared with the model private. In the case of OpenAI you will need to upload your data to their servers, while you want to keep the data within your (managed) network. If you are serious about keeping your data secure, this is not an option.
Cost does not always scale with value. Licensing for OpenAI and Copilot are available at about €30 per person. This will incur heavy costs for an organisation that takes Generative AI seriously and makes it available to everyone. Reality will be that some people might use it a lot and others might not. Here, a shared infrastructure that scales with use will make more sense.
Building your own RAG
With these limitations in mind we designed our own solution. The goal was to create a lightweight solution that enables organisations to create value through LLMs, with their own data. To do this we combined multiple open-source software solutions to come to an Retrieval Augmented Generation (RAG) architecture that can run in you own cloud.
Added Value for your Organisation
Relevancy and accuracy of information. By making sure the LLM uses the information of your organisation, the answers it provides will be grounded in your organisation’s “truth”. This reduces the amount of so-called hallucinations, and increases relevancy of the answers for your use-case. We make sure that the data is up-to-date, by running data pipelines that check the data’s freshness.
Citing sources and references. By integrating different data sources and their metadata, this set-up is able to provide you with citations and links to the original piece of information such as a PowerPoint deck or Notion block. This reduces the impact of so-called hallucinations, and allows you to check the source provided by the LLM.
A secure and private set-up. Embeddings and your data are stored within your network, and your prompts, completions embeddings are never used to re-train public models.
Ability to host your own model, within your network. Making it even more secure and private, although more expensive and complex - depending on your use-case.
Cost efficient. RAG architectures allow you to embed the information of your organisation with the knowledge contained in the LLM without expensive re-training, and once made available the embeddings are easily shared among the organisation.
Want to see this in action, or want to know how to take the next step in implementing such an architecture yourself?
Don’t hesitate to contact me, or plan a free data & AI intake session here.