Architects: Jump In To Generative AI
We’ve been busy at Forrester on two parallel streams: generative AI and top emerging technologies. I’m pleased to announce that our new report, The Architect’s Guide To Generative AI, is one place where these two streams come together. Generative AI (genAI) was our top emerging technology for 2023 and will likely remain so in 2024. For each of our top 10, we are writing three reports: a “State Of,” then a “Future Of” and an “Architect’s Guide.” We have been rolling out the “State Of” and “Future Of” reports for our top emerging technologies since the summer. We have just published our first Architect’s Guide, covering genAI.
The Architect’s Guide report is designed to go deeper on top emerging technologies for architects of all types and provide insight on the security and risk aspects of the technology. It follows a very recognizable architectural flow: We discuss current- and future-state trends, break down the emerging technology into capability building blocks, assemble the building blocks into solution patterns, and then end with a technical architecture along with governance, security, and risk considerations.
GenAI Architectures Go Far Beyond LLMs
Architects are caught between extreme executive enthusiasm and need for speed. GenAI demonstrations via ChatGPT and others seem magical and have executives excited about potential. Under the hood, however, we find that implementations become challenging very quickly. Architects get stuck — should you just wait for genAI in the software you use already, or do you build something from OpenAI or others? Or do you venture into the open-source world with models such as Llama and Mistral? We found that:
- Generative AI goes far beyond a single LLM. We identify four ways that genAI is entering organizations. We also look at who is going beyond “bring your own AI” or simply using genAI tools in existing software. As you can imagine, most firms are talking about using genAI-infused enterprise software coming to market or experimenting with OpenAI, Microsoft, Google, and other public model-as-a-service providers. For the most advanced, most of the time and effort spent has little to do with the main LLM selected.
- RAG-focused solution architectures with pipelines, gates, and service layers work best. Incorporate retrieval-augmented generation (RAG) into your solution architecture. Intent and governance gates on either end of generative models are crucial. GenAI solutions are rolling out through pipelines that focus on prompt shaping on the front end and governance against what comes out. We find an emerging need to focus more on front end, intent recognition, prompt handling, and model grounding via RAG. Most firms that have tried to fine-tune model parameters tell us that prompt shaping and grounding work just as well and are cheaper and faster, at least initially.
Don’t Lose Sight Of The Fundamentals
While genAI architectures may seem like moving targets, don’t lose sight of the fundamentals. Your technical architectures must be nimble for more — more models, more data, and more scale seems to be the order of the day. To be successful, your tech stack must be able to manage and version more models, monitor more outcomes and more engagements, and optimize costs as everything gets bigger. Furthermore, this isn’t just in the cloud; you need to plan for personal AI devices coming to your pockets soon! Focus on:
- Assembling well-managed data. Whether structured or unstructured, consistent, well-inventoried, and quality data is essential if your goal is to apply genAI to your own organizations’ problems.
- Addressing governance, risk, and compliance stakeholder concerns. In the Architect’s Guide, we embrace the idea of building specific architectural models that address the concerns of risk-oriented stakeholders. We give you examples of how to address these concerns in our report.
- Engaging in governance discussions based on a set of principles. Do not try to establish standards; things are simply moving too fast. In the report, we identify three data and application principles and three infrastructure and operations principles to guide your work.
A big thanks to my colleagues Charlie Betz, Jeff Pollard, Alvin Nguyen, and Will McKeon-White, who accompanied me on this journey of discovery as we dug into the guts of the technology, as well as several other analysts who shared their experience in developing this deeper look at the technology. Thank you to the several architects and technology leaders who volunteered their time for this research, and a special thanks to our internal genAI team working on Izola, our client-facing research tool that is working its way through beta and coming soon to all our clients.