Coming out of The Crash, we understood the problem more deeply—because we’d lived it.
We had stretched a working bot to cover a new use case and data source—and watched it strain under the weight of unclear expectations, sprawling content, and retrieval blind spots. But those hard lessons gave us the clarity to regroup. We didn’t need to start over. We needed to design with intention and sharper focus.
This phase is where we stop assuming, start scoping tightly, and make intentional choices that reflect the realities of our data, users, and goals.
AI challenge: Construct a chatbot that can leverage constantly changing, unstructured go-to-market (GTM) content to reduce sales friction by providing brief and accurate answers to seller questions as well as links to more detailed information.
The build: We built this assistant on Red Hat OpenShift Platform Plus (MP+) and Red Hat OpenShift AI, using Granite as the core model giving us enterprise-grade model serving and deployment. LangChain orchestrated the retrieval flow, and PGVector handled vector storage (an extension to the popular PostgreSQL database). We used MongoDB for logging the interactions with the AI. To preserve context from long-form documents, we used structure-aware tools like Docling and experimented with Unstructured’s Python libraries to pull speaker notes from slides. While that code didn’t make it into production, the experiment revealed just how crucial structure and formatting are to successful retrieval—lessons that now guide our preprocessing efforts.
Phase 3: Iteration with intention
“Let’s build the right thing—with the right people, for the right problem”
With the knowledge of where things had broken, we designed a strategy to unlock the potential of our data sources.
When the project hit a new blocker, we didn’t abandon what we’d built—we took the time to understand it. The early bot wasn’t a failure. It was designed for a structured, curated content set and worked well within those boundaries. But when we applied it to a broader set of unstructured sources, its limitations became clear.
It surfaced gaps in our assumptions, revealed architectural constraints, and highlighted places where our organization wasn’t fully prepared. Progress slowed. Validation became manual and time-consuming, often adding heavy lift for SMEs just to pinpoint where things were breaking down. The system relied solely on chunk-based retrieval using semantic similarity—without metadata filtering, signal weighting, or any real context retention between turns. Add nightly revectoring and the lack of human-in-the-loop validation, and we were spending more time navigating constraints than delivering value.
Eventually, it became clear: we were adding complexity faster than impact. And with every adjustment, the long-term risks grew harder to ignore.
At the same time, we started to identify what had worked—and leaned into it. From our DIY low/no-code validation tools to SME-led evaluations and targeted content evolution using Gemini and NotebookLM, SME teams have been exploring restructuring materials to perform better in AI workflows.
Example – SME-Led Rework
One SME team piloted this with a single topic, starting with a bank of materials and refining them into one stronger resource. They used Gemini and NotebookLM to test structure, prompts, and phrasing. The final version was tested inside Google’s NotebookLM and then prepped for ingestion into the chatbot. While early, this alpha approach shows promise to reduce volume and evolve toward materials designed from real questions.
Behind the scenes, we’re doing the real work: metadata cleanup, saturation analysis, and targeted content reduction. So we stepped back. We narrowed the scope, aligned around a single use case, and committed to building with intention. The new design is aligned with the types of unstructured assets we have, the user journeys that we can realistically support, and workflows that minimize long-term debt.
Lessons playbook: Phase 3 – building with what you know now
- Narrowing focus increases success: Building for everyone means failing someone. When you define a use case, you unlock clarity, alignment, and impact.
- Feedback accelerates progress: Being able to create or leverage focused environments (like a proof of concept (POC) for a new Seller Experience) with test groups gives faster, better insights into what performs and what doesn’t—for users, creators, and the bot.
- Fix the signals: Good answers require good inputs. User role awareness, filters, and tagging are the silent levers that make smart search possible.
- AI needs audiences, not just documents: We’re not just feeding files into a system. We’re designing resources around what people need, how they ask, and how the system selects.
- Authoring mindset must shift: Small, fragmented assets often underperform, get lost in the shuffle, and can cause a flood of similar messaging that is hard for AI to differentiate. Shifting toward richer, well-structured resources can improve both AI output and human clarity. This shift also aligns with the move toward chunk enrichment and multi-layered context in modern retrieval-augmented generation (RAG) systems.
- Content strategy is retrieval strategy: Library audits, tag reviews, and signal checks aren’t busywork, they’re how you prep for AI.
- SMEs are the secret ingredient: AI accelerates a lot—but you still need someone who can scan quickly, know what’s off, and bridge product, content, and user context. Without a dedicated SME you cannot validate outputs. It’s manual, exhausting, and essential.
Market reflection
When we started this build, traditional RAG architecture was still seen as the go-to approach. But we quickly ran into what others are now calling out too—RAG on unstructured enterprise content doesn’t cut it unless it’s paired with metadata, user signals, and clearly scoped use cases. We didn’t just read about it, we hit those limitations head-on.
By early 2025, the market began catching up.
- Microsoft put out guidance on chunk enrichment, showing how even basic metadata, like section titles, summaries, and keywords can dramatically improve retrieval accuracy. (See article)
- Google, in a joint study with VentureBeat, introduced the idea of “sufficient context.” Their point? It’s not enough to retrieve something vaguely related, the chunk needs to carry enough context to actually answer the question. (Read more)
These weren’t revelations for us, they were confirmation. We learned it by doing—right as the documentation started to catch up. And now, things like hybrid retrieval, reranking, and chunk-level context checks feel less like “extras” and more like table stakes.
Red Hat reflection:
This journey gave us the chance to share our story openly, struggles included, and connect with others who are facing similar challenges in today’s AI-in-business explosion. Not just the engineers and architects, but the people in the middle: the ones bridging strategy and execution, the business and the build. These are the folks driving AI adoption in the real world, often without clear roadmaps, doing their best to make emerging tools work inside complex systems. That’s where this story lives: in the practical application of AI, by teams navigating uncertainty together.
Looking ahead
We’re launching a pilot for a new Seller Experience platform. Once live, this integrated seller journey and AI Assistant POC will unlock several key outcomes. Seller feedback will help refine both the assistant experience and the collateral it draws from. A bank of seller questions will surface, offering insights into enablement gaps and real-world search patterns.
The real opportunity now is improving how we use our existing, often unstructured, content sources. The New Seller Experience POC requires us to build an assistant that can retrieve from context-rich documents—the kind that reflect how we actually speak, build, and sell. From our earlier project, we know how difficult that is without stronger structure, metadata, and content signals to guide retrieval. Those lessons are shaping how we approach this next build.
For content creators, it helps close the loop between publishing and performance. For stakeholders, it builds confidence in AI grounded in field use. And for the assistant itself, it creates a feedback loop that enables tuning based on actual interactions with working content.
Next week in a “Note to Fellow PMs (and Accidental Knowledge Strategists)” I will share what I wish I knew before I took this project and share a project checklist.
resource
Get started with AI for enterprise: A beginner’s guide
About the author
Andrea Hudson is a program manager focused on making AI tools useful in the messy world of enterprise content. She has been at Red Hat since 2022, she helps teams untangle complexity, connect the dots, and turn good intentions into working systems. Most recently, she helped reshape an AI chatbot project that aimed to surface the right go-to-market content—but ran into the chaos of unstructured data.
Her background spans product launches, enablement, product testing, data prep, and evolving content for the new AI era. As a systems thinker with early training in the U.S. Navy, she relies on what works in practice, not just in theory. She’s focused on building things that scale, reduce rework, and make people’s lives easier.
Andrea writes with honesty, sharing lessons from the projects that don’t go as planned. She believes transparency is key to growth and wants others to have a starting point by sharing the messy middle and not just the polished end.
When she’s not wrangling AI or metadata, you’ll find her tinkering with dashboards, learning low/no-code tools, playing on Red Hat’s charity eSports teams, recruiting teammates, or enjoying time with her family.
More like this
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Virtualization
The future of enterprise virtualization for your workloads on-premise or across clouds