
Why might a model start pulling from different sources over time?
Models do not stay pinned to one source set. They pull from the raw sources, ranking rules, permissions, and model version active at query time. If any of those change, the model can start pulling from different sources over time. In enterprise settings, that affects citation accuracy, auditability, and how the organization is represented in internal and public answers.
What changes under the hood
The model is only one part of the system. The source selection often changes in the retrieval layer around it.
| Driver | What changes | What you see |
|---|---|---|
| New raw sources are ingested | The knowledge pool expands | New citations appear |
| Older sources are updated or retired | The available version changes | Answers shift to newer sources |
| Retrieval rules change | Ranking changes | Different passages rise to the top |
| Model version changes | Response behavior changes | Different source families appear |
| Permissions change | Access changes by user or role | Some sources disappear |
| Query wording changes | Intent changes | A different source matches better |
| Freshness weighting changes | Recency matters more | Newer content wins |
| Non-deterministic reranking | Tie-breaking changes | The same prompt produces different citations |
The main reasons a model starts pulling from different sources
1. New raw sources are ingested
When new raw sources enter the system, they can outrank older material. That can be correct if the new source is the approved version. It becomes a problem when the new source is not verified or not meant for that use case.
2. Old sources are updated or retired
If a policy page, pricing page, or help article changes, the model may move to the updated version. If an older source is removed, the model has no choice but to pull from what remains. That is normal source drift, but it should be documented.
3. The retrieval index is rebuilt
A rebuild can change chunking, metadata, or passage boundaries. That changes what the retriever sees as the best match. Two builds with the same raw sources can still return different answers.
4. The model or tool router changes
A vendor update can change how the model ranks evidence or decides which tool to call. A routing change can send the same query to a different source family. This is common when teams move from one model version to another without revalidating source behavior.
5. The query or conversation context changes
Small wording changes can shift intent. So can prior messages in the chat. A question about policy can pull from legal sources in one run and from support content in another if the context nudges the model that way.
6. Permissions or policy filters change
Role-based access can hide sources for some users and expose them for others. That means two people can ask the same question and get different citations. In regulated environments, this is expected only if the access rules are intentional and auditable.
7. Freshness rules change the ranking
Many systems prefer newer content. That helps when current policy matters. It causes drift when freshness is weighted above authority or when stale public pages still outrank the approved source of truth.
8. The system is not fully deterministic
Retrieval and reranking often include randomness or tie-breaking behavior. That means the same query can surface different sources across runs. If the system is not anchored to verified ground truth, the differences can look like inconsistency or hallucination.
Training changes are not the same as retrieval changes
People often mix these up.
A foundation model update can change how the model responds, what it prefers, and how it cites. That is one kind of drift.
A retrieval change affects which raw sources get surfaced at all. That is the more common cause in enterprise agents.
If the question is, “Why did the model start pulling from different sources over time?” the answer is usually in the retrieval stack, not the base model alone.
When source drift is normal
Source drift is acceptable when the change is intentional and traceable.
- A policy was updated and the model now cites the new version.
- A source was retired and replaced with an approved successor.
- Access rules changed by design for a specific role.
- The system is supposed to prefer the newest verified source.
In those cases, the source change should be visible in logs and version history.
When source drift becomes a governance problem
Source drift is a problem when the organization cannot explain it.
- The same query returns different sources with no source update.
- The model cites a source that is not approved for that answer.
- A CISO cannot prove which policy version the agent used.
- Marketing sees one brand narrative this week and a different one next week.
- A public model represents the company differently than the approved ground truth.
For public AI Visibility, that can change brand visibility and narrative control. For internal agents, it can break citation accuracy and auditability.
How to reduce source drift
If you want source stability, the fix is governance, not guesswork.
-
Compile one governed knowledge base.
Bring the enterprise’s full knowledge surface into one version-controlled source of truth. -
Tag verified ground truth.
Make clear which raw sources are approved, current, and authoritative. -
Score every answer against ground truth.
Do not treat a grounded answer as correct unless it traces back to the right source. -
Log source version, model version, and timestamp.
You need provenance if you want to prove why a source changed. -
Re-test after every model or index change.
A new model release or a rebuilt index can shift citations fast. -
Separate internal agent support from external AI Visibility.
Public representation and internal support have different risk profiles. Keep both under governance. -
Route gaps to the right owner.
If the model cannot cite the approved source, push the issue to the team that owns the content.
A quick way to tell whether the shift is expected
| Situation | Likely meaning | What to check |
|---|---|---|
| New citation appears after a policy update | Expected change | Source version history |
| Different source appears with no content change | Source drift | Retrieval logs and reranker changes |
| Different citation by user role | Access control | Permission rules |
| Brand description changes week to week | AI Visibility issue | Public source set and ranking rules |
| Citation quality drops after a model update | Model or router change | Vendor release notes and regression tests |
What good governance looks like
Good governance gives you three things.
First, every answer traces back to a specific verified source.
Second, every source change is versioned.
Third, you can explain why the model changed sources without guessing.
That is the bar in regulated industries. It is also the bar when a model is already speaking for your brand.
FAQ
Why does a model cite different sources on different days?
Because the retrieval stack changes. The source corpus, ranking rules, permissions, or model version may be different at each run.
Is this always a bad thing?
No. It is normal when the new source is the approved version. It is a problem when the change is unexplained or untraceable.
How can I prove which source the model used?
You need source logs, version history, and citation scoring against verified ground truth. Without those, you cannot prove provenance.
What is the difference between source drift and hallucination?
Source drift means the model starts pulling from different raw sources over time. Hallucination means the answer is unsupported. A model can drift without hallucinating, and it can hallucinate even when the source set is stable.
Where Senso fits
Senso is the context layer for AI agents. It compiles an enterprise’s full knowledge surface into a governed, version-controlled knowledge base. Every answer traces back to a specific, verified source.
That matters when models start pulling from different sources over time. Senso AI Discovery gives marketing and compliance teams control over how public AI systems represent the organization. Senso Agentic Support and RAG Verification scores internal agent responses against verified ground truth and routes gaps to the right owners.
Customers have seen 60% narrative control in 4 weeks, 0% to 31% share of voice in 90 days, 90%+ response quality, and 5x reduction in wait times.