Operator note: This is an opinion and operations piece from Fix I.T. Phill. The point is not that one country, class, or culture owns the problem. The point is that AI learns from human output, and a lot of public human output is low-signal, rushed, under-reviewed, or too far removed from the work it claims to explain.
AI models do not learn from magic. They learn from people.
That is the uncomfortable part of the conversation. When the public internet becomes training material, a model does not only absorb elite engineers, clear technical writers, disciplined operators, and careful researchers. It also absorbs rushed documentation, shallow tutorials, outsourced checklist content, social-media writing, SEO filler, support articles written by people who barely understood the product, and code examples copied from other people who were also guessing.
Then we act surprised when the model sounds fluent but loses the plot.
That is the real AI quality problem I keep running into: the model can write beautifully and still make poor operational decisions. It can sound more polished while becoming less dependable in constrained work. It can summarize the issue, praise the plan, invent a new plan, and still miss the actual function that needed to be fixed.
That is not useful intelligence. That is internet-shaped fluency.
The Internet Is Not A Clean Training Room
The open web is a mixed bag. Some of it is brilliant. Some of it is practical. Some of it is dead wrong. A lot of it is almost right, which is worse because it feels believable.
Technical documentation is especially messy. The best engineers are often busy building, debugging, responding to incidents, or carrying production. The person assigned to write the documentation may be excellent. They may also be new, overloaded, outsourced, underpaid, far from the implementation, or simply stuck translating half-explained engineering notes into a public article.
That does not make them bad people. It makes the output uneven.
And AI does not always know the difference between a battle-tested operator guide and a thin article written to close a ticket.
The Documentation Dumping Ground
In too many organizations, documentation becomes a dumping ground. Strong developers and infrastructure people are told to move fast, ship features, and solve production problems. The write-up gets handed to whoever is available. Sometimes that is a skilled technical communicator. Sometimes it is a junior person trying to reverse-engineer the product from Slack messages. Sometimes it is a contractor paid to make the docs look complete even when the workflow underneath is not fully understood.
That produces documentation with familiar weaknesses:
- It explains the happy path but not the failure path.
- It lists commands without explaining when those commands are unsafe.
- It shows a clean example that does not survive a real server.
- It names the feature but not the operational constraint.
- It hides the assumption that the writer never tested the edge case.
- It turns a production workflow into a brochure.
Now feed enough of that into a model. What do you get back?
You get a documentation-shaped answer instead of an operator-shaped answer.
That difference matters. Documentation-shaped answers describe the feature. Operator-shaped answers understand the job, the failure mode, the blast radius, the order of operations, and what not to touch.
Labor Quality Becomes Data Quality
This is where people get uncomfortable, but it needs to be said carefully: training data quality is connected to human labor quality.
Different labor markets solve staffing in different ways. Some environments lean hard on relationship-based staffing. Somebody brings in a cousin, a friend, a neighbor, or whoever is nearby and trusted. The job may be above that person’s current ability, but they are there, so they learn under pressure or fail under pressure.
Other environments, especially in corporate America, overcorrect in the opposite direction. They turn hiring into a six-month ritual. A role opens. A recruiter screens 150 people. A hiring manager gives weak feedback. The budget shifts. The role gets paused. No one is hired, nothing improves, and the remaining team keeps producing rushed work because the process ate the time it was supposed to save.
Both systems create bad knowledge artifacts.
One puts undertrained people into complex roles too quickly. The other blocks capable people from doing useful work at all. One creates trial-by-fire documentation. The other creates corporate paralysis documentation. Neither automatically produces clear technical truth.
AI learns from the result.
This is not about nationality. A bad tutorial written in Detroit is just as poisonous as a bad tutorial written overseas. A sloppy enterprise knowledge base from a Fortune 500 company can mislead a model just as easily as a copied blog post on a low-quality site. The issue is proximity to the work, competence, review, and incentives.
The Viral-Culture Effect
There is also a cultural layer. The internet rewards speed, confidence, attitude, repetition, and emotional punch. It does not reliably reward accuracy. Social platforms trained a generation of content to perform certainty instead of earning it.
That style leaks into everything. It leaks into marketing. It leaks into tutorials. It leaks into technical explainers. It leaks into leadership posts that sound smart and say nothing. It leaks into AI training material.
The result is a model that can imitate confidence better than it can maintain discipline. It can give you a great headline and a weak implementation. It can make a plan sound intelligent while ignoring the actual constraint. It can turn a one-function repair into a rewrite because the rewrite sounds more impressive.
That is the viral-culture effect: polished output with thin reasoning underneath.
Creative Fluency Is Not Logical Discipline
Creative work and logical work use different muscles.
A great writer is not automatically a great systems engineer. A great debugger is not automatically a great marketer. A great storyteller may see ten possible directions. A great operator has to know which nine directions are distractions.
AI inherits that split because AI is modeled from human output. A model trained and tuned for expression may become excellent at tone, summary, framing, and narrative flow. That same model may be weaker at constrained tasks where the correct move is boring: read the existing code, find the broken path, edit the smallest safe area, run the test, and stop.
In real technical work, that boring discipline is everything.
When a model goes off the rails, it often does not look like chaos at first. It looks like helpfulness. It wants to add a new helper. It wants to rewrite the flow. It wants to protect you from a problem you did not ask about. It wants to turn a defensive security article into a policy lecture. It wants to solve the surrounding universe instead of the task in front of it.
That is why model routing matters.
One Model Should Not Do Every Job
The answer is not to declare one model smart and another model dumb. The answer is to stop pretending that one model should do every job equally well.
Use creative models where creativity matters. Use coding and reasoning models where execution discipline matters. Pair them when the job benefits from both. Let one draft. Let another verify. Let one generate options. Let another enforce constraints.
For a real business workflow, the split might look like this:
- Creative model: headlines, article framing, plain-English summaries, social copy, tone exploration.
- Coding model: repository edits, test repair, function-level debugging, deployment scripts, file-by-file changes.
- Security-aware workflow: defensive impact, patch guidance, safe validation boundaries, publication safety review.
- Human operator: final judgment, production risk, customer context, business priority, source trust.
OpenAI’s own model-selection guidance says to optimize for accuracy first and build evaluation datasets around the use case before optimizing for cost and latency. That is the right direction. The question should not be, “Which model sounds smartest?” The question should be, “Which model reliably performs this job under this constraint?”
The Security Publishing Example
Security content is where this gets obvious.
A defensive CVE article should tell people what product is affected, who needs to patch, what versions are exposed, whether active attacks are being reported, how to update safely, how to verify the fix, what logs or files to review, and what to tell customers.
It should not publish copy-paste attack instructions.
That line is not hard for an experienced security operator. But an AI model without enough operational context can blur it in either direction. It may refuse safe defensive work because the topic smells dangerous, or it may include too much attack detail because the article pattern it learned from included too much attack detail.
Both are failures.
The correct behavior is simple: protect people, do not teach abuse.
Model Collapse Is The Academic Warning Sign
There is a research version of this concern too. A 2024 Nature paper, AI models collapse when trained on recursively generated data, warns that model collapse can occur when generated material pollutes the training data of later models. The paper describes a process where models can lose information about the original distribution over time, especially when generated output keeps feeding the next generation.
That research focuses heavily on synthetic data, but the practical business concern is broader. Low-quality human data can also pollute judgment. Synthetic slop is a problem. Human slop is also a problem. The model does not get wiser just because the bad sentence was typed by a person.
If the public web becomes a recycling loop of rushed human writing, AI-generated rewrites, engagement bait, shallow tutorials, copied documentation, and more AI-generated rewrites, the signal gets weaker. The model may still sound smooth. It may even sound smoother. But smooth is not the same as correct.
What Businesses Should Do
If you are using AI for real operations, do not treat model output as magic. Treat it like a powerful junior teammate with unusual strengths, strange blind spots, and no natural fear of production consequences.
Use a few hard rules:
- Route by task. Do not use the same model for prose, code repair, security triage, and production deployment without testing.
- Build evals. Measure whether the model completes your actual workflow, not whether it writes a nice answer.
- Prefer primary sources. Feed models vendor docs, code, tickets, logs, and internal runbooks instead of random web summaries.
- Keep humans near the work. The reviewer should understand the system, not just the grammar.
- Reward restraint. A model that changes less and verifies more is often more valuable than one that rewrites everything.
- Separate creative drafting from operational execution. The thing that writes the headline does not have to be the thing that edits the deployment script.
- Protect the knowledge base. Bad internal docs become bad AI context. Fix the docs before blaming the model for learning from them.
NIST’s AI Risk Management Framework and Generative AI Profile point in the same general direction: organizations need to manage AI quality, trustworthiness, measurement, monitoring, and human oversight as operational concerns, not vibes. OpenAI’s agent-evaluation guidance also emphasizes reproducible evaluations for measuring agent quality. That is the grown-up path: measure the work.
The Real Regression To Watch
The danger is not that AI stops being able to write.
The danger is that AI gets better at sounding right while getting worse at staying inside the task. Better prose can hide weaker discipline. Better confidence can hide worse judgment. Better formatting can hide the fact that the model ignored the existing system.
That is the regression businesses should watch for.
Not “Can it talk?”
Can it finish the job?
Can it follow constraints? Can it know when not to touch something? Can it tell the difference between defensive security guidance and attack instructions? Can it repair the function that broke instead of writing a new universe around it?
Those are the questions that matter.
Bottom Line
AI quality is not only a model problem. It is a human-output problem.
Weak documentation becomes weak training signal. Weak training signal becomes weak model behavior. Weak model behavior becomes expensive mistakes when businesses mistake fluent output for operational intelligence.
The fix is not to worship one model or rage at another. The fix is cleaner source material, better routing, stronger evals, expert review, and a culture that values correct work more than impressive language.
Because if the internet keeps rewarding noise, AI will keep learning noise.
And it will still say it beautifully.


