The Old Workflow: Find First, Validate Later
Traditional HCC coding technology follows a predictable sequence. The system ingests a clinical note. NLP scans the text and identifies diagnosis mentions. The software surfaces a list of potential HCC codes. The coder reviews the list, checks it against the chart, and submits what looks supported. Validation happens at the end of the process, if it happens at all.
This workflow made sense when the primary goal was volume. Find as many codes as possible, move through charts quickly, and maximize submissions. The technology was optimized for identification speed. Documentation quality assessment was the coder’s responsibility, performed under throughput pressure with no structured framework for evaluating whether MEAT criteria (Monitoring, Evaluation, Assessment, Treatment) were satisfied.
The results of this approach are now public. OIG audits in March 2026 found error rates between 81% and 91% across three Medicare Advantage organizations. The codes were identified correctly. The documentation behind them couldn’t survive scrutiny. The workflow found codes efficiently. It just didn’t prove them.
Evidence-First Architecture
The next generation of coding technology inverts the sequence. Instead of starting with code identification and ending with validation, it starts with evidence assessment. The system reads the clinical note and evaluates the documentation first. What conditions are described? What clinical evidence supports each one? Which MEAT elements are present? Where are the gaps?
Only after this evidence assessment does the system generate coding recommendations. Each recommendation arrives pre-validated, with the specific clinical language mapped to specific MEAT elements. The coder doesn’t receive a list of potential codes and then search for evidence. The coder receives evidence-linked recommendations and confirms or rejects them based on clinical judgment.
This changes the error profile. In find-first systems, errors happen when coders submit codes that the evidence doesn’t support, usually because time pressure prevented thorough validation. In evidence-first systems, the default is to not recommend a code unless the evidence meets the standard. Unsupported codes don’t appear in the recommendation set because the evidence assessment filtered them out before the coder saw them.
Why Architecture Matters More Than Features
Some vendors have added evidence-display features to find-first systems. The AI identifies a code, and a secondary module highlights potentially relevant text in the note. This looks like evidence validation, but the architecture still leads with identification. The system found the code first and then searched for supporting text. That’s confirmation bias built into software. It finds what it’s looking for rather than objectively assessing what the documentation supports.
Evidence-first systems avoid this bias because the assessment precedes the recommendation. The AI evaluates the documentation on its own terms, identifies what conditions are genuinely supported, and only then recommends codes that clear the evidentiary bar. Conditions that appear in the chart but lack MEAT support don’t generate recommendations. The system doesn’t find the code and then look for reasons to submit it. It assesses the evidence and recommends only what the evidence supports.
This distinction matters to auditors. When CMS reviews a submitted code, the question is: “Does the documentation support this diagnosis?” A system that starts with the evidence and arrives at the code mirrors the auditor’s logic. A system that starts with the code and searches for evidence works backward from the answer, which is exactly the process that generated the settlements and audit findings now making headlines.
The Standard Going Forward
Plans selecting HCC Coding Software in 2026 should ask one architectural question before evaluating features, price, or implementation timelines: does this system start with the evidence or start with the code? The answer determines whether the technology produces defensible output by design or requires manual validation to achieve defensibility. In an enforcement environment where CMS audits every contract annually and uses its own AI to flag suspicious patterns, only evidence-first architecture produces output at the standard regulators now apply.