My System Was Making Decisions on Data From 2024. I Didn't Notice for Weeks.

Two separate bugs. One in the analysis layer, one in the research layer. Both are fixed. Here's what they taught me.

Apr 29, 2026

So I want to talk about something that happened this week that was equal parts frustrating and, if I’m being honest, kind of reassuring. I found two separate problems in the investment system that had been sitting there since I built it. Neither of them caused a catastrophic failure. But both of them meant the system wasn’t working as well as I thought it was, and I think the way I found them and fixed them is worth documenting because this is what building in public actually looks like.

It’s not all clean outputs and satisfying exit decisions. Sometimes it’s staring at a validator result and thinking “hold on, that doesn’t look right.”

Bug One: The Analysis Layer Was Using Old Information

The first problem was in the analysis stage. This is the layer that takes the 50 to 100 stocks that pass the quantitative screener and scores them on things like competitive advantages, management quality, and whether the business has a real moat. It’s powered by Claude, and the assessments are good. The issue is that Claude’s analysis is based on its training data, which has a cutoff. It knows what it knew when it was last updated. It doesn’t know what happened last quarter.

For most stocks this doesn’t matter much. If a company has no competitive advantage, that’s usually been true for a while and the training data is perfectly fine for catching it. The problem lives in the middle ground, the stocks that score in the borderline range. A company that’s been improving its competitive position over the last six months might get marked down because the analysis layer doesn’t know about the improvement. And a company where things have got worse recently might get a pass because the analysis layer is still working off the older, better picture.

I think what bothered me about it was that these borderline stocks were getting filtered out before the validator ever ran on them. The validator is the part of the system that does live web searches and checks everything against current evidence. But it only runs on stocks that the analysis layer already approved. So if the analysis layer rejected something based on stale information, I’d never even know what I missed.

The fix was to add a targeted web search for stocks in that borderline zone. Not the full deep-dive that the validator does, just a single focused search per stock to check whether anything has materially changed since the training data was last updated. It runs automatically on the stocks the analysis layer isn’t sure about, and the results get fed back into the scoring. It adds a small amount to the cost per run but it closes the gap where real opportunities were potentially getting lost.

Bug Two: The Validator Didn’t Know What Day It Was

The second problem was more annoying because it was so obvious once I spotted it.

The validator runs three research passes on every stock using live web searches. It looks at what’s going on with competitors, recent earnings commentary, and management risk signals. The whole point is that it’s checking current evidence. But when you call the Claude API programmatically, it doesn’t automatically know today’s date the way it does in a chat interface. So when the research prompts said things like “find the most recent earnings call” or “focus on the last six months,” the model had no actual reference point for what “recent” or “last six months” meant.

I found this because I ran the validator on Manhattan Associates on April 23, two days after their Q1 2026 earnings came out on April 21. The earnings report was strong, cloud revenue up 24%, lower customer churn, raised guidance. But the validator output was referencing data from 2024. Not 2026. Not even 2025. It was pulling up the results that had the most search engine weight behind them, which for a company like Manhattan Associates means the annual results from over a year ago that have had months of articles and analysis written about them.

The Q1 2026 call had been out for 48 hours. It hadn’t had time to build up that kind of search ranking. And the validator had no way to know it should be looking for 2026 data specifically, because nobody had told it what year it was.

I think once I realised what was happening, the fix was pretty straightforward. Every research prompt now includes today’s date, pulled automatically from the machine’s system clock when the script runs. It sets an explicit lookback window of 90 days, and all the search queries include the current year. So instead of searching for “Manhattan Associates earnings” and getting whatever’s most heavily indexed, it searches for “Manhattan Associates earnings 2026” and explicitly ignores anything older if newer data is available.

After the fix I re-ran MANH and the validator pulled the Q1 2026 results correctly. The output was completely different, the cloud growth numbers, the churn data, the Agentic AI launch, the CFO transition. All of it was there. The stock got upgraded from an 8 to a 9, which is what I wrote about in the last post.

Link to Etoro Portfolio

What Both Bugs Have in Common

I think the thing that connects these two problems is the same underlying assumption I’d made without really thinking about it. I assumed the system was working with current information because I’d designed it to work with current information. The screener pulls live data from Yahoo Finance. The validator runs live web searches. So obviously everything is up to date, right?

But “live” doesn’t mean the same thing at every stage. The screener’s financial data is current because Yahoo Finance updates it. The validator’s web searches are live but they return whatever the search engine ranks highest, which isn’t always the newest result. And the analysis layer isn’t live at all, it’s working from a training snapshot that could be months old.

None of this broke the system completely. The analysis layer still caught the obvious failures. The validator still flagged genuine problems. But there was a gap between what I thought the system was doing and what it was actually doing, and I think closing that gap, even when it’s embarrassing to admit it existed, is part of what makes this a real process rather than a theoretical one.

What’s Next

Both fixes are live. The analysis layer now runs a targeted web search on borderline candidates before making its final call. The validator now knows what day it is.

ExlService reports earnings on April 28 and DexCom on April 30, so those will be the first two stocks to go through the updated validator. I’m interested to see whether the outputs are noticeably different from what the old version would have produced. If they are, I’ll document that too.

What’s the most annoying bug or gap you’ve found in something you built that you were sure was working?

Reply or leave a comment, I read every one.

Cold Capital

Discussion about this post

Ready for more?