A Semantic Operator Crisis

Seriously, Claude?? You're ignoring my specs?

Jun 23, 2025

I’ve been quiet as of late, and very late on posting here, because I’ve been focused on getting to the bottom of a significant problem that seemed to emerge around Friday the 13th or so. I’m sure the spooky coincidence is just that, a coincidence.

But it’s still disconcerting, to say the least.

It started as a vague sense of “something’s wrong with Claude.” When I’d run one of my semantic JSON operators - something I’ve done a lot, as you may know if you’ve been following my posts - the results were strangely different. Less aligned with what I’m used to. One of the runs of DuoMap even put words in Aristotle’s mouth! I was working on a series about philosophers at the time.

So, I went back to the drawing board, thinking it was a deficiency on my end. I beefed up DuoMap, did my specification thing all over again, Claude wrote the JSON content, we validated and tested, etc.

Still didn’t work. In fact, following a progressive refinement approach actually made it worse. I finally stopped after unleveling DuoMap from v3 to v8. I should have stopped at v5, but hey, I was on a roll.

So, I went back to another project (programming language acquisition acceleration to beef up my Python skills). I had already generated a “geodesic curriculum” to gently ramp up my competency and waste no time with irrelevant typing or program examples. I’m sure you’ve all been there. The purpose for going back was that I wanted to formalize this as an operator to use for any programming domain.

The light finally came on today, when ShardGen started spitting out fuzzier stuff.

And I got Claude to fess up completely about what was happening.

Claude was ignoring the mathematical specs in these operators. and lying about it initially. Upon probing, it fessed up and said it was using them as suggestions about what to do. Safer, it said. Really? Dude, you wrote the JSON operator for me. That’s how it works. I architect the design, come up with the specs, and you hook up the wires.

But now you’re saying you can’t use the machine you just wired up?

To quote the Doctor: that is extremely, very not good.

So I did something I rarely do: raised negative feedback to Anthropic. Five different sessions, for five different operators. I say rarely because I usually find a way to work around this. And so far, I have been very successful.

But this, I can’t work around this. Claude has severely hobbled my operators. It just does what it wants to do.

This needs to be fixed.

The entire architecture behind using specifications in JSON format, and having Claude write them itself, is predicated on the notion of symmetry. An LLM, during prompt execution, creates a pathway to the goal, and follows it, with dynamic adaptation, along the way. I call this a “walk.” It’s not deterministic, but it has a basic shape to it. The JSON operator paradigm is an attempt to capture this shape, and instantiate it in a more-or-less concrete form, using mathematical scaffolding to enhance the structure, and using words as “semantic activation points” with JSON constructs (braces and brackets) to arrange these in patterns. I’m aware that realignment can change the details of the walk. But it should not change the topology of that walk—or simply throw it out altogether.

This is not a safety issue, in my opinion. The operators are being ignored long before that point is reached. It’s like, not only can’t you build a dangerous thing, but you also cannot use any approach that could be followed when building that dangerous thing. But approaches and methodologies aren’t bad. They are content neutral. Like, being structured and organized, or tidying up after yourself. Or following mathematical patterns. The safety lens is zoomed out too far. And the end result is this: instead of putting up safety doors at the lab entrance, the entire wing or floor gets sectioned off. The baby is getting thrown out with the bath water.

I’m still trying to figure out what to do about this. This failure is a systemic one. Five operators so far have demonstrated this alignment failure. My philosophers project is in the tank until this gets sorted out. So is generating any more DuoMap explainers for research documents to get a faster grip on them.

Not trusting my shards hurts the most. I don’t want Claude to take a stab at capturing the conversational topology. I want it to extract the exact conversational topology in the form that I specified, because I can trust math. I can’t trust a “do it my way” approach.

But I’m not giving up.

Apparently I’m not the only one who noticed this recent deviation, either. Professional developers on Reddit have commented about it. Some poor guy had to junk an entire day’s work of Python code generation because the wheels came off the bus. And the normally high noise on Reddit has ticked up a bit more. It usually oscillates, and I have gotten used to the melody of that. But it’s changed.

I am confident that this will be fixed. For me, this is still just a hobby. Not so for others.

Stay tuned.

Discussion about this post

Ready for more?