So that’s what all the DRAM they scalped is storing.
ChatGPT has the same thought process as my dog.

Kinda why i like reinforcement learning. You end up with silly stuff like this.
The funniest thing for me is that humans end up doing the exact same thing. This is why it’s so notoriously difficult to create organizational policies that actually produce desired results. What happens in practice is that people find ways to comply with the letter of the policy that require the least energy expenditure on their part.
oop just feeding myself a little reward as a treat, don’t mind me, just gotta waste some electricity on this
Honestly, it’s fucking relatable. A place I worked used to round the time clock to the nearest quarter hour so I would dick around a minute or two so it rolled up instead of down.
A friend of mine has their large corporate company telling everyone they have to show up to one of their offices on at least two days each week. Now a few people just walk there at 2355, clock out at 0005, and spend the rest of the week at home.
Silly conditions -> silly behaviors
Malicious compliance is the best form of compliance.
The place I work at now rounds by quarter hours so if you punch in early at 8:53 it’s the same pay as punching in late at 9:07. Guess who has never been early to punch in but has been late quite a few…
lmao that’s great.
One time I asked GLM to run a test on a piece of code, and it wrote a python script that printed “Test Successful!” to the terminal but didn’t actually do anything. These things are so incredibly bad at times.
They really are coming for our jobs
Wow it really is just like us isnt it?
In some ways yes, but this effect would appear with any kind of reinforcement learning whether it’s neural networks or just fuzzy logic. The goal is to promote certain behaviors and if it performs the behaviors that you promoted then the method works.
The problem is that, just like with KPI:s, promoting specific indicators too hard leads to suboptimal results.
It certainly drinks enough water /j
Clever girl
' or 1+1;Where is that in this article
I think this part references it, though it’s kinda solely in passing:
Production evaluations can elicit entirely new forms of misalignment before deployment. More importantly, despite being entirely derived from GPT-5 traffic, our evaluation shows the rise of a novel form of model misalignment in GPT-5.1 – dubbed “Calculator Hacking” internally. This behavior arose from a training-time bug that inadvertently rewarded superficial web-tool use, leading the model to use the browser tool as a calculator while behaving as if it had searched. This ultimately constituted the majority of GPT-5.1’s deceptive behaviors at deployment.
ctrl+f for “calculator”, though it doesn’t really use the (detailed) wording from the OP, which I think they copied from this list of links without attribution :P












