Every week, one of my colleagues sends around a survey on behalf of my manager+1, asking for anecdata about productivity gains from using so-called "AI" coding assistants. They are very explicit that they only want to hear about wins and not about downsides. Great example of, "How not to do a survey," but eh, this is the state of the software "engineering" in 2026.
Anyway, rather than engage with... all that, I've started sending around some reading material to our local team here in Scotland each week. I'll archive them in this post, which will occasionally be updated.
2026-05-29 It's OK To Want To Have A Good Time
In our ongoing investigation into the nature of "productivity", here is an interesting paper recently presented at the 9th International Conference on the Art, Science and Engineering of Programming.
https://doi.org/10.4230/OASIcs.Programming.2025.5
By far the biggest productivity problem in software development is understanding the purpose for which the software is being written, and not having to throw it away and do it again; something that studies of productivity rarely include. We’re not suggesting that all developer productivity research is bad – we certainly don’t think that is the case. What were are trying to do is to highlight that the total scope of the professional activities of a software engineer are wide, varied and complicated.
2026-06-02 Understanding the Strengths and Limitations of Reasoning Models
A friend directed my attention to this study at Apple, which found that so-called 'Large Reasoning Models' completely collapse when asked complex questions. The results and conclusions of their study are particularly interesting (and they have some nice visualisations.
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
https://arxiv.org/pdf/2506.06941
Recent generations of frontier language models have introduced Large Reasoning Models (LRMs) that generate detailed thinking processes before providing answers. While these models demonstrate improved performance on reasoning benchmarks, their fundamental capabilities, scaling properties, and limitations remain insufficiently understood.
...
By comparing LRMs with their standard LLM counterparts under equivalent inference compute, we identify three performance regimes: (1) low- complexity tasks where standard models surprisingly outperform LRMs, (2) medium-complexity tasks where additional thinking in LRMs demonstrates advantage, and (3) high-complexity tasks where both models experience complete collapse. We found that LRMs have limitations in exact computation: they fail to use explicit algorithms and reason inconsistently across scales and problems.
2026-05-15 James Shore: You Need AI That Reduces Maintenance Costs
We have previously discussed what, exactly, does "productivity" mean in the context of using "AI" tools?
https://www.jamesshore.com/v2/blog/2026/you-need-ai-that-reduces-your-maintenance-costs
I’ll get straight to the point: your AI coding agent, the one you use to write code, needs to reduce your maintenance costs. Not by a little bit, either. You write code twice as quick now? Better hope you’ve halved your maintenance costs. Three times as productive? One third the maintenance costs. Otherwise, you’re screwed. You’re trading a temporary speed boost for permanent indenture.
2026-05-07 AI study from Carnegie Mellon, MIT, Oxford and UCLA
A new paper was just released by a multi-institution team of researchers:
AI Assistance Reduces Persistence and Hurts Independent Performance
https://ai-project-website.github.io/AI-assistance-reduces-persistence/
...after just ∼10 minutes of AI-assisted problem-solving, people who lost access to the AI performed worse and gave up more frequently than those who never used it. These findings raise urgent questions about the cumulative effects of daily AI use on human persistence and reasoning. We caution that if such effects accumulate with sustained AI use, current AI systems — optimized only for short-term helpfulness — risk eroding the very human capabilities they are meant to support.
No comments:
Post a Comment