Tags: learning

512

Wednesday, April 30th, 2025

Codewashing

I have little understanding for people using large language models to generate slop; words and images that nobody asked for.

I have more understanding for people using large language models to generate code. Code isn’t the thing in the same way that words or images are; code is the thing that gets you to the thing.

And if a large language model hallucinates some code, you’ll find out soon enough:

With code you get a powerful form of fact checking for free. Run the code, see if it works.

But I want to push back on one justification I see repeatedly about using large language models to write code. Here’s Craig:

There are many moral and ethical issues with using LLMs, but building software feels like one of the few truly ethically “clean”(er) uses (trained on open source code, etc.)

That’s not how this works. Yes, the large language models are trained on lots of code (most of it open source), but they’re not only trained on that. That’s on top of everything else; all the stolen books, all the unpaid creative work of others.

Even Robin Sloan, who first says:

I think the case of code is especially clear, and, for me, basically settled.

…goes on to acknowledge:

But, again, it’s important to say: the code only works because of Everything. Take that data away, train a model using GitHub alone, and you’ll get a far less useful tool.

When large language models are trained on domain-specific data, it’s always in addition to the mahoosive amount of content they’ve already stolen. It’s that mohoosive amount of content—not the domain-specific data—that enables them to parse your instructions.

(Note that I’m being very delibarate in saying “parse”, not “understand.” Though make no mistake, I’m astonished at how good these tools are at parsing instructions. I say that as someone who tried to write natural language parsers for text-only adventure games back in the 1980s.)

So, sure, go ahead and use large language models to write code. But don’t fool yourself into thinking that it’s somehow ethical.

What I said here applies to code too:

If you’re going to use generative tools powered by large language models, don’t pretend you don’t know how your sausage is made.

An Entirely Other Day: The Triumph of Triumphalism

Scratch the skin of wild-eyed AI proponents, and a thick syrup oozes out, made up of the blendered remains of Roko’s Basilisk, barely sublimated Christian end-times thinking, and the mis-remembered plot of that one cool science-fiction story they read when they were twelve. This is the basis for the new order, just like the blockchain was a couple of years ago, and a dead-eyed, low-poly, pantsless rendering of Mark Zuckerberg was a couple of years before that.

“You’re going to be left behind” is only the latest version of “Have fun staying poor.” It’s got every ounce of the smug self-satisfaction that it shouldn’t need if the inevitability it promises were actually inevitable.

“AI-first” is the new Return To Office - Anil Dash

AI is really good for helping you if you’re bad at something, or at least below average. But it’s probably not the right tool if you’re great at something. So why would these CEOs be saying, almost all using the exact same phrasing, that everyone at their companies should be using these tools? Do the think their employees are all bad at their jobs?

Tuesday, April 29th, 2025

What we talk about when we talk about AI — Careful Industries

Technically, AI is a field of computer science that uses advanced methods of computing.

Socially, AI is a set of extractive tools used to concentrate power and wealth.

Saturday, April 26th, 2025

The Hidden Cost of AI Coding – Terrible Software

Feels like an emerging trend:

Instead of that deep immersion where I’d craft each function, I’m now more like a curator? I describe what I want, evaluate what the AI gives me, tweak the prompts, and iterate. It’s efficient, yes. Revolutionary, even. But something essential feels missing — that state of flow where time vanishes and you’re completely absorbed in creation. If this becomes the dominant workflow across teams, do we risk an industry full of highly productive yet strangely detached developers?

Thursday, April 17th, 2025

I Hate Wasting Time on Identifying AI Slop • Buttondown

It’s an annoying cognitive task: detecting weird photo artifacts, bizarre movement in videos, impossible animals and body horror, and reading through reams of anodyne text to determine if the person who prompted the synthetic media machine cared enough to dedicate time and energy to the task of communicating to their audience.

I hate that this is the bleak future which venture capitalists and AI boosters have gleefully laid out for us, that they consider this to be a “democratizing” technology in any real sense of the word. Far from strengthening democracy, these are technologies more apt at propping up scam capitalism and multi-level marketing schemes. I would like my time and mental space back.

Tuesday, April 15th, 2025

Why do AI company logos look like buttholes?

You won’t be able to unsee this. It’s like the FedEx logo …if the arrow was an anus.

  1. Circular shape (often with a gradient)
  2. Central opening or focal point
  3. Radiating elements from the center
  4. Soft, organic curves

Sound familiar? It should, because it’s also an apt description of… well, you know.

Monday, April 14th, 2025

Cascading Layouts | OddBird

A workshop on resilient CSS layouts

Oh, hell yes!

Do not hesitate—sign yourself up to this series of three online workshops by Miriam. This is the quickest to level up your working knowledge of the most powerful parts of CSS.

By the end of this you’re going to feel like Neo in that bit of The Matrix when he says “I know kung-fu!” …except kung-fu isn’t very useful for building resilient and maintainable websites, whereas modern CSS absolutely is.

Tuesday, April 8th, 2025

‘An Overwhelmingly Negative And Demoralizing Force’: What It’s Like Working For A Company That’s Forcing AI On Its Developers - Aftermath

Grim reading from the games industry, especially if you work at Shopify where the CEbrO has just mandated that you have to use this shite.

Monday, April 7th, 2025

Denial

The Wikimedia Foundation, stewards of the finest projects on the web, have written about the hammering their servers are taking from the scraping bots that feed large language models.

Our infrastructure is built to sustain sudden traffic spikes from humans during high-interest events, but the amount of traffic generated by scraper bots is unprecedented and presents growing risks and costs.

Drew DeVault puts it more bluntly, saying Please stop externalizing your costs directly into my face:

Over the past few months, instead of working on our priorities at SourceHut, I have spent anywhere from 20-100% of my time in any given week mitigating hyper-aggressive LLM crawlers at scale.

And no, a robots.txt file doesn’t help.

If you think these crawlers respect robots.txt then you are several assumptions of good faith removed from reality. These bots crawl everything they can find, robots.txt be damned.

Free and open source projects are particularly vulnerable. FOSS infrastructure is under attack by AI companies:

LLM scrapers are taking down FOSS projects’ infrastructure, and it’s getting worse.

You try to do the right thing by making knowledge and tools freely available. This is how you get repaid. AI bots are destroying Open Access:

There’s a war going on on the Internet. AI companies with billions to burn are hard at work destroying the websites of libraries, archives, non-profit organizations, and scholarly publishers, anyone who is working to make quality information universally available on the internet.

My own experience with The Session bears this out.

Ars Technica has a piece on this: Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries .

So does MIT Technology Review: AI crawler wars threaten to make the web more closed for everyone.

When we talk about the unfair practices and harm done by training large language models, we usually talk about it in the past tense: how they were trained on other people’s creative work without permission. But this is an ongoing problem that’s just getting worse.

The worst of the internet is continuously attacking the best of the internet. This is a distributed denial of service attack on the good parts of the World Wide Web.

If you’re using the products powered by these attacks, you’re part of the problem. Don’t pretend it’s cute to ask ChatGPT for something. Don’t pretend it’s somehow being technologically open-minded to continuously search for nails to hit with the latest “AI” hammers.

If you’re going to use generative tools powered by large language models, don’t pretend you don’t know how your sausage is made.

AI ambivalence | Read the Tea Leaves

Here’s the main problem I’ve found with generative AI, and with “vibe coding” in general: it completely sucks out the joy of software development for me.

I hate the way they’ve taken over the software industry, I hate how they make me feel while I’m using them, and I hate the human-intelligence-insulting postulation that a glorified Excel spreadsheet can do what I can but better.

Wednesday, April 2nd, 2025

Poisoning Well: HeydonWorks

Heydon is employing a different tactic to what I’m doing to sabotage large language model crawlers. These bots don’t respect the nofollow rel value …so now they pay the price.

Raising my own middle finger to LLM manufacturers will achieve little on its own. If doing this even works at all. But if lots of writers put something similar in place, I wonder what the effect would be. Maybe we would start seeing more—and more obvious—gibberish emerging in generative AI output. Perhaps LLM owners would start to think twice about disrespecting the nofollow protocol.

Sunday, March 30th, 2025

Friday, March 28th, 2025

Open source devs say AI crawlers dominate traffic, forcing blocks on entire countries - Ars Technica

As it currently stands, both the rapid growth of AI-generated content overwhelming online spaces and aggressive web-crawling practices by AI firms threaten the sustainability of essential online resources. The current approach taken by some large AI companies—extracting vast amounts of data from open-source projects without clear consent or compensation—risks severely damaging the very digital ecosystem on which these AI models depend.

Wednesday, March 26th, 2025

Go To Hellman: AI bots are destroying Open Access

AI companies with billions to burn are hard at work destroying the websites of libraries, archives, non-profit organizations, and scholarly publishers, anyone who is working to make quality information universally available on the internet.

Friday, March 21st, 2025

FOSS infrastructure is under attack by AI companies

More on how large language bots are DDOSing the web:

LLM scrapers are taking down FOSS projects’ infrastructure, and it’s getting worse.

Thursday, March 20th, 2025

Please stop externalizing your costs directly into my face

Over the past few months, instead of working on our priorities at SourceHut, I have spent anywhere from 20-100% of my time in any given week mitigating hyper-aggressive LLM crawlers at scale.

This matches my experience with The Session. In fact, while I had this article open in a tab, I had to go deal with a tsunami of large language model bots. It’s really fucking depressing.

Please stop legitimizing LLMs or AI image generators or GitHub Copilot or any of this garbage. I am begging you to stop using them, stop talking about them, stop making new ones, just stop. If blasting CO2 into the air and ruining all of our freshwater and traumatizing cheap laborers and making every sysadmin you know miserable and ripping off code and books and art at scale and ruining our fucking democracy isn’t enough for you to leave this shit alone, what is?

Wednesday, March 19th, 2025

Make stuff, on your own, first | Sean Voisen

AI can be incredibly useful when deployed skillfully in creative endeavors—as an ideation partner, as a scaffolding tool, by eliminating tedious tasks, etc.—but anyone making anything truly good with it is probably somebody who could already make something good first without it.

Tuesday, March 18th, 2025

Design processing

Dan wrote an interesting post with a somewhat clickbaity title; This Competition Exposed How AI is Reshaping Design:

I watched two designers go head-to-head in a high-speed battle to create the best landing page in 45 minutes. One was a seasoned pro. The other was a non-designer using AI.

If you can ignore the title (and the fact that Dan still actively posts on Twitter; something I find very hard to ignore), then there’s a really thoughtful analysis in there.

It’s less about one platform or tool vs. another more than it is a commentary on how design happens, and whether or not that’s changing in a significant way.

In particular, there’s a very revealing graph that shows the pros and cons of both approaches.

There’s no doubt about it, using a generative large language model helped a non-designer to get past the blank page. But it was less useful in subsequent iterations that rely on decision-making:

I’ve said it before and I’ll say it again: design is deciding. The best designers are the best deciders.

Dan finishes by saying that what he’d really like to see is an experienced designer/decider using these tools to turbo-boost their process:

AI raises the floor for non-designers. But can it raise the ceiling for designers?

Meanwhile, Matt has been writing about Vibe-designing. Matt is an experienced designer, but he’s not experienced with Figma. He’s found that he can work around that using a large language model:

Where in the past 30 years I might have had to cajole a more technically adept colleague into making something through sketches, gesticulating and making sound effects – I open up a Claude window and start what-iffing.

The “vibe” part of the equation often defaults to the mean, which is not a surprise when you think about what you’re asking to help is a staggeringly-massive machine for producing generally-unsurprising satisfactory answers quickly. So, you look at the output as a basis for the next sketch, and the next sketch and quickly, together, you move to something more novel as a result.

Interesting! Just as Dan insisted, the important work is making the decision and moving on to the next stage. If the actual outputs at each stage are mediocre, that seems to be okay, as long as they’re just good enough to inform a go/no-go decision.

This certainly seems more centaur-like than the usual boring uses of large language models to simply do what people are already doing.

Rich gets at something similar when he talks about using large language models for prototyping, where it’s okay if the code is kind of shitty:

If all you need is crappy code to try out a concept or a solution, then an LLM might well enable you (the designer) to do that.

Mind you, even if you do end up finding useful and appropriate ways to use these tools, you’re still using a tool built on exploitation and unfairness:

It’s hard (and reckless) to ignore the heartfelt and cogent perspective laid out by Miriam on the role of AI companies in the current geopolitical crisis:

When eugenics-obsessed billionaires try to sell me a new toy, I don’t ask how many keystrokes it will save me at work. It’s impossible for me to discuss the utility of a thing when I fundamentally disagree with the purpose of it.

Another uncalled-for blog post about the ethics of using AI | Clagnut by Richard Rutter

This is a really thoughtful piece by Rich, who’s got conflicted feelings about large language models in the design process. I suspect a lot of people can relate to this.

What I do know is that I find LLMs useful on occasion, but every time I use one I die a little inside.