I've just checked the transcript for the answer to Why is attribution optional?:
Lately, the company keeps saying "attribution is non-negotiable", and then using / endorsing systems that fundamentally cannot respect attribution. What's up with that?
… I'm still not entirely sure you get it. I don't say "fundamentally" for no reason! I'll copy the transcript here, inserting my colour commentary.
got it thank you wizz wizz for the question that's that's an important one I think the attribution is absolutely very important we consider it as is completely vital it is a non-negotiable as we engage with Partners but it's also a matter of you know us holding them accountable when they on their road map
Filler (usual for live Q&A).
and as they it does take some time for partners to incorporate that into their AI tools
Attribution isn't something that can just be "incorporated" into existing "AI tools". RAG does not count.
as we do these Partnerships with them and where they get licensed access to the data as an example so it's still in sort of an early infancy stage despite the fact we've struck multiple of these Partnerships
Filler.
they're all in the process of attributing so as an example one recent example is with Google Gemini Cloud assist that which is now attributing Stack Overflow content in the in their IDE right as you write code through that through that tool you will see the sources back to the Stack overflow Content that was was used
I haven't studied Google Gemini Cloud assist specifically, but I would be surprised if it weren't just RAG – which, as we know, isn't sufficient attribution.
Humans are "few-shot learners" because we think about what we are shown, and develop habitual patterns of thought based on our analysis of what we are shown. We make links between it and other things, developing our own perspectives and honing our own learning skills, so even two people who learned from the same textbook can come away with radically different perspectives on a subject. There is a very real sense in which humans "own" what they have learned.
Humans can generalise their knowledge and understanding in unprecedented ways: not only in response to unprecedented situations, but developing unprecedented meaningful concepts as well! So there is a very real sense in which humans can be said to "own" their creative output.
Language model architectures such as GPT, PaLM and Llama do none of this. Their training process is fitting a statistical model to an observed dataset. Their output process is sampling from this statistical model. (Sometimes there is an additional step, refining the model to be more persuasive.) Everything the model "knows" is directly taken from somebody's work, but the creation of this model is a form of averaging that makes a notion of attribution difficult to define – in much the same way that money laundering makes a notion of "follow the money" difficult to define.
you know originally anecdotally also I would just say that you know our head of marketing Eric Martin shared this with me earlier today is that the with open AI we now you begin to see things like sources within there and you can actually see the links again to Stack overflow if you're in and we have actually seen a nice increase in the traffic that comes to our site from from chatgpt actually
That's nice. It's completely orthogonal to proper attribution, though.
and you know obviously the opposite is happening on search for all content sites because that's just a it's a different world for search
Not all search. We can encourage the use of more respectful search systems, like Marginalia Search Engine. See the search results for "branch prediction fail" (image): the first six links (two boxes) are Stack Exchange, and that's organic!
but as gen AI and search trade off on you know you know getting access to users is sort of the primary screen that people spend time on to get access to information on the web
Incoherent. (Transcripts need to be written by actual people: the technology isn't there, and won't be until 2125 at the earliest.)
we're literally seeing that happen on our site where traffic seems to be coming now from both places people are coming definitely from Google but also our search engines and people are definitely coming from now our some of our knowledge as a service integration so it was interesting to see the chat GPT traffic through this attribution point
so it is a long way of saying wizz wizz it is a a work in progress and it is is happening as we keep working with each of these Partners to to hold them accountable to that requirement thank you
I understand "some is better than none" and "harm reduction", but you're not holding them accountable to that requirement. You're holding them accountable to a weaker requirement ("search for and link to relevant Stack Exchange pages"), which – while way better than nothing – is not the same thing as attribution.
DuckDuckGo has been shoving AI slop in my face. Usually I ignore it, but I wasn't concentrating hard enough, so I read this one. It's a good example!

The two "cited sources" are:
But the claim that "This idea is often explored in philosophy and psychology" – where does that come from? It's completely absent from the "sources", and I'm not sure it's even true. (It's often explored in literary analysis.) If this claim does come from somewhere… where?
This is not attribution.
Kaylee Williams, Sarah Grevy Gotfredsen, and Dhrumil Mehta of Columbia Journalism Review did an experiment, and found that “Content licensing deals with news sources provided no guarantee of accurate citation in chatbot responses.”
On a somewhat tangential note, I looked up the definition of money laundering. The bit about US law was interesting, but I've lost the tabs. The UK's Crown Prosecution Service's page on "Money Laundering Offences" says:
Money laundering is defined in the POCA as “the process by which the proceeds of crime are converted into assets which appear to have a legitimate origin, so that they can be retained permanently or recycled into further criminal enterprises”.
(quoting the Proceeds of Crime Act 2002 Explanatory Notes, not the law proper)
This definition is quite broad, so I dug a bit deeper, and… per sections 326, 327 and 340 of the Proceeds of Crime Act 2002, large-scale AI provenance-laundering might not just be like money laundering: it looks like it falls under the plain-English definition, hinging only on whether the scraping CC-licensed texts satisfies the legalese "obtains an interest in it". (This'll probably never be relevant, but it's interesting.)