Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

4
  • For me it's clearly the later, what makes you think it means the first? They don't want the data to be used to train LLM - be it personal or commercial LLM. Does it make sense? Sure, they're committed to a specific LLM by now. Is it fair or good? Not really. Commented Jul 27, 2024 at 6:19
  • 1
    @ShadowWizard Since this wording was born after the legal team had a look at the previous one, I expect the new wording to be more relaxed, not less (in this answer, the former is more relaxed, while the latter is less relaxed). Also note the second sentence, "should I distribute this file for purpose of LLM training..." - there's no penalty for non-personal use, only for LLM use. Commented Jul 27, 2024 at 9:05
  • 4
    Overall, I feel SE for some reason just can't fathom that anybody would need the dump for anything other than LLM training, they have this false dichotomy between "personal use" and "LLM training". Commented Jul 27, 2024 at 9:08
  • 6
    Either way, any extra limitations SE tries to tack on the data dump are unenforceable anyways, as CC-BY-SA expressly prohibits adding any limitations not already included in it. You can safely ignore SE's legal team's baseless scaremongering. Commented Jul 28, 2024 at 1:04