Timeline for Announcing a change to the data-dump process
Current License: CC BY-SA 4.0
18 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Aug 23, 2024 at 2:54 | comment | added | Journeyman Geek | quick sidenote - modern copies of rclone appear to support the internet archive and between threading and other features, seem to be a lot more robust than the IA client - I'd suggest grabbing the latest version rather than the distro supplied version | |
| Aug 5, 2024 at 21:50 | comment | added | Franck Dernoncourt | @Joshua If an agreement puts some condition on a file, does the condition also apply to its content? (the content itself is under CC BY-SA) | |
| Jul 24, 2024 at 17:55 | history | edited | Franck Dernoncourt | CC BY-SA 4.0 |
added 595 characters in body
|
| Jul 16, 2024 at 21:54 | comment | added | Franck Dernoncourt | @Joshua ok converting to JSON: {"SE dump": files.xml} :) | |
| Jul 16, 2024 at 21:53 | comment | added | Joshua | @FranckDernoncourt: More like; replace the packaging schema with one of your own design. When I wrote my content I was imagining building an HTML render of the whole as flat files. | |
| Jul 16, 2024 at 19:00 | comment | added | Franck Dernoncourt | @Joshua how much process is needed? eg is adding a space enough? | |
| Jul 16, 2024 at 10:21 | comment | added | Joshua | Fun fact; the license only applies to the dump, not to the post contents within. If you were to process the dump and re-host all the content thereof; anybody hitting your site would not have inherited that license but only the CC-BY-SA license of the actual content. | |
| Jul 16, 2024 at 9:19 | history | edited | Franck Dernoncourt | CC BY-SA 4.0 |
added 5 characters in body
|
| Jul 16, 2024 at 9:14 | history | edited | Franck Dernoncourt | CC BY-SA 4.0 |
added 109 characters in body
|
| Jul 16, 2024 at 9:08 | history | edited | Franck Dernoncourt | CC BY-SA 4.0 |
added 247 characters in body
|
| Jul 15, 2024 at 1:38 | history | edited | Franck Dernoncourt | CC BY-SA 4.0 |
deleted 12 characters in body
|
| Jul 14, 2024 at 23:30 | history | edited | Franck Dernoncourt | CC BY-SA 4.0 |
Thanks Zoe and AMtwo
|
| Jul 14, 2024 at 13:55 | comment | added | Zoe - Save the data dump | "and it's for only 1 out of around 380 network sites+metas" - strictly speaking, it's for 2. From the question: "Please note that users will only be provided the data for the specific Stack Exchange site and its Meta site for the corresponding Stack Exchange site profile" -- so you'd be getting the main + meta dump in one go, so it's 1/183 or so instead (strictly speaking, 184, but area51 has never been included in the data dump). It's still really bad that one download becomes 183, but it's at least slightly better than ~365 or so | |
| Jul 12, 2024 at 17:21 | comment | added | anon | Oh yeah. It's definitely not better.... and I am furious about it. I spent my entire lunch hour writing up a lengthy answer on this post. | |
| Jul 12, 2024 at 17:20 | comment | added | Franck Dernoncourt | @AMtwo thanks, yes I know I created stack-exchange-images, just wanted to point out limitations of the new dump that SE Inc. claims is better than using archive.org | |
| Jul 12, 2024 at 17:18 | comment | added | anon |
The Data Dump has NEVER included images--It only includes the URLs to the images hosted elsewhere. The stack-exchange-images archive is created & maintained by data archivists, not by Stack Overflow. I suspect that after this change, the stackexchange archive will similarly continue to be maintained by data archivists without the help of the company.
|
|
| Jul 12, 2024 at 17:06 | history | edited | Franck Dernoncourt | CC BY-SA 4.0 |
added 146 characters in body
|
| Jul 12, 2024 at 16:42 | history | answered | Franck Dernoncourt | CC BY-SA 4.0 |