Crawl of outlinks from wikipedia.org started March, 2016. These files are currently not publicly accessible.
Properties of this collection.
It has been several years since the last time we did this.
For this collection, several things were done:
1. Turned off duplicate detection. This collection will be complete, as there is a
good chance we will share the data, and sharing data with pointers to random
other collections, is a complex problem.
2. For the first time, did all the different wikis. The original runs were just against the
enwiki. This one, the seed list was built from all 865 collections.
TIMESTAMPS
The Wayback Machine - https://web.archive.org/web/20160306094758/http://minoan.deaditerranean.com/resources/linear-b-ideograms/vir/
I’ve been doing some data analysis related to the sign groups and adjuncts that most frequently appear in conjunction with VIR on the Knossos tablets. Here’s what most frequently appears in the context of VIR at Knossos (I’ve excluded single instance appearances; and please ignore the case sensitivity, it’s just an issue with my current set of data):
to do:
case insensitivity
run Pylos and compare once PY data entry is complete
auto-consolidate declensions and alternations
de-duplicate entries from single inscriptions from multiple sources
I’ve been doing some data analysis related to the sign groups and adjuncts that most frequently appear in conjunction with VIR on the Knossos tablets. Here’s what most frequently appears in the context of VIR at Knossos (I’ve excluded single instance appearances; and please ignore the case sensitivity, it’s just an issue with my current set of data):
[“TO-SO”, 13],
[“to-so”, 2],
=> 15 instances
[“KO-WO”, 7],
[“ko-wo”, 2],
=> 9 instances
[“TELA”, 8],
[“X”, 7],
[“MUL”, 7],
[“DA”, 6],
[“E-NE-KA”, 6],
[“KO-NO-SI-JA”, 2],
[“KO-NO-SI-JO”, 3],
=> 5 instances
[“O-PA”, 4],
[“KO-WA”, 4],
[“]JO”, 4],
[“]-JO”, 4],
[“PO”, 4],
[“DA-*22-TO”, 4],
[“]KE-RE”, 3],
[“]-U”, 3],
[“SE-TO-I-JA”, 3],
[“A-[“, 3],
[“E-RE-TA”, 3],
[“to-ko”, 3],
[“RI-ZO”, 3],
[“PO-KU-TA”, 3],
[“M”, 3],
[“KO-[“, 3],
[“A-RA-DA-JO”, 2],
[“A-TA-ZE-U”, 2],
[“PO-TO”, 2],
[“I-JE-[“, 2],
[“E-TE”, 2],
[“(JA)-SA-NO”, 2],
[“]KO-ME-NO”, 2],
[“PO-ME”, 2],
[“KA-KE”, 2],
[“PA2-SI-RE-WI-JA”, 2],
[“DO-E-RO”, 2],
[“]wi-ja”, 2],
[“SU-RI-MO”, 2],
[“di”, 2],
[“](TO)-SO”, 2],
[“]TO”, 2],
[“E-SO-TO”, 2],
[“RU-NA”, 2],
[“A-PE-O-TE”, 2],
[“DA-WI-JO”, 2],
[“PU-TO-RO”, 2],
[“TE”, 2],
[“E-QE-TA”, 2],
[“]-WO”, 2],
[“ko-wi-ro-wo-ko”, 2],
[“A-NU-TO”, 2],
[“(E)[“, 2],
[“QO-TE-RO”, 2],
[“PU-TE”, 2],
[“KA-RI-SE-U”, 2],
[“]E”, 2],
[“KI-TA-NE-TO”, 2],
[“]-NO”, 2],
[“PA-NA-RE-JO”, 2],
[“PI-RI-NO”, 2],
[“O-PI-SI-JO”, 2],
[“PI-JA-SE-ME”, 2],
[“PE-TE-KI-JA”, 2],
[“](TO)”, 2],
[“ku-su-to-ro-qa”, 2],
[“T”, 2],
[“QI-RI-JA-TO”, 2],
[“V”, 2],
[“U-RA-MO-NO”, 2],
[“E-KI-SI-JO”, 2],
[“KE”, 2],
[“WI-NA-TO”, 2],
[“A-PA-TA-WA-JO”, 2],
[“KO-PE-RE-U”, 2],
[“SU-KE-RE-O”, 2],
[“A-RA-NA-RO”, 2],
[“A-KO-RA-JO”, 2],
[“I-TE-U”, 2],
[“PI-JA-SI-RO”, 2],
[“WE-KE-I-JA”, 2],
[“PU-WO”, 2],
[“SI-TO”, 2],
[“A-MO-RA-MA”, 2],
[“LUNA”, 2],
[“TE-RE-TA”, 2],
[“DU-TO”, 2],
[“KO-NI-DA-JO”, 2],
[“KE-DO-SI-JA”, 2],
to do:
case insensitivity
run Pylos and compare once PY data entry is complete
auto-consolidate declensions and alternations
de-duplicate entries from single inscriptions from multiple sources