Thread:KockaAdmiralac/@comment-9605025-20191028042947/@comment-9605025-20191207231746

Okay, sounds like database dump is definitely the way to go for time's sake. How much do you know about SHA1? I gave more thought to generating the combined revisions as part of the imported data. The thing is I am still having an issue figuring out the SHA1. I have done searching online and found top-level descriptions and implementations; nothing in between. From the implementations I have found, it seems like the initial hashes are fixed by the specs (or at least there is a set that is commonly used); although none of the top-level descriptions I found actually say that. Is that the case, that the initial hashes are prescribed by the specs?

If so, then I would assume any valid SHA1 generator should do. That is where the issue occurs. I tested it by taking the contents of a revision of my user page here on CC and used an online SHA1 generator. It gave me the results in base-16 and I know MediaWiki uses base-36. So then I found a converter and used it to get the base-36 SHA1. It didn't match. To me, this means either that my understanding of how SHA1 works is incomplete or that MediaWiki is using something besides the straight-up revision content as the input string.

Also, apparently JS uses UTF16 while MediaWiki, at least when it comes to retrieved revisions, uses UTF8. Do you think this will cause any issues when manipulating the XML?