Community Central
Community Central
Forums: Index Community Central Forum How to: Export all pages from each namespace
Fandom's forums are a place for the community to help other members.
To contact staff directly or to report bugs, please use Special:Contact.
Archive
Note: This topic has been unedited for 4941 days. It is considered archived - the discussion is over. Information in this thread may be out of date. Do not add to unless it really needs a response.

This method is shared in the hope that (i) someone else finds it useful and (ii) someone else might point me to an even better way of getting the same result.

Goal
Export all pages from a wiki for subsequent import into another wiki.
Problem
  1. Special:Export requires a list of pages that is (i) one page per line and (ii) no trailing whitespace after each page name.
  2. There is no one category that all pages at a wiki belong to and then some pages do not belong to any category at all.
  3. Special:AllPages output is in 3 columns and does not prepend each page name with the namespace prefix.
  4. The list produced by Special:AllPages cannot simply be cut and paste into Special:Export and achieve the expected result.

One solution

  1. Repeat the following steps for each namespace at your wiki, appending each block of HTML as you go
    1. Use Special:AllPages and select one namespace at a time.
    2. View the HTML source of the resulting list and copy just the HTML for the table of results. (This is very easy to do if you can use Firebug to select the <tbody> element and then "Copy inner HTML" to copy just the table of results.
    3. Paste the copied HTML into a file editor
  2. Save the concatenation of pasted HTML as a file named "inner-html" or similar
  3. Copy the following awk script into a file editor and save it as file name "a.awk" or similar
    BEGIN { RS=" title="; FS="\"" }
    {print $2}
  4. At a shell command line run the command: awk -f a.awk inner-html > export-list
  5. Open the file "export-list" for edit and copy that list of pages to Special:Export - this very nicely provides the namespace prefix as well as one page per line with no trailing whitespace
  6. Export to a file and then proceed with import as usual.

enjoy! -- najevi 00:48, September 29, 2010 (UTC)

Of course if you can control the other wiki (e.g. if it's on your server), you can download the dump (Special:Statistics at wiki in question) and directly import the whole dump. JohnBeckett 01:10, September 29, 2010 (UTC)
  export         - Export the current revisions of all given or generated pages
  exportnowrap   - Return the export XML without wrapping it in an XML result (same format as Special:Export). Can only be used with export

http://community.wikia.com/api.php?action=query&generator=allpages&gapnamespace=2&prop=revisions&export=true&exportnowrap=true&format=xml

OK thanks for that idea. It was interesting to read about the database dump and import process. That research also lead me to mw:Manual:Moving_a_wiki which mentions the importance of matching MediaWiki versions between source and target wikis when using the database dump.
What I am planning to do is export all pages in all namespaces except the user, user_blog, blog and corresponding _talk/_comment namespaces from a Wikia hosted MediaWiki v1.15.x wiki and then import those pages into a newly created MediaWiki v1.16.0 wiki.
  • I realize that none of the files/images will get transfered but the page for each file/image should get created and I reason that I can, albeit tediously, upload a file/image at each of the 93 pages in that File: namespace.
    1. Do you think it would make any sense to submit a request to Wikia staff (via Special:Contact) for a tar archive of the $IP/images directory?
    2. Will the two-level hashed directory structure for images be identical between source and target wikis? .... or different?
    3. I assume I need to make sure that the numeric key for each namespace is identical at source and target wiki. (One custom namespace as well as a few SMW related namespaces are used.)
  • Should I concern myself over the username and numeric user ID for whomever last edited each page being transfered? (I did not plan to copy the full history of each page.)
  • Can you anticipate any problems with this plan?
I have cPanel access to the target wiki's file system and MySQL database but I do not have access to a unix shell command line at the new host server. -- najevi 19:10, September 29, 2010 (UTC)