2

Migrate from MediaWiki to GitBook

 1 year ago
source link: https://ooso.net/archives/778
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Export data from MediaWiki

Using ‘Special:Export’ to export all pages at MediaWiki

  • Go to Special:Allpages and choose the desired article/file.
  • Copy the list of page names to a text editor
  • Put all page names on separate lines. I did this job with Vim.
  • Go to Special:Export and paste all your page names into the textbox, making sure there are no empty lines.
  • Click ‘Submit query’
  • Save the resulting XML to a file using your browser’s save facility.

Now you can use this XML file to next step

Convert MediaWiki documents to markdown format

We have a big XML file now. It contains all pages in MediaWiki. All the pages are in MediaWiki format. pandoc tool can convert the format to markdown directly. But pandoc can’t parse XML file.

I use tool MediaWiki to Markdown to do the job. It’s written in PHP. The conversion uses an XML export from MediaWiki and converts each wiki page to an individual markdown file.

The tool is easy to use. An example convert all pages to Github Flavoured Markdown format.

php convert.php --filename=mediawiki.xml --output=export --format=gfm 

Import data to GitBook

I got many md files in previous step.

  • Create two folders en and zh for multiple languages support
  • Copy all md files to the two folders. Cleanup files for each language.
  • Copy Main_Page.md to SUMMARY.md. It’s the index file for gitbook. I rewrited this file carefully.
  • Create languages file for gitbook
* [English](en/)
* [Chinese](zh/)
  • Create book file for gitbook, Content for book.json.
{
    "plugins": [
        "language-selected",
        "expandable-chapters-small", 
        "-search", 
        "-livereload", 
        "-fontsettings", 
        "-lunr", 
        "-sharing"
    ]
}
  • Install gitbook and preview

Rewrite rule for nginx

I wrote these rewrite rules for nginx. It make the new URLs compatible with original MediaWiki.

        rewrite ^/wiki/Main_Page$ /_book/index.html last;
        rewrite ^/index.html$ /_book/index.html last;
        rewrite ^/gitbook/(.+)$ /_book/gitbook/$1 last;
        rewrite ^/wiki/(.+).html$ /en/$1.html permanent;
        rewrite ^/wiki/(.+)$ /en/$1.html permanent;

        location /en/ {
                try_files $uri /_book$uri /_book$uri.html;
        }

        location /zh/ {
                try_files $uri /_book$uri /_book$uri.html;
        }

Miscs

I have many markdown files from original MediaWiki. It’s too sad gitbook can’t process markdown file outside the SUMMARY.md. You can find more peoples have same problem at the repo of Gitbook. I use the workground to resolve the problem.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK