5

[Python] Insert Line With Matched Pattern

 2 years ago
source link: http://siongui.github.io/2016/03/01/python-insert-line-with-matched-pattern/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

[Python] Insert Line With Matched Pattern

March 01, 2016

Extract URL from reStructuredText link and insert the URL into the file as rst metadata via Python.

We extract URL from the following link in rst file:

`舊網頁 <http://nanda.online-dhamma.net/Tipitaka/Post-Canon/Visuddhimagga/Visuddhimagga.htm>`_

Then insert the URL back to the rst file as metadata.

#!/usr/bin/env python
# -*- coding:utf-8 -*-

import os
import re

oldlink = re.compile('<http.+>')

def processFile(path):
  print("process " + path + " ...")
  tmpfilepath = os.path.join("/tmp", os.path.basename(path))
  with open(path, "r") as f, open(tmpfilepath, "w") as fo:
    # http://www.tutorialspoint.com/python/file_readlines.htm
    lines = f.readlines()
    insertAt = 0
    oldurl = ""
    for index, line in enumerate(lines):
      if ":tags:" in line:
        insertAt = index
      if "舊網頁" in line:
        result = oldlink.findall(line)
        oldurl = result[0][1:-1]

    assert insertAt != 0
    assert oldurl != ""
    # http://www.tutorialspoint.com/python/list_insert.htm
    # http://stackoverflow.com/questions/10507230/insert-line-at-middle-of-file-with-python
    # http://stackoverflow.com/questions/11968998/remove-lines-that-contain-certain-string
    lines.insert(insertAt, ":oldurl: " + oldurl + "\n")
    fo.writelines(lines)

  with open(tmpfilepath, "r") as f, open(path, "w") as fo:
    fo.write(f.read())


def processDir(rootDir):
  # http://www.tutorialspoint.com/python/os_walk.htm
  for root, dirs, files in os.walk(rootDir):
    for name in files:
      path = os.path.join(root, name)
      processFile(path)


if __name__ == '__main__':
  # http://stackoverflow.com/questions/50499/how-do-i-get-the-path-and-name-of-the-file-that-is-currently-executing
  processDir(os.path.join(os.path.dirname(__file__), "../content/articles"))

Tested on: Ubuntu Linux 15.10, Python 2.7.10.


References:

[1]twnanda/linkold.py at master · twnanda/twnanda · GitHub


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK