4

[Bash] Process HTML Files Format in Volumn

 2 years ago
source link: http://siongui.github.io/2016/04/26/bash-process-html-files-format-in-volumn/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

[Bash] Process HTML Files Format in Volumn

April 26, 2016

Convert the format (Big5 encoding to UTF-8, remove DOS newline in file, replace string big5 with UTF-8, and append UNIX newline to end of file) of HTML files in directory via Bash script.

#!/bin/bash

# $1 is the directory in which files to be processed
for path in $(find $1 -type f)
do
  echo -e "\033[92mProcessing ${path} ...\033[0m"
  # big5 to utf8
  iconv -f big5 -t utf-8 ${path} > tmp.html
  if [ $? -ne 0 ]; then
    # fail to convert big5 to UTF-8
    continue
  fi
  mv tmp.html ${path}

  # remove dos newline
  tr -d '\015' <${path} > tmp.html
  mv tmp.html ${path}

  # html meta big5 to UTF-8
  sed 's/big5/UTF-8/' -i ${path}

  # append newline to end of file
  sed -i -e '$a\' ${path}
done

References:

[1][Bash] Rename Files in Directory to Lowercase

[2][Bash] Encoding Conversion, Newline Manipulation, String Replacement of File


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK