4

Guess Metadata from HTML and Converted to reStructuredText

 2 years ago
source link: https://siongui.github.io/2016/05/16/html-metadata-to-rst/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Guess Metadata from HTML and Converted to reStructuredText

May 16, 2016

Guess metadata from HTML webpage and convert it to reStructuredText format. Currently the following metadata extraction (if available) is supported:

  • title
  • keywords (tags)
  • description (summary)
  • author
  • og:image

Usage:

Check guess metadata from HTML commit in html2rst repo for details of source code.


Tested on: Ubuntu Linux 16.04, Go 1.6.2.


References:

[1][Golang] HTML to reStructuredText[2]Online Taobao Item to reStructuredText Image on Google App Engine Go[3][Golang] Create reStructuredText Metadata via text/template Package[4][Golang] Extract Title, Image, and URL via goquery[5]Extract title and metadata from a reStructuredText document // homework prod.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK