SAVING WEB PAGES INTO MARKDOWN FILES

Some time ago I organized and migrated my notes into one big repository of text files (and images where really necessary), all of them version controlled by git and lightly formatted by Markdown.

I wrote about the setup and the reasoning behind it in the Note Taking Unchained article and today I'm going to expand on working with Markdown a bit.

Markdown makes it easy to read a text file even with the formatting codes visible in-line - OK, having a good text editor also makes a huge difference. And if you don't like reading like that, all editors allow you to export HTML files generated from your Markdown text, for easy reading inside of a web browser.

But what about the other way around? Let's say I found a web article full of very useful information and I'd like to save it to my notes so it's available for searching and future reference. The options I have are:

  • do it manually: copy paste the text and insert some Markdown formatting on top of it
  • use a script/web app that cleans and converts the useful content to Markdown while preserving links and formatting

Should be pretty obvious which one I use. Enter Marky the Markdownifier. I'll pick a random article from this blog as the test subject.

Diagram

Once you hit Go, the output will be the Markdown-formatted version of the page. It's best if you use the Copy button top-right instead of doing it yourself, for some inexplicable reason.

Diagram

Take that, drop it into a .md file and you can use your text editor or favorite script to convert it to HTML again for easy reading. You'll have to use Github-Flavored-Markdown (very commonly supported nowadays due to its popularity) in order for all the bits to work (the original Markdown spec is a bit sparse in term of features).

Diagram

And there we go, this is the result. Mind you, the pictures are treated as dynamic links, so will be loaded from the original website. Saving and referencing them locally is something this particular converter does not do.

What's left is to talk about how I bulk convert my notes to a web-accessible search-able format, but that's for the future.

And, as always, thanks for reading.


comments powered by Disqus