HTML tags in content will make the parsed result a mess

desk-user · January 24, 2018, 4:09am

Thanks for your great tool! When I use Push to Kindle to push a page in Safari online books, it does not handle the HTML tags in the article content well. It seems recognize them as real HTML tags.

With a tool like egrep, it doesn’t seem particularly common or useful to simply match lines with HTML tags. But, exploring a regular expression that matches HTML tags exactly can be quite fruitful, especially when we delve into more advanced tools in the next chapter.

Looking at simple cases like ‘’ and ‘

’, we might think to try ⌈<.>⌋. This simplistic approach is a frequent first thought, but it’s certainly incorrect. Converting ⌈<.>⌋ into English reads “match a ‘<’

fivefilters · January 25, 2018, 12:55am

Hi there, could you please give us the URL of the page you’re trying to send so we can take a look?