Solution 1 :

You may try the below regex:

(?!<)^(https?://(?:www.)?[-a-zA-Z0-9@:%._+~#=]{1,256}.[a-zA-Z0-9()]{1,6}b(?:[-a-zA-Z0-9()@:%_+.~#?&//=]*))(?!>)$

Explanation of the above regex:

  • (?!<) – Represents negative look-ahead not matching the string if it starts with a <.

  • ^, $ – Represents start and end of line respectively.

  • (https?://(?:www.)?[-a-zA-Z0-9@:%._+~#=]{1,256}.[a-zA-Z0-9()]{1,6}b(?:[-a-zA-Z0-9()@:%_+.~#?&//=]*)) – This part matches all the possible valid urls efficiently.

  • (?!>) – Represents negative look-ahead not matching if the url ends in >.

Pictorial Representation

You can find the demo of the above regex in here.


NOTE: I also prefer using perl command if it comes to implementation in bash. But if it is your necessary requirement to use sed then you can try the below command. However; please be noted that sed misses many amazing features of regex namely; look-arounds, non-captured groups, etc.

sed -E 's@^[^<]?(https?://(www.)?[-a-zA-Z0-9@:%._+~#=]{1,256}.[a-zA-Z0-9()]{1,6}b([-a-zA-Z0-9()@:%_+.~#?&/=]*))[^>]?$@<1>@gm'

You can find sample run of the perl and sed implementation in here.

Solution 2 :

use pandoc to convert between different markdown flavors

example:

pandoc -f gfm+hard_line_breaks -t markdown_strict in.md -o out.md

gfm means github-flavored markdown

demo:

pandoc -f gfm+hard_line_breaks -t markdown_strict <<<$'
https://asdf.com
<https://asdf.com>
[asdf](https://asdf.com)
' | perl -pe 's/n/¶n/'

<https://asdf.com>  ¶
<https://asdf.com>  ¶
[asdf](https://asdf.com)¶

my example also converts from hard linebreaks (n renders as linebreak) to soft linebreaks (n renders as space). i added pilcrows (¶) to clarify the output

live demo

Problem :

I notice that some Markdown parsers and GitHub will auto-convert bare URLs to links, but others (like Kramdown) don’t. The standard Markdown syntax requires that URLs be wrapped in angle brackets, e.g. <https://www.google.com/>.

I have a number of documents with bare URLs that appear as desired, i.e. as hyperlinks, in my Markdown editor but are not getting rendered as links when I push them in Jekyll to GitHub Pages.

How can I write a script to surround bare URLs with angle brackets?
Preferably via shell scripting, standard command line tools (sed, awk) or Python. Or perhaps there’s already a Jekyll plugin for this?

I know that matching URLs is highly nontrivial, so wanted to ask here on SO before getting too deep into this.

Further difficulty: The solution should only change bare URLs, and leave alone URLs that have already been wrapped/encoded via standards-compliant Markdown or HTML.

(I expected this to be a common question, and it is in various GitHub-Issues posts for various packages, with no solutions… But tried searching for this question here and couldn’t find it already asked, nor any premade Jekyll solutions. I found many questions about matching when the angle brackets are already there, but not ones to add the angle brackets. Yet I’m imagining the solution has been implemented many, many times — in the very tools we use, such as GitHub and MathOverflow — so, not sure why the means to do this isn’t widely posted.)

Comments

Comment posted by this

Does

Comment posted by github/markup

GitHub documents their process in

Comment posted by Waylan

… Either way, note that this happens after the Markdown is converted to HTML. Presumably GitHub is passing the HTML into an HTML parser, stepping through the document tree, and running filters on the text. This makes it easy to avoid code blocks, existing links, etc.

Comment posted by sh37211

@Mandy8055 wow, that worked! It even leaves alone the strings I added containing normal HTML and Markdown links. Even a big-long complicated google search string with many unusual characters!

Comment posted by sh37211

I have confirmed that either the perl or the sed expressions work for what I need, so that’s sufficient for accepting your answer. The preference for sed over perl was just because I’m proposing a change to another person’s software suite, and whereas sed is always guaranteed to be installed, I’m not sure if he added perl to his docker image (or wants to), so that would be an additional dependency. For my personal use, perl is fine. Maybe perl is also ‘standard’, I just wanted to make it ‘easy to adopt’ my change by using sed.

Comment posted by sh37211

If you like, go for it! I have been unable to come up with a URL that ‘breaks’ the perl regex as is, but a more robust solution is always welcome!

Comment posted by continue this discussion in chat

Let us

By