EDDYMENS

Matching Markdown And HTML Headings Using Regex (JS)

The Javascript code below matches the different types of headings in a Markdown string. This includes HTML headings since Markdown is a superset of HTML.

The different types of headings the script matches are:

  • Markdown hash based heading: Thus #, ##, ###, ####, etc.
  • Alternative Markdown heading syntax: i.e.: ==== and ------.
  • HTML heading tags: Thus <‌h1>, <‌h2>, etc.

Script


01: <script>
02: const regex = /(?<headerTag>#+)\s+(?<headingText>[^"\n]*)|(<(?<HTMLHeaderTag>h[1-6]).*?>(?<HTMLHeadingText>.*?)<\/h[1-6]>)|(((?<altMDHeadingText>.+)\n)(?<altMDHeadingTag>-+|=+))/gm;
03: 
04: const str = `# Overview
05: AWS S3 is a cloud storage service that caters to the storage needs of modern software applications. S3 buckets can be used to host static sites.
06: 
07: ## Getting started
08: Once you have your AWS account all set up you can log in and then use the search bar up top to search for the S3 service.
09: 
10: ### Third-level header
11:  Third-level header content goes here.
12: 
13: #### Forth-level header 
14:  Fourth-level content goes here.
15: 
16: Alternative Heading Level 1
17: ===========================
18: Alternative heading 1 text.
19: 
20: Alternative Heading Level 2
21: ---------------------------
22: Alternative heading 2 text.
23: 
24: <h1>   HTML Header 1  </h1>
25: Level 1 heading
26: 
27: <h2> HTML Header 2  </h2>
28: Level 2 heading.
29: 
30: <h6> HTML Header 6  </h6>
31: Level 6 heading.`;
32: let m;
33: let headings = [];
34: while ((m = regex.exec(str)) !== null) {
35:     if (m.index === regex.lastIndex) regex.lastIndex++;
36:     headings.push({
37:         headingTag : m.groups.HTMLHeaderTag ?? m.groups.altMDHeadingTag ?? m.groups.headerTag,
38:         headingText : m.groups.HTMLHeadingText ?? m.groups.altMDHeadingText ?? m.groups.headingText
39:     });
40: }
41: console.log(headings);
42: </script>

Output


01: [
02:    {
03:       "headingTag":"#",
04:       "headingText":"Overview"
05:    },
06:    {
07:       "headingTag":"##",
08:       "headingText":"Getting started"
09:    },
10:    {
11:       "headingTag":"###",
12:       "headingText":"Third-level header"
13:    },
14:    {
15:       "headingTag":"####",
16:       "headingText":"Forth-level header "
17:    },
18:    {
19:       "headingTag":"===========================",
20:       "headingText":"Alternative Heading Level 1"
21:    },
22:    {
23:       "headingTag":"---------------------------",
24:       "headingText":"Alternative Heading Level 2"
25:    },
26:    {
27:       "headingTag":"h1",
28:       "headingText":"   HTML Header 1  "
29:    },
30:    {
31:       "headingTag":"h2",
32:       "headingText":" HTML Header 2  "
33:    },
34:    {
35:       "headingTag":"h6",
36:       "headingText":" HTML Header 6  "
37:    }
38: ]

Here is another article you might like 😊 "Matching Markdown And HTML Headings Using Regex | PHP"