Auto generate Heading Anchors using HTML AgilityPack DOM Manipulation
Niels Swimberghe - - .NET
Follow me on Twitter, buy me a coffee
For very long documents it can be hard to share a specific segment with others. One way website commonly solve this is by providing "Heading Anchors".
I'm not sure if "Heading Anchors" is the correct term, but that's the most descriptive name I've come across. A heading anchor is when articles provide a hyperlink for each heading to provide deep links. When you browse to the link, it will scroll directly to the heading. Often heading anchors are implemented by adding the pound sign "#" as a hyperlink next to the heading. Here's a nice example from css-tricks.com:
Manually adding an anchor to every heading would be a painful solution. So let's learn how we can achieve this by generating the Heading Anchors using the HTML AgilityPack .NET library.
Generate Heading Anchors using HTML AgilityPack #
HTML AgilityPack (HAP) is a .NET library for parsing, querying, and manipulating HTML. Here's some operations you can do with HAP.
To generate the Heading Anchors we'll need to:
- Parse our HTML wherever the HTML is coming from (Database, CMS, etc.)
- Select our headings using XPath
- Add "#" anchors using DOM Manipulation
- Output the manipulated HTML
To follow along, you can use this GitHub repository containing all the sample code. There's more relevant code in the repository, but the important part is the following function:
public string AddHeadingAnchorsToHtml(string html) { var doc = new HtmlDocument(); doc.LoadHtml(html); // select all possible headings in the document var headings = doc.DocumentNode.SelectNodes("//h1 | //h2 | //h3 | //h4 | //h5 | //h6"); if (headings != null) { foreach (var heading in headings) { var headingText = heading.InnerText; // if heading has id, use it string headingId = heading.Attributes["id"]?.Value; if (headingId == null) { // if heading does not have an id, generate a safe id by creating a slug based on the heading text // slug is a URL/SEO friendly part of a URL, this is a good option for generating anchor fragments // Source: http://predicatet.blogspot.com/2009/04/improved-c-slug-generator-or-how-to.html // assumption: Prase should only contain standard a-z characters or numbers headingId = ToSlug(headingText); // for the fragment to work (jump to the relevant content), the heading id and fragment needs to match heading.Attributes.Append("id", headingId); } // use a non-breaking space to make sure the heading text and the #-sign don't appear on a separate line heading.InnerHtml += " "; // create the heading anchor which points to the heading var headingAnchor = HtmlNode.CreateNode($"<a href=\"#{headingId}\" aria-label=\"Anchor for heading: {headingText}\">#</a>"); // append the anchor behind the heading text content heading.AppendChild(headingAnchor); } } return doc.DocumentNode.InnerHtml; }
In summary, the above code does the following:
- Parse the HTML by using the
HtmlDocument.LoadHtml
function - Select all headings by passing an XPath query to the
DocumentNode.SelectNodes
function - Iterate over each heading and
- Generate an ID for each heading by slugifying the text in the heading. The ToSlug method is based on this article.
If the heading already has an ID we can reuse it. - Create an HTML anchor and set a fragment URL generated from to heading-id to the
href
-attribute - Append the anchor to the heading so the '#'-anchor shows up next to the heading text
- Generate an ID for each heading by slugifying the text in the heading. The ToSlug method is based on this article.
- Return the manipulated HTML
If you play around with the sample, you'll see the HTML is coming from an HTML file stored on the server and the resulting HTML is returned directly to the browser. The result looks like this:
IMPORTANT NOTE: Parsing, querying, and manipulating DOM is an intensive task. Keep that in mind when using HTML AgilityPack and apply caching if necessary.
Summary #
Using the HTML AgilityPack library, we parsed, queried, and manipulated HTML to generate Heading Anchors for a richer URL sharing experience.
BONUS: Using the Scroll-behavior
CSS property we can enable a smooth scroll animation on supporting browsers.