Generating Multiple Jekyll RSS Feeds

Whelp, my “resolution” to post to this blog every 90 days or so has failed. It lasted about a year, but a few weeks months ago I got an email notification that my RSSby.email subscription was automatically unsubscribed due to inactivity. I guess one good thing is now I know that I’ll get an unsubscribe notice… but sorry to all my RSSby.email followers who now have to go through the hassle to resubscribe.

One of the things that has kept me from posting is that my blog is being indexed by Planet Emacs. I only found out from a comment that someone left, but apparently it’s been in there since March 2020. Now I feel bad for all those non-emacs posts! My solution is create an emacs-only RSS feed for my site.

Note: If you are using GitHub pages to deploy your site, skip to Step 4: Slightly Less Generation.

Step 1: Create the File Generator

Jekyll has the capability to generate whole pages. That is, no .md file exists in the source tree, but a .html file is generated and appears on the site. This requires writing a bit of ruby code (a Jekyll plugin), but a very easy to use template is included in the documentation (linked above).

module GenerateTagRSS
  class TagRSSGenerator < Jekyll::Generator
    safe true

    # loop to generate each feed file
    def generate(site)
      site.tags.each do |tag, posts|
        Jekyll.logger.info "\tGenerating feed for tag: #{tag}"
        # call to TagRSSPage instantiation function
        site.pages << TagRSSPage.new(site, tag, posts)
      end
    end
  end

  # Subclass of `Jekyll::Page` with custom method definitions.
  # TagRSSPage instantiation
  # Definition of what an RSS file looks like
  class TagRSSPage < Jekyll::Page
    def initialize(site, tag, posts)
      @site = site             # the current site instance.
      @base = site.source      # path to the source directory.
      @dir  = 'blorg/feeds'    # the directory the page will reside in.

      @basename = tag          # filename without the extension.
      @ext      = '.xml'       # the extension.
      @name     = tag + '.xml' # basically @basename + @ext.


      # Initialize data hash with a key pointing to all posts under current category.
      # This allows accessing the list in a template via `page.linked_docs`.
      @data = {
        'linked_docs' => posts
      }

      # Look up front matter defaults scoped to type `tags`, if given key
      # doesn't exist in the `data` hash.
      data.default_proc = proc do |_, key|
        site.frontmatter_defaults.find(relative_path, :tag, key)
      end
    end

    # Placeholders that are used in constructing page URL.
    def url_placeholders
      {
        :path       => @dir,
        :basename   => basename,
        :output_ext => output_ext,
      }
    end
  end
end

File GenerateRSSTags.rb

Now I don’t understand that code entirely, but I have verified that it does work.

Step 2: Template for the Generated Files

The R community has a similar “R topics” RSS aggregator, with an explicit requirement that included feeds are limited to on-topic posts. This post guides you through doing that by hand, for a single Jekyll tag. It provides a basic template for an RSS XML file, (only slightly adapted below).

  <?xml version="1.0" encoding="UTF-8"?>
  <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
      <channel>
          <title>{{ site.title | xml_escape }}</title>
          <description>{{ site.description | xml_escape }}</description>
          <link>{{ site.url }}{{ site.baseurl }}/</link>
          <atom:link href="{{ page.url | prepend: site.baseurl | prepend: site.url }}" rel="self" type="application/rss+xml"/>
          <pubDate>{{ site.time | date_to_rfc822 }}</pubDate>
          <lastBuildDate>{{ site.time | date_to_rfc822 }}</lastBuildDate>
          <generator>Jekyll v{{ jekyll.version }}</generator>
          {% for post in page.linked_docs %}
              <item>
                  <title>{{ post.title | xml_escape }}</title>
                  <author>{{ site.email | xml_escape }} ({{ site.author | xml_escape }})</author>
                  <description>{{ post.content | xml_escape }}</description>
                  <pubDate>{{ post.date | date_to_rfc822 }}</pubDate>
                  <link>{{ post.url | prepend: site.baseurl | prepend: site.url }}</link>
                  <guid isPermaLink="true">{{ post.url | prepend: site.baseurl | prepend: site.url }}</guid>
                  {% for tag in post.tags %}
                  <category>{{ tag | xml_escape }}</category>
                  {% endfor %}
                  {% for cat in post.categories %}
                  <category>{{ cat | xml_escape }}</category>
                  {% endfor %}
              </item>
          {% endfor %}
      </channel>
  </rss>

File _layouts/rss.html

Some notes on this snippet:

  • This layout file has the .html extension. This is an assumption/requirement of Jekyll, our ruby generator code is what sets the output file extension to .xml.
  • The posts which will be included are passed via the page.linked_docs variable.

Of course it’s important to make sure Jekyll knows that the generated files should default to using this layout. That is done in the _config.yml file.

defaults:
  - scope:
      type: tag
    values:
      layout: rss

Snippet from _config.yml

Step 3: Getting it to Run

By default Jekyll will run any generators it finds in _plugins directory. I included a nice log message in the generation loop, so that I have some indication that each tag is being generated. Locally building my site is complete success!

The one slight issue is that GitHub pages won’t run unsupported plugins, aka the generator we just wrote. So much of this post feels moot. The whole reason I’m still using the Jekyll stack is that GitHub takes care of generating the site for me.

Step 4: Slightly Less Generation

Since I couldn’t get GitHub pages to actually generate the files from whole cloth, I had to do a bit by hand. This is relatively easy, given a list of tags which exist on my site, I simply create the following markdown file for each tag:

---
layout: rss
permalink: /blorg/feeds/emacs.xml
tagname: emacs
---

Jekyll frontmatter for tag feed file

Then, in the RSS layout (from Step 1), instead of looping over the page.linked_docs variable, I use:

{% for post in site.posts %}
{% if post.tags contains page.tagname %}

Updated liquid loop for _layouts/rss.html file.

(Don’t forget to close your if statement with an {% endif %} just before the end of the loop). This solution requires the manual step of me remembering to create the tag markdown file when I use a new tag, but it works!

Step 5: Profit

The final step is to share these RSS feeds with the world. I’ve created a whole <file:///blorg/feeds/> page which implements the following liquid to generate a nice list of topics with a link to each RSS feed (somewhat simplified here):

{% for tag in site.tags %}
  {% capture tagname %}{{ tag | first }}{% endcapture %}
  <a href="/blorg/feeds/{{ tagname }}.xml">{{ tagname }}</a>
{% endfor %}

Liquid snippet for generating a list of tags.

Of course, you can find all this code in my site’s GitHub repository.


Comments