Chris Bailey

Integrating GitHub Issues as a Blogging Platform

Originally when I started this blog, I threw together a static site generator as an adhoc Elixir script.

This was really quick and fun to do, but as I started actually drafting posts I realised that there were a few things missing.

The original script implementation was written such that at a high level all it did was concatenate HTML fragments which was fine for a proof of concept but didn't scale well. I wanted to be able to write in something like Markdown, as well as niceties such as automatic syntax highlighting.

Whilst it's completely possible to roll your own, it easily tripled the size of my generator script and it took more time to run. I had settled on getting Elixir to shell out to pandoc to compile markdown into HTML and apply syntax highlighting but at this point I started to worry about the future maintainability of this script.

If only there were a free service I used everyday that handled rendering Markdown with comprehensive syntax highlighting support for all the languages I care about and more...

I felt as though I had a revelation—I'll implement my blog atop GitHub!

Implementation

I haven't had the chance to really use the GitHub API in the past, but I've only heard good things about it.

At my current project, we work extensively with GraphQL, crafting APIs for a whole range of clients to consume. One of the benefits of GraphQL is that we essentially enable said clients to build custom responses which return whatever data they want in a small optimal payload suited for their usecase.

Because GitHub exposes a public GraphQL API I thought that this would be a great chance to jump in and experience what all the client-side hype was all about.

The first iteration I ended up exploring simply fetched files that existed as part of the git repo I was pointing to. This had the advantage of letting me traverse directories, store arbitrary filetypes (so I could write in HTML if I ever wanted to), etc but came with one huge downside: git as a protocol doesn't store file metadata (i.e. file creation time) which meant that I would have to do something else to supplement this. You can get around this issue by getting the timestamp of a file's first commit or prepend metadata into the file itself (as is common with many static site generators anyway) but I didn't want to do either.

Looking for a subjectively purer and simpler solution, I realised that I could fetch respository issues with the GitHub API too. Doing so not only gives me almost everything that fetching files from the repository itself, but it also gives me the ability to apply labels to my posts to implement taxonomies and filtering easily, as well as comments! Deciding this to be a fun approach, I started hacking around.

Using the GraphiQL explorer for playing around with GitHub's graph, I was able to quickly throw together a query to fetch the issues I wanted:

query($owner: String!, $name: String!) { 
  repository(owner: $owner, name: $name) {
    nameWithOwner,
    issues(first: 10, orderBy: {field: CREATED_AT, direction: DESC}) {
      nodes {
        number
        bodyHTML
        createdAt
        updatedAt
        title
        labels(first: 10) {
          nodes {
            name
          }
        }     
      }
      totalCount
    }
  }
}

This snippet just gets the first ten issues ordered by CREATED_AT descending from a given repository. If one were concerned that anyone would be able to make issues in a given repo, you could further constrain the query with something like:

issues(filterBy: {createdBy: $owner}, first: 10, orderBy: {field: CREATED_AT, direction: DESC})

Parsing the result of running this query was pretty simple, since it just returns some (admittedly pretty nested) JSON.

I ended up rolling my own simple GraphQL client library because I didn't think it was going to be too complex. I'm sure libraries exist, but for reference, I ended up with something similar to the following:

defmodule GraphQL.Request do
  @moduledoc """
  Implements a GraphQL client which can be used as follows:

  > variables = %{}
  > GraphQL.Request.call("query { viewer { login } }", variables)
  
  %{data: %{viewer: %{login: ...}}, ...}
  """

  @github_url "https://api.github.com/graphql"
  @access_token <SECRET>

  @spec call(String.t()) :: {:ok, map()} | {:error, term()}
  def call(query, variables \\ %{}) do
    with {:ok, payload} <- build_payload(query, variables),
         {:ok, access_token} <- build_access_token(),
         {:ok, %{status_code: 200} = response} <- do_request(payload, access_token),
         {:ok, result} <- process_response(response) do
      result
    end
  end

  defp build_payload(query, variables) do
    case Jason.encode(%{query: query, variables: variables}) do
      {:ok, payload} -> {:ok, payload}
      error -> {:error, reason: :encode_query_error, error: error}
    end
  end

  defp build_access_token() do
    case @access_token do
      {:ok, token} ->
        {:ok, "Bearer " <> token}

      error ->
        {:error, reason: :build_access_token_error, error: error}
    end
  end

  defp do_request(payload, access_token) do
    case HTTPoison.post(@github_url, payload, authorization: access_token) do
      {:ok, %HTTPoison.Response{} = response} ->
        {:ok, response}

      error ->
        {:error, reason: :do_request_error, error: error}
    end
  end

  defp process_response(%HTTPoison.Response{body: body}) do
    case Jason.decode(body) do
      {:ok, result} ->
        {:ok, result}

      error ->
        {:error, reason: :process_response_error, error: error}
    end
  end
end

I simply parse the result of my GraphQL request on every page request, grab the rendered HTML out of the response and render it with Phoenix inside a nice template 😆

If you're currently reading this post, you can see that this approach works pretty well! You only get a few thousand requests however so in the future I might end up revisiting this a little and caching posts in an ETS table or something if that ever becomes a problem. You can see the source of this post if you're interested in that too.

There are a few nice things we can do from here on out:

  1. I'd be really interested in hacking together a comment system out of Github's issue comments. Posters will need to have GitHub accounts but given the fact that I'm a software engineer and I intend to write about software engineering that's probably fine

  2. I've yet to implement labeling and taxonomies in any way. The GraphQL query snippet above includes them but I've yet to really think about how I'll end up actually implementing it.

  3. It might be a little tricky to add some subtext to posts (i.e. reading time) unless I add that to the contents of the issue itself? I could always do some postprocessing of GraphQL responses but I want this approach to kind of just work...

I'll continue playing around with my blog and iterating on this approach but all in all, retrospectively, this is a pretty comfortable blogging setup 😈

edit:
I'm still using this approach after a few months but I've iterated on it a tonne. The main things I've done involved the caching of issues in an ETS table primarily to reduce latency. Pages load quite quickly now compared to before. Doing something with pushes from GitHub might be nice but I think pulling changes is much more resilient. Otherwise I've had nothing bad to say about this approach, it really does just work 😋

I also don't know how hacky it is, but because I'm using css grid for the layout of this site, I dealt with the problem of adding a subtitle to a post's h1:first-of-type by simply rearranging the layout of my posts with grid. I simply have the following snippet of css:

sub + h1 {
  grid-area: postHeaderAboveSub
}

where postHeaderAboveSub would be empty otherwise, and is above the rest of the
posts normal contents.


Return to Posts →