Integrating GitHub Issues as a Blogging Platform
One of the first projects I always start to try out a new backend or frontend technology is to build a simple blog. I built the prototype of this blog when I started learning Erlang/Elixir in university as an ad-hoc escript that would generate things via a cronjob!
This was a really quick, dirty, and fun way of doing things. It started as simply concatenating a bunch of handwritten .html
files and building an index page, and eventually evolved to compiling .md
files by shelling out to pandoc. As I actually started drafting posts, however, I realised that a few things were missing...
Post tagging was derived from the directory structure so that I couldn't tag one post with multiple tags, I couldn't get syntax highlighting working very well, for example, and I wanted a platform where I could make edits from anywhere including devices that don't necessarily have a command-line or git. If I found a typo, I should be able to make a change and fix it right then and there.
If only there were a free service, with a nice web UI, that handled rendering markdown with excellent syntax highlighting support, really customisable tagging, pinning, and even supported comments for future extensibility...
And it hit me: I would implement my blog abusing GitHub as a backend!
Implementation
GitHub is actually really open in terms of API. There are multiple different APIs you can choose from. Despite having never used GitHub's API for anything other than playing around with language tutorials, I knew it'd be a good time since I've heard only good things about it.
Out of the APIs they provide, I decided to use their GraphQL API since GraphQL is a technology I'm both very fond of and rather experienced in, having done a few consulting gigs with companies using the technology at scale.
The first iteration of this idea was essentially porting the logic I already had: posts were stored as files in my repo, and my backend server would recursively traverse directories based on the requested URL slug, i.e. localhost:4000/some/post
would try requesting $MY_REPO/some/post.*
.
This approach worked really well initially, but after playing around with it, I decided querying files via their repository { object(...) }
API a little nasty git
protocol. Still, I wish I could query objects via repository { files("somePath") { ... }}
instead. This would make my server logic a lot less obtuse to extend since the object
field needed me to pass instructions that git rev-parse
could understand
After exploring their graph a little bit, I saw that GitHub Issues could, however, be conveniently queried from the top level of the repository
node and basically let me do whatever I wanted; thus, I scrapped everything I had built so far and started with this second iteration.
For those of you who are interested, GitHub's GraphiQL explorer is great for playing around and getting a feel for both GraphQL (if you're not familiar with the technology), but also for getting to grips with what, exactly, my blog is doing in the background.
The basic query I'm doing is this, for example:
query($owner: String!, $name: String!) {
repository(owner: $owner, name: $name) {
nameWithOwner,
issues(first: 10, orderBy: {field: CREATED_AT, direction: DESC}) {
nodes {
number
bodyHTML
createdAt
updatedAt
title
labels(first: 10) {
nodes {
name
}
}
}
totalCount
}
}
}
This snippet gets the first ten issues ordered by CREATED_AT
descending from a given repository. If one were concerned that anyone would be able to make issues in a given repo, you could further constrain the query with something like:
issues(filterBy: {createdBy: $owner}, first: 10, orderBy: {field: CREATED_AT, direction: DESC})
Absinthe is an absolutely amazing GraphQL server implementation for the backend, but since my site is rendered by the backend, this wasn't what I needed. Whenever I've worked on frontend code interacting with GraphQL, I've always used a popular library called Apollo but (at least at the time of implementation) there was no Apollo library for Elixir.
I ended up rolling my own simple GraphQL client library because I didn't think it would be too complex. I'm sure as time goes on, simple libraries will be built, but honestly, it was easy enough to work with by just POST-ing to GitHub's API endpoint. For future reference, I ended up with something similar to the following:
defmodule GraphQL.Request do
@moduledoc """
Implements a GraphQL client which can be used as follows:
> variables = %{}
> GraphQL.Request.call("query { viewer { login } }", variables)
%{data: %{viewer: %{login: ...}}, ...}
"""
@github_url "https://api.github.com/graphql"
@access_token <SECRET>
@spec call(String.t()) :: {:ok, map()} | {:error, term()}
def call(query, variables \\ %{}) do
with {:ok, payload} <- build_payload(query, variables),
{:ok, access_token} <- build_access_token(),
{:ok, %{status_code: 200} = response} <- do_request(payload, access_token),
{:ok, result} <- process_response(response) do
result
end
end
defp build_payload(query, variables) do
case Jason.encode(%{query: query, variables: variables}) do
{:ok, payload} -> {:ok, payload}
error -> {:error, reason: :encode_query_error, error: error}
end
end
defp build_access_token() do
case @access_token do
{:ok, token} ->
{:ok, "Bearer " <> token}
error ->
{:error, reason: :build_access_token_error, error: error}
end
end
defp do_request(payload, access_token) do
case HTTPoison.post(@github_url, payload, authorization: access_token) do
{:ok, %HTTPoison.Response{} = response} ->
{:ok, response}
error ->
{:error, reason: :do_request_error, error: error}
end
end
defp process_response(%HTTPoison.Response{body: body}) do
case Jason.decode(body) do
{:ok, result} ->
{:ok, result}
error ->
{:error, reason: :process_response_error, error: error}
end
end
end
This gives me some HTML (and other metadata like tags, etc.) which I essentially insert ad-hoc into my own Phoenix templates, and it just works. If you're reading this post, you can see this approach works pretty well (I'm unlikely to change it going forwards
One downside I can see is that you're limited to a few thousand queries as rate-limiting* so as time goes forwards, I'll probably cache posts in an ETS table or something so that I'm not wasting my quota whenever someone clicks on a post.
Just for a bit of fun, you can see the source of this post. Features I'll definitely add going forwards are reactions to the main post, which I can display nicely on the index page, and comments!
I'll continue playing around with my blog and the technology driving it, as is tradition for software engineers, but all-in-all, this is a pretty comfy blogging setup
edit: I'm still using this approach after a few months, but I definitely needed to implement caching. I woke up one night to my Graph API rate limits exceeded and couldn't do anything about it but wait! That's a good problem to have though
😉
edit: One fancy thing I've seen is adding estimated reading times to blog posts. This proved a little hard because when I request blog contents from GitHub, I get them all as one monolithic block, which I inject into my templates...
To get it working, I end up processing the HTML returned by GitHub to calculate a reading time, injecting the reading time in the template and maximally abusing CSS Grid to re-order the flow of the page as follows:
sub + h1 { grid-area: afterHeaderBeforeContent } h1 + p:first-of-type { grid-area: afterContent }
edit: Update writing since it was full of typos. This draft should be much clearer — my writing has really improved thanks to this blog... I highly recommend anyone start their own blog just for the fun and flexibility it enables you and be a great place to improve your writing skills!