Easy Ergonomic Telemetry in Elixir with Sibyl
Feb 02, 2023
10 min. read
Elixir
Project
Refining Ecto Query Composition
Nov 08, 2022
13 min. read
Elixir
Absinthe
Ecto
Compositional Elixir—Ecto Query Patterns
Jun 10, 2021
18 min. read
Elixir
Ecto
Stuff I use
May 28, 2021
13 min. read
General
A minimal Nix development environment on WSL
Apr 04, 2021
13 min. read
WSL
Nix/NixOS
Elixir pet peeves—Ecto virtual fields
Dec 15, 2020
5 min. read
Elixir
Ecto
My usage of Nix, and Lorri + Direnv
Nov 26, 2020
5 min. read
Nix/NixOS
Build you a `:telemetry` for such learn!
Aug 26, 2020
19 min. read
Elixir
Erlang
Project
Drinking the NixOS kool aid
Jun 30, 2020
9 min. read
Nix/NixOS
Testing with tracing on the BEAM
May 20, 2020
8 min. read
Elixir
Compositional Elixir—Contexts, Error Handling, and, Middleware
Mar 06, 2020
13 min. read
Elixir
GraphQL
Phoenix
Absinthe
Taming my Vim configuration
Jan 14, 2020
7 min. read
(Neo)Vim
Integrating GitHub Issues as a Blogging Platform
Nov 06, 2019
6 min. read
Elixir
GraphQL
About Me
Nov 05, 2019
1 min. read
General

Testing with tracing on the BEAM

When I was writing some tests for a event broadcasting system (the one I outlined in this blog post I found that, as expected, the unit tests worked very easily but when it came to testing that the entire system worked as intended end to end, it was a little more difficult because I'd be trying to test completely asynchronous and pretty decoupled behaviour.

Testing with Mock/:meck

At this point in time, the test suites inside our code base use mocks very extensively which might be a topic for another day, because I don't think I believe that mocking code in tests in the best—nor most idiomatic—approach. One nice bit about the de-facto mocking library for Elixir is that it also comes with a few assertions you can import into your test suites; namely assert_called/1.

Using this library, we could write our tests, mocking the event handling functions in the event listeners we're interested in for the particular test and then use assert_called/n to determine whether or not that function was called as expected and optionally you can provide what looks roughly like an Erlang match spec to assert that the function was called with specific arguments.

Our tests end up looking roughly like the following:

defmodule E2ETests.EctoBroadcasterTest do
  use ExUnit.Case
  alias EventListener.EctoHandler
  import Mock

  setup_with_mocks(
    [
      {EctoHandler, [:passthrough],
       handle_event: fn {_ecto_action, _entity_type, _data} ->
         {:ok, :mocked}
       end}
    ],
    _tags
  ) do
    :ok
  end

  test "Creating entity foo is handled by EctoHandler" do
    assert {:ok, %MyApp.Foo{}} = MyApp.Foos.create_foo(...)
    assert_called EctoHandler.handle_event({:insert, MyApp.Foo, :_})
  end
end

This approach basically met all of our requirements except for the fact that mocking via :meck (the de-facto Erlang mocking library and used internally by Mock) does things in a pretty heavy handed way.

At a high level, all :meck does is replace the module you're trying to mock with a slightly edited one where the functions you want to mock are inserted in place of the original implementations. This comes with three disadvantages for our particular use case:

  1. Because we're essentially loading code into the BEAM VM, we can no longer run our tests asynchronously because we've patched code at the VM level and these patches are global
  2. Because we're replacing the function we want to call; we can't actually test that that particular function behaves as expected; only that it was called
  3. If we want to mock our functions purely for the assert_called/1 assertion (though this has annoyed us somewhat elsewhere too) we need to be careful and essentially re-implement any functionality that you actually expect to happen for that to be testable; which itself might lead itself to brittle mocks.

My main qualm was with the second problem; we ran into it pretty hard for a few edge cases (though we resolved the second case too!): say EctoHandler.handle_event({:delete, _, _}) needed to be implemented in a special way where it persists these actions into an ETS table and you want to assert that that actually happened, you'd be a little stuffed... image your source code looks like:

defmodule EventListener.EctoHandler do
  ...
  def handle_event(:delete, entity_type, data) do
    Logger.log("Got delete event for #{entity_type} at #{System.monotonic_time()}")
    data
    |> EventListener.DeletePool.write()
    |> process_delete_pool_response()
    |> case do
      {:ok, _} = response -> 
        response
      error ->
        add_to_retry_queue({:delete, entity_type, data})
        Logger.error("Failed to add to delete pool, adding to retry queue")
        error
    end
  end
  ...
end

To mock this function properly, you'd probably just have to implement your mock such that we essentially duplicate the implementation, but this of course will blow up when the mocks come out of sync with the implementation.

A less heavy handed approach

One of the coolest but possibly least used features of the BEAM is the built in ability to do tracing.

Tracing essentially lets you capture things such as messages being passed to particular processes, functions being called among other things. It does come with some disadvantages (i.e. if you're not careful you'll blow up your application because of the performance impact (be careful if you trace on a production deployment 😉)) and it's very low level but it works well and there exist several high level wrappers.

I wanted to experiment with tracing because I've never used it very much so I chose a pretty low level abstraction to do the work for me: the :dbg application which comes built into the normal Erlang OTP distribution.

I'm in the process of cleaning up the actual implementation, but if you wanted to build your own replacement for the assert_called/1 function (with a nice added feature) feel free to follow along!

You can start the :dbg application via the following:

def init() do
  # Flush :dbg in case it was already started prior to this
  :ok = :dbg.stop_clear()
  {:ok, _pid} = :dbg.start()

  # This is the important bit; by default :dbg will just log what gets traced on stdout
  # but we actually need to handle these traces because we want to capture that the
  # function was called programmatically rather than reading it off the shell/log file
  # This tells `:dbg` to call the `process_trace/2` function in the current running process
  # whenever we get a trace message
  {:ok, _pid} = :dbg.tracer(:process, {&process_trace/2, []})

  # And this sets up `:dbg` to handle function calls
  :dbg.p(:all, [:call])
end

Now that the tracer is started up; we need to tell it to start tracing certain functions. You can do this quite easily by passing an mfa into :dbg.tpl/4 as follows:

def trace(m, f, a) do
  # :dbg.tpl traces all function calls including private functions (which we needed for our use-case)
  # but I think it's a good default
  # 
  # The 4th parameter basically says to trace all invocations of the given mfa (`:_` means we pattern
  # match on everything), and we also pass in `{:return_trace}` to tell the tracer we want to receive
  # not just function invocation messages but also capture the return of those functions 
  :dbg.tpl(m, f, a, [{:_, [], [{:return_trace}]}])
end

Now provided a super simple stub function for process_trace/2, you should be able to see your traces come back looking all good:

# Using the following process_trace/2 implementation:
def process_trace({:trace, _pid, :call, {module, function, args}} = _message, _state) do
  IO.inspect(inspect({module, function, args}) <>" - call start")
end

def process_trace({:trace, _pid, :return_from, {module, function, arity}, return_value}, _state)
    when length(args) == arity do
  IO.inspect(inspect({module, function, arity}) <>" - returns: " <> inspect(return_value))
end
-------------------------------------------------------------------------------------------------
iex(1)> Assertions.init()
{:ok, [{:matched, :nonode@nohost, 64}]}
iex(2)> Assertions.trace(Enum, :map, 2)
{:ok, [{:matched, :nonode@nohost, 1}, {:saved, 1}]}
iex(3)> Enum.map([1, 2, 3], & &1 + 2)
"{Enum, :map, [[1, 2, 3], #Function<44.33894411/1 in :erl_eval.expr/5>]} - call start"
"{Enum, :map, 2} - returns: [3, 4, 5]"
[3, 4, 5]

This gets us most of the way to what we want! We can see what functions were called and capture the variables passed into it; as well as capturing what the functions returned.

But if you look closely, the message for :call and :return_from have a few issues:

  1. Ideally, I'd have preferred to have a single message rather than two, because we have some additional complexity here now...
  2. Especially because the :return_from message doesn't get the args passed into it but instead only the function's arity... which means we need to try our best to marry them up as part of the process_trace function...
  3. Just as a note; I don't think we can pattern match on lambdas easily on the BEAM (nor do things like send them over :rpc.calls so at least its consistent 🙂

We can solve the first two problems by amending the implementation of process_trace/2 quite easily though (it might not be a foolproof implementation but it worked well enough for our usecase that I'll just add the code here verbatim):

def process_trace({:trace, _pid, :call, {module, function, args}} = _message, state) do
  [{:call, {module, function, args}, :pending} | state]
end

def process_trace({:trace, _pid, :return_from, {module, function, arity}, return_value}, [
      {:call, {module, function, args}, :pending} | remaining_stack
    ])
    when length(args) == arity do
  IO.inspect(inspect({module, function, args}) <> " - returns: " <> inspect(return_value))

  remaining_stack
end
-------------------------------------------------------------------------------------------------
iex(1)> Assertions.init()
{:ok, [{:matched, :nonode@nohost, 64}]}
iex(2)> Assertions.trace(Enum, :map, 2)
{:ok, [{:matched, :nonode@nohost, 1}, {:saved, 1}]}
iex(3)> Enum.map([1, 2, 3], & &1 + 2)
"{Enum, :map, [[1, 2, 3], #Function<44.33894411/1 in :erl_eval.expr/5>]} - returns: [3, 4, 5]"
[3, 4, 5]
iex(4)> Enum.map([[1, 2, 3], [:a, :b, :c], []], fn list -> Enum.map(list, &IO.inspect/1) end)
1
2
3
:a
"{Enum, :map, [[1, 2, 3], &IO.inspect/1]} - returns: [1, 2, 3]"
:b
:c
"{Enum, :map, [[:a, :b, :c], &IO.inspect/1]} - returns: [:a, :b, :c]"
"{Enum, :map, [[], &IO.inspect/1]} - returns: []"
[[1, 2, 3], [:a, :b, :c], []]

Because whatever gets returned from process_trace/2 accumulates into future calls of process_state/2 in the second parameter, we can essentially do the above and treat it like a stack of function calls. We only emit the actual IO.inspect we care about if we end up popping the function call (because it's resolved) off the stack.

All I do from here is persist these functions into an :ets table of type :bag that we create when init/0 is called, and have process_trace({_, _, :return_from, _}, _) write to it whenever a function call succeeds.

For convenience I really hackily threw together the following macros (please don't judge me 🤣 -- this is going to be cleaned up a lot before I throw it up on hex.pm):

defp apply_clause(
       {{:., _, [{:__aliases__, _, mod}, fun]}, _, args},
       match \\ {:_, [], Elixir}
     ) do
  module = ["Elixir" | mod] |> Enum.map(&to_string/1) |> Enum.join(".") |> String.to_atom()

  arg_pattern =
    args
    |> Enum.map(fn
      {:_, _loc, _meta} -> :_
      other -> other
    end)

  quote do
    case :ets.select(unquote(@mod), [
           {{:_,
             {unquote(module), unquote(fun), unquote(length(arg_pattern)), unquote(arg_pattern),
              :_}}, [], [:"$_"]}
         ]) do
      [] ->
        false

      matches ->
        Enum.any?(matches, fn
          {_timestamp, {_mod, _fun, _arity, _args, unquote(match)}} -> true
          _ -> false
        end)
    end
  end
end

defp capture_clause({{:., _, [{:__aliases__, _, mod}, fun]}, _, args}) do
  module = ["Elixir" | mod] |> Enum.map(&to_string/1) |> Enum.join(".") |> String.to_atom()

  arg_pattern =
    args
    |> Enum.map(fn
      {:_, _loc, _meta} -> :_
      other -> other
    end)

  quote do
    case :ets.select(unquote(@mod), [
           {{:_,
             {unquote(module), unquote(fun), unquote(length(arg_pattern)), unquote(arg_pattern),
              :_}}, [], [:"$_"]}
         ]) do
      [] ->
        false

      [{_timestamp, {_mod, _fun, _arity, _args, val}} | _] ->
        val
    end
  end
end

defmacro called?({{:., _, _}, _, _} = ast) do
  apply_clause(ast)
end

defmacro called?({op, _, [lhs, {{:., _, _}, _, _} = rhs]}) when op in [:=, :"::"] do
  apply_clause(rhs, lhs)
end

defmacro called?({op, _, [{{:., _, _}, _, _} = lhs, rhs]}) when op in [:=, :"::"] do
  apply_clause(lhs, rhs)
end

defmacro capture({op, _, [lhs, {{:., _, _}, _, _} = rhs]}) when op in [:=, :<-] do
  captured_function = capture_clause(rhs)

  quote do
    var!(unquote(lhs)) = unquote(captured_function)
  end
end

Which lets us finally put it all together and do the following 🎉

iex(1)> require Assertions
Assertions
iex(2)> import Assertions
Assertions
iex(3)> Assertions.init()
{:ok, [{:matched, :nonode@nohost, 64}]}
iex(4)> Assertions.trace(Enum, :map, 2)
{:ok, [{:matched, :nonode@nohost, 1}, {:saved, 1}]}
iex(5)> called? Enum.map([:hi, "world!"], _) = [:atom, :string]
false
iex(6)> Enum.map([:hi, "world!"], & (if is_atom(&1), do: :atom, else: :string))
[:atom, :string]
iex(7)> called? Enum.map([:hi, "world!"], _) = [:atom, :string]
true
iex(8)> called? Enum.map([:hi, "world!"], _) = [_, :atom]
false
iex(9)> called? Enum.map([:hi, "world!"], _) = [_, _]
true
iex(10)> called? Enum.map([:hi, "world!"], _)
true
iex(11)> capture some_variable <- Enum.map([:hi, "world!"], _)
[:atom, :string]
iex(12)> [:integer | some_variable]
[:integer, :atom, :string]

So as you can see; these macros give us some nice ways to solve two of the three issues we had with Mock:

  1. We can assert that functions were called (passing in a real pattern match v.s. needing to pass in a match spec like DSL)
  2. And we can assert that the functions were called and returned a particular value which again can be any Elixir pattern match
  3. And as a bonus, you can bind whatever some function returned to a variable with my capture x <- function macro! 🤯

The remaining steps for this is to add a before_compile step to automatically parse invocations of called? and capture so that I can automate the Assertions.trace/3 function to make it completely transparent (it'll automatically trace the functions you care about) as well as extensively test some edge cases because we've run into a few already.

I'll also be looking into massively cleaning up these macros since they were thrown together literally in a couple of hours playing around with the ASTs I thought to test but I hope this blog post helped as an intro into tracing on the BEAM but also just a fun notepad of my iterative problem solving process 🙂