5

Application Code Upgrades in Elixir

 2 years ago
source link: https://blog.appsignal.com/2021/09/14/application-code-upgrades-in-elixir.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

In this third and final part of my series about production code upgrades in Elixir, we will look at what happens during an application upgrade.

Let’s get going!

Set-up: Create an Appup File in Elixir

As with a single module, we need new compiled code for a fresh version of an application.

But an application can consist of many modules and have running processes. So we need a scenario to specify what to upgrade and how to do it.

These scenarios are called application upgrade files (.appup files).

Let’s try to make an appup file for our application. Imagine that we want to upgrade our counters so that we can specify counter increment speed.

Prepare a New Version of the Application in Elixir

First, we’ll add our application to git and tag the old version:

git add .
git commit -m 'initial commit'
git tag 'v0.1.0'

Now we should make a new version.

In mix.exs, update the version to 0.1.1:

...
 def project do
    [
      app: :our_new_app,
      version: "0.1.1",
...

We want interval to be an external parameter, not a module attribute. Make the following changes in lib/our_new_app/counter.ex:

defmodule OurNewApp.Counter do
  ...
  def start_link({start_from, interval}) do
    GenServer.start_link(__MODULE__, {start_from, interval})
  end

  ...

  def init({start_from, interval}) do
    Process.flag(:trap_exit, true)

    st = %{
      current: start_from,
      timer: :erlang.start_timer(interval, self(), :tick),
      terminator: nil,
      interval: interval
    }

    {:ok, st}
  end

  ...

  def handle_info({:timeout, _timer_ref, :tick}, st) do
    :erlang.cancel_timer(st.timer)

    new_current = st.current + 1

    if st.terminator && rem(new_current, 10) == 0 do
      # we are terminating
      GenServer.reply(st.terminator, :ok)
      {:stop, :normal, %{st | current: new_current, timer: nil}}
    else
      new_timer = :erlang.start_timer(st.interval, self(), :tick)
      {:noreply, %{st | current: new_current, timer: new_timer}}
    end
  end

  ...
end

Now add a code_change callback to the same file:

defmodule OurNewApp.Counter do
  ...

  def code_change(_old_vsn, st, new_interval) do
    {:ok, Map.put(st, :interval, new_interval)}
  end
end

And change supervisor specs in lib/our_new_app/counter_sup.ex:

  @impl true
  def init(start_numbers) do
    children =
      for start_number <- start_numbers do
        # We can't just use `{OurNewApp.Counter, start_number}`
        # because we need different id's for children

        Supervisor.child_spec({OurNewApp.Counter, {start_number, 200}}, id: start_number)
      end

    Supervisor.init(children, strategy: :one_for_one)
  end

Let’s construct our_new_app.appup file. We need to update our supervision specs and pass a new tick interval (250) to change_code callback:

{
    "0.1.1",
    [{"0.1.0", [
        {update, 'Elixir.OurNewApp.CounterSup', supervisor},
        {update, 'Elixir.OurNewApp.Counter', {advanced, 250}}
    ]}],
    [{"0.1.0", []}]
}.

Note that this file is written in Erlang syntax.

Now tag your new version:

git add .
git commit -m 'added customizable intervals'
git tag 'v0.1.1'

Try to upgrade 0.1.0 to 0.1.1.

Checkout, compile, and run 0.1.0 version:

git checkout v0.1.0
iex -S mix

In a separate shell and different directory, checkout the new version and put the appup file in the appropriate place:

...
git checkout v0.1.0
mix compile
cp our_new_app.appup _build/dev/lib/our_new_app/ebin

Run the Application Upgrade in Elixir

We are ready to upgrade the application. In the running iex session, do the following:

iex(1)> Application.spec(:our_new_app)
[
  description: 'our_new_app',
  id: [],
  vsn: '0.1.0',
  modules: [OurNewApp, OurNewApp.Application, OurNewApp.Counter,
   OurNewApp.CounterSup],
  maxP: :infinity,
  maxT: :infinity,
  registered: [],
  included_applications: [],
  applications: [:kernel, :stdlib, :elixir, :logger],
  mod: {OurNewApp.Application, []},
  start_phases: :undefined
]

You can see that the old application is running.

Run an “orphaned” counter outside the supervision tree (we will need it later):

iex(2)> {:ok, pid} = OurNewApp.Counter.start_link(30000)
{:ok, #PID<0.147.0>}

Check that your appup file is correct and the OTP knows how to upgrade your application:

iex(3)> :release_handler.upgrade_script(:our_new_app, '/path/to/new/version/of/our_new_app/_build/dev/lib/our_new_app/')
{:ok, '0.1.1',
 [
   {:load_object_code,
    {:our_new_app, '0.1.1', [OurNewApp.CounterSup, OurNewApp.Counter]}},
   :point_of_no_return,
   {:suspend, [OurNewApp.CounterSup]},
   {:load, {OurNewApp.CounterSup, :brutal_purge, :brutal_purge}},
   {:code_change, :up, [{OurNewApp.CounterSup, []}]},
   {:resume, [OurNewApp.CounterSup]},
   {:suspend, [OurNewApp.Counter]},
   {:load, {OurNewApp.Counter, :brutal_purge, :brutal_purge}},
   {:code_change, :up, [{OurNewApp.Counter, 250}]},
   {:resume, [OurNewApp.Counter]}
 ]}

Check your counter processes and their pids:

iex(4)> Supervisor.which_children(OurNewApp.CounterSup)
[
  {20000, #PID<0.143.0>, :worker, [OurNewApp.Counter]},
  {10000, #PID<0.142.0>, :worker, [OurNewApp.Counter]}
]

Upgrade the application!

iex(5)> :release_handler.upgrade_app(:our_new_app, '/path/to/new/version/of/our_new_app/_build/dev/lib/our_new_app/')
{:ok, []}
iex(6)>
02:46:48.286 [info]  terminating with {{:badkey, :interval, %{current: 33478, terminator: nil, timer: #Reference<0.948322908.3672375303.146224>}}, [{OurNewApp.Counter, :handle_info, 2, [file: 'lib/our_new_app/counter.ex', line: 52]}, {:gen_server, :try_dispatch, 4, [file: 'gen_server.erl', line: 680]}, {:gen_server, :handle_msg, 6, [file: 'gen_server.erl', line: 756]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 226]}]}, counter is 33478

02:46:48.291 [error] GenServer #PID<0.147.0> terminating
...

Let’s see what happened:

iex(1)> Application.spec(:our_new_app)
[
  description: 'our_new_app',
  id: [],
  vsn: '0.1.1',
  modules: [OurNewApp, OurNewApp.Application, OurNewApp.Counter,
   OurNewApp.CounterSup],
  maxP: :infinity,
  maxT: :infinity,
  registered: [],
  included_applications: [],
  applications: [:kernel, :stdlib, :elixir, :logger],
  mod: {OurNewApp.Application, []},
  start_phases: :undefined
]
iex(2)> Supervisor.which_children(OurNewApp.CounterSup)
[
  {20000, #PID<0.143.0>, :worker, [OurNewApp.Counter]},
  {10000, #PID<0.142.0>, :worker, [OurNewApp.Counter]}
]
iex(3)> pids = for {_, pid, _, _} <- Supervisor.which_children(OurNewApp.CounterSup), do: pid
[#PID<0.143.0>, #PID<0.142.0>]
iex(4)> OurNewApp.Counter.get(Enum.at(pids, 0))
27225
iex(5)> OurNewApp.Counter.get(Enum.at(pids, 1))
17235
iex(6)> :sys.get_state(Enum.at(pids, 0))
%{
  current: 27468,
  interval: 250,
  terminator: nil,
  timer: #Reference<0.948322908.3672375311.146315>
}
iex(7)> :sys.get_state(Enum.at(pids, 1))
%{
  current: 17476,
  interval: 250,
  terminator: nil,
  timer: #Reference<0.948322908.3672375311.146330>
}
iex(8)> :sys.get_state(OurNewApp.CounterSup)
{:state, {:local, OurNewApp.CounterSup}, :one_for_one,
 {[20000, 10000],
  %{
    10000 => {:child, #PID<0.142.0>, 10000,
     {OurNewApp.Counter, :start_link, [{10000, 200}]}, :permanent, 5000,
     :worker, [OurNewApp.Counter]},
    20000 => {:child, #PID<0.143.0>, 20000,
     {OurNewApp.Counter, :start_link, [{20000, 200}]}, :permanent, 5000,
     :worker, [OurNewApp.Counter]}
  }}, :undefined, 3, 5, [], 0, OurNewApp.CounterSup, [10000, 20000]}

Our upgrade process was successfully completed.

You can see that:

  • The new version of the application is now running.
  • Our child counters are alive — they updated their state and are functioning.
  • From the internal state of OurNewApp.CounterSup, we see those child specifications for our counters updated too! Now, if they die, they will properly restart.

But what about the error GenServer #PID<0.147.0> terminating?

Recall that #PID<0.147.0> is the pid of our “orphaned” counter, which was running outside of the supervision tree. As the application upgrade process traverses the supervision tree and updates processes, the “orphaned” counter’s state was not updated. But the code of the OurNewApp.Counter did update, so the “orphaned” counter process died: new code met its old state.

We’ve seen how to upgrade a single running application. We needed only two special tools for that: an .appup file and :release_handler.upgrade_app/2 function. It was also crucial for us to follow OTP principles.

Wrap-up and What to Learn Next

I hope you’ve enjoyed this whirlwind ride through production code upgrades in Elixir! We started with my guide to hot code reloading, followed by the best use of supervisors when building applications.

This final article has demonstrated how following OTP principles can show us the way to powerful application code upgrades.

It’s worth noting that the application code upgrade I’ve demonstrated here still has disadvantages. The critical issue is that if the whole OS beam process restarts, this will load our application’s old code (unless we take some action).

How can we handle this potential problem, I hear you ask? With so-called release upgrades. This awesome article from ‘Learn you some Erlang’ is a good starting point to dive into release upgrades.

Thanks again for taking the time to read this series — and happy coding!

P.S. If you’d like to read Elixir Alchemy posts as soon as they get off the press, subscribe to our Elixir Alchemy newsletter and never miss a single post!

Our guest author Ilya is an Elixir/Erlang/Python developer and a tech leader at FunBox. His main occupation is bootstrapping new projects from both human and technological perspectives. Reach out via his Twitter for interesting discussions or consultancy.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK