Everett v1.0.3 released! 🔗

Wednesday October 28, 2020, Will Kahn-Greene | Tweet this

What is it?

Everett is a configuration library for Python apps.

Goals of Everett:

flexible configuration from multiple configured environments
easy testing with configuration
easy documentation of configuration for users

From that, Everett has the following features:

is composeable and flexible
makes it easier to provide helpful error messages for users trying to configure your software
supports auto-documentation of configuration with a Sphinx autocomponent directive
has an API for testing configuration variations in your tests
can pull configuration from a variety of specified sources (environment, INI files, YAML files, dict, write-your-own)
supports parsing values (bool, int, lists of things, classes, write-your-own)
supports key namespaces
supports component architectures
works with whatever you're writing--command line tools, web sites, system daemons, etc

v1.0.3 released!

This is a minor maintenance update that fixes a couple of minor bugs, addresses a Sphinx deprecation issue, drops support for Python 3.4 and 3.5, and adds support for Python 3.8 and 3.9 (largely adding those environments to the test suite).

Why you should take a look at Everett

At Mozilla, I'm using Everett for a variety of projects: Mozilla symbols server, Mozilla crash ingestion pipeline, and some other tooling. We use it in a bunch of other places at Mozilla, too.

Everett makes it easy to:

deal with different configurations between local development and server environments
test different configuration values
document configuration options

First-class docs. First-class configuration error help. First-class testing. This is why I created Everett.

If this sounds useful to you, take it for a spin. It's a drop-in replacement for python-decouple and os.environ.get('CONFIGVAR', 'default_value') style of configuration so it's easy to test out.

Enjoy!

Socorro Engineering: Half in Review 2020 h1 🔗

Friday September 11, 2020, Will Kahn-Greene | Tweet this

Summary

2020h1 was rough. Layoffs, re-org, Berlin All Hands, Covid-19, focused on MLS for a while, then I switched back to Socorro/Tecken full time, then virtual All Hands.

It's September now and 2020h1 ended a long time ago, but I'm only just getting a chance to catch up and some things happened in 2020h1 that are important to divulge and we don't tell anyone about Socorro events via any other medium.

Prepare to dive in!

RustConf 2020 thoughts 🔗

Friday August 28, 2020, Will Kahn-Greene | Tweet this

Last year, I went to RustConf 2019 in Portland. It was a lovely conference. Everyone I saw was so exuberantly happy to be there--it was just remarkable. It was my first RustConf. Plus while I've been sort-of learning Rust for a while and cursorily related to Rust things (I work on crash ingestion and debug symbols things), I haven't really done any Rust work. Still, it was a remarkable and very exciting conference.

RustConf 2020 was entirely online. I'm in UTC-4, so it occurred during my afternoon and evening. I spent the entire time watching the RustConf 2020 stream and skimming the channels on Discord. Everyone I saw on the channels were so exuberantly happy to be there and supportive of one another--it was just remarkable. Again! Even virtually!

I missed the in-person aspect of a conference a bit. I've still got this thing about conferences that I'm getting over, so I liked that it was virtual because of that and also it meant I didn't have to travel to go.

I enjoyed all of the sessions--they're all top-notch! They were all pretty different in their topics and difficulty level. The organizers should get gold stars for the children's programming between sessions. I really enjoyed the "CAT!" sightings in the channels--that was worth the entrance fee.

This is a summary of the talks I wrote notes for.

Experimenting with Symbolic 🔗

Tuesday April 28, 2020, Will Kahn-Greene | Tweet this

One of the things I work on is Tecken which runs Mozilla Symbols Server. It's a server that handles Breakpad symbols files upload, download, and stack symbolication.

Bug #1614928 covers adding line numbers to the symbolicated stack results for the symbolication API. The current code doesn't parse line records in Breakpad symbols files, so it doesn't know anything about line numbers. I spent some time looking at how much effort it'd take to improve the hand-written Breakpad symbol file parsing code to parse line records which requires us to carry those changes through to the caching layer and some related parts--it seemed really tricky.

That's the point where I decided to go look at Symbolic which I had been meaning to look at since Jan wrote the Native Crash Reporting: Symbol Servers, PDBs, and SDK for C and c++ blog post a year ago.

What is symbolication?

There are lots of places where stacks are interesting. For example:

the stack of the crashing thread in a crash report
the stacks of the parent and child processes in a hung IPC channel
the stack of a thread being profiled
the stack of a thread at a given point in time for debugging

"The stack" is an array of addresses in memory corresponding to the value of the instruction pointer for each of those stack frames. You can use the module information to convert that array of memory offsets to an array of [module, module_offset] pairs. Something like this:

[ 3, 6516407 ],
[ 3, 12856365 ],
[ 3, 12899916 ],
[ 3, 13034426 ],
[ 3, 13581214 ],
[ 3, 13646510 ],
...

with modules:

[ "firefox.pdb", "5F84ACF1D63667F44C4C44205044422E1" ],
[ "mozavcodec.pdb", "9A8AF7836EE6141F4C4C44205044422E1" ],
[ "Windows.Media.pdb", "01B7C51B62E95FD9C8CD73A45B4446C71" ],
[ "xul.pdb", "09F9D7ECF31F60E34C4C44205044422E1" ],
...

That's neat, but hard to work with.

What you really want is a human-readable stack of function names and files and line numbers. Then you can go look at the code in question and start your debugging adventure.

When the program is compiled, the act of compiling produces a bunch of compiler debugging information. We use dump_syms to extract the symbol information and put it into the Breakpad symbols file format. Those files get uploaded to Mozilla Symbols Server where they join all the symbols files for all the builds for the last 2 years.

Symbolication takes the array of [module, module_offset] pairs, the list of modules in memory, and the Breakpad symbols files for those modules and looks up the symbols for the [module, module_offset] pairs producing symbolicated frames.

Then you get something nicer like this:

0  xul.pdb  mozilla::ConsoleReportCollector::FlushReportsToConsole(unsigned long long, nsIConsoleReportCollector::ReportAction)
1  xul.pdb  mozilla::net::HttpBaseChannel::MaybeFlushConsoleReports()",
2  xul.pdb  mozilla::net::HttpChannelChild::OnStopRequest(nsresult const&, mozilla::net::ResourceTimingStructArgs const&, mozilla::net::nsHttpHeaderArray const&, nsTArray<mozilla::net::ConsoleReportCollected> const&)
3  xul.pdb  std::_Func_impl_no_alloc<`lambda at /builds/worker/checkouts/gecko/netwerk/protocol/http/HttpChannelChild.cpp:1001:11',void>::_Do_call()
...

Yay! Much rejoicing! Something we can do something with!

I wrote about this a bit in Crash pings and crash reports.

Tecken has a symbolication API, so you can send in a well-crafted HTTP POST and it'll symbolicate the stack for you and return it.

Quickstart with Symbolic in Python

Symbolic is a Rust crate with a Python library wrapper. The Sentry folks do a great job of generating wheels and uploading those to PyPI, so installing Symbolic is as easy as:

pip install symbolic

The Symbolic docs are terse. I found the following documentation:

That helped, but I had questions those didn't answer. I have an intrepid freshman understanding of Rust, so I ended up reading the code, tests, and examples.

The one big thing that tripped me up was that Symbolic can't parse Breakpad symbols files from a byte stream--they need to be files on disk. Tecken doesn't store Breakpad symbols files on disk--they're in AWS S3 buckets. So it downloads them and parses the byte stream. In order to use Symbolic, we'll have to adjust that to save the file to disk, then parse it, then delete the file afterwards. 1

If that's not true, please let me know.

Anyhow, here's some sample annotated code using Symbolic to do symbol lookups:

import symbolic

# This is a Breakpad symbols file I have on disk.
archive = symbolic.Archive.open("XUL/75A79CFA0E783A35810F8ADF2931659A0/XUL.sym")

# We do debug ids as all-uppercase with no hyphens. However, symbolic
# requires that get normalized into the form it likes.
debug_id = symbolic.normalize_debug_id("75A79CFA0E783A35810F8ADF2931659A0")

# This parses the Breakpad symbols file and returns a symcache that we can
# look up addresses in.
obj = archive.get_object(debug_id=ndebug_id)
symcache = obj.make_symcache()

# Symbol lookup returns a list of LineInfo objects.
lineinfos = symcache.lookup(0xf5aa0)

print("line: %s symbol: %s" % (lineinfos[0].line, lineinfos[0].symbol))

Cool!

Symbolic parses Breakpad symbols files. It uses a cache format for fast symbol lookups. Loading the cache file is very fast.

Further, Symbolic parses files of a variety of other debug binary formats. This could be handy for skipping the intermediary Breakpad symbol file and using the debug binaries directly. More on that idea later.

Tecken is maintained by a team of two and we have other projects, so it spends a lot of time sitting in the corner feeling sad. Meanwhile, Symbolic is actively worked on by Sentry and a cadre of other contributors including Mozilla engineers because it's one of the cornerstone crates for the great Rust rewrite of Breakpad things. That's a big win for me.

So then I built a prototype

Today, I threw together a web app that does symbolication using Symbolic and called it Sherwin Syms.

https://github.com/willkg/sherwin-syms/

Building a separate prototype gives me something to tinker with that's not in production. I was able to add line number information pretty quickly. I can experiment with caching on disk. I can compare the symbolication API output for stacks between the prototype and what the Mozilla Symbols Server produces.

There's a lot of scaffolding in there. The Symbolic-using bits are in this file:

https://github.com/willkg/sherwin-syms/blob/master/src/sherwin_syms/symbols.py

Next steps

I need to integrate this into Tecken. I think that means writing a new v6 API view because the v4 and v5 code is tangled up with downloading and caching.

Markus and Gabriele suggested Tecken skip the Breakpad symbols files and use the debug binaries directly--Symbolic can handle those, too. The compelling reason for this is that Breakpad symbols files lose all the information for symbolicating inline functions correctly. I hope to look into that soon.

Summary

That summarizes the week I spent with Symbolic.

Switching from pyup to dependabot 🔗

Tuesday January 14, 2020, Will Kahn-Greene | Tweet this

Switching from pyup to dependabot

I maintain a bunch of Python-based projects including some major projects like Crash Stats, Mozilla Symbols Server, and Mozilla Location Services. In order to keep up with dependency updates, we used pyup to monitor dependencies in those projects and create GitHub pull requests for updates.

pyup was pretty nice. It would create a single pull request with many dependency updates in it. I could then review the details, wait for CI to test everything, make adjustments as necessary, and then land the pull request and go do other things.

Starting in October of 2019, pyup stopped doing monthly updates. A co-worker of mine tried to contact them to no avail. I don't know what happened. I got tired of waiting for it to start working again.

Since my projects are all on GitHub, we had already switched to GitHub security alerts. Given that, I decided it was time to switch from pyup to dependabot (also owned by GitHub).

Switching from pyup to dependabot

I had to do a bunch of projects, so I ended up with a process along these lines:

Remove projects from pyup.

All my projects are either in mozilla or mozilla-services organizations on GitHub.

We had a separate service account configure pyup, so I'm not able to make changes to pyup myself.

I had to ask Greg to remove my projects from pyup.

I wouldn't suggest proceeding until your project has been removed from pyup. Otherwise, it's possible you'll get PRs from pyup and dependabot for the same updates.
Add dependabot configuration to repo.

Then I added the required dependabot configuration to my repository and removed the pyup configuration.

I used these resources:

I created a pull request with these changes, reviewed it, and landed it.
Enable dependabot.

For some reason, I couldn't enable dependabot for my projects. I had to ask Greg who I think asked Hal to enable dependabot for my projects.

Once this was done, then dependabot created a plethora of pull requests.

While there are Mozilla-specific bits in here, it's probably generally helpful.

Dealing with incoming pull requests

dependabot isn't as nice as pyup was. It can only update one dependency per PR. That stinks for a bunch of reasons:

working through 30 PRs is extremely time consuming
every time you finish up work on one PR, it triggers dependabot to update the others and that triggers email notifications, CI builds, and a bunch of spam and resource usage
dependencies often depend on each other and need to get updated as a group

Since we hadn't been keeping up with Python dependencies, we ended up with between 20 and 60 pull requests to deal with per repository.

For Antenna, I rebased each PR, reviewed it, and merged it by hand. That took a day to do. It sucked. I can't imagine doing this four times every month.

While working on PRs for Socorro, I hit a case where I needed to update multiple dependencies at the same time. I decided to write a tool that combined pull requests.

Thus was born paul-mclendahand. Using this tool, I can combine pull requests. Using paul-mclendahand, I worked through 20 pull requests for Tecken in about an hour. This saves me tons of time!

My process goes like this:

create a new branch on my laptop based off of the main branch
list all open pull requests by running pmac listprs
make a list of pull requests to combine into it
for each pull request, I:
1. run pmac add PR
2. resolve any cherry-pick conflicts
3. (optional) rebuild my project and run tests
push the new branch to GitHub
create a pull request
run pmac prmsg and copy-and-paste the output as the pull request description

I can then review the pull request. It has links to the other pull requests and the data that dependabot puts together for each update. I can rebase, add additional commits, etc.

When I'm done, I merge it and that's it!

paul-mclendahand v1.0.0

I released paul-mclendahand 1.0.0!

Install it with pipx:

pipx install paul-mclendahand

Install it with pip:

pip install paul-mclendahand

It doesn't just combine pull requests from dependabot--it's general and can work on any pull requests.

If you find any issues, please report them in the issue tracker.

I hope this helps you!

How to pick up a project with an audit 🔗

Tuesday January 7, 2020, Will Kahn-Greene | Tweet this

Over the last year, I was handed a bunch of projects in various states. One of the first things I do when getting a new project that I'm suddenly responsible for is to audit the project. That helps me figure out what I'm looking at and what I need to do with it next.

This blog post covers my process for auditing projects I'm suddenly the proud owner of.

Socorro Engineering: Year in Review 2019 🔗

Monday January 6, 2020, Will Kahn-Greene | Tweet this

Summary

Last year at about this time, I wrote a year in review blog post. Since I only worked on Socorro at the time, it was all about Socorro. In 2019, that changed, so this blog post covers the efforts of two people across a bunch of projects.

2019 was pretty nuts. We accomplished a lot, but picking up a bunch of new projects really threw a wrench in the wheel of ongoing work.

This year in review covers highlights, some numbers, and some things I took away.

Here's the list of projects we worked on over the year:

Crash stats: (aka Socorro) the Mozilla crash ingestion pipeline
Symbols server: (aka Tecken) the Mozilla symbols server
Buildhub and Buildhub2: indexes of builds of Mozilla products
PollBot and Delivery Dashboard: a system for showing release status
Mozilla Location Services: Mozilla's geolocation system

Markus v2.0.0 released! Better metrics API for Python projects. 🔗

Thursday September 19, 2019, Will Kahn-Greene | Tweet this

What is it?

Markus is a Python library for generating metrics.

Markus makes it easier to generate metrics in your program by:

providing multiple backends (Datadog statsd, statsd, logging, logging roll-up, and so on) for sending metrics data to different places
sending metrics to multiple backends at the same time
providing a testing framework for easy metrics generation testing
providing a decoupled architecture making it easier to write code to generate metrics without having to worry about making sure creating and configuring a metrics client has been done--similar to the Python logging module in this way

We use it at Mozilla on many projects.

v2.0.0 released!

I released v2.0.0 just now. Changes:

Features

Use time.perf_counter() if available. Thank you, Mike! (#34)
Support Python 3.7 officially.
Add filters for adjusting and dropping metrics getting emitted. See documentation for more details. (#40)

Backwards incompatible changes

tags now defaults to [] instead of None which may affect some expected test output.
Adjust internals to run .emit() on backends. If you wrote your own backend, you may need to adjust it.
Drop support for Python 3.4. (#39)
Drop support for Python 2.7.

If you're still using Python 2.7, you'll need to pin to <2.0.0. (#42)

Bug fixes

Document feature support in backends. (#47)
Fix MetricsMock.has_record() example. Thank you, John!

Socorro Engineering: July 2019 happenings and putting it on hold 🔗

Tuesday August 6, 2019, Will Kahn-Greene | Tweet this

Summary

Socorro Engineering team covers several projects:

Socorro is the crash ingestion pipeline and Crash Stats web service for Mozilla's products like Firefox.
Tecken is the symbols server for uploading, downloading, and symbolicating stacks.
Buildhub2 is the build information index.
Buildhub is the previous iteration of Buildhub2 that's currently deprecated and will get decommissioned soon.
PollBot and Delivery Dashboard are something something.

This blog post summarizes our activities in July.

Highlights of July

Socorro: Added modules_in_stack field to super search allowing people to search the set of module/debugid for functions that are in teh stack of the crashing thread.

This lets us reprocess crash reports that have modules for which symbols were just uploaded.
Socorro: Added PHC related fields, dom_fission_enabled, and bug_1541161 to super search.
Socorro: Fixed some things further streamlining the local dev environment.
Socorro: Reformatted Python code with Black.
Socorro: Extracted supersearch and fetch-data commands as a separate Python library: https://github.com/willkg/crashstats-tools
Tecken: Upgraded to Python 3.7 and adjusted storage bucket code to work better for multiple storage providers.
Tecken: Added GCS emulator for local development environment.
PollBot: Updated to use Buildhub2.

Hiatus and project changes

In April, we picked up Tecken, Buildhub, Buildhub2, and PollBot in addition to working on Socorro. Since then, we've:

audited Tecken, Buildhub, Buildhub2, and PollBot
updated all projects, updated dependencies, and performed other necessary maintenance
documented deploy procedures and basic runbooks
deprecated Buildhub in favor of Buildhub2 and updated projects to use Buildhub2

Buildhub is decomissioned now and is being dismantled.

We're passing Buildhub2 and PollBot off to another team. They'll take ownership of those projects going forward.

Socorro and Tecken are switching to maintenance mode as of last week. All Socorro/Tecken related projects are on hold. We'll continue to maintain the two sites doing "keep the lights on" type things:

granting access to memory dumps
adding new products
adding fields to super search
making changes to signature generation and updating siggen library
responding to outages
fixing security issues

All other non-urgent work will be pushed off.

As of August 1st, we've switched to Mozilla Location Services. We'll be auditing that project, getting it back into a healthy state, and bringing it in line with current standards and practices.

Given that, this is the last Socorro Engineering status post for a while.

crashstats-tools v1.0.1 released! cli for Crash Stats. 🔗

Wednesday July 31, 2019, Will Kahn-Greene | Tweet this

What is it?

crashstats-tools is a set of command-line tools for working with Crash Stats (https://crash-stats.mozilla.org/).

crashstats-tools comes with two commands:

supersearch: for performing Crash Stats Super Search queries
fetch-data: for fetching raw crash, dumps, and processed crash data for specified crash ids

v1.0.1 released!

I extracted two commands we have in the Socorro local dev environment as a separate Python project. This allows anyone to use those two commands without having to set up a Socorro local dev environment.

The audience for this is pretty limited, but I think it'll help significantly for testing analysis tools.

Say I'm working on an analysis tool that looks at crash report minidump files and does some additional analysis on it. I could use supersearch command to get me a list of crash ids to download data for and the fetch-data command to download the requisite data.

$ export CRASHSTATS_API_TOKEN=foo
$ mkdir crashdata
$ supersearch --product=Firefox --num=10 | \
    fetch-data --raw --dumps --no-processed crashdata

Then I can run my tools on the dumps in crashdata/upload_file_minidump/.

Be thoughtful about using data

Make sure to use these tools in compliance with our data policy:

https://crash-stats.mozilla.org/documentation/memory_dump_access/

Where to go for more

See the project on GitHub which includes a README which contains everything about the project including examples of usage, the issue tracker, and the source code:

https://github.com/willkg/crashstats-tools

Let me know whether this helps you!

Will's Blog

Everett v1.0.3 released! 🔗

What is it?

v1.0.3 released!

Why you should take a look at Everett

Socorro Engineering: Half in Review 2020 h1 🔗

Summary

RustConf 2020 thoughts 🔗

Experimenting with Symbolic 🔗

What is symbolication?

Quickstart with Symbolic in Python

So then I built a prototype

Next steps

Summary

Switching from pyup to dependabot 🔗

Switching from pyup to dependabot

Switching from pyup to dependabot

Dealing with incoming pull requests

paul-mclendahand v1.0.0

How to pick up a project with an audit 🔗

Socorro Engineering: Year in Review 2019 🔗

Summary

Markus v2.0.0 released! Better metrics API for Python projects. 🔗

What is it?

v2.0.0 released!

Socorro Engineering: July 2019 happenings and putting it on hold 🔗

Summary

Highlights of July

Hiatus and project changes

crashstats-tools v1.0.1 released! cli for Crash Stats. 🔗

What is it?

v1.0.1 released!

Be thoughtful about using data

Where to go for more

Recommend

About Joyk