5

Fine-Tuning Live Debugging with Conditional and Time-travel Tracepoints

 2 years ago
source link: https://oz-code.com/blog/production-debugging/fine-tuning-live-debugging-with-conditional-and-time-travel-tracepoints
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Fine-Tuning Live Debugging with Conditional and Time-travel Tracepoints

Conditional and Time Travel Tracepoints - Ozcode
Dynamic logging with tracepoints hailed in a new era of live debugging. Conditional and time-travel tracepoints are the next generation and take live debugging to new heights.

Developers have this love/hate relationship with logging. Logs are one of the pillars of observability and the first line of defense against production errors. But using them for debugging in production is very cumbersome. Dynamic logging is much more effective. We introduced Tracepoints into Ozcode Live Debugger several months ago, so you can use dynamic logging to debug elusive bugs and production issues – the ones that don’t throw an exception but make your application misbehave.

The imperfections of dynamic logging

Dynamic logs are a giant leap towards effective incident resolution compared to static logs. First and foremost, you can update logs on the fly without going through a complete CI/CD process. Storage is not a problem since dynamic logs can be switched off as soon as the issue at hand is resolved. Dynamic logs also never go stale. They are easily removed once they’re not needed, keeping your source code clean and focused on the actual business logic at hand.

However, you still need to address the haystack and the unknowns.

The imperfections of dynamic logging

Dynamic logs are a giant leap towards effective incident resolution compared to static logs. First and foremost, you can update logs on the fly without going through a complete CI/CD process. Storage is not a problem since dynamic logs can be switched off as soon as the issue at hand is resolved. Dynamic logs also never go stale. They are easily removed once they’re not needed, keeping your source code clean and focused on the actual business logic at hand.

However, you still need to address the haystack and the unknowns.

The haystack

In some cases, the error you are investigating only happens under a very particular set of conditions. If your dynamic log entry fires for every tracepoint hit, you may find yourself digging through many logs before you identify the relevant one. While this may feel familiar to those who are used to digging through mountains of static logs, it’s exactly what we’re trying to avoid.

The unknowns

While a snapshot of application data at a tracepoint is helpful, often, it’s not enough. The values of variables at the line of code where you placed the tracepoint provide some insights, but you still don’t know how those variables got there. How the code execution flow affected the application state line-by-line until you got to the tracepoint is still unknown. You have to mentally step back in the code to try and figure out which conditional execution paths were traversed and the value of each variable at every step.

Conditional tracepoints to the rescue

Welcome to the next generation of tracepoints in Ozcode Live Debugger. Conditional tracepoints give you a lot more control over when to capture a tracepoint and output a log. When setting a tracepoint, you can define a set of conditions based on any variable in scope to determine when that tracepoint should actually fire a log entry to the output stream.

As the video above shows, autocomplete helps you select the variables to output, and you can build complex conditional expressions using the AND or OR operators.

Time-travel data per tracepoint

Understanding how a variable “develops” in the code execution flow; well, we’ve taken care of that too. You can now record time-travel debug information along with the stack trace for any tracepoint. It’s all about understanding “how we got here.” When examining a tracepoint hit, color-coded conditional statements clearly show how the code executed, and annotations display the value of each variable at every step of the way, from the beginning of the method down to the tracepoint. This step-by-step data provides deep insights into the chain of causality of the error in question.

Let’s see how this can be helpful.

Debugging logic errors in microservices

Microservices are great. As small, distinct pieces of code, they’re relatively easy to develop. But once they’re deployed into your distributed architecture, things can get tricky. Logical bugs that only appear under special circumstances of your complex production environment can be tough to pin down. This is where conditional tracepoints with time-travel data can really help. Here are a few tactics you can use.

  • Use the stack trace to understand the code execution flow when an error occurs.
  • Add tracepoints at each level in the stack trace so you can monitor relevant application data to see when something goes wrong. The time travel data within the scope of each tracepoint will be very helpful in showing how data changes with the error execution flow.
  • Identify the conditions under which an error occurs and use conditional tracepoints to provide data – but only when the errors occur. No need to add straw to the haystack! Note that conditions can be based on contextual data such as customer name, machine name, etc., as well as the values of local variables.
  • Use the Agents filter to ignore any tracepoint hits from services that aren’t connected to the error you’re debugging.
  • Once you start homing in on the source of the error, you can add columns to the Tracepoint Hits panel and filter on the service causing the error

The real thing (almost)

 Developers need ten fingers to write code, but only one to debug it in their IDEs. That’s right, F5 (Start debugger), F10 (Step over), and F11 (Step into). OK, not quite, but you get the point. Thing is, they want the same kind of experience when debugging in production. Digging through reams of static log files is unacceptable. Stepping through decompiled code to examine application state around tracepoints feel much more natural. You can do that with one finger.

Ozcode Live Debugger

Omer Raviv

Comments


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK