Getting local variable values in stack traces, in production

hendzen · on June 5, 2014

Glad to see the dynamic language folks reinventing core dumps ;)

bcantrill · on June 5, 2014

For whatever it's worth, we've done similar stuff with node.js, operating from an actual OS-generated (i.e., not reinvented) core dump.[1] The difference in approach here is that we don't require the app to load anything, and we don't change its behavior -- we perform all of the inference from the dump itself, using extensive knowledge of V8.[2] It's great to see other dynamic languages discover the merits of postmortem debugging; for us, it's been essential for developing node.js-based services.

[1] http://www.joyent.com/blog/mdb-and-node-js

[2] http://www.slideshare.net/bcantrill/goto2012

brianr · on June 5, 2014

That's pretty awesome, we'd love to have this in Node as well (which we also use). Will check it out.

zackmorris · on June 5, 2014

Does anyone know of a debugger that can be built into C apps during development, so that testers can log stack traces with values, or be able to continue on breakpoints?

I can't tell you how many times users have hit strange bugs, and it took me forever to reproduce them in my debugger. Giving them a limited debugger would have saved me countless hours. Remote debugging is not really an option (I'd like something that can be compiled in).

forgottenpass · on June 5, 2014

What you want are core dumps, they allow after-the-fact use of the debugger. Use ulimit to raise the core dump size limit above 0 (I just use unlimited) and you'll get a core dump whenever the process faults. More specifically there are a handful of signals that cause a core file to be created, so you can also send the right kills to the process to get one on demand.

zackmorris · on June 5, 2014

Haha I just had to post this. When I went to read about core dumps in iOS:

http://stackoverflow.com/questions/7353024/when-an-app-crash...

I got a StackOverflow core dump, of sorts, similar to this:

http://img215.imageshack.us/img215/2886/picturezb.png

Currently I am using an uncaught exception handler based on this:

http://www.cocoawithlove.com/2010/05/handling-unhandled-exce...

But unfortunately it doesn't give very descriptive stack traces, even with debugger symbols turned on in the project settings. I'm really looking for something that shows me a full view of the program's state just like if I was in the debugger. It may be possible to extrapolate from the core dump, but I’m having a hard time figuring it out. This post summarizes how to do it with gdb:

http://stackoverflow.com/questions/5115613/core-dump-file-an...

But I’m thinking a huge opportunity has been lost here. This should be built into IDEs and especially for mobile apps, there should be a standard way of sending core dumps back to the developer when apps crash, especially for ad hoc builds during testing.

js2 · on June 5, 2014

Crittercism, Bugsense, Crashlytics and HockeyApp are all commercial providers that capture applications crashes and upload them to their backend. A few of them have built their SDKs on the open-source PLCrashReporter. You can also look at KSCrash and Google Breakpad. KSCrash may be the most advanced. Google Breakpad captures the closest thing to a core dump (it gathers only the stack and registers though).

brianr · on June 5, 2014

We've been using this internally for a few weeks now and it has been really, really awesome. Currently just Python, but should be feasible for Ruby, PHP, and other dynamic langauges too.

ianbicking · on June 5, 2014

Kind of related trick that paste.exceptions implemented (and may also be in weberror, all adopted from Zope) is if you set the local variable __traceback_info__ to some value, that value would be included in the traceback (that is emailed or whatever). And there are other __traceback_* variables that allow you to do more detailed additions to the report.

brianr · on June 5, 2014

That's a nice trick, though it does require knowing ahead of time what data you want (and having lots of __traceback_info__ and __traceback_supplement__ statements in the code). The nice thing about grabbing all locals is that the data is collected without having to think about it.

helper · on June 5, 2014

One thing I missed switching from Perl to Ruby was stack traces that show the function arguments.

rbarrois · on June 5, 2014

This is already available, for a huge set of languages, with Sentry (originally designed for Python), too.

Including event aggregation, lib versions, and so on: https://www.getsentry.com/welcome/

albertzeyer · on June 5, 2014

I once wrote something similar: https://github.com/albertz/py_better_exchook

Example: https://gist.github.com/albertz/922622

It does not dump all local vars though, only those which appear in the line where the exception happened. And not only local but also global. But also including subfields, like `obj.field`. And it does that in a kind of hacky way, via some embedded simple Python parsing, but it works most of the way just fine.

brianr · on June 5, 2014

Very nice. Did you find globals to be useful often? We excluded those here because there can be a lot of them (i.e. imported modules), but could be a good addition. `obj.field` is a nice touch too (our current approach will work if repr() shows the field, but that requires some code ahead of time).

albertzeyer · on June 5, 2014

That's why I only include those globals which are referred to in the line of the exception. And when they were referred in the line, they often were also useful. Otherwise, you are right, way too many to show all. I even found all the locals to be too many in many cases, that's why I did that simple heuristics. Also, I just made this to be a `sys.excepthook` replacement, so it's just text and you cannot simply hide the locals away and a simple traceback would just look too long/complicated.

jwegan · on June 6, 2014

I open sourced something very similar a couple years ago. It is basically an error monitoring service that uses git blame to figure out which developer last touched the code that caused the exception. It then sends that developer a stack trace, along with all the values of all the variables in the stack frame and additionally the HTTP request if it is running as a django/pylons middleware.

https://github.com/shopkick/flawless

zhng · on June 5, 2014

Django includes local variables and contextual lines in stack traces. https://i.imgur.com/Iyk81mp.png

brianr · on June 5, 2014

Yep - the debug toolbars for Pyramid, Flask, etc. do as well, using the same facility as here. The cool thing about this feature is being able to see the local variables from real exceptions in production instead of just in your local dev environment.

consultutah · on June 5, 2014

Here is how to do it in .NET: http://stackoverflow.com/questions/3087595/values-of-local-v...

Figured it out a few years ago. Sadly requires binary rewriting...

shiftb · on June 5, 2014

I can't wait until this is available for Ruby!

Can we exclude certain arguments so we don't capture sensitive data?

brianr · on June 5, 2014

Ruby: absolutely. It's not quite as straightforward as Python , but it should be feasible using binding_of_caller. If any Ruby people want to help us figure this out, please drop us a line or stop by https://github.com/rollbar/rollbar-gem/issues/117

Excluding sensitive data: yep. This feature uses the same scrub_field list that's used to scrub sensitive data from the request (GET/POST/headers/etc).

jinal · on June 5, 2014

Super excited about this! Can't wait to use it on our prod servers once its available for ruby!

100k · on June 5, 2014

Very nice! Congrats to Brian and the Rollbar team for releasing a super-useful feature.

andylash · on June 5, 2014

sweet feature, stoked to start using it