Make sidekiq report _all_ exceptions to Instrumental#35
Make sidekiq report _all_ exceptions to Instrumental#35esquivalient wants to merge 6 commits intomasterfrom
Conversation
… and not just StandardError and it's descendents. Why is this change needed? -------------------------- It came to our attention that some Metrician users have jobs that use code like `Exception.new`. That's not good practice, but reality is reality. In those cases, the Sidekiq middleware was not reporting those exceptions to Instrumental because `rescue` does not catch `Exception` by default. How does it address the issue? ------------------------------ Updates the Sidekiq middleware to rescue all exceptions, send the appropriate data to Instrumental, and the re-raise. On the way to doing this, several other changes were needed. - Our testing matrix was still looking at Ruby 2.1.5, 2.2.3, and 2.3.0. Those versions of Ruby are EOL, so they've been removed. - As a consequence, we increased the required Ruby version to 2.4.0... - And increased the Ruby version for developing Metrician to 2.6.0 - Also, we added all the newest versions of Ruby to the testing matrix - Which meant we needed to update delayed_job because the previous version depended on `yaml_as` that was part of Psych. Psych is no longer included in Ruby and newer versions of delayed_job handle that. - In other news, the testing matrix now requires installing two versions of Bundler. The default version of Bundler from gem is >2.0, but 2.0 will try to use the version of Bundler in the lock file. The lock files all contain <2.0 because the test Rails project still requires it. As a consequence, you'll get both versions of Bundler until all the lock files are updated. Any links to any relevant tickets, articles or other resources? --------------------------------------------------------------- https://stackoverflow.com/a/54088849/1417774 https://stackoverflow.com/questions/53655913/error-upgrading-ruby-version-undefined-method-yaml-as https://github.com/collectiveidea/delayed_job/blob/master/CHANGELOG.md#414---2017-12-29 Co-authored-by: Alex Overbeck <alex@expectedbehavior.com>
janxious
left a comment
There was a problem hiding this comment.
Given Rails 4.2 and Rails 5.0 are both EOL, perhaps we should also add Gemfile.6.0, Gemfile.5.2, and Gemfile.5.1?
|
Also should figure out what Travis is angry. Looks like something with bundler but I didn't dig in. |
Why is this change needed? -------------------------- Bundler 2.0 and greater attempts to use the version of Bundler in the lock file. If it doesn't find it, you get an error like `can't find gem bundler (>= 0.a) with executable bundle`. How does it address the issue? ------------------------------ We don't want to update all the lock files because the Rails project used for testing still requires Bundler < 2.0, so we're just making both the latest version of Bundler and the last version of Bundler compatible with the test version of Rails available. This worked in script/setup, so it'll probably work in CI. Any links to any relevant tickets, articles or other resources? --------------------------------------------------------------- https://travis-ci.org/Instrumental/metrician-ruby/jobs/602005821
@janxious Alex and I worked on this today. It turned into a totally deep hole and may have revealed that Metrician does not work correctly on newer versions of Rails. The saga continues on Monday... |
|
I am not shocked. Anything you need a hand with? |
Not yet. Mostly just shaving the yak. |
…m we support for each of those versions of Rails Why is this change needed? -------------------------- This is a gem for automatically instrumenting Rails and popular libraries used in Rails. We should test and make sure it works for some reasonable combination of versions. How does it address the issue? ------------------------------ Let me count the ways! 1) Adds script/generate_gemfiles This script automatically generates a gemfile for each valid combination of Rails, the required support gems, and the gems we automatically instrument. When run, it also provides the output that needs to be added to the .travis.yml file for those gemfiles to be run as part of the test suite. 2) Adds newly generated gemfiles for Rails 5.1.3, 5.2.0, and 6.0.0 Some gems we support (e.g. dalli/memcached) are mutually exclusive, so each version of Rails has more than one gemfile. 3) Fixes an issue with Bundler versions in Travis (and other places) If you want to install more than one version of Bundler and you don't specify a version for both, you only get one. That's bad, since some of the older gemfiles in this repo are still using Bundler 1.x. 4) Guards against infinite recursion Newer gemfiles have newer versions of rspec which have different loading behaviors that caused the active_record instrumentation code to load more than once. It turns out, that guard code doesn't work correctly and could cause infinite recursion. Here's the code comment, for the lazy You might wonder why this exception-based flow control is here instead of something more sane like instrumented_class.method_defined?(:log_without_metrician) The answer is that method_defined?, methods, instance_methods and other reasonable ways of answering this question do not work correctly in this circumstance. They return false even when log_without_metrician is defined on the class. The result is that the guard statement fails and runs this code twice. That makes log_without_metrician an alias of log_with_metrician and causes infinite recursion :( It seems like some kind of deep Ruby shenanigans avoud eigen classes and/or possibly a bug. It could maybe be improved by avoiding the use of class_eval, but we did not investigate on account of how many yaks we've already shaved this week. 5) Some test environment changes Newer versions of sidekiq don't automatically load sidekiq/testing, so that's no specifically required. Also, newer versions of HoneyBadger change the way they disabled data sending. That's conditionalized in the tests until we stop supporting the versions older than 3.3.1. Any links to any relevant tickets, articles or other resources? --------------------------------------------------------------- Co-authored-by: Alex Overbeck <alex@expectedbehavior.com>
Why is this change needed? -------------------------- We support a lot of different libraries across many ruby versions and rails versions. How does it address the issue? ------------------------------ There's some automation in place to make sure it's all done correctly, but now there's also words. Any links to any relevant tickets, articles or other resources? --------------------------------------------------------------- Co-authored-by: Alex Overbeck <alex@expectedbehavior.com>
Why is this change needed? -------------------------- We support the memcached gem. It's hard to test when there's no memcached running. How does it address the issue? ------------------------------ Installs the thing. Any links to any relevant tickets, articles or other resources? --------------------------------------------------------------- As requested in this PR: #35
Why is this change needed? -------------------------- Other people requested they be... more descriptive. How does it address the issue? ------------------------------ Now includes the specific differentiator of the two branches in the message. Any links to any relevant tickets, articles or other resources? --------------------------------------------------------------- As requested #35
jason-o-matic
left a comment
There was a problem hiding this comment.
Overall looks good. I commented about changing the method_defined? stuff. I also think we should add a test for that if it isn't super hard. Maybe something that explicitly tries to load the ActiveRecord instrumentation twice just to make sure it keeps working.
| already_defined = false | ||
| end | ||
|
|
||
| return if already_defined |
There was a problem hiding this comment.
It looks like private_method_defined? is the way to get at this particular method here, if we don't want to use exception based flow control:
39: rescue
40: already_defined = false
41: end
42: debugger if !instrumented_class.method_defined?(:log_without_metrician) && already_defined
43:
=> 44: return if already_defined
45: debugger
46: instrumented_class.class_eval do
47: debugger
48: alias_method :log_without_metrician, :log
(byebug) instrumented_class.method_defined?(:log_without_metrician)
false
(byebug) instrumented_class.private_method_defined?(:log_without_metrician)
true
Maybe the way that the log method gets marked as private has changed in Rails and our alias is ending up marked as private? I could see checking both method_defined? and private_method_defined? to be sure.
I'm pro switching this, but I'm happy to talk about it more.
There was a problem hiding this comment.
private_method_defined? does appear to work for some of the ruby/rails combos, but not all of them. I'm just doing a quick check, so it doesn't work for the oldest combo and it does work on the newest one.
Pros and cons
-
Attempting to get the unbound method definitely works across all versions. It will probably work forever because if you can't access a method through
methodthen there is no method. But, you have to use exception flow control to usemethod. -
We can try to figure out which methods work with which combos and conditionalize it. I think it's probably a pre-2.5/post-2.5 split.
-
We can try to verify that some combination of interrogation methods should cover every combo (e.g.
method_defined?andprivate_method_defined?)
Given that the outcome is infinite recursion I prefer the one that definitely can't be wrong, but I'm convincible. It's not like I think exception-based flow control is the bees knees.
There was a problem hiding this comment.
I don't feel super strongly either way. The fact that *method_defined? has changed over time does make me wonder if method will actually be trustworthy into the future, but probably. If we leave the exception based flow control we might want to move it into an already_defined? method just to encapsulate the badness.
In any case I think we should definitely add some kind of a test for this.
… and not just StandardError and it's descendents.
Why is this change needed?
It came to our attention that some Metrician users have jobs that use code like
Exception.new. That's not good practice, but reality is reality. In those cases, the Sidekiq middleware was not reporting those exceptions to Instrumental becauserescuedoes not catchExceptionby default.How does it address the issue?
Updates the Sidekiq middleware to rescue all exceptions, send the appropriate data to Instrumental, and the re-raise.
On the way to doing this, several other changes were needed.
yaml_asthat was part of Psych. Psych is no longer included in Ruby and newer versions of delayed_job handle that.Any links to any relevant tickets, articles or other resources?
https://stackoverflow.com/a/54088849/1417774
https://stackoverflow.com/questions/53655913/error-upgrading-ruby-version-undefined-method-yaml-as
https://github.com/collectiveidea/delayed_job/blob/master/CHANGELOG.md#414---2017-12-29
Co-authored-by: Alex Overbeck alex@expectedbehavior.com