I find this mostly happens on the first try after a deploy, but actually when i went to replicate to post on here, it happens everytime i trigger the action that schedules the job. The job definitely exists and has run previously.
Any ideas?
EDIT:
This is really bugging me. I’m now not thinking it’s after a deploy. Basically I have an action in the Avo gem that does this:
When i select in Avo a number of records to run that action on, I see the jobs get scheduled in Mission Control jobs. Sometimes it does them all no problem, but other times, some of them appear as failed, with the following error:
So you have seen errors for two constants? Errors for ImportGalleryStructureJob and errors for AssignObjectsToPhotosJob? Have you seen others? In your app, this only happens in production?
Yes, it doesn;t appear to matter which job it is. Sometimes it just says it’s an uninitialized constant. I can click retry and sometimes it fails straight away with the same error and other times it completes successfully.
and yes, only in Production. It’s like it just can’t find a job with that name. I’m very confused by it.
I encountered something like this before myself! One thing to check is that there’s a configuration config.rake_eager_load which defaults to false and overrides the config.eager_load value. So if you are starting your job process with rake you might want to try setting rake_eager_load as well in production.
@ziggycalyx could you elaborate? Did that happen to you using Zeitwerk?
I ask because in theory eager loading does not matter for this problem, since the autoloaders are configured anyway. That is, constants should be found with eager loading disabled the same way they are in development mode. Only the file system is not watched, and reloading is disabled.
This was a few weeks ago, using the latest version of rails with zeitwerk and delayed_job. Now, I’ll concede, I was not going about it in the most scientific way, as I was working on a few things at the same time, and the problem went away without me figuring out exactly what caused it. I suspect it may been because the delayed_job service was not restarting correctly after deploying new code, but while experimenting with running jobs, I was surprised to find different behavior between bin/delayed_job and rake jobs:work with regards to eager loading.