Hey, thank you for your answer. Since I try to give some context I have a summary below.
I don’t think I completely follow what some of these questions have to do my original question, because I am not asking for how I should dimension my application server(s), and neither am I asking about how much of a priority it should be, but I’ll gladly give some answers and rough (measured) numbers, if it helps.
Yes, it is currently a problem and it’s growing of course. I’ll give some context below.
We are talking about hundreds of thousands of individual users with users which sometimes have thousands of such relationships.Of course not all are returned at once. For example going through them is paginated, etc.
However, the issue is that for some requests in extreme cases many (hundreds, approaching a thousand) users are returned as part of some other objects and having four additional queries being made for them is quite a bit of an overhead, even though the individual queries typically taking around 1ms (including network latency, etc.). Still means that for extreme cases one ends up with a request spending a whole second processing. I want to avoid building a solution only tackling these extreme cases.
My thought process therefor is to combine as much of this into single queries. As mentioned it currently is triggered by the view and I am not sure what the best (also in terms of idiomatic code) way is to at least combine these four queries per user into one.
I have been thinking about having a method returning a hash and then using that in a view. That would mean one additional query, but I think it looks very unidiomatic and still means that the cost for this incurs rather frequently.
My overall goal is to keep requests completing in well under a second, while not needing to complicate my code too much, which is why I think moving all of this logic into the controller, preferably into the controller and if possible somewhat “generically” into something I can re-use would be great. I know that might not be all that easy, which is why I would be fine with at least significantly reducing the requests.
And for your last question, on the hardware side we are not currently limited. The database server (and its replicas) is currently overdimensioned and the application servers are currently in the process of being scaled out. Since this is an older project there is some unrelated work to make this easier, but that’s unrelated to that problem.
I hope that is concrete enough. So to sum it up:
There is hundreds of thousands of users. Since not all of them show the same activity/have the same amount of popularity for the majority of users making additional very small queries is at least okay, but for a growing number having that many very small queries becomes an issue. Throwing additional hardware on the problem wouldn’t be an issue, however that doesn’t currently look like right approach to me, because we not limited here and since individual requests are slow.
Some side notes:
Caching isn’t really working so well here, because we are talking about what a single user gets about other users in relation to themselves, so this data is both per-user and needs to be up to date.
I focused on the queries in view here, since this is the main problem. But for better understanding I should mention that these relationships are also a form of permission system so to speak (ie. you can only view something when you are an accepted follower). Since this has to be done upfront in some situations, to decide what is shown that is also something it is used for. However, this is less of an issue, since it is usually done as part of whatever is queried for anyways. As mentioned we are talking about sub millisecond queries here.