Still, do we want Wedge to be used only on small forums? Do we want to have people complain that we didn't see the bigger picture?
Let's just put that into context. sm.org has been around almost 10 years, and in that time hasn't even pushed an id of 200. As in it's exhausted 0.3% of capacity.
The most insane SMF board I have seen in the wild had 700 boards, the most insane I've ever made had 2000 boards. And believe me when I say that performance is screwed so very badly once you get into those realms.
Of course it's easy enough for an admin to change their field sizes... But I just wanted to do it now. Otherwise, we could just as well limit the id_member field to a smallint, because, well, most forums have a dozen members anyway...!
There's realistic scales for things. It is unlikely a site will generate that many boards. But it is entirely possible for a site to get over 65k members, especially given spam etc. - I have even seen sites that have almost exhausted the 4bn id for posts. But I have never seen a site even remotely approaching the board limit.
It's all about what is practical, and where it is likely to grow. I accept that boards are in the future more likely to grow than before, due to the intentions of using boards for albums and so on. But even at this point in time I cannot realistically expect sites to go over that. Yes, there are going to be sites that do, but I don't want to penalise everyone for the sake of the minority.
People are going to want to run Wedge on shitty hosting. It's a fact of life, and that no matter how shitty their hosting is, they're going to want to run Wedge on it. And it's going to mean people are going to run into issues. It's hard enough with the number of people who have trouble with SMF on shitty hosting - and we're going to have more issues, not less, by making it bigger.
I'm not sure a subselect would be any slower than a very long query, which Wedge always has to parse anyway...
The PHP side of the performance aspect is near enough irrelevant. I'm talking about the SQL execution of that query. IN() clauses are inefficient.
Then, if we start from this, there's also no way we can have privacy settings for topics, posts, etc...
No, it's a question of scale.
IN() is essentially a shortcut for OR clauses. column IN (1,2,3) is functionally equivalent to (column = 1 OR column = 2 OR column = 3) (brackets for the purposes of precedence etc.)
On a few values, like privacy, it's fine. But when you're talking about hundreds or thousands of values, it's going to suck however you do it. And not because of the parsing in the DB layer to get it into the query, it's going to suck once the thousands of rows are figured out in the subselect.
Not that complicated. Noisen does have that... Joins instead of subselects for privacy settings.
It just requires to rewrite all queries, which is annoying, but it wouldn't be a first.
Topic privacy is actually fine because of the limited number of them, because you're not throwing potentially thousands of values into an IN() clause.
It does have other consequences, too, not just rewriting all the queries which use query_see_board and all its friends, but on top of that you also have to consider the ball-ache it's going to create for modders on top. Yay.
So, last time we discussed join vs subselect, I think you came to the conclusion that performance benefits were not obvious..?
For the small numbers of values previously involved, it's pretty much a push. But when you're getting into the theoretically thousands of rows that you're expecting to deal with, it's much, much clearer.
Yup.