Nao

  • Dadman with a boy
  • Posts: 16,079
Re: Getting ready for an alpha release...
« Reply #15, on August 26th, 2012, 12:59 AM »
Quote from Arantor on August 26th, 2012, 12:23 AM
Quote
Couldn't we at least update the most crucial ones, like the id_group stuff..?
I really would leave it alone at this point if you're still planning on doing some kind of release this weekend.
What about we just set id_board to a mediumint, and same for id_group..? We can increase their size again later... No? Because I really don't see myself leaving id_group be like that... Increasing its size will be an incentive for early release of contact lists.
Quote
For everyone else I figure that alpha releases won't really be upgraded, and it's not like we can't just diff the install file to figure out what's changed - and convert that into something else that's easily usable (even here)
Hmm, yeah, yeah...
There are already lots of changes btw... I'm sure the old packman no longer works, here. I know that it crashed my local install because of a few missing fields.
Quote
Quote
That's why I want to be sure that it works as is, right now...
It's committed, just needs more testing really.
Nope, the PHP code isn't committed, only the YouTube fixes which I never committed in the first place. To be specific -- my Aeva-Sites.php file had the iframe code in it, and my Subs-Aeva-Sites.php didn't. Meaning that I'd updated the sitelist file, then tested, then was happy, then reverted manually before doing a commit at some point (because of the lack of test), and forgot to re-introduce the iframe code... You won't believe the number of times that kind of crap happened to me!
Quote
I didn't initially understand the consequences but I think the result is better for themers to do so, so go for it :)
I just need to figure out what to put into the URL...
i.e. I don't see a need to have a 'mobile' keyword in the URL, *but* if the current device is a mobile device, *AND* one of the files it needs has a mobile keyword in it, then I need to add that keyword to the URL too, and thus create an extra css file.
The goal being to avoid having the more 'generic' keywords when not needed... But then a new problem occurs. If I use $variable {mobile} = "This is a mobile device" in the CSS code, I'll suddenly find myself with the need to record somewhere that it's a mobile device...

See what I mean..?
I don't. Guess it's time for bed...

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Getting ready for an alpha release...
« Reply #16, on August 26th, 2012, 01:17 AM »
Quote
What about we just set id_board to a mediumint, and same for id_group..? We can increase their size again later... No? Because I really don't see myself leaving id_group be like that... Increasing its size will be an incentive for early release of contact lists.
Making id_board into anything bigger than smallint actually has much greater consequences than merely messing with the queries. query_see_board is still one of the key bottlenecks in more than one place. Making it potentially 1000x worse is not what I had in mind... though I have no idea how to make it better at this stage.
Quote
Hmm, yeah, yeah...
There are already lots of changes btw... I'm sure the old packman no longer works, here. I know that it crashed my local install because of a few missing fields.
I don't recall changing anything that should affect the old packman >_>

* Arantor is all out of ideas for the other stuff, going through the social services paperwork is thoroughly draining :(
When we unite against a common enemy that attacks our ethos, it nurtures group solidarity. Trolls are sensational, yes, but we keep everyone honest. | Game Memorial

Nao

  • Dadman with a boy
  • Posts: 16,079
Re: Getting ready for an alpha release...
« Reply #17, on August 26th, 2012, 09:05 AM »
Quote from Arantor on August 26th, 2012, 01:17 AM
Making id_board into anything bigger than smallint actually has much greater consequences than merely messing with the queries.
Leaving id_board as a smallint means limiting ourselves to a forum that doesn't have successful boards, successful blogs and successful galleries... (and especially, not all three of them!)
Quote
query_see_board is still one of the key bottlenecks in more than one place. Making it potentially 1000x worse is not what I had in mind... though I have no idea how to make it better at this stage.
Well... Again and again: what is wrong with using a subselect..?
Quote
I don't recall changing anything that should affect the old packman >_>
Just search for the ManagePlugins history (and install.sql), at some point you added three new fields (username, password and another) to the package_servers table. Don't ask me why...

Have a drink on me.

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Getting ready for an alpha release...
« Reply #18, on August 26th, 2012, 02:52 PM »
Quote
Leaving id_board as a smallint means limiting ourselves to a forum that doesn't have successful boards, successful blogs and successful galleries... (and especially, not all three of them!)
How many forums do you know that have 65535 boards? Even if we expand that, how many forums do you know that have 65535 boards, blogs and galleries all at once?

If they have that much stuff, they're likely going to run into other problems too.

Here's the fundamental question: do you make life easier for the select few truly huge sites at the cost of making it slower for every other user? You cannot have it both ways.
Quote
Well... Again and again: what is wrong with using a subselect..?
Because that wouldn't solve the problem and would probably be even worse.

Right now the clause evaluates to 1=1 for admins and WHERE b.id_board IN (...) AND b.id_board NOT IN (...) (the last part is not all the time, but it can be, put it that way)

IN() clauses are not efficient, because internally they are evaluated to WHERE value = 1 OR value = 2 OR value = 3 etc.

Now, wrapping it in a subselect won't solve anything at all because all you end up doing is making it WHERE b.id_board IN (SELECT ...) - which ultimately becomes the same thing, except that you're not even making use of any cache this time.

As far as I can figure out, the only way to make it more efficient is to rewrite all the queries not to use a subselect and to force the board stuff to be conventionally joined, and conventionally excluded (where you have deny) which makes all queries that use query_see_board and its friends all get a way lot more complicated.
Quote
Just search for the ManagePlugins history (and install.sql), at some point you added three new fields (username, password and another) to the package_servers table. Don't ask me why...
The theory was, as I discussed many months ago, that I wanted to be able to have private repositories that it could call for updates on. But I ran into privacy issues.

Nao

  • Dadman with a boy
  • Posts: 16,079
Re: Getting ready for an alpha release...
« Reply #19, on August 26th, 2012, 03:05 PM »
Quote from Arantor on August 26th, 2012, 02:52 PM
How many forums do you know that have 65535 boards? Even if we expand that, how many forums do you know that have 65535 boards, blogs and galleries all at once?
Not a single once -- but add to that anything that might be created at some point, and then deleted (remember we're on auto_increment here... Although we could do an Optimize after deleting anything. Might help a bit in case of stuff you create and then immediately delete.)
Still, do we want Wedge to be used only on small forums? Do we want to have people complain that we didn't see the bigger picture?
Of course it's easy enough for an admin to change their field sizes... But I just wanted to do it now. Otherwise, we could just as well limit the id_member field to a smallint, because, well, most forums have a dozen members anyway...!
Quote
Because that wouldn't solve the problem and would probably be even worse.
I'm not sure a subselect would be any slower than a very long query, which Wedge always has to parse anyway...
Quote
Now, wrapping it in a subselect won't solve anything at all because all you end up doing is making it WHERE b.id_board IN (SELECT ...) - which ultimately becomes the same thing, except that you're not even making use of any cache this time.
Parse time. I don't know...

Then, if we start from this, there's also no way we can have privacy settings for topics, posts, etc...
Quote
As far as I can figure out, the only way to make it more efficient is to rewrite all the queries not to use a subselect and to force the board stuff to be conventionally joined, and conventionally excluded (where you have deny) which makes all queries that use query_see_board and its friends all get a way lot more complicated.
Not that complicated. Noisen does have that... Joins instead of subselects for privacy settings.
It just requires to rewrite all queries, which is annoying, but it wouldn't be a first.

So, last time we discussed join vs subselect, I think you came to the conclusion that performance benefits were not obvious..?
Quote
The theory was, as I discussed many months ago, that I wanted to be able to have private repositories that it could call for updates on. But I ran into privacy issues.
Privacy is so cool.

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Getting ready for an alpha release...
« Reply #20, on August 26th, 2012, 04:31 PM »
Quote
Still, do we want Wedge to be used only on small forums? Do we want to have people complain that we didn't see the bigger picture?
Let's just put that into context. sm.org has been around almost 10 years, and in that time hasn't even pushed an id of 200. As in it's exhausted 0.3% of capacity.

The most insane SMF board I have seen in the wild had 700 boards, the most insane I've ever made had 2000 boards. And believe me when I say that performance is screwed so very badly once you get into those realms.
Quote
Of course it's easy enough for an admin to change their field sizes... But I just wanted to do it now. Otherwise, we could just as well limit the id_member field to a smallint, because, well, most forums have a dozen members anyway...!
There's realistic scales for things. It is unlikely a site will generate that many boards. But it is entirely possible for a site to get over 65k members, especially given spam etc. - I have even seen sites that have almost exhausted the 4bn id for posts. But I have never seen a site even remotely approaching the board limit.

It's all about what is practical, and where it is likely to grow. I accept that boards are in the future more likely to grow than before, due to the intentions of using boards for albums and so on. But even at this point in time I cannot realistically expect sites to go over that. Yes, there are going to be sites that do, but I don't want to penalise everyone for the sake of the minority.

People are going to want to run Wedge on shitty hosting. It's a fact of life, and that no matter how shitty their hosting is, they're going to want to run Wedge on it. And it's going to mean people are going to run into issues. It's hard enough with the number of people who have trouble with SMF on shitty hosting - and we're going to have more issues, not less, by making it bigger.
Quote
I'm not sure a subselect would be any slower than a very long query, which Wedge always has to parse anyway...
The PHP side of the performance aspect is near enough irrelevant. I'm talking about the SQL execution of that query. IN() clauses are inefficient.
Quote
Then, if we start from this, there's also no way we can have privacy settings for topics, posts, etc...
No, it's a question of scale.

IN() is essentially a shortcut for OR clauses. column IN (1,2,3) is functionally equivalent to (column = 1 OR column = 2 OR column = 3) (brackets for the purposes of precedence etc.)

On a few values, like privacy, it's fine. But when you're talking about hundreds or thousands of values, it's going to suck however you do it. And not because of the parsing in the DB layer to get it into the query, it's going to suck once the thousands of rows are figured out in the subselect.
Quote
Not that complicated. Noisen does have that... Joins instead of subselects for privacy settings.
It just requires to rewrite all queries, which is annoying, but it wouldn't be a first.
Topic privacy is actually fine because of the limited number of them, because you're not throwing potentially thousands of values into an IN() clause.

It does have other consequences, too, not just rewriting all the queries which use query_see_board and all its friends, but on top of that you also have to consider the ball-ache it's going to create for modders on top. Yay.[1]
Quote
So, last time we discussed join vs subselect, I think you came to the conclusion that performance benefits were not obvious..?
For the small numbers of values previously involved, it's pretty much a push. But when you're getting into the theoretically thousands of rows that you're expecting to deal with, it's much, much clearer.
Quote
Privacy is so cool.
Yup.
 1. And no, this isn't just a case of 'The only winning move is not to play', not really. There are no good solutions. The current one is tolerable, leaning on it harder is going to cause a lot of trouble. I need to go away and think about what's really best here.

b4pjoe

  • @Powerbob - I can edit my thoughts both here and on my alpha install.
  • Posts: 54
Re: Getting ready for an alpha release...
« Reply #21, on August 26th, 2012, 09:11 PM »
I know it had been mentioned of maybe an Aug. 25th release date for the Alpha version of Wedge. Did it happen and I am just missing it or is it going to be later?

markham

  • Finally finished the Slideshow... phew!
  • Posts: 138
Re: Getting ready for an alpha release...
« Reply #22, on August 27th, 2012, 10:05 AM »
Quote from Arantor on August 23rd, 2012, 03:24 PM
Though it would be nice if we can get Aeva to use the no-cookies version of the YT embed code, and ideally using the iframe method rather than a Flash object
I agree completely about using the no-cookies version. As to iFrame versus Flash, I've noticed recently that more and more videos refuse to work if embedded in an iFrame, so maybe both mechanisms should be available?

Nao

  • Dadman with a boy
  • Posts: 16,079
Re: Getting ready for an alpha release...
« Reply #23, on August 28th, 2012, 04:24 PM »
I don't see the point of using youtube-nocookie, really..? Aren't you being a bit paranoid? What's in that cookie anyway?

It's either iframe or flash, not both. Although I did keep the flash version in a comment and explained how to reset it... And you could just as well include a new sitelist entry with the flash version, so that you could choose from the admin area, but that would imply you can't have both enabled at the same time obviously, and that would be confusing to do by default.
Re: Getting ready for an alpha release...
« Reply #24, on August 28th, 2012, 04:31 PM »
Quote from b4pjoe on August 26th, 2012, 09:11 PM
I know it had been mentioned of maybe an Aug. 25th release date for the Alpha version of Wedge. Did it happen and I am just missing it or is it going to be later?
You didn't miss anything...
I said it was my target date. i.e. the date after which I decided to 'freeze' the code and only focus on making an alpha release viable. (Actually, I froze the codebase a couple of weeks ago already.)
You have no idea the amount of work it requires just to have something usable for everyone...

And frankly -- I've been busy IRL. Saw my little sister for the first time in over 2 years, met my niece for the first time, was ill at some point, went to the movies (I really, really needed to see TDKR that badly, sorry for taking some time off from time to time!), and having difficulties in other areas that I won't expand on.

So, yeah, Wedge has been in the works for over 2 years now, and I'd REALLY like for a version to be out for testing.
My current schedule is:
- finish the stuff I'm still working on (it'll probably take about a week, given that I've done about 2 thirds of the work so far, in the last couple of weeks),
- package the release, publish it somewhere on the private boards,
- wait for at least 5-6 people to start testing it, wait about a week, fix any bugs that have been reported,
- and release the alpha in public. So... That should hopefully happen in September. With a personal goal to release the stable version in late 2012 (but again, it's a personal goal -- not an official schedule. Our official schedule is "When it's done", as we said about a hundred times already.)

With a huge bold red warning saying that you shan't use it in production -- only as a test. I wouldn't even recommend to create plugins or skins with it -- I don't think Pete is finished with the plugin code, and I'm reserving the right to break the skin code because it has a couple of inconsistencies and might be a bit convoluted for proper use in some areas.
Re: Getting ready for an alpha release...
« Reply #25, on August 28th, 2012, 04:46 PM »
Quote from Arantor on August 26th, 2012, 04:31 PM
Let's just put that into context. sm.org has been around almost 10 years, and in that time hasn't even pushed an id of 200. As in it's exhausted 0.3% of capacity.
But it doesn't count the number of media albums, etc.
Noisen is a pretty 'small' website (a regular community of about a dozen users), and currently has 120 albums and 61 boards -- considering there are only a small handful of people who actually create (and use) blogs and forums over there. I didn't exactly make it 'obvious' that you can create blogs with a big bold button for that, and it's in French only.
So I can estimate that an English-speaking community of decent size could easily surpass the 64K limit in a few months time. You just need a few people to keep creating and then deleting their own albums or blogs because 'they don't know how to manipulate this stuff' :P
Quote
The most insane SMF board I have seen in the wild had 700 boards, the most insane I've ever made had 2000 boards. And believe me when I say that performance is screwed so very badly once you get into those realms.
Well, I see...

Still, if you have lots of boards and albums, the first thing you want to do is buy more computing power... More RAM, more disk space, etc. You can't have everything for free. If your site is slow, then maybe your hosting isn't suitable for you. See what I mean...?

So, I'm keeping a low figure and suggesting to switch both 'id_board' and 'id_group' from smallint(5) unsigned to mediumint(8) unsigned. (Well, signed for id_group.)
Or is it still too much...?
We can obviously prevent people from creating more than, say, 100K boards, it's just that I don't want to be physically limited at 32K or 64K to begin with...
Quote
It's all about what is practical, and where it is likely to grow. I accept that boards are in the future more likely to grow than before, due to the intentions of using boards for albums and so on. But even at this point in time I cannot realistically expect sites to go over that. Yes, there are going to be sites that do, but I don't want to penalise everyone for the sake of the minority.
I see.
(But members > 64K is a minority, too...)
Quote
Quote
I'm not sure a subselect would be any slower than a very long query, which Wedge always has to parse anyway...
The PHP side of the performance aspect is near enough irrelevant. I'm talking about the SQL execution of that query. IN() clauses are inefficient.
I was talking about the MySQL side. I mean, MySQL also has to parse the query before executing it... We're providing a string, not binary code ;)
Quote
Topic privacy is actually fine because of the limited number of them, because you're not throwing potentially thousands of values into an IN() clause.
Privacy settings will eventually be on their own tables anyway...? (I'm thinking of committing at least the table structures -- privacy_thoughts, privacy_boards and privacy_topics.)
This is the only way that we can realistically allow for multiple privacy settings on a single element, e.g. "allow my friends and my co-workers on my blog, but not my family." Well, other than having a comma-separated field of groups, of course... (?!)
Quote
It does have other consequences, too, not just rewriting all the queries which use query_see_board and all its friends, but on top of that you also have to consider the ball-ache it's going to create for modders on top. Yay.
Err... Modders WILL have to rewrite their code entirely to fit Wedge anyway... Might as well ask them to re-learn everything at the same time, eh.
Quote
For the small numbers of values previously involved, it's pretty much a push. But when you're getting into the theoretically thousands of rows that you're expecting to deal with, it's much, much clearer.
To use a join, you mean?
And faster?

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Getting ready for an alpha release...
« Reply #26, on August 28th, 2012, 05:07 PM »
Quote
I don't see the point of using youtube-nocookie, really..? Aren't you being a bit paranoid? What's in that cookie anyway?
This is the internet where your privacy is being eroded every single day. I don't know about you but I don't like my privacy being eroded daily. I don't like the fact that if I watch a YouTube video, Google is tracking that fact and is able to track what I'm watching.
Quote
It's either iframe or flash, not both. Although I did keep the flash version in a comment and explained how to reset it... And you could just as well include a new sitelist entry with the flash version, so that you could choose from the admin area, but that would imply you can't have both enabled at the same time obviously, and that would be confusing to do by default.
Other than one instance where even the Flash version didn't work for a short time, I've not seen any problems with the iframe version. (As in: I once used the iframe version, it didn't work, but when I switched it for the Flash version it still didn't work)
Quote
sorry for taking some time off from time to time!
You have absolutely nothing to apologise for.
Quote
So I can estimate that an English-speaking community of decent size could easily surpass the 64K limit in a few months time. You just need a few people to keep creating and then deleting their own albums or blogs because 'they don't know how to manipulate this stuff'
I'm not entirely convinced at this stage.
Quote
Still, if you have lots of boards and albums, the first thing you want to do is buy more computing power... More RAM, more disk space, etc. You can't have everything for free. If your site is slow, then maybe your hosting isn't suitable for you. See what I mean...?
I was doing experiments on my old PC - dual core Athlon x64 with 8GB RAM, running Windows. It was hardly well-optimised but if you can imagine I was seeing half-second load times under 2k boards...
Quote
So, I'm keeping a low figure and suggesting to switch both 'id_board' and 'id_group' from smallint(5) unsigned to mediumint(8) unsigned. (Well, signed for id_group.)
Or is it still too much...?
I can certainly get behind using mediumint for these, far more than I can with making them ints, especially given how many places use these things...
Quote
I was talking about the MySQL side. I mean, MySQL also has to parse the query before executing it... We're providing a string, not binary code
The parsing stage is pretty much irrelevant either way, actually.

Under the first stage you're dealing with:

SELECT fields
FROM table
INNER JOIN table
WHERE field IN (list of values)

vs

SELECT fields
FROM table
INNER JOIN table
WHERE (SELECT value FROM table WHERE field = something)


From a pure parsing stage it almost doesn't make any difference, but it's because parsing is very quick anyway. The real grunt is how that query is actually executed, not how it is parsed.
Quote
Privacy settings will eventually be on their own tables anyway...? (I'm thinking of committing at least the table structures -- privacy_thoughts, privacy_boards and privacy_topics.)
This is the only way that we can realistically allow for multiple privacy settings on a single element, e.g. "allow my friends and my co-workers on my blog, but not my family." Well, other than having a comma-separated field of groups, of course... (?!)
Actually that's still likely to be fairly cheap.
Quote
Err... Modders WILL have to rewrite their code entirely to fit Wedge anyway... Might as well ask them to re-learn everything at the same time, eh.
That isn't what I was thinking about.

Telling modders that if they want to adhere to board privileges, they just have to use {query_enter_board} or whichever one it is, is nice and easy. Telling them that to do it with an extra join and whatnot is a lot more complicated to explain.
Quote
To use a join, you mean?
And faster?/quote]

Yes, a join will typically be faster when you're getting into thousands of rows.

b4pjoe

  • @Powerbob - I can edit my thoughts both here and on my alpha install.
  • Posts: 54
Re: Getting ready for an alpha release...
« Reply #27, on August 28th, 2012, 05:44 PM »
Quote from Nao on August 28th, 2012, 04:31 PM
Quote from b4pjoe on August 26th, 2012, 09:11 PM
I know it had been mentioned of maybe an Aug. 25th release date for the Alpha version of Wedge. Did it happen and I am just missing it or is it going to be later?
You didn't miss anything...
I said it was my target date. i.e. the date after which I decided to 'freeze' the code and only focus on making an alpha release viable. (Actually, I froze the codebase a couple of weeks ago already.)
You have no idea the amount of work it requires just to have something usable for everyone...

And frankly -- I've been busy IRL. Saw my little sister for the first time in over 2 years, met my niece for the first time, was ill at some point, went to the movies (I really, really needed to see TDKR that badly, sorry for taking some time off from time to time!), and having difficulties in other areas that I won't expand on.

So, yeah, Wedge has been in the works for over 2 years now, and I'd REALLY like for a version to be out for testing.
My current schedule is:
- finish the stuff I'm still working on (it'll probably take about a week, given that I've done about 2 thirds of the work so far, in the last couple of weeks),
- package the release, publish it somewhere on the private boards,
- wait for at least 5-6 people to start testing it, wait about a week, fix any bugs that have been reported,
- and release the alpha in public. So... That should hopefully happen in September. With a personal goal to release the stable version in late 2012 (but again, it's a personal goal -- not an official schedule. Our official schedule is "When it's done", as we said about a hundred times already.)

With a huge bold red warning saying that you shan't use it in production -- only as a test. I wouldn't even recommend to create plugins or skins with it -- I don't think Pete is finished with the plugin code, and I'm reserving the right to break the skin code because it has a couple of inconsistencies and might be a bit convoluted for proper use in some areas.
Thanks for the info and my apologies if I sounded like I was complaining. I had no intention of that. I understand real life should always come first.

godboko71

  • Fence accomplished!
  • Hello
  • Posts: 361
Re: Getting ready for an alpha release...
« Reply #28, on August 28th, 2012, 05:46 PM »
Nao thanks for having a life it makes some of us feel better. b4pjoe tone can be hard to decipher I don't think anyone toke offence though I can't be sure or in anyway talk for anyone :)
Thank you,
Boko

markham

  • Finally finished the Slideshow... phew!
  • Posts: 138
Re: Getting ready for an alpha release...
« Reply #29, on August 28th, 2012, 07:39 PM »
Quote from Arantor on August 28th, 2012, 05:07 PM
Quote
I don't see the point of using youtube-nocookie, really..? Aren't you being a bit paranoid? What's in that cookie anyway?
This is the internet where your privacy is being eroded every single day. I don't know about you but I don't like my privacy being eroded daily. I don't like the fact that if I watch a YouTube video, Google is tracking that fact and is able to track what I'm watching.
If that's being paranoid, then you can put my name down for that club. The point is,  we really don't know what exactly is in those cookies. Whenever anyone embeds a YouTube video, I do edit the post to change the URL to the non-cookie version. I'd rather not have to do that.