Wedge

Public area => The Pub => Topic started by: Nao on August 18th, 2012, 12:24 PM

Title: Getting ready for an alpha release...
Post by: Nao on August 18th, 2012, 12:24 PM
So... I'm looking into making this August 25 date a reality, right...? :edit: Obviously at a later date, now...

In order to achieve such a close release date, I needed to decide that some of the features I want for 1.0 will have to go through an upgrade script of my own. I hate the idea, but I'd rather postpone 20% of my features (and a stable release), than have Wedge remain vaporware forever... Plus, the interest we might generate could help get more developers onboard.

So, there are places in the code where "I'm not too sure" about what I've been doing, and here's what I'll do: I'll simply post patches and ask for feedback... This is better than withholding commits until I'm sure of what I'm actually doing.

Could you guys review the first patch? It's a simple one really -- it's something we discussed before, the default lengths for various database columns. I tend to forget what we talked about, especially when it comes to MySQL, so I don't know what is "right" and what isn't. And because we're going to have an upgrade script eventually, it also means we can safely change these field lengths later in the process, so I'm not even sure this patch needs applying at all.

Yet, there might be a few fields that warrant shortening or lengthening. You tell me. Thanks.

:edit: Removed attachment to make it possible to have the topic in public...
Title: Re: Getting ready for an alpha release...
Post by: live627 on August 18th, 2012, 07:19 PM
You really see a need for 4 million groups and 16 million boards? :o
Title: Re: Getting ready for an alpha release...
Post by: Arantor on August 18th, 2012, 08:40 PM
OK, let's recap something.

smallint ranges from -32768 to 32767 (0 to 65535 when unsigned)
mediumint has a range of +/- 8 million (16m when unsigned)
int has a range of +/- 2 billion (4 billion when unsigned)

You cannot assign that many groups to users, because if you have that many, the additional_groups will overflow IIRC as it only has so much space.

And, also, I'm still not comfortable expanding ids just for the sake of it - it WILL have a performance penalty on indexes on larger forums.

I can see the use for millions of groups - user created contact lists. (Though the group column MUST remain signed in most places e.g. with permissions, where permissions are required to apply to guests which are group -1. I can't consider the need for literally billions of groups, though even with a signed mediumint, that's still a column running into the millions, which I can imagine could be the case.)
Title: Re: Getting ready for an alpha release...
Post by: Arantor on August 19th, 2012, 02:13 AM
Oh, that reminds me, I need to remove the ban system as it stands so people don't get any funny ideas about using it because it'll be completely fucked up.
Title: Re: Getting ready for an alpha release...
Post by: Pandos on August 20th, 2012, 09:59 AM

Just take a short look at your install.sql and i have a view remarks:


boards table:

In boards table there is no field named id_draft where you define: PRIMARY KEY (id_draft).
Perhaps change PRIMARY to id_board?


What about the messages table?
Do not need to have an PRIMARY KEY on id_msg?
Also the topics table. No PRIMARAY KEY?


board_members, group_moderators, log_boards, log_mark_read,  moderators and subscriptions_groups:
ENGINE=MyISAM ?



This is my first view. Will take a deeper look later.


THX for all your work!!!


Sven
Title: Re: Getting ready for an alpha release...
Post by: Arantor on August 20th, 2012, 02:37 PM
It's a patch, not the full install.sql file. You're only seeing the differences between what's in the SVN trunk and what Nao's suggesting, which means you're seeing a tiny tiny tiny fraction of it.
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 23rd, 2012, 02:36 PM
@John> Are you implying that Wedge will never be installed on a successful forum? :P
Quote from Arantor on August 18th, 2012, 08:40 PM
smallint ranges from -32768 to 32767 (0 to 65535 when unsigned)
mediumint has a range of +/- 8 million (16m when unsigned)
int has a range of +/- 2 billion (4 billion when unsigned)
Additionally there are some fuck-ups...
For instance, let's take the group ID.
By design, SMF considers guests as group '-1'. Meaning that everytime the guest group has no reason to be stored in the database (simple example: the members table, they can't belong to the guest group...), we can safely use int(10) unsigned, but when it needs to be taken into account (example: permission tables), we have to use int(10) signed. I'm saying int(10) and not int(11) because if the negative number is never going to be < -10, there's no reason to account for the minus sign, but it doesn't matter as it's only for representation purposes in the command line as you said.
So, I don't even know if it's worth storing unsigned at all, because there are going to be situations where id_group > maxint/2 and thus will be stored correctly in some places (members), and badly in others (perms).
Quote
You cannot assign that many groups to users, because if you have that many, the additional_groups will overflow IIRC as it only has so much space.
I thought we'd discussed that it'd be best to get rid of additional_groups and instead use a subselect..?
An alternative would simply be to limit the numbers of groups you can create, number of people you can put in them, and number of groups you can belong to. i.e. have to find some technical way of allowing you to join any group, but disallowing people from adding you to their groups after a while.
Of course, in a realistic environment, it's unlikely that more than 0.0000001% of all Wedge installs will ever need to have more than a few hundred spaces 'available' for storing groups and so on.
Quote
And, also, I'm still not comfortable expanding ids just for the sake of it - it WILL have a performance penalty on indexes on larger forums.
And these forums are precisely the ones that need the expanding...
And if they're big, they can afford the extra hardware to handle the extra weight :P

(And you can retort that they can also afford to hire us to augment their database field sizes and optimize everything even more. :lol:)

Also, it's worth noting that Dragooon's original database schema for SMG was... A little bit skewed towards integer fields. Many fields have an int(11) definition even for storing small numbers -- e.g. the width/height fields in media_files are integers when they could just as well be smallint(4), i.e. < 65536... I don't see any realistic scenario where an online uploaded image would be wider/taller than that.
Quote
I can see the use for millions of groups - user created contact lists. (Though the group column MUST remain signed in most places e.g. with permissions, where permissions are required to apply to guests which are group -1. I can't consider the need for literally billions of groups, though even with a signed mediumint, that's still a column running into the millions, which I can imagine could be the case.)
My point exactly. :)
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 23rd, 2012, 02:51 PM
- I'm not sure whether mass database changes are ideal right now. But I also don't see myself making an upgrade script for all of that... Maybe we can write a script that would automatically parse install.sql, and compare it to your current one (maybe storing the latest used install.sql somewhere in the database in compressed form..?). It would just be a matter of doing a virtual diff of both files, finding the areas where the changes are done, and trying to do them in the database automatically.
But you wanna know...? I don't feel like I'm ever going to write that kind of script. This is where I'm getting crazy.

- I've added a 'fun' little experiment on the Stats page (go see it)... Since there was an empty area, I added a 'Top liked authors' box which has the benefit of showing more names than just Pete and I, ah ah! Well, it's funny because in the end, after all this time with likes enabled etc, it looks like we both have the same amount of posts (approx.), as well as the same number of likes! But I'm wondering whether it's fine to show this box. It's more of a philosophical matter. Do we want to encourage competition for people to get more likes..?

- could someone do tests on wedge.org's test board, posting various embeddable material and seeing if everyone's going all right? (i.e. no double-encoding of links, no broken stuff, works everywhere, works when posting in quick reply, quick modify, reply and modify...) Because wedge.org has been running a custom Aeva-Embed.php and Subs-BBC.php for quite some time now (including the removal and addition of an AWFUL lot of code), and I think it's going to be time to commit it...

- shall I count theme users per skin or per theme in the Skin Selector page? It only involves replacing "id_theme" with "id_theme, id_skin" in the SQL query for that -- but it might not help when it comes to performance...

- shall the board icons be set to 'New' status if there are new topics in the corresponding board, or new topics in the corresponding board *and* its children...? Currently it's the latter, but I think the former would be best, if only because sub-boards have their own 'New' label...

- just for the record: I'm currently reworking (quite heavily) the layouts for the Search page and Media page (homepage mostly). And I'm totally not happy, and I totally need to commit before we go alpha... :-/

Aaaaand, that's all for now.
And I haven't even had a look into my to-do-list for months, really... What fun to be had!
Title: Re: Getting ready for an alpha release...
Post by: Arantor on August 23rd, 2012, 03:24 PM
I'd leave database changes for now, ultimately. You're right that it's very complicated.

As far as upgrading goes, I really wouldn't bother worrying about it for the alpha. Anyone that uses an alpha should be prepared for anything and everything, including manual DB changes, and shouldn't be using it on a production site anyway.

I'd suggest just committing the changes to Aeva and Subs-BBC. Most things work as expected at this point in time. (Though it would be nice if we can get Aeva to use the no-cookies version of the YT embed code, and ideally using the iframe method rather than a Flash object)
Quote
Do we want to encourage competition for people to get more likes..?
That is going to be dependent on the community. There was a thread on AAF discussing the fact that I had a lot less posts than Shawn and yet more likes. If the community notices enough to care, they'll figure out whether they want to make it competitive.
Quote
- shall I count theme users per skin or per theme in the Skin Selector page? It only involves replacing "id_theme" with "id_theme, id_skin" in the SQL query for that -- but it might not help when it comes to performance...
Change it, that's cool. We can worry about performance later.
Quote
- shall the board icons be set to 'New' status if there are new topics in the corresponding board, or new topics in the corresponding board *and* its children...? Currently it's the latter, but I think the former would be best, if only because sub-boards have their own 'New' label...
That makes sense. It would solve a lot of troubles.
Quote
- just for the record: I'm currently reworking (quite heavily) the layouts for the Search page and Media page (homepage mostly). And I'm totally not happy, and I totally need to commit before we go alpha...
That's the beauty of being an alpha, it doesn't have to be ready in any fashion.
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 23rd, 2012, 07:45 PM
Quote from Arantor on August 23rd, 2012, 03:24 PM
I'd leave database changes for now, ultimately. You're right that it's very complicated.
Couldn't we at least update the most crucial ones, like the id_group stuff..?
Quote
As far as upgrading goes, I really wouldn't bother worrying about it for the alpha. Anyone that uses an alpha should be prepared for anything and everything, including manual DB changes, and shouldn't be using it on a production site anyway.
Well... We're using it on wedge.org, and will definitely have to update our tables one way or another... :^^;:

What could be done, sorta, is getting the new install.sql file, and doing a pseudo installation, i.e. running it with a setting of db_prefix = db_prefix + 'temp_install_whatever', and then compare the tables... (Or whatever.)
Quote
I'd suggest just committing the changes to Aeva and Subs-BBC.
There are a lot, still ;)
That's why I want to be sure that it works as is, right now...
If anything, I know that footnotes are still buggy in some respect. I mean, when you accumulate multiple quotes with multiple footnotes in them.. It starts to fail. Kind of thing I haven't worked on lately. Was too busy on CSS crap...
Quote
Most things work as expected at this point in time. (Though it would be nice if we can get Aeva to use the no-cookies version of the YT embed code, and ideally using the iframe method rather than a Flash object)
Hmmm... That's odd. I was pretty sure YT as an iframe was already implemented?! I remember working it into AeMe months ago... But it's nowhere to be seen in my current code (?).
Can you confirm this?
It's really easy to implement the new YT code, it's pretty much the same as in Google Maps (set type to html and use an iframe tag), only simpler :P
Quote
That is going to be dependent on the community. There was a thread on AAF discussing the fact that I had a lot less posts than Shawn and yet more likes. If the community notices enough to care, they'll figure out whether they want to make it competitive.
I've committed it... Feel free, and I mean it, feel free to revert it anytime!
Quote
Change it, that's cool. We can worry about performance later.
(Done.)
Quote
Quote
- shall the board icons be set to 'New' status if there are new topics in the corresponding board, or new topics in the corresponding board *and* its children...? Currently it's the latter, but I think the former would be best, if only because sub-boards have their own 'New' label...
That makes sense. It would solve a lot of troubles.
Good then :)
Quote
That's the beauty of being an alpha, it doesn't have to be ready in any fashion.
Our first users will hate us for it... :lol:
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 24th, 2012, 12:38 AM
Just had a silly idea, but well...
Adding a 'mobile' variable to $context['browser'], similar to 'webkit', but being based upon $user_info['is_mobile'] instead. This would allow skinners to create index.mobile.css files for mobile devices.
However, all devices go through the mobile path, and thus are usually redirected to the Wireless skin, so it might be a bit overkill to additionally offer this keyword...
Anyway...
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 25th, 2012, 12:25 AM
(Bumps! And this isn't a "Pete-only" series of questions, really, feel free to join in :P)
Title: Re: Getting ready for an alpha release...
Post by: godboko71 on August 25th, 2012, 06:33 PM
Well the keyword might encourage skinners to make wireless versions of there theme which is always welcome. Not that they can't do it with wireless
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 25th, 2012, 11:59 PM
(A small bump for Pete, still! :lol:)

Issue with using a index.mobile.css as opposed to <options><mobile>1</mobile></options> in skin.xml, is that mobile users get the extra CSS whatever their choice is. One of the features I like in Wedge, is that I keep track of both desktop and mobile use, and serve a different skin, but you can also choose to ignore that and use the desktop skin in your smartphone, or the mobile skin in your desktop browser. If you add an index.mobile.css file, it will always (always!) execute on mobile phones/tablets, and never on desktop.

Which might very well be as intended, of course! So I'm not sure it's actually a drawback, or instead another proof that Wedge is freaking flexible. 8-)
Title: Re: Getting ready for an alpha release...
Post by: Arantor on August 26th, 2012, 12:23 AM
Quote
Couldn't we at least update the most crucial ones, like the id_group stuff..?
I really would leave it alone at this point if you're still planning on doing some kind of release this weekend.
Quote
What could be done, sorta, is getting the new install.sql file, and doing a pseudo installation, i.e. running it with a setting of db_prefix = db_prefix + 'temp_install_whatever', and then compare the tables... (Or whatever.)
For everyone else I figure that alpha releases won't really be upgraded, and it's not like we can't just diff the install file to figure out what's changed - and convert that into something else that's easily usable (even here)
Quote
That's why I want to be sure that it works as is, right now...
It's committed, just needs more testing really. I haven't noticed any funny problems but I haven't been trying to embed. (Unfortunately I'm just in the process of figuring out how to pay for my grandmother's healthcare, is complicated)
Quote
This would allow skinners to create index.mobile.css files for mobile devices.
However, all devices go through the mobile path, and thus are usually redirected to the Wireless skin, so it might be a bit overkill to additionally offer this keyword...
I didn't initially understand the consequences but I think the result is better for themers to do so, so go for it :)
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 26th, 2012, 12:59 AM
Quote from Arantor on August 26th, 2012, 12:23 AM
Quote
Couldn't we at least update the most crucial ones, like the id_group stuff..?
I really would leave it alone at this point if you're still planning on doing some kind of release this weekend.
What about we just set id_board to a mediumint, and same for id_group..? We can increase their size again later... No? Because I really don't see myself leaving id_group be like that... Increasing its size will be an incentive for early release of contact lists.
Quote
For everyone else I figure that alpha releases won't really be upgraded, and it's not like we can't just diff the install file to figure out what's changed - and convert that into something else that's easily usable (even here)
Hmm, yeah, yeah...
There are already lots of changes btw... I'm sure the old packman no longer works, here. I know that it crashed my local install because of a few missing fields.
Quote
Quote
That's why I want to be sure that it works as is, right now...
It's committed, just needs more testing really.
Nope, the PHP code isn't committed, only the YouTube fixes which I never committed in the first place. To be specific -- my Aeva-Sites.php file had the iframe code in it, and my Subs-Aeva-Sites.php didn't. Meaning that I'd updated the sitelist file, then tested, then was happy, then reverted manually before doing a commit at some point (because of the lack of test), and forgot to re-introduce the iframe code... You won't believe the number of times that kind of crap happened to me!
Quote
I didn't initially understand the consequences but I think the result is better for themers to do so, so go for it :)
I just need to figure out what to put into the URL...
i.e. I don't see a need to have a 'mobile' keyword in the URL, *but* if the current device is a mobile device, *AND* one of the files it needs has a mobile keyword in it, then I need to add that keyword to the URL too, and thus create an extra css file.
The goal being to avoid having the more 'generic' keywords when not needed... But then a new problem occurs. If I use $variable {mobile} = "This is a mobile device" in the CSS code, I'll suddenly find myself with the need to record somewhere that it's a mobile device...

See what I mean..?
I don't. Guess it's time for bed...
Title: Re: Getting ready for an alpha release...
Post by: Arantor on August 26th, 2012, 01:17 AM
Quote
What about we just set id_board to a mediumint, and same for id_group..? We can increase their size again later... No? Because I really don't see myself leaving id_group be like that... Increasing its size will be an incentive for early release of contact lists.
Making id_board into anything bigger than smallint actually has much greater consequences than merely messing with the queries. query_see_board is still one of the key bottlenecks in more than one place. Making it potentially 1000x worse is not what I had in mind... though I have no idea how to make it better at this stage.
Quote
Hmm, yeah, yeah...
There are already lots of changes btw... I'm sure the old packman no longer works, here. I know that it crashed my local install because of a few missing fields.
I don't recall changing anything that should affect the old packman >_>

/meis all out of ideas for the other stuff, going through the social services paperwork is thoroughly draining :(
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 26th, 2012, 09:05 AM
Quote from Arantor on August 26th, 2012, 01:17 AM
Making id_board into anything bigger than smallint actually has much greater consequences than merely messing with the queries.
Leaving id_board as a smallint means limiting ourselves to a forum that doesn't have successful boards, successful blogs and successful galleries... (and especially, not all three of them!)
Quote
query_see_board is still one of the key bottlenecks in more than one place. Making it potentially 1000x worse is not what I had in mind... though I have no idea how to make it better at this stage.
Well... Again and again: what is wrong with using a subselect..?
Quote
I don't recall changing anything that should affect the old packman >_>
Just search for the ManagePlugins history (and install.sql), at some point you added three new fields (username, password and another) to the package_servers table. Don't ask me why...

Have a drink on me.
Title: Re: Getting ready for an alpha release...
Post by: Arantor on August 26th, 2012, 02:52 PM
Quote
Leaving id_board as a smallint means limiting ourselves to a forum that doesn't have successful boards, successful blogs and successful galleries... (and especially, not all three of them!)
How many forums do you know that have 65535 boards? Even if we expand that, how many forums do you know that have 65535 boards, blogs and galleries all at once?

If they have that much stuff, they're likely going to run into other problems too.

Here's the fundamental question: do you make life easier for the select few truly huge sites at the cost of making it slower for every other user? You cannot have it both ways.
Quote
Well... Again and again: what is wrong with using a subselect..?
Because that wouldn't solve the problem and would probably be even worse.

Right now the clause evaluates to 1=1 for admins and WHERE b.id_board IN (...) AND b.id_board NOT IN (...) (the last part is not all the time, but it can be, put it that way)

IN() clauses are not efficient, because internally they are evaluated to WHERE value = 1 OR value = 2 OR value = 3 etc.

Now, wrapping it in a subselect won't solve anything at all because all you end up doing is making it WHERE b.id_board IN (SELECT ...) - which ultimately becomes the same thing, except that you're not even making use of any cache this time.

As far as I can figure out, the only way to make it more efficient is to rewrite all the queries not to use a subselect and to force the board stuff to be conventionally joined, and conventionally excluded (where you have deny) which makes all queries that use query_see_board and its friends all get a way lot more complicated.
Quote
Just search for the ManagePlugins history (and install.sql), at some point you added three new fields (username, password and another) to the package_servers table. Don't ask me why...
The theory was, as I discussed many months ago, that I wanted to be able to have private repositories that it could call for updates on. But I ran into privacy issues.
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 26th, 2012, 03:05 PM
Quote from Arantor on August 26th, 2012, 02:52 PM
How many forums do you know that have 65535 boards? Even if we expand that, how many forums do you know that have 65535 boards, blogs and galleries all at once?
Not a single once -- but add to that anything that might be created at some point, and then deleted (remember we're on auto_increment here... Although we could do an Optimize after deleting anything. Might help a bit in case of stuff you create and then immediately delete.)
Still, do we want Wedge to be used only on small forums? Do we want to have people complain that we didn't see the bigger picture?
Of course it's easy enough for an admin to change their field sizes... But I just wanted to do it now. Otherwise, we could just as well limit the id_member field to a smallint, because, well, most forums have a dozen members anyway...!
Quote
Because that wouldn't solve the problem and would probably be even worse.
I'm not sure a subselect would be any slower than a very long query, which Wedge always has to parse anyway...
Quote
Now, wrapping it in a subselect won't solve anything at all because all you end up doing is making it WHERE b.id_board IN (SELECT ...) - which ultimately becomes the same thing, except that you're not even making use of any cache this time.
Parse time. I don't know...

Then, if we start from this, there's also no way we can have privacy settings for topics, posts, etc...
Quote
As far as I can figure out, the only way to make it more efficient is to rewrite all the queries not to use a subselect and to force the board stuff to be conventionally joined, and conventionally excluded (where you have deny) which makes all queries that use query_see_board and its friends all get a way lot more complicated.
Not that complicated. Noisen does have that... Joins instead of subselects for privacy settings.
It just requires to rewrite all queries, which is annoying, but it wouldn't be a first.

So, last time we discussed join vs subselect, I think you came to the conclusion that performance benefits were not obvious..?
Quote
The theory was, as I discussed many months ago, that I wanted to be able to have private repositories that it could call for updates on. But I ran into privacy issues.
Privacy is so cool.
Title: Re: Getting ready for an alpha release...
Post by: Arantor on August 26th, 2012, 04:31 PM
Quote
Still, do we want Wedge to be used only on small forums? Do we want to have people complain that we didn't see the bigger picture?
Let's just put that into context. sm.org has been around almost 10 years, and in that time hasn't even pushed an id of 200. As in it's exhausted 0.3% of capacity.

The most insane SMF board I have seen in the wild had 700 boards, the most insane I've ever made had 2000 boards. And believe me when I say that performance is screwed so very badly once you get into those realms.
Quote
Of course it's easy enough for an admin to change their field sizes... But I just wanted to do it now. Otherwise, we could just as well limit the id_member field to a smallint, because, well, most forums have a dozen members anyway...!
There's realistic scales for things. It is unlikely a site will generate that many boards. But it is entirely possible for a site to get over 65k members, especially given spam etc. - I have even seen sites that have almost exhausted the 4bn id for posts. But I have never seen a site even remotely approaching the board limit.

It's all about what is practical, and where it is likely to grow. I accept that boards are in the future more likely to grow than before, due to the intentions of using boards for albums and so on. But even at this point in time I cannot realistically expect sites to go over that. Yes, there are going to be sites that do, but I don't want to penalise everyone for the sake of the minority.

People are going to want to run Wedge on shitty hosting. It's a fact of life, and that no matter how shitty their hosting is, they're going to want to run Wedge on it. And it's going to mean people are going to run into issues. It's hard enough with the number of people who have trouble with SMF on shitty hosting - and we're going to have more issues, not less, by making it bigger.
Quote
I'm not sure a subselect would be any slower than a very long query, which Wedge always has to parse anyway...
The PHP side of the performance aspect is near enough irrelevant. I'm talking about the SQL execution of that query. IN() clauses are inefficient.
Quote
Then, if we start from this, there's also no way we can have privacy settings for topics, posts, etc...
No, it's a question of scale.

IN() is essentially a shortcut for OR clauses. column IN (1,2,3) is functionally equivalent to (column = 1 OR column = 2 OR column = 3) (brackets for the purposes of precedence etc.)

On a few values, like privacy, it's fine. But when you're talking about hundreds or thousands of values, it's going to suck however you do it. And not because of the parsing in the DB layer to get it into the query, it's going to suck once the thousands of rows are figured out in the subselect.
Quote
Not that complicated. Noisen does have that... Joins instead of subselects for privacy settings.
It just requires to rewrite all queries, which is annoying, but it wouldn't be a first.
Topic privacy is actually fine because of the limited number of them, because you're not throwing potentially thousands of values into an IN() clause.

It does have other consequences, too, not just rewriting all the queries which use query_see_board and all its friends, but on top of that you also have to consider the ball-ache it's going to create for modders on top. Yay.[1]
Quote
So, last time we discussed join vs subselect, I think you came to the conclusion that performance benefits were not obvious..?
For the small numbers of values previously involved, it's pretty much a push. But when you're getting into the theoretically thousands of rows that you're expecting to deal with, it's much, much clearer.
Quote
Privacy is so cool.
Yup.
 1. And no, this isn't just a case of 'The only winning move is not to play', not really. There are no good solutions. The current one is tolerable, leaning on it harder is going to cause a lot of trouble. I need to go away and think about what's really best here.
Title: Re: Getting ready for an alpha release...
Post by: b4pjoe on August 26th, 2012, 09:11 PM
I know it had been mentioned of maybe an Aug. 25th release date for the Alpha version of Wedge. Did it happen and I am just missing it or is it going to be later?
Title: Re: Getting ready for an alpha release...
Post by: markham on August 27th, 2012, 10:05 AM
Quote from Arantor on August 23rd, 2012, 03:24 PM
Though it would be nice if we can get Aeva to use the no-cookies version of the YT embed code, and ideally using the iframe method rather than a Flash object
I agree completely about using the no-cookies version. As to iFrame versus Flash, I've noticed recently that more and more videos refuse to work if embedded in an iFrame, so maybe both mechanisms should be available?
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 28th, 2012, 04:24 PM
I don't see the point of using youtube-nocookie, really..? Aren't you being a bit paranoid? What's in that cookie anyway?

It's either iframe or flash, not both. Although I did keep the flash version in a comment and explained how to reset it... And you could just as well include a new sitelist entry with the flash version, so that you could choose from the admin area, but that would imply you can't have both enabled at the same time obviously, and that would be confusing to do by default.
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 28th, 2012, 04:31 PM
Quote from b4pjoe on August 26th, 2012, 09:11 PM
I know it had been mentioned of maybe an Aug. 25th release date for the Alpha version of Wedge. Did it happen and I am just missing it or is it going to be later?
You didn't miss anything...
I said it was my target date. i.e. the date after which I decided to 'freeze' the code and only focus on making an alpha release viable. (Actually, I froze the codebase a couple of weeks ago already.)
You have no idea the amount of work it requires just to have something usable for everyone...

And frankly -- I've been busy IRL. Saw my little sister for the first time in over 2 years, met my niece for the first time, was ill at some point, went to the movies (I really, really needed to see TDKR that badly, sorry for taking some time off from time to time!), and having difficulties in other areas that I won't expand on.

So, yeah, Wedge has been in the works for over 2 years now, and I'd REALLY like for a version to be out for testing.
My current schedule is:
- finish the stuff I'm still working on (it'll probably take about a week, given that I've done about 2 thirds of the work so far, in the last couple of weeks),
- package the release, publish it somewhere on the private boards,
- wait for at least 5-6 people to start testing it, wait about a week, fix any bugs that have been reported,
- and release the alpha in public. So... That should hopefully happen in September. With a personal goal to release the stable version in late 2012 (but again, it's a personal goal -- not an official schedule. Our official schedule is "When it's done", as we said about a hundred times already.)

With a huge bold red warning saying that you shan't use it in production -- only as a test. I wouldn't even recommend to create plugins or skins with it -- I don't think Pete is finished with the plugin code, and I'm reserving the right to break the skin code because it has a couple of inconsistencies and might be a bit convoluted for proper use in some areas.
Title: Re: Getting ready for an alpha release...
Post by: Nao on August 28th, 2012, 04:46 PM
Quote from Arantor on August 26th, 2012, 04:31 PM
Let's just put that into context. sm.org has been around almost 10 years, and in that time hasn't even pushed an id of 200. As in it's exhausted 0.3% of capacity.
But it doesn't count the number of media albums, etc.
Noisen is a pretty 'small' website (a regular community of about a dozen users), and currently has 120 albums and 61 boards -- considering there are only a small handful of people who actually create (and use) blogs and forums over there. I didn't exactly make it 'obvious' that you can create blogs with a big bold button for that, and it's in French only.
So I can estimate that an English-speaking community of decent size could easily surpass the 64K limit in a few months time. You just need a few people to keep creating and then deleting their own albums or blogs because 'they don't know how to manipulate this stuff' :P
Quote
The most insane SMF board I have seen in the wild had 700 boards, the most insane I've ever made had 2000 boards. And believe me when I say that performance is screwed so very badly once you get into those realms.
Well, I see...

Still, if you have lots of boards and albums, the first thing you want to do is buy more computing power... More RAM, more disk space, etc. You can't have everything for free. If your site is slow, then maybe your hosting isn't suitable for you. See what I mean...?

So, I'm keeping a low figure and suggesting to switch both 'id_board' and 'id_group' from smallint(5) unsigned to mediumint(8) unsigned. (Well, signed for id_group.)
Or is it still too much...?
We can obviously prevent people from creating more than, say, 100K boards, it's just that I don't want to be physically limited at 32K or 64K to begin with...
Quote
It's all about what is practical, and where it is likely to grow. I accept that boards are in the future more likely to grow than before, due to the intentions of using boards for albums and so on. But even at this point in time I cannot realistically expect sites to go over that. Yes, there are going to be sites that do, but I don't want to penalise everyone for the sake of the minority.
I see.
(But members > 64K is a minority, too...)
Quote
Quote
I'm not sure a subselect would be any slower than a very long query, which Wedge always has to parse anyway...
The PHP side of the performance aspect is near enough irrelevant. I'm talking about the SQL execution of that query. IN() clauses are inefficient.
I was talking about the MySQL side. I mean, MySQL also has to parse the query before executing it... We're providing a string, not binary code ;)
Quote
Topic privacy is actually fine because of the limited number of them, because you're not throwing potentially thousands of values into an IN() clause.
Privacy settings will eventually be on their own tables anyway...? (I'm thinking of committing at least the table structures -- privacy_thoughts, privacy_boards and privacy_topics.)
This is the only way that we can realistically allow for multiple privacy settings on a single element, e.g. "allow my friends and my co-workers on my blog, but not my family." Well, other than having a comma-separated field of groups, of course... (?!)
Quote
It does have other consequences, too, not just rewriting all the queries which use query_see_board and all its friends, but on top of that you also have to consider the ball-ache it's going to create for modders on top. Yay.
Err... Modders WILL have to rewrite their code entirely to fit Wedge anyway... Might as well ask them to re-learn everything at the same time, eh.
Quote
For the small numbers of values previously involved, it's pretty much a push. But when you're getting into the theoretically thousands of rows that you're expecting to deal with, it's much, much clearer.
To use a join, you mean?
And faster?
Title: Re: Getting ready for an alpha release...
Post by: Arantor on August 28th, 2012, 05:07 PM
Quote
I don't see the point of using youtube-nocookie, really..? Aren't you being a bit paranoid? What's in that cookie anyway?
This is the internet where your privacy is being eroded every single day. I don't know about you but I don't like my privacy being eroded daily. I don't like the fact that if I watch a YouTube video, Google is tracking that fact and is able to track what I'm watching.
Quote
It's either iframe or flash, not both. Although I did keep the flash version in a comment and explained how to reset it... And you could just as well include a new sitelist entry with the flash version, so that you could choose from the admin area, but that would imply you can't have both enabled at the same time obviously, and that would be confusing to do by default.
Other than one instance where even the Flash version didn't work for a short time, I've not seen any problems with the iframe version. (As in: I once used the iframe version, it didn't work, but when I switched it for the Flash version it still didn't work)
Quote
sorry for taking some time off from time to time!
You have absolutely nothing to apologise for.
Quote
So I can estimate that an English-speaking community of decent size could easily surpass the 64K limit in a few months time. You just need a few people to keep creating and then deleting their own albums or blogs because 'they don't know how to manipulate this stuff'
I'm not entirely convinced at this stage.
Quote
Still, if you have lots of boards and albums, the first thing you want to do is buy more computing power... More RAM, more disk space, etc. You can't have everything for free. If your site is slow, then maybe your hosting isn't suitable for you. See what I mean...?
I was doing experiments on my old PC - dual core Athlon x64 with 8GB RAM, running Windows. It was hardly well-optimised but if you can imagine I was seeing half-second load times under 2k boards...
Quote
So, I'm keeping a low figure and suggesting to switch both 'id_board' and 'id_group' from smallint(5) unsigned to mediumint(8) unsigned. (Well, signed for id_group.)
Or is it still too much...?
I can certainly get behind using mediumint for these, far more than I can with making them ints, especially given how many places use these things...
Quote
I was talking about the MySQL side. I mean, MySQL also has to parse the query before executing it... We're providing a string, not binary code
The parsing stage is pretty much irrelevant either way, actually.

Under the first stage you're dealing with:

SELECT fields
FROM table
INNER JOIN table
WHERE field IN (list of values)

vs

SELECT fields
FROM table
INNER JOIN table
WHERE (SELECT value FROM table WHERE field = something)


From a pure parsing stage it almost doesn't make any difference, but it's because parsing is very quick anyway. The real grunt is how that query is actually executed, not how it is parsed.
Quote
Privacy settings will eventually be on their own tables anyway...? (I'm thinking of committing at least the table structures -- privacy_thoughts, privacy_boards and privacy_topics.)
This is the only way that we can realistically allow for multiple privacy settings on a single element, e.g. "allow my friends and my co-workers on my blog, but not my family." Well, other than having a comma-separated field of groups, of course... (?!)
Actually that's still likely to be fairly cheap.
Quote
Err... Modders WILL have to rewrite their code entirely to fit Wedge anyway... Might as well ask them to re-learn everything at the same time, eh.
That isn't what I was thinking about.

Telling modders that if they want to adhere to board privileges, they just have to use {query_enter_board} or whichever one it is, is nice and easy. Telling them that to do it with an extra join and whatnot is a lot more complicated to explain.
Quote
To use a join, you mean?
And faster?/quote]

Yes, a join will typically be faster when you're getting into thousands of rows.
Title: Re: Getting ready for an alpha release...
Post by: b4pjoe on August 28th, 2012, 05:44 PM
Quote from Nao on August 28th, 2012, 04:31 PM
Quote from b4pjoe on August 26th, 2012, 09:11 PM
I know it had been mentioned of maybe an Aug. 25th release date for the Alpha version of Wedge. Did it happen and I am just missing it or is it going to be later?
You didn't miss anything...
I said it was my target date. i.e. the date after which I decided to 'freeze' the code and only focus on making an alpha release viable. (Actually, I froze the codebase a couple of weeks ago already.)
You have no idea the amount of work it requires just to have something usable for everyone...

And frankly -- I've been busy IRL. Saw my little sister for the first time in over 2 years, met my niece for the first time, was ill at some point, went to the movies (I really, really needed to see TDKR that badly, sorry for taking some time off from time to time!), and having difficulties in other areas that I won't expand on.

So, yeah, Wedge has been in the works for over 2 years now, and I'd REALLY like for a version to be out for testing.
My current schedule is:
- finish the stuff I'm still working on (it'll probably take about a week, given that I've done about 2 thirds of the work so far, in the last couple of weeks),
- package the release, publish it somewhere on the private boards,
- wait for at least 5-6 people to start testing it, wait about a week, fix any bugs that have been reported,
- and release the alpha in public. So... That should hopefully happen in September. With a personal goal to release the stable version in late 2012 (but again, it's a personal goal -- not an official schedule. Our official schedule is "When it's done", as we said about a hundred times already.)

With a huge bold red warning saying that you shan't use it in production -- only as a test. I wouldn't even recommend to create plugins or skins with it -- I don't think Pete is finished with the plugin code, and I'm reserving the right to break the skin code because it has a couple of inconsistencies and might be a bit convoluted for proper use in some areas.
Thanks for the info and my apologies if I sounded like I was complaining. I had no intention of that. I understand real life should always come first.
Title: Re: Getting ready for an alpha release...
Post by: godboko71 on August 28th, 2012, 05:46 PM
Nao thanks for having a life it makes some of us feel better. b4pjoe tone can be hard to decipher I don't think anyone toke offence though I can't be sure or in anyway talk for anyone :)
Title: Re: Getting ready for an alpha release...
Post by: markham on August 28th, 2012, 07:39 PM
Quote from Arantor on August 28th, 2012, 05:07 PM
Quote
I don't see the point of using youtube-nocookie, really..? Aren't you being a bit paranoid? What's in that cookie anyway?
This is the internet where your privacy is being eroded every single day. I don't know about you but I don't like my privacy being eroded daily. I don't like the fact that if I watch a YouTube video, Google is tracking that fact and is able to track what I'm watching.
If that's being paranoid, then you can put my name down for that club. The point is,  we really don't know what exactly is in those cookies. Whenever anyone embeds a YouTube video, I do edit the post to change the URL to the non-cookie version. I'd rather not have to do that.
Title: Re: Getting ready for an alpha release...
Post by: Nao on September 7th, 2012, 07:17 PM
Quote from Arantor on August 28th, 2012, 05:07 PM
This is the internet where your privacy is being eroded every single day. I don't know about you but I don't like my privacy being eroded daily. I don't like the fact that if I watch a YouTube video, Google is tracking that fact and is able to track what I'm watching.
(Implemented a few revs back.)
Quote
I was doing experiments on my old PC - dual core Athlon x64 with 8GB RAM, running Windows. It was hardly well-optimised but if you can imagine I was seeing half-second load times under 2k boards...
In what type of query?
Quote
I can certainly get behind using mediumint for these,
(Implemented a few revs back.)
Quote
far more than I can with making them ints, especially given how many places use these things...
Alrighty.
Quote
Telling modders that if they want to adhere to board privileges, they just have to use {query_enter_board} or whichever one it is, is nice and easy. Telling them that to do it with an extra join and whatnot is a lot more complicated to explain.
Hmm, no, I don't see the issue..?

SELECT something
FROM table
{query_enter_board}
WHERE condition

And {query_enter_board} = JOIN boards AS b ON (b.id_board IN {list_of_boards_I_can_enter}...
No? Isn't that just as simple as having WHERE condition AND {query_enter_board}...?
I know that I used something similar in Noisen (you still have the diff patch), and it felt natural to me.
Quote
Yes, a join will typically be faster when you're getting into thousands of rows.
That's good to know...
Title: Re: Getting ready for an alpha release...
Post by: Arantor on September 7th, 2012, 07:53 PM
Quote
In what type of query?
The board index.
Quote
Hmm, no, I don't see the issue..?

SELECT something
FROM table
{query_enter_board}
WHERE condition

And {query_enter_board} = JOIN boards AS b ON (b.id_board IN {list_of_boards_I_can_enter}...
No? Isn't that just as simple as having WHERE condition AND {query_enter_board}...?
I know that I used something similar in Noisen (you still have the diff patch), and it felt natural to me.
What you end up doing is an INNER JOIN to something else. OK, let's figure out what that would actually give you. Something simple like the last 5 topics in all the boards we can see. (Expanding it to actually encompass this above)

SELECT t.id_topic, b.name
FROM {db_prefix}topics AS t
INNER JOIN {db_prefix}boards AS b ON (b.id_board IN (1,2,3,4,5,6))
WHERE t.approved = 1 AND b.id_board = t.id_board
ORDER BY t.id_topic DESC
Quote
1   SIMPLE   t   ref   approved,id_board,last_message_pinned,board_news   approved   1   const   1   Using where; Using filesort
1   SIMPLE   b   eq_ref   board_id   board_id   3   wedge.t.id_board   1
As opposed to
SELECT t.id_topic, b.name
FROM {db_prefix}topics AS t
INNER JOIN {db_prefix}boards AS b ON (b.id_board = t.id_board)
WHERE b.id_board IN (1,2,3,4,5,6) AND t.approved = 1
ORDER BY t.id_topic DESC
Quote
1   SIMPLE   t   ref   approved,id_board,last_message_pinned,board_news   approved   1   const   1   Using where; Using filesort
1   SIMPLE   b   eq_ref   board_id   board_id   3   wedge.t.id_board   1
This is not a perfect example, performance it's mostly a push because the join doesn't actually benefit you in any way whatsoever. The ONLY way that join will ever be a benefit (and the context in which I actually meant that comment) is if you're not doing it this way.

The bottleneck in these queries is not the method of parsing or injection. It's the fact you're still doing what amounts to WHERE id_board = 1 OR id_board = 2 OR id_board = 3 etc. It doesn't matter a pair of fetid dingo's kidneys that it's been rewritten to the JOIN because it still has to be evaluated as that to actually perform the join.

When I referred to JOINs being faster, they are faster when you're dealing with tables, rather than injected variables like that. Hell, even a join to a subquery might be slightly faster if evaluating the selection criteria there.
Quote
Isn't that just as simple as having WHERE condition AND {query_enter_board}...?
No, it gets more complicated. Plugin authors cannot rely on {query_see_board} actually joining {db_prefix}boards AS b, because it wouldn't for admin users. Which means you either have to force that extra join for admin users (as opposed to bypassing it entirely for performance), or you have to join the table manually yourself to get things like the board names.

Which means that to reliably get board names you'd actually have to do this:
SELECT t.id_topic, b2.name
FROM {db_prefix}topics AS t
INNER JOIN {db_prefix}boards AS b ON (b.id_board IN (1,2,3,4,5,6))      <------- this is the query_see_board line
INNER JOIN {db_prefix}boards AS b2 ON (b.id_board = t.id_board)
WHERE t.approved = 1 AND b.id_board = t.id_board
ORDER BY t.id_topic DESC

And then it starts to hurt, doing two joins to the same table instead of doing the correct WHERE clause.
Quote
That's good to know...
Like everything it's relative to context. Certain cases will outperform others.