Wedge

Nao · « Reply #**120**, on March 14th, 2012, 09:51 AM »

Quote from Arantor on March 8th, 2012, 03:31 PM

Wasn't there a t.approved = 1 test there as well? I don't remember. But it wouldn't surprise me if it were buggy.

No, it was just an example... I don't think Recent items are in danger of being seen.
But we'll definitely need to audit most of the queries.

I think I'll do it like I did on Noisen.com back when I implemented the feature... I'll create a few topics, add a hidden word in them, set privacy to myself on them, and tell everyone that if they ever find the word, they should contact me immediately and I'll reward them for helping protecting the website's security settings.
It'll only be made more important by the fact that people are really gonna use these privacy settings...

Oh, while I'm at it.
I'm still working on database structure, getting close to the end thankfully. The major change these last few days, is that I decided that if topic privacy is going to be so specific ("member 34 and group 12..."), that it would only make sense to use the same feature on board access. We'll need to keep in touch to make sure it doesn't conflict with board access/view permissions, obviously... (I'm thinking board privacy should be tested first, and then these permissions. Does Wedge do a {query_see_board} BEFORE the permissions are tested? In which case it should be fine.)

Now, my main concern is with the has_privacy field. It's a tinyint(1) field (i.e. bool), so it won't take any significant space in the DB, but I'm not sure that the field will be taken into account by the key selection system in MySQL...
For performance reasons, I *suppose* it's best to do the test for that bit as late as possible in the process...? i.e. if its index isn't used, might as well execute the test last, so that it doesn't waste time doing a filesort or something on the table? (Yeah, after all these years I'm still THAT bad at MySQL optimization...)
In which case it'd mean we have to move all query_see_board tests to the end of all queries. I don't know.
Now, the boards table has a member_groups key, all alone by itself, so maybe it's really not a problem and our has_privacy field will not make it slower... (Assuming that the board table has thousands of entries, which is ridiculous normally, but not when you add a feature to allow people to create their own blogs. Will do... Later.)

Anyway...
So, if has_privacy = 0, there's nothing else to do. If has_privacy = 1, we join the privacy_boards table (I've switched the two words so that all privacy boards are next to each other, easier debugging), and we test for availability.

Also, Noisen's topic privacy feature has a 'default' which uses its parent board privacy settings. (IIRC it'll always use the parent board's privacy and then apply its own on top.) I'm assuming that all 'sensitive' topic queries in SMF/Wedge already do a query_see_board on them, so it shouldn't be a problem, but what if it is...? We don't want to show, in a list of topics, some topic that has no privacy settings but is in a board that shouldn't be viewed... But I suppose this privacy issue would have been spotted first, right..?
Obviously we'd have to remove the member_groups field from wedge_boards and have TE include some code to import these entries into the privacy_boards table. Shouldn't be a problem...

NB: board privacy is not yet included, only did the table in the sql file, I just want to discuss it and make sure it's the right direction... But I somehow suspect that if people are gonna be able to create their own boards (once we had a permission for them to do so), they'll be *thrilled* to know that they can fine-tune who gets to see them.
Noisen has such a feature already, but if I'm going to write a complex JavaScript UI for maintaining privacy lists, might as well use it everywhere...

NB2: Okay, I looked into board access... Oh my.
Wedge adds the ability to choose whether a group can KNOW (view) about a board, can ACCESS or board, and can be prevented from knowing or accessing a board regardless of other group memberships.
That is very fine, but if the board is created by a user who wants to prevent someone they dislike from viewing it, they won't be able to do it because they can't create membergroups... Unless we allow them to do just that (which would also make topic privacy and thought privacy simpler to manage, let's just say...)

Maybe I'm "in the wrong". Maybe our "contact lists" should be membergroups and so on. Maybe we need to silently create a membergroup that contains just users 5 and 47 if topic creator requests that only users 5 and 47 can read it. (And remove the membergroup if they change privacy settings for that topic and no other privacy area uses that group.)
I don't know...

Oh my. It's never going to end, is it...?

Quote

Would need benchmarking and query plans to be sure, but AFAIK, if the first branch is matched, the rest aren't if a given row would be returned in the OR case (if x OR y causes a row to be returned, it will be returned as soon as x is matched, or if x isn't matched, then return if y is matched)

Then it's a bit surprising that m.approved tests being done at the end of a query still gather attention from big board owners...
(Or they really need to buy huge equipment and not bother. I don't know how Facebook does it...)

Quote

Quote
Ready to digest, or giving up?
I've spent the last 24 hours trying to make sense of FTP.

So, not digesting well... :P

Quote

I could not conceive of a more backwards-ass specification if I *tried*, and to a degree I've just been thinking about everything else rather than focusing on this >_> I keep getting feelings of having so much to do, you know?

Wedge just isn't fun to work on when it requires going through the entire codebase and rewiring everything...

I'm just a bit sad that I have so much desire for user freedom when writing privacy settings, and that it suddenly seems so unrealistic a goal to achieve with regards to database performance...

Arantor · « Reply #**122**, on March 14th, 2012, 12:54 PM »

That's the thing, it's about control, and who has what control.

The admin can set up who's in what groups, not the user. So it seems a bit odd to expect the user to use that setup, whereas if they're using buddy lists, they have control over who sees their topic.

Nao · « Reply #**123**, on March 14th, 2012, 02:39 PM »

Admin = can control ownerless groups
User = can only control own groups

No?

Arantor · « Reply #**124**, on March 14th, 2012, 02:49 PM »

Yes, and that's the point.

If a user wants to restrict who can see a topic themselves, they want control over it, i.e. setting the list of people who can see it. If they don't have control over that list, why would they want to use it?

Nao · « Reply #**125**, on March 14th, 2012, 02:53 PM »

Well... Friend list?

Member groups would get extra fields. I'm sure it'll be all right.
My main concern is that if you have thousands of friends, your membergroup field will overflow in the member table...

Arantor · « Reply #**126**, on March 14th, 2012, 02:53 PM »

Which is why it can't really be done that way :(

Nao · « Reply #**127**, on March 14th, 2012, 02:55 PM »

Except if we increase the text field size..?

Arantor · « Reply #**128**, on March 14th, 2012, 02:58 PM »

I guess 1MB[1] is a big enough container for most sites, but it's really not that useful when it's scaled up because you just can't index that field in any fashion forcing a table scan to query it.

1.	Yes, I know mediumtext is defined as 16MB, it's not a typo. In the real world, there is a cap on the query packet size, which on most hosts is set to 1MB. There are also other limits that may come into play too.

Nao · « Reply #**129**, on March 14th, 2012, 03:16 PM »

Yep, but even 1MB sounds quite big when it comes to doing tests like id_group IN (...) and then you start listing the entire 1MB field in it...

My idea is just that contact lists and membergroups are one and the same thing. They're groups of members. One member can be in multiple groups, and a group can hold multiple members, so it's a many-to-many relationship, and I don't really see why SMF's membergroup memberships are never kept anywhere but in the members table.
Look at ManageMembergroups: to retrieve a list of members for a specific group, it does a query on the members table with a find_in_set... Not exactly optimized.

But even if we leave membergroups aside, and add contact lists in a similar fashion to how UltimateProfile did it (i.e. keep the buddy_list field in the members table for find_in_set searches, and add a relational table that holds lists of user IDs relative to a buddy list owner), the problem is exactly the same... Instead of holding membergroup IDs in the buddy_list field, you hold member IDs, and it's quite likely that there will be many more members than there are membergroups (eh), so if you have hundreds of 'buddies', the field will be overflowing before you know it... AFAIK at least.

So, basically:

With contact lists:
- a contact_lists table that holds a list of all existing contact lists (id_contact_list + id_owner)
- a contacts table that holds a list of all members associated with a specific contact list (id_contact_list + id_member + possibly id_owner)
- a buddy_list field in the members table that holds a list of all members who are in your own contact lists
- possibly replaced with 'simply' a comma-separated list of your contact lists. In which case, a JOIN will be needed to get the member list. Making the comma list a bit useless.

With membergroups:
- the existing membergroups table, with a new field (id_owner) indicating who created it, and maybe add a new type (contacts) or something.
- possibly a memberships table (similar to the contacts table), with id_member, and id_membergroup relationship, and maybe a 'hidden' flag (i.e. whether or not this membership should be disclosed), things like that, and an id_owner to avoid having to join the membergroups table if you want to retrieve a list of someone's friends.
- the existing members table, with a comma-separated list of membergroups that you BELONG to. If you need to obtain the list of your friends, you would do a JOIN on the memberships table, as mentioned above.

I think we already discussed the fact that find_in_set isn't very efficient and that maybe we should move membergroups to outside the members field... If we move it entirely out, then the field size is no longer an issue. Otherwise, it still is, but not for some time at least.

Meh...

Arantor · « Reply #**130**, on March 14th, 2012, 03:27 PM »

Quote

My idea is just that contact lists and membergroups are one and the same thing. They're groups of members. One member can be in multiple groups, and a group can hold multiple members, so it's a many-to-many relationship, and I don't really see why SMF's membergroup memberships are never kept anywhere but in the members table.

OK, last point first, that's mostly historical. MySQL until later on didn't have subselects at all, and we even removed the block on subselects in the code. Not only that, but there's also a performance consideration - going back, the belief was fewer queries = better, and if you can't do subselects, the best method is to denormalise the data somewhat and do what was done.

As far as membergroups vs contact lists go, yes, they're fundamentally the same thing - lists of users - but membergroups have a lot more to them. There's multiple types of membergroups, they have badges, they have PM limits and permissions and *stuff* attached to them. They also get the distinction of having three separate elements of the members table (primary group, post count group, other groups)

I have no real objections to making contact lists and membergroups basically the same thing, provided that proper protections are taken to ensure that the two are not permitted to overlap (e.g. you can't edit or view a contact list anywhere you're not supposed to be able to and vice versa)

In fact, that reminds me of something I considered... I considered the possibility of ditching post count groups as they are currently implemented and reimplementing it separately so that they're not physical groups, but instead a separate entity (that has badges etc.) and rearranging permissions so that they're set up based on being granted at different post counts rather than by groups. Needs more thought to explain what I have in mind but it would simplify this setup and move it closer to what you're looking at doing.

Nao · « Reply #**131**, on March 14th, 2012, 04:03 PM »

Quote from Arantor on March 14th, 2012, 03:27 PM

OK, last point first, that's mostly historical. MySQL until later on didn't have subselects at all, and we even removed the block on subselects in the code. Not only that, but there's also a performance consideration - going back, the belief was fewer queries = better, and if you can't do subselects, the best method is to denormalise the data somewhat and do what was done.

Yes, I remember that it was mostly done because SMF is old... :P
So to me that's an extra reason to just drop these.

Quote

As far as membergroups vs contact lists go, yes, they're fundamentally the same thing - lists of users - but membergroups have a lot more to them. There's multiple types of membergroups, they have badges, they have PM limits and permissions and *stuff* attached to them.

But, because membergroups are not usually exclusive, if you don't apply any setting to a membergroup, Wedge won't even bother with it... Basically, if id_owner > 0, then Wedge should reject them from the admin area when it comes to applying special settings to them.

Quote

They also get the distinction of having three separate elements of the members table (primary group, post count group, other groups)

If id_owner > 0, then the post count system shouldn't even be taken into account... (i.e. members shouldn't be able to create a post group. Hence, min_posts would be at 0, and should incur no performance penalty.)

Quote

I have no real objections to making contact lists and membergroups basically the same thing, provided that proper protections are taken to ensure that the two are not permitted to overlap (e.g. you can't edit or view a contact list anywhere you're not supposed to be able to and vice versa)

Doesn't seem hard to me ;)

I'm more interested in determining whether you think that performance-wise, having a separate membership table would be better for membergroups overall (not only for contact lists.)

Quote

In fact, that reminds me of something I considered... I considered the possibility of ditching post count groups as they are currently implemented and reimplementing it separately so that they're not physical groups, but instead a separate entity (that has badges etc.) and rearranging permissions so that they're set up based on being granted at different post counts rather than by groups. Needs more thought to explain what I have in mind but it would simplify this setup and move it closer to what you're looking at doing.

Hmm... About that, I have a feeling it could be complicated to implement. And if it doesn't help with performance but only with UIs, I would recommend against it.

Arantor · « Reply #**132**, on March 14th, 2012, 04:12 PM »

Quote

But, because membergroups are not usually exclusive, if you don't apply any setting to a membergroup, Wedge won't even bother with it... Basically, if id_owner > 0, then Wedge should reject them from the admin area when it comes to applying special settings to them.

Cool, that works for me. As long as the manage membergroups area doesn't allow access to contact lists, and contact lists don't allow for editing regular membergroups, I'm happy with that.

Quote

I'm more interested in determining whether you think that performance-wise, having a separate membership table would be better for membergroups overall (not only for contact lists.)

It's a tough one to call, because there's more to it than topic privacy.

Namely, the way permissions are generally loaded would have to be rethought, as also whether we store the user's primary group in their user record too. What we'd end up doing is pushing the permissions check out of the loading the member's main record, and pushing it off to loadPermissions where we'd end up querying for user's groups LEFT JOIN permissions table for the permissions for those groups (since that's the only way to identify users in group 1, who don't have any entries in the permissions tables otherwise)

And of course you have to be careful not to join the contact lists for the purposes of permissions.

That part is mostly not a *huge* deal in performance, but where it will make a difference is topic privacy as you suspect. The problem though is that I don't know *how* it's going to make a difference. Honestly I don't think there's any better way than to benchmark it and see what happens.

What I do know is that it will have a beneficial impact on handling users who want to implement shared forums (multiple forums with a single members table) to not have the groups tied explicitly to users in the members table.

Quote

Hmm... About that, I have a feeling it could be complicated to implement. And if it doesn't help with performance but only with UIs, I would recommend against it.

While the UI aspect is the primary aspect, there is a slight performance gain to be had, because you don't have to go through and update folks' post count groups if you change them, which is an ugly query. It also means you don't have the same management issues in other respects... people that don't want post count groups have a habit of deleting them.

That might not sound like a problem but I've seen WAY too many cases where people delete all the post count groups. That wouldn't be so bad if it weren't for the problem that results where it fucks groups up leading to any user anywhere having full admin access because of the way the post count assignation query works.

I'd almost argue that the UI simplification it would bring for setting permissions would actually be enough.

Nao · « Reply #**133**, on March 14th, 2012, 07:20 PM »

Still. Your post scared the shit out of me!
And I don't care if it isn't proper English :P

Wedge

Home

Login

Register

Would you like to have topic privacy options in Wedge?

Nao

Re: Privacy options

Re: Privacy options

Arantor

Re: Privacy options

Nao

Re: Privacy options

Arantor

Re: Privacy options

Nao

Re: Privacy options

Arantor

Re: Privacy options

Nao

Re: Privacy options

Arantor

Re: Privacy options

Nao

Re: Privacy options

Arantor

Re: Privacy options

Nao

Re: Privacy options

Arantor

Re: Privacy options

Nao

Re: Privacy options