Splitting this topic into its own from the original Selectbox topic...
Please read the bold characters below and tell us your opinion!
Please read the bold characters below and tell us your opinion!
Would you like to have topic privacy options in Wedge?
Chosen is slick, really slick. It even works great on iPad.
I'm not sure what the options on Noisen are.
Normal topic privacy setup would imply:
* topic starter only can see it
* topic starter and their contacts can see it
* topic starter and moderators
* anyone who can access the board
This should cover all the main cases. I don't think list of groups is needed, and I think that if it's down to the above, a simple number could deal with it.
Unfortunately, I tested it on my iPod and it has issues:
- because there's always a combo box, Safari will enlarge the page and focus on the input box. And because of that, you get the keyboard and you suddenly can't see the list of items at all... Good luck browsing it!
either a complete list of friends, or simply 'friend types', something I will probably add in the future (close friends, family, co-workers, etc.) I'm just not sure whether it'll be something that doesn't hurt performance, really.
- friend granularity would be hard, or even impossible, with a simple number. Unless we give friend groups a unique ID for everyone (like, I'm new to this site, and the first friend group I'll create will have id #2356 because there are already 2000+ other contact lists), and we start these IDs above 3 or 4 (used for default, logged in members, and all friends.)
It would probably make a lot of sense, and would certainly help with sql queries.
- Okay for moderators, you've convinced me they should be kept out. Still, I think it's best to leave that option aside -- just have 'Just Me'.
- I'd like to have some user opinions on contact lists. Who do you think does it best? Noisen.com (if you ever used it)? Facebook? Google+? SMF?
- Additionally, how would contact granularity hurt performance more than a general 'My Contacts' choice...? Considering we'll be hitting an extra table in every case?
No, I had a specific reason for 'just me + moderators'. There's a poor-man's helpdesk, there's for discussing reasons for bans etc, discussing 'application forms' on clan type sites. The list goes on, and it's much easier for us to implement it there in the core than it would ever be to bolt it on later.
Definitely not SMF.Quote - I'd like to have some user opinions on contact lists. Who do you think does it best? Noisen.com (if you ever used it)? Facebook? Google+? SMF?
Because it wouldn't just be an extra table hit. If you want to store anything other than a simple number, you have to either implode it and store it inline, or you have to store it in another table, which means that's *two* extra tables vs what we have now, not one.
And believe me, the notion of putting an imploded field in the topics table is a no-no, seeing how it would make the entire topics table an order of magnitude slower because right now there are no variable-size fields in it, which is a very, very good thing.
If it's kept as a simple number, it's possible to solve a touch more efficiently, because what you can then do is figure out who the users are who have the current user as a friend, and turn it into (where topic starter = me OR (privacy = friends AND topic starter IN (list of people who friended me)).
The one thing to realise about topic privacy, and this is quite important: it is going to suck compared to board privacy. It's unavoidable, because there's no way to do it in a way that adds extra conditions that can be evaluated without ORs (except in the just me or everyone cases) - ORs are bad for performance because they're an extra branch and often virtually a sub-query in their own right.
| 1. | If a generic list like Friends, Family etc, store {friends}, {family} and use $txt[trim($name, '{}')] or whatever at display time. |
Hmm.. Okay, okay... But I don't see this being useful in Thoughts, for instance...
It was kind of rhetorical. Noisen has asynchronous contacts, plus the ability to hide selective contacts from viewers. That one will be in Wedge too, once I get to implementing contact lists...
Oh... I see what you're talking about -- selecting several contact lists for viewers.
For instance -- if you have a family-only post, you select your family. If you have a work-only post, you select your co-workers. If you have a friends-only post, you select your friends. Among which can be some of your family and co-worker list members, of course. The point is having the ability to put some people into multiple lists. Then, when a list is modified, the 'buddy_list' field in {db_prefix}members is updated to reflect the entire list of contacts.
Although I'm not sure we'll be using that field much in the future... But IIRC there are reasons to leave it in.
If you start setting privacy settings on everything in your profile for instance, it'll be a disaster if you have to set multiple contact lists in each. I think it's much smarter to encourage people to put their 'safe' friends into a special list, and give that list all permissions, and deny the rest to anyone else -- guests, members and contacts that aren't in the safe list.
So when we check for a topic's privacy validity, we just retrieve its privacy setting, if <10 (for instance) it's a special setting like 'guests' or 'members' or 'just me' or 'just me + mods' or anything else we can think of, if >=10 it's a contact list, so we INNER JOIN wedge_contacts AS c ON t.privacy = c.id_list AND c.id_member = {int:myself}, or something like that...
As I see it, it's (WHERE i'm_admin OR topic starter = me OR (privacy >= 10 AND me IN (SELECT id_member FROM wedge_contacts WHERE privacy = id_list))
Does that make sense...?
I don't even know HOW exactly it's not KILLING performance, this one...
I'm not making advances when it comes to the select box, BTW... I'm still unsure where to start from!
Not so important for Thoughts, but it is important for topic privacy.
It was, yes, which is why I didn't really get into it. Facebook is now sort of asynchronous, Google is asynchronous by design and creates multiple lists for you to be asynchronous with. But I think that might be a bit too complex for what's needed in Wedge.
#
# Table structure for table `contact_lists`
#
CREATE TABLE {$db_prefix}contact_lists (
id_list mediumint(8) NOT NULL DEFAULT 0,
id_owner mediumint(8) NOT NULL DEFAULT 0,
PRIMARY KEY (id_list),
KEY member (id_owner)
) AUTO_INCREMENT=10 ENGINE=MyISAM;
#
# Table structure for table `contacts`
#
CREATE TABLE {$db_prefix}contacts (
id_member mediumint(8) NOT NULL DEFAULT 0,
id_list mediumint(8) NOT NULL DEFAULT 0,
is_synchronous tinyint(1) unsigned NOT NULL DEFAULT 0,
position tinyint(4) NOT NULL DEFAULT 0,
updated int(11) NOT NULL DEFAULT 0,
hidden tinyint(1) unsigned NOT NULL DEFAULT 0,
PRIMARY KEY (id_member, id_list)
) ENGINE=MyISAM;If buddy_list is a single list of comma-separated users, it's queryable without having to do an extra query. It'll be slow, but it'll be doable.
If it's *anything* else, we'll have to query it independently, decode it (I'm assuming unserialize), then build a query based on that. It's going to suck in performance terms.
It's still multiple lists to manage, and I just don't think that's entirely necessary. On a social network like Facebook or Google, where you're inherently sharing information that may be suitable for some but not all people, it's important to have that granularity. On a forum, it just isn't necessary.
Don't inner join. I get where you're going but inner join is a bad place to be. All of the queries (or at least, all the *important* queries that rely on topic visibility; there are many but the important ones like topic display for example) rely on having only one result returned, and inner join will generate multiple rows in the result.
It's hurting but you probably wouldn't notice it until you got to really huge boards with many many many topics.
You know you're going to end up designing your own in the end...Quote I'm not making advances when it comes to the select box, BTW... I'm still unsure where to start from!
So we need to establish a list of privacy IDs and their corresponding meaning...
I'm starting with the basics:
1- everyone
2- members
3- just me (author & admins)
Now we'll need to update the Import tool to actually convert buddy lists to contact lists (and automatically create a default list for every user that has at least one buddy.) I don't know if it's best to do it from the importer tool, or from within Wedge if the table is empty etc... I'd say the importer.
Slower than its equivalent subselect with a secondary table?
The problem with subselects, is that they often (always??) require a table scan to complete, even if done on the proper index...
This is something that doesn't happen with INNER JOINs.
The only thing we can/may/should/will/shall/would/whatever store in the data field of the member table is the list of contact lists you have. $contact_lists = unserialize($member['data']['contacts']) or something. Would need to be done on every page load (to get the list of friend groups for thought privacy), and other uses (such as users viewing a profile etc) can be done through a quick sql query.
So... You're suggesting no lists at all? Just plain contacts...?
I don't know about that. I think contact lists would have more pros and cons. And no one is forced to create multiple lists... It's just good to have them.
Heck... Either lists (id >= 10) or 'all contacts' is easily doable in a subselect. We either select id_members who are associated to our stored id_list (>= 10), or id_members who are associated with the id_member owner of the list. In which case it'd be best to store the list owner's id in the contacts table as well, to save time. Or just do a find_in_set on their buddy_list of course... But buddy lists are limited in size, unlike the contacts table.
Also, as I said above, from my experience, subselects don't use indexes. If you could help here, because you're the mysql specialist out of us both, it'd be nice to be able to use subselects, if only because it'd make life a hell of a lot easier when using {query_see_topic} in the code I'll eventually import from the Noisen diff...
My idea was to have such boards rely on Wedge...
Then again, it may never happen at all.
The only performance bottleneck I've been told so far, is the random list of items in the media homepage. That's because it retrieves all entries, randomizes them, and returns the first few. I still have an entry in my to-do-list to add a pseudo-randomizer variable in each item...
I'm hoping not.
Personally I'd rather have 0 = everyone, 1 = me, 2 = members (since 0 -> nothing to limit, 1 -> I'm the only 1) but I'm not particularly fussed.Quote So we need to establish a list of privacy IDs and their corresponding meaning...
I'm starting with the basics:
1- everyone
2- members
3- just me (author & admins)
The table structure seems straightforward enough to me.
As for when to do it, it should be done from the importer.
Oh hell yes. FIND_IN_SET is bad for a reason. Like the fact you cannot under any circumstances make a usable index on it.Quote Slower than its equivalent subselect with a secondary table?
You're right that it will cause a table scan, but only if the subquery is used IN () directly (this is not something I've done very often) but it's interesting to note that 3 years ago it was flagged as being solved in MySQL 6 as per http://bugs.mysql.com/bug.php?id=18826
But you need to be careful. INNER JOIN may be faster but you then have to process the results of a result-table that now has multiple rows, potentially many many rows you didn't want in the first place.
It's one of the reasons the board index query is fucked up, because it inner-joins the moderators table to the list of boards, so if you have 100 boards, each with 2 moderators, you get 200 rows back of which most of it is duplicated.
If the list's owner is stored in the table of contacts, why does it even need to be in the members table at all? Index the owner and you're golden.
Personally I just don't see the point. I can't think that many people are going to create topics that are visible to only a subset of a subset of friends.
Then again, I know it happens on LiveJournal which does make it a viable target for us (blogging context), I guess.
Consequently it must slow things down compared to a base SMF install,
Other possible privacy values..?
"18 years old or older"
"13 years old or older"
Etc...
I'd say it's possibly better than having a 'mature' flag on posts.
I was thinking maybe add a 'privacy' field for contact_lists as well... Although granularity for this would be hell -- what if I want my contact list to be able to browse the list they're in? Or if I don't want to? Shall I be able to select multiple viewing groups...?
Uh. v6 is a long time from now...
Is this just in an IN () situation?
I mean, could it work with something like SELECT t.id_member WHERE (SELECT TRUE FROM wedge_other AS o WHERE t.id_member = o.id_other).....? (Just off the top of my head.)
I'm not sure... If there's an IN (), we'll also be getting multiple entries just the same...?
Anyway -- query_see_topic does an inner join, and it's fucking ugly because it makes inserting the query_see_topic variable more compatibled than, say, query_see_board. Having it use an IN() would make it much more elegant.
The ability to create a blog for your professional friends... And another for your drinking buddies. (Same goes for topics, although less important.)
LJ is definitely not 'the' popular blog platform these days, but they still have got 'something'. They're also the ones who have rotating avatars, which I like... (Although not THAT much, eh.)
(We could also offer to disable topic privacy settings...)
This seems like it's potentially *very* complicated and something I'm not entirely sure I want to get into, to be honest.
This sounds to me like overthinking for SCIENCE!
But the bug report was 3 *years* ago. I have no idea whether that's since been backported to 5.5 or 5.6 or 5.WTF any more. I think we need to try it sometime.
I believe it can be made to work like that but I'm not sure, I don't think in subselects very often.
It sort of depends, really. There are times it will, times it won't. It also depends on whether DISTINCT is present or not, which would solve the multi-row return case regardless of joins or in clauses.
Hmmm. Part of me thinks that's a wonderful idea,Quote The ability to create a blog for your professional friends... And another for your drinking buddies. (Same goes for topics, although less important.)
They're the only popular hosted blog platform I know of that still functions with a community aspect to it.
In the end, even as long as there is one line of code for it, it is going to be slower than SMF.
We can mitigate it (and given other things will be altered, it may even out in the end) but it must make a difference.
If anyone is reading this -- please tell us whether you think that it would be nice to be able to create contact lists (i.e. friends, family, work...) and whether you'd use the feature to fine-tune your topic/board privacy settings, or you just wouldn't bother yourself?
(New topic, perhaps?)
The first query does a full table scan and returns in X milliseconds. The second query does the exact same full table scan, and returns in X milliseconds (sometimes a tad more but they're all very variable). Finally, the last query does an index search, and returns in X*2 milliseconds. It's also the most 'stable' of all queries, but it's stable in that it's always slower than even the slowest return time for the first two queries...
But if we start thinking this way, then we might as well drop the concept of topic privacy entirely...?
Nope, not backported. From what I gather in the link you posted, subqueries are implemented in 5.x in a way that they can't use an index, and it works in 6.x not because they fixed that bug, but because they rewrote their subquery code.
I didn't, until we dropped support for MySQL 4.0... And, turns out, I always hated doing inner joins...
My girlfriend suggested I use DISTINCT for these no later than last night.
I'm currently helping her set up a SOAP client in PHP for a WSDL app at Oracle. Another clusterfuck, BTW... We have no idea what functions we're supposed to call (and AFAIK there's no way to request a list of available methods once the custom object is created), how we're supposed to identify, etc... Thank you Oracle for zero documentation. Plus it doesn't help that neither her or I have any prior experience with SOAP...
I don't know why I'm mentioning that... Maybe there's a SOAP specialist around here
Ah yes, you just reminded me why I'm looking up to them... LJ is the only blogging platform that actually encourages communication between blog authors, rather than having parallel blogs with their own comment authors and such. It's like LJ is a huge forum with boards set up as blogs, just like on Noisen. Back when I created Noisen (2007-2008), LJ was the only similar example, and I actually used them as an example of *why* it would eventually work. Noisen was pretty much supposed to be the 'French LJ'... Too bad it never really got momentum. I'm not very good at advertising my work. I prefer development work.
Because that's the thing here... When we work with normal-sized boards, there's no such thing as a slow query. Start adding bot posters or scraping or simply have a hugely successful site, and you get your first performance issues.
I'd love to see the results of EXPLAINs on those queries.
It's a tough call. How much is too much?Quote But if we start thinking this way, then we might as well drop the concept of topic privacy entirely...?
I thought 5.5 was actually a bastardisation of the 6.x branch anyway.
SOAP is... evil, and I'm not just saying that as a smelly hippie code hacker :P But if it's involving WSDL... WSDL is a language that indicates what services exist at a given URL, and what inputs are expected and what outputs will be given. It's sort of like XML-RPC but more convoluted IMO. (Yes, I've done SOAP work. It's not that exciting, but it should be manageable. It really depends whether you're doing it all by hand in PHP or using something like the Zend_Soap components.)
*nods* And given that LJ is really not working out so well at the moment (it's not been the same since it was sold off a bit back), maybe we should be pushing that harder.
Yup, but we can still try and optimise as best possible for those cases.