Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - Arantor
5281
Features / Re: Privacy options
« on December 2nd, 2011, 01:08 AM »
It doesn't help that I've been wrapped up in a bastard funk mood lately that has just flatlined my enthusiasm for just about everything :( It's just like I've been so overloaded with *stuff*, both digitally and in real life.
Posted: December 2nd, 2011, 12:58 AM

Also...
Quote
Yes -- everyone, or just me (i.e. just the ability to write drafts...)
Is redundant, since you can formulate drafts quite happily...
5282
FAQs / [FAQ] Re: What is Wedge?
« on December 2nd, 2011, 01:01 AM »
Hmm. Such a thing is not simple because the preparse step is called in different ways at different times (like it's called inside sendpm as opposed to sendpm expecting it to be already done like createPost does)

The times it's called with ENT_QUOTES is where it expects to see a ' and/or a " to deal with. In posts, that is a valid situation to deal with, especially since it can manifest itself in a slightly disturbing way, notably that not doing it might potentially lead to a vulnerability. I'm not sufficiently comfortable with turning it into NOQUOTES for that reason.

Harmonisation, yes. Probably even moving it all into preparsecode, but with the caveat that it will require more than just moving a few calls, as sendpm will need updating.
5283
The Pub / Re: Logo Madness
« on December 2nd, 2011, 12:56 AM »
Sure I mean arse, but I was in a hurry earlier.
5284
The Pub / Re: Logo Madness
« on December 1st, 2011, 05:20 PM »
SMF kicks ass in its own ways too... Just not as hard as perhaps it might.
5285
FAQs / [FAQ] Re: What is Wedge?
« on December 1st, 2011, 05:18 PM »
There is already a shortened version of parse_bbc for such inline purposes, as it happens.

As for the rejection of regexp, yup such things are called ReDoS situations where a badly crafted post could tie the parser up, denying service to others. I actually don't dislike what Unknown has done in the original parse_bbc actually, I just wish it were faster, but the real problem isn't the parser, but the pre parser, which I think is ultimately going to have to be rewritten.
5286
Features / Re: Privacy options
« on December 1st, 2011, 10:33 AM »
Quote
The first query does a full table scan and returns in X milliseconds. The second query does the exact same full table scan, and returns in X milliseconds (sometimes a tad more but they're all very variable). Finally, the last query does an index search, and returns in X*2 milliseconds. It's also the most 'stable' of all queries, but it's stable in that it's always slower than even the slowest return time for the first two queries...
I'd love to see the results of EXPLAINs on those queries.
Quote
But if we start thinking this way, then we might as well drop the concept of topic privacy entirely...?
It's a tough call. How much is too much?
Quote
Nope, not backported. From what I gather in the link you posted, subqueries are implemented in 5.x in a way that they can't use an index, and it works in 6.x not because they fixed that bug, but because they rewrote their subquery code.
I thought 5.5 was actually a bastardisation of the 6.x branch anyway.
Quote
I didn't, until we dropped support for MySQL 4.0... And, turns out, I always hated doing inner joins...
Eh, I still don't.
Quote
My girlfriend suggested I use DISTINCT for these no later than last night.
Heh, can't say I'm surprised.
Quote
I'm currently helping her set up a SOAP client in PHP for a WSDL app at Oracle. Another clusterfuck, BTW... We have no idea what functions we're supposed to call (and AFAIK there's no way to request a list of available methods once the custom object is created), how we're supposed to identify, etc... Thank you Oracle for zero documentation. Plus it doesn't help that neither her or I have any prior experience with SOAP...
I don't know why I'm mentioning that... Maybe there's a SOAP specialist around here
SOAP is... evil, and I'm not just saying that as a smelly hippie code hacker :P But if it's involving WSDL... WSDL is a language that indicates what services exist at a given URL, and what inputs are expected and what outputs will be given. It's sort of like XML-RPC but more convoluted IMO. (Yes, I've done SOAP work. It's not that exciting, but it should be manageable. It really depends whether you're doing it all by hand in PHP or using something like the Zend_Soap components.)
Quote
Ah yes, you just reminded me why I'm looking up to them... LJ is the only blogging platform that actually encourages communication between blog authors, rather than having parallel blogs with their own comment authors and such. It's like LJ is a huge forum with boards set up as blogs, just like on Noisen. Back when I created Noisen (2007-2008), LJ was the only similar example, and I actually used them as an example of *why* it would eventually work. Noisen was pretty much supposed to be the 'French LJ'... Too bad it never really got momentum. I'm not very good at advertising my work. I prefer development work.
*nods* And given that LJ is really not working out so well at the moment (it's not been the same since it was sold off a bit back), maybe we should be pushing that harder.
Quote
Because that's the thing here... When we work with normal-sized boards, there's no such thing as a slow query. Start adding bot posters or scraping or simply have a hugely successful site, and you get your first performance issues.
Yup, but we can still try and optimise as best possible for those cases.
5287
Features / Re: Privacy options
« on December 1st, 2011, 12:26 AM »
Quote
Other possible privacy values..?
"18 years old or older"
"13 years old or older"
Etc...
I'd say it's possibly better than having a 'mature' flag on posts.
This seems like it's potentially *very* complicated and something I'm not entirely sure I want to get into, to be honest.
Quote
I was thinking maybe add a 'privacy' field for contact_lists as well... Although granularity for this would be hell -- what if I want my contact list to be able to browse the list they're in? Or if I don't want to? Shall I be able to select multiple viewing groups...?
This sounds to me like overthinking for SCIENCE!
Quote
Uh. v6 is a long time from now...
Is this just in an IN () situation?
I mean, could it work with something like SELECT t.id_member WHERE (SELECT TRUE FROM wedge_other AS o WHERE t.id_member = o.id_other).....? (Just off the top of my head.)
But the bug report was 3 *years* ago. I have no idea whether that's since been backported to 5.5 or 5.6 or 5.WTF any more. I think we need to try it sometime.

I believe it can be made to work like that but I'm not sure, I don't think in subselects very often.
Quote
I'm not sure... If there's an IN (), we'll also be getting multiple entries just the same...?

Anyway -- query_see_topic does an inner join, and it's fucking ugly because it makes inserting the query_see_topic variable more compatibled than, say, query_see_board. Having it use an IN() would make it much more elegant.
It sort of depends, really. There are times it will, times it won't. It also depends on whether DISTINCT is present or not, which would solve the multi-row return case regardless of joins or in clauses.
Quote
The ability to create a blog for your professional friends... And another for your drinking buddies. (Same goes for topics, although less important.)
Hmmm. Part of me thinks that's a wonderful idea, part of me thinks it's unnecessarily complicated. I'm not entirely convinced that people would use it that way. However I know it works for LiveJournal to do something like that, so there is that little niggling bit of my gut that says we should.
Quote
LJ is definitely not 'the' popular blog platform these days, but they still have got 'something'. They're also the ones who have rotating avatars, which I like... (Although not THAT much, eh.)
They're the only popular hosted blog platform I know of that still functions with a community aspect to it. Sure, WP.org is popular for blogging as is Blogger, but they distinctly lack the community factor - I know several people with LJ accounts who regularly refer to each other and talk amongst each other. (To a degree this is how I came to understand how its privacy filters worked)
Quote
(We could also offer to disable topic privacy settings...)
In the end, even as long as there is one line of code for it, it is going to be slower than SMF. We can mitigate it (and given other things will be altered, it may even out in the end) but it must make a difference.


If anyone is reading this -- please tell us whether you think that it would be nice to be able to create contact lists (i.e. friends, family, work...) and whether you'd use the feature to fine-tune your topic/board privacy settings, or you just wouldn't bother yourself?

(New topic, perhaps?)
5288
Off-topic / Re: Doctor Who
« on November 30th, 2011, 10:09 PM »
It didn't work, but being written by 10 year olds and not being horrendous is a good sign for their future as writers...
5289
Off-topic / Re: Doctor Who
« on November 30th, 2011, 08:54 PM »
Which was written by a bunch of 10 year olds, and quite amusing to boot.
5290
Off-topic / Re: Doctor Who
« on November 30th, 2011, 08:48 PM »
Other than the top entry which is the 2011 Christmas special, most of the extra bits are just special extra scenes, like deleted scenes or extra scenes. They're all on the season 6 boxed set. (The special isn't... It hasn't even been shown here yet ;))
5291
Features / Re: Privacy options
« on November 30th, 2011, 04:32 PM »
Quote
So we need to establish a list of privacy IDs and their corresponding meaning...
I'm starting with the basics:
1- everyone
2- members
3- just me (author & admins)
Personally I'd rather have 0 = everyone, 1 = me, 2 = members (since 0 -> nothing to limit, 1 -> I'm the only 1) but I'm not particularly fussed.
Quote
Now we'll need to update the Import tool to actually convert buddy lists to contact lists (and automatically create a default list for every user that has at least one buddy.) I don't know if it's best to do it from the importer tool, or from within Wedge if the table is empty etc... I'd say the importer.
The table structure seems straightforward enough to me. As for when to do it, it should be done from the importer.
Quote
Slower than its equivalent subselect with a secondary table?
Oh hell yes. FIND_IN_SET is bad for a reason. Like the fact you cannot under any circumstances make a usable index on it.
Quote
The problem with subselects, is that they often (always??) require a table scan to complete, even if done on the proper index...
This is something that doesn't happen with INNER JOINs.
You're right that it will cause a table scan, but only if the subquery is used IN () directly (this is not something I've done very often) but it's interesting to note that 3 years ago it was flagged as being solved in MySQL 6 as per http://bugs.mysql.com/bug.php?id=18826

But you need to be careful. INNER JOIN may be faster but you then have to process the results of a result-table that now has multiple rows, potentially many many rows you didn't want in the first place.

It's one of the reasons the board index query is fucked up, because it inner-joins the moderators table to the list of boards, so if you have 100 boards, each with 2 moderators, you get 200 rows back of which most of it is duplicated.
Quote
The only thing we can/may/should/will/shall/would/whatever store in the data field of the member table is the list of contact lists you have. $contact_lists = unserialize($member['data']['contacts']) or something. Would need to be done on every page load (to get the list of friend groups for thought privacy), and other uses (such as users viewing a profile etc) can be done through a quick sql query.
If the list's owner is stored in the table of contacts, why does it even need to be in the members table at all? Index the owner and you're golden.
Quote
So... You're suggesting no lists at all? Just plain contacts...?
I don't know about that. I think contact lists would have more pros and cons. And no one is forced to create multiple lists... It's just good to have them.
Personally I just don't see the point. I can't think that many people are going to create topics that are visible to only a subset of a subset of friends. Then again, I know it happens on LiveJournal which does make it a viable target for us (blogging context), I guess.
Quote
Heck... Either lists (id >= 10) or 'all contacts' is easily doable in a subselect. We either select id_members who are associated to our stored id_list (>= 10), or id_members who are associated with the id_member owner of the list. In which case it'd be best to store the list owner's id in the contacts table as well, to save time. Or just do a find_in_set on their buddy_list of course... But buddy lists are limited in size, unlike the contacts table.
FIND_IN_SET is the devil.
Quote
Also, as I said above, from my experience, subselects don't use indexes. If you could help here, because you're the mysql specialist out of us both, it'd be nice to be able to use subselects, if only because it'd make life a hell of a lot easier when using {query_see_topic} in the code I'll eventually import from the Noisen diff...
See above. They only don't if they're plugged into an IN () clause, not if they're other types of subselect.
Quote
My idea was to have such boards rely on Wedge...
Then again, it may never happen at all.
The only performance bottleneck I've been told so far, is the random list of items in the media homepage. That's because it retrieves all entries, randomizes them, and returns the first few. I still have an entry in my to-do-list to add a pseudo-randomizer variable in each item...
Not what I meant. I didn't mean *forum*, I specifically meant *board* as in a board within a site.

Right now, access is controlled at board level. Once you enter a board, you don't have any incremental performance concerns about access rights. A board with 1 topic has the same overhead as a board with 1m topics in it as far as assessing access to that board goes.

If you have topic-specific granularity, you have to do more work to assess it, specifically there's an extra overhead on the board index, message index, display, attachments... anywhere that has to assess topic access, which is more complex than board access (because you have to implicitly do both, though you can do it so that board access is evaluated first and if that's not going to let them in, you can skip the topic access checks)

Consequently it must slow things down compared to a base SMF install, but you probably wouldn't notice it on Noisen until you got to boards that had many many many topics, because the one-off cases like attachments or topics themselves, the incremental cost is not significantly higher, it's for the cases where you're assessing a lot at once (like board index, message index)
Quote
I'm hoping not.
Maybe not, but none of the ones you've seen thus far are ideal, so you're going to end up Frankensteining two or more together, and then putting your own spin on it anyway...
5292
Features / Re: New revs
« on November 30th, 2011, 03:25 PM »
(1 file, 1KB)

Revision: 1183
Author: arantor
Date: 30 November 2011 14:25:03
Message:
! Missing ; would cause the resultant minified code to choke. For once IE9 came to the rescue, telling me what character the error was on, rather than just the line number... (script.js)
----
Modified : /trunk/Themes/default/scripts/script.js
5293
Features / Re: Privacy options
« on November 30th, 2011, 12:26 PM »
Quote
Hmm.. Okay, okay... But I don't see this being useful in Thoughts, for instance...
Not so important for Thoughts, but it is important for topic privacy.
Quote
It was kind of rhetorical. Noisen has asynchronous contacts, plus the ability to hide selective contacts from viewers. That one will be in Wedge too, once I get to implementing contact lists...
It was, yes, which is why I didn't really get into it. Facebook is now sort of asynchronous, Google is asynchronous by design and creates multiple lists for you to be asynchronous with. But I think that might be a bit too complex for what's needed in Wedge.
Quote
Oh... I see what you're talking about -- selecting several contact lists for viewers.
Not even that. If it's a simple number, you can build it virtually into the main query so you only have to have an extra query to find out who has the current user as a friend. Doing anything extra requires another query on top of *that*.
Quote
For instance -- if you have a family-only post, you select your family. If you have a work-only post, you select your co-workers. If you have a friends-only post, you select your friends. Among which can be some of your family and co-worker list members, of course. The point is having the ability to put some people into multiple lists. Then, when a list is modified, the 'buddy_list' field in {db_prefix}members is updated to reflect the entire list of contacts.
If buddy_list is a single list of comma-separated users, it's queryable without having to do an extra query. It'll be slow, but it'll be doable. If it's *anything* else, we'll have to query it independently, decode it (I'm assuming unserialize), then build a query based on that. It's going to suck in performance terms.
Quote
Although I'm not sure we'll be using that field much in the future... But IIRC there are reasons to leave it in.
There are, but they're limited.
Quote
If you start setting privacy settings on everything in your profile for instance, it'll be a disaster if you have to set multiple contact lists in each. I think it's much smarter to encourage people to put their 'safe' friends into a special list, and give that list all permissions, and deny the rest to anyone else -- guests, members and contacts that aren't in the safe list.
It's still multiple lists to manage, and I just don't think that's entirely necessary. On a social network like Facebook or Google, where you're inherently sharing information that may be suitable for some but not all people, it's important to have that granularity. On a forum, it just isn't necessary. (Interestingly, LiveJournal makes this possible on a very granular level, you can create custom filters which works basically as discussed here, but they're created in such a way that it's not going to be that complicated... since every post is automatically put into a filter of sorts)
Quote
So when we check for a topic's privacy validity, we just retrieve its privacy setting, if <10 (for instance) it's a special setting like 'guests' or 'members' or 'just me' or 'just me + mods' or anything else we can think of, if >=10 it's a contact list, so we INNER JOIN wedge_contacts AS c ON t.privacy = c.id_list AND c.id_member = {int:myself}, or something like that...
Don't inner join. I get where you're going but inner join is a bad place to be. All of the queries (or at least, all the *important* queries that rely on topic visibility; there are many but the important ones like topic display for example) rely on having only one result returned, and inner join will generate multiple rows in the result.
Quote
As I see it, it's (WHERE i'm_admin OR topic starter = me OR (privacy >= 10 AND me IN (SELECT id_member FROM wedge_contacts WHERE privacy = id_list))
Does that make sense...?
Yes, and that's the way to do it. It's still going to hurt but probably hurt a bit less. Note that if it's an admin, we can safely not bother with this and define query_see_topic as 1=1 to avoid the whole fandango.
Quote
I don't even know HOW exactly it's not KILLING performance, this one...
It's hurting but you probably wouldn't notice it until you got to really huge boards with many many many topics.
Quote
I'm not making advances when it comes to the select box, BTW... I'm still unsure where to start from!
You know you're going to end up designing your own in the end...
5294
Features / Re: Privacy options
« on November 30th, 2011, 02:31 AM »
Quote
- Okay for moderators, you've convinced me they should be kept out. Still, I think it's best to leave that option aside -- just have 'Just Me'.
No, I had a specific reason for 'just me + moderators'. There's a poor-man's helpdesk, there's for discussing reasons for bans etc, discussing 'application forms' on clan type sites. The list goes on, and it's much easier for us to implement it there in the core than it would ever be to bolt it on later.
Quote
- I'd like to have some user opinions on contact lists. Who do you think does it best? Noisen.com (if you ever used it)? Facebook? Google+? SMF?
Definitely not SMF.
Quote
- Additionally, how would contact granularity hurt performance more than a general 'My Contacts' choice...? Considering we'll be hitting an extra table in every case?
Because it wouldn't just be an extra table hit. If you want to store anything other than a simple number, you have to either implode it and store it inline, or you have to store it in another table, which means that's *two* extra tables vs what we have now, not one. And believe me, the notion of putting an imploded field in the topics table is a no-no, seeing how it would make the entire topics table an order of magnitude slower because right now there are no variable-size fields in it, which is a very, very good thing.

If it's kept as a simple number, it's possible to solve a touch more efficiently, because what you can then do is figure out who the users are who have the current user as a friend, and turn it into (where topic starter = me OR (privacy = friends AND topic starter IN (list of people who friended me)).

The one thing to realise about topic privacy, and this is quite important: it is going to suck compared to board privacy. It's unavoidable, because there's no way to do it in a way that adds extra conditions that can be evaluated without ORs (except in the just me or everyone cases) - ORs are bad for performance because they're an extra branch and often virtually a sub-query in their own right.
5295
Off-topic / Re: AEVA
« on November 29th, 2011, 11:57 PM »
Not that I recognise. It just has the parameters in a different order.

It shouldn't be any different to:
Code: [Select]
http://www.youtube.com/watch?v=fL9RHimAv-g&feature=player_embedded