Wedge

Public area => Bug reports => The Pub => Archived fixes => Topic started by: Nao on May 27th, 2012, 12:20 PM

Title: Optimizing parse_bbc() and Aeva
Post by: Nao on May 27th, 2012, 12:20 PM
(Split from 'BBC whitespace trimming'.)

Would be helpful to see external tests.

Eh maybe one of my earlier micro optimizations fixed it ahah.
Title: Re : Optimizing parse_bbc() and Aeva
Post by: Arantor on May 27th, 2012, 03:19 PM
I don't believe anyone's really done external tests.
Title: Re : Optimizing parse_bbc() and Aeva
Post by: Nao on May 27th, 2012, 03:29 PM
By external tests, I mean taking a few 'test' posts and seeing how parse_bbc() fares on them on a few different servers...
Title: Re : Optimizing parse_bbc() and Aeva
Post by: Arantor on May 27th, 2012, 03:41 PM
I don't think that'll make much difference, really. It's all pure CPU (excluding Aeva and media processing) so all you'll be doing is establishing the relative performance of each server.
Title: Re : Optimizing parse_bbc() and Aeva
Post by: Nao on May 27th, 2012, 09:24 PM
I made a comparison between a Wedge and a SMF board, benchmark based on real use for parse_bbc() etc, and... I get pretty much the same results. A bit better for Wedge, but not by more than 20% or so, and we're talking about times less than a millisecond...

I'm really starting to wonder what the fuss was about parse_bbc performance, really. Or maybe it's because servers back in 2003-2007 were shitty compared to now... (or PHP.)
Title: Re : Optimizing parse_bbc() and Aeva
Post by: Arantor on May 27th, 2012, 09:27 PM
Because on bigger boards it's still a large part of processing effort, even if it's better than it was.
Title: Re : Optimizing parse_bbc() and Aeva
Post by: Nao on May 27th, 2012, 10:17 PM
Bah. Starting to think I wasted my time with all my tiny rewrites. And the db change. Maybe I should just revert locally...
Title: Re : Optimizing parse_bbc() and Aeva
Post by: Nao on May 30th, 2012, 01:40 PM
So... I removed the autolink calls from everywhere in Aeva embedding.

Here are my conclusions:

- They're not needed AT ALL at runtime. Because Wedge already autolinks anything that's not autolinked...
- They're needed at posting time, but for only one reason: so that Aeva can detect any URLs and start looking them up (to retrieve titles, etc.)

Now, here's the thing. It seems that EVEN that is broken right now... Aeva will return some incorrect stuff in the process.
I don't know where, but I really broke something when I implemented Aeva into Wedge. Something that won't be fixed by just magically adding more things to the autolink stuff...

So, I have two solutions right now.
- Either I try and fix the bug by comparing AeMe and Wedge codebases,
- Or I just give up on lookups altogether -- if only because most 'important' websites these days offer a video link & title in the thumbnail. The only advantage that Aeva has over these, is that you can still see the video's title while watching it... Also, you can choose to simply copy the video URL instead of having to open the tab and then copying the URL. Things like that were important to me back then. I'm not so sure right now...

Opinions please?
Title: Re: Optimizing parse_bbc() and Aeva
Post by: godboko71 on May 30th, 2012, 02:49 PM
Titiles are not super important to me Tbh. Like you said most sites show it in the preview anyway.
Title: Re: Optimizing parse_bbc() and Aeva
Post by: Nao on May 30th, 2012, 04:09 PM
What I could do is show the URL in case it's a site that isn't known to provide a back link. I can't do anything for the title itself, though, if I'm not going to keep the lookup code...
But I'm actually in a situation where I think that it's silly to offer hundreds of sites to embed from. It was nice at a time where I was working on it and I was trying to fit in Karl Benson's shoes, and he had a tendency to add anything that was asked of him... But we're now several years later, and I don't think any 'small' site ever really caught on, YouTube, Dailymotion and a few others are all the rage and what matters is to ensure that THEY work... So I actually changed the embed code for YouTube to provide HTML5 support (e.g. mobile stuff), and I'm seriously considering dropping the non-major lists. Although Wedge already disables them by default, so as long as you don't screw up in the admin area, it doesn't matter that much...

What is missing in Wedge, too, is the 'Large | Normal' list of links to increase the size of a YouTube video. I have absolutely no idea what happened to that... :(
Posted: May 30th, 2012, 04:03 PM

Oh, and one of the drawbacks of the lookup code, is that all URLs that are shown in plain view (http something) will be surrounded by url tags in the database, whereas SMF will leave them untouched.
It makes it faster to process these links from parse_bbc, though, so it can be seen as an advantage... But I don't think I could remove the 'smart' link detector from parse_bbc() if only because we'd have to turn them into url tags at SMF import time, and I don't know if it's a good thing to do... (Plus, what if there's a very large post filled with URLs? It may get broken in the process...)
Posted: May 30th, 2012, 04:06 PM

Oh... Forgot about one thing. Titles are used for media items when you put an external video. Wedge will extract the title and use it in the album... :-/
Title: Re: Optimizing parse_bbc() and Aeva
Post by: Arantor on May 30th, 2012, 04:18 PM
Well, with something like this, I'd look at the alternatives - namely the youtube bbc or similar. None of those show the title after the video, nor do any of them have a convenient link either.

I'm not bothered with the title (and does this mean you can avoid the lookup aspect?) but the link to the video would be useful to have.

:edit: Ninja'd.

I'd note that vbgamer's mod supports about 60 sites these days and he does add new ones reasonably easily. What would be ideal is if it becomes possible to provide new auto embed options easily (i.e. as a plugin) and if that's doable we can farm all the smaller ones out to plugins and then create plugins as needed.

Honestly, the extra weight of the url surroundings in the database is small enough not to worry about, and the performance benefit is worth the effort, IMHO.

Hmm, it's complicated.
Title: Re: Optimizing parse_bbc() and Aeva
Post by: Nao on May 30th, 2012, 04:36 PM
Quote from Arantor on May 30th, 2012, 04:18 PM
Well, with something like this, I'd look at the alternatives - namely the youtube bbc or similar. None of those show the title after the video, nor do any of them have a convenient link either.

I'm not bothered with the title (and does this mean you can avoid the lookup aspect?)
Yes, it would.
But lookups aren't only for titles. They also fetch information such as whether embedding is allowed (for YouTube). This is the kind of information I like having around.
Quote
but the link to the video would be useful to have.
I could always show it on hover... (although I don't think hover works on a Flash item...? Or maybe it works if I put the hover on the container div... Will try that.)
Quote
I'd note that vbgamer's mod supports about 60 sites these days and he does add new ones reasonably easily.
It's just as easy for Aeva, unless you want to support the extra features.
Well, in terms of features SAVE is limited to doing a preg_replace on the URLs. Of course it also probably makes it a bit faster than Aeva, but it also prevents it from things like disabling embedding after X embeds on a page. (Easy way to crash browsers that don't push their Flash content into a separate process... Heck, it can even crash browsers that do.)
He can't support a noembed tag either... (Can probably be emulated through nobbc, but in any way, it means these links wouldn't even be clickable...)
Or things like 'disable embed in quotes/sentences'.
Basically, it's limited. Not something I'd ever want to do!
Quote
What would be ideal is if it becomes possible to provide new auto embed options easily (i.e. as a plugin) and if that's doable we can farm all the smaller ones out to plugins and then create plugins as needed.
I considered that, but (1) I don't have enough knowledge in the plugin system, (2) I could care less about these sites anyway...
Quote
Honestly, the extra weight of the url surroundings in the database is small enough not to worry about, and the performance benefit is worth the effort, IMHO.
Okay, then.

So, I'm leaving autolinking in for now. But in both parse_bbc() and aeva_onposting (which means we both lose database space AND not benefit from it at all), because I don't want to go through the process of reparsing posts through an importer. (Heck, that would also mean older Wedge.org URLs are broken...)
(Not that it's vital, though.)

One thing to note is that aeva_onposting would best be placed inside preparsecode(), but Post2.php doesn't always call it, as I mentioned in my thoughts.
Right now, my 'fix' was to call it as well from JSModify.php, because that's something that I forgot to import from Aeva 7's xml files -- believe it or not, I actually skipped that bit and this is the reason why quick edit never properly preparsed a video link...
Title: Re: Optimizing parse_bbc() and Aeva
Post by: Arantor on May 30th, 2012, 04:46 PM
Quote
But lookups aren't only for titles. They also fetch information such as whether embedding is allowed (for YouTube). This is the kind of information I like having around.
That's fair enough.
Quote
I could always show it on hover... (although I don't think hover works on a Flash item...? Or maybe it works if I put the hover on the container div... Will try that.)
I don't believe it does because of how Flash works.
Quote
It's just as easy for Aeva, unless you want to support the extra features.
I don't see why we can't leave the facilities around but the patterns to match and site specific stuff moved to its own plugin.
Quote
Well, in terms of features SAVE is limited to doing a preg_replace on the URLs. Of course it also probably makes it a bit faster than Aeva, but it also prevents it from things like disabling embedding after X embeds on a page.
Oh, he was very keen to point out that it was faster. But too many people realised the downsides related to the embedding limit and the fact it (still) doesn't filter for signatures.
Quote
Basically, it's limited. Not something I'd ever want to do!
We both knew that, right? :P
Quote
I considered that, but (1) I don't have enough knowledge in the plugin system, (2) I could care less about these sites anyway...
Oh, I'm fine with doing it, I just need to understand how the system currently works, then I can do the rest.
Quote
One thing to note is that aeva_onposting would best be placed inside preparsecode(), but Post2.php doesn't always call it, as I mentioned in my thoughts.
Right now, my 'fix' was to call it as well from JSModify.php, because that's something that I forgot to import from Aeva 7's xml files -- believe it or not, I actually skipped that bit and this is the reason why quick edit never properly preparsed a video link...
That makes sense :)
Title: Re: Optimizing parse_bbc() and Aeva
Post by: Nao on May 30th, 2012, 07:38 PM
Quote from Arantor on May 30th, 2012, 04:46 PM
Quote
I could always show it on hover... (although I don't think hover works on a Flash item...? Or maybe it works if I put the hover on the container div... Will try that.)
I don't believe it does because of how Flash works.
Actually it does... I tested. It works on the parent table for the flash objects, as well as directly on the frames for iframe embedding (YouTube...)
So it's not too hard to implement some nice styling around videos to specify the URL, allow for resizing (if technically possible) and things like that..
Quote
Oh, he was very keen to point out that it was faster. But too many people realised the downsides related to the embedding limit and the fact it (still) doesn't filter for signatures.
Signature filtering is doable with his technique, too.
Quote
We both knew that, right? :P
It was a no-brainer, just like him :niark:
Quote
Oh, I'm fine with doing it, I just need to understand how the system currently works, then I can do the rest.
Never looked much into AeMe code eh? :P
Title: Re: Optimizing parse_bbc() and Aeva
Post by: Arantor on May 30th, 2012, 07:45 PM
Quote
Actually it does... I tested. It works on the parent table for the flash objects, as well as directly on the frames for iframe embedding (YouTube...)
So it's not too hard to implement some nice styling around videos to specify the URL, allow for resizing (if technically possible) and things like that..
I think that sort of depends on what browser you use as to whether that's the case. But it certainly would be nice :)
Quote
Signature filtering is doable with his technique, too.
I know that and you know that but he hasn't done so yet. Even now, after all this time, he still hasn't figured it out, and I'm enjoying not telling him how, after all the things that happened.
Quote
Never looked much into AeMe code eh?
I've just never had to. But if I can get a grip on how it works, I might be able to figure out a good way to pluginify some of the sites.
Title: Re: Optimizing parse_bbc() and Aeva
Post by: Nao on May 31st, 2012, 05:12 PM
Just fyi -- :hover on iframes and Flash objects is supported on Chrome, Firefox and Opera (at least the versions I tested). It didn't work in IE 9 or IE 10. (I don't really care much about that one...