Wedge

Public area => The Pub => Features => Topic started by: Arantor on April 29th, 2013, 12:07 AM

Title: return_raw buggy if enableCompressedOutput is on
Post by: Arantor on April 29th, 2013, 12:07 AM
I was surprised to find that every 90 odd seconds, I was seeing weird errors from the unread notifications lookup. Every time was the same, unexplained failure - no concrete reason, every request was simply left as 'pending'.

After some digging, I found there is one consistent reason for this: return_raw forces all the buffers to close. The problem is, doing that seems to cause it to screw up - in my case *every* request was failing because of this.

Bizarrely, if I don't close the output buffers, it works exactly as expected for me, behaves consistently regardless of setting. (All I did was comment out that line)

If the end-buffers line is left alone, it only works if enableCompressedOutput is disabled.

Curiouser and curiouser, I added the following debug code before the end-levels line:
Code: [Select]
global $settings;
trigger_error('Compressed output: ' . (!empty($settings['enableCompressedOutput']) ? 'ON' : 'OFF') . "\n" . print_r(ob_list_handlers(), true));

With it enabled:
Quote
1024: Compressed output: ON
Array
(
  • => default output handler
[1] => ob_gzhandler
)
With it disabled:
Quote
1024: Compressed output: OFF
Array
(
  • => default output handler
[1] => default output handler
)
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Nao on April 29th, 2013, 10:00 AM
Re: output buffers, nothing really surprising... I'll admit I was stumped at first, but then I remembered; ob_sessrewrite is only called at the start of the templating session (long after most of the PHP has been executed), meaning that it shouldn't be called at all if you're doing exit(), rather than obExit().
So, technically, I guess I could do without the ob_end_clean() calls... What mattered to me was not saving the cost of gzipping a few bytes (it's probably nearly free), but of running ob_sessrewrite on them, which would be potentially most costly.

I've double-checked other areas in Wedge that call ob_end_clean(), and they don't seem to generate any errors, so I'm *guessing* this is due to some codepath executed when in Ajax mode. The only two places where I get a crash, are in notification unread count, and thought retrieval. The first one probably generated the bug I mentioned yesterday (that the unread count cache wasn't invalidated when the number was changed), while the second never happened to me in the course of my testings, because thoughts are only ajaxively retrieved when they're not pure text (e.g. have a HTML tag in their rendition.)

I'm not going to delve too deep into this; just going to say I'm sorry for wasting your time with this, and just delete the ob_end_clean line...

I'm just wondering if that call to clean_ouput() REALLY is that necessary... After all, all it does is reset the output buffers. But at this point in the code, we haven't even started dealing with them in a proper fashion -- ob_sessrewrite isn't added to the output buffer list yet. Heck, suddenly it strikes me... Maybe it's going to run ob_sessrewrite twice instead..??!!
Just let me check... Nope, nope, I was a bit confused.
- clean_output() changes the output buffer list from "default, gzip" to "default" then to "default, gzip, ob_sessrewrite".[nb]Which is a problem, technically... It assumes that gzip is the only extra handler at this point. It should be doing the while() call, and then the rest..? I think. Warrants looking into.
- obExit() adds ob_sesswrite then shows the skeleton, but obExit(false) simply goes through obExit but skips both $do_start and $do_finish code-blocks, effectively NOT adding ob_sessrewrite to the list...
- so, technically, I could get away with just calling clean_output() and not calling obExit(false), so that Wedge doesn't execute the side-functions like sending mail or tracking stats... I guess, it would make more sense.

Well, what a complicated mess. To my defense, when I wrote the return_* function, I mainly copied-pasted code from somewhere else in our codebase, then tested it, and as it worked... Didn't really bother to see the magic it did. (Or needed to do.)
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Arantor on April 29th, 2013, 02:27 PM
How often did the @while(ob_end_clean()); line actually run before? Seems to me that the lines we had before have a very slightly different behaviour to them because this is closing even the default output handler when it shouldn't.
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Nao on April 29th, 2013, 04:15 PM
Quote from Arantor on April 29th, 2013, 02:27 PM
How often did the @while(ob_end_clean()); line actually run before?
As long as there's another output buffer around..? :^^;:
Quote
Seems to me that the lines we had before have a very slightly different behaviour to them because this is closing even the default output handler when it shouldn't.
But there are plenty of other places where it's used innocently in the same situation. Just look at the download code for both attachments and media items... :-/
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Arantor on April 29th, 2013, 04:45 PM
But I can't think of a time I've actually *triggered* those.

Dlattach:
* if Not-Modified-Since header is sent
* if If-None-Match header is sent
* if the attachment is larger than 4MB

Errors:
* in the event of untrappable DB error

Media files
* prior to sending large files

Security
* When there's been a session timeout with AJAX calls

Subs
* When preventing prefetch


Except I'm not sure I've actually hit any of these lately...
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Nao on April 29th, 2013, 06:51 PM
What a mess...
I still don't get why it's not working now, when it used to. It's very obviously due to the ob_gzip handler being removed from the handler list after being somewhat initialized internally, but I can't see where the issue is, really.

Would you suggest that we keep things as they are and only fix when finding errors (such as in return_raw...), or doing something more drastic..?
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Arantor on April 29th, 2013, 07:08 PM
Quote
I still don't get why it's not working now, when it used to
I'm not sure it's ever actually worked properly since it was 'optimised'.
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Nao on April 29th, 2013, 07:26 PM
So...? What do you suggest?

Also, I'm having the toughest time rewriting the CSS filename suffix parser. It's one of my to-do's... My goal was to allow users to do "index.(ie[-7] && !m1) || mod && can_flex, replace.css" or whatever crap they want... The same freedom as in CSS files themselves, really.
Turns out it's a PITA. My original code is a bit complex, and right now I'm not even sure whether I really needed all these $suffixes_to_keep and things like that... :-/

I'm torn between trying to fix it all, or just telling people that suffixes are only for simple tests, and deal with it, I'm not paid, if you're not happy then don't use it.™

Do you think it's worth it..?
Heck, I could even get rid of suffixes entirely, if I listened to myself, ah ah...

Main problem is due to the fact that Wess optimizes final filenames not to have crappy suffixes in them if they're not used at all, but it gets complicated because of the contents of the files themselves, so I'll probably have to always include all keywords, and shouldn't bother about this at all, I guess...!
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Arantor on April 29th, 2013, 08:57 PM
Quote
So...? What do you suggest?
Checking whether the old 'less optimised' code causes the same problem under the same conditions. I've been busy with RL stuff today but I'll check it tonight.

That and I'm fed up of the infractions code but I can't commit it in its present state because I don't want to have something in SVN that actively limits what we can do on here in the meantime. And all the other things I've been thinking of doing are not small projects >_<

I don't know what's best with suffixes.
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Nao on April 30th, 2013, 12:16 AM
Quote from Arantor on April 29th, 2013, 08:57 PM
Quote
So...? What do you suggest?
Checking whether the old 'less optimised' code causes the same problem under the same conditions. I've been busy with RL stuff today but I'll check it tonight.
Thank you; I myself was focusing on my suffix rewrite...
Quote
That and I'm fed up of the infractions code but I can't commit it in its present state because I don't want to have something in SVN that actively limits what we can do on here in the meantime. And all the other things I've been thinking of doing are not small projects >_<
I still don't get what infractions are compared to warnings (I'm sure I'll get it once I dive into your 70-post topics ;)), but I sure hope the extra kilobyte of language strings will end up in a less 'public' file than index.english.php soon... ;)

Oh, speaking about English, I really think we should add a $txt['parent_lang'] parameter. Because it's not only about English UK, or Brazilian Portuguese, or whatever... Every language has its own variations, and maybe the reason we're not seeing these variations around is precisely because it's so annoying to maintain a file for a minority..?
For instance, Québec people (around 10M, IIRC..?) speak French, but they have their own words for tons of things. I'm pretty sure that if they could do a Québecois language file, and set parent_lang to French, and just provide a few files with rewrites, and fall back to French for the rest, they'd be really happy to work on it.
Quote
I don't know what's best with suffixes.
Bugs fixed... For the most part ;) I guess I was, err... Too sentimental about them. I actually stripped away a lot of my code, and it keeps working eheh.
The 'thing' is that up until I implemented @if/@is, the only way to define a CSS block for some browser, was to add a new file and use a suffix for it. Then I figured people would want multiple browsers in a single file, so I did index.ie6,ie7.css, then I decided I wanted to have @if and wrote more generic code, but I never really thought of using the same codebase in both @if and suffixes, while it was very easy...

See, the reason why I kept track of 'found' suffixes, was that I wanted to limit the number of suffixes in the final filename. However, it's NOT possible to do that when any CSS file can have a specific variable test inside it. You DO have to keep track of all of them, and thus you need to store all possible suffixes in the filename, instead of just those that are useful.
Fortunately, my system was already smart enough that it only records suffixes conservatively, so it's mainly been about browser-name-OS-name-logged-status-date.
I even found a solution to manage the 'local' keyword in the same way.

There are a few issues remaining, though... Namely, the member ID, board ID and category ID stuff. Thing is -- if I record these for everyone, then it means one single CSS file per user, per board... Which is crazy, really. It's not possible. So... I could do it like I did so far, i.e. check suffixes and if a member Id or board Id is found in them, store these as well if they return true. But then I realized... Hey, what about @is and @if..?! What if we use "@if m1" to show some CSS only for member #1..? Well, you can't know that until you load the file.[1] Which doesn't happen until you cache it. And the point of caching it, is that you want to save time... So, no loads.
There's always the possibility to cache these entries into the database, and associate them with a corresponding skin + member ID + board ID etc, but I don't think it's very realistic...

So, I guess we could restrict use of these IDs to file suffixes only. Which isn't cool, because I'm struggling to have suffixes be interoperable with @if.

Or, just drop m1/b2/c3 from the list of features in Wess... :-/

I think I've been getting too technical in this post. I can't complain that I don't understand your feature explanations if I'm no better at exposing mine, ah ah... :^^;:
Anyway, I guess I'll have to make a decision about this...
 1. I probably thought about it, then forgot. Because Wedge never uses this kind of test in its default skin files, obviously the problem doesn't jump immediately to mind. And my only use of the 'm1' keyword is in a file suffix, so... Back to square one.
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Arantor on April 30th, 2013, 12:47 AM
Quote
I still don't get what infractions are compared to warnings (I'm sure I'll get it once I dive into your 70-post topics ;)), but I sure hope the extra kilobyte of language strings will end up in a less 'public' file than index.english.php soon... ;)
A complete and more powerful replacement.
Quote
but I sure hope the extra kilobyte of language strings will end up in a less 'public' file than index.english.php soon...
If you mean what's already committed, check what it is. It is not the warning system, it's the moderation centre which is sort of needed for the board index since that's what's happening, a new virtual category that contains virtual boards. I forget, I haven't committed half of this because of the ever sprawling changes.
Quote
Oh, speaking about English, I really think we should add a $txt['parent_lang'] parameter. Because it's not only about English UK, or Brazilian Portuguese, or whatever... Every language has its own variations, and maybe the reason we're not seeing these variations around is precisely because it's so annoying to maintain a file for a minority..?
Every language has its variations, which is precisely why we get into English US vs English UK, or Portuguese BR vs Portuguese PT and so on. Extra files are going to need to be maintained for those, it's not avoidable.

The only real difference as far as parent language goes is how much difference there really is. English UK vs English US has some major differences but it also has an awful lot of similarities, so it would make sense to have the parent language.

The parent language aspect does sort of require the other language to be installed. If you want to set up a forum for local events in Portugal (emphasis: Portuguese/PT), that's fine because presumably it's the parent language. But if you want to install a forum for local events in Brazil (emphasis: Portuguese/BR), you then need to install *two* language packs. This is not intuitive even if it is documented somewhere.

I don't know how significant the differences are - though I did ask for some feedback on this - but the impression I've been given is that the variations are big enough to warrant two sets of language files being maintained.
Quote
For instance, Québec people (around 10M, IIRC..?) speak French, but they have their own words for tons of things. I'm pretty sure that if they could do a Québecois language file, and set parent_lang to French, and just provide a few files with rewrites, and fall back to French for the rest, they'd be really happy to work on it.
Let's take that as a practical example. Yes, you're probably right in that it probably wouldn't need a complete set of files.

However, that does make language loading more complex (in trying to report missing files etc.) and more importantly: is it logical or appropriate to ask those forum cases to install *both* French and the Québecois language files? (Ignore, for the moment, that French is already bundled)

Technically, English UK should be a complete full set of language files, it only isn't because I'm essentially too lazy to do so and I know that English US is *always* loaded because there must be a fallback for error-prevention-in-incomplete-translation purposes.

If the situation were reversed, and French were the only bundled language, with English US being the primary and English UK being a secondary, if I were setting up a forum for *British* users (i.e. the minority speakers), I would not want to install both English US and English UK, even if I only made English UK selectable by users.


As far as suffixes go, I've always seen where it could be practical but I've always had the feeling it's going to be a feature that no-one but you is going to use anyway :(
Posted: April 30th, 2013, 12:35 AM

Back on topic, it seems that it is working as expected elsewhere with that code, so I don't know why it is in normal execution. (At least, attachments are working even when > 4MB)
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Nao on April 30th, 2013, 03:24 PM
Quote from Arantor on April 30th, 2013, 12:47 AM
If you mean what's already committed, check what it is. It is not the warning system, it's the moderation centre which is sort of needed for the board index since that's what's happening, a new virtual category that contains virtual boards. I forget, I haven't committed half of this because of the ever sprawling changes.
Ah, yes, I remember reading bits of this. I have no problems with moderation being part of the board index (it will, however, need to be presented with a small variation in styling, otherwise it might be confusing), but I'm thinking, maybe it's time to have a 'Boards' language file, where we could put all strings related to showing boards? Maybe also strings for MessageIndex, I don't know...

Not too important, though...!
Quote
The only real difference as far as parent language goes is how much difference there really is. English UK vs English US has some major differences but it also has an awful lot of similarities, so it would make sense to have the parent language.
Yep... And, same for Québécois (or Belgian or Swiss, if someone desperately wants to say "septante" in their files instead of "soixante-dix"... Which won't happen because we always say 70 anyway :P). And, I'm sure, same for dozens of other regional variations on other languages, that can't be done 'for now' because of the extra needless work it requires to maintain all files, when there are only a few differences in a few files...
Quote
The parent language aspect does sort of require the other language to be installed.
Yes, but it can be said on the download page, can't it..?
Quote
If you want to set up a forum for local events in Portugal (emphasis: Portuguese/PT), that's fine because presumably it's the parent language. But if you want to install a forum for local events in Brazil (emphasis: Portuguese/BR), you then need to install *two* language packs. This is not intuitive even if it is documented somewhere.
We could also put all PT language files into the BR package, not really a problem..?
Quote
I don't know how significant the differences are - though I did ask for some feedback on this - but the impression I've been given is that the variations are big enough to warrant two sets of language files being maintained.
Yeah, I don't think BR is a very good example, Québécois and British really are better examples in my mind.
Quote
However, that does make language loading more complex (in trying to report missing files etc.) and more importantly: is it logical or appropriate to ask those forum cases to install *both* French and the Québécois language files? (Ignore, for the moment, that French is already bundled)
I don't think I'd want French to be bundled in the default download... I may be a patriot, and have much love for my language, but I'm also a realist ;) I don't see us shipping anything else than English (US) by default, although I guess we could discuss this later...
And yes, it would make sense to ship FR with QC, because inevitably Québécois forums will receive visits from French visitors, and vice versa, so it's best to have both in there... (Well, French being crucial, Québécois being less important, but if I *had* access to such a package, I would definitely include it.)
Quote
If the situation were reversed, and French were the only bundled language, with English US being the primary and English UK being a secondary, if I were setting up a forum for *British* users (i.e. the minority speakers), I would not want to install both English US and English UK, even if I only made English UK selectable by users.
For instance, let's try to load a file in Québécois...

load language: English (US), e.g. mypage.english.php
then
load language: French (if file not found, error), e.g. mypage.french.php
then
load language: Québecois (if file not found, ignore if parent_lang is set, otherwise error), e.g. mypage.french-qc.php

I think that's pretty much what I suggested a few weeks or months ago... And I don't really see any fault in doing it that way, is there..?
Quote
As far as suffixes go, I've always seen where it could be practical but I've always had the feeling it's going to be a feature that no-one but you is going to use anyway :(
Yup, probably ahah...
Then, do you mind if I rewrite them the way I mentioned in the other, also unrelated[1] topic..?
Quote
Back on topic, it seems that it is working as expected elsewhere with that code, so I don't know why it is in normal execution. (At least, attachments are working even when > 4MB)
Then it probably needs no more changes, which is why I committed this morning; still, I'd like to know what caused this in the first place, eh... ;)
 1. I'm the world champion for corrupting topic subjects.
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Arantor on April 30th, 2013, 04:07 PM
Business first:
Quote
Yup, probably ahah...
Then, do you mind if I rewrite them the way I mentioned in the other, also unrelated[1] topic..?
Go nuts.
Quote
And yes, it would make sense to ship FR with QC, because inevitably Québécois forums will receive visits from French visitors, and vice versa, so it's best to have both in there... (Well, French being crucial, Québécois being less important, but if I *had* access to such a package, I would definitely include it.)
This is where I disagree. But I could really use input from people who would actually be using this in this fashion to see what their take on it is.

My gut reaction is that I don't see how it is logical or reasonable to enforce two languages being downloaded or installed (even if they are a single download). But I don't know how many sites use both together and make both available together for users (outside of sm.org which has basically everything installed)

Certainly having them bundled together is an improvement and does remove the other logical obstacle - of users who install the subsidiary pack without the parent pack.
Quote
but I'm thinking, maybe it's time to have a 'Boards' language file, where we could put all strings related to showing boards? Maybe also strings for MessageIndex, I don't know...
We can do that. I can certainly do it as part of the overhaul that's ongoing (just makes the impending commit bigger) >_< Unless you want to tackle it?
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Arantor on May 6th, 2013, 04:07 PM
I just did a very quick comparison of files between Portuguese PT and Portugese Brazilian, as well as Spanish Es vs Spanish Latin.

The changes are much more extensive than English US vs English UK is, to the point where the extra language pack is probably worth the effort. English UK modifies a tiny fraction, easily under 10% of the language codex as a whole - so 90% similarity, then. But I can't find any language file in either of those two comparisons that has anything like a 90% similarity, I'd be hard put to suggest there's a 50% similarity to some files (though I'm eye-balling it from WinMerge)

On the back of that, I'd actually suggest we don't do the language fallback.
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Nao on May 7th, 2013, 11:51 PM
Even for Québécois..?
Title: Re: return_raw buggy if enableCompressedOutput is on
Post by: Arantor on May 7th, 2013, 11:55 PM
There isn't an SMF language pack for that, so I have no basis for comparison.