Wedge

Public area => The Pub => Features => Topic started by: Nao on March 21st, 2013, 11:48 AM

Title: JavaScriptEscape headaches
Post by: Nao on March 21st, 2013, 11:48 AM
So, I don't know why, but today I decided to tackle JSE quirks...
If you'll remember (or not), six months ago I mentioned in rev 1767 that I was looking for a volunteer to go through the codebase with on[a-z]+="[^"]*JavaScriptEscape and ensure these calls are properly escaped.

Well, six months ago I didn't know about grepWin, which is a great tool (I would gladly replace my other search'n'replace tool if it supported both a history of previous searches, and a quick preview of matches 'in context'), and most importantly, supports multi-line regex searches.

So there are about 75 matches across Wedge + plugins (without counting some special cases like 'onclick' => 'return something' which I didn't test for), that's a lot but it's doable.
Now I'd like to discuss the JSE function itself, and the situations in which it can cause problems.

As $txt protections inside .js files

Nothing to say... It works well. JS files use single quotes everywhere, so it's well optimized, too. One has to remember, though, not to use a $txt inside a double-quoted string. No reason to do that, though..?

Inside an inline event

Ah, this...
echo 'onclick="return', JavaScriptEscape('I say "Hello!" and you say "Get lost!"...'), ';"'; isn't gonna work, because you can't escape a double-quote properly inside an inline event... So, that's why I added an extra bool param to JSE to tell the engine that we're within a HTML param, and that we need to turn any double quotes to "...

That's good.
Especially since I (somehow foolishly, but I stand by my decision) turned all quot entities to double-quotes in language files, to make them more readable. (I think the upside is better than the downside here.)

Now, the thing that REALLY matters, is that Wedge doesn't leave inline events as they are... You know that. It loads jQuery at the end of the page, so it moves inline events to the end, too. I was thinking, it's pretty silly to bother with that...?!
I can very, very easily allow for escaped double-quotes (\") in inline events. They'll just get moved out of the HTML anyway... But, does that mean I should escape all double-quotes inside JSE without distinction..? I don't think so. Outside of an inline event, the double-quotes will remain escaped, and I don't like the idea.
And yes, I need to escape double quotes, there's no other realistic option.
So, do I need to use the bool in every use of JSE inside an inline event..?

Inside footer JS

I think that's the last possibility..?!
This is where I've been looking today. Fact is, in a Wedge HTML page, all tag params are surrounded by double quotes. There are tons of double-quotes in every page. There are very little single quotes. So I did a quick test, and instead of JSE returning 'string', it returned "string', and guess what? I saved about 10 gzipped bytes just by doing that on the home page. It's not MUCH, and certainly not enough to warrant this topic, but... It's my hobby, after all.

Where is JSE used the most, then..?

Here's a solution I was thinking of... It's crazy, at least as much as I can be.
In JSE, return the string surrounded by chr(15) (or whatever unused character in the invisible ASCII range), and escape double-quotes in the string with chr(16) or whatever.
Then, at ob_sessrewrite time, once the event delaying process is completed, turn back chr(15) to double-quotes, and turn chr(16) to double-quotes only if they're not inside a string, otherwise turn them to " entities. I'm thinking of this as I'm typing, so it probably doesn't make a lot of sense... But I'd love being able to solve the need for the 'escape double quotes' boolean param. I think I'll do that *at least* for chr(16)...

Also, I think we could use double-quotes by default in JSE, and only use single-quotes when called within $txt protection code in JS files.

Okay, that's it... The proverbial headache is coming up. I can feel it.
Title: Re: JavaScriptEscape headaches
Post by: Nao on March 21st, 2013, 01:27 PM
Note to self: grepWin supports search history. Just needed to double-click the search box... :)
Still doesn't have a preview pane, though. Only offers one line of preview if choosing 'contents' in the bottom of the window. Better than nothing, I suppose...
Title: Re: JavaScriptEscape headaches
Post by: Arantor on March 21st, 2013, 05:17 PM
And this same behaviour is mirrored in the language editor area, so users will be able to edit stuff and quote marks are left *unchanged* there. (I cannot safely htmlspecialchars anything the user puts, given that we have various inline links and whatnot in the language strings.)

The only way around it is to have a separate language file where its contents will be loaded just for JS injections and then we can 'encourage' everything in that to have quot instead of bare double quotes.
Title: Re: JavaScriptEscape headaches
Post by: Nao on March 22nd, 2013, 05:12 PM
I don't understand your story about the behavio(u)r..?
Is there a problem with the chr() I've implemented to get rid of any problems for JSE? (It needs to be turned back manually whenever ob_sessrewrite isn't called, e.g. $txt handling in JS files.)

I'm not thrilled with the idea of a separate file for JS strings.
I would, however, be opened with a convention such as putting the js_ prefix to any string that should be in JS code, but even that is not needed with my trick... ^^
Title: Re: JavaScriptEscape headaches
Post by: Arantor on March 22nd, 2013, 05:17 PM
Anything added by the user in the language editor will be bare HTML, including quotes in language strings. So anything you do has to be invisible to the user. And they will always be straight quote marks, not quot entities, unless the user specifically enters them as such. That's one of the curses of the language editor.

It's more a case of extra evidence for your decision to have no quot entities in language files, really.

As far as having js_ prefixes go, that doesn't solve anything whatsoever because users can still do crazy stuff in the language editor (though, I guess, there I could detect the js_ prefix and force things)

That's the thing about having a given file where certain behaviour is required, if it's got one element that is unique to it (js_ prefix, or separate JS file), I can adapt the language editor to be aware of it and enforce that behaviour. Right now that's not an option - what the user enters in the language editor is what shows up in the page, WITHOUT ANY EDITING.

This does make me wonder whether we need to consider removing all links from language items, though.