Language editing inside Wedge

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Language editing inside Wedge
« Reply #15, on September 23rd, 2012, 04:30 PM »
Quote
There are still at least 9x2 cases of string concatenation across 7x2 language files, I'm afraid
Then I'll fix that first. I am aware there are cases of concatenation in the language files for the purpose of encoding line-endings into certain strings. This is standard practice for the editor and it is already actually aware of this. But I'll check for any other cases and fix it.
Quote
Other than that, it's just on average twice the current amount of cached JS files, which is very, very small, especially when compared to the number of CSS files in the css cache... Just go have a look ;)
I count 9 JS files at present: admin, captcha, editor, mediadmin, register, script, script-sha1, suggest, topic. That's not necessarily definitive, but what I've encountered so far in general use.
Quote
Does it make sense at all?
Oh, it makes sense, but I'm not seeing that as a problem, actually.

add_js_file would load the file, you'd be calling loadLanguage or loadPluginLanguage anyway from there. That means, basically, you'll still fall into the process I'm talking about. There's no reason why the same process couldn't also contain the JS language file, though if I understand you correctly, you're almost talking about having multiple language JS files too...

Either way, I see no reason why an editor couldn't cope with being able to edit either and load from the DB, caching the results. It only has to worry about caching if people are daft enough to edit files directly, but then all bets are off anyway.
When we unite against a common enemy that attacks our ethos, it nurtures group solidarity. Trolls are sensational, yes, but we keep everyone honest. | Game Memorial

Nao

  • Dadman with a boy
  • Posts: 16,079
Re: Language editing inside Wedge
« Reply #16, on September 23rd, 2012, 11:27 PM »
(Re: latest rev -- should it be notewarn for new PMs..? IIRC I set the semantics to use notenice because a PM isn't usually a 'warning' but something neutral or maybe even 'good news'... Or maybe I was just a bit too thorough with these three notice states..?)
Quote from Arantor on September 23rd, 2012, 04:30 PM
I count 9 JS files at present: admin, captcha, editor, mediadmin, register, script, script-sha1, suggest, topic. That's not necessarily definitive, but what I've encountered so far in general use.
If you wander into the media area, you'll also get media and zoom files.
Quote
add_js_file would load the file, you'd be calling loadLanguage or loadPluginLanguage anyway from there.
Hmm it's called in wedge_cache_js, only at the time of caching, otherwise it'd defeat the purpose of it all... add_js_file doesn't call wedge_cache_js, unless the file isn't found.
Quote
That means, basically, you'll still fall into the process I'm talking about. There's no reason why the same process couldn't also contain the JS language file, though if I understand you correctly, you're almost talking about having multiple language JS files too...
(Process?)
script-123.js -> script.js with English strings (or whatever the default language is)
script-french-123.js -> script.js with French strings (or whatever the alternative language is)
Quote
Either way, I see no reason why an editor couldn't cope with being able to edit either and load from the DB, caching the results. It only has to worry about caching if people are daft enough to edit files directly, but then all bets are off anyway.
I'm not sure we're on the same line here..?

Anyway, to make things short:
We need to determine whether a language file has been modified. The easiest/quickest is to do it at loadLanguage time, because it's rather likely that PHP internally loads file data when including it (i.e. it knows the file modification date and doesn't waste time retrieving it again). If a language file is found to be updated, then we can do a simple call to clean_cache('js') to empty the JS folder. This means all JS files will have to be regenerated, even for a language file that never gets used in them -- but it's acceptable since (1) it's unlikely an admin is going to update their language files a lot...?, (2) these files only get generated when needed, thus they won't be regenerated all at the same time, and the extra CPU time is split over several minutes or hours.

Now, there are some potential problems once again...
- Imagine that script.js uses @language ManageTopics, and the admin never actually visits the admin area... It means the script file never gets updated, even if the corresponding language file is updated. It's unlikely, but still...?
- I still can't think of a 'correct' way to cache file dates. I'm thinking of something like this... Please give me your opinion?
A $settings['lang_dates'] array that contains keys like 'Admin_french' (as indicated in my previous post), and a value representing the modified date. The array is only filled with entries that are found in the JS scripts (@language entries), in the language requested. At compile time, the current language files and user language are used to store this data in the array, then submit it via updateSettings(). When it comes to doing loadLanguage(), the language and filename are determined, and if (isset($settings['lang_dates'][$filename . '_' . $lang]), then compare the actual filedate with that value... This would ensure that only the files actually cached in JS scripts will be tested for modifications. (Of course, if a plugin/feature is removed that used to include a certain language file, that entry will be wasting space in the database for nothing... But I don't have a solution for this, either.)

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Language editing inside Wedge
« Reply #17, on September 23rd, 2012, 11:46 PM »
Quote
(Re: latest rev -- should it be notewarn for new PMs..
You're not too thorough with the three states. The reason I don't use notenice in the menu is simply because the green doesn't sit well in the menu when it's the active menu item. To replicate, make sure you have an unread PM that won't be marked read when you're on action=pm, then go to action=pm, so there's at least a (1) still present, and you'll see that green just doesn't sit nicely because of a lack of contrast. But it's more prominent on the menu.
Quote
(Process?)
script-123.js -> script.js with English strings (or whatever the default language is)
script-french-123.js -> script.js with French strings (or whatever the alternative language is)
Then you started talking about @language also referring to multiple language files... which seems unnecessary to me.
Quote
We need to determine whether a language file has been modified.
No, we don't. That's the underlying point of what I've been saying. If you have the files as-is and the database containing deltas, you do not need to worry about file dates, because you just cache it whenever the database changes. You don't need to keep rechecking - you tell people to use the language editor, which deals with all that stuff.

Hence all my comments about people that are stupid enough to edit the files.
Quote
these files only get generated when needed, thus they won't be regenerated all at the same time, and the extra CPU time is split over several minutes or hours.
And if you know what files are changed you can safely just regenerate only the file that's changed, just when it's changed, job done.
Quote
- Imagine that script.js uses @language ManageTopics, and the admin never actually visits the admin area... It means the script file never gets updated, even if the corresponding language file is updated. It's unlikely, but still...?
I'm thoroughly confused. Are you saying that you want to push the entirety of a language file into a JS construct so they can be used client side?

A language set is approximately half an MB. Even allowing for savings for language constructs being more compact, you're still talking hundreds of KB that almost entirely aren't necessary!

To be brutally honest, the number of strings actually needed and referred to solely in JS is actually surprisingly small, assuming it can actually be pushed to JS (because not all of them can, for example, some of the strings used in the moderation filters area are added inline but the bulk of them won't be known until a DB query has been run)

Better would simply be to find all the strings that are actually needed from JS and make a single file out of those which can be included as standard along with script.js. Then you get to simplify a lot of the architecture.
Quote
- I still can't think of a 'correct' way to cache file dates. I'm thinking of something like this... Please give me your opinion?
It's completely unnecessary under everything I've been saying...

Nao

  • Dadman with a boy
  • Posts: 16,079
Re: Language editing inside Wedge
« Reply #18, on September 24th, 2012, 12:18 AM »
Quote from Arantor on September 23rd, 2012, 11:46 PM
You're not too thorough with the three states. The reason I don't use notenice in the menu is simply because the green doesn't sit well in the menu when it's the active menu item. To replicate, make sure you have an unread PM that won't be marked read when you're on action=pm, then go to action=pm, so there's at least a (1) still present, and you'll see that green just doesn't sit nicely because of a lack of contrast. But it's more prominent on the menu.
Okay, I'll trust you on this, I try not to 'unread' my PMs because it tends not to work... (I use convo mode, and if I unread the latest convo, it just redirects me back to that convo and thus cancels the unread state... Well duh! We need to do something about this... Redirecting to the homepage if we're on the PM first page?)
Quote
Then you started talking about @language also referring to multiple language files... which seems unnecessary to me.
@language has a list of language file radixes, not a list of languages...?
Quote
No, we don't. That's the underlying point of what I've been saying. If you have the files as-is and the database containing deltas, you do not need to worry about file dates, because you just cache it whenever the database changes.
Oh... Right! Then that would be an excellent solution, yes!
Is it already written? I suppose not? What do we do in the meantime? (Because the code is already written, I'd simply postponed the language caching test code...)
Quote
And if you know what files are changed you can safely just regenerate only the file that's changed, just when it's changed, job done.
We'll still need to go through all language files and search for @language then... (or search directly for $txt which'll take more CPU power.)
Quote
Quote
- Imagine that script.js uses @language ManageTopics, and the admin never actually visits the admin area... It means the script file never gets updated, even if the corresponding language file is updated. It's unlikely, but still...?
I'm thoroughly confused. Are you saying that you want to push the entirety of a language file into a JS construct so they can be used client side?
No, no...
Okay, here's the beginning of the script.js file:

Code: [Select]
@language index;

var
oThought,
weEditors = [],
_formSubmitted = false,

we_loading = $txt['ajax_in_progress'],
we_cancel = $txt['form_cancel'],
we_delete = $txt['delete'],
we_submit = $txt['form_submit'],
we_ok = $txt['ok'],

See what I mean..?
I'm simply going to go through all $txt insertions in HTML files, and move them to the scripts themselves. It'll be mostly helpful in editor.js, because it's on many pages... Heck, maybe we could even cache stuff like the button list etc :P
Quote
Better would simply be to find all the strings that are actually needed from JS and make a single file out of those which can be included as standard along with script.js. Then you get to simplify a lot of the architecture.
It's a solution, but I'm not fond of it...

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Language editing inside Wedge
« Reply #19, on September 24th, 2012, 12:35 AM »
Quote
Oh... Right! Then that would be an excellent solution, yes!
Is it already written? I suppose not? What do we do in the meantime? (Because the code is already written, I'd simply postponed the language caching test code...)
No, this was more of a 'this is what I think we could do' and wanting comments before I wrote it.
Quote
See what I mean..?
Oh..... so you load the index file, pull the $txt entries out and inject that into the result file that then gets cached. Now it makes sense to me.

And now I understand the concern about language files and caching. There's one entire class of cases that's screwed up in passing - plugins. While we could check the core files for updates, and force those ones to be updated from the ACP, we can't do the same with plugins. There are no requirements about where plugins must keep their files, so unless we were to go through every folder of every active plugin, looking for .js files, just to look for @language directives, we can't do that.

The alternative is to simply nuke the cache when doing such an update and immediately rebuilding the most common (script, editor, topic) if affected and let the others rebuild on demand.
Quote
Okay, I'll trust you on this, I try not to 'unread' my PMs because it tends not to work... (I use convo mode, and if I unread the latest convo, it just redirects me back to that convo and thus cancels the unread state... Well duh! We need to do something about this... Redirecting to the homepage if we're on the PM first page?)
I'm a stuck in the mud who uses one at a time mode, so it's very clear. I can post a screenshot if that'd help?

Nao

  • Dadman with a boy
  • Posts: 16,079
Re: Language editing inside Wedge
« Reply #20, on September 24th, 2012, 09:31 AM »
Quote from Arantor on September 24th, 2012, 12:35 AM
No, this was more of a 'this is what I think we could do' and wanting comments before I wrote it.
Well, I suppose it's up to you... I'll just leave my code as is -- without any caching involved. For now, admins will have to empty their JS cache folder to update any language strings. I'd rather integrate cache flushing once your side of the code is done (or abandoned).
Quote
Oh..... so you load the index file, pull the $txt entries out and inject that into the result file that then gets cached. Now it makes sense to me.
:)
Quote
And now I understand the concern about language files and caching. There's one entire class of cases that's screwed up in passing - plugins. While we could check the core files for updates, and force those ones to be updated from the ACP, we can't do the same with plugins. There are no requirements about where plugins must keep their files, so unless we were to go through every folder of every active plugin, looking for .js files, just to look for @language directives, we can't do that.
I don't see it that way...?
If we see a plugin language file as updated, we empty the JS cache folder... (clean_cache('js'), once again.)
Quote
The alternative is to simply nuke the cache when doing such an update and immediately rebuilding the most common (script, editor, topic) if affected and let the others rebuild on demand.
They're always rebuilt on demand -- whether it be script.js or admin.js or whatever...
That's the beauty of it. It always does a file_exists before including the file. And it's fast enough that we don't have to worry about it. (CSS caching might be more of a problem, with the number of files, even though I've deleted a lot recently. It's still worth considering caching the filedates for a couple of minutes each time.)

Oh, slightly unrelated: I've decided to stop including common.css in all files. The only reason is that I forgot how Wess still supposedly supports the { } syntax, but not in mixed situations, so there's no point in adding a tabbed file to a bracketed file.

The alternative would be to do the bracket conversion on every file each at a time, rather than everything together, but back then I'd calculated it would be noticeably slower, or maybe not but I decided against it, I don't remember if it was a technical issue though...
Would you like me to look into it and maybe add supporting for mixing different CSS file types..? I don't think it's that necessary.
Quote
I'm a stuck in the mud who uses one at a time mode, so it's very clear. I can post a screenshot if that'd help?
No, it's okay it's okay...

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Language editing inside Wedge
« Reply #21, on September 24th, 2012, 12:38 PM »
Quote
I don't see it that way...?
If we see a plugin language file as updated, we empty the JS cache folder... (clean_cache('js'), once again.)
There's actually two plugin related cases.

1) We update a plugin's own language files. We don't know if any given plugin has any JS files that are modified by those language files. But it would be conceivable that if we have a specific plugin being targeted, that we could brute-force search it for .js files and check those for @language directives. That said, it would theoretically be possible for a plugin to load another plugin's language files and use another plugin's JS, so that's not a reliable method.

2) We update a core language file which is used by multiple plugins of indeterminate use. Like above, short of checking every plugin, we can't really do a lot. Which means, we're either looking into caching the last update by language file as you suggested, or we clear the JS cache.

The problem with clearing the JS cache is that of course it means the next page load is going to hurt for someone. Would it not be better to purge the cache but *immediately* rebuild the files you know you're going to need for definite? The idea is that you'll hopefully move the performance hurt of regenerating cache files away from end users to the admin who is in the ACP and therefore should be expecting the occasional delay.

But we can definitely worry about that after the fact. I've got some stuff to do today so won't be able to look at this until tonight if I'm lucky.
Quote
Oh, slightly unrelated: I've decided to stop including common.css in all files.
Makes sense.
Quote
I don't think it's that necessary.
You're right, it's probably not that necessary.

Nao

  • Dadman with a boy
  • Posts: 16,079
Re: Language editing inside Wedge
« Reply #22, on September 25th, 2012, 12:22 PM »
Quote from Arantor on September 24th, 2012, 12:38 PM
2) We update a core language file which is used by multiple plugins of indeterminate use. Like above, short of checking every plugin, we can't really do a lot. Which means, we're either looking into caching the last update by language file as you suggested, or we clear the JS cache.
Clearing the JS cache really isn't much of a big deal... Packer is an efficient piece of software, it's only slow when you try using it in real time obviously. The only 'big' chunk of code in Wedge is the jQuery library, and even that is never going to be packed manually (that's why I added the <jquery_cache> tags around the local copy of it. So that it's ignored by Packer later.)
Quote
The problem with clearing the JS cache is that of course it means the next page load is going to hurt for someone.
One person is going to have a one-second delay in their page load... That's pretty much all. And that person is likely to be the admin, testing their own changes.
Quote
Would it not be better to purge the cache but *immediately* rebuild the files you know you're going to need for definite?
In any case you're going to rebuild the files while loading a page, so I don't see the point in rebuilding a selection of pages when the cache is purged, or when the next user requests the cached files...

Well, now I think the best way to deal with the cache purge for now would be to cache an array of key/values where key is the requested script file, and values is the list of language files needed. We can easily maintain that list when rebuilding the JS cache: if the file list has changed, then we update the database values. This saves us: (1) the need to go through the original JS files to retrieve a list of language files to test (this way, if we want to test whether a file needs to be rebuilt, we can directly test the JS filedate and then the language filedates and take the most recent of all.) (2) the need to search for 'script-french-1234.js.gz' if we know that a particular file doesn't have language entries, in which case we can directly look for 'script-1234.js.gz'.

BTW, my code only supports loadLanguage for now... You'll have to build loadPluginLanguage into it if you want. :)

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Language editing inside Wedge
« Reply #23, on September 25th, 2012, 02:24 PM »
Well, there's one very large problem to contend with: what happens if a theme (not a skin) decides to have its own language files?

There are all sorts of funky things tangled in the bowels of the code, such as the fact that themes can also have other themes as a dependent (so you can create one theme with a bunch of new code, and then create several full themes that depend on *that* theme)

I can fully see cases where people might push out their own actual theme, with their own templates and their own strings. For example, most of DzinerStudio's themes have a different board index layout to the standard, which also requires additional strings. Which can also be edited and will require caching.

Though I'm tempted to remove base_theme support, I don't believe anyone except Bloc ever used it and even then I'm not entirely sure about that. But I don't see that we can remove actual support for alternate themes in general.

But that makes caching potentially complicated, especially if people can choose between different themes as such, because even the JS caches might be different - if a theme declares its own index.english.php (which it is perfectly entitled to do) or replace any strings from index within its own ThemeStrings language file, that will need to be factored in, which means potentially you need to include the theme id in the cached filename.

Nao

  • Dadman with a boy
  • Posts: 16,079
Re: Language editing inside Wedge
« Reply #24, on September 25th, 2012, 03:55 PM »
Quote from Arantor on September 25th, 2012, 02:24 PM
Well, there's one very large problem to contend with: what happens if a theme (not a skin) decides to have its own language files?
Well, it goes through loadLanguage all the same, doesn't it..? And loadLanguage is the perfect place to monitor changes... Heck, I just checked -- it even does a file_exists on any language file you're going to load, meaning that the file stats are cached from that point on, meaning you can get the filemtime for free, meaning doing a check there is a no-brainer.
Of course it means that if you have a special JS file like JS.english.php that you only load for JS caching, it'll never be seen as out of date. Thus we could have some additional cache check somewhere else... (e.g. in add_js_file)
Quote
Though I'm tempted to remove base_theme support, I don't believe anyone except Bloc ever used it and even then I'm not entirely sure about that. But I don't see that we can remove actual support for alternate themes in general.
I don't know what base_theme is, but what I know is that I've never delved into alternate themes, had a few back in the day but now it's default only... I wouldn't know where it slows Wedge down, really. (Although there's quite a lot of code for alternate theme language files in loadLanguage...)
Quote
that will need to be factored in, which means potentially you need to include the theme id in the cached filename.
Hmm, yeah, I guess so...
Wess/Subs-Cache aren't tested for alternate themes anyway. I mean, it's pretty much as if theme support was broken/removed for me. For instance, if you have a 'Wine' folder in your alternate theme, the CSS cache will much certainly save its files in the same folders as the default theme's...

Anyway.

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Language editing inside Wedge
« Reply #25, on September 25th, 2012, 06:09 PM »
Quote
Well, it goes through loadLanguage all the same, doesn't it..?
It does, but it's complicated. Basically, loadLanguage goes through a list of places to check and *stops* when it hits the first.
Quote
I don't know what base_theme is, but what I know is that I've never delved into alternate themes
Normally, in SMF, a theme says that it starts from its own folder and defers to the default theme in the event of templates and language files it doesn't have. base_theme is where a theme says it is based on another theme, so it attempts to load its own templates/language files, then defer to base_theme for templates and language files, *then* defer to the default theme.

base_theme strikes me as something not really needed any more, and I have no objections to removing it.

However, the question of multiple physical themes is a difficult one, do we actually need to leave this in (and check it works)? It depends, really, on how flexible CSS can be and whether or not additional layout markup would be needed - if new markup is required, you're in either plugin or theme territory.

Nao

  • Dadman with a boy
  • Posts: 16,079
Re: Language editing inside Wedge
« Reply #26, on September 25th, 2012, 10:28 PM »
So if it were up to you. When it comes to js language caching, what would you store in the db and where in the code would you check on updates being needed or not?

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Language editing inside Wedge
« Reply #27, on September 25th, 2012, 10:57 PM »
It sort of IS up to me :P I sort of need to go through all of it mentally, so let's recap the whole thing end to end.

Ideally, I don't want anything in the DB that doesn't need to be there. The DB at a minimum needs to be able to contain what is different from the physical files, which means the DB needs to store enough to be able to work that out - which means, I guess: theme id, language file root, language of string, changed string. (Perhaps an additional column to indicate it is an array and should be unserialized on reading)

Then it's queried as necessary from loadLanguage.

From a cache perspective, we will likely need to recache language files, and to that end, I'm thinking we'll end up playing cache using the theme id, plus the language root, plus language - essentially caching something like '1-index-english' as the resultant content of that language file.

To me it seems that JS caching only needs to call on that combination (theme + language root + language) to find the file and then you have far fewer combinations. Clearing the cache when editing strings means the language cache, then the JS cache will be rebuilt. There should be no circumstance where files are edited manually and if users do do something stupid like that, it's going to be their own fault that it doesn't work properly. You don't have to worry about users that don't follow instructions - because they won't actively break anything by editing strings, it just won't propagate their changes.

That's for the worst case scenario of having multiple themes where the theme will also have to provide language strings itself, and we have to be able to make a distinction between two themes for the purposes of language files. It means you get a set of JS files per language per theme.

I'm just not seeing the need to store any cache values, unless there's something I'm missing...

Nao

  • Dadman with a boy
  • Posts: 16,079
Re: Language editing inside Wedge
« Reply #28, on September 28th, 2012, 11:52 PM »
Just to keep you posted...

- Fixed a surprising bug where strpos($del, $latest_date) would return false even if $latest_date was in $del... Because it was an integer. I thought PHP did some typecasting before a strpos, but apparently not...! So it means that this caching feature was pretty much broken since the beginning...
- The code doesn't (yet???) support (or care for) other themes. I figure you'll want to add support yourself :)
- Finished implementing per-language caching, i.e. generating 'script-french' and 'script' files depending on your language.
- Added support for plugin language files.
- Added a $settings['js_lang'] array. It is serialized in a similar fashion to $settings['registered_hooks'], which I'm not super-happy about, but it's still fast enough I guess. 'js_lang' is unserialized to an array of serialized arrays, so basically it looks like this in the end: array('script-' => array('index, Arantor:ThemeSelector:SkinSelector')) (<== SkinSelector being the language file inside the ThemeSelector plugin.)
- 'js_lang' is supposed to help in two ways: (1) if the script ID is missing from the array, it means there are no languages to be included, so we can directly include 'script-1234.js.gz' instead of adding the language string. (2) Supposedly, when we do loadLanguage, we can first test for file changes (I'm not recording this yet...), then if changed, go through js_lang and remove all JS files that use our language file.

As indicated in an earlier post, the main issue with js_lang is that it makes it impossible to test for changes to a file like 'JS.french.php' if the file is never loaded otherwise. I'd recommend maintaining a database of changed files somewhere, and then checking for changes before doing the wedge_cache_js test/call.
So... What do we do?

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Language editing inside Wedge
« Reply #29, on September 29th, 2012, 12:53 AM »
Quote
- The code doesn't (yet???) support (or care for) other themes. I figure you'll want to add support yourself :)
Well, this is sort of why I brought the issue up. Multiple theme support is tricky because by definition you need to be able to override what's in the default theme. loadLanguage does this just fine, loadPluginLanguage doesn't care (by design).
Quote
Supposedly, when we do loadLanguage, we can first test for file changes
If at any point we are testing for file changes of the language files, the whole exercise becomes a waste of time. The whole point is to NEVER change the actual language files, which is why I'm so adamant that if the files do get changed, we should NOT lift a finger to help it along, under ANY circumstances.

Unless you're testing for changes of the resultant JS in some fashion but even then it's all down to things being in the DB or not, nothing more.