Rewriting the skin file parser...

Nao

  • Dadman with a boy
  • Posts: 16,082
Rewriting the skin file parser...
« on May 14th, 2012, 07:47 AM »
Re: suffixes, I forgot to say...
I want to do a rewrite because the current implementation is a bit flawed (in at least an edge case.)
My main goal with the rewrite would be to allow for comma-separated lists of suffixes in any file.
index.ie6,ie7,ie9,member1.css would thus target member #1 if he uses ie except ie8.
To achieve this, I suspect that doing a glob() call would be enough. Then for each file -- explode the suffixes, and check whether they're in our list.

Only thing I'm not sure about -- performance. Is glob() optimized enough, does the folder list stay in memory as much as possible..? I suspect so, but I don't know for now.

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re : Rewriting the skin file parser...
« Reply #1, on May 14th, 2012, 11:58 AM »
I don't believe glob is cached as thoroughly as you're hoping - but it should be cached by the OS through the statcache if possible. IOW, it is cached but not up in PHP and I think you're going to end up doing one per theme folder (in the hierarchy) per page.
Re : Rewriting the skin file parser...
« Reply #2, on May 15th, 2012, 03:36 PM »
I'm not actually bothered by it, it is a link after all ;)

file_exists definitely uses the stat cache, glob I'm not so sure about but I can't see any reason it wouldn't be cached at the OS level. Benchmarking is useful - but benching it on Windows is less meaningful than benching it on a Linux server. Windows will give you the worst-case scenario for things like that.
When we unite against a common enemy that attacks our ethos, it nurtures group solidarity. Trolls are sensational, yes, but we keep everyone honest. | Game Memorial

Nao

  • Dadman with a boy
  • Posts: 16,082
Re : Rewriting the skin file parser...
« Reply #3, on May 15th, 2012, 03:54 PM »
It's not cached by stat...
http://www.php.net/manual/en/function.clearstatcache.php
There's a list of functions affected -- glob is not among them.

What we could do is add a link to the Admin menu where we would delete the cache. And then we could store the current (complete) skin file list in a regular Wedge cache file, with a relatively short latency (a couple of minutes?). What do you think...?
Posted: May 15th, 2012, 03:50 PM

As for the main menu, the system may be a bit too complex to be usable in a "simple" mini-menu.
It's done pretty much this way: store parent items into a special span (not a big fan but...), and apply on hover a .hove class. And then create my menu inside the span (or div), this way the parent item still has the 'hove' class even though it's no longer actually hovered.

What I liked about the mini-menu thing is that by being a direct child of the anchor, it would automatically adapt to its position, as well as keep the :hover state on the parent link. But I'm not ready either to have some browsers apply :hover to every single child item, even when not hovered... Meh!
I'll look into the span, then...

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re : Rewriting the skin file parser...
« Reply #4, on May 15th, 2012, 03:59 PM »
Quote
What we could do is add a link to the Admin menu where we would delete the cache. And then we could store the current (complete) skin file list in a regular Wedge cache file, with a relatively short latency (a couple of minutes?). What do you think...?
We could certainly do that, we could also push it into the main cache and make clearing that more prominent (which would clear everything)?

Nao

  • Dadman with a boy
  • Posts: 16,082

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re : Rewriting the skin file parser...
« Reply #6, on May 15th, 2012, 04:28 PM »
Well, you're talking about caching it and having a button in order to clear that cache. I wasn't sure whether you were putting into the main data cache or somewhere else, especially as you were talking about having a separate way to clear that cache.

I figured we could use the main cache and then make it more prominent to clear the main cache (though that has other consequences on, say, accelerator-based caches)

Nao

  • Dadman with a boy
  • Posts: 16,082
Re : Rewriting the skin file parser...
« Reply #7, on May 15th, 2012, 05:41 PM »
Yup, I did mean using the 'regular' cache, and that the admin button for clearing the cache could possibly clear it all (the .php files, at the very least.)

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re : Rewriting the skin file parser...
« Reply #8, on May 17th, 2012, 12:57 PM »
Quote
After SMF 2.0 you mean?
Yup, as far as I know it was patched in their Git repo, along with the fact that the cache level and type are now globals and not in $settings. (So that the settings table can actually be cached and read properly from cache, regardless of which cache we're talking about)
Re : Rewriting the skin file parser...
« Reply #9, on May 17th, 2012, 03:16 PM »
Well, here's the thing, and it's actually the source of some of the 'SMF 2 is slower than SMF 1' theory. It's also incredibly complicated, but bear with me and hopefully I'll explain how it all works.

SMF 1 has no file cache. Thus no queries get cached to files but it can use proper caches like memcache. That means until such times as you're on a grown up cache, you're swallowing the extra load attached to all the uncached queries.

The difference as compared to SMF 2 is though, when you're starting out, the cache miss frequency will be much higher due to low traffic, so in starting out (and, unsurprisingly, prime shared hosting territory) you're not only not benefitting much from the cache because it expires too quickly for the low traffic, you're also paying the overhead attached to it too.

So at the very earliest stages, you carry out extra work compared to SMF 1, and see little benefit - but as soon as you start getting anywhere serious (and by that I mean you start getting any real volume of traffic), the cache soon starts to kick in and benefit, and I suspect the reality is that you'd be able to run SMF 2 on shared hosting longer than you would for SMF 1 because of the caching taking the edge off the DB, and especially that most hosts target CPU as the prime measure of load and don't focus on I/O, so the cache definitely helps there.

I think the file cache is worth doing. The problems with the cache subsystem as they stand are:
1. It's totally I/O bound. As nend's experiments show, there's little you can do to optimise that. So the speed benefit is finite. There are advantages to using the SQLite system as nend suggested, though.

2. Things are not best using the cache, but that's not systemic to the cache itself; the fact $settings is not cached by things other than the file cache is not the fault of the file cache, it is the fault of the housekeeping that drives all of the caching system. Transferring the cache settings to be in Settings.php means it's possible to cache $settings, something not possible in Wedge currently except through the file cache (which is better than nothing)[1]

3. Clearing the non-file caches is difficult and unreliable. Clearing the file cache or even parts of the file cache is trivial by comparison.


Re the menu, the big problem there is the fact that changing the rest of the layout screws other things up too, which is a big con in my book. I know only too well that resizing the browser or other layout causes a lot of hassle when positioning is important (especially if the size of menu panels is dependent on the layout of the screen!) but to me that seems a deal breaker, especially as the complexity of fixing it by binding to the viewport's resize is going to add a fair bit more code.
 1. I'm still floating on whether this particular measure is ideal or not. It does make it a shade harder to force flush the cache because the time of last settings update is also in the settings table (and there is no safe way to put it anywhere else)

Nao

  • Dadman with a boy
  • Posts: 16,082
Re : Rewriting the skin file parser...
« Reply #10, on May 17th, 2012, 03:22 PM »
I'll deal with the OT later, but menus -- I don't understand what you're trying to say. Any real-life cases of a layout change that happens often enough for absolutely-positioned menus to be a big con for you...?
Re : Rewriting the skin file parser...
« Reply #11, on May 17th, 2012, 03:31 PM »
Quote from Arantor on May 17th, 2012, 03:16 PM
SMF 1 has no file cache.
(Which is why I switched to SMF 2 in the first place... and, because it was in private beta, how I got into the SMF community after a couple of years of just using SMF1 and posting on my own forums. Anyway -- the file cache never really helped performance on my old server and moving it to a new server helped. I've always been worried about performance in general, and it shows in Wedge...)
Quote
The difference as compared to SMF 2 is though, when you're starting out, the cache miss frequency will be much higher due to low traffic, so in starting out (and, unsurprisingly, prime shared hosting territory) you're not only not benefitting much from the cache because it expires too quickly for the low traffic, you're also paying the overhead attached to it too.
Yeah, that would explain a lot... (Although my forum already had 200k+ posts when I tried making the switch.)
Quote
I think the file cache is worth doing. The problems with the cache subsystem as they stand are:
1. It's totally I/O bound. As nend's experiments show, there's little you can do to optimise that.
It's pretty much up to PHP to keep things in memory for other simultaneous sessions, yeah.
Quote
So the speed benefit is finite. There are advantages to using the SQLite system as nend suggested, though.
But again, it's mostly I/O bound... (i.e. whether SQLite can deal with the disk access faster than a pure PHP call is up to the machine and setup, I guess...?)

Relying on SQLite, to me, is a bit like creating temp tables in MySQL that bear the names of the cache keys :) Eh, that could be an idea... :lol:
Quote
2. Things are not best using the cache, but that's not systemic to the cache itself; the fact $settings is not cached by things other than the file cache is not the fault of the file cache, it is the fault of the housekeeping that drives all of the caching system. Transferring the cache settings to be in Settings.php means it's possible to cache $settings, something not possible in Wedge currently except through the file cache (which is better than nothing)
You mean it's not possible without a similar rewrite, right..?
Quote
3. Clearing the non-file caches is difficult and unreliable. Clearing the file cache or even parts of the file cache is trivial by comparison.
Isn't it possible to send a global timeout notice or something to the cache handlers...?

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re : Rewriting the skin file parser...
« Reply #12, on May 17th, 2012, 04:36 PM »
Quote
Yeah, that would explain a lot... (Although my forum already had 200k+ posts when I tried making the switch.)
Yup, it starts out with less efficiency and will outpace SMF 1 after a while, but it's never clear exactly when the switch will occur.
Quote
It's pretty much up to PHP to keep things in memory for other simultaneous sessions, yeah.
Well... there's part of the problem. Especially on Apache, each PHP request is theoretically meant to be threadsafe and thus independent, so any request made can only use stuff that is outside PHP itself for persistence, whatever form that persistence takes. For big persistence obviously something like MySQL is going to be used, but for short term caching, you'd likely be using memcache or whatever.

Really, it's just about what is available to service the lower end hosting; anyone running Wedge etc. on any really large platform is going to be on a VPS and able to drop memcache or APC into place anyway.
Quote
But again, it's mostly I/O bound... (i.e. whether SQLite can deal with the disk access faster than a pure PHP call is up to the machine and setup, I guess...?)
Actually it's mostly a toss-up between pure CPU overhead at that point. Either way it's still I/O bound but the scale of writing means whether you write 1KB or 8KB is almost irrelevant, so it comes down to the file_put_contents overhead or the overhead of SQLite which is effectively also a native method. That was the thing nend's experiments came up with, that at times one was faster than the other but it was variable - and from what I can see the variation is going to ultimately be down to CPU and file overhead between the two. (The fact that the SQLite implementation was slower to start with is no surprise to me, any system with any kind of caching performs less efficiently from cold.)
Quote
You mean it's not possible without a similar rewrite, right..?
Something like that. If you want $settings cacheable with a proper cache, the underlying settings have to be moved out of $settings and available globally. We're only talking 2 variables IIRC, so it's not a massive deal.

But the question of expiry is a bit more tricky to resolve.
Quote
Isn't it possible to send a global timeout notice or something to the cache handlers...?
It sort of is, and that's sort of what the patches do, but if you have a site like that you may not wish to clear the entire cache, just parts of it.

Nao

  • Dadman with a boy
  • Posts: 16,082
Re : Rewriting the skin file parser...
« Reply #13, on May 20th, 2012, 01:09 PM »
Quote from Nao on May 15th, 2012, 03:54 PM
It's not cached by stat...
http://[url]http://www.php.net/manual/en/function.clearstatcache.php[/url]
There's a list of functions affected -- glob is not among them.
So... Been doing my tests...
It's a local thing, so it's based on Windows 7 + i7 CPU (quad core, 3+Ghz...) + 8GB Ram, and would probably be slower on a 'regular' server (even if running Linux).
By calling the entire CSS caching, it takes about 0.4ms in the current SVN, and about 0.7ms in the glob() version. That's about 50% slower, and the file searching code itself is probably even twice slower (because the entire function has more code than just the file search...), but still -- it's less than a millisecond. We're talking about web pages that always take at least 120 milliseconds to generate on my PC. It seems to be acceptable -- and if anything, it can be sped up by caching the file list...
I think I'll keep it this way. It still needs some (a lot of?) work to make it work across fallback folders, but other than that, it seems to be working.

Only issue, maybe: I'm using GLOB_BRACE to avoid using multiple glob() calls, but the php.net documentation says that it's not supported everywhere, giving Solaris as an example... I suspect that isn't that big of a deal, but still...
Posted: May 20th, 2012, 12:59 PM

Doing some quick tests between one version and the other...
Old version: 0.4 to 0.7ms (average: 0.5-0.6ms)
New version: 0.5 to 0.8ms (average: 0.7-0.8ms)
So, it's not even 50% slower actually... :^^;:

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re : Rewriting the skin file parser...
« Reply #14, on May 20th, 2012, 03:21 PM »
That's the thing, I'd be interested in testing it on a Linux server because IIRC Windows doesn't have the same concepts of caching at the OS level that Linux does.