This topic was marked solved by its starter, on December 14th, 2012, 01:23 AM
SMF bug 4956 (slash in cache key causes cache to fail)

Nao

  • Dadman with a boy
  • Posts: 16,082

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #16, on April 11th, 2012, 12:15 AM »
I'm not sure it is caching anything. The overhead of caching would be a significant slowdown.

I think we do need to base64 it at that point, we can't safely replace that character, it's a collision risk.
When we unite against a common enemy that attacks our ethos, it nurtures group solidarity. Trolls are sensational, yes, but we keep everyone honest. | Game Memorial

nend

  • When is a theme, no longer what it was when installed?
  • Posts: 165
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #17, on April 16th, 2012, 03:21 AM »
Bickering about a bug I reported to SMF, lol.

So far all I know about it is it affects file caching for sure, as to any other caching means, I am not sure. So if it doesn't affect any of the other systems then it would be appropriate not to make the key safe for them since they are able to handle it. This is speculation since I don't know how the other cache systems work.

However one could argue it is up to the developer to send safe keys to the cache but I am sure some will not follow.

IMHO base64 is just going to cause more problems, since your going to have to check the output of it to make sure that it didn't make illegal characters in the key, so basically it will put you back to where you started.
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #18, on April 16th, 2012, 03:32 AM »
My test results, I added a few on there, it looks like strtr by itself would be allot better. :cool:

PHP 5.3 on grid/cloud hosting.
Quote
Carrying out 500000 iterations

String: `something short`
MD5: 0.34987902641296 seconds
SHA1: 0.48932695388794 seconds
B64: 0.16412401199341 seconds
B64/sstrtr: 0.57090711593628 seconds
B64/str_replace: 0.92343997955322 seconds
sstrtr: 0.41409516334534 seconds
str_replace: 0.85998892784119 seconds

String: `something much much much much much longer that we can use to benchmark these things.`
MD5: 0.48175501823425 seconds
SHA1: 0.70720291137695 seconds
B64: 0.23448801040649 seconds
B64/sstrtr: 0.6997401714325 seconds
B64/str_replace: 1.0725769996643 seconds
sstrtr: 0.48645901679993 seconds
str_replace: 0.88212299346924 seconds

String: `something impossibly and unrealistically long, after longer than the typical cache key will be but just in case it is worth having an idea of relative performance`
MD5: 0.52688503265381 seconds
SHA1: 0.93148612976074 seconds
B64: 0.32354092597961 seconds
B64/sstrtr: 0.91212201118469 seconds
B64/str_replace: 1.3229742050171 seconds
sstrtr: 0.60456418991089 seconds
str_replace: 1.0461859703064 seconds

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #19, on April 16th, 2012, 03:46 AM »
Quote
Bickering about a bug I reported to SMF, lol.
Oh, there's no argument that there's a bug. It's how best to fix it, knowing that caching should be as fast as possible while not compromising functionality.
Quote
So far all I know about it is it affects file caching for sure, as to any other caching means, I am not sure.
As far as I know, memcache(d) and APC do not care and neither uses file based storage in any fashion and as such it doesn't matter.
Quote
However one could argue it is up to the developer to send safe keys to the cache but I am sure some will not follow.
In the case of SMF+Aeva, that's a valid proposition, because it's essentially a non-compliant mod rather than a core feature, but in Wedge's case, it's a core feature that we have to contend with and it would be better to fix it at the cache level rather than the cache caller.
Quote
My test results, I added a few on there, it looks like strtr by itself would be allot better.
Except that we can't use it, because of the inherent risk of collision. If you have path/file.ext and strtr the / out to _ (or indeed any other character, but we'll use _ for the sake of argument), path/file.ext and path_file.ext both resolve to the same cache key, and since the root name used is the album name, users have some control over this process. I'm not saying that it's a direct vulnerability, but anything that a user can (relatively) trivially manipulate in this fashion is a bad idea.

On the other hand, MD5 is faster even than strtr and while it does carry a collision risk, I'd argue that it's actually much less likely than strtr especially as it can't be manipulated as easily by the user.

Nao

  • Dadman with a boy
  • Posts: 16,082
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #20, on April 16th, 2012, 06:45 PM »
As I didn't read the original report, I didn't even know there was a problem with AeMe...
Blame any Aeva cache issues on Dragooon, he did it all :niark:

There's also Subs-Sounds.php that clearly uses slashes in the key...

I'll still cast my vote on base64_encode (alone), based on what was discussed.

Now... I'm looking into the actual caching code, and here's the thing.

Code: [Select]
$key = md5($boardurl . filemtime($sourcedir . '/Collapse.php')) . '-Wedge-' . strtr($key, ':', '-');

It's ALREADY doing both a md5 (on the prefix though!), and on the key (to replace ":").
So it can easily be turned into strtr($key, ':/', '--') for speed reasons...
However, it would make more sense to 'simply' put everything into the md5() call. I see no reason not to do so...
Also, I'm not exactly sure why it absolutely needs to retrieve filemtime here, is it for security reasons? (i.e. not allowing a user to find the correct URL for the cache...)
But we *already* have an htaccess file that prevents directly viewing php files...

(Plus, I've always been bothered by this 'touch' on Collapse.php -- previously on Load.php in SMF which is a VERY bad idea, because that file is critical and often modified...)

Suggesting this as replacement, then...

Code: [Select]
$key = base64_encode($key);

And voilà, problem solved... And supercharged cache ;)

Dragooon

  • I can code! Really!
  • polygon.com has to be one of the best sites I've seen recently.
  • Posts: 1,841
The way it's meant to be

Nao

  • Dadman with a boy
  • Posts: 16,082
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #22, on April 16th, 2012, 07:57 PM »
Quote from Dragooon on April 16th, 2012, 07:22 PM
I'm fairly sure SMG had no cache when I made it...or did it?
What I know is that I had absolutely no knowledge of SMF2 caching until I started work on Wedge ;)
Well, I knew it existed (and was my prime reason to use SMF2), but I'd never looked into the code, it was too esoteric for me.

One could say, that basically when you were 13, you were way more knowledgeable about SMF programming than I was. And I'm 20 years older than you :P

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #23, on April 16th, 2012, 07:58 PM »
Quote
Also, I'm not exactly sure why it absolutely needs to retrieve filemtime here, is it for security reasons? (i.e. not allowing a user to find the correct URL for the cache...)
Forcing cache expiry. Cleaning the cache updates the filemtime on Collapse.php, so that all the file caches are invalidated at once in case the files themselves didn't get removed.
Quote
And voilà, problem solved... And supercharged cache
And you also have no reliable way to force cache expiry short of deleting files (which has been a source of contention in the past anyway)

Nao

  • Dadman with a boy
  • Posts: 16,082

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #25, on April 16th, 2012, 11:17 PM »
Well, the md5 does have a security aspect to it as well, because not all servers listen to the contents of the .htaccess files (like IIS, nginx, some configurations of Apache)

Ideally I'd prefer ot keep that aspect of it in place unless there's a good reason not to (and right now, the slight performance increase vs the security aspect isn't quite enough)

Nao

  • Dadman with a boy
  • Posts: 16,082
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #26, on April 16th, 2012, 11:28 PM »
Instead of the filemtime, I'd really (really) rather we have a $settings variable that we increment when we ask for a cache flush...
I know that most file operations are cached in PHP for some time, so a filemtime shouldn't be expensive at all, but I'd rather not count on it. I'm still considering adding an option to disable file date checking for CSS and JS files, which I'm sure would save more than a few milliseconds on each page load...

I don't really see security being any better if using md5() over base64. What can happen? Someone trying to repeatedly run a check on a URL by decreasing the time from now to a week ago or something...? It's tantamount to brute-forcing a md5() as well, if you ask me...

In any case, I don't see the need for storing $boardurl in the file name. And the base64 encoding should replace the strtr, at least that is proven to be as fast, or faster.

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #27, on April 16th, 2012, 11:33 PM »
Quote
Instead of the filemtime, I'd really (really) rather we have a $settings variable that we increment when we ask for a cache flush...
There is, $settings['settings_updated']. But the settings are also cached...
Quote
I don't really see security being any better if using md5() over base64. What can happen?
I'm not saying md5 vs base64, I'm saying something vs nothing where the result is not trivially guessable by a hacker.
Quote
In any case, I don't see the need for storing $boardurl in the file name. And the base64 encoding should replace the strtr, at least that is proven to be as fast, or faster.
But the base64 can still generate characters that are invalid. AND there are other cases where invalid filename characters are sent anyway, like : which isn't valid on Windows. Now, the base64 will fix that but you will still occasionally get / in the resultant base64 encoding, so you don't save anything there either.

nend

  • When is a theme, no longer what it was when installed?
  • Posts: 165
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #28, on April 17th, 2012, 06:12 AM »
How about convert the string to something that can't possibly be a collision risk and file name safe. I just thought of this while writing this post and ran a test in the middle of the post, bin2hex. From test I am getting real fast speeds from this function, the only problem I forsee though is filename limits.

Maybe key compression of some sort if the key is too long or maybe say a folder structure for long keys where the key can be continued over the span of multiple folders.

Test results
Quote
Carrying out 500000 iterations

String: `something short`
MD5: 0.32972002029419 seconds
SHA1: 0.47672295570374 seconds
B64: 0.17632007598877 seconds
B64/strtr: 0.57628583908081 seconds
B64/str_replace: 0.93950510025024 seconds
bin2hex: 0.16528415679932 seconds

String: `something much much much much much longer that we can use to benchmark these things.`
MD5: 0.42671203613281 seconds
SHA1: 0.69267106056213 seconds
B64: 0.22752213478088 seconds
B64/strtr: 0.70388197898865 seconds
B64/str_replace: 1.0751059055328 seconds
bin2hex: 0.25826096534729 seconds

String: `something impossibly and unrealistically long, after longer than the typical cache key will be but just in case it is worth having an idea of relative performance`
MD5: 0.51122808456421 seconds
SHA1: 0.90426898002625 seconds
B64: 0.30805087089539 seconds
B64/strtr: 0.89947199821472 seconds
B64/str_replace: 1.3907599449158 seconds
bin2hex: 0.33286499977112 seconds

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: SMF bug 4956 (slash in cache key causes cache to fail)
« Reply #29, on April 17th, 2012, 02:26 PM »
To be honest, bin2hex is only a percentage longer than base64 anyway, but it's certainly the fastest of the practical options. Filename length is only an issue with long keys - which is mostly related to media where this all started :/