This topic was marked solved by Arantor, on February 10th, 2013, 05:48 PM
Long cache keys make the cache fail.

Nao

  • Dadman with a boy
  • Posts: 16,082
Long cache keys make the cache fail.
« on April 30th, 2012, 12:53 PM »Last edited on May 1st, 2012, 09:44 AM
If you go to this page:
http://wedge.org/do/media/?sa=item;in=29

There will be a cache request. The key is generated as such:

aeva-embed-link-[url=http://www.youtube.com/watch?v=OgCjpA03mOI#ws]B+ Episode 5 - باسم يوسف شو (مع تامر من غمرة) الحلقة ٥[/url]

Funny eh..?
Well, it generates an error saying the filename is too long -- and indeed, with bin2hex being used on the key, it does make it extra long...!
I don't know what would be the best fix here... Direct or indirect?

PS: sorry I'm being so silent these days... I try to keep up with reading but I definitely can't post or work on Wedge (too much) -- hectic RL. It's likely that it'll be similar in the next few days, too... I hate that.

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Long cache keys make the cache file.
« Reply #1, on April 30th, 2012, 02:29 PM »
I fully understand about hectic RL (I'm in a similar situation but for quite different reasons)

This sounds like the sort of thing where MD5 might work rather well? Perhaps MD5 + the YouTube ID?
When we unite against a common enemy that attacks our ethos, it nurtures group solidarity. Trolls are sensational, yes, but we keep everyone honest. | Game Memorial

Nao

  • Dadman with a boy
  • Posts: 16,082

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Long cache keys make the cache fail.
« Reply #3, on May 1st, 2012, 01:10 PM »
The point of using MD5 is to give you something reasonably unique, without being overly long and adding the video ID gives you sufficient uniqueness that it should work fine.

Nao

  • Dadman with a boy
  • Posts: 16,082

CJ Jackson

  • I got myself a new iPad, a different world to the iPhone!
  • Posts: 241
Re: Long cache keys make the cache fail.
« Reply #5, on May 1st, 2012, 05:10 PM »
That was a simple mistake, always use hash for generating unique keys unless simple id number are practical to use.

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Long cache keys make the cache fail.
« Reply #6, on May 1st, 2012, 05:12 PM »
True, unless there's any kind of risk of hash collision; MD5 is not nearly as well distributed as its keyspace implies it should be.

PantsManUK

  • [me=PantsManUK]would dearly love to dump SMF 1.X at this juncture...[/me]
  • Posts: 174
Re: Long cache keys make the cache fail.
« Reply #7, on May 1st, 2012, 05:25 PM »
MD5 is terrible (from a cryptographic standpoint) and SHA1 isn't a lot better, but in non-security situations it's perfectly good enough; the collision risk is acceptable on both ('though MD5 is significantly cheaper in computing terms).
« What is this thing you hoomans call "Facebook"? »

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Long cache keys make the cache fail.
« Reply #8, on May 1st, 2012, 05:41 PM »
Well, MD5's keyspace is theoretically 2^128 but vulnerabilities in the mathematics give it a collision window of 2^40 or so, while SHA1's keyspace is 2^160 with a collision window of 2^51. While that may not mean much to most people, what it really means is that both are vulnerable for really sensitive stuff, and that it's still an order of complexity more to be able to collide SHA1 hashes.

From what's required here, even an MD5 on its own would probably be OK, and that for password hashing nothing less than SHA1 should be used (though that's a separate discussion in itself)

Mind you, while we're on the subject we might as well tackle it. SMF and Wedge (and most other forums) use a hash based on username and password, combined together then hashed for comparison purposes. SMF and Wedge also score slightly higher than most other forums by sending a user's password to the server hashed if possible. Anyway, the hash used in both cases is SHA1, and changing it has large consequences.

The biggest one, really, is about conversions, where users from all other environments (including Wedge itself during migration) would have to re-enter their password. If you're coming from a system with weaker protection, no harm, no foul, you get upgraded anyway. But if you're coming from SMF or similar, you will still have that extra step which may be off-putting to users.

In all other respects about performance, the effort of using something like SHA256 (SHA-2) in place of SHA1 is no big deal, it doesn't even require a schema change in the database because the column has been declared as varchar(64) for ages.

The one thing I do want to mention is what phpBB and WP do (when using portable hashes, anyway), you take the username and password and md5 it repeatedly, making it harder to find a brute force match. I do not like that method, I'm really not convinced it's a security benefit. But I can believe that it might be if you're only working off the username and trying to match the hash through the same process (though I can also believe there are rainbow tables for that too)

nend

  • When is a theme, no longer what it was when installed?
  • Posts: 165
Re: Long cache keys make the cache fail.
« Reply #9, on May 2nd, 2012, 12:39 AM »
This cache thing isn't as simple as it looks, long keys pass the file name character limit.

I was thinking though also a while back about a master table which will be loaded once and saved once every load that the cache is used. Each cache file will be saved as say numeric or alpha numeric file name.

The cache table will figure out when a key is requested which file to load. This works differently because the key has nothing to do with the file name.

When a key needs to be saved the key will be added to the cache table array and a numeric file reference will be generated. If a collision occurs then the system just needs to generate a new file name.

Code: [Select]
'some key here' => 'file1.php',
'some other key here' => 'file2'

In a perfect world though this should work fine, but it wouldn't. There are chances the file may not be available when requested. Some system has to be in place to prevent this.

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Long cache keys make the cache fail.
« Reply #10, on May 2nd, 2012, 12:53 AM »
Quote
This cache thing isn't as simple as it looks, long keys pass the file name character limit.
Yup (as the title says, heh)

The problem with using a cache table is that it's not efficient (especially as in an ideal world even something like $modSettings as was would be properly cached, which it isn't right now)

Also note that in the proper end of caching where you're using memcached or similar, there are longer (if any) limits applied to the key names, so this is really a matter just for the file cache to contend with.

Nao

  • Dadman with a boy
  • Posts: 16,082
Re: Long cache keys make the cache fail.
« Reply #11, on May 2nd, 2012, 08:09 AM »
My opinion on cache keys is that it's the responsibility of the programmer to ensure they're not too long.
Here we had a key that could definitely be way too long, and benefited from being md5'd. Thankfully, any key that's too long will be logged in the error log and thus it makes it easier to fix it. Worst case scenario, anyway, is that the cache isn't used for that particular key ;)

Here's my code... Any weaknesses? I'm just curious. Or maybe I should use \[url[]=]([^][]+) for the pattern...? Don't remember if these formats are used as well. I guess so...

Code: [Select]
preg_match('~\[url=([^]]+)~', $item_data['embed_url'], $match);
$key = md5($match[1]) . '-' . md5($item_data['embed_url']);

(Originally, $key = $item_data['embed_url'], if you will.)

PantsManUK

  • [me=PantsManUK]would dearly love to dump SMF 1.X at this juncture...[/me]
  • Posts: 174
Re: Long cache keys make the cache fail.
« Reply #12, on May 2nd, 2012, 11:35 AM »
Quote from Arantor on May 1st, 2012, 05:41 PM
The one thing I do want to mention is what phpBB and WP do (when using portable hashes, anyway), you take the username and password and md5 it repeatedly, making it harder to find a brute force match. I do not like that method, I'm really not convinced it's a security benefit. But I can believe that it might be if you're only working off the username and trying to match the hash through the same process (though I can also believe there are rainbow tables for that too)
Yes, I also don't see a lot of mileage in doing it multiple times. It's OK for obscurity, but otherwise it's doubtful security. Far better to use a proven technique (a "bigger" method - SHA-2+, etc) and suck up any slight loss of performance.

Just to throw one more hat in the ring, can I just say "salting" and leave it at that...?

Nao

  • Dadman with a boy
  • Posts: 16,082
Re: Long cache keys make the cache fail.
« Reply #13, on May 2nd, 2012, 12:00 PM »
Salting is the process of adding a secret string to any password before encrypting them, making it impossible to brute force a password by dictionary or whatever, right..? (I'm trying to remember :P)

PantsManUK

  • [me=PantsManUK]would dearly love to dump SMF 1.X at this juncture...[/me]
  • Posts: 174
Re: Long cache keys make the cache fail.
« Reply #14, on May 2nd, 2012, 12:12 PM »
Pretty much, yes. Salting is adding a string (of some sort) to the password prior to the hashing process to make things harder on folks attempting to reverse engineer the hashes. As long as the salt is complex enough and long enough, a rainbow attack becomes impractical (computationally speaking) - http://en.wikipedia.org/wiki/Salt_(cryptography).

Theoretically, you can store the salt value and the hash without compromising security.