This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
3751
Plugins / Re: Light URL Plugin Maybe?
« on April 16th, 2012, 11:48 AM »
Yay another URL shortener.
If you're planning to offer something whereby sites can generate their own shortened URLs that don't rely on l-url.com (so that site.com makes site.com/go/hash automatically), you're going to have *SO* much fun with all the mutant configurations out there that won't support your routing scheme properly, especially given how hard it is to actually set up site.com/go/hash instead of site.com/go.php?hash...
If you're planning to offer something whereby sites can generate their own shortened URLs that don't rely on l-url.com (so that site.com makes site.com/go/hash automatically), you're going to have *SO* much fun with all the mutant configurations out there that won't support your routing scheme properly, especially given how hard it is to actually set up site.com/go/hash instead of site.com/go.php?hash...
3752
Archived fixes / Re: SMF bug 4956 (slash in cache key causes cache to fail)
« on April 16th, 2012, 03:46 AM »Bickering about a bug I reported to SMF, lol.
So far all I know about it is it affects file caching for sure, as to any other caching means, I am not sure.
However one could argue it is up to the developer to send safe keys to the cache but I am sure some will not follow.
My test results, I added a few on there, it looks like strtr by itself would be allot better.
On the other hand, MD5 is faster even than strtr and while it does carry a collision risk, I'd argue that it's actually much less likely than strtr especially as it can't be manipulated as easily by the user.
3753
Archived fixes / Re: Buggy Feed links
« on April 16th, 2012, 12:06 AM »I would be tempted to say that we should remove that compatibility code...
// Are we going to need to parse the ; out?
if (strpos(@ini_get('arg_separator.input'), ';') === false && !empty($_SERVER['QUERY_STRING']))
{
// Get rid of the old one! You don't know where it's been!
$_GET = array();
// Was this redirected? If so, get the REDIRECT_QUERY_STRING.
$_SERVER['QUERY_STRING'] = urldecode(substr($_SERVER['QUERY_STRING'], 0, 5) === 'url=/' ? $_SERVER['REDIRECT_QUERY_STRING'] : $_SERVER['QUERY_STRING']);
// Replace ';' with '&' and '&something&' with '&something=&'. (This is done for compatibility...)
parse_str(preg_replace('/&(\w+)(?=&|$)/', '&$1=', strtr($_SERVER['QUERY_STRING'], array(';?' => '&', ';' => '&', '%00' => '', "\0" => ''))), $_GET);
// Magic quotes still applies with parse_str - so clean it up.
if (function_exists('get_magic_quotes_gpc') && @get_magic_quotes_gpc() != 0 && empty($settings['integrate_magic_quotes']))
$_GET = $removeMagicQuoteFunction($_GET);
}
elseif (strpos(@ini_get('arg_separator.input'), ';') !== false)
{
if (function_exists('get_magic_quotes_gpc') && @get_magic_quotes_gpc() != 0 && empty($settings['integrate_magic_quotes']))
$_GET = $removeMagicQuoteFunction($_GET);
// Search engines will send action=profile%3Bu=1, which confuses PHP.
foreach ($_GET as $k => $v)
{
if (is_string($v) && strpos($k, ';') !== false)
{
$temp = explode(';', $v);
$_GET[$k] = $temp[0];
for ($i = 1, $n = count($temp); $i < $n; $i++)
{
@list ($key, $val) = @explode('=', $temp[$i], 2);
if (!isset($_GET[$key]))
$_GET[$key] = $val;
}
}
// This helps a lot with integration!
if (strpos($k, '?') === 0)
{
$_GET[substr($k, 1)] = $v;
unset($_GET[$k]);
}
}
}So we have to leave that code in, unless you plan on fixing every URL to not use ; (and with all the other problems related to it). SMF and Wedge using ; is a definite oddity though it does solve so many problems.
And we should have dealt with that long ago. Seriously, I'm surprised we had the problem at all, because if you'll look at noisen.com, the phpsessid is never injected *at all* into feeds...
Added a context variable in Feed.php to say we don't want to insert the session ID. This is tested against in Subs-Template.php. The reason why I was lazy for it, is that (1) just testing for Feedfetcher-Google (or whatever it's called) isn't going to do any good for other feed reader bots, (2) there is VERY little reason to have a session ID in a feed URL
Note that SID containing a & is injected by SMF and Wedge, and I have no idea why ; wasn't used there.
If you have cookies disabled, then your session won't be active forever. Your feed reader will soon end up trying to access an incorrect session ID anyway.
The whole thing about using SID in URLs is an interesting one and one I've been unwilling to make a move on because I'm inclined to think that having 'probably accurate' stats about the number of guests is probably slightly more important than having an SEO benefit to it (though having a canonical URL should fix most issues)
If probably_robot were more thorough, we can be happier about leaving it in. On the other hand, note, cookies being disabled will break other functionality anyway. It's a tough one to call :/
3754
The Pub / Re: The Cookie Law (in the UK at least)
« on April 15th, 2012, 11:11 PM »
It might not, but there is always the possibility that it *does*.
3755
The Pub / Re: The Cookie Law (in the UK at least)
« on April 15th, 2012, 10:09 PM »It would seem that site owners may be responsible and have to obtain specific opt-ins before allowing their software to invite third-party cookies. But, as I said, ICO isn't giving any clear guidance on this (that satisfies lawyers).
3756
Archived fixes / Re: Buggy Feed links
« on April 15th, 2012, 05:58 PM »However, Wedge does it this way... First of all, it determines whether user is a robot. If it is, it will skip adding PHPSESSID to links
So the reason why it broke in GR is that the URLs returned were not prettified.
Of course, there's still the problem of browsers that disable cookies entirely... i.e. if you're logged in and have cookies disabled like me right now, you absolutely need a phpsessid link in the URL.
I'm not sure why, but SMF and Wedge both add "&" at the end of the SID URL... Instead of simply using ";". I'm not sure SMF/Wedge would work *at all* if the installed PHP didn't support ";" as a separator.
(Jean Dujardin voice) Not so fast, Mr. Bond!
As pointed out above, SID isn't injected is $user_info['possibly_robot'] is true, which is always the case for Google Bot. (And probably Google Reader's bot as well.)
3757
Archived fixes / Re: Buggy Feed links
« on April 14th, 2012, 10:50 PM »But technically would it be possible to forget phpsessid in the first page view?
Maybe we could enforce loading a second page like login before logging in is accepted.
The problem that we have to weigh up is accuracy of reporting vs. PHPSESSID in URLs. Specifically, it's simply about tracking how many 'probably unique' guests there are, given the requests being made, since the whole nature of cookies is a friggin' bolt on to the specification in the first place (as HTTP is specifically designed to be stateless)
As for feeds, are we sure phpsessid is used in these? If yes we should certainly ensure it isn't included...
3758
Archived fixes / Re: Buggy Feed links
« on April 14th, 2012, 09:17 PM »
Re Google... for *web searches*, yes it does - if you have Google Webmaster Tools and tell it not to do so. It doesn't do so automatically. And other services - like the feed reader that started this thread, no, it doesn't, because that's what causes it to repeatedly read in topics - because the URL changes.
The problem with that solution is that it's still not that reliable, especially for those who would actually trip it - we'd be better just accepting when it's wrong instead.
The problem with that solution is that it's still not that reliable, especially for those who would actually trip it - we'd be better just accepting when it's wrong instead.
3759
Archived fixes / Re: Buggy Feed links
« on April 14th, 2012, 08:56 PM »
Well, it's a bit more complicated than that because cookies are used to handle sessions even for guests. Where it gets problematic is for tracking the number of unique guests, and without proper session support that just won't happen properly - and PHPSESSID is only ever sent when there isn't a cookie, which is where search engines use it.
It isn't just about not having cookie support, it is also about when there simply hasn't been a cookie, e.g. the very first visit, but search engines typically have 'new sessions', and you could very easily go from having '2 or 3' Google visits at a time to dozens where it can't properly handle the session.
That's why I haven't changed it, because I have a nasty feeling it would break the 'number of guests online at present'.
It isn't just about not having cookie support, it is also about when there simply hasn't been a cookie, e.g. the very first visit, but search engines typically have 'new sessions', and you could very easily go from having '2 or 3' Google visits at a time to dozens where it can't properly handle the session.
That's why I haven't changed it, because I have a nasty feeling it would break the 'number of guests online at present'.
3760
Bug reports / Re: Pretty URL remarks
« on April 14th, 2012, 08:51 PM »
As long as it works and works even for actions added to the list through hooks (e.g plugins), changing to /do/something would be neat.
3761
Bug reports / Re: Pretty URL remarks
« on April 14th, 2012, 05:31 PM »
I like what you've done with it, it's pretty approachable and friendly in the scheme of things :)
3762
Archived fixes / Re: Buggy Feed links
« on April 14th, 2012, 05:16 PM »
Oh, there's several things wrong with PHPSESSID but I'm beginning to think the magic injection into every link on the page is actually more of a hindrance than a help. Certainly it screws up a lot of search engine stuff (including SEO) and feeds are no exception.
I have thought about dropping it entirely and relying on only cookies to handle sessions but that will confuse the tracking of max users on systems that can't properly handle cookies (like some search engines, paranoid guests), but not sure yet.
I have thought about dropping it entirely and relying on only cookies to handle sessions but that will confuse the tracking of max users on systems that can't properly handle cookies (like some search engines, paranoid guests), but not sure yet.
3763
Bug reports / Re: Pretty URL remarks
« on April 14th, 2012, 05:06 PM »
And it's currently broken, in redirectexit(), when trying to redirect back to a topic, since it redirects back to ?topic=x which causes a redirect loop... return-to-topic currently is broken for me, it saves but subsequently fails to load the page after (where it has returned to the topic)
3764
Archived fixes / Re: Buggy Feed links
« on April 14th, 2012, 05:04 PM »
Yes there is something to fix.
Every single time Google reads it, it will have PHPSESSID links in it, so each time it refetches the feed, it sees different items in it as a result (because whenever it hits, it has a different session id). I've already patched it so there's a new parameter ($context['no_phpsessid']), if empty it will inject PHPSESSID=whatever into the URLs if appropriate, or forcibly disable it if it's non-empty - there is absolutely no reason to submit PHPSESSIDs in feed items, and several reasons (like this) to expressly not do so.
There is still a bug with the XML not passing validation, though that won't cause this behaviour.
Every single time Google reads it, it will have PHPSESSID links in it, so each time it refetches the feed, it sees different items in it as a result (because whenever it hits, it has a different session id). I've already patched it so there's a new parameter ($context['no_phpsessid']), if empty it will inject PHPSESSID=whatever into the URLs if appropriate, or forcibly disable it if it's non-empty - there is absolutely no reason to submit PHPSESSIDs in feed items, and several reasons (like this) to expressly not do so.
There is still a bug with the XML not passing validation, though that won't cause this behaviour.
3765
Archived fixes / Re: Buggy Feed links
« on April 14th, 2012, 02:00 PM »
OK, so I actually fixed the bigger issue quite easily, but now I need to go away and research the specification now.