Buggy Feed links

nolsilang

  • Lurking <i class=
  • Posts: 106
Buggy Feed links
« on April 14th, 2012, 01:30 PM »
I do not consider this is a bug yet, but today my Google Reader got bombarded from Wedge Development Blog. As if anything new got posted in Development Blog repeatedly[1].

On the other hand,I want to ask is the date posted in Development Blog correspond to latest reply date not the date of the thread posted?

Thanks.
 1. picture attached

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Buggy Feed links
« Reply #1, on April 14th, 2012, 01:47 PM »
Well, you can look at the threads just as easily as I can, This. Is. Crazy was 'Posted by Nao, on April 3rd, 11:00 PM' for example.

I do suspect there's a problem here though after pushing the Dev Blog's new-topics feed link through a validator:
http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fwedge.org%2Fblog%2F%3Faction%3Dfeed%3Bsa%3Dnews

The first issue:
Quote
In addition, interoperability with the widest range of feed readers could be improved by implementing the following recommendation.
line 2, column 0: Missing atom:link with rel="self"
That I can fix. The other error, though, I can't fix so readily - and I suspect it's why it's broken (because it's including the session ID in the URL, making each one unique, and breaking each one)
Re: Buggy Feed links
« Reply #2, on April 14th, 2012, 02:00 PM »
OK, so I actually fixed the bigger issue quite easily, but now I need to go away and research the specification now.
When we unite against a common enemy that attacks our ethos, it nurtures group solidarity. Trolls are sensational, yes, but we keep everyone honest. | Game Memorial

nolsilang

  • Lurking <i class=
  • Posts: 106
Re: Buggy Feed links
« Reply #3, on April 14th, 2012, 02:07 PM »
I just got another in my reader :).

The title in feed is not pointing to thread url, but to wedge homepage.

At first I thought that is the intended behaviour so the subscriber know if any new reply have been made.(the frequency and date)

📎 Google Reader (4) - cropped1.png - 10.31 kB, 405x440, viewed 203 times.


Nao

  • Dadman with a boy
  • Posts: 16,082
Re: Buggy Feed links
« Reply #4, on April 14th, 2012, 02:34 PM »
There is nothing to fix.
As soon as Google will retrieve the feed, it will try and find differences with the earlier feed. If anything changes, such as the URL, it will mark the item as new/unread.
Yesterday I made many tweaks to the pretty URL system, and as a result I broke feed links a few times. I think I fixed that no more than a couple of hours after it was broken, but it's enough for Google to mark something as new.

Oddly, though, I *thought* that with the system I'm using in Wedge (unique tags), it wouldn't bother with URL changes... That is one thing that really bothers me. But other than that -- I checked the current feed and it's all 'normal' to me...
Posted: April 14th, 2012, 02:31 PM

Note: a minor bug...
I was logged off when I posted my reply. It asked me to log in. I did my thing, and it said "Page not found" or something. Went back to the original page, clicked Submit again -- told me "Already submitted". Went back, copied my text, refreshed the page: it was NOT already submitted... So I just pasted the text and voilà.

So, two possible bugs:
- "already submitted" being said even when the post was actually refused...
- "page not found". I think it also does this in "View query" pages. Only admins can see that, though... (That bug may be due to my pretty URL tweaks though.)

nolsilang

  • Lurking <i class=
  • Posts: 106
Re: Buggy Feed links
« Reply #5, on April 14th, 2012, 02:36 PM »
Oh okay, I just report what I think odd. :) that's why I said earlier I don't think it's a bug yet, because I know wedge is still work in progress[1]

Thanks.
 1. Maybe I should wait more before reporting again

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Buggy Feed links
« Reply #6, on April 14th, 2012, 05:04 PM »
Yes there is something to fix.

Every single time Google reads it, it will have PHPSESSID links in it, so each time it refetches the feed, it sees different items in it as a result (because whenever it hits, it has a different session id). I've already patched it so there's a new parameter ($context['no_phpsessid']), if empty it will inject PHPSESSID=whatever into the URLs if appropriate, or forcibly disable it if it's non-empty - there is absolutely no reason to submit PHPSESSIDs in feed items, and several reasons (like this) to expressly not do so.

There is still a bug with the XML not passing validation, though that won't cause this behaviour.

Nao

  • Dadman with a boy
  • Posts: 16,082
Re: Buggy Feed links
« Reply #7, on April 14th, 2012, 05:12 PM »
Ah, yes, I remember our talk about phpsessid... :)

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Buggy Feed links
« Reply #8, on April 14th, 2012, 05:16 PM »
Oh, there's several things wrong with PHPSESSID but I'm beginning to think the magic injection into every link on the page is actually more of a hindrance than a help. Certainly it screws up a lot of search engine stuff (including SEO) and feeds are no exception.

I have thought about dropping it entirely and relying on only cookies to handle sessions but that will confuse the tracking of max users on systems that can't properly handle cookies (like some search engines, paranoid guests), but not sure yet.

MultiformeIngegno

  • Posts: 1,337
Re: Buggy Feed links
« Reply #9, on April 14th, 2012, 06:34 PM »
Quote from Arantor on April 14th, 2012, 05:16 PM
I have thought about dropping it entirely and relying on only cookies to handle sessions but that will confuse the tracking of max users on systems that can't properly handle cookies (like some search engines, paranoid guests), but not sure yet.
This seems interesting! ;)

Nao

  • Dadman with a boy
  • Posts: 16,082
Re: Buggy Feed links
« Reply #10, on April 14th, 2012, 07:24 PM »
Yeah, I'm pro-removing sessid too. And sort of determining that if the client won't accept cookies, and it's not the first page they load, then give them a message saying to fuck off :P

Then again I'm not sure we won't be getting genuine feedback about horror stories that arose from that design choice ;)

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Buggy Feed links
« Reply #11, on April 14th, 2012, 08:56 PM »
Well, it's a bit more complicated than that because cookies are used to handle sessions even for guests. Where it gets problematic is for tracking the number of unique guests, and without proper session support that just won't happen properly - and PHPSESSID is only ever sent when there isn't a cookie, which is where search engines use it.

It isn't just about not having cookie support, it is also about when there simply hasn't been a cookie, e.g. the very first visit, but search engines typically have 'new sessions', and you could very easily go from having '2 or 3' Google visits at a time to dozens where it can't properly handle the session.

That's why I haven't changed it, because I have a nasty feeling it would break the 'number of guests online at present'.

Nao

  • Dadman with a boy
  • Posts: 16,082
Re: Buggy Feed links
« Reply #12, on April 14th, 2012, 09:06 PM »
And google removes the phpsessid var from the URL iirc? Isn't it a default php name?
If it doesn't, we might want to look into whether it can remove anything.

Also my main issue with phpsessid is not with purls but with people posting links that contain it.
Re: Buggy Feed links
« Reply #13, on April 14th, 2012, 09:07 PM »
Wouldn't it work if we removed phpsessid for first visits (ip based), and restore it on the second visit if the cookie still isn't there..? (visit = pageview)

Arantor

  • As powerful as possible, as complex as necessary.
  • Posts: 14,278
Re: Buggy Feed links
« Reply #14, on April 14th, 2012, 09:17 PM »
Re Google... for *web searches*, yes it does - if you have Google Webmaster Tools and tell it not to do so. It doesn't do so automatically. And other services - like the feed reader that started this thread, no, it doesn't, because that's what causes it to repeatedly read in topics - because the URL changes.

The problem with that solution is that it's still not that reliable, especially for those who would actually trip it - we'd be better just accepting when it's wrong instead.