Wedge

Public area => Bug reports => The Pub => Archived fixes => Topic started by: Arantor on February 29th, 2012, 01:08 PM

Title: SMF bug 4859 (Text limit not necessarily enforced properly)
Post by: Arantor on February 29th, 2012, 01:08 PM
I can see what's being said here - if you have a post that, when converted to get its br tags etc, will overflow 64K but the post itself was otherwise below 64K, it can be truncated at the end of the text boundary.

There was talk at one point that we were going to make it mediumtext, but the SQL says we didn't >_> Hmm, need to look this one up.
Title: Re: SMF bug 4859 (Text limit not necessarily enforced properly)
Post by: Arantor on April 9th, 2012, 02:03 PM
I also need to check whether there's a performance impact when moving from a text to a mediumtext but I don't believe there is.
Title: Re: SMF bug 4859 (Text limit not necessarily enforced properly)
Post by: Nao on April 9th, 2012, 02:54 PM
Me neither. :)
Title: Re: SMF bug 4859 (Text limit not necessarily enforced properly)
Post by: Arantor on April 9th, 2012, 03:04 PM
I can't seem to find anything which suggests mediumtext is any more vicious than a text is, bearing in mind the fact that a mediumtext requires 3 bytes per row rather than 2, which means an extra byte per message, so you've got the extra I/O impact.

But really, given the way that you can accidentally trigger the code internally to perform an ALTER TABLE on the messages table without being aware of it, I'd rather just make the switch, which would typically solve the problem, you only then hit the query packet limit, rather than some shorter arbitrary limit (and it means that to all intents and purposes, 0 really does mean no limit)
Title: Re: SMF bug 4859 (Text limit not necessarily enforced properly)
Post by: Nao on April 9th, 2012, 07:00 PM
Have you ever seen many posts grow over 64kb..?
Title: Re: SMF bug 4859 (Text limit not necessarily enforced properly)
Post by: Arantor on April 9th, 2012, 07:16 PM
Not *many*, no.

However, there's all sorts of fringe problems.

1) Set a post length of 64000, create a post where you add 63999 characters. It will be checked, validated as being within the limit, then preparsed, whereupon it will get bigger in almost every case (e.g. the newline conversion) where now it might not fit in to the text buffer.

2) If you set a limit of 0, there is absolutely no check made on content length and posts will be truncated at 64K without any warning.

3) If you set the length to > 64K, the table will be adjusted without warning, which is a full table lock. There isn't even a reminder to put it in maintenance mode.

4) If you set the length to > 64K, then back afterwards, the table will be downsized, without any validation of posts' content, meaning that any posts > 64K, they will be truncated without warning.


Now, we have two choices. We can accept the extra byte per message, or we can implement measures to remedy all of the above.
Title: Re: SMF bug 4859 (Text limit not necessarily enforced properly)
Post by: PantsManUK on April 13th, 2012, 11:51 AM
Don't forget option 3, do nothing...

For my "money", option 1, an extra byte per row, has least impact.

I prefer option 2, but that means work that you may not be able to afford at this time. Least upsetting might be 1 as a stop-gap measure until 2 can be implemented (i.e., do it right eventually).
Title: Re: SMF bug 4859 (Text limit not necessarily enforced properly)
Post by: Arantor on April 13th, 2012, 12:09 PM
That's the thing, if you carry out option 1, pretty much everything goes away, and doesn't need to be remedied. Option 3 of doing nothing isn't really viable, though ;)

Objection 1 becomes a non-issue in practical terms (because it essentially becomes upwards of 1MB and if you have a 1MB forum post even before preparsing, it's going to be very, very, very ugly and fail for other reasons before you get there)

Objection 2 becomes a non-issue in practical terms too (for the same reason)

Both objections 3 and 4 become a non-issue in any context because if it's a mediumtext table, no resizing is necessary.

Sure, we could fix them, but honestly I don't really see the point. All it means is doing more checks and more kickbacks for edge conditions that we can basically solve quickly by resizing the table and stripping the code.
Title: Re: SMF bug 4859 (Text limit not necessarily enforced properly)
Post by: Arantor on February 20th, 2013, 05:52 AM
Bump to remind myself. I'm going to convert the table to mediumint, strip the edge logic around bumping the size around and then it will work as it should.
Title: Re: SMF bug 4859 (Text limit not necessarily enforced properly)
Post by: Arantor on February 21st, 2013, 12:04 AM
Oh now I remember why I didn't do it. You can't use a fulltext index on mediumtext fields, though if you were using InnoDB you couldn't anyway[1] and invariably a custom index would outperform fulltext anyway.

I was going to remove fulltext but for some reason I didn't - http://wedge.org/code/6838/removing-fulltext-indexes/ - but it seems my concern was primarily about memory usage.

It does not reserve the full size per row (as proven with drafts which have always been mediumtext anyway but never had to be searchable)

So unless there are any objections, I am going to proceed with this, and drop fulltext indexes anyway - even if InnoDB supports them going forward, it doesn't on the size of field we're using and a custom index is better anyway. Note that fulltext is not even enabled by default anyway so no qualms about that.
 1. It's an option in MySQL 5.6.
Title: Re: SMF bug 4859 (Text limit not necessarily enforced properly)
Post by: Arantor on February 21st, 2013, 02:19 AM
All committed in r1936. Bye bye fulltext. Hello huge posts. Not a massive revert if we so desire but I'm comfortable without worrying about it personally.

Bad me, I should have waited for feedback but this is something I was eager to get done.