Wedge

Public area => The Pub => Features => Topic started by: Nao on July 10th, 2012, 11:27 AM

Title: Auto_increment madness...
Post by: Nao on July 10th, 2012, 11:27 AM
You all know I'm a bit odd from time to time... This is one of those days.

I was wondering if we shouldn't (couldn't) do something specific on each table row deletion.

Let's say we have a draft that we don't want to keep (or that doesn't need to remain on the server because we just saved the original post), so Wedge will delete it internally... So we go through Post2.php:904, and the draft is deleted. Along with the auto-incremented (likely) latest id_draft.
If we could simply call ALTER TABLE {db_prefix}drafts AUTO_INCREMENT=1, where the auto_increment value would be set to the earliest 'available' id, that means it would be much slower for the table to reach its maximum size. Granted, it's int in this case, but you get my point: by being too lazy about some tables, and not resetting the auto_increment when we could do it, we're basically wasting IDs.

So maybe we could have an optional feature in db_query that would check whether a table row is being deleted, and if it is, just set its auto_increment to 1, either by checking against a whitelist of tables with auto_increment fields, or just doing it automatically (I doubt MySQL would do anything else than ignore the request if it has no such field.)

Is it too crazy..? Too smart for its own good?

(Oh, did you notice how I changed the layout for the top of the post page? The icons are now inside the select box itself, and the select box was moved to the left of the title input... I like that. Hopefully you all do.)
Title: Re: Auto_increment madness...
Post by: Arantor on July 10th, 2012, 04:52 PM
Quote
If we could simply call ALTER TABLE {db_prefix}drafts AUTO_INCREMENT=1, where the auto_increment value would be set to the earliest 'available' id, that means it would be much slower for the table to reach its maximum size.
That's just begging for a race condition bug.

Consider this scenario:
draft 100 is deleted
select max(id_draft) from wedge_drafts -> returns 99
draft 101 is added in the meantime
table is updated for its auto-increment to 99

But now we have an entry after 99, and this sort of stuff is far more likely than you might imagine.
Title: Re: Auto_increment madness...
Post by: Nao on July 10th, 2012, 05:24 PM
Hmm... I think we already had this conversation long ago ;)

I *did* mean setting the auto_increment to 1, it wasn't just an example... MySQL will 'fix' that to use the earliest possible id, which is what we want. ;)

Believe me, I do it all the time in phpMyAdmin!
Title: Re: Auto_increment madness...
Post by: Arantor on July 10th, 2012, 05:35 PM
Does that hold true for InnoDB? Does it still hold true if you shut down MySQL, restart it, and then apply that alter instruction?

/meis very wary of relying on such behaviour.
Title: Re: Auto_increment madness...
Post by: Nao on July 11th, 2012, 12:41 AM
Seriously, I don't really see how it could be 'dangerous'..?
Everywhere I go, people suggest doing that.
Heck, it's even mentioned in the comments on the auto_increment doc page at mysql.com...
Title: Re: Auto_increment madness...
Post by: Arantor on July 11th, 2012, 12:57 AM
Yes, it's mentioned in that page. Did you notice the date of said comment? It's *nine years old*. That's around the time MySQL 4.0 left beta.

Almost everyone citing this doesn't even know which table engine they're using in on (and is repeating that comment from 2003), and in almost every case the advice predates MyISAM being deprecated as being the default. There are issues in InnoDB with it 'forgetting' the auto-increment value, which is specifically why I asked about it.

I also have no idea if it'll break replication sets either, though I'd hope it wouldn't. Changing table structures is never a safe procedure. It's usually safe but it's not certainly safe even if it's a soft non-blocking operation (in most cases) like this should be.
Title: Re: Auto_increment madness...
Post by: Nao on July 11th, 2012, 11:57 AM
Was looking for detail on ALTER IGNORE TABLE (apparently it doesn't play with auto_increment...), and found this:
http://dev.mysql.com/doc/refman/5.1/en/alter-table.html#c10750

Looks like the operation isn't 'free' on large tables, and I don't want to go with dummy rows (concurrent queries?), so if we ever implement that, it would have to be only on 'small' tables...
Quote from Arantor on July 11th, 2012, 12:57 AM
Almost everyone citing this doesn't even know which table engine they're using in on (and is repeating that comment from 2003), and in almost every case the advice predates MyISAM being deprecated as being the default. There are issues in InnoDB with it 'forgetting' the auto-increment value, which is specifically why I asked about it.
That's interesting. I just read about it, and to keep it simple: if the MySQL server is restarted, InnoDB will indeed recalculate (i.e. forget and recache) the auto_increment value for each table. Meaning that it basically does an Alter table auto_increment = 1 on each restart. I think...
http://dev.mysql.com/doc/refman/5.0/en/innodb-auto-increment-handling.html
Title: Re: Auto_increment madness...
Post by: Pandos on July 11th, 2012, 12:36 PM
One way is to join the table by itself and do a select statement before inserting.
Perhaps something like this:

Code: [Select]
SELECT A.ID+1 AS NEXTID FROM drafts LEFT JOIN drafts ON A.ID+1 = B.ID WHERE B.ID IS NULL ORDER BY A.ID LIMIT 1;
Title: Re: Auto_increment madness...
Post by: Nao on July 11th, 2012, 12:37 PM
Err... What for?
Title: Re: Auto_increment madness...
Post by: Pandos on July 11th, 2012, 12:41 PM
This will look for "empty" ID's and put it into the database. Soo you can besure to have always the right value in your databse.
Title: Re: Auto_increment madness...
Post by: Arantor on July 11th, 2012, 04:13 PM
Quote from Pandos on July 11th, 2012, 12:41 PM
This will look for "empty" ID's and put it into the database. Soo you can besure to have always the right value in your databse.
That is an instant race condition waiting to happen (which is why auto-increment even exists in the first place)
Title: Re: Auto_increment madness...
Post by: Nao on July 11th, 2012, 04:50 PM
So, err, no comment on my post above? ;)
Title: Re: Auto_increment madness...
Post by: Arantor on July 11th, 2012, 05:02 PM
Quote
Looks like the operation isn't 'free' on large tables, and I don't want to go with dummy rows (concurrent queries?), so if we ever implement that, it would have to be only on 'small' tables...
Dummy rows is a sure-fire way to cause race conditions because you just cannot guarantee the order in which things hit the server.

I honestly thought this was a free operation, I didn't realise that it was done with a complete table clone, the sort of thing only usually reserved for row-changing table changes (ones that change the physical properties of a column)

Seems like it would rule such things out as we can't judge what is a 'small' table and what is a 'big' table for any given host that Wedge is deployed on.
Title: Re: Auto_increment madness...
Post by: Nao on July 12th, 2012, 04:22 PM
Ah well, another good intention that comes down the toilet... :(

(And don't tell me "I know how you feel", I know you do :P)
Title: Re: Auto_increment madness...
Post by: Arantor on July 12th, 2012, 04:29 PM
Quote from Nao on July 12th, 2012, 04:22 PM
Ah well, another good intention that comes down the toilet... :(
It is a good intention but I suspect we'll be better served by dealing with phantom drafts anyway ;)
Quote from Nao on July 12th, 2012, 04:22 PM
(And don't tell me "I know how you feel", I know you do :P)
Actually when you said that, it made me think of Song Of Myself from Nightwish's album Imaginaerum:
Quote
How can you 'just be yourself'
when you don't know who you are?
Stop saying 'I know how you feel'
How could anyone know how another feels?
Title: Re: Auto_increment madness...
Post by: Nao on July 12th, 2012, 06:39 PM
Quote from Arantor on July 12th, 2012, 04:29 PM
Quote from Nao on July 12th, 2012, 04:22 PM
Ah well, another good intention that comes down the toilet... :(
It is a good intention but I suspect we'll be better served by dealing with phantom drafts anyway ;)
How do you envision it then..?
Title: Re: Auto_increment madness...
Post by: Arantor on July 12th, 2012, 08:44 PM
I envision that it isn't really a problem if we end up avoiding making rows we don't need - right now there are phantom rows being introduced that leak ids even faster. If we reduce that from happening, we reduce the problem anyway.

But even as I've typed this, I can see we're only onto #8556 out of how many thousands of millions...?
Title: Re: Auto_increment madness...
Post by: Nao on July 12th, 2012, 09:33 PM
I dont remember this story about phantom rows.
Really.

8500+ is a lot for a basically temp table.
We could add a maintenance task to flush the cache though. Like, "okay the forum is quiet these days, I can do that..."
Title: Re: Auto_increment madness...
Post by: Arantor on July 12th, 2012, 10:03 PM
Quote
I dont remember this story about phantom rows.
You're the one who reported it - where drafts get saved even when they shouldn't be - the result is phantom rows that shouldn't really be saved in the first place.
Quote
We could add a maintenance task to flush the cache though. Like, "okay the forum is quiet these days, I can do that..."
If you mean the regular cache, there should be no need to do that automatically most of the time, it should just work with older data just being purged over time.

If you mean cases like this, I'd suggest not doing that on an automatic fashion. I'm not sure it's ever going to be a problem.

As per http://stackoverflow.com/questions/6130672/mysql-auto-increment-primary-key-running-out there are notes about keeping the primary key as small as possible (especially on InnoDB), and that's important as that's now the default in MySQL. Let's leave it as is and we can deal with it if and when we actually get near to those limits.
Title: Re: Auto_increment madness...
Post by: Nao on July 13th, 2012, 12:02 AM
Quote from Arantor on July 12th, 2012, 10:03 PM
Quote
I dont remember this story about phantom rows.
You're the one who reported it - where drafts get saved even when they shouldn't be - the result is phantom rows that shouldn't really be saved in the first place.
Well, err... I thought you'd fixed that one...?
Once I find a bug of mine, I tend not to let go until it's fixed. (Just spent 3 hours on a stupid JS bug in playlist pages... My dear Opera, it was your fault... You failed to report that it was a declaration error.... -_-)
Quote
Quote
We could add a maintenance task to flush the cache though. Like, "okay the forum is quiet these days, I can do that..."
If you mean the regular cache,
No, I mean the draft table (the draft cache).
Quote
If you mean cases like this, I'd suggest not doing that on an automatic fashion. I'm not sure it's ever going to be a problem.
Well, if the table is filled, I guess we can always tell them to do a TRUNCATE on it...
Quote
As per http://stackoverflow.com/questions/6130672/mysql-auto-increment-primary-key-running-out there are notes about keeping the primary key as small as possible (especially on InnoDB), and that's important as that's now the default in MySQL. Let's leave it as is and we can deal with it if and when we actually get near to those limits.
Interesting.

Yeah, I just don't wanna be accused of thinking on the short term. But we've got a long way to go until Wedge is run on a billion-post forum!
Title: Re: Auto_increment madness...
Post by: Arantor on July 13th, 2012, 12:09 AM
Quote
Well, err... I thought you'd fixed that one...?
I fixed most of it but you since showed me that there are edge cases where it still happens which haven't yet been fixed.
Quote
Well, if the table is filled, I guess we can always tell them to do a TRUNCATE on it...
*nods*
Quote
Yeah, I just don't wanna be accused of thinking on the short term. But we've got a long way to go until Wedge is run on a billion-post forum!
I'll be happy with running it on a multi-million post forum. I've only seen one billion-post forum in the past and they started out on phpBB2 some years back (and have since rewritten it - that's Gaia Online), everyone else of that sort of scale (the likes of Slashdot et al) wrote their own.

Biggest SMF forum I've seen is 22m posts and growing daily.
Title: Re: Auto_increment madness...
Post by: live627 on July 13th, 2012, 01:25 AM
Quote
Biggest SMF forum I've seen is 22m posts and growing daily
WCRPG? It regularly gets around 50,000 posts PER DAY!
Title: Re: Auto_increment madness...
Post by: Arantor on July 13th, 2012, 01:32 AM
Yeah, that's the one, it's pretty insane and holds together by miracles I'm sure. It's still running on Apache for example >_> (Liroy needs to move that to nginx ASAP)
Title: Re: Auto_increment madness...
Post by: Nao on July 13th, 2012, 07:52 AM
Quote from Arantor on July 13th, 2012, 12:09 AM
Quote
Well, err... I thought you'd fixed that one...?
I fixed most of it but you since showed me that there are edge cases where it still happens which haven't yet been fixed.
Oh... The iPod thing...?
Crap... :(
I'd moved on. Never really had another opportunity to look at my drafts. I just looked at them now and sure thing, I have some drafts from this week that are quite obviously from iPod posts... :-/
Quote
I'll be happy with running it on a multi-million post forum. I've only seen one billion-post forum in the past and they started out on phpBB2 some years back (and have since rewritten it - that's Gaia Online), everyone else of that sort of scale (the likes of Slashdot et al) wrote their own.
It's been years, and I have yet to understand the point in Gaia Online...
Well, to each their own! If I could create my very own Gaia Online I wouldn't care what others think of it :P
Quote
Biggest SMF forum I've seen is 22m posts and growing daily.
And I don't understand its point, either...