Public area => The Pub => Topic started by: Nao on January 11th, 2014, 06:18 PM
Okay, so this git rebase feature is a joke...
IIRC, the Wedge repo was built using svn-git, and then fully converted to git. It's got, due to its SVN background, a fully linear history (except at the end, when I was experiencing with branches.)
Yesterday, I decided to go ahead and do the full rebase I wanted to do, to fix these problems:
- Many CRLF fixes in the first commits, which should be concatenated to their parent commit, so that the CRLF problems fix themselves.
- One more CRLF fix in late 2012. This one's been bothering me for a while, because it adds two useless steps to most Git Blame operations.
- An empty commit (entirely empty, really.)
- An empty commit message (this is a wrong manipulation, but the message exists in the New Revs topic, it's rev 142.)
That was more than enough to justify the rebase.
So, I started off by rebasing the CRLF stuff early in the history. As a result, I kept getting 3-way merge errors, insisting that some files were missing, or other silly things. Given that it's a LINEAR history, and that the only work was done was to cancel CRLF line endings, there's absolutely no reason for these.
After several hours of failures, I decided to give it up, and instead start fixing from the most recent commits, and then work my way up.
I got rid of the empty commit easily (all it took was git filter-branch --prune-empty), then did a rebase on the 2012 bug, which seems to be fixed. Okay, all good...
Now, onto that empty commit message. What I did was retrieve the SHA1 of the grandfather commit, do git rebase -i on that SHA1, then when the list shows up, change 'pick' to 'reword' in the empty commit message line. Then I save, it shows me a commit window, I paste my rev 142 content, quit, and it starts rebasing. Until the end. ALL GOOD.
Then I right-clicked the Wedge repo, Show Log, and... Funny. You can see the error message below.
It still shows me the master and all commits, but I can't access the branch list, and switch to any of them.
I then restarted from scratch (well, before the last rebase), and I still got the same. That reword rebase totally breaks my refs.
From what I could gather, between my broken Wedge's .git folder and the last valid one, there are a few differences. Broken has an empty refs folder, except for the 'stash' file. All its subfolders are there, but empty. In Working, all subfolders have a list of files corresponding to the expected refs (master, other branches, etc.)
OTOH, Working has an info folder with just an 'exclude' file, while Broken has an extra file in it, called 'refs', and which holds a list of all my branches, like this:
So, basically, I'm thinking I could just re-create the files manually in /refs/, but (1) it's taxing, (2) I'm never gonna know if I forgot to fix something else, (2) I'll probably have the same problem again when I try another rebase, (3) I don't even know why git (command line) doesn't complain (if I type git status, I get a list of branches), and TortoiseGit is complaining.
Can anyone help..? (ema? :sob:)
Copying the ref files manually and updating the hash value for master worked.
In the meantime I did one more rebase and it didn't trigger errors.
I'm currently doing an additional one (each rebase takes something like an hour), and so far so good.
Then I'll have about two rebases left to do. Hopefully I can pinpoint the one fixup that breaks the repo. Maybe nothing will break it. If something breaks it, I'll probably just leave it be. After all, it's only a stat-related rebase (ie I don't want line removal and addition stats to be influenced too much by crlf commits that usually add thousands of lines in the stats.)
Still. Rebasing was so hard, I swear after that, I won't be tempted to do it again! Thankfully, as the repo will then be public, I would be crazy to rebase after that point.
My goal for public repo release is January 15 (and a public alpha a couple of weeks after that at most). Will probably come later if I lose too much time with rebasing.
Great! Last rebase worked fine, AND the current one (on which I decided to tackle one more commit than usual) is going fine, too.
Two rebases left, at most! Once I finish this one...
All done, yay!
Now, all that's left to do is filter-branch to remove all occurrences of the languages folder (don't need to waste repo space on something that's already duplicated in its own repo), and the 'other' folder (aka the stash), although that one will be harder, and might stay in it.
The 'other' folder is gone entirely, in the end. It didn't make sense to keep it, especially as the entirety of it went into the stash history.
The 'languages' folders are all gone, except for /core/languages/, introduced recently. See the 'lang files' topic for a related question to users experienced with either github and/or git submodules. Thanks!
Amusingly, after all this cleanup, the repo size is now twice the original size, even after a git gc --prune=now, but I suspect it's down to the fact that my other (unpushed) branches are all based upon older versions of my history, meaning git is forced to keep a backup of all previous commits, too. Anyone knows how I could reintegrate the updated master into the other branches? I fear rebasing would end up in failure, but I guess it's the 'easiest' solution. But I don't even know if it's possible to take the master history, rebase another branch's tip on top of it, and then save that new stuff to a new branch (or the original feature branch)..?
I just want to make sure the Wedge repo is 100% solid before I push it to github, of course.
Ehhh... I understand.
Remove the CRLF from the history is a PAIN and I have to fight with them every time... No idea how to fix them "easily". The only way is prevent them to begin with, I think there are a couple of options you can enable to ignore CRLF and force the repo to LF only.
Don't worry, I finished rebasing everything. There are only two instances of CRLF remaininf: one in Admin.php, one in changelog.txt. I'm not 'fixup'ing their CRLF->LF fixes, because there were a few commits made before these files were fixed, and I don't want to merge unrelated fixes together, 'just' to get the LF issues right.
Ehhh... I understand.
Remove the CRLF from the history is a PAIN and I have to fight with them every time... No idea how to fix them "easily".
Still, that's a lot better than the 50+ fixed files I used to have, of course.
Yes and no. It fixes problems on your local side, but if your 'remote' repo has CRLF in a file, it will be re-committed as CRLF, so it needs to be specifically dealt with.
The only way is prevent them to begin with, I think there are a couple of options you can enable to ignore CRLF and force the repo to LF only.
Provided the repository is clean of course.
/meremembers the joy of finding CRLF commits...
We all went through that (for me, it was in August and September 2010, after which we never got any problems again, except in late 2012 because of a new program I used that rewrote all my files in CRLF, and I only realized it after committing.... and SVN doesn't allow history rewrites, so I was screwed at the time. git is too complicated of a beast, but at least it gives me a lot more flexibility in cases like this one.)
One other solution would be to run a git filter-branch that gets rid of any CRLF combos. And yes, that exists, and it's doable in one line, IIRC... :^^;:
ema, can you help with this?
After I rebased my master branch, the other branches were not updated, so they're all still referring to the original untouched master, and thus take more space in the .git folder...
I need to find out how to update the OTHER branches to use the MASTER branch as their 'original' branch. That is, basically, these branches all have a 'starting point' (the commit ID when they were created), I suppose I just need to move that commit ID to the new master branch. But how do I do that without rebasing the entire branch? Playing with reflogs?
Or is there another, even smarter way of updating these branches?
You should rebase them.
You enter the branch and
And that would be enough..?
Wouldn't git attempt to first take the entire master history, then apply whatever it doesn't find in the master history on top of it? That is, the entire 'alternate' (obsolete) history of the other branch?
Not that it matters much, because I didn't have enough diffs to justify re-doing these branches, but maybe for the future, I don't know...