author | Elijah Newren <newren@gmail.com> | |
Fri, 12 Aug 2011 05:20:03 +0000 (23:20 -0600) | ||
committer | Junio C Hamano <gitster@pobox.com> | |
Sun, 14 Aug 2011 21:19:36 +0000 (14:19 -0700) | ||
commit | 531357a4cc2b93ec68099890b835f50e462ceab7 | |
tree | 0b23ebf066061e2e0174f1b35e65c887e72ca15b | tree | snapshot |
parent | b8ddf16424f2d4f29a73b89dabbb764763744abe | commit | diff |
merge-recursive: Fix deletion of untracked file in rename/delete conflicts
In the recursive case (o->call_depth > 0), we do not modify the working
directory. However, when o->call_depth==0, file renames can mean we need
to delete the old filename from the working copy. Since there have been
lots of changes and mistakes here, let's go through the details. Let's
start with a simple explanation of what we are trying to achieve:
Original goal: If a file is renamed on the side of history being merged
into head, the filename serving as the source of that rename needs to be
removed from the working directory.
The path to getting the above statement implemented in merge-recursive took
several steps. The relevant bits of code may be instructive to keep in
mind for the explanation, especially since an English-only description
involves double negatives that are hard to follow. These bits of code are:
int remove_file(..., const char *path, int no_wd)
{
...
int update_working_directory = !o->call_depth && !no_wd;
and
remove_file(o, 1, ren1_src, <expression>);
Where the choice for <expression> has morphed over time:
65ac6e9 (merge-recursive: adjust to loosened "working file clobbered"
check 2006-10-27), introduced the "no_wd" parameter to remove_file() and
used "1" for <expression>. This meant ren1_src was never deleted, leaving
it around in the working copy.
In 8371234 (Remove uncontested renamed files during merge. 2006-12-13),
<expression> was changed to "index_only" (where index_only ==
!!o->call_depth; see b7fa51da). This was equivalent to using "0" for
<expression> (due to the early logic in remove_file), and is orthogonal to
the condition we actually want to check at this point; it resulted in the
source file being removed except when index_only was false. This was
problematic because the file could have been renamed on the side of history
including head, in which case ren1_src could correspond to an untracked
file that should not be deleted.
In 183d797 (Keep untracked files not involved in a merge. 2007-02-04),
<expression> was changed to "index_only || stage == 3". While this gives
correct behavior, the "index_only ||" portion of <expression> is
unnecessary and makes the code slightly harder to follow.
There were also two further changes to this expression, though without
any change in behavior. First in b7fa51d (merge-recursive: get rid of the
index_only global variable 2008-09-02), it was changed to "o->call_depth
|| stage == 3". (index_only == !!o->call_depth). Later, in 41d70bd6
(merge-recursive: Small code clarification -- variable name and comments),
this was changed to "o->call_depth || renamed_stage == 2" (where stage was
renamed to other_stage and renamed_stage == other_stage ^ 1).
So we ended with <expression> being "o->call_depth || renamed_stage == 2".
But the "o->call_depth ||" piece was unnecessary. We can remove it,
leaving us with <expression> being "renamed_stage == 2". This doesn't
change behavior at all, but it makes the code clearer. Which is good,
because it's about to get uglier.
Corrected goal: If a file is renamed on the side of history being merged
into head, the filename serving as the source of that rename needs to be
removed from the working directory *IF* that file is tracked in head AND
the file tracked in head is related to the original file.
Note that the only difference between the original goal and the corrected
goal is the two extra conditions added at the end. The first condition is
relevant in a rename/delete conflict. If the file was deleted on the
HEAD side of the merge and an untracked file of the same name was added to
the working copy, then without that extra condition the untracked file
will be erroneously deleted. This changes <expression> to "renamed_stage
== 2 || !was_tracked(ren1_src)".
The second additional condition is relevant in two cases.
The first case the second condition can occur is when a file is deleted
and a completely different file is added with the same name. To my
knowledge, merge-recursive has no mechanism for detecting deleted-and-
replaced-by-different-file cases, so I am simply punting on this
possibility.
The second case for the second condition to occur is when there is a
rename/rename/add-source conflict. That is, when the original file was
renamed on both sides of history AND the original filename is being
re-used by some unrelated (but tracked) content. This case also presents
some additional difficulties for us since we cannot currently detect these
rename/rename/add-source conflicts; as long as the rename detection logic
"optimizes" by ignoring filenames that are present at both ends of the
diff, these conflicts will go unnoticed. However, rename/rename conflicts
are handled by an entirely separate codepath not being discussed here, so
this case is not relevant for the line of code under consideration.
In summary:
Change <expression> from "o->call_depth || renamed_stage == 2" to
"renamed_stage == 2 || !was_tracked(ren1_src)", in order to remove
unnecessary code and avoid deleting untracked files.
96 lines of explanation in the changelog to describe a one-line fix...
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the recursive case (o->call_depth > 0), we do not modify the working
directory. However, when o->call_depth==0, file renames can mean we need
to delete the old filename from the working copy. Since there have been
lots of changes and mistakes here, let's go through the details. Let's
start with a simple explanation of what we are trying to achieve:
Original goal: If a file is renamed on the side of history being merged
into head, the filename serving as the source of that rename needs to be
removed from the working directory.
The path to getting the above statement implemented in merge-recursive took
several steps. The relevant bits of code may be instructive to keep in
mind for the explanation, especially since an English-only description
involves double negatives that are hard to follow. These bits of code are:
int remove_file(..., const char *path, int no_wd)
{
...
int update_working_directory = !o->call_depth && !no_wd;
and
remove_file(o, 1, ren1_src, <expression>);
Where the choice for <expression> has morphed over time:
65ac6e9 (merge-recursive: adjust to loosened "working file clobbered"
check 2006-10-27), introduced the "no_wd" parameter to remove_file() and
used "1" for <expression>. This meant ren1_src was never deleted, leaving
it around in the working copy.
In 8371234 (Remove uncontested renamed files during merge. 2006-12-13),
<expression> was changed to "index_only" (where index_only ==
!!o->call_depth; see b7fa51da). This was equivalent to using "0" for
<expression> (due to the early logic in remove_file), and is orthogonal to
the condition we actually want to check at this point; it resulted in the
source file being removed except when index_only was false. This was
problematic because the file could have been renamed on the side of history
including head, in which case ren1_src could correspond to an untracked
file that should not be deleted.
In 183d797 (Keep untracked files not involved in a merge. 2007-02-04),
<expression> was changed to "index_only || stage == 3". While this gives
correct behavior, the "index_only ||" portion of <expression> is
unnecessary and makes the code slightly harder to follow.
There were also two further changes to this expression, though without
any change in behavior. First in b7fa51d (merge-recursive: get rid of the
index_only global variable 2008-09-02), it was changed to "o->call_depth
|| stage == 3". (index_only == !!o->call_depth). Later, in 41d70bd6
(merge-recursive: Small code clarification -- variable name and comments),
this was changed to "o->call_depth || renamed_stage == 2" (where stage was
renamed to other_stage and renamed_stage == other_stage ^ 1).
So we ended with <expression> being "o->call_depth || renamed_stage == 2".
But the "o->call_depth ||" piece was unnecessary. We can remove it,
leaving us with <expression> being "renamed_stage == 2". This doesn't
change behavior at all, but it makes the code clearer. Which is good,
because it's about to get uglier.
Corrected goal: If a file is renamed on the side of history being merged
into head, the filename serving as the source of that rename needs to be
removed from the working directory *IF* that file is tracked in head AND
the file tracked in head is related to the original file.
Note that the only difference between the original goal and the corrected
goal is the two extra conditions added at the end. The first condition is
relevant in a rename/delete conflict. If the file was deleted on the
HEAD side of the merge and an untracked file of the same name was added to
the working copy, then without that extra condition the untracked file
will be erroneously deleted. This changes <expression> to "renamed_stage
== 2 || !was_tracked(ren1_src)".
The second additional condition is relevant in two cases.
The first case the second condition can occur is when a file is deleted
and a completely different file is added with the same name. To my
knowledge, merge-recursive has no mechanism for detecting deleted-and-
replaced-by-different-file cases, so I am simply punting on this
possibility.
The second case for the second condition to occur is when there is a
rename/rename/add-source conflict. That is, when the original file was
renamed on both sides of history AND the original filename is being
re-used by some unrelated (but tracked) content. This case also presents
some additional difficulties for us since we cannot currently detect these
rename/rename/add-source conflicts; as long as the rename detection logic
"optimizes" by ignoring filenames that are present at both ends of the
diff, these conflicts will go unnoticed. However, rename/rename conflicts
are handled by an entirely separate codepath not being discussed here, so
this case is not relevant for the line of code under consideration.
In summary:
Change <expression> from "o->call_depth || renamed_stage == 2" to
"renamed_stage == 2 || !was_tracked(ren1_src)", in order to remove
unnecessary code and avoid deleting untracked files.
96 lines of explanation in the changelog to describe a one-line fix...
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
merge-recursive.c | diff | blob | history | |
t/t6042-merge-rename-corner-cases.sh | diff | blob | history |