All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: John Keeping <john@keeping.me.uk>
To: Junio C Hamano <gitster@pobox.com>
Cc: Florian Aspart <florian.aspart@gmail.com>,
	Michael J Gruber <git@drmicha.warpmail.net>,
	git@vger.kernel.org
Subject: Re: Using clean/smudge filters with difftool
Date: Thu, 18 Jun 2015 23:39:27 +0100	[thread overview]
Message-ID: <20150618223927.GP18226@serenity.lan> (raw)
In-Reply-To: <xmqqsi9oeqhn.fsf@gitster.dls.corp.google.com>

On Thu, Jun 18, 2015 at 01:00:36PM -0700, Junio C Hamano wrote:
> John Keeping <john@keeping.me.uk> writes:
> 
> > I think this is a difference between git-diff's internal and external
> > diff modes which is working correctly, although possibly not desirably
> > in this case.  The internal diff always uses clean files (so it runs the
> > working tree file through the "clean" filter before applying the diff
> > algorithm) but the external diff uses the working tree file so it
> > applies the "smudge" filter to any blobs that it needs to checkout.
> >
> > Commit 4e218f5 (Smudge the files fed to external diff and textconv,
> > 2009-03-21) was the source of this behaviour.
> 
> The fundamental design to use smudged version when interacting with
> external programs actually predates that particular commit, I think.
> 
> The caller of the function that was updated by that commit, i.e.
> prepare_temp_file(), reuses what is checked out to the working tree
> when we can (i.e. it hasn't been modified from what we think is
> checked out) and when it is beneficial to do so (i.e. on a system
> with FAST_WORKING_DIRECTORY defined), which means the temporary file
> given by the prepare_temp_file() that is used by the external tools
> (both --ext-diff program and textconv filter) are designed to be fed
> and work on the smudged version of the file.  4e218f5 did not change
> that fundamental design; it just made things more consistent between
> the case where we do create a new temporary file out of blob and we
> allow an unmodified checked out file to be reused.

When I started looking at this, I assumed the problem would be that
git-difftool wasn't smudging the non-working-tree files.  But actually
everything is working "correctly", I'm just not sure it's always what
the user wants (at least it isn't what was wanted in this case).

Currently, the behaviour is:

	internal diff: compare clean files
	external diff: compare smudged files

This makes sense for LF/CRLF conversion, where platform-specific tools
clearly want the platform's line ending but the internal diff machinery
doesn't care.

However, from the filter description in an earlier email, I think
Florian is using a clean filter to remove output from IPython notebook
files (it seems that IPython saves both the input and the output in the
same file [1] and the output is the equivalent of, for example, C object
files).  In this case, the filter is one-way and discards information
from the working tree file, producing a smaller and more readable diff
in the process.

I think the summary is that there are some scenarios where the external
diff tool should see the smudged version and others where the clean
version is more appropriate and Git should support both options.  It
seems this is a property of the filter, so I wonder if the best solution
is a new "filter.<name>.extdiff = [clean|smudge]" configuration
variable (there's probably a better name for the variable than
"extdiff").


[1] http://pascalbugnion.net/blog/ipython-notebooks-and-git.html

  reply	other threads:[~2015-06-18 22:39 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-16 14:11 Using clean/smudge filters with difftool Florian Aspart
2015-06-18 12:31 ` Michael J Gruber
2015-06-18 13:15   ` Florian Aspart
2015-06-18 13:26     ` John Keeping
2015-06-18 13:51       ` Florian Aspart
2015-06-18 14:11         ` John Keeping
2015-06-18 14:17           ` Florian Aspart
2015-06-18 14:28             ` John Keeping
2015-06-18 15:39               ` Florian Aspart
2015-06-18 16:01                 ` John Keeping
2015-06-18 20:00                   ` Junio C Hamano
2015-06-18 22:39                     ` John Keeping [this message]
2015-06-18 22:55                       ` Junio C Hamano
2015-06-19  8:57                         ` Michael J Gruber
2015-06-19  9:32                           ` John Keeping
2015-06-19 15:04                             ` Florian Aspart
2015-06-19 17:03                           ` Junio C Hamano
2015-06-21 19:29                             ` Michael J Gruber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150618223927.GP18226@serenity.lan \
    --to=john@keeping.me.uk \
    --cc=florian.aspart@gmail.com \
    --cc=git@drmicha.warpmail.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.