All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Patrick Palka <patrick@parcs.ath.cx>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [PATCH] Improve contrib/diff-highlight to highlight unevenly-sized hunks
Date: Thu, 18 Jun 2015 16:45:06 -0400	[thread overview]
Message-ID: <20150618204505.GD14550@peff.net> (raw)
In-Reply-To: <alpine.DEB.2.20.8.1506181536070.4322@idea>

On Thu, Jun 18, 2015 at 04:14:19PM -0400, Patrick Palka wrote:

> >in a test script becomes more clear. But some of the output is not so
> >great. For instance, the very commit under discussion has a
> >confusing and useless highlight. Or take a documentation patch like
> >5c31acfb, where I find the highlights actively distracting. We are saved
> >a little by the "if the whole line is different, do not highlight at
> >all" behavior of 097128d1bc.
> 
> To fix the useless highlights for both evenly and unevenly sized hunks
> (like when all but a semicolon on a line changes), one can loosen the
> criterion for not highlighting from "do not highlight if 0% of the
> before and after lines are common between them" to, say, "do not
> highlight if less than 10% of the before and after lines are common
> between them".  Then most of these useless highlights are gone for both
> evenly and unevenly sized hunks.

Yeah, this is an idea I had considered but never actually experimented
with. It does make some things better, but it also makes some a little
worse. For example, in 8dbf3eb, the hunk:

-               const char *plain = diff_get_color(ecb->color_diff,
-                                                  DIFF_PLAIN);
+               const char *context = diff_get_color(ecb->color_diff,
+                                                    DIFF_CONTEXT);

currently gets the plain/context change in the first line highlighted,
as well as the DIFF_PLAIN/DIFF_CONTEXT in the second line. With a 10%
limit, the second line isn't highlighted. That's correct by the
heuristic, but it's a bit harder to read, because the highlight draws
your eye to the first change, and it is easy to miss the second.

Still, I think this is probably a minority case, and it may be
outweighed by the improvements. The "real" solution is to consider the
hunk as a whole and do an LCS diff on it, which would show that yes,
it's worth highlighting both of those spots, as they are a small
percentage of the total hunk.

> Here is a patch that changes the criterion as mentioned.  Testing this
> change on the documentation patch 5c31acfb, only two pairs of lines are
> highlighted instead of six.  On my original patch, the useless highlight
> is gone.  The useless semicolon-related highlights on e.g. commit
> 99a2cfb are gone.

Nice, the ones like 99a2cfb are definitely wrong (I had though to fix
them eventually by treating some punctuation as uninteresting, but I
suspect the percentage heuristic covers that reasonably well in
practice).

> Of course, these patches are both hacks but they seem to be surprisingly
> effective hacks especially when paired together.

The whole script is a (surprisingly effective) hack. ;)

> >So I dunno. IMHO this does more harm than good, and I would not want to
> >use it myself. But it is somewhat a matter of taste; I am not opposed to
> >making it a configurable option.
> 
> That is something I can do :)

Coupled with the 10%-threshold patch, I think it would be OK to include
it unconditionally. So far we've just been diffing the two outputs and
micro-analyzing them. The real test to me will be using it in practice
and seeing if it's helpful or annoying.

-Peff

  reply	other threads:[~2015-06-18 20:45 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-15 17:20 [PATCH] Improve contrib/diff-highlight to highlight unevenly-sized hunks Patrick Palka
2015-06-18 15:50 ` Junio C Hamano
2015-06-18 16:28   ` Patrick Palka
2015-06-18 18:08     ` Junio C Hamano
2015-06-18 19:04       ` Jeff King
2015-06-18 20:14         ` Patrick Palka
2015-06-18 20:45           ` Jeff King [this message]
2015-06-18 21:23             ` Jeff King
2015-06-18 21:39               ` Junio C Hamano
2015-06-18 22:25               ` Patrick Palka
2015-06-19  3:54               ` Jeff King
2015-06-19  4:49                 ` Junio C Hamano
2015-06-19  5:32                   ` Jeff King
2015-06-19  7:34                     ` Jeff King
2015-06-19 11:38                       ` Jeff King
2015-06-19 17:20                     ` Junio C Hamano
2015-06-18 23:06             ` Patrick Palka
2015-06-18 20:23       ` Patrick Palka
2015-06-18 19:08     ` Jeff King
2015-06-18 20:27       ` Patrick Palka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150618204505.GD14550@peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=patrick@parcs.ath.cx \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.