Junio C Hamano <gitster@pobox.com> writes:

> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> From: Karthik Nayak <karthik.188@gmail.com>
>>
>> In the previous commits, we added the required base for adding symref
>> commands to the '--stdin' mode provided by 'git-update-ref(1)'. Using
>> them, add a new 'symref-verify' command to verify symrefs.
>>
>> The 'symref-verify' command allows users to verify if a provided <ref>
>> contains the provided <old-target> without changing the <ref>. If
>> <old-target> is not provided, the command will verify that the <ref>
>> doesn't exist. Since we're checking for symbolic refs, this command will
>> only work with the 'no-deref' mode. This is because any dereferenced
>> symbolic ref will point to an object and not a ref and the regular
>> 'verify' command can be used in such situations.
>
> All makes sense, but a naïve reader may find it helpful if you
> explained why having "verify" command is a good idea in the first
> place ("I can just do 'git symoblic-ref' to read the current value,
> and see if it is what I expect").  Presumably the value of "verify"
> is that you can have it in a transaction and fail other operations
> in the same transaction if the symref moved from what you expected
> it to point at?
>

I would say none of the commits drive this point, and I would go ahead
and add something on these lines to each of them. I think it would add
good value to readers.

>> Add and use `ref_update_is_null_new_value`, a helper function which is
>> used to check if there is a new_value in a reference update. The new
>> value could either be a symref target `new_target` or a OID `new_oid`.
>> We also add tests to test the command in both the regular stdin mode and
>> also with the '-z' flag.
>
> This looks out of place, primarily because the helper function is
> *NOT* used in this step.  Without any actual user, and with the name
> that says only what it checks without hinting why a caller may want
> to check the condition it checks, it is hard to guess if it is a
> good idea to have such a helper.
>

I think over the revision, its usage from this commit was removed. It
makes sense to move it to a commit where its used, I'll do that.

> "If a ref_update object specifies no new-oid and no new-target, it
> is not about updating but just validating" is how the callers are
> expected to use it, then instead of is_null_new_value that says
> what it checks, something like is_verify_only that says what the
> caller may want to use it for would be a more friendly name for
> readers and future developers.

This is true for the old-oid and old-target. That is, when they are set
to null, we're validating.

With the new-oid and new-target, if they're null, it usually signifies
deletion. We could rename it to 'is_delete_only', but that would also
need checking the 'REF_HAVE_NEW' flag. So we could ideally change it to

```
int ref_update_is_delete_only(struct ref_update *update) {
	return (update->flags & REF_HAVE_NEW) && !update->new_target &&
is_null_oid(&update->new_oid);
}
```

I'm okay with making this change.

>> @@ -297,11 +320,47 @@ static void parse_cmd_verify(struct ref_transaction *transaction,
>>  		die("verify %s: extra input: %s", refname, next);
>>
>>  	if (ref_transaction_verify(transaction, refname, &old_oid,
>> -				   update_flags, &err))
>> +				   NULL, update_flags, &err))
>>
>>  	update_flags = default_flags;
>>  	free(refname);
>>  	strbuf_release(&err);
>>  }
>
> The only damage by this patch to parse_cmd_verify() is that
> ref_transaction_verify() gained another parameter NULL, but with the
> default "--diff-algorithm=myers" algorithm, it is very hard to see.
>
> The "--patience" algorithm does a much beter job on this hunk.
>
> And the following function is entirely new.
>
>> +static void parse_cmd_symref_verify(struct ref_transaction *transaction,
>> +				    const char *next, const char *end)
>> +{
>> +	struct strbuf err = STRBUF_INIT;
>> +	struct object_id old_oid;
>> +	char *refname, *old_target;
>> +
>> +	if (!(update_flags & REF_NO_DEREF))
>> +		die("symref-verify: cannot operate with deref mode");
>> +
>> +	refname = parse_refname(&next);
>> +	if (!refname)
>> +		die("symref-verify: missing <ref>");
>> +
>> +	/*
>> +	 * old_ref is optional, but we want to differentiate between
>> +	 * a NULL and zero value.
>> +	 */
>> +	old_target = parse_next_refname(&next);
>> +	if (!old_target)
>> +		old_oid = *null_oid();
>
> In many existing code paths, we do not do structure assignment like
> this. Instead we do
>
> 		oidcpy(&old_oid, null_oid());
>
> We can see an existing example in a common context in a hunk for
> refs.c in this patch.
>

Yeah, makes sense to switch this. Will do.

>> +	if (*next != line_termination)
>> +		die("symref-verify %s: extra input: %s", refname, next);
>> +
>> +	if (ref_transaction_verify(transaction, refname,
>> +				   old_target ? NULL : &old_oid,
>> +				   old_target, update_flags, &err))
>> +		die("%s", err.buf);
>
> Are static analyzers smart enough to notice that we will not be
> using old_oid uninitialized here?  Just wondering.

Yup, at least the clang LSP server seems to detect and not bug me about
it.

> Anyway.  This ensures ref_transaction_verify() gets either
> old_target or old_oid, but never both at the same time.  The caller
> to ref_transaction_verify() in the previous function passed NULL for
> old_target but it always had a non-NULL old_oid so that is perfectly
> fine.
>
>> +	update_flags = default_flags;
>> +	free(refname);
>> +	free(old_target);
>> +	strbuf_release(&err);
>> +}
>
>> diff --git a/refs.c b/refs.c
>> index 060a31616d..0e1013b5ab 100644
>> --- a/refs.c
>> +++ b/refs.c
>> @@ -1217,6 +1217,8 @@ void ref_transaction_free(struct ref_transaction *transaction)
>>
>>  	for (i = 0; i < transaction->nr; i++) {
>>  		free(transaction->updates[i]->msg);
>> +		free((void *)transaction->updates[i]->old_target);
>> +		free((void *)transaction->updates[i]->new_target);
>>  		free(transaction->updates[i]);
>>  	}
>>  	free(transaction->updates);
>> @@ -1247,9 +1249,13 @@ struct ref_update *ref_transaction_add_update(
>>
>>  	update->flags = flags;
>>
>> -	if (flags & REF_HAVE_NEW)
>> +	if (new_target)
>> +		update->new_target = xstrdup(new_target);
>> +	if (old_target)
>> +		update->old_target = xstrdup(old_target);
>
> Presumably "update" structure, when freshly initialized, has NULL in
> both of these _target members?  Otherwise ref_transaction_free()
> would get in trouble, so double checking.
>

This is a good point. My understanding was that FLEX_ALLOC_MEM should
set everything to 0.

>> +	if (new_oid && flags & REF_HAVE_NEW)
>>  		oidcpy(&update->new_oid, new_oid);
>> -	if (flags & REF_HAVE_OLD)
>> +	if (old_oid && flags & REF_HAVE_OLD)
>>  		oidcpy(&update->old_oid, old_oid);
>
> Since we can ask to work on a symbolic ref, new_oid / old_oid can be
> NULL when REF_HAVE_NEW / REF_HAVE_OLD bit is on for _target members.
>
> Makes me wonder if the code becomes easier to follow if the flag
> bits are split into four (_NEW -> _NEW_OID + _NEW_TARGET), but let's
> not worry about that for now.
>

The intersection of this is quite low currently, so I'm not really sure
if there's added benefit. I did start that way before, but perhaps with
the iterations in the last few version, maybe it makes the code simpler.

>> @@ -1286,6 +1292,7 @@ int ref_transaction_update(struct ref_transaction *transaction,
>>  	flags &= REF_TRANSACTION_UPDATE_ALLOWED_FLAGS;
>>
>>  	flags |= (new_oid ? REF_HAVE_NEW : 0) | (old_oid ? REF_HAVE_OLD : 0);
>> +	flags |= (new_target ? REF_HAVE_NEW : 0) | (old_target ? REF_HAVE_OLD : 0);
>
>> @@ -1325,14 +1332,17 @@ int ref_transaction_delete(struct ref_transaction *transaction,
>>  int ref_transaction_verify(struct ref_transaction *transaction,
>>  			   const char *refname,
>>  			   const struct object_id *old_oid,
>> +			   const char *old_target,
>>  			   unsigned int flags,
>>  			   struct strbuf *err)
>>  {
>> -	if (!old_oid)
>> -		BUG("verify called with old_oid set to NULL");
>> +	if (!old_target && !old_oid)
>> +		BUG("verify called with old_oid and old_target set to NULL");
>
> Is it normal if you get _both_ set, or is it equally a BUG()?
> The parse_*_verify() codepaths we saw earlier both made sure
> only one of the two is non-NULL, and it is unclear what should
> happen if both are non-NULL.
>

It is a bug and this is caught in `ref_transaction_add_update`.
Introduced in the first patch of the series.

>> +	if (old_target && !(flags & REF_NO_DEREF))
>> +		BUG("verify cannot operate on symrefs with deref mode");
>>  	return ref_transaction_update(transaction, refname,
>>  				      NULL, old_oid,
>> -				      NULL, NULL,
>> +				      NULL, old_target,
>>  				      flags, NULL, err);
>>  }
>
> So this queues an ref_update object whose .new_oid and .new_target
> are NULL, and .old_oid and .old_target are what the caller gave us
> to check.  The NULLs in .new* members hopefully do not mean "delete
> this thing" ;-)
>

So the 'new_oid' being set to zero should be the delete this thing
queue.

>> @@ -2349,6 +2359,12 @@ static int run_transaction_hook(struct ref_transaction *transaction,
>>  	for (i = 0; i < transaction->nr; i++) {
>>  		struct ref_update *update = transaction->updates[i];
>>
>> +		/*
>> +		 * Skip reference transaction for symbolic refs.
>> +		 */
>> +		if (update->new_target || update->old_target)
>> +			continue;
>
> Is that a final design, or will the hooks have a chance to interfere?
>

The last patch adds hook support.

>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>> index 2420dac2aa..53197fa3af 100644
>> --- a/refs/files-backend.c
>> +++ b/refs/files-backend.c
>> @@ -2425,6 +2425,37 @@ static const char *original_update_refname(struct ref_update *update)
>>  	return update->refname;
>>  }
>>
>> +/*
>> + * Check whether the REF_HAVE_OLD and old_target values stored in
>> + * update are consistent with ref, which is the symbolic reference's
>> + * current value. If everything is OK, return 0; otherwise, write an
>> + * error message to err and return -1.
>> + */
>> +static int check_old_target(struct ref_update *update, char *ref,
>> +			    struct strbuf *err)
>> +{
>> +	if (!(update->flags & REF_HAVE_OLD) ||
>> +	    !strcmp(update->old_target, ref))
>> +		return 0;
>
> Earlier on the assignment side for "update" structure we saw above,
> the guard was (old_target && flags & REF_HAVE_OLD), but here we
> assume old_target is valid, which feels a bit asymmetric.
>
> Yes, I can see that the caller does not call us when !old_target,
> but still...  Perhaps
>
> 	if ((update->flags & REF_HAVE_OLD) && !update->old_target)
> 		BUG(...);
>

I will add something like this.

> or something?  Or alternatively, perhaps !!update->old_target should
> be the only thing we should check and ignore REF_HAVE_OLD bit?  I am
> not sure, but it smells like that the non-NULL-ness of old_target is
> the only thing that matters (if it is not NULL, very early in the
> control flow somebody would have set REF_HAVE_OLD bit to flags, no?).
>
> It brings me back to my earlier question.  Does REF_HAVE_OLD bit
> serve a useful purpose in this code?
>

I checked and it doesn't, it can be removed from usage in this code.
Will cleanup this part.

>> +	if (!strcmp(update->old_target, ""))
>> +		strbuf_addf(err, "cannot lock ref '%s': "
>> +			    "reference already exists",
>> +			    original_update_refname(update));
>> +	else if (!strcmp(ref, ""))
>> +		strbuf_addf(err, "cannot lock ref '%s': "
>> +			    "reference is missing but expected %s",
>> +			    original_update_refname(update),
>> +			    update->old_target);
>
> So... for old_target and ref, an empty string is a special value?
> How?  Shouldn't that be documented in the comment before the
> function?
>
>> +	else
>> +		strbuf_addf(err, "cannot lock ref '%s': "
>> +			    "is at %s but expected %s",
>> +			    original_update_refname(update),
>> +			    ref, update->old_target);
>> +
>> +	return -1;
>> +}
>> +
>>  /*
>>   * Check whether the REF_HAVE_OLD and old_oid values stored in update
>>   * are consistent with oid, which is the reference's current value. If
>> @@ -2528,6 +2559,18 @@ static int lock_ref_for_update(struct files_ref_store *refs,
>>  					ret = TRANSACTION_GENERIC_ERROR;
>>  					goto out;
>>  				}
>> +			}
>> +
>> +			/*
>> +			 * For symref verification, we need to check the reference value
>> +			 * rather than the oid. If we're dealing with regular refs or we're
>> +			 * verifying a dereferenced symref, we then check the oid.
>> +			 */
>> +			if (update->old_target) {
>> +				if (check_old_target(update, referent.buf, err)) {
>> +					ret = TRANSACTION_GENERIC_ERROR;
>> +					goto out;
>> +				}
>
> We come here only when update->type has REF_ISSYMREF bit on (we
> learned that value by calling lock_raw_ref()), and know referent.buf
> has the current "target" value.  That is consumed as "ref" parameter
> to check_old_target() we just saw.  OK.
>
>> diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
>> index 6104471199..a2474245aa 100644
>> --- a/refs/reftable-backend.c
>> +++ b/refs/reftable-backend.c
>> @@ -938,7 +938,26 @@ static int reftable_be_transaction_prepare(struct ref_store *ref_store,
>>  		 * individual refs. But the error messages match what the files
>>  		 * backend returns, which keeps our tests happy.
>>  		 */
>> -		if (u->flags & REF_HAVE_OLD && !oideq(&current_oid, &u->old_oid)) {
>> +		if ((u->flags & REF_HAVE_OLD) && u->old_target) {
>> +			if (strcmp(referent.buf, u->old_target)) {
>> +				if (!strcmp(u->old_target, ""))
>> +					strbuf_addf(err, "verifying symref target: '%s': "
>> +						    "provided target is empty",
>> +						    original_update_refname(u));
>> +				else if (!strcmp(referent.buf, ""))
>> +					strbuf_addf(err, "verifying symref target: '%s': "
>> +						    "reference is missing but expected %s",
>> +						    original_update_refname(u),
>> +						    u->old_target);
>> +				else
>> +					strbuf_addf(err, "verifying symref target: '%s': "
>> +						    "is at %s but expected %s",
>> +						    original_update_refname(u),
>> +						    referent.buf, u->old_target);
>> +				ret = -1;
>> +				goto done;
>> +			}
>
> Again, the puzzling "empty string"s are handled here.

For here and above, this too is dead code and no longer needed,
old_target being empty string is left over code from before we decided
to use zero_oid for deleting. I'll remove it. Thanks.