Git Mailing List Archive mirror
 help / color / mirror / Atom feed
* [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained
@ 2024-03-16 18:57 Aryan Gupta
  2024-03-20 11:06 ` Christian Couder
  2024-03-21 12:48 ` Patrick Steinhardt
  0 siblings, 2 replies; 9+ messages in thread
From: Aryan Gupta @ 2024-03-16 18:57 UTC (permalink / raw
  To: git; +Cc: Patrick Steinhardt [ ], karthik nayak, Christian Couder,
	Junio C Hamano

Hello everyone.

I hope you are doing well. I am writing this email to get a review for
my GSoC proposal.

Thanks in advance.

Regards
Aryan

Here it goes:

-- Refactor git-bisect(1) to make its state self-contained --
=============================================================

-- Personal Info --
====================

Full name :     Aryan Gupta
E-mail    :     garyan447@gmail.com
Tel       :     (+91) 7018273899
Education :     Graphic Era University, Dehradun, India
Year      :     Final Year
Major     :     Computer Science and Engineering
GitHub    :     [1]

-- About Me --
====================

I have been enthusiastic about open source from the very beginning of
my journey as a software developer. I’ve contributed to other
open-source projects, though still a beginner, I’m generally familiar
with the process of contribution. The related experiences are all in
the contribution graph on my GitHub profile page [1]. In the ZAP
Project[2] community, I’ve made over 50 PRs [3]. I also participated
in the Google Summer of Code 2023 with the OWASP Foundation and
successfully completed it as well [4]. I have contributed to some
other small projects as well on GitHub.

I came to participate in the Git community this year (around
February). I got myself rather comfortable with the contribution
process by writing, replying, and auditing different sorts of patches
in the community.

With the patches done so far, I’m getting more familiar with the Git
internals, project structures, commonly used APIs, test suites,
required tech stacks, and coding guidelines. For understanding better
about Git, I read and reread the documentation a few times, including
‘MyFirstContribution.txt’, ‘MyFirstObjectWalk.txt’, and ‘Hacking Git’.
The book Pro Git also helped me to understand the Git internals
better.

Effective communication with mentors is a very important part of
Google Summer of Code Program. I know when to ask the right questions
and I do every possible thing I can do before asking any question.
This trait of mine always proved to be very useful and I believe it
will be really beneficial for me for the completion of this GSoC
project as well.


-- Before GSoC --
====================

-- Synopsis --
-----------------------

I’m picking the project idea “Refactor git-bisect(1) to make its state
self-contained” from the SoC 2024 Ideas page [5]. This idea aims to
make git-bisect(1) self-contained. Its difficulty should be medium,
and the expected project size takes somewhere between 175 hours to 350
hours.

The git-bisect(1) command is used to find a commit in a range of
commits that introduced a specific bug. Starting a bisection run
creates a set of state files into the Git repository which record
various different parameters like “.git/BISECT_START”. These files
look almost like refs due to their names being all-uppercase. This has
led to confusion with the new “reftable” backend because it wasn’t
quite clear whether those files are in fact refs or not.

As it turns out they are not refs and should never be treated like
one. Overall, it has been concluded that the way those files are
currently stored is not ideal. Instead of having a proliferation of
files in the Git directory, it was discussed whether the bisect state
should be moved into its own “bisect-state” subdirectory. This would
make it more self-contained and thereby avoid future confusion. It is
also aligned with the sequencer state used by rebases, which is neatly
contained in the “rebase-apply” and “rebase-merge” directories.

The goal of this project would be to realize this change. While
rearchitecting the layout should be comparatively easy to do, the
harder part will be to hash out how to handle backwards compatibility.

-- Benefits to Community --
--------------------------------

By joining the community and working on this idea, I can collaborate
with my mentors and fellow community members to enhance the user
experience for those who use git-bisect(1). Additionally, creating a
clear distinction between the refs and the git-bisect state files
through this project would be highly beneficial for the community.

-- Why? --
-----------------

For me, the biggest question is always why I am doing something. So to
answer why I want to contribute to Git in this Google Summer of Code
is the learning I will get while communicating with the mentors and
the git community. I want to learn as much as I could in this phrase.
Also, I want a kick start to begin my journey of contributing to git
for a long term. Why I chose git was mainly because I wanted to
explore something which is being used by almost every software
developer.

-- Microproject --
------------------------------

- Modernize a test script [6]
  - Status: merged into master
  - Description: Modernize the formatting of the test script to align
with current standards and improve its overall readability.
  - Remarks: This was my first patch in git which helped me to
understand a lot about how the process of submitting a patch and
review works.

-- Other Patches --
----------------------------

Other than the required microproject, I’ve submitted a few other
patches when I stumbled upon potential modifications, these patches
are:

- Zero Count Optimization [7]
  - Status: Under Review
  - Description: Optimize for efficiency using trailing zeros for set
bit iteration
  - Remarks: This was a great learning task. I learned how tiny things
matter and small optimizations might make a big impact on performance.


-- In GSoC --
====================

-- Plan --
----------------

What is Git Bisect?

Git Bisect is a git command which helps to find which commit
introduced the bug into the codebase at a faster rate by leveraging
the power of binary search.


The project idea is relatively easy to understand by the description
itself. Git bisect stores some state files such as BISECT_START,
BISECT_FIRST_PARENT etc which looks very similar to the naming
convention used for creating the refs file. Due to this similar naming
convention it sometimes causes unexpected results when these state
files are confused as the ref files.

In order to fix this problem we can create a new ".git/bisect-state"
directory and store all the state files which belong to the
git-bisect(1) command in that directory as proposed by Patrick and
store all the files as follows.


- BISECT_TERMS -> bisect-state/terms
- BISECT_LOG -> bisect-state/log
- BISECT_START -> bisect-state/start
- BISECT_RUN -> bisect-state/run
- BISECT_FIRST_PARENT -> bisect-state/first-parent
- BISECT_ANCESTORS_OK -> bisect-state/ancestors-ok

While the change looks very simple, it actually is. We will just
update all the paths from where these files are being accessed. When I
went through the code of bisect.c file I found that the path is pretty
easy to configure we just need to modify it at a few places. The main
problem is to efficiently handle the backward compatibility and
implement proper test cases to not let the existing things break. I
have already gone through the bisect.c file a couple of times just
reading and trying to understand all the functions and was able to
understand a lot of things about how it works.

-- Challenges with Backward Compatibility --
--------------------------------------------

I plan to implement a migration mechanism that will detect whether the
legacy stat file like BISECT_* are present or not during the
initialization of git bisect. This migration phase will automatically
move these files to the new “.git/bisect-state” directory, ensuring a
seamless transition for users while maintaining the compatibility with
existing scripts and tools. The biggest challenge is not to let the
existing functionality break.


-- Strategies for Handling Backward Compatibility --
---------------------------------------------------------

1. Before implementing any changes, a good understanding of how
git-bisect works and how it is structured is very important. So, I
will thoroughly analyze the existing usage patterns of git-bisect and
list down all the potential areas of impact.

2. To validate the codebase changes and ensure the stability and the
backward compatibility of git-bisect command. I will write unit tests
for all the required changes.

3. I will also try to create some real scenarios where I will be
manually testing the desired behavior.

4. We might also need some regression tests to test some of the
functionality which can't be tested using unit tests.

Here is a list of a few tests which I am able to think of as of now:

1. Verify correct relocation of state files during initialization.
2. Test compatibility with existing bisect scripts and tools.
3. Validate handling of edge cases, such as empty or corrupted state
files, to prevent unexpected behavior

I am pretty sure I am still missing some of the problems which I may
face in the future but I will figure them out too when they arrive.

-- Timeline --
------------------

I will be starting the project as soon as I receive the confirmation
of my proposal getting selected for GSoC.

Here is a list of phases during the GSoC period for this project.
While these are just the approximations, the actual timeline may vary
because of the review iterations or unpredictable scenarios.

1. Understanding the existing implementation of Bisect command
    - This includes understanding how to test the changes locally
which are made in the bisect command efficiently
    - Get more familiar with the style of writing the code

2. Start working on the task to add the directory
    - Updating all the references wherever these bisect files are used
    - Writing/Updating tests to test all the changes made

3. Start with supporting backward compatibility
    - For this phase what I am thinking is of adding a stage just
after the initialization of git bisect command which will detect if
there exist some files by the name BISECT_* and migrate all those
files into the “.git/bisect” directory.

4. Adding Tests
    - Through testing for backward compatibility is very very
important so that the changes don’t break the existing changes.

5. Documenting the changes
    - This might also go side by side if mentors need some weekly
documentation of the tasks done.


-- Availability --
----------------------

I will be able to contribute 2-3 hours on weekdays and minimum 6 hours
on weekends. Which gives me around 22-27 hours per week to work on the
project. I will work till a minimum span of 14 weeks (can be extended)
during the GSoC. I will try my best to keep this time commitment and
be always available through the community’s mailing list. Also, I have
exams in the first week of June so I will have limited availability
during that period. The project might take somewhere around 175-350
hours.

-- After GSoC --
---------------------

After GSoC I intend to be a part of the community and keep
contributing to the git’s codebase. I eagerly want to learn a lot from
the git community and enhance my skills. I also see myself making
important contributions to git in the future.

When I first joined the community 2 months ago, the ancient way of
collaborating through a mailing list by sending diff patches was
really puzzling (GitHub was the only means that I knew about for
open-source collaboration). After sending a patch I got a little
comfortable with it and started loving it.

-- Closing remarks --
----------------------------

I am really enthusiastic to contribute to git. I really want to learn
a lot from my mentors and the whole git community. Everyone
contributes using git, why not contribute to git :).

In the end I really want to thank the whole community, especially
Christian and Junio, for their valuable time in reviewing my patches
and guiding me with the problems I faced.

-- References --
---------------------

[1]  Github Link - https://github.com/aryangupta701
[2]  ZAP Project - https://github.com/zaproxy
[3]  Pull Requests -
https://github.com/search?q=org%3Azaproxy+type%3Apr+sort%3Aupdated-asc++author%3Aaryangupta701&type=pullrequests
[4]  GSoC 2023 Project -
https://www.zaproxy.org/blog/2023-09-11-browser-recorder/
[5]  SoC Ideas - https://git.github.io/SoC-2024-Ideas/
[6]  Migrate Test Script -
https://lore.kernel.org/git/pull.1675.v2.git.1709243831190.gitgitgadget@gmail.com/T/#t
[7]  Zero Count Optimization-
https://lore.kernel.org/git/20240310162614.62691-2-garyan447@gmail.com/T/#u
[8]  Issue Raised - https://lore.kernel.org/git/Za-gF_Hp_lXViGWw@tanuki/
[9]  Git References -
https://git-scm.com/book/en/v2/Git-Internals-Git-References
[10] Git bisect - https://git-scm.com/docs/git-bisect
[11] Reftables - https://git-scm.com/docs/reftable


Thanks & Regards,
Aryan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained
  2024-03-16 18:57 [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained Aryan Gupta
@ 2024-03-20 11:06 ` Christian Couder
  2024-03-21 13:21   ` Aryan Gupta
  2024-03-23 16:37   ` Aryan Gupta
  2024-03-21 12:48 ` Patrick Steinhardt
  1 sibling, 2 replies; 9+ messages in thread
From: Christian Couder @ 2024-03-20 11:06 UTC (permalink / raw
  To: Aryan Gupta; +Cc: git, Patrick Steinhardt [ ], karthik nayak, Junio C Hamano

Hi,

On Sat, Mar 16, 2024 at 7:58 PM Aryan Gupta <garyan447@gmail.com> wrote:
>
> Hello everyone.
>
> I hope you are doing well. I am writing this email to get a review for
> my GSoC proposal.
>
> Thanks in advance.
>
> Regards
> Aryan
>
> Here it goes:
>
> -- Refactor git-bisect(1) to make its state self-contained --
> =============================================================
>
> -- Personal Info --
> ====================
>
> Full name :     Aryan Gupta
> E-mail    :     garyan447@gmail.com
> Tel       :     (+91) 7018273899
> Education :     Graphic Era University, Dehradun, India
> Year      :     Final Year
> Major     :     Computer Science and Engineering
> GitHub    :     [1]
>
> -- About Me --
> ====================
>
> I have been enthusiastic about open source from the very beginning of
> my journey as a software developer. I’ve contributed to other
> open-source projects, though still a beginner, I’m generally familiar
> with the process of contribution. The related experiences are all in
> the contribution graph on my GitHub profile page [1]. In the ZAP
> Project[2] community, I’ve made over 50 PRs [3]. I also participated
> in the Google Summer of Code 2023 with the OWASP Foundation and
> successfully completed it as well [4]. I have contributed to some
> other small projects as well on GitHub.

It's interesting to know that you have already participated in a GSoC.
Is there a single blog post about this or more?

> I came to participate in the Git community this year (around
> February). I got myself rather comfortable with the contribution
> process by writing, replying, and auditing different sorts of patches
> in the community.
>
> With the patches done so far, I’m getting more familiar with the Git
> internals, project structures, commonly used APIs, test suites,
> required tech stacks, and coding guidelines. For understanding better
> about Git, I read and reread the documentation a few times, including
> ‘MyFirstContribution.txt’, ‘MyFirstObjectWalk.txt’, and ‘Hacking Git’.
> The book Pro Git also helped me to understand the Git internals
> better.
>
> Effective communication with mentors is a very important part of
> Google Summer of Code Program. I know when to ask the right questions
> and I do every possible thing I can do before asking any question.
> This trait of mine always proved to be very useful and I believe it
> will be really beneficial for me for the completion of this GSoC
> project as well.

Ok.

> -- Before GSoC --
> ====================
>
> -- Synopsis --
> -----------------------
>
> I’m picking the project idea “Refactor git-bisect(1) to make its state
> self-contained” from the SoC 2024 Ideas page [5]. This idea aims to
> make git-bisect(1) self-contained. Its difficulty should be medium,
> and the expected project size takes somewhere between 175 hours to 350
> hours.
>
> The git-bisect(1) command is used to find a commit in a range of
> commits that introduced a specific bug. Starting a bisection run
> creates a set of state files into the Git repository which record
> various different parameters like “.git/BISECT_START”. These files
> look almost like refs due to their names being all-uppercase. This has
> led to confusion with the new “reftable” backend because it wasn’t
> quite clear whether those files are in fact refs or not.
>
> As it turns out they are not refs and should never be treated like
> one. Overall, it has been concluded that the way those files are
> currently stored is not ideal. Instead of having a proliferation of
> files in the Git directory, it was discussed whether the bisect state
> should be moved into its own “bisect-state” subdirectory. This would
> make it more self-contained and thereby avoid future confusion. It is
> also aligned with the sequencer state used by rebases, which is neatly
> contained in the “rebase-apply” and “rebase-merge” directories.
>
> The goal of this project would be to realize this change. While
> rearchitecting the layout should be comparatively easy to do, the
> harder part will be to hash out how to handle backwards compatibility.

It's Ok to copy the project description, or parts of it, from the idea
page, but please say that you have done so. This way we can just skip
this when we read your proposal.

> -- Benefits to Community --
> --------------------------------
>
> By joining the community and working on this idea, I can collaborate
> with my mentors and fellow community members to enhance the user
> experience for those who use git-bisect(1). Additionally, creating a
> clear distinction between the refs and the git-bisect state files
> through this project would be highly beneficial for the community.
>
> -- Why? --
> -----------------
>
> For me, the biggest question is always why I am doing something. So to
> answer why I want to contribute to Git in this Google Summer of Code
> is the learning I will get while communicating with the mentors and
> the git community. I want to learn as much as I could in this phrase.
> Also, I want a kick start to begin my journey of contributing to git
> for a long term. Why I chose git was mainly because I wanted to
> explore something which is being used by almost every software
> developer.
>
> -- Microproject --
> ------------------------------
>
> - Modernize a test script [6]
>   - Status: merged into master

Please add a link to (or the hash of) the commit that merges your
commit(s) into the master branch.

>   - Description: Modernize the formatting of the test script to align
> with current standards and improve its overall readability.
>   - Remarks: This was my first patch in git which helped me to
> understand a lot about how the process of submitting a patch and
> review works.
>
> -- Other Patches --
> ----------------------------
>
> Other than the required microproject, I’ve submitted a few other
> patches when I stumbled upon potential modifications, these patches
> are:
>
> - Zero Count Optimization [7]
>   - Status: Under Review

Looking at the thread, it seems to me that it has been reviewed and we
are expecting a reroll from you with at least a better commit message.

>   - Description: Optimize for efficiency using trailing zeros for set
> bit iteration
>   - Remarks: This was a great learning task. I learned how tiny things
> matter and small optimizations might make a big impact on performance.
>
>
> -- In GSoC --
> ====================
>
> -- Plan --
> ----------------
>
> What is Git Bisect?
>
> Git Bisect is a git command which helps to find which commit
> introduced the bug into the codebase at a faster rate by leveraging
> the power of binary search.
>
>
> The project idea is relatively easy to understand by the description
> itself. Git bisect stores some state files such as BISECT_START,
> BISECT_FIRST_PARENT etc which looks very similar to the naming
> convention used for creating the refs file. Due to this similar naming
> convention it sometimes causes unexpected results when these state
> files are confused as the ref files.

I don't think there are unexpected results. I think it's mostly a
clarification, standardization and cleanup issue.

> In order to fix this problem we can create a new ".git/bisect-state"
> directory and store all the state files which belong to the
> git-bisect(1) command in that directory as proposed by Patrick and
> store all the files as follows.
>
>
> - BISECT_TERMS -> bisect-state/terms
> - BISECT_LOG -> bisect-state/log
> - BISECT_START -> bisect-state/start
> - BISECT_RUN -> bisect-state/run
> - BISECT_FIRST_PARENT -> bisect-state/first-parent
> - BISECT_ANCESTORS_OK -> bisect-state/ancestors-ok

Yeah, please link to Patrick's email where he suggests this. And
please clarify what is quoted verbatim from his email. It is Ok to
copy paste from existing emails, but it helps reviewing proposals if
you are clear about this as we can then skip those parts.

> While the change looks very simple, it actually is. We will just
> update all the paths from where these files are being accessed. When I
> went through the code of bisect.c file I found that the path is pretty
> easy to configure we just need to modify it at a few places. The main
> problem is to efficiently handle the backward compatibility and
> implement proper test cases to not let the existing things break. I
> have already gone through the bisect.c file a couple of times just
> reading and trying to understand all the functions and was able to
> understand a lot of things about how it works.

There is "bisect.c" and there is "builtin/bisect.c". Please make it
clearer which file(s) you are talking about. Also code examples of how
you would update things could be useful.

> -- Challenges with Backward Compatibility --
> --------------------------------------------
>
> I plan to implement a migration mechanism that will detect whether the
> legacy stat file like BISECT_* are present or not during the

Maybe: s/stat file/status files/

> initialization of git bisect.

What do you mean by "initialization of git bisect". Does that take
into account the fact that users could upgrade while they are
bisecting? What if they switch back to an older version after
upgrading?

Do you plan to always check for any "BISECT_*" file or would a check
for the "bisect-state" first would help?

> This migration phase will automatically
> move these files to the new “.git/bisect-state” directory, ensuring a
> seamless transition for users while maintaining the compatibility with
> existing scripts and tools. The biggest challenge is not to let the
> existing functionality break.
>
>
> -- Strategies for Handling Backward Compatibility --
> ---------------------------------------------------------
>
> 1. Before implementing any changes, a good understanding of how
> git-bisect works and how it is structured is very important. So, I
> will thoroughly analyze the existing usage patterns of git-bisect and
> list down all the potential areas of impact.
>
> 2. To validate the codebase changes and ensure the stability and the
> backward compatibility of git-bisect command. I will write unit tests
> for all the required changes.

Do you mean unit tests using the new unit test framework in C. Could
you show an example?

> 3. I will also try to create some real scenarios where I will be
> manually testing the desired behavior.
>
> 4. We might also need some regression tests to test some of the
> functionality which can't be tested using unit tests.

In which test script would you add them?

> Here is a list of a few tests which I am able to think of as of now:
>
> 1. Verify correct relocation of state files during initialization.
> 2. Test compatibility with existing bisect scripts and tools.
> 3. Validate handling of edge cases, such as empty or corrupted state
> files, to prevent unexpected behavior
>
> I am pretty sure I am still missing some of the problems which I may
> face in the future but I will figure them out too when they arrive.
>
> -- Timeline --
> ------------------
>
> I will be starting the project as soon as I receive the confirmation
> of my proposal getting selected for GSoC.
>
> Here is a list of phases during the GSoC period for this project.
> While these are just the approximations, the actual timeline may vary
> because of the review iterations or unpredictable scenarios.
>
> 1. Understanding the existing implementation of Bisect command
>     - This includes understanding how to test the changes locally
> which are made in the bisect command efficiently
>     - Get more familiar with the style of writing the code
>
> 2. Start working on the task to add the directory
>     - Updating all the references wherever these bisect files are used
>     - Writing/Updating tests to test all the changes made
>
> 3. Start with supporting backward compatibility
>     - For this phase what I am thinking is of adding a stage just
> after the initialization of git bisect command which will detect if
> there exist some files by the name BISECT_* and migrate all those
> files into the “.git/bisect” directory.
>
> 4. Adding Tests
>     - Through testing for backward compatibility is very very
> important so that the changes don’t break the existing changes.
>
> 5. Documenting the changes
>     - This might also go side by side if mentors need some weekly
> documentation of the tasks done.

Tests and documentation should be part of the patches that change the
behavior. So it doesn't really make sense to list them separately in
the timeline.

> -- Availability --
> ----------------------
>
> I will be able to contribute 2-3 hours on weekdays and minimum 6 hours
> on weekends. Which gives me around 22-27 hours per week to work on the
> project. I will work till a minimum span of 14 weeks (can be extended)
> during the GSoC. I will try my best to keep this time commitment and
> be always available through the community’s mailing list. Also, I have
> exams in the first week of June so I will have limited availability
> during that period. The project might take somewhere around 175-350
> hours.

Thanks for these details related to your availability. As this is your
final year, when are you expecting to graduate?

> -- After GSoC --
> ---------------------
>
> After GSoC I intend to be a part of the community and keep
> contributing to the git’s codebase. I eagerly want to learn a lot from
> the git community and enhance my skills. I also see myself making
> important contributions to git in the future.
>
> When I first joined the community 2 months ago, the ancient way of
> collaborating through a mailing list by sending diff patches was
> really puzzling (GitHub was the only means that I knew about for
> open-source collaboration). After sending a patch I got a little
> comfortable with it and started loving it.
>
> -- Closing remarks --
> ----------------------------
>
> I am really enthusiastic to contribute to git. I really want to learn
> a lot from my mentors and the whole git community. Everyone
> contributes using git, why not contribute to git :).
>
> In the end I really want to thank the whole community, especially
> Christian and Junio, for their valuable time in reviewing my patches
> and guiding me with the problems I faced.

Thanks for your proposal. Please make sure you submit it soon as a pdf
file to the GSoC website. It can then be updated by uploading a new
pdf until the April 2 1800 UTC deadline.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained
  2024-03-16 18:57 [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained Aryan Gupta
  2024-03-20 11:06 ` Christian Couder
@ 2024-03-21 12:48 ` Patrick Steinhardt
  2024-03-21 13:27   ` Aryan Gupta
  1 sibling, 1 reply; 9+ messages in thread
From: Patrick Steinhardt @ 2024-03-21 12:48 UTC (permalink / raw
  To: Aryan Gupta; +Cc: git, karthik nayak, Christian Couder, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 3065 bytes --]

On Sat, Mar 16, 2024 at 07:57:49PM +0100, Aryan Gupta wrote:
[snip]
> -- Plan --
> ----------------
> 
> What is Git Bisect?
> 
> Git Bisect is a git command which helps to find which commit
> introduced the bug into the codebase at a faster rate by leveraging
> the power of binary search.
> 
> 
> The project idea is relatively easy to understand by the description
> itself. Git bisect stores some state files such as BISECT_START,
> BISECT_FIRST_PARENT etc which looks very similar to the naming
> convention used for creating the refs file. Due to this similar naming
> convention it sometimes causes unexpected results when these state
> files are confused as the ref files.
> 
> In order to fix this problem we can create a new ".git/bisect-state"
> directory and store all the state files which belong to the
> git-bisect(1) command in that directory as proposed by Patrick and
> store all the files as follows.
> 
> 
> - BISECT_TERMS -> bisect-state/terms
> - BISECT_LOG -> bisect-state/log
> - BISECT_START -> bisect-state/start
> - BISECT_RUN -> bisect-state/run
> - BISECT_FIRST_PARENT -> bisect-state/first-parent
> - BISECT_ANCESTORS_OK -> bisect-state/ancestors-ok
> 
> While the change looks very simple, it actually is. We will just
> update all the paths from where these files are being accessed. When I
> went through the code of bisect.c file I found that the path is pretty
> easy to configure we just need to modify it at a few places. The main
> problem is to efficiently handle the backward compatibility and
> implement proper test cases to not let the existing things break. I
> have already gone through the bisect.c file a couple of times just
> reading and trying to understand all the functions and was able to
> understand a lot of things about how it works.

As you also mention further down in the section about backwards
compatibility, the challenge of this project is not about doing those
changes. That indeed is the trivial part of this whole project.

The real challenge is figuring out with the community how to ensure that
the change is indeed backwards compatible, and that does not only
involve backwards compatibility with Git itself, but also with other,
third party tools. The biggest question will be whether the refactoring
is ultimately going to be worth it in the bigger picture, or whether we
should really just leave it be.

So personally, I rather see the biggest part of this project to find
good middle ground with the community. It's thus of a more "political"
nature overall.

I don't mean to discourage you with this. I just want to state up front
where you should expect difficulties so that you don't underestimate the
difficulty of this project overall. It could very well happen that the
whole idea gets shot down by the community in case we figure out that it
is simply too risky and/or not worth it in the long run.

If you want to stick with this idea then I would strongly recommend that
you mention this in your proposal.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained
  2024-03-20 11:06 ` Christian Couder
@ 2024-03-21 13:21   ` Aryan Gupta
  2024-03-26 10:29     ` Christian Couder
  2024-03-23 16:37   ` Aryan Gupta
  1 sibling, 1 reply; 9+ messages in thread
From: Aryan Gupta @ 2024-03-21 13:21 UTC (permalink / raw
  To: Christian Couder
  Cc: git, Patrick Steinhardt [ ], karthik nayak, Junio C Hamano

Hey Christian

Thank you for reviewing the proposal.

On Wed, Mar 20, 2024 at 12:06 PM Christian Couder
<christian.couder@gmail.com> wrote:
>
> Hi,
>
> On Sat, Mar 16, 2024 at 7:58 PM Aryan Gupta <garyan447@gmail.com> wrote:
> >
> > Hello everyone.
> >
> > I hope you are doing well. I am writing this email to get a review for
> > my GSoC proposal.
> >
> > Thanks in advance.
> >
> > Regards
> > Aryan
> >
> > Here it goes:
> >
> > -- Refactor git-bisect(1) to make its state self-contained --
> > =============================================================
> >
> > -- Personal Info --
> > ====================
> >
> > Full name :     Aryan Gupta
> > E-mail    :     garyan447@gmail.com
> > Tel       :     (+91) 7018273899
> > Education :     Graphic Era University, Dehradun, India
> > Year      :     Final Year
> > Major     :     Computer Science and Engineering
> > GitHub    :     [1]
> >
> > -- About Me --
> > ====================
> >
> > I have been enthusiastic about open source from the very beginning of
> > my journey as a software developer. I’ve contributed to other
> > open-source projects, though still a beginner, I’m generally familiar
> > with the process of contribution. The related experiences are all in
> > the contribution graph on my GitHub profile page [1]. In the ZAP
> > Project[2] community, I’ve made over 50 PRs [3]. I also participated
> > in the Google Summer of Code 2023 with the OWASP Foundation and
> > successfully completed it as well [4]. I have contributed to some
> > other small projects as well on GitHub.
>
> It's interesting to know that you have already participated in a GSoC.
> Is there a single blog post about this or more?

Here it is: https://www.zaproxy.org/blog/2023-09-11-browser-recorder/

> > I came to participate in the Git community this year (around
> > February). I got myself rather comfortable with the contribution
> > process by writing, replying, and auditing different sorts of patches
> > in the community.
> >
> > With the patches done so far, I’m getting more familiar with the Git
> > internals, project structures, commonly used APIs, test suites,
> > required tech stacks, and coding guidelines. For understanding better
> > about Git, I read and reread the documentation a few times, including
> > ‘MyFirstContribution.txt’, ‘MyFirstObjectWalk.txt’, and ‘Hacking Git’.
> > The book Pro Git also helped me to understand the Git internals
> > better.
> >
> > Effective communication with mentors is a very important part of
> > Google Summer of Code Program. I know when to ask the right questions
> > and I do every possible thing I can do before asking any question.
> > This trait of mine always proved to be very useful and I believe it
> > will be really beneficial for me for the completion of this GSoC
> > project as well.
>
> Ok.
>
> > -- Before GSoC --
> > ====================
> >
> > -- Synopsis --
> > -----------------------
> >
> > I’m picking the project idea “Refactor git-bisect(1) to make its state
> > self-contained” from the SoC 2024 Ideasb page [5]. This idea aims to
> > make git-bisect(1) self-contained. Its difficulty should be medium,
> > and the expected project size takes somewhere between 175 hours to 350
> > hours.
> >
> > The git-bisect(1) command is used to find a commit in a range of
> > commits that introduced a specific bug. Starting a bisection run
> > creates a set of state files into the Git repository which record
> > various different parameters like “.git/BISECT_START”. These files
> > look almost like refs due to their names being all-uppercase. This has
> > led to confusion with the new “reftable” backend because it wasn’t
> > quite clear whether those files are in fact refs or not.
> >
> > As it turns out they are not refs and should never be treated like
> > one. Overall, it has been concluded that the way those files are
> > currently stored is not ideal. Instead of having a proliferation of
> > files in the Git directory, it was discussed whether the bisect state
> > should be moved into its own “bisect-state” subdirectory. This would
> > make it more self-contained and thereby avoid future confusion. It is
> > also aligned with the sequencer state used by rebases, which is neatly
> > contained in the “rebase-apply” and “rebase-merge” directories.
> >
> > The goal of this project would be to realize this change. While
> > rearchitecting the layout should be comparatively easy to do, the
> > harder part will be to hash out how to handle backwards compatibility.
>
> It's Ok to copy the project description, or parts of it, from the idea
> page, but please say that you have done so. This way we can just skip
> this when we read your proposal.
>
Okay.

> > -- Benefits to Community --
> > --------------------------------
> >
> > By joining the community and working on this idea, I can collaborate
> > with my mentors and fellow community members to enhance the user
> > experience for those who use git-bisect(1). Additionally, creating a
> > clear distinction between the refs and the git-bisect state files
> > through this project would be highly beneficial for the community.
> >
> > -- Why? --
> > -----------------
> >
> > For me, the biggest question is always why I am doing something. So to
> > answer why I want to contribute to Git in this Google Summer of Code
> > is the learning I will get while communicating with the mentors and
> > the git community. I want to learn as much as I could in this phrase.
> > Also, I want a kick start to begin my journey of contributing to git
> > for a long term. Why I chose git was mainly because I wanted to
> > explore something which is being used by almost every software
> > developer.
> >
> > -- Microproject --
> > ------------------------------
> >
> > - Modernize a test script [6]
> >   - Status: merged into master
>
> Please add a link to (or the hash of) the commit that merges your
> commit(s) into the master branch.
>
Here it is: https://github.com/gitgitgadget/git/commit/06ac51898156b7bba0a42d36731285ec3c1975aa
will add it the next version as well.
> >   - Description: Modernize the formatting of the test script to align
> > with current standards and improve its overall readability.
> >   - Remarks: This was my first patch in git which helped me to
> > understand a lot about how the process of submitting a patch and
> > review works.
> >
> > -- Other Patches --
> > ----------------------------
> >
> > Other than the required microproject, I’ve submitted a few other
> > patches when I stumbled upon potential modifications, these patches
> > are:
> >
> > - Zero Count Optimization [7]
> >   - Status: Under Review
>
> Looking at the thread, it seems to me that it has been reviewed and we
> are expecting a reroll from you with at least a better commit message.

Yeah I am still working on the task which Junio suggested to me to do,
which is to only use the builtin function if there is support for it
on the machine.
.
>
> >   - Description: Optimize for efficiency using trailing zeros for set
> > bit iteration
> >   - Remarks: This was a great learning task. I learned how tiny things
> > matter and small optimizations might make a big impact on performance.
> >
> >
> > -- In GSoC --
> > ====================
> >
> > -- Plan --
> > ----------------
> >
> > What is Git Bisect?
> >
> > Git Bisect is a git command which helps to find which commit
> > introduced the bug into the codebase at a faster rate by leveraging
> > the power of binary search.
> >
> >
> > The project idea is relatively easy to understand by the description
> > itself. Git bisect stores some state files such as BISECT_START,
> > BISECT_FIRST_PARENT etc which looks very similar to the naming
> > convention used for creating the refs file. Due to this similar naming
> > convention it sometimes causes unexpected results when these state
> > files are confused as the ref files.
>
> I don't think there are unexpected results. I think it's mostly a
> clarification, standardization and cleanup issue.

Okay. Thank you for the clarification.
>
> > In order to fix this problem we can create a new ".git/bisect-state"
> > directory and store all the state files which belong to the
> > git-bisect(1) command in that directory as proposed by Patrick and
> > store all the files as follows.
> >
> >
> > - BISECT_TERMS -> bisect-state/terms
> > - BISECT_LOG -> bisect-state/log
> > - BISECT_START -> bisect-state/start
> > - BISECT_RUN -> bisect-state/run
> > - BISECT_FIRST_PARENT -> bisect-state/first-parent
> > - BISECT_ANCESTORS_OK -> bisect-state/ancestors-ok
>
> Yeah, please link to Patrick's email where he suggests this. And
> please clarify what is quoted verbatim from his email. It is Ok to
> copy paste from existing emails, but it helps reviewing proposals if
> you are clear about this as we can then skip those parts.
>
Okay sure.

> > While the change looks very simple, it actually is. We will just
> > update all the paths from where these files are being accessed. When I
> > went through the code of bisect.c file I found that the path is pretty
> > easy to configure we just need to modify it at a few places. The main
> > problem is to efficiently handle the backward compatibility and
> > implement proper test cases to not let the existing things break. I
> > have already gone through the bisect.c file a couple of times just
> > reading and trying to understand all the functions and was able to
> > understand a lot of things about how it works.
>
> There is "bisect.c" and there is "builtin/bisect.c". Please make it
> clearer which file(s) you are talking about. Also code examples of how
> you would update things could be useful.
>
I am talking about "bisect.c" here. But this path update needs to be
done in both of the files i.e. "bisect.c" and "builtin/bisect.c" as shown
below. Also, As I again went through the codebase I found that these
paths are being used at some other places as well such as the test files,
wt-status.c file and a few more.

diff --git a/bisect.c b/bisect.c
index 60aae2fe50..d615113c73 100644
--- a/bisect.c
+++ b/bisect.c
@@ -472,13 +472,13 @@ static int read_bisect_refs(void)
        return for_each_ref_in("refs/bisect/", register_ref, NULL);
 }

-static GIT_PATH_FUNC(git_path_bisect_names, "BISECT_NAMES")
-static GIT_PATH_FUNC(git_path_bisect_ancestors_ok, "BISECT_ANCESTORS_OK")
-static GIT_PATH_FUNC(git_path_bisect_run, "BISECT_RUN")
-static GIT_PATH_FUNC(git_path_bisect_start, "BISECT_START")
-static GIT_PATH_FUNC(git_path_bisect_log, "BISECT_LOG")
-static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS")
-static GIT_PATH_FUNC(git_path_bisect_first_parent, "BISECT_FIRST_PARENT")
+static GIT_PATH_FUNC(git_path_bisect_names, "bisect-state/BISECT_NAMES")
+static GIT_PATH_FUNC(git_path_bisect_ancestors_ok,
"bisect-state/BISECT_ANCESTORS_OK")
+static GIT_PATH_FUNC(git_path_bisect_run, "bisect-state/BISECT_RUN")
+static GIT_PATH_FUNC(git_path_bisect_start, "bisect-state/BISECT_START")
+static GIT_PATH_FUNC(git_path_bisect_log, "bisect-state/BISECT_LOG")
+static GIT_PATH_FUNC(git_path_bisect_terms, "bisect-state/BISECT_TERMS")
+static GIT_PATH_FUNC(git_path_bisect_first_parent,
"bisect-state/BISECT_FIRST_PARENT")

Note that this is just an example I will not be removing the existing paths
instead I will be adding new paths to keep both backward and forward
compatibility. I will update all the writes in state files with both the paths
and all the reads with first a check for bisect_state directory and then
read from bisect state directory if it is present else first migrate from
.git directory and then read.

> > -- Challenges with Backward Compatibility --
> > --------------------------------------------
> >
> > I plan to implement a migration mechanism that will detect whether the
> > legacy stat file like BISECT_* are present or not during the
>
> Maybe: s/stat file/status files/
>
Oops, I mean state files used to store bisect state which can be find in
.git directory

> > initialization of git bisect.
>
> What do you mean by "initialization of git bisect". Does that take
> into account the fact that users could upgrade while they are
> bisecting? What if they switch back to an older version after
> upgrading?
>

"Initialization" for now is an abstract concept for me in my mind,
which I think is the phase just after when the git bisect command
is run by the user and before the actual execution of git bisect. But
thinking about the scenario that a user can upgrade while bisecting.
I think it won't be enough. So do you think we can add this migration
step every time the bisect state files are accessed. Also to address
the issue of switching back to an older version, we will not delete
the older files? And instead what I am thinking is that we will end up
maintaining the state files at two places one in .git directory and other
in .git/bisect-state.

And do you have a solution for when we will be able to completely
remove the state files in the .git directory?

> Do you plan to always check for any "BISECT_*" file or would a check
> for the "bisect-state" first would help?
>

I think checking for bisect-state will be better because we will be maintaining
the files in both the directories.

> > This migration phase will automatically
> > move these files to the new “.git/bisect-state” directory, ensuring a
> > seamless transition for users while maintaining the compatibility with
> > existing scripts and tools. The biggest challenge is not to let the
> > existing functionality break.
> >
> >
> > -- Strategies for Handling Backward Compatibility --
> > ---------------------------------------------------------
> >
> > 1. Before implementing any changes, a good understanding of how
> > git-bisect works and how it is structured is very important. So, I
> > will thoroughly analyze the existing usage patterns of git-bisect and
> > list down all the potential areas of impact.
> >
> > 2. To validate the codebase changes and ensure the stability and the
> > backward compatibility of git-bisect command. I will write unit tests
> > for all the required changes.
>
> Do you mean unit tests using the new unit test framework in C. Could
> you show an example?
>

Not sure about what is the standard way that git uses for writing unit tests.
Could you tell me a bit about this?

> > 3. I will also try to create some real scenarios where I will be
> > manually testing the desired behavior.
> >
> > 4. We might also need some regression tests to test some of the
> > functionality which can't be tested using unit tests.
>
> In which test script would you add them?

I will be writing these in bash and will be adding it in a new test case file
inside the git/t directory or maybe in "git\t\t6030-bisect-porcelain.sh"
file. I will try mocking a scenario where we have the BISECT_* files available
in the .git directory and then run git bisect using the command line and then
check the expected results.

>
> > Here is a list of a few tests which I am able to think of as of now:
> >
> > 1. Verify correct relocation of state files during initialization.
> > 2. Test compatibility with existing bisect scripts and tools.
> > 3. Validate handling of edge cases, such as empty or corrupted state
> > files, to prevent unexpected behavior
> >
> > I am pretty sure I am still missing some of the problems which I may
> > face in the future but I will figure them out too when they arrive.
> >
> > -- Timeline --
> > ------------------
> >
> > I will be starting the project as soon as I receive the confirmation
> > of my proposal getting selected for GSoC.
> >
> > Here is a list of phases during the GSoC period for this project.
> > While these are just the approximations, the actual timeline may vary
> > because of the review iterations or unpredictable scenarios.
> >
> > 1. Understanding the existing implementation of Bisect command
> >     - This includes understanding how to test the changes locally
> > which are made in the bisect command efficiently
> >     - Get more familiar with the style of writing the code
> >
> > 2. Start working on the task to add the directory
> >     - Updating all the references wherever these bisect files are used
> >     - Writing/Updating tests to test all the changes made
> >
> > 3. Start with supporting backward compatibility
> >     - For this phase what I am thinking is of adding a stage just
> > after the initialization of git bisect command which will detect if
> > there exist some files by the name BISECT_* and migrate all those
> > files into the “.git/bisect” directory.
> >
> > 4. Adding Tests
> >     - Through testing for backward compatibility is very very
> > important so that the changes don’t break the existing changes.
> >
> > 5. Documenting the changes
> >     - This might also go side by side if mentors need some weekly
> > documentation of the tasks done.
>
> Tests and documentation should be part of the patches that change the
> behavior. So it doesn't really make sense to list them separately in
> the timeline.

Okay. Maybe a blog post could be a part of this?

>
> > -- Availability --
> > ----------------------
> >
> > I will be able to contribute 2-3 hours on weekdays and minimum 6 hours
> > on weekends. Which gives me around 22-27 hours per week to work on the
> > project. I will work till a minimum span of 14 weeks (can be extended)
> > during the GSoC. I will try my best to keep this time commitment and
> > be always available through the community’s mailing list. Also, I have
> > exams in the first week of June so I will have limited availability
> > during that period. The project might take somewhere around 175-350
> > hours.
>
> Thanks for these details related to your availability. As this is your
> final year, when are you expecting to graduate?

I will be graduating in July this year.

>
> > -- After GSoC --
> > ---------------------
> >
> > After GSoC I intend to be a part of the community and keep
> > contributing to the git’s codebase. I eagerly want to learn a lot from
> > the git community and enhance my skills. I also see myself making
> > important contributions to git in the future.
> >
> > When I first joined the community 2 months ago, the ancient way of
> > collaborating through a mailing list by sending diff patches was
> > really puzzling (GitHub was the only means that I knew about for
> > open-source collaboration). After sending a patch I got a little
> > comfortable with it and started loving it.
> >
> > -- Closing remarks --
> > ----------------------------
> >
> > I am really enthusiastic to contribute to git. I really want to learn
> > a lot from my mentors and the whole git community. Everyone
> > contributes using git, why not contribute to git :).
> >
> > In the end I really want to thank the whole community, especially
> > Christian and Junio, for their valuable time in reviewing my patches
> > and guiding me with the problems I faced.
>
> Thanks for your proposal. Please make sure you submit it soon as a pdf
> file to the GSoC website. It can then be updated by uploading a new
> pdf until the April 2 1800 UTC deadline.

Okay. Sure, I will share an updated proposal as soon as possible.

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained
  2024-03-21 12:48 ` Patrick Steinhardt
@ 2024-03-21 13:27   ` Aryan Gupta
  2024-03-21 14:39     ` Patrick Steinhardt
  0 siblings, 1 reply; 9+ messages in thread
From: Aryan Gupta @ 2024-03-21 13:27 UTC (permalink / raw
  To: Patrick Steinhardt; +Cc: git, karthik nayak, Christian Couder, Junio C Hamano

On Thu, Mar 21, 2024 at 1:48 PM Patrick Steinhardt <ps@pks.im> wrote:
>
> On Sat, Mar 16, 2024 at 07:57:49PM +0100, Aryan Gupta wrote:
> [snip]
> > -- Plan --
> > ----------------
> >
> > What is Git Bisect?
> >
> > Git Bisect is a git command which helps to find which commit
> > introduced the bug into the codebase at a faster rate by leveraging
> > the power of binary search.
> >
> >
> > The project idea is relatively easy to understand by the description
> > itself. Git bisect stores some state files such as BISECT_START,
> > BISECT_FIRST_PARENT etc which looks very similar to the naming
> > convention used for creating the refs file. Due to this similar naming
> > convention it sometimes causes unexpected results when these state
> > files are confused as the ref files.
> >
> > In order to fix this problem we can create a new ".git/bisect-state"
> > directory and store all the state files which belong to the
> > git-bisect(1) command in that directory as proposed by Patrick and
> > store all the files as follows.
> >
> >
> > - BISECT_TERMS -> bisect-state/terms
> > - BISECT_LOG -> bisect-state/log
> > - BISECT_START -> bisect-state/start
> > - BISECT_RUN -> bisect-state/run
> > - BISECT_FIRST_PARENT -> bisect-state/first-parent
> > - BISECT_ANCESTORS_OK -> bisect-state/ancestors-ok
> >
> > While the change looks very simple, it actually is. We will just
> > update all the paths from where these files are being accessed. When I
> > went through the code of bisect.c file I found that the path is pretty
> > easy to configure we just need to modify it at a few places. The main
> > problem is to efficiently handle the backward compatibility and
> > implement proper test cases to not let the existing things break. I
> > have already gone through the bisect.c file a couple of times just
> > reading and trying to understand all the functions and was able to
> > understand a lot of things about how it works.
>
> As you also mention further down in the section about backwards
> compatibility, the challenge of this project is not about doing those
> changes. That indeed is the trivial part of this whole project.
>
Yes.

> The real challenge is figuring out with the community how to ensure that
> the change is indeed backwards compatible, and that does not only
> involve backwards compatibility with Git itself, but also with other,
> third party tools. The biggest question will be whether the refactoring
> is ultimately going to be worth it in the bigger picture, or whether we
> should really just leave it be.
>
Yeah. As mentioned in my previous email even I afraid about we will be
ending with managing the bisect files at two places in order to ensure
backward compatibility.

> So personally, I rather see the biggest part of this project to find
> good middle ground with the community. It's thus of a more "political"
> nature overall.
>
Yes I think we can figure it out together as a community what should be
done to encounter this problem.

> I don't mean to discourage you with this. I just want to state up front
> where you should expect difficulties so that you don't underestimate the
> difficulty of this project overall. It could very well happen that the
> whole idea gets shot down by the community in case we figure out that it
> is simply too risky and/or not worth it in the long run.
>
Yes, it's true.

> If you want to stick with this idea then I would strongly recommend that
> you mention this in your proposal.
>
I am open to changing to another idea, I won't mind that. Because my
ultimate aim to add something (even it's a small patch) to git and if
this project will never be in use. I am willing to change it. Let me know
what you think about it.

> Patrick

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained
  2024-03-21 13:27   ` Aryan Gupta
@ 2024-03-21 14:39     ` Patrick Steinhardt
  0 siblings, 0 replies; 9+ messages in thread
From: Patrick Steinhardt @ 2024-03-21 14:39 UTC (permalink / raw
  To: Aryan Gupta; +Cc: git, karthik nayak, Christian Couder, Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 842 bytes --]

On Thu, Mar 21, 2024 at 02:27:56PM +0100, Aryan Gupta wrote:
> On Thu, Mar 21, 2024 at 1:48 PM Patrick Steinhardt <ps@pks.im> wrote:
[snip]
> > If you want to stick with this idea then I would strongly recommend that
> > you mention this in your proposal.
> >
> I am open to changing to another idea, I won't mind that. Because my
> ultimate aim to add something (even it's a small patch) to git and if
> this project will never be in use. I am willing to change it. Let me know
> what you think about it.

I'll leave it up to you to pick the project you're most interested in. I
only want to ensure that you have as much information as required to do
a proper assessment of the different proposed projects. So if you want
to stick to the idea then that's fine. If you want to switch now, then
that is fine, too.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained
  2024-03-20 11:06 ` Christian Couder
  2024-03-21 13:21   ` Aryan Gupta
@ 2024-03-23 16:37   ` Aryan Gupta
  1 sibling, 0 replies; 9+ messages in thread
From: Aryan Gupta @ 2024-03-23 16:37 UTC (permalink / raw
  To: Christian Couder
  Cc: git, Patrick Steinhardt [ ], karthik nayak, Junio C Hamano

Hello mentors

Here is the next version of my proposal. I had a problem changing the
subject of the mail while keeping the email in the same thread. So sending
it here only.
Google doc link:
https://docs.google.com/document/d/1MRmGpJeUdrp1HfizyI7p1oSGVuOUQL2WEmV7x6
YZ0Is/edit?usp=sharing

GSoC Proposal V2

*******************************************************
Refactor git-bisect(1) to make its state self-contained
******************************************************


*************
Personal Info
*************

Full name    :     Aryan Gupta
E-mail        :     garyan447@gmail.com
Tel        :     (+91) 7018273899
Education    :     Graphic Era University, Dehradun, India
Year        :     Final Year
Major        :     Computer Science and Engineering
GitHub        :     https://github.com/aryangupta701


********
About Me
********

I have been enthusiastic about open source from the very beginning of my
journey as a software developer. I’ve contributed to other open-source
projects, though still a beginner, I’m generally familiar with the process
of contribution. The related experiences are all in the contribution graph
on my GitHub profile page[1]. In the ZAP Project[2] community, I’ve made
over 50 PRs[3]. I also participated in the Google Summer of Code 2023 with
the OWASP Foundation and successfully completed it as well (Project Link
[4]). I am still one of the active extended team members to contribute to
the ZAP. I have contributed to some small projects as well on github.

I came to participate in the Git community this year (around February). I
got myself rather comfortable with the contribution process by writing,
replying, and auditing different sorts of patches in the community.

With the patches done so far, I’m getting more familiar with the Git
internals, project structures, commonly used APIs, test suites, required
tech stacks, and coding guidelines. For understanding better about Git, I
read and reread the documentation a few times, including
‘MyFirstContribution.txt’, ‘MyFirstObjectWalk.txt’, and ‘Hacking Git’. The
book Pro Git also helped me to understand the Git internals better.

Effective communication with mentors is a very important part of Google
Summer of Code Program. I know when to ask the right questions and I do
every possible thing I can do before asking any question. This trait of
mine always proves to be very useful and I believe it will be really
beneficial for me for completing this GSoC project as well.


***********
Before GSoC
***********

Synopsis (copied from the idea lists page)
*****************************************

I’m picking the project idea “Refactor git-bisect(1) to make its state
self-contained” from the SoC 2024 Ideas[5] page. This idea aims to make
git-bisect(1) self contained. Its difficulty should be medium, and the
expected project size takes somewhere between 175 hours to 350 hours.

The git-bisect(1) command is used to find a commit in a range of commits
that introduced a specific bug. Starting a bisection run creates a set of
state files into the Git repository which record various different
parameters like “.git/BISECT_START”. These files look almost like refs due
to their names being all-uppercase. This has led to confusion with the new
“reftable” backend because it wasn’t quite clear whether those files are
in fact refs or not.

As it turns out they are not refs and should never be treated like one.
Overall, it has been concluded that the way those files are currently
stored is not ideal. Instead of having a proliferation of files in the Git
directory, it was discussed whether the bisect state should be moved into
its own “bisect-state” subdirectory. This would make it more
self-contained and thereby avoid future confusion. It is also aligned with
the sequencer state used by rebases, which is neatly contained in the
“rebase-apply” and “rebase-merge” directories.

The goal of this project would be to realize this change. While
rearchitecting the layout should be comparatively easy to do, the harder
part will be to hash out how to handle backwards compatibility.

Benefits to Community
********************

By joining the community and working on this idea, I can collaborate with
my mentors and fellow community members to enhance the user experience for
those who use git-bisect(1). Additionally, creating a clear distinction
between the refs and the git-bisect state files through this project would
be highly beneficial for the community.

Why?
***

For me the biggest question is always why I am doing something. So to
answer why I want to contribute to Git in this Google summer of code is
the learning I will get while communicating with the mentors and the git
community. I was to learn as much as I could in this phrase. Also, I want
a kick start to begin my journey of contributing to git for a long term.
Why I chose git was mainly because I wanted to explore something which is
being used by almost every software developer.

Microproject
***********

Modernize a test script [6]
Status: merged into master (Hash Link [7])
Description: Modernize the formatting of the test script to align with
current
standards and improve its overall readability.
Remarks: This was my first patch in git which helped me to understand a
lot about how the process of submitting a patch and review works.

Other Patches
*************

Other than the required microproject, I’ve submitted a few other patches
when I stumbled upon potential modifications, these patches are:

Zero Count Optimization [8]
Status: Work in Progress for second version
Description: Optimize for efficiency using trailing zeros for set bit
iteration
Remarks: This was a great learning task. I learned how tiny things matter
and small optimizations might make a big impact on performance.


*******
In GSoC
*******

Plan
****

What is Git Bisect?

Git Bisect is a git command which helps to find which commit introduced
the bug into the codebase at a faster rate by leveraging the power of
binary search.


The project idea is relatively easy to understand by the description
itself. Git bisect stores some state files such as BISECT_START,
BISECT_FIRST_PARENT etc which looks very similar to the naming convention
used for creating the refs file.These files, resembling refs with their
all-uppercase naming convention, have caused confusion, particularly with
the introduction of the new “reftable” backend. It's unclear whether these
files are indeed refs. Hence, this project seeks to redefine the storage
approach for these files within the git repository.

In order to fix this problem we can create a new ".git/bisect-state"
directory and store all the state files which belong to the git-bisect(1)
command in that directory as proposed by Patrick (issue link [9]) and
store all the files as follows.



- BISECT_TERMS -> bisect-state/terms
- BISECT_LOG -> bisect-state/log
- BISECT_START -> bisect-state/start
- BISECT_RUN -> bisect-state/run
- BISECT_FIRST_PARENT -> bisect-state/first-parent
- BISECT_ANCESTORS_OK -> bisect-state/ancestors-ok

While the change looks very simple, it actually is. We will just update
all the paths from where these files are being accessed. When I went
through the code of “bisect.c” and “builtin/bisect.c” files. I found that
the path is pretty easy to configure; we just need to modify it in a few
places. Also, I found that these paths are being used at some other places
as well such as the test files,”wt-status.c” file and a few more. The main
problem is to efficiently handle the backward compatibility and implement
proper test cases to not let the existing things break. I have already
gone through the bisect.c file a couple of times just reading and trying
to understand all the functions and was able to understand a lot of things
about how it works. Here is an example of the code changes which I am
talking about.


    diff --git a/bisect.c b/bisect.c
index 60aae2fe50..d615113c73 100644
--- a/bisect.c
+++ b/bisect.c
@@ -472,13 +472,13 @@ static int read_bisect_refs(void)
        return for_each_ref_in("refs/bisect/"
, register_ref, NULL);
 }

-static GIT_PATH_FUNC(git_path_bisect_names, "BISECT_NAMES")
-static GIT_PATH_FUNC(git_path_bisect_ancestors_ok, "BISECT_ANCESTORS_OK")
-static GIT_PATH_FUNC(git_path_bisect_run, "BISECT_RUN")
-static GIT_PATH_FUNC(git_path_bisect_start, "BISECT_START")
-static GIT_PATH_FUNC(git_path_bisect_log, "BISECT_LOG")
-static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS")
-static GIT_PATH_FUNC(git_path_bisect_first_parent, "BISECT_FIRST_PARENT")
+static GIT_PATH_FUNC(git_path_bisect_names, "bisect-state/BISECT_NAMES")
+static GIT_PATH_FUNC(git_path_bisect_ancestors_ok,
"bisect-state/BISECT_ANCESTORS_OK")
+static GIT_PATH_FUNC(git_path_bisect_run, "bisect-state/BISECT_RUN")
+static GIT_PATH_FUNC(git_path_bisect_start, "bisect-state/BISECT_START")
+static GIT_PATH_FUNC(git_path_bisect_log, "bisect-state/BISECT_LOG")
+static GIT_PATH_FUNC(git_path_bisect_terms, "bisect-state/BISECT_TERMS")
+static GIT_PATH_FUNC(git_path_bisect_first_parent,
"bisect-state/BISECT_FIRST_PARENT")


Note that this is just an example. I will not be removing the existing
paths instead I will be adding new paths to keep both backward and forward
compatibility. I will update all the writes in state files with both the
paths and all the reads with first a check for bisect_state directory and
then read from bisect state directory if it is present else first migrate
from .git directory and then read.


***************************************
Challenges with Backward Compatibility
***************************************

I plan to implement a migration mechanism that will detect whether the
legacy bisect state directory “bisect-state” is present or not during
every access for the state file of git bisect. This migration phase will
automatically copy the state files (BISECT_*) from “./git” directory to
the new “.git/bisect-state” directory, ensuring a seamless transition for
users while maintaining the compatibility with existing scripts and tools.
The biggest challenge is not to let the existing functionality break.

Thinking about the scenario that a user can upgrade while bisecting. I
think it won't be enough. So do you think we can add this migration step
every time the bisect state files are accessed. Also to address the issue
of switching back to an older version, we will not delete the older files?
And instead what I am thinking is that we will end up maintaining the
state files at two places one in .git directory and other in
.git/bisect-state.

As a solution for this I think we can start by managing the BISECT_* files
at both the places and at a later stage when we deprecate the current
version of git we can entirely remove the management of BISECT_* files in
the “.git” directory. And we can also provide a backward compatibility
note in the release version note for this to let the user know about the
particular change.

***********************************************
Strategies for Handling Backward Compatibility:
***********************************************

Before implementing any changes, a good understanding of how git-bisect
works and how it is structured is very important. So, I will thoroughly
analyze the existing usage patterns of git-bisect and list down all the
potential areas of impact.

To validate the codebase changes and ensure the stability and the backward
compatibility of git-bisect command. I will write unit tests and
integration tests for all the required changes.

   3. I will also try to create some real scenarios where I will be
manually testing the desired behavior.

   4. We might also need some regression tests to test some of the
functionality which can't be tested using unit tests. I will be writing
these in bash and will be adding it in a new test case file inside the
git/t directory or maybe in the "git\t\t6030-bisect-porcelain.sh" file. I
will try mocking a scenario where we have the BISECT_* files available in
the .git directory and then run git bisect using the command line and then
check the expected results.


Here is a list of a few tests which I am able to think of as of now:

1. Verify correct relocation of state files during initialization.
2. Test compatibility with existing bisect scripts and tools.
3. Validate handling of edge cases, such as empty or corrupted stat files,
to prevent unexpected behavior

I am pretty sure I am still missing some of the problems which I may face
in the future but I will figure them out too when they arrive.


********
Timeline
********

I will be starting the project as soon as I receive the confirmation of my
proposal getting selected for GSoC.


Here is a list of phases during the GSoC period for this project. While
these are just the approximations, actual timeline may vary because of the
review iterations or unpredictable scenarios.

1. Understanding the existing implementation of Bisect command
- This includes understanding how to test the changes locally which are
made in the bisect command efficiently
- Get more familiar with the style of writing the code

2. Start working on the task to add the directory
- Updating all the references wherever these bisect files are used
- Writing/Updating tests to test all the changes made

3. Start with supporting backward compatibility
- For this phase what I am thinking is of adding a stage just after the
initialization of git bisect command which will detect if there exists
some files by the name BISECT_* and migrate all those files into the
“.git/bisect” directory.

4. Adding Tests
- Through testing for backward compatibility is very very important so
that the changes don’t break the existing changes.


Availability

I will be able to contribute 2-3 hours on weekdays and minimum 6 hours on
weekends. Which gives me around 22-27 hours per week to work on the
project. I will work till a minimum span of 14 weeks (can be extended)
during the GSoC. I will try my best to keep this time commitment and be
always available through the community’s mailing list. Also, I have exams
in the first week of June so I will have limited availability during that
period. The project might take somewhere around 175-350 hours. I have
taken all the factors into factor including my graduation month which is
July 2024 to calculate the timings I will be able to contribute.

After GSoC

After GSoC I intend to be a part of the community and keep contributing to
the git’s codebase. I eagerly want to learn a lot from the git community
and enhance my skills. I also see myself making important contributions to
git in the future.

When I first joined the community 2 months ago, the ancient way of
collaborating through a mailing list by sending diff patches was really
puzzling (GitHub was the only means that I knew about for open-source
collaboration). After sending a patch I got a little comfortable with it
and started loving it.
Closing remarks

I am really enthusiastic to contribute to git. I really want to learn a
lot from my mentors and the whole git community. Everyone contributes
using git, why not contribute to git :).

In the end I really want to thank the whole community, especially
Christian and Junio, for their valuable time in reviewing my patches and
guiding me with the problems I faced.

References

[1]  Github Profile - https://github.com/aryangupta701
[2]  ZAP Project - https://github.com/zaproxy
[3]  Pull Requests -
https://github.com/search?q=org%3Azaproxy+type%3Apr+sort%3Aupdated-asc++au
thor%3Aaryangupta701&type=pullrequests
[4]  GSoC 2023 Project -
https://www.zaproxy.org/blog/2023-09-11-browser-recorder/
[5]  SoC Ideas - https://git.github.io/SoC-2024-Ideas/
[6]  Migrate Test Script -
https://lore.kernel.org/git/pull.1675.v2.git.1709243831190.gitgitgadget@gm
ail.com/T/#t
[7] Migrate Test Script Hash Link:
https://github.com/gitgitgadget/git/commit/06ac51898156b7bba0a42d36731285e
[8]  Zero Count Optimization-
https://lore.kernel.org/git/20240310162614.62691-2-garyan447@gmail.com/T/#
u
[9]  Issue Raised - https://lore.kernel.org/git/Za-gF_Hp_lXViGWw@tanuki/
[10]  Git References -
https://git-scm.com/book/en/v2/Git-Internals-Git-References
[11] Git bisect - https://git-scm.com/docs/git-bisect
[12]  Reftables - https://git-scm.com/docs/reftable

Thanks & Regards,
Aryan
> Hi,
>
> On Sat, Mar 16, 2024 at 7:58 PM Aryan Gupta <garyan447@gmail.com> wrote:
> >
> > Hello everyone.
> >
> > I hope you are doing well. I am writing this email to get a review for
> > my GSoC proposal.
> >
> > Thanks in advance.
> >
> > Regards
> > Aryan
> >
> > Here it goes:
> >
> > -- Refactor git-bisect(1) to make its state self-contained --
> > =============================================================
> >
> > -- Personal Info --
> > ====================
> >
> > Full name :     Aryan Gupta
> > E-mail    :     garyan447@gmail.com
> > Tel       :     (+91) 7018273899
> > Education :     Graphic Era University, Dehradun, India
> > Year      :     Final Year
> > Major     :     Computer Science and Engineering
> > GitHub    :     [1]
> >
> > -- About Me --
> > ====================
> >
> > I have been enthusiastic about open source from the very beginning of
> > my journey as a software developer. I’ve contributed to other
> > open-source projects, though still a beginner, I’m generally familiar
> > with the process of contribution. The related experiences are all in
> > the contribution graph on my GitHub profile page [1]. In the ZAP
> > Project[2] community, I’ve made over 50 PRs [3]. I also participated
> > in the Google Summer of Code 2023 with the OWASP Foundation and
> > successfully completed it as well [4]. I have contributed to some
> > other small projects as well on GitHub.
>
> It's interesting to know that you have already participated in a GSoC.
> Is there a single blog post about this or more?
>
> > I came to participate in the Git community this year (around
> > February). I got myself rather comfortable with the contribution
> > process by writing, replying, and auditing different sorts of patches
> > in the community.
> >
> > With the patches done so far, I’m getting more familiar with the Git
> > internals, project structures, commonly used APIs, test suites,
> > required tech stacks, and coding guidelines. For understanding better
> > about Git, I read and reread the documentation a few times, including
> > ‘MyFirstContribution.txt’, ‘MyFirstObjectWalk.txt’, and ‘Hacking Git’.
> > The book Pro Git also helped me to understand the Git internals
> > better.
> >
> > Effective communication with mentors is a very important part of
> > Google Summer of Code Program. I know when to ask the right questions
> > and I do every possible thing I can do before asking any question.
> > This trait of mine always proved to be very useful and I believe it
> > will be really beneficial for me for the completion of this GSoC
> > project as well.
>
> Ok.
>
> > -- Before GSoC --
> > ====================
> >
> > -- Synopsis --
> > -----------------------
> >
> > I’m picking the project idea “Refactor git-bisect(1) to make its state
> > self-contained” from the SoC 2024 Ideas page [5]. This idea aims to
> > make git-bisect(1) self-contained. Its difficulty should be medium,
> > and the expected project size takes somewhere between 175 hours to 350
> > hours.
> >
> > The git-bisect(1) command is used to find a commit in a range of
> > commits that introduced a specific bug. Starting a bisection run
> > creates a set of state files into the Git repository which record
> > various different parameters like “.git/BISECT_START”. These files
> > look almost like refs due to their names being all-uppercase. This has
> > led to confusion with the new “reftable” backend because it wasn’t
> > quite clear whether those files are in fact refs or not.
> >
> > As it turns out they are not refs and should never be treated like
> > one. Overall, it has been concluded that the way those files are
> > currently stored is not ideal. Instead of having a proliferation of
> > files in the Git directory, it was discussed whether the bisect state
> > should be moved into its own “bisect-state” subdirectory. This would
> > make it more self-contained and thereby avoid future confusion. It is
> > also aligned with the sequencer state used by rebases, which is neatly
> > contained in the “rebase-apply” and “rebase-merge” directories.
> >
> > The goal of this project would be to realize this change. While
> > rearchitecting the layout should be comparatively easy to do, the
> > harder part will be to hash out how to handle backwards compatibility.
>
> It's Ok to copy the project description, or parts of it, from the idea
> page, but please say that you have done so. This way we can just skip
> this when we read your proposal.
>
> > -- Benefits to Community --
> > --------------------------------
> >
> > By joining the community and working on this idea, I can collaborate
> > with my mentors and fellow community members to enhance the user
> > experience for those who use git-bisect(1). Additionally, creating a
> > clear distinction between the refs and the git-bisect state files
> > through this project would be highly beneficial for the community.
> >
> > -- Why? --
> > -----------------
> >
> > For me, the biggest question is always why I am doing something. So to
> > answer why I want to contribute to Git in this Google Summer of Code
> > is the learning I will get while communicating with the mentors and
> > the git community. I want to learn as much as I could in this phrase.
> > Also, I want a kick start to begin my journey of contributing to git
> > for a long term. Why I chose git was mainly because I wanted to
> > explore something which is being used by almost every software
> > developer.
> >
> > -- Microproject --
> > ------------------------------
> >
> > - Modernize a test script [6]
> >   - Status: merged into master
>
> Please add a link to (or the hash of) the commit that merges your
> commit(s) into the master branch.
>
> >   - Description: Modernize the formatting of the test script to align
> > with current standards and improve its overall readability.
> >   - Remarks: This was my first patch in git which helped me to
> > understand a lot about how the process of submitting a patch and
> > review works.
> >
> > -- Other Patches --
> > ----------------------------
> >
> > Other than the required microproject, I’ve submitted a few other
> > patches when I stumbled upon potential modifications, these patches
> > are:
> >
> > - Zero Count Optimization [7]
> >   - Status: Under Review
>
> Looking at the thread, it seems to me that it has been reviewed and we
> are expecting a reroll from you with at least a better commit message.
>
> >   - Description: Optimize for efficiency using trailing zeros for set
> > bit iteration
> >   - Remarks: This was a great learning task. I learned how tiny things
> > matter and small optimizations might make a big impact on performance.
> >
> >
> > -- In GSoC --
> > ====================
> >
> > -- Plan --
> > ----------------
> >
> > What is Git Bisect?
> >
> > Git Bisect is a git command which helps to find which commit
> > introduced the bug into the codebase at a faster rate by leveraging
> > the power of binary search.
> >
> >
> > The project idea is relatively easy to understand by the description
> > itself. Git bisect stores some state files such as BISECT_START,
> > BISECT_FIRST_PARENT etc which looks very similar to the naming
> > convention used for creating the refs file. Due to this similar naming
> > convention it sometimes causes unexpected results when these state
> > files are confused as the ref files.
>
> I don't think there are unexpected results. I think it's mostly a
> clarification, standardization and cleanup issue.
>
> > In order to fix this problem we can create a new ".git/bisect-state"
> > directory and store all the state files which belong to the
> > git-bisect(1) command in that directory as proposed by Patrick and
> > store all the files as follows.
> >
> >
> > - BISECT_TERMS -> bisect-state/terms
> > - BISECT_LOG -> bisect-state/log
> > - BISECT_START -> bisect-state/start
> > - BISECT_RUN -> bisect-state/run
> > - BISECT_FIRST_PARENT -> bisect-state/first-parent
> > - BISECT_ANCESTORS_OK -> bisect-state/ancestors-ok
>
> Yeah, please link to Patrick's email where he suggests this. And
> please clarify what is quoted verbatim from his email. It is Ok to
> copy paste from existing emails, but it helps reviewing proposals if
> you are clear about this as we can then skip those parts.
>
> > While the change looks very simple, it actually is. We will just
> > update all the paths from where these files are being accessed. When I
> > went through the code of bisect.c file I found that the path is pretty
> > easy to configure we just need to modify it at a few places. The main
> > problem is to efficiently handle the backward compatibility and
> > implement proper test cases to not let the existing things break. I
> > have already gone through the bisect.c file a couple of times just
> > reading and trying to understand all the functions and was able to
> > understand a lot of things about how it works.
>
> There is "bisect.c" and there is "builtin/bisect.c". Please make it
> clearer which file(s) you are talking about. Also code examples of how
> you would update things could be useful.
>
> > -- Challenges with Backward Compatibility --
> > --------------------------------------------
> >
> > I plan to implement a migration mechanism that will detect whether the
> > legacy stat file like BISECT_* are present or not during the
>
> Maybe: s/stat file/status files/
>
> > initialization of git bisect.
>
> What do you mean by "initialization of git bisect". Does that take
> into account the fact that users could upgrade while they are
> bisecting? What if they switch back to an older version after
> upgrading?
>
> Do you plan to always check for any "BISECT_*" file or would a check
> for the "bisect-state" first would help?
>
> > This migration phase will automatically
> > move these files to the new “.git/bisect-state” directory, ensuring a
> > seamless transition for users while maintaining the compatibility with
> > existing scripts and tools. The biggest challenge is not to let the
> > existing functionality break.
> >
> >
> > -- Strategies for Handling Backward Compatibility --
> > ---------------------------------------------------------
> >
> > 1. Before implementing any changes, a good understanding of how
> > git-bisect works and how it is structured is very important. So, I
> > will thoroughly analyze the existing usage patterns of git-bisect and
> > list down all the potential areas of impact.
> >
> > 2. To validate the codebase changes and ensure the stability and the
> > backward compatibility of git-bisect command. I will write unit tests
> > for all the required changes.
>
> Do you mean unit tests using the new unit test framework in C. Could
> you show an example?
>
> > 3. I will also try to create some real scenarios where I will be
> > manually testing the desired behavior.
> >
> > 4. We might also need some regression tests to test some of the
> > functionality which can't be tested using unit tests.
>
> In which test script would you add them?
>
> > Here is a list of a few tests which I am able to think of as of now:
> >
> > 1. Verify correct relocation of state files during initialization.
> > 2. Test compatibility with existing bisect scripts and tools.
> > 3. Validate handling of edge cases, such as empty or corrupted state
> > files, to prevent unexpected behavior
> >
> > I am pretty sure I am still missing some of the problems which I may
> > face in the future but I will figure them out too when they arrive.
> >
> > -- Timeline --
> > ------------------
> >
> > I will be starting the project as soon as I receive the confirmation
> > of my proposal getting selected for GSoC.
> >
> > Here is a list of phases during the GSoC period for this project.
> > While these are just the approximations, the actual timeline may vary
> > because of the review iterations or unpredictable scenarios.
> >
> > 1. Understanding the existing implementation of Bisect command
> >     - This includes understanding how to test the changes locally
> > which are made in the bisect command efficiently
> >     - Get more familiar with the style of writing the code
> >
> > 2. Start working on the task to add the directory
> >     - Updating all the references wherever these bisect files are used
> >     - Writing/Updating tests to test all the changes made
> >
> > 3. Start with supporting backward compatibility
> >     - For this phase what I am thinking is of adding a stage just
> > after the initialization of git bisect command which will detect if
> > there exist some files by the name BISECT_* and migrate all those
> > files into the “.git/bisect” directory.
> >
> > 4. Adding Tests
> >     - Through testing for backward compatibility is very very
> > important so that the changes don’t break the existing changes.
> >
> > 5. Documenting the changes
> >     - This might also go side by side if mentors need some weekly
> > documentation of the tasks done.
>
> Tests and documentation should be part of the patches that change the
> behavior. So it doesn't really make sense to list them separately in
> the timeline.
>
> > -- Availability --
> > ----------------------
> >
> > I will be able to contribute 2-3 hours on weekdays and minimum 6 hours
> > on weekends. Which gives me around 22-27 hours per week to work on the
> > project. I will work till a minimum span of 14 weeks (can be extended)
> > during the GSoC. I will try my best to keep this time commitment and
> > be always available through the community’s mailing list. Also, I have
> > exams in the first week of June so I will have limited availability
> > during that period. The project might take somewhere around 175-350
> > hours.
>
> Thanks for these details related to your availability. As this is your
> final year, when are you expecting to graduate?
>
> > -- After GSoC --
> > ---------------------
> >
> > After GSoC I intend to be a part of the community and keep
> > contributing to the git’s codebase. I eagerly want to learn a lot from
> > the git community and enhance my skills. I also see myself making
> > important contributions to git in the future.
> >
> > When I first joined the community 2 months ago, the ancient way of
> > collaborating through a mailing list by sending diff patches was
> > really puzzling (GitHub was the only means that I knew about for
> > open-source collaboration). After sending a patch I got a little
> > comfortable with it and started loving it.
> >
> > -- Closing remarks --
> > ----------------------------
> >
> > I am really enthusiastic to contribute to git. I really want to learn
> > a lot from my mentors and the whole git community. Everyone
> > contributes using git, why not contribute to git :).
> >
> > In the end I really want to thank the whole community, especially
> > Christian and Junio, for their valuable time in reviewing my patches
> > and guiding me with the problems I faced.
>
> Thanks for your proposal. Please make sure you submit it soon as a pdf
> file to the GSoC website. It can then be updated by uploading a new
> pdf until the April 2 1800 UTC deadline.

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained
  2024-03-21 13:21   ` Aryan Gupta
@ 2024-03-26 10:29     ` Christian Couder
  2024-03-27  0:23       ` Aryan Gupta
  0 siblings, 1 reply; 9+ messages in thread
From: Christian Couder @ 2024-03-26 10:29 UTC (permalink / raw
  To: Aryan Gupta; +Cc: git, Patrick Steinhardt [ ], karthik nayak, Junio C Hamano

Hi Aryan,

On Thu, Mar 21, 2024 at 2:21 PM Aryan Gupta <garyan447@gmail.com> wrote:

> On Wed, Mar 20, 2024 at 12:06 PM Christian Couder
> <christian.couder@gmail.com> wrote:

> > On Sat, Mar 16, 2024 at 7:58 PM Aryan Gupta <garyan447@gmail.com> wrote:

> > > -- About Me --
> > > ====================
> > >
> > > I have been enthusiastic about open source from the very beginning of
> > > my journey as a software developer. I’ve contributed to other
> > > open-source projects, though still a beginner, I’m generally familiar
> > > with the process of contribution. The related experiences are all in
> > > the contribution graph on my GitHub profile page [1]. In the ZAP
> > > Project[2] community, I’ve made over 50 PRs [3]. I also participated
> > > in the Google Summer of Code 2023 with the OWASP Foundation and
> > > successfully completed it as well [4]. I have contributed to some
> > > other small projects as well on GitHub.
> >
> > It's interesting to know that you have already participated in a GSoC.
> > Is there a single blog post about this or more?
>
> Here it is: https://www.zaproxy.org/blog/2023-09-11-browser-recorder/

I saw that there is one blog post, but I wanted to ask if there are
more blog posts. That's because we ask GSoC contributors to post on
their blog at least every 2 weeks and if possible every week.

> > > -- Strategies for Handling Backward Compatibility --
> > > ---------------------------------------------------------
> > >
> > > 1. Before implementing any changes, a good understanding of how
> > > git-bisect works and how it is structured is very important. So, I
> > > will thoroughly analyze the existing usage patterns of git-bisect and
> > > list down all the potential areas of impact.
> > >
> > > 2. To validate the codebase changes and ensure the stability and the
> > > backward compatibility of git-bisect command. I will write unit tests
> > > for all the required changes.
> >
> > Do you mean unit tests using the new unit test framework in C. Could
> > you show an example?
> >
>
> Not sure about what is the standard way that git uses for writing unit tests.
> Could you tell me a bit about this?

I was just asking if you planned to use the new unit test framework in
C. The "Move existing tests to a unit testing framework" project that
we propose on https://git.github.io/SoC-2024-Ideas/ is about moving
some unit tests to this new unit test framework. Please take a look at
that project to get more information about this subject.

> > > 3. I will also try to create some real scenarios where I will be
> > > manually testing the desired behavior.
> > >
> > > 4. We might also need some regression tests to test some of the
> > > functionality which can't be tested using unit tests.
> >
> > In which test script would you add them?
>
> I will be writing these in bash and will be adding it in a new test case file
> inside the git/t directory or maybe in "git\t\t6030-bisect-porcelain.sh"

It's better to always use the Unix notation for paths like
"t/t6030-bisect-porcelain.sh" starting at the root of the repo, rather
than mixing the Unix and Windows notations and adding "git/" before
the root.

> file. I will try mocking a scenario where we have the BISECT_* files available
> in the .git directory and then run git bisect using the command line and then
> check the expected results.

The t/*.sh test scripts are the right place for end-to-end tests
(sometimes called "black box" tests), but not for unit tests that
would test some C functions. Both unit tests and end-to-end tests
could be regression tests or performance tests.

> > > 4. Adding Tests
> > >     - Through testing for backward compatibility is very very
> > > important so that the changes don’t break the existing changes.
> > >
> > > 5. Documenting the changes
> > >     - This might also go side by side if mentors need some weekly
> > > documentation of the tasks done.
> >
> > Tests and documentation should be part of the patches that change the
> > behavior. So it doesn't really make sense to list them separately in
> > the timeline.
>
> Okay. Maybe a blog post could be a part of this?

We ask GSoC contributors to post weekly on their blog, so except for a
final blog post, most of the blogging should be also part of the
regular work.

Thanks for updating your proposal.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained
  2024-03-26 10:29     ` Christian Couder
@ 2024-03-27  0:23       ` Aryan Gupta
  0 siblings, 0 replies; 9+ messages in thread
From: Aryan Gupta @ 2024-03-27  0:23 UTC (permalink / raw
  To: Christian Couder
  Cc: git, Patrick Steinhardt [ ], karthik nayak, Junio C Hamano

Hey Christian

Thank you for the response.

On Tue, Mar 26, 2024 at 11:30 AM Christian Couder
<christian.couder@gmail.com> wrote:
>
> Hi Aryan,
>
> On Thu, Mar 21, 2024 at 2:21 PM Aryan Gupta <garyan447@gmail.com> wrote:
>
> > On Wed, Mar 20, 2024 at 12:06 PM Christian Couder
> > <christian.couder@gmail.com> wrote:
>
> > > On Sat, Mar 16, 2024 at 7:58 PM Aryan Gupta <garyan447@gmail.com> wrote:
>
> > > > -- About Me --
> > > > ====================
> > > >
> > > > I have been enthusiastic about open source from the very beginning of
> > > > my journey as a software developer. I’ve contributed to other
> > > > open-source projects, though still a beginner, I’m generally familiar
> > > > with the process of contribution. The related experiences are all in
> > > > the contribution graph on my GitHub profile page [1]. In the ZAP
> > > > Project[2] community, I’ve made over 50 PRs [3]. I also participated
> > > > in the Google Summer of Code 2023 with the OWASP Foundation and
> > > > successfully completed it as well [4]. I have contributed to some
> > > > other small projects as well on GitHub.
> > >
> > > It's interesting to know that you have already participated in a GSoC.
> > > Is there a single blog post about this or more?
> >
> > Here it is: https://www.zaproxy.org/blog/2023-09-11-browser-recorder/
>
> I saw that there is one blog post, but I wanted to ask if there are
> more blog posts. That's because we ask GSoC contributors to post on
> their blog at least every 2 weeks and if possible every week.
>
Oh okay. No, the org didn't have any such criteria we usually had
a google meeting once a week where we discussed everything
about the week's progress and the next week's plan. Yes I can
surely write a blog every one or two weeks whichever would be
comfortable for sure if I contribute to git.

> > > > -- Strategies for Handling Backward Compatibility --
> > > > ---------------------------------------------------------
> > > >
> > > > 1. Before implementing any changes, a good understanding of how
> > > > git-bisect works and how it is structured is very important. So, I
> > > > will thoroughly analyze the existing usage patterns of git-bisect and
> > > > list down all the potential areas of impact.
> > > >
> > > > 2. To validate the codebase changes and ensure the stability and the
> > > > backward compatibility of git-bisect command. I will write unit tests
> > > > for all the required changes.
> > >
> > > Do you mean unit tests using the new unit test framework in C. Could
> > > you show an example?
> > >
> >
> > Not sure about what is the standard way that git uses for writing unit tests.
> > Could you tell me a bit about this?
>
> I was just asking if you planned to use the new unit test framework in
> C. The "Move existing tests to a unit testing framework" project that
> we propose on https://git.github.io/SoC-2024-Ideas/ is about moving
> some unit tests to this new unit test framework. Please take a look at
> that project to get more information about this subject.
>
Got it. I read and researched on this and yes I can use the latest unit
test framework for writing the unit tests. Also I was thinking of a backup
plan. If the community later (after getting the proposal accepted) do not
agree with the project idea. I can work on the migration of these tests as
well. Since there are so many tests, I think I can work on a few if this
happens. Here is an example:

#include "test-lib.h"
#include "bisect.h"

static void test_backward_compatibility(void) {
/*
* will try to do some bisect operations by
* keeping the new way of storing bisect state
* files, not sure about the how the code will
* look like for now because I haven't gone
* through git-bisect much but I will do it
* for sure
*/
}


int cmd_main(int argc, const char **argv) {
TEST(test_backward_compatibility(), "Test backward compatibility of
git-bisect");

return test_done();
}


> > > > 3. I will also try to create some real scenarios where I will be
> > > > manually testing the desired behavior.
> > > >
> > > > 4. We might also need some regression tests to test some of the
> > > > functionality which can't be tested using unit tests.
> > >
> > > In which test script would you add them?
> >
> > I will be writing these in bash and will be adding it in a new test case file
> > inside the git/t directory or maybe in "git\t\t6030-bisect-porcelain.sh"
>
> It's better to always use the Unix notation for paths like
> "t/t6030-bisect-porcelain.sh" starting at the root of the repo, rather
> than mixing the Unix and Windows notations and adding "git/" before
> the root.

Okay.

>
> > file. I will try mocking a scenario where we have the BISECT_* files available
> > in the .git directory and then run git bisect using the command line and then
> > check the expected results.
>
> The t/*.sh test scripts are the right place for end-to-end tests
> (sometimes called "black box" tests), but not for unit tests that
> would test some C functions. Both unit tests and end-to-end tests
> could be regression tests or performance tests.
>
Okay

> > > > 4. Adding Tests
> > > >     - Through testing for backward compatibility is very very
> > > > important so that the changes don’t break the existing changes.
> > > >
> > > > 5. Documenting the changes
> > > >     - This might also go side by side if mentors need some weekly
> > > > documentation of the tasks done.
> > >
> > > Tests and documentation should be part of the patches that change the
> > > behavior. So it doesn't really make sense to list them separately in
> > > the timeline.
> >
> > Okay. Maybe a blog post could be a part of this?
>
> We ask GSoC contributors to post weekly on their blog, so except for a
> final blog post, most of the blogging should be also part of the
> regular work.

Okay.

>
> Thanks for updating your proposal.


Thank you
Regards

Aryan Gupta

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-03-27  0:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-16 18:57 [GSoC][PROPOSAL v1] Refactor git-bisect(1) to make its state self-contained Aryan Gupta
2024-03-20 11:06 ` Christian Couder
2024-03-21 13:21   ` Aryan Gupta
2024-03-26 10:29     ` Christian Couder
2024-03-27  0:23       ` Aryan Gupta
2024-03-23 16:37   ` Aryan Gupta
2024-03-21 12:48 ` Patrick Steinhardt
2024-03-21 13:27   ` Aryan Gupta
2024-03-21 14:39     ` Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).