Home >Technology peripherals >It Industry >Understanding and Working with Submodules in Git
Modern software projects mostly rely on the work results of other projects. If someone else has written excellent solutions and you reinvent the wheel in your code, it would be a huge waste of time. This is why many projects use third-party code, such as libraries or modules.
Git, the most popular version control system in the world, provides an elegant and powerful way to manage these dependencies. Its concept of "submodules" allows us to include and manage third-party libraries while keeping them clearly separate from our own code.
This article will explain why Git submodules are so useful, what exactly are they and how they work.
Key Points
Keep code separation
To clearly illustrate why Git submodules are a valuable structure, let's look at a case where does not have submodules. When you need to include third-party code, such as open source libraries, you can choose an easy way: just download the code from GitHub and put it somewhere in your project. Although this method is very fast, it is definitely not clean for several reasons: By forcibly copying third-party code into your project, you are actually mixing multiple projects into one project. The line between your own project and the projects of others (library) begins to blur.
Whenever you need to update the library code (because its maintainer provides a great new feature or fixes a serious bug), you have to download, copy and paste again. This will soon become a tedious process.Of course, submodules are not the only solution to such problems. You can also use a variety of "package manager" systems provided by many modern languages and frameworks. There is nothing wrong with doing this!
However, you can think of Git's submodule architecture with some advantages:
The essence of Git submodule
Submodules in Git are actually just standard Git repositories. There is no fancy innovation, just the same Git repository we are all very familiar with now. This is also part of the power of submodules: they are so powerful and direct because they are so "dry" from a technical point of view and well tested.
The only thing that makes a Git repository a child module is that it is located inside another parent Git repository .
Other than that, the Git submodule is still a fully functional repository: you can do everything you already know from "normal" Git work - from modifying files to committing, pulling, and pushing. Everything in the submodule is possible.
Add submodule
Let's take a classic example as an example, suppose we want to add a third-party library to the project. It makes sense to create a separate folder to store such content before we get any code:
<code class="language-bash">$ mkdir lib $ cd lib</code>Now we are ready to import some third-party code into our project using submodules in an orderly manner. Suppose we need a small "time zone converter" JavaScript library:
<code class="language-bash">$ git submodule add https://github.com/spencermountain/spacetime.git</code>When we run this command, Git clones the repository into our project as a submodule:
<code>Cloning into 'carparts-website/lib/spacetime'... remote: Enumerating objects: 7768, done. remote: Counting objects: 100% (1066/1066), done. remote: Compressing objects: 100% (445/445), done. remote: Total 7768 (delta 615), reused 975 (delta 588), pack-reused 6702 Receiving objects: 100% (7768/7768), 4.02 MiB | 7.78 MiB/s, done. Resolving deltas: 100% (5159/5159), done.</code>If we look at our working copy folder, we can see that the library file has actually arrived in our project.
! If we just download some files, throw them into our project, and commit them—like the rest of our projects—they will become part of the same Git repository. However, the submodule ensures that the library files are not "leaked" into the repository of our main project. Let's see what else is going on: A new .gitmodules file was created in the main project root folder. The following is its content:
<code class="language-bash">$ mkdir lib $ cd lib</code>
This .gitmodules file is one of several locations for submodules in Git tracking projects. The other is .git/config, which now ends as follows:
<code class="language-bash">$ git submodule add https://github.com/spencermountain/spacetime.git</code>
Finally, Git also keeps a copy of the .git repository of each submodule in the internal .git/modules folder.
All of these are technical details you don't have to remember. However, it may be helpful to understand that the internal maintenance of Git submodules is quite complex. That's why one thing is important to remember: Don't modify the Git submodule configuration manually! If you want to move, delete or otherwise operate submodules, do yourself a favor, don't try this manually. You can use the appropriate Git commands or a Git desktop GUI like "Tower" and it will handle these details for you.
Let's see the status of the main project after we add submodules:
As you can see, Git treats adding submodules as the same changes as other changes. So we have to commit this change like any other change:
<code>Cloning into 'carparts-website/lib/spacetime'... remote: Enumerating objects: 7768, done. remote: Counting objects: 100% (1066/1066), done. remote: Compressing objects: 100% (445/445), done. remote: Total 7768 (delta 615), reused 975 (delta 588), pack-reused 6702 Receiving objects: 100% (7768/7768), 4.02 MiB | 7.78 MiB/s, done. Resolving deltas: 100% (5159/5159), done.</code>
<code>[submodule "lib/spacetime"] path = lib/spacetime url = https://github.com/spencermountain/spacetime.git</code>Clone the project containing the Git submodule
In our example above, we added a new submodule to the existing Git repository. But, "in turn," what happens when you clone a repository that already contains the
submodule?
If we execute a normal git clone
Checkout version
In "normal" Git repositories, we usually check out branches. By using git checkout
to the latest commit. It's important to understand this - because Git submodules work differently! In submodules, we always check out a specific version—not a branch! Even if you execute commands similar to git checkout main in a submodule, in the background, the current latest
commiton that branch is logged - not the branch itself. Of course, this behavior is not a mistake. Consider this: When you include third-party libraries, you want to have full control over what exact code you use in your main project. This is great when the maintainer of the library releases a new version...but you don't necessarily want to use this new version automatically in your project. Because you don't know if these new changes will break your
project!If you want to find out which version your submodule is using, you can request this information in the main project:
<code class="language-bash">$ mkdir lib $ cd lib</code>
This will return the version currently checked out by our lib/spacetime submodule. It also lets us know that this version is a tag called "6.16.3". It is common to use tags heavily when using Git submodules.
Suppose you want your submodule to use an older version of , marked "6.14.0". First, we have to change the directory so that our Git commands will be executed in the context of the submodule, not our main project. Then we can simply run git checkout with the tag name:
<code class="language-bash">$ git submodule add https://github.com/spencermountain/spacetime.git</code>If we now go back to our main project and execute git submodule status again, we will see our checkout:
<code>Cloning into 'carparts-website/lib/spacetime'... remote: Enumerating objects: 7768, done. remote: Counting objects: 100% (1066/1066), done. remote: Compressing objects: 100% (445/445), done. remote: Total 7768 (delta 615), reused 975 (delta 588), pack-reused 6702 Receiving objects: 100% (7768/7768), 4.02 MiB | 7.78 MiB/s, done. Resolving deltas: 100% (5159/5159), done.</code>Come to view the output: The symbol before the SHA-1 hash tells us that the version of the submodule is different from the version currently stored in the parent repository. Since we just changed the checked out version, this looks correct.
Calling git status in our main project now also informs us of this fact:
<code>[submodule "lib/spacetime"] path = lib/spacetime url = https://github.com/spencermountain/spacetime.git</code>You can see that Git treats moving submodule pointers as the same changes as other changes: if we want to store it, we have to commit it to the repository:
<code>[submodule "lib/spacetime"] url = https://github.com/spencermountain/spacetime.git active = true</code>
Update Git submodule
In the above steps, weourselves moved the submodule pointer: we are those who choose to check out different versions, submit it, and push it to our team's remote repository. But what if our colleague changed the submodule version - maybe because an interesting new version of the submodule was released and our colleague decided to use it in our project (after thorough testing, of course...) .
Let's execute a simple git pull in the main project - because we may do it often - to get new changes from a shared remote repository:
<code class="language-bash">$ git status On branch master Changes to be committed: (use "git restore --staged <file>..." to unstage) new file: .gitmodules new file: lib/spacetime</file></code>The penultimate line indicates that something in the submodule has been changed. But let's take a closer look:
<code class="language-bash">$ git commit -m "Add timezone converter library as a submodule"</code>I believe you still remember that small number: This means that the submodule pointer has moved! To update our local checkout version to the "official" version selected by our teammates, we can run the update command:
<code class="language-bash">$ git submodule status ea703a7d557efd90ccae894db96368d750be93b6 lib/spacetime (6.16.3)</code>Okay! Our submodules are now checked out to the version recorded in our main project repository!
Using Git submodule
We have covered the basic building blocks using Git submodules. Other workflows are very standard!For example, checking for new changes in submodules is like in any other Git repository: you run the git fetch command in the submodule repository and if you do want to use updates, you might then run something like git pull origin afterwards main command.
Changing submodules may also work for you, especially if you manage the library code yourself (because it is an internal library, not from a third party). You can use submodules like you would with any other Git repository: you can make changes, commit them, push them, and more.
Get the power of Git
Git has powerful features behind the scenes. However, many advanced tools, such as Git submodules, are not well known. Many developers missed a lot of powerful features, which is really a pity!
If you want to dig deeper into some other advanced Git technologies, I highly recommend the "Advanced Git Toolkit": This is a (free!) short video collection that will introduce you to Reflog, interactive rebase, Cherry- Topics like Picking and even branching strategies.
I wish you a better developer!
Frequently Asked Questions about Git Submodules
What is a Git submodule? Git submodule is a way to include another Git repository as a subdirectory into your own Git repository. It allows you to maintain a separate repository as a subproject in the main project.
Why use Git submodule? Git submodules are useful for merging external repositories into your project, especially if you want to separate their development history from the main project. This is very beneficial for managing dependencies or including external libraries.
What information is stored in the main project about the submodule? The main project stores the URL and commit hash of the submodule in a special entry in the parent repository. This allows anyone cloning the main project to clone the referenced submodules as well.
How to clone a Git repository containing submodules? When cloning a repository containing submodules, you can automatically initialize and clone submodules using the --recursive flag of the git clone command. Alternatively, you can use git submodule update --init after cloning.
Can I nest submodules? Yes, Git supports nested submodules, which means that submodules can contain its own submodules. However, managing nested submodules can become complicated and you must ensure that each submodule is properly initialized and updated.
The above is the detailed content of Understanding and Working with Submodules in Git. For more information, please follow other related articles on the PHP Chinese website!