Home >Technology peripherals >It Industry >Understanding and Working with Submodules in Git

Understanding and Working with Submodules in Git

Joseph Gordon-Levitt
Joseph Gordon-LevittOriginal
2025-02-10 15:58:09132browse

Understanding and Working with Submodules in Git

Modern software projects mostly rely on the work results of other projects. If someone else has written excellent solutions and you reinvent the wheel in your code, it would be a huge waste of time. This is why many projects use third-party code, such as libraries or modules.

Git, the most popular version control system in the world, provides an elegant and powerful way to manage these dependencies. Its concept of "submodules" allows us to include and manage third-party libraries while keeping them clearly separate from our own code.

This article will explain why Git submodules are so useful, what exactly are they and how they work.

Key Points

  • Git submodules are a powerful and straightforward way to manage third-party libraries in a project and to clearly isolate them from the main code base. They are standard Git repositories placed in another parent Git repository.
  • Adding submodules to a project involves creating a separate folder, then using the "git submodule add" command, followed by the URL of the desired library. This clones the repository into the project as a submodule, separateing it from the main project's repository.
  • When cloning a project containing a Git submodule, the submodule will be automatically initialized and cloned using the "--recurse-submodules" option in the "git clone" command. If you don't do this, the submodule folder will be empty after cloning and needs to be populated with "git submodule update --init --recursive".
  • In the Git submodule, a specific version is checked out, not a branch, allowing for complete control over what exact code is used in the main project. Updating a submodule involves using "git submodule update", followed by the submodule name.

Keep code separation

To clearly illustrate why Git submodules are a valuable structure, let's look at a case where does not have submodules. When you need to include third-party code, such as open source libraries, you can choose an easy way: just download the code from GitHub and put it somewhere in your project. Although this method is very fast, it is definitely not clean for several reasons: By forcibly copying third-party code into your project, you are actually mixing multiple projects into one project. The line between your own project and the projects of others (library) begins to blur.

Whenever you need to update the library code (because its maintainer provides a great new feature or fixes a serious bug), you have to download, copy and paste again. This will soon become a tedious process.
  • The general rule of "separating different things" in software development is not unreasonable. This is especially true for managing third-party code in your own projects. Fortunately, Git's submodule concept is designed for these situations.
  • Of course, submodules are not the only solution to such problems. You can also use a variety of "package manager" systems provided by many modern languages ​​and frameworks. There is nothing wrong with doing this!

    However, you can think of Git's submodule architecture with some advantages:

    • Submodules provide consistent and reliable interfaces—regardless of the language or framework you use. If you are using multiple technologies, each may have its own package manager and its own set of rules and commands. On the other hand, submodules always work the same way.
    • Probably not all code is available through the package manager. Maybe you just want to share your own code between two projects - in this case, the submodule may provide the easiest process.

    The essence of Git submodule

    Submodules in Git are actually just standard Git repositories. There is no fancy innovation, just the same Git repository we are all very familiar with now. This is also part of the power of submodules: they are so powerful and direct because they are so "dry" from a technical point of view and well tested.

    The only thing that makes a Git repository a child module is that it is located inside another parent Git repository .

    Other than that, the Git submodule is still a fully functional repository: you can do everything you already know from "normal" Git work - from modifying files to committing, pulling, and pushing. Everything in the submodule is possible.

    Add submodule

    Let's take a classic example as an example, suppose we want to add a third-party library to the project. It makes sense to create a separate folder to store such content before we get any code:

    <code class="language-bash">$ mkdir lib
    $ cd lib</code>
    Now we are ready to import some third-party code into our project using submodules in an orderly manner. Suppose we need a small "time zone converter" JavaScript library:

    <code class="language-bash">$ git submodule add https://github.com/spencermountain/spacetime.git</code>
    When we run this command, Git clones the repository into our project as a submodule:

    <code>Cloning into 'carparts-website/lib/spacetime'...
    remote: Enumerating objects: 7768, done.
    remote: Counting objects: 100% (1066/1066), done.
    remote: Compressing objects: 100% (445/445), done.
    remote: Total 7768 (delta 615), reused 975 (delta 588), pack-reused 6702
    Receiving objects: 100% (7768/7768), 4.02 MiB | 7.78 MiB/s, done.
    Resolving deltas: 100% (5159/5159), done.</code>
    If we look at our working copy folder, we can see that the library file has actually arrived in our project.

    Understanding and Working with Submodules in Git

    You might ask, "What's the difference?" After all, the files for third-party libraries are here, just like we're copying and pasting them. The key difference is that they are included in their own Git repository

    ! If we just download some files, throw them into our project, and commit them—like the rest of our projects—they will become part of the same Git repository. However, the submodule ensures that the library files are not "leaked" into the repository of our main project. Let's see what else is going on: A new .gitmodules file was created in the main project root folder. The following is its content:

    <code class="language-bash">$ mkdir lib
    $ cd lib</code>

    This .gitmodules file is one of several locations for submodules in Git tracking projects. The other is .git/config, which now ends as follows:

    <code class="language-bash">$ git submodule add https://github.com/spencermountain/spacetime.git</code>

    Finally, Git also keeps a copy of the .git repository of each submodule in the internal .git/modules folder.

    All of these are technical details you don't have to remember. However, it may be helpful to understand that the internal maintenance of Git submodules is quite complex. That's why one thing is important to remember: Don't modify the Git submodule configuration manually! If you want to move, delete or otherwise operate submodules, do yourself a favor, don't try this manually. You can use the appropriate Git commands or a Git desktop GUI like "Tower" and it will handle these details for you.

    Understanding and Working with Submodules in Git Let's see the status of the main project after we add submodules:

    As you can see, Git treats adding submodules as the same changes as other changes. So we have to commit this change like any other change:
    <code>Cloning into 'carparts-website/lib/spacetime'...
    remote: Enumerating objects: 7768, done.
    remote: Counting objects: 100% (1066/1066), done.
    remote: Compressing objects: 100% (445/445), done.
    remote: Total 7768 (delta 615), reused 975 (delta 588), pack-reused 6702
    Receiving objects: 100% (7768/7768), 4.02 MiB | 7.78 MiB/s, done.
    Resolving deltas: 100% (5159/5159), done.</code>

    <code>[submodule "lib/spacetime"]
      path = lib/spacetime
      url = https://github.com/spencermountain/spacetime.git</code>
    Clone the project containing the Git submodule

    In our example above, we added a new submodule to the existing Git repository. But, "in turn," what happens when you clone a repository that already contains the

    submodule?

    If we execute a normal git clone on the command line, we will download the main project - but we will find that any submodule folder is empty! This once again vividly proves that the submodule files are independent and are not included in their parent repository.

    In this case, to populate the submodule after cloning its parent repository, you can simply do git submodule update --init --recursive. A better way is to directly add the --recurse-submodules option when the first time git clone is called.

    Checkout version

    In "normal" Git repositories, we usually check out branches. By using git checkout or an updated git switch , we tell Git what our currently active branch should be. When a new commit is made on this branch, the HEAD pointer will automatically move

    to the latest commit. It's important to understand this - because Git submodules work differently! In submodules, we always check out a specific version—not a branch! Even if you execute commands similar to git checkout main in a submodule, in the background, the current latest

    commit

    on that branch is logged - not the branch itself. Of course, this behavior is not a mistake. Consider this: When you include third-party libraries, you want to have full control over what exact code you use in your main project. This is great when the maintainer of the library releases a new version...but you don't necessarily want to use this new version automatically in your project. Because you don't know if these new changes will break your

    project!

    If you want to find out which version your submodule is using, you can request this information in the main project:

    <code class="language-bash">$ mkdir lib
    $ cd lib</code>

    This will return the version currently checked out by our lib/spacetime submodule. It also lets us know that this version is a tag called "6.16.3". It is common to use tags heavily when using Git submodules.

    Suppose you want your submodule to use an older version of , marked "6.14.0". First, we have to change the directory so that our Git commands will be executed in the context of the submodule, not our main project. Then we can simply run git checkout with the tag name:

    <code class="language-bash">$ git submodule add https://github.com/spencermountain/spacetime.git</code>
    If we now go back to our main project and execute git submodule status again, we will see our checkout:

    <code>Cloning into 'carparts-website/lib/spacetime'...
    remote: Enumerating objects: 7768, done.
    remote: Counting objects: 100% (1066/1066), done.
    remote: Compressing objects: 100% (445/445), done.
    remote: Total 7768 (delta 615), reused 975 (delta 588), pack-reused 6702
    Receiving objects: 100% (7768/7768), 4.02 MiB | 7.78 MiB/s, done.
    Resolving deltas: 100% (5159/5159), done.</code>
    Come to view the output: The symbol before the SHA-1 hash tells us that the version of the submodule is different from the version currently stored in the parent repository. Since we just changed the checked out version, this looks correct.

    Calling git status in our main project now also informs us of this fact:

    <code>[submodule "lib/spacetime"]
      path = lib/spacetime
      url = https://github.com/spencermountain/spacetime.git</code>
    You can see that Git treats moving submodule pointers as the same changes as other changes: if we want to store it, we have to commit it to the repository:

    <code>[submodule "lib/spacetime"]
      url = https://github.com/spencermountain/spacetime.git
      active = true</code>

    Update Git submodule

    In the above steps, we

    ourselves moved the submodule pointer: we are those who choose to check out different versions, submit it, and push it to our team's remote repository. But what if our colleague changed the submodule version - maybe because an interesting new version of the submodule was released and our colleague decided to use it in our project (after thorough testing, of course...) .

    Let's execute a simple git pull in the main project - because we may do it often - to get new changes from a shared remote repository:

    <code class="language-bash">$ git status
    On branch master
    Changes to be committed:
      (use "git restore --staged <file>..." to unstage)
      new file:   .gitmodules
      new file:   lib/spacetime</file></code>
    The penultimate line indicates that something in the submodule has been changed. But let's take a closer look:

    <code class="language-bash">$ git commit -m "Add timezone converter library as a submodule"</code>
    I believe you still remember that small number: This means that the submodule pointer has moved! To update our local checkout version to the "official" version selected by our teammates, we can run the update command:

    <code class="language-bash">$ git submodule status
       ea703a7d557efd90ccae894db96368d750be93b6 lib/spacetime (6.16.3)</code>
    Okay! Our submodules are now checked out to the version recorded in our main project repository!

    Using Git submodule

    We have covered the basic building blocks using Git submodules. Other workflows are very standard!

    For example, checking for new changes in submodules is like in any other Git repository: you run the git fetch command in the submodule repository and if you do want to use updates, you might then run something like git pull origin afterwards main command.

    Changing submodules may also work for you, especially if you manage the library code yourself (because it is an internal library, not from a third party). You can use submodules like you would with any other Git repository: you can make changes, commit them, push them, and more.

    Get the power of Git

    Git has powerful features behind the scenes. However, many advanced tools, such as Git submodules, are not well known. Many developers missed a lot of powerful features, which is really a pity!

    If you want to dig deeper into some other advanced Git technologies, I highly recommend the "Advanced Git Toolkit": This is a (free!) short video collection that will introduce you to Reflog, interactive rebase, Cherry- Topics like Picking and even branching strategies.

    I wish you a better developer!

    Frequently Asked Questions about Git Submodules

    What is a Git submodule? Git submodule is a way to include another Git repository as a subdirectory into your own Git repository. It allows you to maintain a separate repository as a subproject in the main project.

    Why use Git submodule? Git submodules are useful for merging external repositories into your project, especially if you want to separate their development history from the main project. This is very beneficial for managing dependencies or including external libraries.

    What information is stored in the main project about the submodule? The main project stores the URL and commit hash of the submodule in a special entry in the parent repository. This allows anyone cloning the main project to clone the referenced submodules as well.

    How to clone a Git repository containing submodules? When cloning a repository containing submodules, you can automatically initialize and clone submodules using the --recursive flag of the git clone command. Alternatively, you can use git submodule update --init after cloning.

    Can I nest submodules? Yes, Git supports nested submodules, which means that submodules can contain its own submodules. However, managing nested submodules can become complicated and you must ensure that each submodule is properly initialized and updated.

The above is the detailed content of Understanding and Working with Submodules in Git. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn