an elegant way to fix user IDs in docker containers using docker_userid

Home

Backend Development

C++

an elegant way to fix user IDs in docker containers using docker_userid_fixer

王林

Aug 15, 2024 pm 06:42 PM

an elegant way to fix user IDs in docker containers using docker_userid_fixer

what is it about?

It's about a rather technical issue in using docker containers that interact with the docker host computer, generally related to using the host filesystem inside the container.
That happens in particular in reproducible research context.
I developed an opensource utility that helps tackling that issue.

docker containers as execution environments

The initial and main use case of a docker container: a self-contained application that only interacts with the host system with some network ports.
Think of a web application: the docker container typically contains a web server and a web application, running for example on port 80 (inside the container). The container is then run on the host, by binding the container internal port 80 to a host port (e.g. 8000).
Then the only interaction between the containerized app and the host system is via this bound network port.

Containers as execution environments are completely different:

instead of containerizing an application, it's the application build system that is containerized.
- it could a be a compiler, an IDE, a notebook engine, a Quarto publishing system...
the goals are:
- to have an standard, easy to install and share environment
  - imagine a complex build environment, with fixed versions of R, python and zillions of external packages. Installing everything with the right versions can be a very difficult and time-consuming task. By sharing a docker image containing everything already installed and pre-configured is a real time-saver.
- to have a reproducible environment
  - by using it, you are able to reproduce some analysis results, since you are using very same controlled environment
  - you can also easily reproduce bugs, which is the first step to fixing them

But, in order to use those execution environments, those containers must have access to the host system, in particular to the host user filesystem.

docker containers and the host filesystem

Suppose you have containerized an IDE, e.g. Rstudio.
Your Rstudio is installed and running inside the docker container, but it needs to read and edit files in your project folder.

For that you bind mount your project folder (in your host filesystem) using the docker run --volume option.
Then your files are accessible from withing the docker container.

The challenge now are the file permissions. Suppose your host user has userid 1001, and suppose that the user owning the Rsudio process in the container is either 0 (root), or 1002.

If the container user is root, then it will have no issue in reading your files.
But as soon as you edit some existing files, are produce new ones (e.g. pdf, html), these files will belong to root also on the host filesystem!.
Meaning that your local host user will not be able to use them, or delete them, since they belong to root.

Now if the container user id is 1002, Rstudio may not be able to read your files, edit them or produce new files.
Even if it can, by settings some very permissive permissions, your local host user may not be able to use them.

Of course one bruteforce way of solving that issue is to run with root both on the host computer and withing the docker container. This is not always possible and raise some obvious critical security concerns.

solving the file owner issue part 1: the docker run --user option

Because we can not know in advance what will be the host userid (here 1001), we can not pre-configure
the userid of the docker container user.

docker run now provides a --user option that enables to create a pseudo user with some supplied userid
at runtime. For example, docker run --user 1001 ... will create a docker container running with processes
belonging to a user with userid 1001.

So what are we still discussing this issue? Isn't it solved?

Here some quirks about that dynamically created user:

擬似ユーザーです
ホームディレクトリ (/home/xxx) がありません
/etc/passwd には表示されません
事前に設定することはできません。 bash プロファイル、いくつかの環境変数、アプリケーションのデフォルトなどを使用します...

これらの問題を回避することはできますが、退屈でイライラする可能性があります。私たちが本当に望んでいるのは、Docker コンテナユーザーを事前設定し、
実行時... でその userid を動的に変更できるようにすることです。

ファイル所有者の問題の解決パート 2: docker_userid_fixer を入力します

docker_userid_fixer は、先ほど提起した userid の問題を修正するための

docker エントリポイント として使用することを目的としたオープンソースユーティリティです。 使用方法を見てみましょう: これを docker ENTRYPOINT として設定し、使用するユーザーを指定し、その

userid

を動的に変更します: リーリー
用語を正確に言いましょう:

target

user1 リクエストされたユーザーは、docker runによってプロビジョニングされたユーザー、つまり最初のプロセス(PID 1)を(当初)所有するユーザーです
その後、コンテナーランタイムの作成時に、次の 2 つのオプションがあります:

requested

ユーザーIDが(すでに)

target

あるいはそうではありません。たとえば、requestedのユーザーIDは1001
targetのユーザーIDは100です。次に、docker_userid_fixer は、コンテナーのメインプロセスで直接、ターゲットユーザーuser1のユーザーIDを1000から1001に修正します。 実際にはこれで問題は解決します: コンテナのユーザーIDを修正する必要がない場合は、通常の方法でdocker runを使用してください(--userオプションなし)

または --user オプションを使用すると、要求したユーザー ID でメインプロセスを実行するだけでなく、事前構成されたユーザーが要求したユーザー ID に変更され、コンテナーが意図したユーザーと意図したユーザー ID で実行されます。

小さな実行可能ファイル (17k) をビルドまたはダウンロードします

短いメモをいくつか入れました https://github.com/kforner/docker_userid_fixer#how-it-works
でも、言い換えてみます

実装の核心は、コンテナ内の docker_userid_fixer 実行可能ファイルの

setuid root

です。ユーザー ID を変更するには root 権限が必要であり、この setuid は

に対してのみその特権実行を有効にします。 docker_userid_fixerプログラム、そしてそれは非常に短期間です。

必要に応じてユーザーIDが変更されるとすぐに、docker_userid_fixerはメインプロセスを切り替えます

リクエストされたユーザー (およびユーザー ID) に送信します。これらのトピック (Docker、再現可能な研究、R パッケージ開発、アルゴリズム、パフォーマンスの最適化、並列処理など) に興味がある場合は、仕事やビジネスの機会について話し合うために、お気軽に私に連絡してください。

The above is the detailed content of an elegant way to fix user IDs in docker containers using docker_userid_fixer. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

C# vs. C : Learning Curves and Developer ExperienceApr 18, 2025 am 12:13 AM

There are significant differences in the learning curves of C# and C and developer experience. 1) The learning curve of C# is relatively flat and is suitable for rapid development and enterprise-level applications. 2) The learning curve of C is steep and is suitable for high-performance and low-level control scenarios.

C# vs. C : Object-Oriented Programming and FeaturesApr 17, 2025 am 12:02 AM

There are significant differences in how C# and C implement and features in object-oriented programming (OOP). 1) The class definition and syntax of C# are more concise and support advanced features such as LINQ. 2) C provides finer granular control, suitable for system programming and high performance needs. Both have their own advantages, and the choice should be based on the specific application scenario.

From XML to C : Data Transformation and ManipulationApr 16, 2025 am 12:08 AM

Converting from XML to C and performing data operations can be achieved through the following steps: 1) parsing XML files using tinyxml2 library, 2) mapping data into C's data structure, 3) using C standard library such as std::vector for data operations. Through these steps, data converted from XML can be processed and manipulated efficiently.

C# vs. C : Memory Management and Garbage CollectionApr 15, 2025 am 12:16 AM

C# uses automatic garbage collection mechanism, while C uses manual memory management. 1. C#'s garbage collector automatically manages memory to reduce the risk of memory leakage, but may lead to performance degradation. 2.C provides flexible memory control, suitable for applications that require fine management, but should be handled with caution to avoid memory leakage.

Beyond the Hype: Assessing the Relevance of C TodayApr 14, 2025 am 12:01 AM

C still has important relevance in modern programming. 1) High performance and direct hardware operation capabilities make it the first choice in the fields of game development, embedded systems and high-performance computing. 2) Rich programming paradigms and modern features such as smart pointers and template programming enhance its flexibility and efficiency. Although the learning curve is steep, its powerful capabilities make it still important in today's programming ecosystem.

The C Community: Resources, Support, and DevelopmentApr 13, 2025 am 12:01 AM

C Learners and developers can get resources and support from StackOverflow, Reddit's r/cpp community, Coursera and edX courses, open source projects on GitHub, professional consulting services, and CppCon. 1. StackOverflow provides answers to technical questions; 2. Reddit's r/cpp community shares the latest news; 3. Coursera and edX provide formal C courses; 4. Open source projects on GitHub such as LLVM and Boost improve skills; 5. Professional consulting services such as JetBrains and Perforce provide technical support; 6. CppCon and other conferences help careers

C# vs. C : Where Each Language ExcelsApr 12, 2025 am 12:08 AM

C# is suitable for projects that require high development efficiency and cross-platform support, while C is suitable for applications that require high performance and underlying control. 1) C# simplifies development, provides garbage collection and rich class libraries, suitable for enterprise-level applications. 2)C allows direct memory operation, suitable for game development and high-performance computing.

The Continued Use of C : Reasons for Its EnduranceApr 11, 2025 am 12:02 AM

C Reasons for continuous use include its high performance, wide application and evolving characteristics. 1) High-efficiency performance: C performs excellently in system programming and high-performance computing by directly manipulating memory and hardware. 2) Widely used: shine in the fields of game development, embedded systems, etc. 3) Continuous evolution: Since its release in 1983, C has continued to add new features to maintain its competitiveness.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

R.E.P.O. Best Graphic Settings

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Assassin's Creed Shadows: Seashell Riddle Solution

2 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Will R.E.P.O. Have Crossplay?

1 months agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Atom editor mac version download

The most popular open source editor

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.