首页  >  文章  >  后端开发  >  位运算概述

位运算概述

WBOY
WBOY原创
2024-07-29 09:50:221116浏览

An Overview of Bitwise Operations

以下帖子取自我创建的存储库,旨在帮助学习(和教授)按位运算。您可以在这里找到该存储库,我建议您查看一下 - 那里有一些代码示例和解决方案。

介绍

此存储库的目标是让您熟悉按位运算,解释它们是什么、它们如何工作以及它们的用途。

第一章:一切都是二进制的

在 C(以及大多数高级语言)中,我们的变量具有 类型。这些类型表明了一些事情。当然,int 类型的变量将存储整数值,但理解这些按位运算的关键是要知道,在幕后,所有类型都以二进制形式存储在内存中(任何位置、堆栈、堆,无论何处)。以下是当您在 C:
中的堆栈上存储简单整数值时会发生的情况的示例

int main(int argc, char** argv) {
    int x = 2;
    return 0;
}

编译为汇编后,代码可能如下所示(我这里使用的是ARM汇编,并且我已经使用注释对代码进行了注释):

.section .text
.global main

main:
    ; Set up the stack for the function
    stp x29, x30 [sp, -16]! ; Save previous frame pointer & link register
    mov x29, sp ; Setup a new frame pointer

    ; Initialize x with 2
    ; IN C: int x = 2;
    mov w0, 2 ; Move 2 into the w0 register
    str w0, [sp, 16] ; Store the contents of w0 (2) at a 16-byte offset from the stack pointer
    ; Essentially, the above line stores 2 on the stack.

    mov w0, 0 ; Move 0 into w0, prepare for return

    ; Clear stack
    ldp x29, x30, [sp], 32 ; Restore frame pointer and link register
    ret ; Return

请注意,大多数编译器不会实际存储像我在堆栈上显示的那样的变量,因为它未被使用。但是,如果多次使用,它会像上面一样存储在堆栈中。

如果我们查看变量在堆栈中存储的位置(当然,它就在那里),我们会看到类似以下内容:

Memory Address Value Stored (Hex) Value Stored (Binary)
0x1000 0x02 00000010
0x1001 0x00 00000000
0x1002 0x00 00000000
0x1003 0x00 00000000

这假设您的系统是小尾数法。我不会在这里讨论字节顺序,但您可以在这里阅读更多相关信息。

关于上表,我希望您注意的关键一点是,尽管我们的整数只有 2 位长,但它占用了 4 个字节(32 位)的内存。现在,不要惊慌——这是正常现象,也是意料之中的。 C 和编译器所做的许多事情之一就是为您调用的类型设置标准。因此,当我创建 int 变量时,编译器知道分配 4 个字节(同样是 32 位)内存。我们甚至可以使用 C 中的 sizeof() 运算符来测试这一点。

sizeof() 运算符

sizeof() 不是实际的 C 函数。相反,在编译时,编译器会将表达式替换为给定数据类型的大小。您甚至可以将其与您自己的类型一起使用,例如 typedef 和/或结构:

#include <stdio.h> 

typedef struct {
  char name[64];
  int age;
} Person;

int main(int argc, char** argv) {
  printf("A Person is %lu bytes long.\n", sizeof(Person));
  return 0;
}

您可能会问的另一件事是如何存储负数。很好的问题。数字可以是有符号无符号,但默认情况下,它们是有符号的。如果一个整数是有符号的,它将牺牲其最高有效位作为“符号位”。如果符号位为1,则该数为负;否则它是积极的。精明的读者可能会意识到这里发生的变化在可能的数字范围内。考虑 8 位数字。有 256 个可能的数字可以表示(由 2^8 给出)。使用无符号8位整数,可以表示0-255的值;使用有符号的 8 位 int,可以表示 -128–127。

要获得二进制数的倒数,请使用二进制补码。我们来求二进制的 -5。

  1. 从5开始。在二进制中,5是0101。前导0是符号。
  2. 反转每一位。 0101→1010。
  3. 将此数字加 1(忽略任何可能的溢出)。 1010 + 0001 = 1011.

轮到你了!

  1. 通过对 -5 进行二进制补码得到 5 或 0101,确认 -5 是二进制的 1011。
  2. 编写一个 C 程序,以字节和位的形式打印 int 的大小。使用上面的代码作为起点。 提示:从字节转换为位,一个字节有多少位?
  3. 将不同类型的尺寸填写到下表中,修改您的程序来检查它们。
Type Size (bytes) Size (bits)
int
int64_t
int8_t
char
bool (you'll need to #include 7c6dd6636e138b73de36040ce8dc60d1)
long int
short int
long long int
double
double

Sample Responses

Question 1

  1. Start with -5: 1011.
  2. Invert each bit: 1011 → 0100.
  3. Add 1: 0100 + 0001 = 0101

Question 2

Here's an example of what your simple program might look like (you can also check it out at Chapter 1/SizeOfOperatorTest.c).

 #include ade979de5fc0e1ca0540f360a64c230b

 int main(int argc, char** argv) {
    printf("The size of an int is %lu bytes, or %lu bits.\n", sizeof(int), sizeof(int) * 8);
    return 0;
 }

Go ahead and compile it using gcc and check out its output:

cd Chapter\ 1
gcc -o sizeof SizeOfOperatorTest.c
./sizeof

Question 3

Type Size (bytes) Size (bits)
int 4 32
int64_t 8 64
int8_t 1 8
char 1 8
bool (you'll need to #include 7c6dd6636e138b73de36040ce8dc60d1) 1 8
long int 4 32
short int 2 16
long long int 8 64
double 4 32
double 8 64

Take This Home

The main point I'd like you to keep in mind is that with control of every bit, we can optimize our memory usage. Though that has little effect on modern systems, in the case of embedded computing, every byte counts. By manually reading and writing bits as opposed to typical variable values, we can harness more functionality from less storage.

Chapter 2: Operating on Bits

Now that we've covered data types and how data is stored, we're ready to introduce the idea of bitwise operations. Put simply, a bitwise operation is an operation done on each bit of a piece of data. Let's take a look at each bitwise operator. Then, we'll implement them in C.

And (&)

Written x & y in C. This operator takes the corresponding bits of each number and performs an and operation. An and operation returns 1 (true) if and only if both bits are 1. This means that two bits that are both 0 do not return 1—they return 0. The result is the number made up of the binary string of results from each and. It's easiest to see this in a truth table.

Consider the operation int result = 3 & 5. First, convert 3 and 5 to binary.
Now, we have int result = 011 & 101. Consider the following truth table:

Bit A Bit B AND
0 1 0
1 0 0
1 1 1

The result of the and operation is 001, which when converted to decimal is 1.

Or (|)

Written x | y in C. This operator takes the corresponding bits of each number and performs an or operation. An or operation returns 1 if either bit is 1. As with other bitwise operators, the result is the number made up of the binary string of results from each or.

Consider the operation int result = 3 | 5. First, convert 3 and 5 to binary.
Now, we have int result = 011 | 101. Consider the following truth table:

Bit A Bit B OR
0 1 1
1 0 1
1 1 1

The result of the or operation is 111, which when converted to decimal is 7.

Xor (^)

Written x ^ y in C. This operator takes the corresponding bits of each number and performs an xor (exclusive or) operation. An xor operation returns 1 if and only if one of the two bits is 1. As with other bitwise operators, the result is the number made up of the binary string of results from each or.

Consider the operation int result = 3 ^ 5. First, convert 3 and 5 to binary.
Now, we have int result = 011 ^ 101. Consider the following truth table:

Bit A Bit B XOR
0 1 1
1 0 1
1 1 0

异或运算的结果是110,转换成十进制后是6。

左移 (

写x

考虑 int result = 5

初始

1 0 1

一班后

0 1 0

结果

1 0 0

二进制结果是100,十进制是4。

右移 (>>)

写x>> amount 这个运算符就像左移一样,只是它的工作方向相反。给定数字中的每一位都会向右移动给定的数量。到达数字末尾的任何位都会被截断(并且零出现在左侧)。让我们来看一个例子。

考虑 int result = 3 >> 2. 如您所知,3 的二进制是 011。让我们来看看转变的每次迭代。

初始

0 1 1

一班后

0 0 1

结果

0 0 0

二进制结果是000,十进制就是0。

不是(~)

写~x。 not 运算符反转给定数字的所有位。我们将再次使用真值表来详细说明。

考虑 int result = ~5。如您所知,5 的二进制数是 101。

Bit A ~ Bit A
1 0
0 1
1 0

因此,非运算的结果是010,即二进制的2。

左移和右移限制

这些轮班操作有一些值得注意的限制。对于初学者来说,您不能将位移位负数 - 这根本没有意义!此外,移位超过分配给变量的位数将被视为未定义。你可以做到这一点,但它的输出不能保证对于给定值是恒定的。最后,虽然不是一个限制,但移位 0 次根本不会执行移位。

轮到你了!

  1. 完成以下每一项的真值表。考虑所有值都是无符号的。完成后转换为十进制。
  2. 8 & 2
  3. 6 | 3
  4. 7 ^ 5
  5. (5 & 2) & (4 & 3)
  6. (5 | 2) & (4 & 3)
  7. (5 & 2) ^ (4 | 3)

  8. 完成每项操作。考虑所有值都是无符号的,并且只要问题中的最长值需要是无符号的(即,如果最大值为 8,则处理 4 位)。完成后转换为十进制。

  9. ~6

  10. 9

  11. ~(7 & 8)

  12. (2 | 6) >> 1

  13. 8>> (~2)

  14. ~((3 >> 2) ^ ~7) & (~(7 >> 4) | 2)

回复示例

问题1

  • 8 & 2 → 1000 & 0010
Bit A Bit B AND
1 0 0
0 0 0
0 1 0
0 0 0

⇒ 0000,即十进制的 0。

  • 6 | 3 → 110 | 011
Bit A Bit B OR
1 0 1
1 1 1
0 1 1

⇒ 111,十进制为 7。

  • 7^5→111^101
Bit A Bit B XOR
1 1 0
1 0 1
1 1 0

⇒ 010,十进制 2。

  • (5 & 2) & (4 & 3) → (101 & 010) & (100 & 011)
Bit A Bit B A AND B Bit C Bit D C AND D (A AND B) AND (C AND D)
1 0 0 1 0 0 0
0 1 0 0 1 0 0
1 0 0 0 1 0 0

⇒ 000,即十进制的 0。

  • (5 | 2) & (4 & 3) → (101 | 010) & (100 & 011)
Bit A Bit B A OR B Bit C Bit D C AND D (A OR B) AND (C AND D)
1 0 1 1 0 0 0
0 1 1 0 1 0 0
1 0 1 0 1 0 0

⇒ 000,即十进制的 0。

  • (5 & 2) ^ (4 | 3) → (101 & 010) ^ (100 | 011)
Bit A Bit B A AND B Bit C Bit D C OR D (A AND B) XOR (C OR D)
1 0 0 1 0 1 1
0 1 0 0 1 1 1
1 0 0 0 1 1 1

⇒ 111, which is 7 in decimal.

Question 2

  • ~6 → ~110 ⇒ 011, which is 3 in decimal.

  • 9 882c1bd3602a338c72ccc257e908feba> 1 → (010 | 110) >> 1 → 110 >> 1 ⇒ 011, which is 3 in decimal.

  • 8 >> (~2) → 1000 >> ~(10) → 1000 >> (01) → 1000 >> 1 ⇒ 0100, which is 4 in decimal.

  • ~((3 >> 2) ^ ~7) & (~(7 >> 4) | 2)

To solve this, I suggest splitting into halves:

~((3 >> 2) ^ ~7) and (~(7 >> 4) | 2)

~((3 >> 2) ^ ~7) → ~((011 >> 2) ^ ~(111)) → ~((000) ^ ~(111)) → ~(000 ^ 000) → 111
(~(7 >> 4) | 2) → (~(111 >> 4) | 010) → (~(000) | 010) → (111 | 010) → 111

Hence, 111 & 111 ⇒ 111, which is 7 in decimal.

Chapter 3: Applying Bitwise Operations in C

This chapter is all about writing C code that utilizes bitwise operators. Before we get to doing bitwise operations, we should begin by writing a function that can write the binary equivalent of a given variable.

To do this, we use a mask. We initialize it to contain a 1 in the most significant (leftmost in little-endian systems) bit followed by zeros. With each iteration of a loop, we right shift the mask by 1, moving the 1 all the way "across" the mask. When we use the & operator on the pointer and the mask, any non-zero value means that a 1 occurred somewhere in the result. Since there's only one 1 in the mask, we know exactly where this occurred. Since the loop moves from left to right, we can just append the corresponding bit's character to the string. The string is one character longer than the size because it needs to contain the null character (\0). This is how printf knows to stop printing, and omitting it can lead to numerous errors and/or unexpected behaviors (like the printing of other data from in memory).

void printBinary(unsigned int decimal) {

    // To determine size (in bits), we multiply the maximum size of an unsigned int by 8 (to convert from bytes to bits).
    int size = sizeof(decimal) * 8;

    // This counts the leading zeros of the number, then subtracts them from the size.
    // Hence, we start at the first bit set to 1 (unless decimal == 0)
    size -= __builtin_clz(decimal);

    if(size == 0) size = 1; // Handle zero case, we need one space to display "0."

    int curr = 0;
    char binary[size + 1];
    for(unsigned int mask = 1 e360cd6bcd3c1b37977442911f0d6d09>= 1) {
        if((decimal & mask) != 0) {
            binary[curr] = '1';
        } else {
            binary[curr] = '0';
        }
        curr++;
    }

    binary[curr] = '\0';

    printf("%s", binary);

}

Bitwise Assignment Operators

All bitwise operators, except for not (~), can be used in the assignment fashion. This means you can add an equals sign right next to one of the bitwise operator. For example, in

int a = 2;
a &= 7;

a is both the first operand and the destination. In other words, the value of a & 7 is stored in a. This works for all bitwise operators aside from the not (~) operator.

Now, I'd like to provide a few case study like prompts for you to try. Sample responses are available.

Case Study 1: Unix File Permissions

One use case of bitwise operations is in the Unix permission system. You've probably run the command

chmod 777 some-file

But what do each of the numbers mean? And why 7? The reality is, binary is at play here, and 7 should tip you off. Recall that 7 in binary is 111. The number being passed here is acting as three flags or booleans glued together.

The three digits specified are for three classes of users:

  • The file owner;
  • Group members of the file owner;
  • and everyone else.

As I mentioned above, each digit is a conglomeration of three flags, each representing a permission. The flag (binary bit) in the fours place represents read permission, the twos place is for write permission, and the ones is for execute. So,

chmod 777 some-file

is doing this under the hood:

File Permissions: some-file

Group Read Write Execute Decimal
Owner 1 1 1 7
Owner's Group 1 1 1 7
Everyone Else 1 1 1 7

换句话说,所有的权限都是给所有人的。

任务

设计一个简单的权限检查器,它接受完整的文件权限值(一个三位数)并检查一组给定的特定权限(即所有者写入、每个人执行等)。例如,请查看第 3 章文件夹。

暗示

获取完整数字后,您需要将其转换为 int(从 char*)。然后,使用模运算来分解每个权限集。请记住,第一个数字代表所有者的权限,第二个数字代表所有者的用户组,第三个数字代表所有人。

检查特定权限是否出现在权限集中,按位以及给定的权限与集合。

案例 2:对网络进行子网划分

如果您曾经配置过路由器,您可能已经注意到配置“子网掩码”的选项。通常,该值设置为 255.255.255.0,但为什么呢?首先,我们需要记住 IPv4 地址的每个字节都用“.”分隔。当处理您最熟悉的网络类型(C 类网络)时,有 3 个字节专用于网络,最后一个字节用于主机。

由于子网掩码只是一个掩码,您可能会明白它的用途。对于从主机字节“借用”的每一位,都会创建两个子网。

网络地址

网络地址将所有主机位设置为0。这意味着任何位都被放弃以创建
子网仍然可以设置为 1。

阅读更多!

通过查看此网站了解有关子网的更多信息。

任务

用 C 语言编写一个程序,接收 IPv4 地址和子网掩码,查找并输出 IPv4 地址所在的网络地址。有关示例,请查看第 3 章文件夹。

暗示

您需要将地址和掩码的每个字节存储为数值。要查找网络地址,请考虑掩码和地址之间的哪种(按位)运算将产生预期的效果。

闭幕式

希望这个解释对您有用!我写它是因为我想自己学习按位运算。我已经检查过了,但有些事情可能是错误的,所以请随时通过拉取请求纠正我,甚至添加评论!另外,如果您有任何疑问,请发表评论!我等不及要和你聊天了!最后,我很高兴能为您提供此资源!

关于我

嗨!我是杰克逊,拉斐特学院计算机科学和法语专业的学生,​​也是一名有抱负的计算机科学研究员和教授。我目前对生物信息学和低级编程/系统领域感兴趣。要了解更多关于我的信息,请查看我的网站。

以上是位运算概述的详细内容。更多信息请关注PHP中文网其他相关文章!

声明:
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn