bitmap是很常用的数据结构,比如用于Bloom Filter中;用于无重复整数的排序等等。bitmap通常基于数组来实现,数组中每个元素可以看成是一系列二进制数,所有元素组成更大的二进制集合。对于Python来说,整数类型默认是有符号类型,所以一个整数的可用位数为31位。
bitmap实现思路
bitmap是用于对每一位进行操作。举例来说,一个Python数组包含4个32位有符号整型,则总共可用位为4 * 31 = 124位。如果要在第90个二进制位上操作,则要先获取到操作数组的第几个元素,再获取相应的位索引,然后执行操作。
上图所示为一个32位整型,在Python中默认是有符号类型,最高位为符号位,bitmap不能使用它。左边是高位,右边是低位,最低位为第0位。
bitmap是用于对每一位进行操作。举例来说,一个Python数组包含4个32位有符号整型,则总共可用位为4 * 31 = 124位。如果要在第90个二进制位上操作,则要先获取到操作数组的第几个元素,再获取相应的位索引,然后执行操作。
初始化bitmap
首先需要初始化bitmap。拿90这个整数来说,因为单个整型只能使用31位,所以90除以31并向上取整则可得知需要几个数组元素。代码如下:
#!/usr/bin/env python
#coding: utf8
class Bitmap(object):
def __init__(self, max):
self.size = int((max + 31 - 1) / 31) #向上取整
if __name__ == '__main__':
bitmap = Bitmap(90)
print '需要 %d 个元素。' % bitmap.size
$ python bitmap.py
需要 3 个元素。
计算在数组中的索引
计算在数组中的索引其实是跟之前计算数组大小是一样的。只不过之前是对最大数计算,现在换成任一需要存储的整数。但是有一点不同,计算在数组中的索引是向下取整,所以需要修改calcElemIndex方法的实现。代码改为如下:
#!/usr/bin/env python
#coding: utf8
class Bitmap(object):
def __init__(self, max):
self.size = self.calcElemIndex(max, True)
self.array = [0 for i in range(self.size)]
def calcElemIndex(self, num, up=False):
'''up为True则为向上取整, 否则为向下取整'''
if up:
return int((num + 31 - 1) / 31) #向上取整
return num / 31
if __name__ == '__main__':
bitmap = Bitmap(90)
print '数组需要 %d 个元素。' % bitmap.size
print '47 应存储在第 %d 个数组元素上。' % bitmap.calcElemIndex(47)
$ python bitmap.py
数组需要 3 个元素。
47 应存储在第 1 个数组元素上。
所以获取最大整数很重要,否则有可能创建的数组容纳不下某些数据。
计算在数组元素中的位索引
数组元素中的位索引可以通过取模运算来得到。令需存储的整数跟31取模即可得到位索引。代码改为如下:
#!/usr/bin/env python
#coding: utf8
class Bitmap(object):
def __init__(self, max):
self.size = self.calcElemIndex(max, True)
self.array = [0 for i in range(self.size)]
def calcElemIndex(self, num, up=False):
'''up为True则为向上取整, 否则为向下取整'''
if up:
return int((num + 31 - 1) / 31) #向上取整
return num / 31
def calcBitIndex(self, num):
return num % 31
if __name__ == '__main__':
bitmap = Bitmap(90)
print '数组需要 %d 个元素。' % bitmap.size
print '47 应存储在第 %d 个数组元素上。' % bitmap.calcElemIndex(47)
print '47 应存储在第 %d 个数组元素的第 %d 位上。' % (bitmap.calcElemIndex(47), bitmap.calcBitIndex(47),)
别忘了是从第0位算起哦。
置1操作
二进制位默认是0,将某位置1则表示在此位存储了数据。代码改为如下:
#!/usr/bin/env python
#coding: utf8
class Bitmap(object):
def __init__(self, max):
self.size = self.calcElemIndex(max, True)
self.array = [0 for i in range(self.size)]
def calcElemIndex(self, num, up=False):
'''up为True则为向上取整, 否则为向下取整'''
if up:
return int((num + 31 - 1) / 31) #向上取整
return num / 31
def calcBitIndex(self, num):
return num % 31
def set(self, num):
elemIndex = self.calcElemIndex(num)
byteIndex = self.calcBitIndex(num)
elem = self.array[elemIndex]
self.array[elemIndex] = elem | (1
if __name__ == '__main__':
bitmap = Bitmap(90)
bitmap.set(0)
print bitmap.array
因为从第0位算起,所以如需要存储0,则需要把第0位置1。
清0操作
将某位置0,也即丢弃已存储的数据。代码如下:
#!/usr/bin/env python
#coding: utf8
class Bitmap(object):
def __init__(self, max):
self.size = self.calcElemIndex(max, True)
self.array = [0 for i in range(self.size)]
def calcElemIndex(self, num, up=False):
'''up为True则为向上取整, 否则为向下取整'''
if up:
return int((num + 31 - 1) / 31) #向上取整
return num / 31
def calcBitIndex(self, num):
return num % 31
def set(self, num):
elemIndex = self.calcElemIndex(num)
byteIndex = self.calcBitIndex(num)
elem = self.array[elemIndex]
self.array[elemIndex] = elem | (1
def clean(self, i):
elemIndex = self.calcElemIndex(i)
byteIndex = self.calcBitIndex(i)
elem = self.array[elemIndex]
self.array[elemIndex] = elem & (~(1
if __name__ == '__main__':
bitmap = Bitmap(87)
bitmap.set(0)
bitmap.set(34)
print bitmap.array
bitmap.clean(0)
print bitmap.array
bitmap.clean(34)
print bitmap.array
清0和置1是互反操作。
测试某位是否为1
判断某位是否为1是为了取出之前所存储的数据。代码如下:
#!/usr/bin/env python
#coding: utf8
class Bitmap(object):
def __init__(self, max):
self.size = self.calcElemIndex(max, True)
self.array = [0 for i in range(self.size)]
def calcElemIndex(self, num, up=False):
'''up为True则为向上取整, 否则为向下取整'''
if up:
return int((num + 31 - 1) / 31) #向上取整
return num / 31
def calcBitIndex(self, num):
return num % 31
def set(self, num):
elemIndex = self.calcElemIndex(num)
byteIndex = self.calcBitIndex(num)
elem = self.array[elemIndex]
self.array[elemIndex] = elem | (1
def clean(self, i):
elemIndex = self.calcElemIndex(i)
byteIndex = self.calcBitIndex(i)
elem = self.array[elemIndex]
self.array[elemIndex] = elem & (~(1
def test(self, i):
elemIndex = self.calcElemIndex(i)
byteIndex = self.calcBitIndex(i)
if self.array[elemIndex] & (1 return True
return False
if __name__ == '__main__':
bitmap = Bitmap(90)
bitmap.set(0)
print bitmap.array
print bitmap.test(0)
bitmap.set(1)
print bitmap.test(1)
print bitmap.test(2)
bitmap.clean(1)
print bitmap.test(1)
$ python bitmap.py
[1, 0, 0]
True
True
False
False
接下来实现一个不重复数组的排序。已知一个无序非负整数数组的最大元素为879,请对其自然排序。代码如下:
#!/usr/bin/env python
#coding: utf8
class Bitmap(object):
def __init__(self, max):
self.size = self.calcElemIndex(max, True)
self.array = [0 for i in range(self.size)]
def calcElemIndex(self, num, up=False):
'''up为True则为向上取整, 否则为向下取整'''
if up:
return int((num + 31 - 1) / 31) #向上取整
return num / 31
def calcBitIndex(self, num):
return num % 31
def set(self, num):
elemIndex = self.calcElemIndex(num)
byteIndex = self.calcBitIndex(num)
elem = self.array[elemIndex]
self.array[elemIndex] = elem | (1
def clean(self, i):
elemIndex = self.calcElemIndex(i)
byteIndex = self.calcBitIndex(i)
elem = self.array[elemIndex]
self.array[elemIndex] = elem & (~(1
def test(self, i):
elemIndex = self.calcElemIndex(i)
byteIndex = self.calcBitIndex(i)
if self.array[elemIndex] & (1 return True
return False
if __name__ == '__main__':
MAX = 879
suffle_array = [45, 2, 78, 35, 67, 90, 879, 0, 340, 123, 46]
result = []
bitmap = Bitmap(MAX)
for num in suffle_array:
bitmap.set(num)
for i in range(MAX + 1):
if bitmap.test(i):
result.append(i)
print '原始数组为: %s' % suffle_array
print '排序后的数组为: %s' % result
bitmap实现了,则利用其进行排序就非常简单了。其它语言也同样可以实现bitmap,但对于静态类型语言来说,比如C/Golang这样的语言,因为可以直接声明无符号整型,所以可用位就变成32位,只需将上述代码中的31改成32即可,这点请大家注意。

Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.

Python and C have significant differences in memory management and control. 1. Python uses automatic memory management, based on reference counting and garbage collection, simplifying the work of programmers. 2.C requires manual management of memory, providing more control but increasing complexity and error risk. Which language to choose should be based on project requirements and team technology stack.

Python's applications in scientific computing include data analysis, machine learning, numerical simulation and visualization. 1.Numpy provides efficient multi-dimensional arrays and mathematical functions. 2. SciPy extends Numpy functionality and provides optimization and linear algebra tools. 3. Pandas is used for data processing and analysis. 4.Matplotlib is used to generate various graphs and visual results.

Whether to choose Python or C depends on project requirements: 1) Python is suitable for rapid development, data science, and scripting because of its concise syntax and rich libraries; 2) C is suitable for scenarios that require high performance and underlying control, such as system programming and game development, because of its compilation and manual memory management.

Python is widely used in data science and machine learning, mainly relying on its simplicity and a powerful library ecosystem. 1) Pandas is used for data processing and analysis, 2) Numpy provides efficient numerical calculations, and 3) Scikit-learn is used for machine learning model construction and optimization, these libraries make Python an ideal tool for data science and machine learning.

Is it enough to learn Python for two hours a day? It depends on your goals and learning methods. 1) Develop a clear learning plan, 2) Select appropriate learning resources and methods, 3) Practice and review and consolidate hands-on practice and review and consolidate, and you can gradually master the basic knowledge and advanced functions of Python during this period.

Key applications of Python in web development include the use of Django and Flask frameworks, API development, data analysis and visualization, machine learning and AI, and performance optimization. 1. Django and Flask framework: Django is suitable for rapid development of complex applications, and Flask is suitable for small or highly customized projects. 2. API development: Use Flask or DjangoRESTFramework to build RESTfulAPI. 3. Data analysis and visualization: Use Python to process data and display it through the web interface. 4. Machine Learning and AI: Python is used to build intelligent web applications. 5. Performance optimization: optimized through asynchronous programming, caching and code

Python is better than C in development efficiency, but C is higher in execution performance. 1. Python's concise syntax and rich libraries improve development efficiency. 2.C's compilation-type characteristics and hardware control improve execution performance. When making a choice, you need to weigh the development speed and execution efficiency based on project needs.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Dreamweaver CS6
Visual web development tools