Python Modules

There are three main types of modules in Python(Python中有三种主要的模块类型):

those you write yourself(自己写的模块)
those you install from external sources(外部安装的模块)
those that are preinstalled with Python(python内置的标准模块)

比如：

模块
运行时服务相关	copy / pickle / sys / …
数学相关模块	decimal / math / random / …
字符串处理模块	codecs / re / …
文件处理相关模块	shutil / gzip / …
操作系统服务相关模块	datetime / os / time / logging / io / argparse…
进程和线程相关模块	multiprocessing / threading / queue
网络应用相关模块	ftplib / http / smtplib / urllib /socket/ …
Web编程相关模块	cgi / webbrowser
数据处理和编码模块	base64 / csv / html.parser / json / xml / doctest/…

Python中每个文件就代表了一个模块（module），我们在不同的模块中可以有同名的函数，在使用函数的时候我们通过import关键字导入指定的模块就可以区分到底要使用的是哪个模块中的foo函数，比方说：

module1.py

1 2	def foo(): print('hello, world!')

module2.py

1 2	def foo(): print('goodbye, world!')

test.py

from module1 import foo

# 输出hello, world!
foo()

import module2 as m2

# 输出goodbye, world!
m2.foo()

我们导入的模块除了定义函数之外还有python可执行代码，导入的同时也会执行这些代码，但事实上我们可能并不希望如此，因此如果我们在模块中编写了执行代码，最好是将这些执行代码放入if __name__ == '__main__'中，这样导入后就不会执行这些代码，因为只有被Python解释器直接执行的模块的名字才是__main__。

module3.py

def foo():
    print("This is a running module")

def bar():
    print("This is an imported module")

## __name__是Python中一个隐含的变量，代表了模块的名字
if __name__ == '__main__':
  print('call foo()')
  foo()
else:
  print('call bar()')
  bar()

此时运行该程序，你会得到：

1 2	call foo() This is a running module

test3.py

1	from module3 import *

运行test3.py，结果为：

1 2	call bar() This is an imported module

Itertools

The module itertoolsis a standard library that contains several functions that are useful in functional programming.
Like infinite iterators:

functions	description
count	counts up infinitely from a value.
cycle	infinitely iterates through an iterable (for instance a list or string).e.g. `itertools.cycle(('A', 'B', 'C'))`
repeat	repeats an object, either infinitely or a specific number of times.
takewhile	takes items from an iterable while a predicate function remains true;
chain	combines several iterables into one long one;
accumulate	returns a running total of values in an iterable.

from itertools import count,chain,accumulate,takewhile,product,permutations,combinations

for i in count(5):
    print(i, end=" ")
    if i >= 10:
        break
# 5 6 7 8 9 10

it1 = iter([1, 2, 3])
it2 = iter([4, 5, 6])
chain(it1, it2)
# [1, 2, 3, 4, 5, 6]

nums = list(accumulate(range(20)))
print(list(takewhile(lambda x:x < 100,nums)))
# [0, 1, 3, 6, 10, 15, 21, 28, 36, 45, 55, 66, 78, 91]

letters=["A","B","C","D"]
print(list(product(letters, range(3)))) ## 笛卡尔积
## All possible combinations from two
# [('A', 0), ('A', 1), ('A', 2), ('B', 0), ('B', 1), ('B', 2), ('C', 0), ('C', 1), ('C', 2), ('D', 0), ('D', 1), ('D', 2)]

排列

1
2
3

## possible combination only from list lonely
print(list(permutations(letters))) 
# list(permutations('ABCD'))

[('A', 'B', 'C', 'D'), ('A', 'B', 'D', 'C'), ('A', 'C', 'B', 'D'), ('A', 'C', 'D', 'B'), ('A', 'D', 'B', 'C'), ('A', 'D', 'C', 'B'), ('B', 'A', 'C', 'D'), ('B', 'A', 'D', 'C'), ('B', 'C', 'A', 'D'), ('B', 'C', 'D', 'A'), ('B', 'D', 'A', 'C'), ('B', 'D', 'C', 'A'), ('C', 'A', 'B', 'D'), ('C', 'A', 'D', 'B'), ('C', 'B', 'A', 'D'), ('C', 'B', 'D', 'A'), ('C', 'D', 'A', 'B'), ('C', 'D', 'B', 'A'), ('D', 'A', 'B', 'C'), ('D', 'A', 'C', 'B'), ('D', 'B', 'A', 'C'), ('D', 'B', 'C', 'A'), ('D', 'C', 'A', 'B'), ('D', 'C', 'B', 'A')]

组合

1
2
3

## possible combination only from list lonely
print(list(combinations(letters, 3))) 
# [('A', 'B', 'C'), ('A', 'B', 'D'), ('A', 'C', 'D'), ('B', 'C', 'D')]

Functools

functools 模块主要为函数式编程而设计, 用于增强函数功能, 常见两个函数如下：

functools.partial: 创建一个偏函数，将默认参数包装一个可调用对象。

import functools
bin2 = functools.partial(int, base=2)
print(bin2("101111111111111111101"))
print(bin2("ABCDEF",base=16))
# 1572861
# 11259375

functools.reduce: 一个序列归纳为一个输出reduce(function, sequence, startValue)

For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5).
1
2
3
4
5
from functools import reduce
def fn(x, y):
return x * 10 + y
print(reduce(fn, [1, 3, 5, 7, 9]))
# 13579

Collections

常用的工具类：

namedtuple：命令元组，它是一个类工厂，接受类型的名称和属性列表来创建一个类。
deque：双端队列，是列表的替代实现。Python中的列表底层是基于数组来实现的，而deque底层是双向链表，因此当你需要在头尾添加和删除元素是，deque会表现出更好的性能，渐近时间复杂度为 $O(1)$ 。
Counter：dict的子类，键是元素，值是元素的计数，它的most_common()方法可以帮助我们获取出现频率最高的元素。Counter和dict的继承关系我认为是值得商榷的，按照CARP原则，Counter跟dict的关系应该设计为关联关系更为合理。
OrderedDict：dict的子类，它记录了键值对插入的顺序，看起来既有字典的行为，也有链表的行为。
defaultdict：类似于字典类型，但是可以通过默认的工厂函数来获得键对应的默认值，相比字典中的setdefault()方法，这种做法更加高效。

找出序列中出现次数最多的元素

from collections import Counter
     
words = [
 'look', 'into', 'my', 'eyes', 'look', 'into', 'my', 'eyes',
 'the', 'eyes', 'the', 'eyes', 'the', 'eyes', 'not', 'around',
 'the', 'eyes', "don't", 'look', 'around', 'the', 'eyes',
 'look', 'into', 'my', 'eyes', "you're", 'under'
]
counter = Counter(words)
print(counter.most_common(3))
# [('eyes', 8), ('the', 5), ('look', 4)]

Copy

**直接赋值：**其实就是对象的引用（别名）。
**浅拷贝(copy)：**拷贝父对象，不会拷贝对象的内部的子对象。
深拷贝(deepcopy)： copy 模块的 deepcopy 方法，完全拷贝了父对象及其子对象。

import copy
floras = {1: "Violet", 2: "Daisy", "3rd":{4: "Primrose",
         5: "Ivy", "6th":["Tulup", "Marigold", "thorn"]}}
f1 = floras
f2 = floras.copy()
f3 = copy.deepcopy(floras)
f1==f2==f3
# True

floras[2] = "chrysanthemum"
floras["3rd"][5] = "rosemary"
floras["3rd"]["6th"].clear()
print(f1)
# {1: 'Violet', 2: 'chrysanthemum', '3rd': {4: 'Primrose', 5: 'rosemary', '6th': []}}
print(f2)
# {1: 'Violet', 2: 'Daisy', '3rd': {4: 'Primrose', 5: 'rosemary', '6th': []}}
print(f3)
# {1: 'Violet',2: 'Daisy','3rd': {4: 'Primrose', 5: 'Ivy', '6th': ['Tulup', 'Marigold', 'thorn']}}

Decimal

Python 的 decimal 模块为浮点型精确计算提供了支持。与 float数据类型相比，Decimal 数字的表示是完全精确的。

在十进制浮点数中，0.1 + 0.1 + 0.1 - 0.3 恰好等于零。在二进制浮点数中，结果为 5.5511151231257827e-017 。虽然接近于零，但差异妨碍了可靠的相等性检验，并且差异可能会累积。因此，在具有严格相等不变量的会计应用程序中， decimal 是首选。

Decimal 模块包含有效位的概念，因此 1.30 + 1.20 的结果是 2.50 。保留尾随零以表示有效位。这是货币的惯用表示方法。

Decimal包含特殊值，比如： NaN 代表“非数字”，正的和负的 Infinity，和 -0。

from decimal import *

Decimal(10)
# Decimal('10')
Decimal('3.14')
# Decimal('3.14')
Decimal(3.14)
# Decimal('3.140000000000000124344978758017532527446746826171875')
## (0表示+1表示-, (3, 1, 4)数字, -2小数点)
Decimal((0, (3, 1, 4), -2))  3
# Decimal('3.14')
Decimal(str(2.0 ** 0.5))
# Decimal('1.4142135623730951')
float(Decimal(2.0 ** 0.5))
# 1.4142135623730951
str(Decimal(2.0 ** 0.5))
# '1.4142135623730951454746218587388284504413604736328125'

1.1+1.3
# 2.4000000000000004
print(Decimal(1.1) + Decimal(1.3))
# 2.400000000000000133226762955
1.1-1.3
# -0.19999999999999996
print(Decimal(1.1) - Decimal(1.3))
# -0.1999999999999999555910790150
1.1*1.3
# 1.4300000000000002
print(Decimal(1.1) * Decimal(1.3))
# 1.430000000000000164313007645

我们可以使用 getcontext().prec 设定有效数字，而quantize() 方法则将数字舍入为固定位数。

getcontext()
# Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[InvalidOperation, DivisionByZero, Overflow])
getcontext().prec = 7       # 设定有效数字为7

print(Decimal(1.1) * Decimal(1.3))
# 1.430000

print((Decimal(1.1) * Decimal(1.3)).quantize(Decimal('0.0000')))

adjusted() 用于确定最高有效位相对于小数点的位置。

321e+5
# 32100000.0
Decimal('321e+5')
# Decimal('3.21E+7')
print(Decimal('321e+5').adjusted())
# 7

as_integer_ratio() 返回一对 (n, d) 整数，表示给定的 Decimal实例作为分数、最简形式项并带有正分母:

1
2
3

## Tips：22/7为疏率,355/113为密率
Decimal('-3.14').as_integer_ratio()
# (-157, 50)

compare 用于比较两个 Decimal 实例的值，1 为大于，-1为小于，0为等于。

1 2	print(Decimal(3.3).compare(Decimal(1.1))) # 1

copy_abs()返回参数的绝对值。

1 2	print(Decimal('-3.3').copy_abs()) # 3.3

Unicodedata

def is_number(s):
    try:
        float(s)
        return True
    except ValueError:
        pass

    try:
        import unicodedata
        unicodedata.numeric(s)
        return True
    except (TypeError, ValueError):
        pass

    return False

# 阿拉伯语 5
print(is_number('٥'))  # True
# 泰语 2
print(is_number('๒'))  # True
# 中文数字
print(is_number('四')) # True
# 版权号
print(is_number('©'))  # False

Enum枚举

from enum import Enum, unique

Month = Enum('Month', ('Jan', 'Feb', 'Mar', 'Apr', 'May',
                       'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'))
for name, member in Month.__members__.items():
    print(name, '=>', member, ',', member.value)

Jan => Month.Jan , 1
Feb => Month.Feb , 2
Mar => Month.Mar , 3
Apr => Month.Apr , 4
May => Month.May , 5
Jun => Month.Jun , 6
Jul => Month.Jul , 7
Aug => Month.Aug , 8
Sep => Month.Sep , 9
Oct => Month.Oct , 10
Nov => Month.Nov , 11
Dec => Month.Dec , 12

# @unique装饰器可以帮助我们检查保证没有重复值。
@unique
class Weekday(Enum):
    Sun = 0  # Sun的value被设定为0
    Mon = 1
    Tue = 2
    Wed = 3
    Thu = 4
    Fri = 5
    Sat = 6
day1 = Weekday.Mon
print(day1)
# Weekday.Mon
print(Weekday['Tue'])
# Weekday.Mon
print(Weekday(1))
# Weekday.Mon

# 利用getattr实现枚举类
class Enum(object):

    def __init__(self, *enums_list, **enums_dict):
        if enums_list:
            self.__enums = list(enums_list)
            self.__type = 'list'
        elif enums_dict:
            self.__enums = dict(enums_dict)
            self.__type = 'dict'
        print(self.__enums)

    def __getattr__(self, attr):
        if attr in self.__enums:
            if self.__type == 'list':
                return self.__enums.index(attr) + 1
            elif self.__type == 'dict':
                return self.__enums[attr]

l = ['JAN', 'FEB', 'MAR', 'APR', 'MAY', 'JUN',
     'JUL', 'AUG', 'SEP', 'OCT', 'NOV', 'DEC']
m = Enum(*l)
print('MAY = %d' % m.MAY)
# ['JAN', 'FEB', 'MAR', 'APR', 'MAY', 'JUN', 'JUL', 'AUG', 'SEP', 'OCT', 'NOV', 'DEC']
# MAY = 5

d = {'JAN': 1, 'FEB': 2, 'MAR': 3, 'APR': 4, 'MAY': 5, 'JUN': 6,
     'JUL': 7, 'AUG': 8, 'SEP': 9, 'OCT': 10, 'NOV': 11, 'DEC': 12}
m = Enum(*d)
print('MAY = %d' % m.MAY)
# ['JAN', 'FEB', 'MAR', 'APR', 'MAY', 'JUN', 'JUL', 'AUG', 'SEP', 'OCT', 'NOV', 'DEC']
# MAY = 5