pytoch和numpy中的broadcast

cast这个单词在英语中就是神一般的存在,好像在哪儿都能碰得到,但又无论在哪儿碰到都觉得很难正确地理解这个词。在c++的术语中,经常遇见static_cast, dynamic_cast, const_cast, interpret_cast之类的组合词。还包括电影名称cast away, 有时候选举,扮演,魔法,变形,投影等场合都可以碰见它。那么这里的broadcast其实也是类似static_cast, dynamic_cast等术语,有“魔法般的变形”的含义。

很多多元矩阵(数组)的操作都有对数组形态的要求。这其实在很多情况下是对数组操作运算的限制。例如,有些时候我们需要对两个矩阵做操作,而其中一个矩阵需要按照一种“理所当然”的变形方法执行一下才能操作。于是我们就可以约定一种理所当然的变形方法,然后在这种方法变形后可以实现多元数组正常操作的两个数组叫做broadcastable。

那么哪几种情况的数组叫做broadcastable呢?相对应的“理所当然”的变形方法分别是什么呢?

  1. 形态相同。两个相同形态的数组当然是broadcastable。因为它们不做变形都可以直接操作。
  2. 两个数组形态不一样,那么把这两个数组从尾部开始对齐。如果某一个维度两者不一样,且其中之一的维度只有一个元素。或者其中之一没有那个维度。则,者两个数组是broadcastable。

还有一个特殊的前置条件,那就是数组都必须有维度,不能是0维度的数组。

针对上面第2条方法,数组是如何变形的呢?很简单,在需要补充的那个维度上重复的补充上一个维度的数据。例如a =tensor( [[1,2]]) , b = tensor([[2,2], [1,2]])加法操作时,a就会被变形为tensor([[1,2], [1,2]])。

再列举几个例子:

a = tensor([[[1,2]],[[1,1]]]) #shape = 2,1,2
b = tensor([1]) #shape = 1
a + b
#tensor([[[2,3]], [[2,2]]])

这里b被变形为 [[[1,1]], [[1,1]]]

python中圆括号()的四种用法

在python中圆括号至少有四种用法:运算符号,定义tuple,定义函数和调用函数。

基本的用法这里不做叙述。只讨论复杂的情况。

a = (1) 和a = (1,)是完全不一样的。在赋值语句中,如果圆括号中出现逗号,这个圆括号就会被编译器认为是tuple的定义。反之,则会被认为是一个运算符号。

python中的星号*的四种意义

python中的星号*在实际应用中存在代码中存在四种意义:普通乘法,队列的乘法(队列的展开),可变参数的构造,队列形变量的解压。

When used in multiplication and power operations

You may already know of this case. Python supports the built-in power operations as well as multiplication.

>>> 2 * 3 6 >>> 2 ** 3 8 >>> 1.414 * 1.414 1.9993959999999997 >>> 1.414 ** 1.414 1.6320575353248798

For repeatedly extending the list-type containers

Python also supports that multiply the list-type container (includes tuple) and int for extending container data by given number times.

# Initialize the zero-valued list with 100 length zeros_list = [0] * 100 # Declare the zero-valued tuple with 100 length zeros_tuple = (0,) * 100 # Extending the "vector_list" by 3 times vector_list = [[1, 2, 3]] for i, vector in enumerate(vector_list * 3): print("{0} scalar product of vector: {1}".format((i + 1), [(i + 1) * e for e in vector])) # 1 scalar product of vector: [1, 2, 3] # 2 scalar product of vector: [2, 4, 6] # 3 scalar product of vector: [3, 6, 9]

For using the variadic arguments

We often need variadic arguments (or parameters) for some functions. For example, we need it if we don’t know number of passing arguments or when we should process something with arbitrary passing arguments for some reasons.

There are 2 kinds of arguments in Python, one is positional argumentsand other is keyword arguments, the former are specified according to their position and latter are the arguments with keyword which is the name of the argument.

Before looking at the variadic positional/keyword arguments, we’ll talk about the positional arguments and keyword arguments simply.

# A function that shows the results of running competitions consisting of 2 to 4 runners. def save_ranking(first, second, third=None, fourth=None): rank = {} rank[1], rank[2] = first, second rank[3] = third if third is not None else 'Nobody' rank[4] = fourth if fourth is not None else 'Nobody' print(rank) # Pass the 2 positional arguments save_ranking('ming', 'alice') # Pass the 2 positional arguments and 1 keyword argument save_ranking('alice', 'ming', third='mike') # Pass the 2 positional arguments and 2 keyword arguments (But, one of them was passed as like positional argument) save_ranking('alice', 'ming', 'mike', fourth='jim')

Above function has 2 positional argumentsfirstsecond and 2 keyword argumentsthirdfourth. For positional arguments, it is not possible to omit it, and you must pass all positional arguments to the correct location for each number of arguments declared. However, for keyword arguments, you can set a default value of it when declaring a function, and if you omit the argument, the corresponding default value is entered as the value of the argument. That is, the keyword arguments can be omitted.

Thus, what you can see here is that keyword arguments can be omitted, so they can not be declared before positional arguments. So, the following code will raises exceptions:

def save_ranking(first, second=None, third, fourth=None): ...

But, in the third case, you can see that there are 3 positional arguments and 1 keyword argument. Yes, for keyword arguments, if the passed position is the same to declared position, the keyword can be excluded and passed as positional arguments. That is, in above, the mike will be passed to third key automatically.

So far we’ve talked about the basic of arguments. By the way, one problem can be met here. The function can not handle the arbitrary numbers of runners because the function has fixed numbers of arguments. So we need the variadic arguments for it. Both positional arguments and keyword arguments can be used as variadic arguments. Let’s see following examples.

When use only positional arguments

def save_ranking(*args): print(args) save_ranking('ming', 'alice', 'tom', 'wilson', 'roy') # ('ming', 'alice', 'tom', 'wilson', 'roy')

When use only keyword arguments

def save_ranking(**kwargs): print(kwargs) save_ranking(first='ming', second='alice', fourth='wilson', third='tom', fifth='roy') # {'first': 'ming', 'second': 'alice', 'fourth': 'wilson', 'third': 'tom', 'fifth': 'roy'}

When use both positional arguments and keyword arguments

def save_ranking(*args, **kwargs): print(args) print(kwargs) save_ranking('ming', 'alice', 'tom', fourth='wilson', fifth='roy') # ('ming', 'alice', 'tom') # {'fourth': 'wilson', 'fifth': 'roy'}

In above, *args means accepting the arbitrary numbers of positional arguments and **kwargs means accepting the arbitrary numbers of keyword arguments. In here, *args**kwargs are called packing.

As you can see above, we are passing the arguments which can hold arbitrary numbers of positional or keyword values. The arguments passed as positional are stored in a tuple called args, and the arguments passed as keyword are stored in a dict called kwargs.

As refered before, the keyword arguments can not be declared before positional arguments, so following code should raises exceptions:

def save_ranking(**kwargs, *args): ...

The variadic argument is very often used feature, it could be seen on many open source projects. Usually, many open sources use typically used argument names such as *args or **kwargs as variadic arguments name. But, of course, you can also use the own name for it like *required or **optional. (However, if your project is open source and there is no special meaning at variadic arguments, it is good to follow conventions of using *args and **kwarg)

For unpacking the containers

The * can also be used for unpacking the containers. Its principles is similar to “For using the variadic arguments” in above. The easiest example is that we have data in the form of a listtuple or dict, and a function take variable arguments:

from functools import reduce primes = [2, 3, 5, 7, 11, 13] def product(*numbers): p = reduce(lambda x, y: x * y, numbers) return p product(*primes) # 30030 product(primes) # [2, 3, 5, 7, 11, 13]

Because the product() take the variable arguments, we need to unpack the our list data and pass it to that function. In this case, if we pass the primes as *primes, every elements of the primes list will be unpacked, then stored in list called numbers. If pass that list primes to the function without unpacking, the numbers will has only one primes list not all elements of primes.

For tuple, it could be done exactly same to list, and for dict, just use ** instead of *.

headers = { 'Accept': 'text/plain', 'Content-Length': 348, 'Host': 'http://mingrammer.com' } def pre_process(**headers): content_length = headers['Content-Length'] print('content length: ', content_length) host = headers['Host'] if 'https' not in host: raise ValueError('You must use SSL for http communication') pre_process(**headers) # content length: 348 # Traceback (most recent call last): # File "", line 1, in # File "", line 7, in pre_process # ValueError: You must use SSL for http communication

And there is also one more type of unpacking, it is not for function but just unpack the list or tuple data to other variables dynamically.

numbers = [1, 2, 3, 4, 5, 6] # The left side of unpacking should be list or tuple. *a, = numbers # a = [1, 2, 3, 4, 5, 6] *a, b = numbers # a = [1, 2, 3, 4, 5] # b = 6 a, *b, = numbers # a = 1 # b = [2, 3, 4, 5, 6] a, *b, c = numbers # a = 1 # b = [2, 3, 4, 5] # c = 6

Here, the *a and *b will do packing the remaining values again except the single unpacked values which are assigned other normal variables after unpacking the list or tuple. It is same concepts to packing for variadic arguments.

Conclusion

So far we’ve covered the Asterisk(*) of Python. It was interesting to be able to do various operations with one operator, and most of the those above are the basics for writing Pythonic code. Especially, the “For using the variadic arguments” is very important thing, but the python beginners often confused about this concept, so if you are a beginner of python, I would like you to know it better.

Next, I’ll cover more interesting things about Python. Thank you.

使用python的requests库向strapi上传文件

requests库在处理multipart/form-data上传时有好几个“便利地”陷阱。看似通过如下的方式调用post是最方便的:

files = {"key", "value"}
requests.post('url_to_server', files = files, ...)

但实则,这样调用post的api陷阱重重。

首先,通过这样的形式上传文件时,没法指定数据的mime类型。例如:

files = {"name": open("path_to_file", "rb")} #这样无法指定数据的content-type

应该这样files = {"name": ("filename", open("path_to_file", "rb", "multipart/form-data"))}

然而,这样写仍然无法解决同一个名称,多个文件的特殊情况。更强大的方式是这样:
files = [
("name",("filename", open("path_to_file", "rb"), "application/octet-stream")), ("name", ("filename",open("path_to_file2"), "rb", "text/plain"))]

这里,两个元素的name可以相等。

那么还有更普遍的情况,上述样本中,filename所在参数还可以为None。这样可以消掉filename的属性(用在非文件数据上传的情况)。

files = [
("name1": (None, "my_value")),
("name2": ("filename", open("path_to_file", "rb"), "multipart/form-data"))]

python iter函数

在javascript函数编程的时候也常常遇见同样的概念。例如yield,和generator函数的组合其实是返回一个可迭代的对象。(iterable object)。

python种的iter函数,正如这个词的全称iteration。它将一个将一个可以转化为可迭代对象的对象转化成一个可迭代对象(拥有.next()函数)。

例如iter([1,2,3])

python通过相对路径引用module的陷阱

这里提到的陷阱主要指,通过模块引入的形式调用某个模块和通过python脚本直接调用某模块的差异。假设工程有如下结构:

/src
/main
module.py
test.py

在module.py中:


from .test import *


然后在src目录下执行如下命令
> python
> import main.module

没有任何问题

但,如果执行以下命令
python main/module

就会报错,说找不到.test模块!

原来python引入相对路径的模块时,其实是通过__name__变量转化为绝对路径再引入的。如果按照第一种方式,__name__变量等于main, 那就相当于在src目录下执行import main.test命令,那当然就没有问题了。

但是后者,__name__变量等于'__main__',这就是个问题了。这相当于在src目录下执行import __main__.test。如此当然会报错!