大型搜索网站开发,成都网站建设公司哪家专业,智慧团建注册登录入口,展架设计制作图片Numpy
Numpy的优势ndarray属性基本操作 ndarray.func() numpy.func()ndarray的运算#xff1a;逻辑运算、统计运算、数组间运算合并、分割、IO操作、数据处理,不过这个一般使用的是pandas
Numpy的优势
Numpy numerical数值化 python 数值计算的python库#xff0c;用于快…Numpy
Numpy的优势ndarray属性基本操作 ndarray.func() numpy.func()ndarray的运算逻辑运算、统计运算、数组间运算合并、分割、IO操作、数据处理,不过这个一般使用的是pandas
Numpy的优势
Numpy numerical数值化 python 数值计算的python库用于快速处理任意维度的数组。 ndarrray n任意个 d(dimension维度) array 任意维度的数组的意思 Numpy使用ndarray对象来处理多维数组该对象是一个快速而灵活的大数据容器 Numpy提供了一个N维数组类型ndarray他描述相同类型的items的集合
import numpy as npscore np.array([[80, 89, 86, 67, 79],[78, 97, 89, 67, 81],[90, 94, 78, 67, 74],[91, 91, 90, 67, 69],[76, 87, 75, 67, 86],[70, 79, 84, 67, 84],[94, 92, 93, 67, 64],[86, 85, 83, 67, 80]])
scorearray([[80, 89, 86, 67, 79],[78, 97, 89, 67, 81],[90, 94, 78, 67, 74],[91, 91, 90, 67, 69],[76, 87, 75, 67, 86],[70, 79, 84, 67, 84],[94, 92, 93, 67, 64],[86, 85, 83, 67, 80]])type(score)numpy.ndarray## ndarray和list的效率的对比
import random
import time
import numpy as np
a []
for i in range(5000000):a.append(random.random())
t1 time.time()
sum1sum(a)
t2 time.time()b np.array(a)
t4 time.time()
sum3np.sum(b)
t5 time.time()
print(使用原生list的求和计算使用的时间, t2-t1, \t使用ndarry的时间计算, t5-t4)使用原生list的求和计算使用的时间 0.03126645088195801 使用ndarry的时间计算 0.0027697086334228516从上面的结果显示使用ndarray的时间处理和原生的list相比更加快速 Numpy专门的针对ndarray的操作和运算进行了设计所以数组的存储效率和输入输出性能远远的高于Python中嵌套列表
第一个内存块存储风格ndarray必须要相同的类型可以连续存储 list的通用性强可以不同类型数据所以list数据之间是依靠引用的形式存储第二个并行化处理形式ndarray支持并行化运算第三个底层语言Numpy底层语言是c内部解除了GIL全局解释器的限制
ndarray属性
属性
ndarray.shape数组维度的元组 ndarray.ndim数组维度 ndarray.size数组中元素的个数 ndarray.itemszie一个数组元素的长度 ndarray.dtype数组元素的类型
score
print(score.shape) #(8, 5) 8行5列
print(score.ndim) # 2
print(score.size) # 40
print(score.itemsize) # 4
print(score.dtype) # int32(8, 5)
2
40
4
int32## ndarray的形状
import numpy as np
a np.array([1, 2, 3, 4])
b np.array([[1, 2, 3], [3, 4, 5]])
c np.array([[[1, 3, 4], [3, 4, 5]], [[1, 5, 7], [4, 7, 8]]])print(a.shape, b.shape, c.shape)(4,) (2, 3) (2, 2, 3)print(a, \n\n, b, \n\n, c)[1 2 3 4] [[1 2 3][3 4 5]] [[[1 3 4][3 4 5]][[1 5 7][4 7 8]]]data np.array([1.1, 2.2, 3.3], dtypenp.float32)
data2 np.array([1.2, 2.2, 3.2], dtypefloat32)print(data, data.dtype, data2, data2.dtype)[1.1 2.2 3.3] float32 [1.2 2.2 3.2] float32生成数组
生成0和1的 np.ones(shape[, dtype, order]) np.zeros(shape[, dtype, order])np.ones(shape(2, 3), dtype‘int32’)np.zeros(shape(2, 3), dtypenp.float32) 从现有数组中生成 np.array() np.copy() np.asarray()data1 np.array(score) ## 深拷贝data2 np.asarray(score) ## 浅拷贝data3 np.copy(score) ## 深拷贝 生成固定范围的数组 np.linspace(satrt, stop, num, endpoint, restep, detype) np.arange()np.linspace(0, 10, 100) ## [0, 10]产生100个等距离的数组np.arange(a, b, c) ## 产生[a, b) 步长为c的数组 生成随机数组 np.random.rand(d0, d1, d2,....) 返回[0.0, 1.0]内的一组均匀分布的数组, d0, d1, d2表示维度的元组数据np.random.uniform(low0.0, high1.0, sizeNone) 均匀分布[low, high)size-int类型表输出一位样本数元组表输出的是对应维度数组np.random.normal(loc0.0, scale1.0, sizeNone) 正态分布 均值loc 标准差scale 形状size
np.ones(shape(2, 4))array([[1., 1., 1., 1.],[1., 1., 1., 1.]])np.zeros((4, 3))array([[0., 0., 0.],[0., 0., 0.],[0., 0., 0.],[0., 0., 0.]])data1 np.array([1, 3, 4, 5])
data1array([1, 3, 4, 5])data2 np.asarray(data1)
data2array([1, 3, 4, 5])data3 np.copy(data1)
data3array([1, 3, 4, 5])np.linspace(0, 10, 100)array([ 0. , 0.1010101 , 0.2020202 , 0.3030303 , 0.4040404 ,0.50505051, 0.60606061, 0.70707071, 0.80808081, 0.90909091,1.01010101, 1.11111111, 1.21212121, 1.31313131, 1.41414141,1.51515152, 1.61616162, 1.71717172, 1.81818182, 1.91919192,2.02020202, 2.12121212, 2.22222222, 2.32323232, 2.42424242,2.52525253, 2.62626263, 2.72727273, 2.82828283, 2.92929293,3.03030303, 3.13131313, 3.23232323, 3.33333333, 3.43434343,3.53535354, 3.63636364, 3.73737374, 3.83838384, 3.93939394,4.04040404, 4.14141414, 4.24242424, 4.34343434, 4.44444444,4.54545455, 4.64646465, 4.74747475, 4.84848485, 4.94949495,5.05050505, 5.15151515, 5.25252525, 5.35353535, 5.45454545,5.55555556, 5.65656566, 5.75757576, 5.85858586, 5.95959596,6.06060606, 6.16161616, 6.26262626, 6.36363636, 6.46464646,6.56565657, 6.66666667, 6.76767677, 6.86868687, 6.96969697,7.07070707, 7.17171717, 7.27272727, 7.37373737, 7.47474747,7.57575758, 7.67676768, 7.77777778, 7.87878788, 7.97979798,8.08080808, 8.18181818, 8.28282828, 8.38383838, 8.48484848,8.58585859, 8.68686869, 8.78787879, 8.88888889, 8.98989899,9.09090909, 9.19191919, 9.29292929, 9.39393939, 9.49494949,9.5959596 , 9.6969697 , 9.7979798 , 9.8989899 , 10. ])np.arange(0, 100, 2)array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32,34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66,68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98])np.random.uniform(1, 2, 20)array([1.08186729, 1.14786875, 1.70033877, 1.21356519, 1.80826522,1.82539046, 1.2411259 , 1.94754535, 1.26016768, 1.95195603,1.83118684, 1.93096164, 1.42540342, 1.01900246, 1.00777939,1.94587154, 1.30147204, 1.85872718, 1.51138215, 1.72144173])np.random.rand(2, 3)array([[0.93695681, 0.54056962, 0.05346231],[0.25430123, 0.4679477 , 0.42365386]])data4 np.random.normal(0, 1, 10000000)
data4array([-1.37843425, 0.43112438, 0.74566392, ..., 1.11031839,-0.35627334, -0.49286865])import matplotlib.pyplot as pltplt.figure(figsize(15, 8), dpi80)
plt.hist(data4, 1000)
plt.show()数组的切片操作和数据索引
import numpy as np
stock_change np.random.normal(loc0, scale1, size(8, 10))
stock_changearray([[-0.0128315 , 1.36389291, 1.67468755, -1.63839812, 0.50246918,0.40632079, 0.5468709 , -1.51506239, -0.95175431, 0.79676231],[-0.29024725, -0.85783328, -2.88228976, 0.09475102, 0.26886068,-0.72337737, 0.32906655, 1.38442008, 0.22017286, 0.11595155],[-1.48797053, -0.34888996, -0.46878054, 0.06614233, -1.2163201 ,-0.12437208, -0.48048511, 0.92053831, 1.37148844, 0.4052761 ],[-0.68483909, 1.45441467, 0.32439071, 2.09266866, -1.40087978,0.21482243, 1.06350017, -1.12371055, -0.21362273, -0.86489608],[-0.8955743 , -2.80666246, -1.81775787, -0.64719575, -1.03749633,-0.09075791, 0.04027887, 0.88156425, -0.38851649, 0.4366844 ],[-0.6112534 , 0.20743331, -1.10785011, -1.94937533, 0.79183302,-1.43629441, -0.39276676, 1.43465142, -0.77917209, 0.75375268],[-0.45255197, 0.21874378, 0.74356075, 0.89123163, 0.80052696,0.07645454, 1.18475498, 1.21210169, -2.57089921, -0.04719686],[ 1.49996354, 1.73125796, 0.35972564, -0.31768555, -0.23859956,0.14878977, 1.78480518, -0.157626 , 0.52180221, 1.53564593]])stock_change[0, 0:3] # 二维数组中第一个一维数组中的第0到3个之间的数据左闭右开array([-1.23848824, 1.80273454, 0.48612183])a1 np.array([[[1, 2, 3],[4, 5, 6]], [[12, 3, 4], [5, 6, 7]]])
a1array([[[ 1, 2, 3],[ 4, 5, 6]],[[12, 3, 4],[ 5, 6, 7]]])a1[1, 0, 2] ## 三维数组中第二个二维数组中的第一个一维数组的第三个数据4形状的修改
ndarray.reshape(shape)ndarray.resize(shape)ndarray.T
print(stock_change.shape)
print(stock_change)
data stock_change.reshape(10, 8) ## 有返回值 不修改对象stock_change的原始数据
print(data.shape)
print(data)(10, 8)
[[-1.23848824 1.80273454 0.48612183 -0.72560924 0.70273282 1.0001417-1.50264292 0.07910228][ 0.50097203 -0.30643765 -2.06606864 1.06603865 -0.24707909 -0.435822391.40507793 0.16617008][ 0.90592803 0.42831191 -0.92043446 -0.86909989 1.86906101 -0.27504789-0.85507962 -0.06812796][-0.47386474 -0.12860694 0.78529739 0.6299527 1.35195163 0.52554048-1.44443021 -0.30228474][-2.00270709 -0.93547033 -1.91377025 -0.44282643 0.39398671 -1.157779111.06886255 -0.99258445][ 1.46011953 0.02989662 -0.57156073 0.33255032 1.10206919 1.10728184-0.2309872 -0.36046913][ 0.6419396 0.45193213 -0.28647482 2.35270101 -1.36580147 -0.3416711-0.68923525 0.40515396][-0.65856583 -0.80067154 1.00151152 -0.59024112 1.72517446 0.992832990.32894163 0.29112266][-0.02950995 1.00548516 0.28799688 -0.23560119 -0.27545952 -2.067568870.10599702 1.29010633][ 0.10229354 -1.61937238 -2.19289266 -2.0243394 -1.584921 1.15768340.11722609 1.00201755]]
(10, 8)
[[-1.23848824 1.80273454 0.48612183 -0.72560924 0.70273282 1.0001417-1.50264292 0.07910228][ 0.50097203 -0.30643765 -2.06606864 1.06603865 -0.24707909 -0.435822391.40507793 0.16617008][ 0.90592803 0.42831191 -0.92043446 -0.86909989 1.86906101 -0.27504789-0.85507962 -0.06812796][-0.47386474 -0.12860694 0.78529739 0.6299527 1.35195163 0.52554048-1.44443021 -0.30228474][-2.00270709 -0.93547033 -1.91377025 -0.44282643 0.39398671 -1.157779111.06886255 -0.99258445][ 1.46011953 0.02989662 -0.57156073 0.33255032 1.10206919 1.10728184-0.2309872 -0.36046913][ 0.6419396 0.45193213 -0.28647482 2.35270101 -1.36580147 -0.3416711-0.68923525 0.40515396][-0.65856583 -0.80067154 1.00151152 -0.59024112 1.72517446 0.992832990.32894163 0.29112266][-0.02950995 1.00548516 0.28799688 -0.23560119 -0.27545952 -2.067568870.10599702 1.29010633][ 0.10229354 -1.61937238 -2.19289266 -2.0243394 -1.584921 1.15768340.11722609 1.00201755]]stock_change.resize((10, 8)) ## 无返回值 直接改变stock_change对象
stock_changearray([[-1.23848824, 1.80273454, 0.48612183, -0.72560924, 0.70273282,1.0001417 , -1.50264292, 0.07910228],[ 0.50097203, -0.30643765, -2.06606864, 1.06603865, -0.24707909,-0.43582239, 1.40507793, 0.16617008],[ 0.90592803, 0.42831191, -0.92043446, -0.86909989, 1.86906101,-0.27504789, -0.85507962, -0.06812796],[-0.47386474, -0.12860694, 0.78529739, 0.6299527 , 1.35195163,0.52554048, -1.44443021, -0.30228474],[-2.00270709, -0.93547033, -1.91377025, -0.44282643, 0.39398671,-1.15777911, 1.06886255, -0.99258445],[ 1.46011953, 0.02989662, -0.57156073, 0.33255032, 1.10206919,1.10728184, -0.2309872 , -0.36046913],[ 0.6419396 , 0.45193213, -0.28647482, 2.35270101, -1.36580147,-0.3416711 , -0.68923525, 0.40515396],[-0.65856583, -0.80067154, 1.00151152, -0.59024112, 1.72517446,0.99283299, 0.32894163, 0.29112266],[-0.02950995, 1.00548516, 0.28799688, -0.23560119, -0.27545952,-2.06756887, 0.10599702, 1.29010633],[ 0.10229354, -1.61937238, -2.19289266, -2.0243394 , -1.584921 ,1.1576834 , 0.11722609, 1.00201755]])stock_change.T ## 转置array([[-1.23848824, 0.50097203, 0.90592803, -0.47386474, -2.00270709,1.46011953, 0.6419396 , -0.65856583, -0.02950995, 0.10229354],[ 1.80273454, -0.30643765, 0.42831191, -0.12860694, -0.93547033,0.02989662, 0.45193213, -0.80067154, 1.00548516, -1.61937238],[ 0.48612183, -2.06606864, -0.92043446, 0.78529739, -1.91377025,-0.57156073, -0.28647482, 1.00151152, 0.28799688, -2.19289266],[-0.72560924, 1.06603865, -0.86909989, 0.6299527 , -0.44282643,0.33255032, 2.35270101, -0.59024112, -0.23560119, -2.0243394 ],[ 0.70273282, -0.24707909, 1.86906101, 1.35195163, 0.39398671,1.10206919, -1.36580147, 1.72517446, -0.27545952, -1.584921 ],[ 1.0001417 , -0.43582239, -0.27504789, 0.52554048, -1.15777911,1.10728184, -0.3416711 , 0.99283299, -2.06756887, 1.1576834 ],[-1.50264292, 1.40507793, -0.85507962, -1.44443021, 1.06886255,-0.2309872 , -0.68923525, 0.32894163, 0.10599702, 0.11722609],[ 0.07910228, 0.16617008, -0.06812796, -0.30228474, -0.99258445,-0.36046913, 0.40515396, 0.29112266, 1.29010633, 1.00201755]])类型的修改和数组去重
ndarray.astype(type)ndarray序列化到本地 ndarray.tostring()ndarray.tobytes() np.unique() 去重
stock_change.astype(np.int32)array([[ 0, 1, 1, -1, 0, 0, 0, -1, 0, 0],[ 0, 0, -2, 0, 0, 0, 0, 1, 0, 0],[-1, 0, 0, 0, -1, 0, 0, 0, 1, 0],[ 0, 1, 0, 2, -1, 0, 1, -1, 0, 0],[ 0, -2, -1, 0, -1, 0, 0, 0, 0, 0],[ 0, 0, -1, -1, 0, -1, 0, 1, 0, 0],[ 0, 0, 0, 0, 0, 0, 1, 1, -2, 0],[ 1, 1, 0, 0, 0, 0, 1, 0, 0, 1]])stock_change.tobytes() ## 之前可以使用tostring的方法b\x10\x83d\xcbfG\x8a\xbf\x06\xcb\n_\x81\xd2\xf5?\xf6i\x89\x85\xcb\xfa?(\x9dK\xf1\xe06\xfa\xbf\x040\xd3:\x14\xe0?\xf4\x96\xb4\xeb(\x01\xda?\x9b\xfe\x94e\xf7\x7f\xe1?\x80I\xb5\x10\xb2\xf8\xbf\xf2\x01\xcbv\xc5t\xee\xbf\x92\xbe9\xac\x13\x7f\xe9?F\x98\xc71i\x93\xd2\xbf\xcf~\x07\xc6^s\xeb\xbf$a\xd4\xee\xed\x0e\x07\xc0\xf2\xf0\x87I\x9aA\xb8?/\x91\xedg\x035\xd1?\xc0\x85\xe6K\xe8%\xe7\xbf9\r\r*m\x0f\xd5?H\x8d\xcb\xab\x95\xf6?A\xed \xca\x9f.\xcc?\xb0\xce\x0f;\x00\xaf\xbd?\xe4\xa3\x860\xba\xce\xf7\xbf\x9e5\x1b\x8c6T\xd6\xbfv\xdd\xc3\x15\x80\x00\xde\xbf\x19s/\x1c\xb4\xee\xb0?\x9c\xc7I\x11\x0cv\xf3\xbf\xcb$A\xd9\xd6\xbf\xbf}\xbd\xa6\x99D\xc0\xde\xbf(\xedu\xc2\x0cu\xed?W\x04\xd2\xdd\x9d\xf1\xf5?MD\xf8)\x0b\xf0\xd9?[\xc0\xaa3\xea\xe5\xbf6ozQHE\xf7?M*CB\xd1\xc2\xd4?{!\x11\xc9\xbd\x00\xb3\x0b\xb0\xeb\x00j\xf6\xbf\x86\xfc\xe7*M\x7f\xcb?_\xca\xdc\xbf\x18\x04\xf1?\x85]G\xea\xb7\xfa\xf1\xbfi4gX\xfdW\xcb\xbf\xc2g^\x8b:\xad\xeb\xbf\x06l\x0bo\x8b\xa8\xec\xbf{L4s\x0bt\x06\xc0;\xdd0F\x89\x15\xfd\xbf\xf6\x06\x03\xde\xd3\xb5\xe4\xbf}\x13\xc9\xc0\x95\x99\xf0\xbf\xfb\rF\x14\xe9;\xb7\xbf\xa1\xa1\x9fpn\x9f\xa4?\x19\xa3\x84\xc65\xec?\xb9^\xa1Ft\xdd\xd8\xbf\x8b,N\x1f\xa3\xf2\xdb?\xe4UJc\x8f\xe3\xbfC\x02\xa0\xbf,\x8d\xca?\xf5)\x82\t\xc1\xb9\xf1\xbf\xbdxl0\xa40\xff\xbfi\x02C3\xb2V\xe9?Q^\x8d\xd9\x0f\xfb\xf6\xbf\xb0\x9c\x914\x17#\xd9\xbfe\xdf\xd2\x0cU\xf4\xf6?\xf3\xf7\xf4M\xfa\xee\xe8\xbf\xb6R\xee\xbd\x1e\xe8?\x84o$\x87\x9c\xf6\xdc\xbf\xf5\xc6$\xd6\xcb\xff\xcb?\xf6\xeb?\xcb\xe7?\\\xa311\xf8\x84\xec?S\xf6\xb5\xea\x9d\xe9?\x06\x18\xed_\x86\x92\xb3?_\xaf\x14\xa2\xc1\xf4\xf2?O\xd2\x02\xbd\xc4d\xf3?p\xe7\x80\x9a3\x91\x04\xc0\xeb\xfe#\xf2/*\xa8\xbf\x9a\xfa\\\xc5\xd9\xff\xf7?\xf4\xfe\xb0\x8a;\xb3\xfb?\x97\x89l\xad\xbe\x05\xd7?\x1d\xc4\xce\xc6\xf5T\xd4\xbf\xfd\x99\xf0n\x8a\xce\xbf:J\xe4\x15\x8b\x0b\xc3??UZ\xe1\x8f\x8e\xfc?\xebph\xb9\x16-\xc4\xbf\x87]\xab\x8b\x9a\xb2\xe0?\xb4\xa2\tw\x01\x92\xf8?temp np.array([[1, 2, 3, 4], [3, 4, 5, 6]])
np.unique(temp)array([1, 2, 3, 4, 5, 6])ndarray的运算
逻辑运算 stock_change 0.5 数据大于0.5的标记为True 否则为Falsestock_change[stock_change 0.5] 返回所有大于0.5的数据stock_change[stock_change 0.5] 1.1 返回所有大于0.5的数据更改为1.1np.all(布尔值) 布尔值里面所有True才返回True, 只要有一个False就返回False np.all(stock_change[0:2, 0:5] 0) 判断里面数据是否全部大于0 np.any(布尔值) 布尔值里面有一个True就返回True,只有全是False才会返回False np.any(stock_change[0:2, 0:5] 0) 判断里面是否有数据大于0 三元运算符np.where(布尔值, True的位置的值, False位置的值) np.where(stock_change0, 1, 0) 将大于0的数据置为1 否则置为0np.where(np.logical_and(stock_change 0.5, stock_change 1), 1, 0) 将大于0.5并且小于1的置为1否则置为0np.where(np.logical_or(stock_change 0.5, stock_change -0.5), 1, 0) 将大于0.5或者小于-0.5的置为1否则置为0 统计运算: 统计指标函数min,max,mean,median,var,std函数其中有一个参数axis,为1代表使用行去进行统计为0使用列进行统计计算。 np.max(a, axis1) / ndarray.max(axis1) / np.max(a) / adarray.max() 返回最大值、最小值的位置 np.argmax(a. axis) / np.argmin(a, axis) 数组间运算: 数组与数的运算arr ±*/等等直接对数组中的每个元素执行相同的操作数组与数组的运算需要满足广播机制广播机制当操作两个数组进行运算的时候numpy会比较两个数组的shape,只有满足shape对应位置相等或者相对应的一个地方为1的数组才可以进行运算结果对应shape取相应的位置的最大值。矩阵运算矩阵matrix 矩阵必须是二维的但是数组可以是一位的。 np.mat() 将数组转换为矩阵有两种方法来存储矩阵ndarray二维数组、matrix数据结构矩阵运算 (m, n) * (n , l) (m, l) 也就是第一个矩阵的列数和第二个矩阵的行数要相等np.matmul() numpy库中用于矩阵乘法的函数,它的作用是计算两个矩阵的乘积np.dot() 向量点乘
逻辑运算
import numpy as np
stock_change np.random.normal(loc0, scale1, size(8, 10))
stock_change 0.5array([[False, False, False, False, False, False, True, True, False,False],[False, False, False, False, False, False, False, True, False,False],[False, True, False, False, False, False, True, True, True,True],[False, True, True, False, False, True, False, False, False,False],[False, False, False, True, False, False, False, True, True,False],[False, False, False, False, False, False, False, False, True,False],[False, False, False, True, False, True, False, False, True,True],[False, False, False, False, True, False, True, False, True,False]])stock_change[stock_change 0.5]array([1.36389291, 1.67468755, 0.50246918, 0.5468709 , 0.79676231,1.38442008, 0.92053831, 1.37148844, 1.45441467, 2.09266866,1.06350017, 0.88156425, 0.79183302, 1.43465142, 0.75375268,0.74356075, 0.89123163, 0.80052696, 1.18475498, 1.21210169,1.49996354, 1.73125796, 1.78480518, 0.52180221, 1.53564593])stock_change[stock_change 0.5] 1.1
stock_changearray([[-0.0128315 , 1.1 , 1.1 , -1.63839812, 1.1 ,0.40632079, 1.1 , -1.51506239, -0.95175431, 1.1 ],[-0.29024725, -0.85783328, -2.88228976, 0.09475102, 0.26886068,-0.72337737, 0.32906655, 1.1 , 0.22017286, 0.11595155],[-1.48797053, -0.34888996, -0.46878054, 0.06614233, -1.2163201 ,-0.12437208, -0.48048511, 1.1 , 1.1 , 0.4052761 ],[-0.68483909, 1.1 , 0.32439071, 1.1 , -1.40087978,0.21482243, 1.1 , -1.12371055, -0.21362273, -0.86489608],[-0.8955743 , -2.80666246, -1.81775787, -0.64719575, -1.03749633,-0.09075791, 0.04027887, 1.1 , -0.38851649, 0.4366844 ],[-0.6112534 , 0.20743331, -1.10785011, -1.94937533, 1.1 ,-1.43629441, -0.39276676, 1.1 , -0.77917209, 1.1 ],[-0.45255197, 0.21874378, 1.1 , 1.1 , 1.1 ,0.07645454, 1.1 , 1.1 , -2.57089921, -0.04719686],[ 1.1 , 1.1 , 0.35972564, -0.31768555, -0.23859956,0.14878977, 1.1 , -0.157626 , 1.1 , 1.1 ]])print(np.all(stock_change[0:2, 0:5] 0))
print(np.any(stock_change[0:2, 0:5] 0))False
Trueprint(np.where(stock_change0, 1, 0))[[0 1 1 0 0 0 1 1 1 0][0 1 0 0 0 1 1 0 0 1][0 1 0 0 0 0 0 1 0 0][1 0 1 0 1 1 0 0 0 1][1 0 0 1 0 1 0 0 1 0][1 1 1 1 0 1 1 1 0 1][0 1 0 1 0 0 1 0 1 0][1 1 1 0 1 1 1 0 0 1]]print(np.where(np.logical_and(stock_change 0.5, stock_change 1), 1 , 0))[[0 0 0 0 0 0 0 0 0 0][0 0 0 0 0 0 0 0 0 1][0 1 0 0 0 0 0 0 0 0][0 0 0 0 1 0 0 0 0 0][0 0 0 0 0 0 0 0 0 0][0 0 1 0 0 0 1 1 0 0][0 0 0 0 0 0 1 0 0 0][0 0 0 0 0 0 0 0 0 1]]print(np.where(np.logical_or(stock_change 0.5, stock_change -0.5), 1 , 0))[[1 0 0 0 1 0 0 1 1 0][1 1 1 1 0 1 1 1 0 1][0 1 1 1 1 0 1 0 1 0][1 0 1 0 1 1 1 0 0 1][0 1 0 0 0 1 0 1 1 0][0 1 1 0 0 0 1 1 1 1][1 0 0 0 0 0 1 1 0 0][1 0 1 0 1 0 0 1 1 1]]统计运算
print(np.max(stock_change), stock_change.max())2.837073584187165 2.837073584187165print(np.mean(stock_change, axis0), np.mean(stock_change, axis1), np.mean(stock_change))[-0.9652667 -0.15328082 0.08317861 -0.54300528 -0.42430401 -0.27689675-0.03939256 0.58928582 0.11866925 0.06092911] [-0.24814861 -0.59923979 0.47094442 0.21607003 -0.15542244 -0.36903679-0.12744662 -0.42778684] -0.15500833265906144print(np.argmax(stock_change), np.argmax(stock_change, axis1))32 [7 7 7 2 3 8 5 8]数组的运算
数组和数的运算
arr np.array([[1, 2, 3, 2, 1, 4], [5, 6, 1, 2, 3, 1]])
arrarray([[1, 2, 3, 2, 1, 4],[5, 6, 1, 2, 3, 1]])arr1array([[2, 3, 4, 3, 2, 5],[6, 7, 2, 3, 4, 2]])arr*2array([[ 2, 4, 6, 4, 2, 8],[10, 12, 2, 4, 6, 2]])arr/2array([[0.5, 1. , 1.5, 1. , 0.5, 2. ],[2.5, 3. , 0.5, 1. , 1.5, 0.5]])arr-2array([[-1, 0, 1, 0, -1, 2],[ 3, 4, -1, 0, 1, -1]])数组和数组运算
arr1 np.array([[1, 2, 3, 2, 1, 4], [5, 6, 1, 2, 3, 1]])
arr2 np.array([[1], [3]])
print(arr1, \n\n, arr2)[[1 2 3 2 1 4][5 6 1 2 3 1]] [[1][3]]print(arr1 * arr2, \n\n, arr1 / arr2)[[ 1 2 3 2 1 4][15 18 3 6 9 3]] [[1. 2. 3. 2. 1. 4. ][1.66666667 2. 0.33333333 0.66666667 1. 0.33333333]]矩阵运算
data np.array([[80, 86],[82, 80],[85, 78],[90, 90],[86, 82],[82, 90],[78, 80],[92, 94]])
dataarray([[80, 86],[82, 80],[85, 78],[90, 90],[86, 82],[82, 90],[78, 80],[92, 94]])data2 np.mat([[80, 86],[82, 80],[85, 78],[90, 90],[86, 82],[82, 90],[78, 80],[92, 94]])
print(data2, \n\n, type(data2))[[80 86][82 80][85 78][90 90][86 82][82 90][78 80][92 94]] class numpy.matrixdata3 np.mat([[0.3], [0.7]])
data3matrix([[0.3],[0.7]])print(data2 * data3, \n\n, data np.array([[0.3], [0.7]])) ## 计算成绩 第一列乘上0.3 第二列乘上0.7[[84.2][80.6][80.1][90. ][83.2][87.6][79.4][93.4]] [[84.2][80.6][80.1][90. ][83.2][87.6][79.4][93.4]]print(np.matmul(data2, data3), \n\n, np.dot(data2, data3))[[84.2][80.6][80.1][90. ][83.2][87.6][79.4][93.4]] [[84.2][80.6][80.1][90. ][83.2][87.6][79.4][93.4]]合并和分割
合并合并可以从水平的方向进行合并也可以在垂直的方法进行合并 numpy.hstack(tuple(column, wise)) 水平拼接numpy.vstack(tuple(row, wise)) 垂直拼接numpy.concatenate((a1, a2, a3…), axis0) axis1来表示水平,axis0表示垂直 分割 np.split(ary, indices_or_sections, axis0)
合并
import numpy as np
a np.array([1, 2, 3])
b np.array([2, 3, 4])
np.hstack((a, b))array([1, 2, 3, 2, 3, 4])np.vstack((a, b))array([[1, 2, 3],[2, 3, 4]])np.concatenate((a, b), axis0)array([1, 2, 3, 2, 3, 4])x np.array([[1, 2], [3, 4]])
print(np.concatenate((x, x), axis0))
print(\n\n, np.concatenate((x, x), axis1))[[1 2][3 4][1 2][3 4]][[1 2 1 2][3 4 3 4]]分割
x1 np.arange(9.0)
np.split(x1, 3)[array([0., 1., 2.]), array([3., 4., 5.]), array([6., 7., 8.])]x1 np.arange(8.0)
np.split(x1, [3, 5, 6, 8]) ## 按照索引进行分割[array([0., 1., 2.]),array([3., 4.]),array([5.]),array([6., 7.]),array([], dtypefloat64)]IO操作和数据处理
numpy数据读取 np.genfromtxt(path, delimiter) ## 文件路径和分隔符号np.genfromtxt(‘tes.csv’, delimiter‘,’)
import numpy as np
data np.genfromtxt(gh.csv, delimiter,)
dataarray([[ nan, nan, nan],[ 12., 213., 321.],[ 123., 345., 1241.],[ 14., 24., 123.]])对于上面的数组中的nan值的类型是float64,对于这个的一般处理方式有两种 将数据存在nan的行删除使用该列的平均值填充到nan的位置
总结
Numpy的优势内存存储风格ndrray存储相同数据内存连续存储底层c语言实现支持多线程ndarray的属性shape、dtype、ndim、size、itemsize基本操作ndarray.方法() np.函数() 生成数组的方法np.ones(shape) np.zeros(shape)从现有数组中生成np.array() np.copy() np.asarray()生成固定范围的数组np.linspace(a, b, c) np.arange(a, b, c)生成随机数均匀分布np.random.uniform() 正态分布np.random.normal()切片索引形状修改ndarray.reshape((a, b)) ndarray.resize((a, b)) ndarray.T类型修改ndarray.astype(type) ndarray.tobytes()数组去重np.unique() numpy的运算 逻辑运算: 布尔索引np.all() np.any()np.where(a, b, c) a是布尔值 b是true对应的值 c是false对应的值 统计运算 统计指标max min mean median var std最大值最小值位置np.argmax() np.argmin() 数组间运算 数组与数的运算数组与数组的运算要注意广播机制矩阵运算np.mat() np.dot() np.matmul()