Python数据科学手册(一)【IPython入门】

用户2936342

发布于 2018-08-27 14:10:23

6220

发布于 2018-08-27 14:10:23

文章被收录于专栏：nummy

节选自《Python data science handbook》

获取帮助

$ pip install line_profiler获取帮助与文档

使用？获取帮助 In [2]: len? Type: builtin_function_or_method String form: <built-in function len> Namespace: Python builtin Docstring: len(object) -> integer Return the number of items of a sequence or mapping.
使用？？获取源码 In [9]: len?? Type: builtin_function_or_method String form: <built-in function len> Namespace: Python builtin Docstring: len(object) -> integer Return the number of items of a sequence or mapping.
使用tab键补全命令 In [10]: import <TAB> Display all 399 possibilities? (y or n) Crypto dis py_compile Cython distutils pyclbr ... ... ... difflib pwd zmq In [10]: import h<TAB> hashlib hmac http heapq html husl
通配符匹配 In [10]: str.*find*? str.find str.rfind

快捷键

导航快捷键

快捷键	操作
Ctrl-a	将光标移至当前行的开头
Ctrl-e	将光标移至当前行的结尾
Ctrl-b 或者左箭头	将光标向后移动一个字符
Ctrl-f 或者右箭头	将光标向前移动一个字符

文本操作快捷键

快捷键	操作
回车键	删除前一个字符
Ctrl-d	删除后一个字符
Ctrl-k	删除从光标到行尾的字符
Ctrl-u	删除从行头到光标的字符
Ctrl-y	粘贴之前剪切的文本
Ctrl-t	交换前面的两个字符

历史命令快捷键

快捷键	操作
Ctrl-p (或者上箭头)	获取上一条历史命令
Ctrl-n (或者下箭头)	获取下一条历史命令
Ctrl-r	反向搜索历史命令

其它命令

快捷键	操作
Ctrl-l	清空
Ctrl-c	中断当前命令
Ctrl-d	退出IPython会话

魔法命令

粘贴代码%paste或者%cpaste，使用该命令粘贴多行代码。
执行系统命令%run
统计代码费时%timeit
%magic获取所有魔法命令的详细信息
%lsmagic获取所有魔法命令列表

输入输出

输入和输出通过In/Out标签标示，实际上In和Out分别是两个变量，保存了最近的输入输出信息。可以直接通过下标获取之前的输入或者输出。

In [4]: print(In)
['', 'import math', 'math.sin(2)', 'math.cos(2)', 'print(In)']

In [5]: Out
Out[5]: {2: 0.9092974268256817, 3: -0.4161468365471424}
In [6]: print(In[1])
import math

还可以通过_获取前一个输出结果，__获取倒数第二个输出结果，___获取倒数第三个输出结果。除此之外，也可以通过_x+序号获取第x个输出结果。

在命令后面加;可以隐藏输出结果。

IPython也提供了魔法命令%history来获取历史输入。

In [16]: %history -n 1-4
   1: import math
   2: math.sin(2)
   3: math.cos(2)
   4: print(In)

shell命令

可以在IPython中直接运行shell命令，只要在命令前面加上!即可。也可以将shell命令的结果传递给Python变量。如果需要将Python变量传递给shell，将变量用{}括起来即可。

In [4]: contents = !ls

In [5]: print(contents)
['myproject.txt']

In [9]: message = "hello from Python"
In [10]: !echo {message}
hello from Python

异常和调试

控制异常信息的输出魔术命令%xmode可以控制异常信息的输出。它可以接受一个参数，参数可选值为Plain，Context，Verbose，默认值为Context，Plain更为紧凑，而Verbose比较详细。
调试当发生异常信息时，可以通过%debug进行调试，输入该命令之后，会进入ipdb。命令描述 list 展示当前行在文件中的位置 h(elp) 查看帮助命令 q(uit) 退出调试 c(ontinue) 退出调试，继续执行 n(ext) 执行下一步 ``<enter>c 重复上一步 p(rint) 打印变量 s(tep) 进入子程序 r(eturn) 退出子程序

性能测试

IPython提供了以下魔法命令用于性能测试：

%time测试单条语句的运行时间
%timeit 重复执行多次单条语句以获取更为精确的时间
%prun使用profiler运行代码
%lprun使用profiler逐行执行代码
%memit测试单条语句的内存使用情况
%mprun使用memory profiler逐行运行代码

后面四条命令需要安装line_profiler和 memory_profiler 扩展。

通常%timeit的执行速度比%time要快，因为它做了一些优化，可以省去部分垃圾回收。

使用%prun 首先定义一个函数： def sum_of_lists(N): total = 0 for i in range(5): L = [j ^ (j >> i) for j in range(N)] total += sum(L) return total 接下来调用命令%prun %prun sum_of_lists(1000000) 输出如下： 14 function calls in 0.714 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 5 0.599 0.120 0.599 0.120 <ipython-input-19>:4(<listcomp>) 5 0.064 0.013 0.064 0.013 {built-in method sum} 1 0.036 0.036 0.699 0.699 <ipython-input-19>:1(sum_of_lists) 1 0.014 0.014 0.714 0.714 <string>:1(<module>) 1 0.000 0.000 0.714 0.714 {built-in method exec} 从结果中可以看出，listcomp的费时最长，从而我们可以知道要对他进行优化。
逐行调试%lprun 先安装以下库 $ pip install line_profiler 然后载入line_profiler模块 %load_ext line_profiler 执行以下代码： %lprun -f sum_of_lists sum_of_lists(5000) 得到结果如下： Timer unit: 1e-06 s Total time: 0.009382 s File: <ipython-input-19-fa2be176cc3e> Function: sum_of_lists at line 1 Line # Hits Time Per Hit % Time Line Contents ============================================================== 1 def sum_of_lists(N): 2 1 2 2.0 0.0 total = 0 3 6 8 1.3 0.1 for i in range(5): 4 5 9001 1800.2 95.9 L = [j ^ (j >> i) for j in range(N)] 5 5 371 74.2 4.0 total += sum(L) 6 1 0 0.0 0.0 return total
使用%memit和%mprun 先安装库： $ pip install memory_profiler 加载库 %load_ext memory_profiler 运行命令%memit %memit sum_of_lists(1000000) 结果输出为：peak memory: 100.08 MiB, increment: 61.36 MiB，可以看出该函数大概使用了100M的内存。使用%mprun逐行测试内存使用情况，但是该命令并不支持直接测试notebook中的代码，而需要从模块中导入才能进行测试。所以先创建模块： %%file mprun_demo.py def sum_of_lists(N): total = 0 for i in range(5): L = [j ^ (j >> i) for j in range(N)] total += sum(L) del L # remove reference to L return total 接下来逐行测试： from mprun_demo import sum_of_lists %mprun -f sum_of_lists sum_of_lists(1000000) 结果如下所示： Filename: ./mprun_demo.py Line # Mem usage Increment Line Contents ================================================ 1 39.0 MiB 0.0 MiB def sum_of_lists(N): 2 39.0 MiB 0.0 MiB total = 0 3 46.5 MiB 7.5 MiB for i in range(5): 4 71.9 MiB 25.4 MiB L = [j ^ (j >> i) for j in range(N)] 5 71.9 MiB 0.0 MiB total += sum(L) 6 46.5 MiB -25.4 MiB del L # remove reference to L 7 39.1 MiB -7.4 MiB return total

其它

pylab模式 在pylab模式下，IPython将自动导入SciPy、NumPy和Matplotlib模块。如果没有这个功能，我们只能手动导入每一个所需模块。

$ ipython --pylab

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2017.08.31 ，如有侵权请联系 cloudcommunity@tencent.com 删除

其他

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

其他

登录后参与评论

0 条评论

热度