专栏首页Python无止境英文分享 | 2018年 Python 的好与坏

英文分享 | 2018年 Python 的好与坏

好久没给大家分享英文博客了,大家的英文阅读能力没有退步吧?(有也不会认的 :))前几天,我被一些小伙伴考四六级的消息刷屏了,不知道大家考得如何啊?虽然我已毕业几年了,不用为考级而学习英语,但是,我也意识到,除了编程技能,英语技能是万万不能丢的。所以,我开始培养起阅读英文材料的习惯了(两周前还尝试翻译了一篇),在公众号分享英文文章也是一种有益的尝试。曾有读者留言,说关注咱公众号还能练习英语,他觉得很赞。这个回复令我信心大增,所以这种分享会一直延续下去的。我会控制好频率,同时在标题注明是英文分享,以示区分。今天分享的是 Medium 网站上的一篇关于 Python 的年度总结。作者分 Good 和 Bad 两方面,介绍了几个重要的模块,比如:JupyterLab、mypy、Pipfile and pipenv、f-strings,等等。希望对你有帮助。(PS:Python猫读者交流群建立起来了,详情请看今日的第二条推文。)


原标题:State of Python in 2018

作者:Daniel Kats

原文:http://t.cn/E42RMi9(有删节)


I love python. I’ve been using Python for almost 10 years now, across projects both personal and professional. My work is equal parts data analytics and rapid prototyping, so Python is a natural fit. The great draw of Python is it has packages for everything: machine learning, data exploration, reproducible research, visualization, cloud functionality, web APIs, and the kitchen sink.

However, as with any engineering effort, Python is a work-in-progress. Our perception of the language today is different than it was even five years ago, so things that may have seemed outlandish then are now not only possible, but logical. In this post, I want to lay out what I see as promising directions for the community, and how I would like to see it grow.

The Good

Many good things have either landed in 2018 in Pythonland, or have overcome their growing pains. Here are my personal favourites:

JupyterLab

A Jupyter Notebook is a web application to execute Python (and other languages) and view the results in-line including graphs, prettified tables, and markdown-formatted prose. It also automatically saves intermediate results (similar to a REPL), allows exporting to many formats, and has a hundred other features. For a deeper dive, see my PyCon talk. Jupyter Notebooks are very widely used in the community, especially those in research and scientific fields. The Jupyter team very justifiably won the 2017 ACM Software System Award.

JupyterLab is an exciting improvement over traditional Jupyter notebooks. It includes some compelling features like cell drag-and-drop, inline viewing of data files (like CSV), a tabbed environment, and a more command-centered interface. It definitely still feels like a beta, with some glitches in Reveal.js slide export functionality and cell collapse not working as expected. But on the whole it’s a perfect example of a good tool getting even better and growing to fit the sophistication of its users.

mypy

mypy, a static type checking tool for Python, has existed for a while. However, it has gotten really good this year, to the point where you can integrate it into your production project as part of git hooks or other CI flow. I find it an extremely helpful addition to all codebases, catching the vast majority of my mistakes before I write a single line of test code. It’s not without pitfalls however. There are many cases where you have to make annotations that feel burdensome

__init__(self, *args) -> None

and other behaviour which I view as just strange. The lack of typeshed files for many common modules¹ such as:

  • flask
  • msgpack
  • coloredlogs
  • flask-restplus
  • sqlalchemy
  • nacl

continues to be an issue in integrating this into your CI system without significant configuration. The — ignore-missing-imports option becomes basically mandatory. In the future, I hope that it becomes a community standard to provide typeshed files for all modules intended to be used as libraries.

Pipfile and pipenv

I’m really excited about Pipfiles! Pipfiles are an implementation of PEP508, which motivates a replacement dependency-management system to requirements.txt.

The top-level motivation is that dependency management with pip feels stale compared to similar systems in other languages like rust and javascript. While the flaws with pip/requirements.txt seem to be well-known in the community, the closest article I’ve seen to an enumeration is this post. I recommend a read, but here is a TLDR:

There is no standard for requirements.txt: is it an enumeration of all primary and secondary dependencies, or just the strict requirements? Does it include pinned versions? Additionally, splitting out development-time requirements is very ad-hoc². Different groups do different things, which makes reproducible builds a problem.

Keeping the list of dependencies up to date required pip install $packagefollowed by pip freeze > requirements.txt, which was a really clunky workflow with a ton of problems.

The development-management ecosystem consists of three tools and standards (virtualenv, pip, and requirements.txt) which do not interop cleanly. Since you’re trying to accomplish a single task, why isn’t there a single tool to help?

Enter pipenv.

Pipenv creates a virtualenv automatically, installs and manages dependencies in that virtualenv, and keeps the Pipfile updated.

While the idea is great, using it is very cumbersome. I’ve run into many issues using it in practice and often have to fall back on the previous way of doing things — using an explicit virtualenv for example. I also found that locking is very slow (a problem partially stemming from the setup.py standard, which is the source of many other issues in the tooling ecosystem).

f-strings

f-strings are fantastic! Many others have written about the joy of f-strings, from their natural syntax to the performance improvements they bring. I see no reason to repeat these points, I just want to say it’s an amazing feature that I have been using regularly since they landed.

An annoyance they introduce is the dichotomy between writing printstatements and logging statements. The logging module is great, and by default does not format strings if that log message is turned off. So you might write:

x = 3
logging.debug(‘x=%d’, x)

Which would print x=3if the log-level is set to DEBUG, but would not even perform the string interpolation if the log-level is set higher. This is because logging.debug is a function, and the strings are passed as arguments. You can see how it works in the very readable C source code. However, this functionality disappears if you write the following:

x = 3
logging.debug(f’x={x}’)

The string interpolation happens regardless of log-level. This makes sense at a language-level, but the practical consequences are irritating in my natural workflow. I write print statements first when debugging my code, and when it looks like everything is right I transform them into logging statements later. So each print statement has to be manually rewritten to fit the different type of string interpolation. I don’t have a good idea of how to solve this problem, but I want to point it out as I haven’t seen anyone else write about this particular problem.

The Bad

As with any project that has been around for as long as Python (wow it’s as old as I am), there are modules and ideas which are showing their age. This is not meant to be a shade-throwing contest, but laying down the gauntlet to say we as a community can do better.

tox

Tox is still the best (or perhaps more accurately the de-facto) test-runner we have in Pythonland, and it’s quite bad. Not only is the syntax for tox.inifiles a bit unintuitive, the tool is also extremely slow. It’s not really tox’s fault, as the whole setup.py system is broken by design. Because these files declare package dependencies and at the same time can execute code, discovering dependencies is inherently slow. This leads to slowness in a number of tools. I believe this is something we should tackle as a community in 2019.

As an aside, there is still no Pipfile support, which makes the value proposition of using it much lower. As with everything, it’s not just about how good the idea is, but the tooling support around it.

type annotations are for tools only

Quoting from PEP0484:

Using type hints for performance optimizations is left as an exercise for the reader.

This is understandable given the state of Python at the time that the PEP was written, but it’s now time to move on. We have successfully transitioned to Python3, and 359/360 of the most commonly downloaded packages on PyPi are Python3-compliant. Type hints are here to stay, and are well-loved by the community. Moving forward, Python type hints should carry additional benefits such as performance optimization and automatic runtime type assertions. I find runtime type assertions to be both extremely helpful (especially in libraries), and very cumbersome to write manually. With type hints, this is especially annoying as you have to maintain multiple sources of truth for types.

As others have written, Python 4 will probably have JIT as a first-class feature. This seems like a logical place to add performance optimization in response to type annotations.

variable mutability

One of my biggest gripes with Python right now is the lack of const or its equivalent. Of all the mistakes I make during coding, a solid 90% of them can be traced to either type-related mistakes (now mostly caught with mypy) or accidental reuse of a previous variable within the same function when I thought I was creating a new variable. I understand that there are packages for this, but I want const to be a first-class citizen.

nbconvert

The nbconvert project is, on the whole, amazing. It allows the conversion of Jupyter notebooks into various other formats including PDF, Reveal.js slides, or an executable script. I have used the last two extensively in the past couple of months, and they have honestly changed my workflow. I can put together a notebook, then at the last moment convert it into a presentation for a weekly meeting with my colleagues to show my progress. Similarly, I can develop an idea in a notebook, then convert it into a script and put it into production with minimal changes.

That’s the idea, anyway. The reality is that the scripts produced from any sizable notebook require so much manual effort to convert that it’s often worth it to write them from scratch using cut-and-paste. I heard from a few companies that they have created wrappers around nbconvert to make it a bit more wieldy. I encourage these folks to open-source these contributions, if only to alleviate my personal pain.

本文分享自微信公众号 - Python猫(python_cat)

原文出处及转载信息见文内详细说明,如有侵权,请联系 yunjia_community@tencent.com 删除。

原始发表时间:2018-12-19

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 从A到Z,26个实用Python模块/函数速览

    花下猫说:今天听了左耳朵耗子的《左耳听风》专栏,我受到启发,所以尝试转载一篇英文技术文章和大家分享。获取第一手的信息源,锻炼英文阅读能力,以期长足的技术进步。文...

    Python猫
  • Ruby vs. Python: 多行字符串的差异

    在《你真的知道Python的字符串是什么吗?》里,我们比较了 Python 多行字符串与Java的区别。有小伙伴说这只是语法的区别,他觉得并不重要。真是不重要吗...

    Python猫
  • Github 火热的 FastAPI 库,站在了这些知名库的肩膀上

    花下猫语:如果你还不知道 FastAPI 是什么/有多好,请先看看我之前转载的 这篇文章,然后再阅读本文。今天分享的是一篇译文,译自 FastAPI 的官方文档...

    Python猫
  • 初创公司Lantern希望引发一场关于如何良好死亡的对话

    美国是文书工作的大国,这在生命尽头时体现得极为明显。我们必须向医疗保健提供者仔细传达高级护理指示并严格遵守。在符合相关房地产法的前提下,必须对财产进行分割和转让...

    木樾233
  • 大数据有助于预防自杀

    作者:Gil Allouche 翻译:coco 校对:孙强 关键词:大数据 自杀 [大数据文摘翻译] 大数据不只是帮助我们寻找更有效的营销和广告方式 它也使这...

    小莹莹
  • 基于游戏的移动电子学习应用评述(cs.cy)

    本文研究主要在回顾和获取有关不同的移动游戏应用程序的信息,以及作为辅助学习工具的可能性,以增强技术教育和技能发展的电子学习方面。本文研究的目的是为了帮助技术教育...

    用户8078797
  • 一种近似弱阻尼非线性机械系统共振曲线的有效方法 (CS)

    本文介绍了一种在参数变化下跟踪频率响应中特定峰值的方法,适用于和谐强制非线性机械系统的周期性稳态振动。它适用于和谐强制非线性机械系统的周期性稳态振动,它在频域内...

    管欣8078776
  • 4月14日对话吴恩达(Andrew Ng):超级大咖深度解析人工智能的发展现状与未来沙龙实录

    2016年4月14日(周四)21:00 - 22:30 嘉宾: - 吴恩达(Andrew Ng):百度首席科学家,“百度大脑”、“谷歌大脑”负责人,斯坦福大学计...

    小莹莹
  • Comparison of Apache Stream Processing Frameworks: Part 1

    A couple of months ago we were discussing the reasons behind increasing demand f...

    首席架构师智库
  • Deep Learning Machine Beats Humans in IQ Test

    用户1737318

扫码关注云+社区

领取腾讯云代金券