# The emergence of consensus in Python
Julien Palard PyCon Fr 2018 ---- There should be one -- and preferably only one -- obvious way to do it. (Tim Peters) ## The emergence of consensus in Python This is a study about undocumented consensus in the Python community, so you don't have to do it. # Julien Palard - Python documentation translator - Teaching Python at - Sup'Internet - CRI-Paris - Makina Corpus - … - julien@python.org, @sizeof, https://mdk.fr - Yes I write Python sometimes too… ## Julien Palard ![](static/emergence-language-switcher.png) # Digression In one year, we went from 25.7% translated to 30% translated! While japanese translation is at 79.2% and korean 14.6%. Notes: Want news about the translation? Thanks to Christophe, Antoine, Glyg, HS-157, and 27 other translators! PLZ HELP # What did I do? Crawled [pypi.org](https://pypi.org) to get some Python projects, cloned their github repo (around 4k repositories at the time of writing). Then... played with the data ^^. ## But why? To answer all those questions a human or a search engine won't be able to answer. Notes: - For my README, should I use `rst` or `md`? - unittest, nose, or pytest? - ``setup.py``, ``requirements.txt``, ``Pipfile``, ...? - ... ## Is it data science? Hell no! It's biased, I only crawled projects published on *pypi.org* AND hosted on *github.com*, so I'm hitting a very specific subset of the population. Note: ### I mean - me: Hey consensus is to use the MIT license! - you: You crawled only open source projects... - me: Oh wait... ## Digression I used Jupyter, if you still don't have tried it, please take a look at it. ![](static/emergence-jupyterpreview.png) ## Meta-Digression If you're using Jupyter Notebook, and never tried Jupyter Lab, please try it. JupyterLab will replace Jupyter notebooks, so maybe start using it. ```bash pip install jupyterlab jupyter-lab ``` Notes: I know you like digressions, so I'm putting digressions in my digression so I can digress while I digress. ## Meta-Digression ![](https://jupyterlab.readthedocs.io/en/stable/_images/jupyterlab.png) # 10 years of data I do not have enough data past this so graphs tends to get messy. ```python stats = raw_data.loc['2008-01-01':,:].resample('6M') ``` Notes: While Python is 28 years old (older than git, than github, even than Java). ## Digression (again) I used Pandas, if you never tried it... ![](static/emergence-pandas.png) Notes: It's a matrix of scatter plots. # README files ```python readmes = (stats['file:README.rst', 'file:README.md', 'file:README', 'file:README.txt'].mean().plot()) ``` # README files ![](static/emergence-readme.png) Notes: ## Consensus 10 years ago, people used ``README`` and ``README.txt``. It changed around 2011, now we use ``README.md`` and ``README.rst`` files. ``Markdown`` won. I bet for its simplicity, readability, and also people may know it from elsewhere. ## Consensus But pypi.python.org don't support Markdown! Yes, but... ## Consensus pypi.org does! ```python long_description_content_type='text/markdown' ``` See: https://pypi.org/project/markdown-description-example/ So use ``README.md`` files! # Requirements ```python setups = stats['file:setup.cfg', 'file:setup.py', 'file:requirements.txt', 'file:Pipfile', 'file:Pipfile.lock'].mean().plot() ``` ## Requirements ![](static/emergence-requirements.png) Notes: Nothing really interesting here :( We see the rise of Pipfile, but still can't say much about it... ## Requirements For dependency managment I've seen a lot of philosophies. and it really depends on "are you packaging", "is it an app or a library", … ## Digression ### The future PEP 517 and PEP 518 ``` [build-system] requires = ["flit"] build-backend = "flit.api:main" ``` Notes: are introducing a way to completly remove setuptools and distutils from being a requirement, it make them a choice: # Tests ```python tests = (raw_data.groupby('test_engine') .resample('Y')['test_engine'] .size() .unstack() .T .fillna(0) .apply(lambda line: 100 * line / float(line.sum()), axis=1) .plot()) ``` ## Tests ![](static/emergence-tests.png) Notes: ## Sorry nose. # Documentation directory ```python docs = stats['dir:doc/', 'dir:docs/'].mean().plot() ``` ## Documentation directory ![](static/emergence-docs.png) Note: Some of you are not documenting at all! Concensus emmerged around 2011 towards **docs/** instead of **doc/**, let's stick to it (please, please, no **Docs/**, I see you, cpython). # **src/** or not **src/** ```python src = pd.DataFrame(stats['dir:src/'].mean()).plot() ``` ## **src/** or not **src/** ![](static/emergence-src.png) Notes: This one was slow, but the concensus is to drop the use of a `src/` directory. I used it a lot, convinced it would allow me to spot earlier an import bug ("." being in PYTHONPATH but not "src/"). But that's way overkill for a small win. # **tests/** or **test/**?
```python has_tests = stats['dir:tests/', 'dir:test/', ].mean().plot() ``` ## **tests/** or **test/**? ![](static/emergence-testdir.png) Note: First thing I see... Not everyone is writing tests. I'm glad the concensus is as for **docs/** and **doc/**, plural clearly wins. I bet it's semantically better, as the folder does not contain a test, but multiple tests. pyproject.toml: to declare dependencies of your setup.py # Shebangs ```python shebangs = (raw_data.loc['2008-01-01':,raw_data.columns .map(lambda col: col.startswith('shebang:'))].sum()) ``` ```python top_shebangs = shebangs.sort_values().tail(4).index ``` ```python shebangs_plot = (raw_data.loc['2008-01-01':, top_shebangs] .fillna(value=0).resample('6M').mean().plot()) ``` ## Shebangs ![](static/emergence-shebang.png) Notes: I'm glad there's not so much `#!/usr/bin/env python2.7` here. I'm not sure it's a good idea to specify a version in the shebang, but... # Licenses ```python top_licenses = raw_data.groupby('license').size().sort_values().tail(10) licenses = (raw_data.groupby('license') .resample('Y')['license'] .size() .unstack() .T .fillna(0) .loc[:, list(top_licenses.index)] .apply(lambda line: 100 * line / float(line.sum()), axis=1) .plot()) ``` ## Licenses ![](static/emergence-licenses.png) ## Digression https://choosealicense.com/ # Questions?

- julien@python.org - Twitter @sizeof - https://mdk.fr