Import: PyConFr 2018.

This commit is contained in:
Julien Palard 2023-02-10 15:48:57 +01:00
parent ee621768e2
commit af2344a0d3
Signed by: mdk
GPG Key ID: 0EFC1AC1006886F8
13 changed files with 368 additions and 0 deletions

368
2018-pycon-fr-emergence.md Normal file
View File

@ -0,0 +1,368 @@
# The emergence of consensus in Python
<!-- .slide: data-background="static/background.jpg" -->
<br/>
<b>Julien Palard</b>
<tt>PyCon Fr 2018</tt>
----
There should be one
-- and preferably only one --
obvious way to do it. (Tim Peters)
## The emergence of consensus in Python
This is a study about undocumented consensus in the Python community,
so you don't have to do it.
# Julien Palard
- Python documentation translator
- Teaching Python at
- Sup'Internet
- CRI-Paris
- Makina Corpus
- …
- julien@python.org, @sizeof, https://mdk.fr
- Yes I write Python sometimes too…
## Julien Palard
![](static/emergence-language-switcher.png)
# Digression
In one year, we went from 25.7% translated to 30% translated!
While japanese translation is at 79.2% and korean 14.6%.
Notes: Want news about the translation?
Thanks to Christophe, Antoine, Glyg, HS-157, and 27 other translators!
PLZ HELP
# What did I do?
Crawled [pypi.org](https://pypi.org) to get some Python projects,
cloned their github repo (around 4k repositories at the time of
writing).
Then... played with the data ^^.
## But why?
To answer all those questions a human or a search engine won't be able to answer.
Notes:
- For my README, should I use `rst` or `md`?
- unittest, nose, or pytest?
- ``setup.py``, ``requirements.txt``, ``Pipfile``, ...?
- ...
## Is it data science?
Hell no! It's biased, I only crawled projects published on *pypi.org* AND
hosted on *github.com*, so I'm hitting a very specific subset of the population.
Note:
### I mean
- me: Hey consensus is to use the MIT license!
- you: You crawled only open source projects...
- me: Oh wait...
## Digression
I used Jupyter, if you still don't have tried it, please take
a look at it.
![](static/emergence-jupyterpreview.png)
## Meta-Digression
If you're using Jupyter Notebook, and never tried Jupyter Lab, please try it.
JupyterLab will replace Jupyter notebooks, so maybe start using it.
```bash
pip install jupyterlab
jupyter-lab
```
Notes:
I know you like digressions, so I'm putting digressions in my
digression so I can digress while I digress.
## Meta-Digression
![](https://jupyterlab.readthedocs.io/en/stable/_images/jupyterlab.png)
# 10 years of data
I do not have enough data past this so graphs tends to get messy.
```python
stats = raw_data.loc['2008-01-01':,:].resample('6M')
```
Notes: While Python is 28 years old (older than git, than github, even than Java).
## Digression (again)
I used Pandas, if you never tried it...
![](static/emergence-pandas.png)
Notes:
It's a matrix of scatter plots.
# README files
```python
readmes = (stats['file:README.rst',
'file:README.md',
'file:README',
'file:README.txt'].mean().plot())
```
# README files
![](static/emergence-readme.png)
Notes:
## Consensus
10 years ago, people used ``README`` and ``README.txt``.
It changed around 2011, now we use ``README.md`` and ``README.rst`` files.
``Markdown`` won. I bet for its simplicity, readability, and also
people may know it from elsewhere.
## Consensus
But pypi.python.org don't support Markdown!
Yes, but...
## Consensus
pypi.org does!
```python
long_description_content_type='text/markdown'
```
See:
https://pypi.org/project/markdown-description-example/
So use ``README.md`` files!
# Requirements
```python
setups = stats['file:setup.cfg',
'file:setup.py',
'file:requirements.txt',
'file:Pipfile',
'file:Pipfile.lock'].mean().plot()
```
## Requirements
![](static/emergence-requirements.png)
Notes:
Nothing really interesting here :( We see the rise of Pipfile, but
still can't say much about it...
## Requirements
For dependency managment I've seen a lot of philosophies. and it
really depends on "are you packaging", "is it an app or a library",
## Digression
### The future
PEP 517 and PEP 518
```
[build-system]
requires = ["flit"]
build-backend = "flit.api:main"
```
Notes:
are introducing a way to completly remove
setuptools and distutils from being a requirement, it make them a
choice:
# Tests
```python
tests = (raw_data.groupby('test_engine')
.resample('Y')['test_engine']
.size()
.unstack()
.T
.fillna(0)
.apply(lambda line: 100 * line / float(line.sum()), axis=1)
.plot())
```
## Tests
![](static/emergence-tests.png)
Notes:
## Sorry nose.
# Documentation directory
```python
docs = stats['dir:doc/', 'dir:docs/'].mean().plot()
```
## Documentation directory
![](static/emergence-docs.png)
Note: Some of you are not documenting at all!
Concensus emmerged around 2011 towards **docs/** instead of **doc/**, let's stick to it (please, please, no **Docs/**, I see you, cpython).
# **src/** or not **src/**
```python
src = pd.DataFrame(stats['dir:src/'].mean()).plot()
```
## **src/** or not **src/**
![](static/emergence-src.png)
Notes:
This one was slow, but the concensus is to drop the use of a `src/` directory.
I used it a lot, convinced it would allow me to spot earlier an import bug ("." being in PYTHONPATH but not "src/"). But that's way overkill for a small win.
# **tests/** or **test/**?
<br/>
```python
has_tests = stats['dir:tests/', 'dir:test/', ].mean().plot()
```
## **tests/** or **test/**?
![](static/emergence-testdir.png)
Note: First thing I see... Not everyone is writing tests.
I'm glad the concensus is as for **docs/** and **doc/**, plural clearly wins. I bet it's semantically better, as the folder does not contain a test, but multiple tests.
pyproject.toml: to declare dependencies of your setup.py
# Shebangs
```python
shebangs = (raw_data.loc['2008-01-01':,raw_data.columns
.map(lambda col: col.startswith('shebang:'))].sum())
```
```python
top_shebangs = shebangs.sort_values().tail(4).index
```
```python
shebangs_plot = (raw_data.loc['2008-01-01':, top_shebangs]
.fillna(value=0).resample('6M').mean().plot())
```
## Shebangs
![](static/emergence-shebang.png)
Notes:
I'm glad there's not so much `#!/usr/bin/env python2.7` here.
I'm not sure it's a good idea to specify a version in the shebang, but...
# Licenses
```python
top_licenses = raw_data.groupby('license').size().sort_values().tail(10)
licenses = (raw_data.groupby('license')
.resample('Y')['license']
.size()
.unstack()
.T
.fillna(0)
.loc[:, list(top_licenses.index)]
.apply(lambda line: 100 * line / float(line.sum()), axis=1)
.plot())
```
## Licenses
![](static/emergence-licenses.png)
## Digression
https://choosealicense.com/
# Questions?
<br/><br/>
- julien@python.org
- Twitter @sizeof
- https://mdk.fr

BIN
static/emergence-docs.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 73 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

BIN
static/emergence-pandas.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 177 KiB

BIN
static/emergence-readme.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

BIN
static/emergence-src.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

BIN
static/emergence-tests.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB