forked from fcode/delarte
Update readme and doc.
This commit is contained in:
parent
c0feaa820a
commit
6057014598
306
README.md
306
README.md
|
@ -1,70 +1,288 @@
|
||||||
`delarte`
|
`delarte`
|
||||||
=========
|
=========
|
||||||
|
|
||||||
🚧 Du code a mettre au propre, dans le seul but de faire du python
|
🎬 ArteTV downloader
|
||||||
|
|
||||||
|
|
||||||
💡 Mais c’est quoi?
|
💡 What is it ?
|
||||||
-------------------
|
---------------
|
||||||
|
|
||||||
Récupérer un flux vidéo dans un fichier local avec sous titres.
|
This is a toy/research project whose only goal is to familiarize with some of the technologies involved in multi-lingual video streaming. Using this program may violate usage policy of ArteTV website and we do not recommend using it for other purpose then studying the code.
|
||||||
|
|
||||||
|
ArteTV is a is a European public service channel dedicated to culture. Available programmes are usually available with multiple audio and subtitiles languages.
|
||||||
|
|
||||||
🚀 Chauffe Marcel!
|
🚀 Quick start
|
||||||
------------------
|
---------------
|
||||||
|
|
||||||
_(pour distribution de famille Debian, adapter les commandes sinon)_
|
_(Linux/Debian distribution)_
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://git.afpy.org/fcode/delarte.git && cd delarte
|
|
||||||
sudo apt install ffmpeg
|
sudo apt install ffmpeg
|
||||||
mkdir ~/.venvs && python3 -m venv ~/.venvs/delarte
|
mkdir ~/.venvs && python3 -m venv ~/.venvs/delarte
|
||||||
source ~/.venvs/delarte/bin/activate
|
source ~/.venvs/delarte/bin/activate
|
||||||
|
git clone https://gitlab.com/Barbagus/delarte.git && cd delarte
|
||||||
pip install -r requirements.txt
|
pip install -r requirements.txt
|
||||||
export PATH_FFMPEG=$(which ffmpeg)
|
export PATH_FFMPEG=$(which ffmpeg)
|
||||||
./delarte.py https://www.arte.tv/fr/videos/093644-001-A/l-incroyable-periple-de-magellan-1-4/
|
|
||||||
Available versions:
|
|
||||||
VF - Français
|
|
||||||
VO-STF - Version originale - ST français
|
|
||||||
VF-STMF - Français (sourds et malentendants)
|
|
||||||
VFAUD - Français (audiodescription)
|
|
||||||
VA-STA - Allemand
|
|
||||||
VA-STMA - Allemand (sourds et malentendants)
|
|
||||||
VAAUD - Allemand (audiodescription)
|
|
||||||
./delarte.py https://www.arte.tv/fr/videos/093644-001-A/l-incroyable-periple-de-magellan-1-4/ VO-STF
|
|
||||||
Available resolutions:
|
|
||||||
1080
|
|
||||||
720
|
|
||||||
432
|
|
||||||
360
|
|
||||||
216
|
|
||||||
$ ./delarte.py https://www.arte.tv/fr/videos/093644-001-A/l-incroyable-periple-de-magellan-1-4/ VO-STF 720
|
|
||||||
ffmpeg version 4.3.5-0+deb11u1 Copyright (c) 2000-2022 the FFmpeg developers
|
|
||||||
frame=78910 fps=1204 q=-1.0 Lsize= 738210kB time=00:52:36.45 bitrate=1915.9kbits/s speed=48.2x
|
|
||||||
video:685949kB audio:50702kB subtitle:9kB other streams:0kB global headers:0kB muxing overhead: 0.210475%
|
|
||||||
```
|
```
|
||||||
|
|
||||||
🔧 Tripoter sous le capot
|
```bash
|
||||||
-------------------------
|
./delarte.py <PROGRAM_PAGE_URL> <VERSION> <RESOLUTION>
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
### 🚀 Chauffe Marcel!
|
🔧 How it works
|
||||||
|
----------------
|
||||||
|
|
||||||
- `Python 3.10` à été utilisé
|
### 🏗️ The streaming infrastructure
|
||||||
- Code formaté avec [`black`](https://pypi.org/project/black) & [`pydocstyle`](https://pypi.org/project/pydocstyle/)
|
|
||||||
- Installation des outils de développement:
|
Every video program have a _program identifier_ visible in their web page URL:
|
||||||
* `pip install -r requirements-dev.txt`
|
|
||||||
- Un `Makefile` équipé: executer `make help` pour le détail
|
```
|
||||||
- Un _git hook_ de `pre-commit`
|
https://www.arte.tv/es/videos/110139-000-A/fromental-halevy-la-tempesta/
|
||||||
* `make init-pre_commit`
|
https://www.arte.tv/fr/videos/100204-001-A/esprit-d-hiver-1-3/
|
||||||
|
https://www.arte.tv/en/videos/104001-000-A/clint-eastwood/
|
||||||
|
```
|
||||||
|
|
||||||
|
That _program identifier_ enables us to query an API for the program's information.
|
||||||
|
|
||||||
|
##### The _config_ API
|
||||||
|
|
||||||
|
For the last exemple the API call is as such:
|
||||||
|
|
||||||
|
```
|
||||||
|
https://api.arte.tv/api/player/v2/config/en/104001-000-A
|
||||||
|
```
|
||||||
|
|
||||||
|
The response is a JSON object:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"id": "104001-000-A_en",
|
||||||
|
"type": "ConfigPlayer",
|
||||||
|
"attributes": {
|
||||||
|
"metadata": {
|
||||||
|
"providerId": "104001-000-A",
|
||||||
|
"language": "en",
|
||||||
|
"title": "Clint Eastwood",
|
||||||
|
"subtitle": "The Last Legend",
|
||||||
|
"description": "70 years of career in front of and behind the camera and still active at 90, Clint Eastwood is a Hollywood legend. A look back at his unique career through a portrait that explores the complexity of the Eastwood myth.",
|
||||||
|
"duration": { "seconds": 4652 },
|
||||||
|
...
|
||||||
|
},
|
||||||
|
"streams": [
|
||||||
|
{
|
||||||
|
"url": "https://.../104001-000-A_VOF-STE%5BANG%5D_XQ.m3u8",
|
||||||
|
"versions": [
|
||||||
|
{
|
||||||
|
"label": "English (Subtitles)",
|
||||||
|
"shortLabel": "OGsub-ANG",
|
||||||
|
"eStat": {
|
||||||
|
"ml5": "VOF-STE[ANG]"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
],
|
||||||
|
...
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"url": "https://.../104001-000-A_VOF-STF_XQ.m3u8",
|
||||||
|
"versions": [
|
||||||
|
{
|
||||||
|
"label": "French (Original)",
|
||||||
|
"shortLabel": "FR",
|
||||||
|
"eStat": {
|
||||||
|
"ml5": "VOF-STF"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
],
|
||||||
|
...
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"url": "https://.../104001-000-A_VOF-STMF_XQ.m3u8",
|
||||||
|
"versions": [
|
||||||
|
{
|
||||||
|
"label": "Original french version - closed captioning (FR)",
|
||||||
|
"shortLabel": "ccFR",
|
||||||
|
"eStat": {
|
||||||
|
"ml5": "VOF-STMF"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
],
|
||||||
|
...
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"url": "https://.../104001-000-A_VA-STA_XQ.m3u8",
|
||||||
|
"versions": [
|
||||||
|
{
|
||||||
|
"label": "German (Dubbed)",
|
||||||
|
"shortLabel": "DE",
|
||||||
|
"eStat": {
|
||||||
|
"ml5": "VA-STA"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
],
|
||||||
|
...
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"url": "https://.../104001-000-A_VA-STMA_XQ.m3u8",
|
||||||
|
"versions": [
|
||||||
|
{
|
||||||
|
"label": "German closed captioning ",
|
||||||
|
"shortLabel": "ccDE",
|
||||||
|
"eStat": {
|
||||||
|
"ml5": "VA-STMA"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
],
|
||||||
|
...
|
||||||
|
}
|
||||||
|
],
|
||||||
|
...
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
Information about the program is detailed in `data.attributes.metadata` and a list of available audio/subtitles combinations in `data.attributes.streams`. In our code such a combination is refered to as a _version_.
|
||||||
|
|
||||||
|
Every such _version_ has a reference to a _version index_ file in `.streams[i].url` and description of the audio/subtitle combination in `.streams[i].versions[0]`.
|
||||||
|
|
||||||
|
We are using `.streams[i].versions[0].eStat.ml5` as our _version codes_:
|
||||||
|
|
||||||
|
- `VOF-STE[ANG]` English (Subtitles)
|
||||||
|
- `VOF-STF` French (Original)
|
||||||
|
- `VOF-STMF` Original french version - closed captioning (FR)
|
||||||
|
- `VA-STA` German (Dubbed)
|
||||||
|
- `VA-STMA` German closed captioning
|
||||||
|
- ...
|
||||||
|
|
||||||
|
##### The _version index_ file
|
||||||
|
|
||||||
|
The file is in [HTTP Livestreaming](https://www.rfc-editor.org/rfc/rfc8216) `.m3u8` format:
|
||||||
|
|
||||||
|
```
|
||||||
|
#EXTM3U
|
||||||
|
...
|
||||||
|
#EXT-X-STREAM-INF:BANDWIDTH=2335200,AVERAGE-BANDWIDTH=1123304,VIDEO-RANGE=SDR,CODECS="avc1.4d401e,mp4a.40.2",RESOLUTION=768x432,FRAME-RATE=25.000,AUDIO="program_audio_0",SUBTITLES="subs"
|
||||||
|
medias/104001-000-A_v432.m3u8
|
||||||
|
#EXT-X-STREAM-INF:BANDWIDTH=4534432,AVERAGE-BANDWIDTH=2124680,VIDEO-RANGE=SDR,CODECS="avc1.4d0028,mp4a.40.2",RESOLUTION=1920x1080,FRAME-RATE=25.000,AUDIO="program_audio_0",SUBTITLES="subs"
|
||||||
|
medias/104001-000-A_v1080.m3u8
|
||||||
|
#EXT-X-STREAM-INF:BANDWIDTH=4153392,AVERAGE-BANDWIDTH=1917840,VIDEO-RANGE=SDR,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=1280x720,FRAME-RATE=25.000,AUDIO="program_audio_0",SUBTITLES="subs"
|
||||||
|
medias/104001-000-A_v720.m3u8
|
||||||
|
#EXT-X-STREAM-INF:BANDWIDTH=1445432,AVERAGE-BANDWIDTH=726160,VIDEO-RANGE=SDR,CODECS="avc1.4d401e,mp4a.40.2",RESOLUTION=640x360,FRAME-RATE=25.000,AUDIO="program_audio_0",SUBTITLES="subs"
|
||||||
|
medias/104001-000-A_v360.m3u8
|
||||||
|
#EXT-X-STREAM-INF:BANDWIDTH=815120,AVERAGE-BANDWIDTH=429104,VIDEO-RANGE=SDR,CODECS="avc1.42e00d,mp4a.40.2",RESOLUTION=384x216,FRAME-RATE=25.000,AUDIO="program_audio_0",SUBTITLES="subs"
|
||||||
|
medias/104001-000-A_v216.m3u8
|
||||||
|
...
|
||||||
|
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="program_audio_0",LANGUAGE="fr",NAME="VOF",AUTOSELECT=YES,DEFAULT=YES,URI="medias/104001-000-A_aud_VOF.m3u8"
|
||||||
|
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English",DEFAULT=YES,AUTOSELECT=YES,FORCED=NO,LANGUAGE="en",URI="medias/104001-000-A_st_VO-ANG.m3u8"
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
This can be parsed with the [m3u8](https://pypi.org/project/m3u8/) library.
|
||||||
|
|
||||||
|
This file show the a list of _video index_ URIs (one per video resolution). Each of them is linked to exactly one _audio index_ file and at most one _subtitiles index_ file.
|
||||||
|
|
||||||
|
##### The _video index_ files
|
||||||
|
|
||||||
|
The file is also in [HTTP Livestreaming](https://www.rfc-editor.org/rfc/rfc8216) `.m3u8` format:
|
||||||
|
|
||||||
|
```
|
||||||
|
#EXTM3U
|
||||||
|
#EXT-X-TARGETDURATION:6
|
||||||
|
#EXT-X-VERSION:7
|
||||||
|
#EXT-X-MEDIA-SEQUENCE:1
|
||||||
|
#EXT-X-INDEPENDENT-SEGMENTS
|
||||||
|
#EXT-X-PLAYLIST-TYPE:VOD
|
||||||
|
#EXT-X-MAP:URI="104001-000-A_v1080.mp4",BYTERANGE="28792@0"
|
||||||
|
#EXTINF:6.000,
|
||||||
|
#EXT-X-BYTERANGE:1734621@28792
|
||||||
|
104001-000-A_v1080.mp4
|
||||||
|
#EXTINF:6.000,
|
||||||
|
#EXT-X-BYTERANGE:1575303@1763413
|
||||||
|
104001-000-A_v1080.mp4
|
||||||
|
#EXTINF:6.000,
|
||||||
|
#EXT-X-BYTERANGE:1603739@3338716
|
||||||
|
104001-000-A_v1080.mp4
|
||||||
|
#EXTINF:6.000,
|
||||||
|
#EXT-X-BYTERANGE:1333835@4942455
|
||||||
|
104001-000-A_v1080.mp4
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
This file shows the list of _video chuncks_ the server expect to serve.
|
||||||
|
|
||||||
|
##### The _audio index_ file
|
||||||
|
|
||||||
|
Similarly to the _video index_ file it shows the list of _audio chuncks_ the server expect to serve:
|
||||||
|
|
||||||
|
```
|
||||||
|
#EXTM3U
|
||||||
|
#EXT-X-TARGETDURATION:6
|
||||||
|
#EXT-X-VERSION:7
|
||||||
|
#EXT-X-MEDIA-SEQUENCE:1
|
||||||
|
#EXT-X-INDEPENDENT-SEGMENTS
|
||||||
|
#EXT-X-PLAYLIST-TYPE:VOD
|
||||||
|
#EXT-X-MAP:URI="104001-000-A_aud_VOF.mp4",BYTERANGE="28752@0"
|
||||||
|
#EXTINF:5.991,
|
||||||
|
#EXT-X-BYTERANGE:82445@28752
|
||||||
|
104001-000-A_aud_VOF.mp4
|
||||||
|
#EXTINF:5.991,
|
||||||
|
#EXT-X-BYTERANGE:99299@111197
|
||||||
|
104001-000-A_aud_VOF.mp4
|
||||||
|
#EXTINF:5.991,
|
||||||
|
#EXT-X-BYTERANGE:101640@210496
|
||||||
|
104001-000-A_aud_VOF.mp4
|
||||||
|
#EXTINF:5.991,
|
||||||
|
#EXT-X-BYTERANGE:102047@312136
|
||||||
|
104001-000-A_aud_VOF.mp4
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
##### The _subtitles index_ file
|
||||||
|
|
||||||
|
The file is also in [HTTP Livestreaming](https://www.rfc-editor.org/rfc/rfc8216) `.m3u8` format:
|
||||||
|
|
||||||
|
```
|
||||||
|
#EXTM3U
|
||||||
|
#EXT-X-VERSION:7
|
||||||
|
#EXT-X-TARGETDURATION:4650
|
||||||
|
#EXT-X-MEDIA-SEQUENCE:1
|
||||||
|
#EXT-X-PLAYLIST-TYPE:VOD
|
||||||
|
#EXTINF:4650,
|
||||||
|
104001-000-A_st_VO-ANG.vtt
|
||||||
|
#EXT-X-ENDLIST
|
||||||
|
```
|
||||||
|
|
||||||
|
This file shows the file(s) containing the subtitles data.
|
||||||
|
|
||||||
|
### ⚙️The process
|
||||||
|
|
||||||
|
1. Get the _config_ API object for the _program identifier_
|
||||||
|
1.1 Figure out the _output filename_ from _metadata_.
|
||||||
|
1.2 Select a _version_.
|
||||||
|
2. Get the _version index_ file
|
||||||
|
2.1 Select a resolution _video index_ along with its _audio index_ and _subtitle index_
|
||||||
|
3. Get the subtitles in `vtt` format and convert them to `srt`
|
||||||
|
4. Feed the _video index_, _audio index_ and `srt` file to `ffmpeg`
|
||||||
|
|
||||||
|
### 📽️ FFMPEG
|
||||||
|
|
||||||
|
The actual build of the video file is handled by [ffmpeg](https://ffmpeg.org/). The script expects [ffmpeg](https://ffmpeg.org/) to be installed in the environement and will call it as a subprocess.
|
||||||
|
|
||||||
|
##### Why not use FFMPEG direcly with the _version index_ URL ?
|
||||||
|
|
||||||
|
So we can select the video resolution _version_ and not rely on stream mapping arguments in `ffmpeg`.
|
||||||
|
|
||||||
|
##### Why not use VTT subtitles direcly ?
|
||||||
|
|
||||||
|
Because it fails 😒.
|
||||||
|
|
||||||
|
|
||||||
### 📌 Dépendances
|
### 📌 Dependences
|
||||||
|
|
||||||
Voir [`requirements.txt`](requirements.txt) & [`requirements-dev.txt`](requirements-dev.txt)
|
- [m3u8](https://pypi.org/project/m3u8/) to parse index files.
|
||||||
|
- [webvtt-py](https://pypi.org/project/webvtt-py/) to load `vtt` subtitles files.
|
||||||
|
|
||||||
|
### 🤝 Help
|
||||||
|
|
||||||
### 🤝 Filer un coup de main
|
For sure ! The more the merrier.
|
||||||
|
|
||||||
- Question, suggestion ➡️ [_ticket du projet_](https://git.afpy.org/fcode/delarte/issues/new)
|
|
||||||
- Balance ton code ➡️ [_demande de fusion_](https://git.afpy.org/fcode/delarte/compare/devel)
|
|
||||||
|
|
|
@ -3,11 +3,11 @@
|
||||||
|
|
||||||
"""delarte.
|
"""delarte.
|
||||||
|
|
||||||
Retrieve video stream in a local file, including sub-titles
|
ArteTV downloader
|
||||||
|
|
||||||
Licence: GNU AGPL v3: http://www.gnu.org/licenses/
|
Licence: GNU AGPL v3: http://www.gnu.org/licenses/
|
||||||
|
|
||||||
This file is part of [`delarte`](https://git.afpy.org/fcode/delarte)
|
This file is part of [`delarte`](https://gitlab.com/Barbagus/delarte)
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
Loading…
Reference in New Issue
Block a user