Update README to reflect changes
This commit is contained in:
parent
e1bed8b1be
commit
e4cba27bdd
213
README.md
213
README.md
|
@ -7,9 +7,9 @@
|
||||||
💡 What is it ?
|
💡 What is it ?
|
||||||
---------------
|
---------------
|
||||||
|
|
||||||
This is a toy/research project whose only goal is to familiarize with some of the technologies involved in multi-lingual video streaming. Using this program may violate usage policy of ArteTV website and we do not recommend using it for other purpose then studying the code.
|
This is a toy/research project whose primary goal is to familiarize with some of the technologies involved in multi-lingual video streaming. Using this program may violate usage policy of ArteTV website and we do not recommend using it for other purpose then studying the code.
|
||||||
|
|
||||||
ArteTV is a is a European public service channel dedicated to culture. Available programmes are usually available with multiple audio and subtitles languages.
|
ArteTV is a is a European public service channel dedicated to culture. Programmes are usually available with multiple audio and subtitles languages.
|
||||||
|
|
||||||
🚀 Quick start
|
🚀 Quick start
|
||||||
---------------
|
---------------
|
||||||
|
@ -59,7 +59,7 @@ usage: delarte [-h|--help] - print this message
|
||||||
🔧 How it works
|
🔧 How it works
|
||||||
----------------
|
----------------
|
||||||
|
|
||||||
### 🏗️ The streaming infrastructure
|
## 🏗️ The streaming infrastructure
|
||||||
|
|
||||||
Every video program have a _program identifier_ visible in their web page URL:
|
Every video program have a _program identifier_ visible in their web page URL:
|
||||||
|
|
||||||
|
@ -71,7 +71,7 @@ https://www.arte.tv/en/videos/104001-000-A/clint-eastwood/
|
||||||
|
|
||||||
That _program identifier_ enables us to query an API for the program's information.
|
That _program identifier_ enables us to query an API for the program's information.
|
||||||
|
|
||||||
##### The _config_ API
|
### The _config_ API
|
||||||
|
|
||||||
For the last example the API call is as such:
|
For the last example the API call is as such:
|
||||||
|
|
||||||
|
@ -79,217 +79,68 @@ For the last example the API call is as such:
|
||||||
https://api.arte.tv/api/player/v2/config/en/104001-000-A
|
https://api.arte.tv/api/player/v2/config/en/104001-000-A
|
||||||
```
|
```
|
||||||
|
|
||||||
The response is a JSON object:
|
The response is a JSON object, a sample of which can be found [here](https://git.afpy.org/fcode/delarte/src/branch/stable/samples/api/config-105612-000-A.json):
|
||||||
|
|
||||||
```json
|
Information about the program is detailed in `$.data.attributes.metadata` and a list of available audio/subtitles combinations in `$.data.attributes.streams`. In our code such a combination is referred to as a _rendition_ (or _version_ in the CLI).
|
||||||
{
|
|
||||||
"data": {
|
|
||||||
"id": "104001-000-A_en",
|
|
||||||
"type": "ConfigPlayer",
|
|
||||||
"attributes": {
|
|
||||||
"metadata": {
|
|
||||||
"providerId": "104001-000-A",
|
|
||||||
"language": "en",
|
|
||||||
"title": "Clint Eastwood",
|
|
||||||
"subtitle": "The Last Legend",
|
|
||||||
"description": "70 years of career in front of and behind the camera and still active at 90, Clint Eastwood is a Hollywood legend. A look back at his unique career through a portrait that explores the complexity of the Eastwood myth.",
|
|
||||||
"duration": { "seconds": 4652 },
|
|
||||||
...
|
|
||||||
},
|
|
||||||
"streams": [
|
|
||||||
{
|
|
||||||
"url": "https://.../104001-000-A_VOF-STE%5BANG%5D_XQ.m3u8",
|
|
||||||
"versions": [
|
|
||||||
{
|
|
||||||
"label": "English (Subtitles)",
|
|
||||||
"shortLabel": "OGsub-ANG",
|
|
||||||
"eStat": {
|
|
||||||
"ml5": "VOF-STE[ANG]"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
],
|
|
||||||
...
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"url": "https://.../104001-000-A_VOF-STF_XQ.m3u8",
|
|
||||||
"versions": [
|
|
||||||
{
|
|
||||||
"label": "French (Original)",
|
|
||||||
"shortLabel": "FR",
|
|
||||||
"eStat": {
|
|
||||||
"ml5": "VOF-STF"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
],
|
|
||||||
...
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"url": "https://.../104001-000-A_VOF-STMF_XQ.m3u8",
|
|
||||||
"versions": [
|
|
||||||
{
|
|
||||||
"label": "Original french version - closed captioning (FR)",
|
|
||||||
"shortLabel": "ccFR",
|
|
||||||
"eStat": {
|
|
||||||
"ml5": "VOF-STMF"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
],
|
|
||||||
...
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"url": "https://.../104001-000-A_VA-STA_XQ.m3u8",
|
|
||||||
"versions": [
|
|
||||||
{
|
|
||||||
"label": "German (Dubbed)",
|
|
||||||
"shortLabel": "DE",
|
|
||||||
"eStat": {
|
|
||||||
"ml5": "VA-STA"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
],
|
|
||||||
...
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"url": "https://.../104001-000-A_VA-STMA_XQ.m3u8",
|
|
||||||
"versions": [
|
|
||||||
{
|
|
||||||
"label": "German closed captioning ",
|
|
||||||
"shortLabel": "ccDE",
|
|
||||||
"eStat": {
|
|
||||||
"ml5": "VA-STMA"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
],
|
|
||||||
...
|
|
||||||
}
|
|
||||||
],
|
|
||||||
...
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
Information about the program is detailed in `data.attributes.metadata` and a list of available audio/subtitles combinations in `data.attributes.streams`. In our code such a combination is referred to as a _rendition_ (or _version_ in the CLI).
|
|
||||||
|
|
||||||
Every such _rendition_ has a reference to a _master playlist_ file in `.streams[i].url` and description of the audio/subtitle combination in `.streams[i].versions[0]`.
|
Every such _rendition_ has a reference to a _master playlist_ file in `.streams[i].url`
|
||||||
|
|
||||||
We are using `.streams[i].versions[0].eStat.ml5` as our _rendition_ key:
|
### The _master playlist_ file
|
||||||
|
|
||||||
- `VOF-STE[ANG]` English (Subtitles)
|
As defined in [HTTP Live Streaming](https://www.rfc-editor.org/rfc/rfc8216) (sample file can be found [here](https://git.afpy.org/fcode/delarte/src/branch/stable/samples/hls/master-105612-000-A_VOF-STMF_XQ.m3u8) or [here](https://git.afpy.org/fcode/delarte/src/branch/stable/samples/hls/master-105612-000-A_VA-STA_XQ.m3u8)). This file show the a list of video _variants_ URIs (one per video resolution). Each of them has
|
||||||
- `VOF-STF` French (Original)
|
|
||||||
- `VOF-STMF` Original french version - closed captioning (FR)
|
|
||||||
- `VA-STA` German (Dubbed)
|
|
||||||
- `VA-STMA` German closed captioning
|
|
||||||
- ...
|
|
||||||
|
|
||||||
#### The _master playlist_
|
|
||||||
|
|
||||||
As defined in [HTTP Live Streaming](https://www.rfc-editor.org/rfc/rfc8216), for example:
|
|
||||||
|
|
||||||
```
|
|
||||||
#EXTM3U
|
|
||||||
...
|
|
||||||
#EXT-X-STREAM-INF:BANDWIDTH=2335200,AVERAGE-BANDWIDTH=1123304,VIDEO-RANGE=SDR,CODECS="avc1.4d401e,mp4a.40.2",RESOLUTION=768x432,FRAME-RATE=25.000,AUDIO="program_audio_0",SUBTITLES="subs"
|
|
||||||
medias/104001-000-A_v432.m3u8
|
|
||||||
#EXT-X-STREAM-INF:BANDWIDTH=4534432,AVERAGE-BANDWIDTH=2124680,VIDEO-RANGE=SDR,CODECS="avc1.4d0028,mp4a.40.2",RESOLUTION=1920x1080,FRAME-RATE=25.000,AUDIO="program_audio_0",SUBTITLES="subs"
|
|
||||||
medias/104001-000-A_v1080.m3u8
|
|
||||||
#EXT-X-STREAM-INF:BANDWIDTH=4153392,AVERAGE-BANDWIDTH=1917840,VIDEO-RANGE=SDR,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=1280x720,FRAME-RATE=25.000,AUDIO="program_audio_0",SUBTITLES="subs"
|
|
||||||
medias/104001-000-A_v720.m3u8
|
|
||||||
#EXT-X-STREAM-INF:BANDWIDTH=1445432,AVERAGE-BANDWIDTH=726160,VIDEO-RANGE=SDR,CODECS="avc1.4d401e,mp4a.40.2",RESOLUTION=640x360,FRAME-RATE=25.000,AUDIO="program_audio_0",SUBTITLES="subs"
|
|
||||||
medias/104001-000-A_v360.m3u8
|
|
||||||
#EXT-X-STREAM-INF:BANDWIDTH=815120,AVERAGE-BANDWIDTH=429104,VIDEO-RANGE=SDR,CODECS="avc1.42e00d,mp4a.40.2",RESOLUTION=384x216,FRAME-RATE=25.000,AUDIO="program_audio_0",SUBTITLES="subs"
|
|
||||||
medias/104001-000-A_v216.m3u8
|
|
||||||
...
|
|
||||||
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="program_audio_0",LANGUAGE="fr",NAME="VOF",AUTOSELECT=YES,DEFAULT=YES,URI="medias/104001-000-A_aud_VOF.m3u8"
|
|
||||||
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs",NAME="English",DEFAULT=YES,AUTOSELECT=YES,FORCED=NO,LANGUAGE="en",URI="medias/104001-000-A_st_VO-ANG.m3u8"
|
|
||||||
...
|
|
||||||
```
|
|
||||||
|
|
||||||
This file show the a list of video _variants_ URIs (one per video resolution). Each of them has
|
|
||||||
- exactly one video _media playlist_ reference
|
- exactly one video _media playlist_ reference
|
||||||
- exactly one audio _media playlist_ reference
|
- exactly one audio _media playlist_ reference
|
||||||
- at most one subtitles _media playlist_ reference
|
- at most one subtitles _media playlist_ reference
|
||||||
|
|
||||||
##### The video and audio _media playlist_
|
Audio and subtitles tracks reference also include:
|
||||||
|
- a two-letter `language` code attribute (`mul` is used for audio multiple language)
|
||||||
|
- a free form `name` attribute that is used to detect an audio _original version_
|
||||||
|
- a coded `characteristics` that is used to detect accessibility tracks (audio or textual description)
|
||||||
|
|
||||||
As defined in [HTTP Live Streaming](https://www.rfc-editor.org/rfc/rfc8216), for example:
|
### The video and audio _media playlist_ file
|
||||||
|
|
||||||
```
|
As defined in [HTTP Live Streaming](https://www.rfc-editor.org/rfc/rfc8216) (a sample file can be found [here](https://git.afpy.org/fcode/delarte/src/branch/stable/samples/hls/audio-105612-000-A_aud_VA.m3u8) or [here](https://git.afpy.org/fcode/delarte/src/branch/stable/samples/hls/video-105612-000-A_v1080.m3u8)). This file is basically a list of _segments_ (http ranges) the client is supposed to download in sequence.
|
||||||
#EXTM3U
|
|
||||||
#EXT-X-TARGETDURATION:6
|
|
||||||
#EXT-X-VERSION:7
|
|
||||||
#EXT-X-MEDIA-SEQUENCE:1
|
|
||||||
#EXT-X-INDEPENDENT-SEGMENTS
|
|
||||||
#EXT-X-PLAYLIST-TYPE:VOD
|
|
||||||
#EXT-X-MAP:URI="104001-000-A_v1080.mp4",BYTERANGE="28792@0"
|
|
||||||
#EXTINF:6.000,
|
|
||||||
#EXT-X-BYTERANGE:1734621@28792
|
|
||||||
104001-000-A_v1080.mp4
|
|
||||||
#EXTINF:6.000,
|
|
||||||
#EXT-X-BYTERANGE:1575303@1763413
|
|
||||||
104001-000-A_v1080.mp4
|
|
||||||
#EXTINF:6.000,
|
|
||||||
#EXT-X-BYTERANGE:1603739@3338716
|
|
||||||
104001-000-A_v1080.mp4
|
|
||||||
#EXTINF:6.000,
|
|
||||||
#EXT-X-BYTERANGE:1333835@4942455
|
|
||||||
104001-000-A_v1080.mp4
|
|
||||||
...
|
|
||||||
```
|
|
||||||
|
|
||||||
This file shows the list of _segments_ the server expect to serve.
|
### The subtitles _media playlist_ file
|
||||||
|
|
||||||
|
As defined in [HTTP Live Streaming](https://www.rfc-editor.org/rfc/rfc8216) (a sample file can be found [here](https://git.afpy.org/fcode/delarte/src/branch/stable/samples/hls/subtitles-105612-000-A_st_VA-ALL.m3u8)). This file references the actual file containing the subtitles [VTT](https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API) data.
|
||||||
|
|
||||||
##### The subtitles _media playlist_
|
## ⚙️The process
|
||||||
|
|
||||||
As defined in [HTTP Live Streaming](https://www.rfc-editor.org/rfc/rfc8216), for example:
|
1. Figure out available _sources_ by:
|
||||||
|
- fetching the _config_ API object for the _program identifier_
|
||||||
|
- fetching all referenced _master playlist_.
|
||||||
|
2. Select the desired _source_ based on _renditions_ and _variants_ codes.
|
||||||
|
3. Figure out the _output filename_ from _source_ details.
|
||||||
|
|
||||||
```
|
4. Download video, audio and subtitles media content.
|
||||||
#EXTM3U
|
- convert `VTT` subtitles to `SRT`
|
||||||
#EXT-X-VERSION:7
|
|
||||||
#EXT-X-TARGETDURATION:4650
|
|
||||||
#EXT-X-MEDIA-SEQUENCE:1
|
|
||||||
#EXT-X-PLAYLIST-TYPE:VOD
|
|
||||||
#EXTINF:4650,
|
|
||||||
104001-000-A_st_VO-ANG.vtt
|
|
||||||
#EXT-X-ENDLIST
|
|
||||||
```
|
|
||||||
|
|
||||||
This file shows the file containing the subtitles data.
|
5. Feed the all the media to `ffmpeg` for multiplexing (or _muxing_)
|
||||||
|
|
||||||
### ⚙️The process
|
## 📽️ FFMPEG
|
||||||
|
|
||||||
1. Get the _config_ API object for the _program identifier_.
|
|
||||||
- Select a _rendition_.
|
|
||||||
2. Get the _master playlist_.
|
|
||||||
- Select a _variant_.
|
|
||||||
3. Download audio, video and subtitles media content.
|
|
||||||
- convert `VTT` subtitles to `SRT`
|
|
||||||
4. Figure out the _output filename_ from _metadata_.
|
|
||||||
5. Feed the all the media to `ffmpeg` for _muxing_
|
|
||||||
|
|
||||||
### 📽️ FFMPEG
|
|
||||||
|
|
||||||
The multiplexing (_muxing_) the video file is handled by [ffmpeg](https://ffmpeg.org/). The script expects [ffmpeg](https://ffmpeg.org/) to be installed in the environnement and will call it as a subprocess.
|
The multiplexing (_muxing_) the video file is handled by [ffmpeg](https://ffmpeg.org/). The script expects [ffmpeg](https://ffmpeg.org/) to be installed in the environnement and will call it as a subprocess.
|
||||||
|
|
||||||
#### Why not use FFMPEG direcly with the HLS _master playlist_ URL ?
|
### Why not use FFMPEG directly with the HLS _master playlist_ URL ?
|
||||||
|
|
||||||
So we can be more granular about _renditions_ and _variants_ that we want.
|
So we can be more granular about _renditions_ and _variants_ that we want.
|
||||||
|
|
||||||
#### Why not use `VTT` subtitles direcly ?
|
### Why not use `VTT` subtitles directly ?
|
||||||
|
|
||||||
Because it fails 😒.
|
Because it fails 😒.
|
||||||
|
|
||||||
#### Why not use FFMPEG direcly with the _media playalist_ URLs and let it do the download ?
|
### Why not use FFMPEG directly with the _media playlist_ URLs and let it do the download ?
|
||||||
|
|
||||||
Because some programs would randomly fail 😒. Probably due to invalid _segmentation_ on the server.
|
Because some programs would randomly fail 😒. Probably due to invalid _segmentation_ on the server.
|
||||||
|
|
||||||
|
|
||||||
### 📌 Dependences
|
## 📌 Dependencies
|
||||||
|
|
||||||
- [m3u8](https://pypi.org/project/m3u8/) to parse playlists.
|
- [m3u8](https://pypi.org/project/m3u8/) to parse playlists.
|
||||||
- [webvtt-py](https://pypi.org/project/webvtt-py/) to load `vtt` subtitles files.
|
- [webvtt-py](https://pypi.org/project/webvtt-py/) to load `vtt` subtitles files.
|
||||||
- [requests](https://pypi.org/project/requests/) to handle HTTP traffic.
|
- [requests](https://pypi.org/project/requests/) to handle HTTP traffic.
|
||||||
|
|
||||||
### 🤝 Help
|
## 🤝 Help
|
||||||
|
|
||||||
For sure ! The more the merrier.
|
For sure ! The more the merrier.
|
||||||
|
|
Loading…
Reference in New Issue
Block a user