We were not (and probably wont be ) using any worthwhile `requests`
features (beside `raise_for_status()`) and the `timeout` session
parameter propagation vs adapter plugging "thing" in requests just
annoys me deeply (not that kind of "... Human (TM)")
- skipping the processing of an existing target output file
- skipping the download of an existing target stream file
- resume the download of an existing target stream temporary file
using a HTTP range request
Significant rewrite after model modification: introducing `*Sources`
objects that encapsulate metadata and fetch information (urls,
protocols). The API (#20) is organized as pipe elements with sources
being what flows through the pipe.
1. fetch program sources
2. fetch rendition sources
3. fetch variant sources
4. fetch targets
5. process (download+mux) targets
Some user selection filter or modifiers could then be applied at any
step of the pipe. Our __main__.py is an implementation of that scheme.
Implied modifications include:
- Later failure on unsupported protocols, used to be in `api`, now in
`hls`. This offers the possibility to filter and/or support them
later.
- Give up honoring the http ranges for mp4 download, stream-download
them by fixed chunk instead.
- Cleaning up of the `hls` module moving the main download function to
__init__ and specific (mp4/vtt) download functions to a new
`download` module.
On the side modifications include:
- The progress handler showing downloading rates.
- The naming utilities providing rendition and variant code insertion.
- Download parts to working directories and skip unnecessary
re-downloads on failure.
This was a big change for a single commit... too big of a change maybe.
Changes the way the program information is figured out. From URL parsing
to page content parsing.
A massive JSON object is shipped within the HTML of the page, that's
were we get what we need from.
Side effects:
- drop `slug` from the program's info
- drop `slug` naming option
- no `Program` / `ProgramMeta` distinction
Includes some JSON samples.
Change/add/rename model's data structures in order to provide a more
useful API #20, introducing new structures:
- `Sources`: summarizing program, renditions and variants found
at a given ArteTV page URL
- `Target`: summarizing all required data for a download
And new functions:
- `fetch_sources()` to build the `Sources` from a URL
- `iter_[renditions|variants]()` describe the available options for the
`Sources`
- `select_[renditions|variants]()` to narrow down the desired options
for the `Sources`
- `compile_sources` to compute such a `Target` from `Sources`
- `download_target` to download such a `Target`
Finally, this should make the playlist handling #7 easier (I know, I've
said that before)
Move all error definitions to `error` module
In `__init__`
- Remove imports from global scope
- Import all from `model` module
- Import all from `error` module
Refactor: `fetch_sources()` to take the URL as argument
Coding style: import definitions from `error` and `model`
Implemented modules:
- api: deals with ArteTV JSON API
- hls: deals with HLS protocol
- muxing: deals with the stream multiplexing
- naming: deals with output file naming
- www: deals with ArteTV web interface
Handle the audio and video channel downloading to temporary files prior
to calling ffmpeg.
Although it might not be necessary, the download is made by "chunks" as
it would be by a client/player.
Downloading progress feedback is printed to the terminal.