Significant rewrite after model modification: introducing `*Sources`
objects that encapsulate metadata and fetch information (urls,
protocols). The API (#20) is organized as pipe elements with sources
being what flows through the pipe.
1. fetch program sources
2. fetch rendition sources
3. fetch variant sources
4. fetch targets
5. process (download+mux) targets
Some user selection filter or modifiers could then be applied at any
step of the pipe. Our __main__.py is an implementation of that scheme.
Implied modifications include:
- Later failure on unsupported protocols, used to be in `api`, now in
`hls`. This offers the possibility to filter and/or support them
later.
- Give up honoring the http ranges for mp4 download, stream-download
them by fixed chunk instead.
- Cleaning up of the `hls` module moving the main download function to
__init__ and specific (mp4/vtt) download functions to a new
`download` module.
On the side modifications include:
- The progress handler showing downloading rates.
- The naming utilities providing rendition and variant code insertion.
- Download parts to working directories and skip unnecessary
re-downloads on failure.
This was a big change for a single commit... too big of a change maybe.
Changes the way the program information is figured out. From URL parsing
to page content parsing.
A massive JSON object is shipped within the HTML of the page, that's
were we get what we need from.
Side effects:
- drop `slug` from the program's info
- drop `slug` naming option
- no `Program` / `ProgramMeta` distinction
Includes some JSON samples.
Change/add/rename model's data structures in order to provide a more
useful API #20, introducing new structures:
- `Sources`: summarizing program, renditions and variants found
at a given ArteTV page URL
- `Target`: summarizing all required data for a download
And new functions:
- `fetch_sources()` to build the `Sources` from a URL
- `iter_[renditions|variants]()` describe the available options for the
`Sources`
- `select_[renditions|variants]()` to narrow down the desired options
for the `Sources`
- `compile_sources` to compute such a `Target` from `Sources`
- `download_target` to download such a `Target`
Finally, this should make the playlist handling #7 easier (I know, I've
said that before)
A bunch of data structures to be used instead of the types used by the
infrastructures, i.e. JSON for API and M3U8 for the HLS.
It should provide a stronger decoupling of the modules and pave the way
for #7 and #8.
Implementation uses `namedtuple`s as they are transparent to test for
equality and are natively hashable (can be used in `set`s or as keys to
`dict`s) which is useful for deduping for instance.