delarte

Commit Graph

Author	SHA1	Message	Date
Barbagus	23e2183c93	Merge pull request 'move to `urllib3` instead of `requests`' (#29 ) from urllib3 into stable Reviewed-on: #29	2023-02-14 08:11:20 +00:00
Barbagus	477edc4910	Implement a `raise_for_status()` on `HTTPError`	2023-02-13 18:44:32 +01:00
Barbagus	a108135141	Use `urllib3` instead of `requests` We were not (and probably wont be ) using any worthwhile `requests` features (beside `raise_for_status()`) and the `timeout` session parameter propagation vs adapter plugging "thing" in requests just annoys me deeply (not that kind of "... Human (TM)")	2023-02-13 09:35:33 +01:00
Barbagus	f90179e7c3	Fix changes in pages embedded data structure	2023-02-13 08:09:00 +01:00
Barbagus	b4eed73a83	Add debug feedback on module exceptions	2023-02-13 08:03:52 +01:00
Barbagus	f36d45fb5e	Enable interrupt/resume of MP4 streams - skipping the processing of an existing target output file - skipping the download of an existing target stream file - resume the download of an existing target stream temporary file using a HTTP range request	2023-01-25 08:53:25 +01:00
Barbagus	57da060e73	Merge pull request 'support for collections' (#28 ) from collections into stable Reviewed-on: #28	2023-01-24 19:26:05 +00:00
Barbagus	6b24b15f57	Update README according to implementation	2023-01-24 20:24:59 +01:00
Barbagus	e23cd73664	Implement collections	2023-01-24 19:59:39 +01:00
Barbagus	3ca02e8e42	Include collection www/json samples TV series that list episodes through many `collection_subcollection_` zones (one per season): - RC-023217__acquitted.json - RC-022923__cry-wolf.json Other collection that list items in one `collection_videos_` zone: - RC-023013__l-incroyable-periple-de-magellan.json - RC-023242__bandes-de-pirates.json	2023-01-24 10:15:50 +01:00
Barbagus	56c1e8468a	Split program/rendition/variant/target operations Significant rewrite after model modification: introducing `*Sources` objects that encapsulate metadata and fetch information (urls, protocols). The API (#20) is organized as pipe elements with sources being what flows through the pipe. 1. fetch program sources 2. fetch rendition sources 3. fetch variant sources 4. fetch targets 5. process (download+mux) targets Some user selection filter or modifiers could then be applied at any step of the pipe. Our __main__.py is an implementation of that scheme. Implied modifications include: - Later failure on unsupported protocols, used to be in `api`, now in `hls`. This offers the possibility to filter and/or support them later. - Give up honoring the http ranges for mp4 download, stream-download them by fixed chunk instead. - Cleaning up of the `hls` module moving the main download function to __init__ and specific (mp4/vtt) download functions to a new `download` module. On the side modifications include: - The progress handler showing downloading rates. - The naming utilities providing rendition and variant code insertion. - Download parts to working directories and skip unnecessary re-downloads on failure. This was a big change for a single commit... too big of a change maybe.	2023-01-24 08:27:37 +01:00
Barbagus	ed5ba06a98	Implement a "schema guard" for `api` module In order to catch errors related to assumed JSON schema, regroup all JSON data access under a context manager that catch related errors: - KeyError - IndexError - ValueError	2023-01-16 21:12:55 +01:00
Barbagus	fcadd531c4	Reorganize imports in files	2023-01-14 20:46:16 +01:00
Barbagus	639a8063a5	Get program information from page content Changes the way the program information is figured out. From URL parsing to page content parsing. A massive JSON object is shipped within the HTML of the page, that's were we get what we need from. Side effects: - drop `slug` from the program's info - drop `slug` naming option - no `Program` / `ProgramMeta` distinction Includes some JSON samples.	2023-01-14 19:51:02 +01:00
Barbagus	ba2dd96b36	Merge pull request 'output file naming #8 ' (#27 ) from naming into stable Reviewed-on: #27	2023-01-11 17:12:54 +00:00
Barbagus	cd24696367	Fix space issue in sequence counter	2023-01-11 18:10:52 +01:00
Barbagus	ecba66d27a	Implement basic naming options	2023-01-11 09:08:32 +01:00
Barbagus	d4616f6298	Update README	2023-01-09 19:48:59 +01:00
Barbagus	4667dbfca1	Refactor models and API Change/add/rename model's data structures in order to provide a more useful API #20, introducing new structures: - `Sources`: summarizing program, renditions and variants found at a given ArteTV page URL - `Target`: summarizing all required data for a download And new functions: - `fetch_sources()` to build the `Sources` from a URL - `iter_[renditions\|variants]()` describe the available options for the `Sources` - `select_[renditions\|variants]()` to narrow down the desired options for the `Sources` - `compile_sources` to compute such a `Target` from `Sources` - `download_target` to download such a `Target` Finally, this should make the playlist handling #7 easier (I know, I've said that before)	2023-01-09 19:30:46 +01:00
Barbagus	b13d4186b0	Add content-type check for HLS responses	2023-01-09 05:07:04 +01:00
Barbagus	5674b4aa0d	Fix terminology and harmful language #12 Master playlists become program indexes Media playlists become track indexes	2023-01-08 20:40:49 +01:00
Barbagus	81913a6f24	Cleanup package API #20 Move all error definitions to `error` module In `__init__` - Remove imports from global scope - Import all from `model` module - Import all from `error` module Refactor: `fetch_sources()` to take the URL as argument Coding style: import definitions from `error` and `model`	2023-01-08 20:04:18 +01:00
Barbagus	aa6a6e4a30	Remove obsolete tests	2023-01-08 20:02:54 +01:00
Barbagus	eac65aaa1c	Fix renditions audio/subtitles objects Due to faulty syntax the `provides_accessibility` field was None/True instead of False/True	2023-01-07 12:28:34 +01:00
Barbagus	87f833d655	Add `docopt-ng` to dependencies in README	2023-01-06 10:06:29 +01:00
Barbagus	914f711670	Merge pull request 'Fix #24 and #25 ' (#26 ) from vtt2srt into stable Reviewed-on: #26	2023-01-06 00:24:56 +00:00
Barbagus	96f411cca0	Fix #24 and #25 Remove dependency to `webvtt-py` which was both too much and not enough for our use case. Implement a basic WebVTT to SRT converter according to ArteTV's usage of WebVTT features.	2023-01-06 01:17:55 +01:00
Barbagus	8d216215dd	Merge pull request 'docopt-ng' (#22 ) from docopt-ng into stable Reviewed-on: #22	2023-01-03 08:45:46 +00:00
Barbagus	831d62d1fd	Update README	2022-12-29 11:14:23 +01:00
Barbagus	464cf85680	Rename command line argument holder	2022-12-29 11:09:28 +01:00
Barbagus	381cbd7a36	Fix bub in version label building	2022-12-29 11:00:48 +01:00
Barbagus	4eac1fa86d	Fix bub in version label building	2022-12-29 10:57:15 +01:00
Barbagus	b057bab44b	Implement CLI parsing using docopt-ng library	2022-12-29 10:54:45 +01:00
Barbagus	3ec2961a85	Merge pull request 'refactoring' (#21 ) from barbadev2 into stable Reviewed-on: #21	2022-12-29 07:58:48 +00:00
Barbagus	e4cba27bdd	Update README to reflect changes	2022-12-29 08:49:45 +01:00
Barbagus	e1bed8b1be	Provide programmatic access #20	2022-12-29 08:49:45 +01:00
Barbagus	07ef013ce3	Rename error handling - move errors in a `error` module - rename the module base error from `Error` to `ModuleError` - fix some error handling in `__main__`	2022-12-29 08:49:45 +01:00
Barbagus	db0a954497	Refactor code to use the model types - Rename variables and function to reflect model names. - Convert infrastructure data (JSON, M3U8) to model types. - Change algorithms to produce/consume `Source` model, in particular using generator functions to build a list of `Source`s rather than the opaque `rendition => variant => urls` mapping (this will make #7 very straight forward). - Download all master playlists after API call before selecting rendition/variants. Motivation for the last point: We use to offer rendition choosing right after the API call, before we download the appropriate master playlist to figure out the available variants. The problem with that is that ArteTV's codes for the renditions (given by the API) do not necessarily include complete languages information (if it is not French or German), for instance a original audio track in Portuguese would show as `VOEU-` (as in "EUropean"). The actual mention of the Portuguese would only show up in the master playlist. So, the new implementation actually downloads all master playlists straight after the API call. This is a bit wasteful, but I figured it was necessary to provide quality interaction with the user. Bonus? Now when we first prompt the user for rendition choice, we actually already know the available variants available, maybe we make use of that fact in the future...	2022-12-29 08:43:20 +01:00
Barbagus	4fa5e1953e	Create the data model types A bunch of data structures to be used instead of the types used by the infrastructures, i.e. JSON for API and M3U8 for the HLS. It should provide a stronger decoupling of the modules and pave the way for #7 and #8. Implementation uses `namedtuple`s as they are transparent to test for equality and are natively hashable (can be used in `set`s or as keys to `dict`s) which is useful for deduping for instance.	2022-12-27 07:55:36 +01:00
Barbagus	305d8ab679	Refactor website URL parsing Lighter implementation and using `target_id` instead of `program_id`, preparing for #7	2022-12-27 07:52:35 +01:00
Barbagus	4c518993ef	Change error handling Creation of a `common.Error` exception whose string representation is taken from its docstring. Creation of a `common.UnexpectedError` to serve as base for exceptions raised while checking assumptions on requests and responses. The later are handled by displaying a message inviting user to submit the error to us, so we can correct our assumptions.	2022-12-22 17:43:42 +01:00
Barbagus	88ffe31a94	Use `requests` library instead of `urllib` Enables by default: - gzip compression - request pooling	2022-12-20 23:46:44 +01:00
Barbagus	458d4cbb6d	Add sample files	2022-12-20 10:11:18 +01:00
Barbagus	1eb4d8557d	Spell check	2022-12-20 09:48:57 +01:00
Rémi TAUVEL	b938dc38c6	Merge branch 'WIP--CLI-argumentsv2#1' into stable	2022-12-19 00:33:02 +01:00
Rémi TAUVEL	28bd775817	📄 📝 docstring and licence at top of test package init module	2022-12-19 00:32:23 +01:00
Rémi TAUVEL	196f88aebb	Merge branch 'stable' of git.afpy.org:fcode/delarte into stable	2022-12-19 00:28:51 +01:00
Barbagus	dacf9533d6	Fix HLS protocol terminology in the code #12 - versions => renditions - resolutions => variants - ranges and/or chunks => segments - version index => master playlist - other index => media playlist url For now, the CLI has not been updated with this terminology, only the code.	2022-12-18 16:27:04 +01:00
Rémi TAUVEL	52420213cd	📝 add more doc for CLI help string	2022-12-18 15:41:10 +01:00
Rémi TAUVEL	e6741594b6	📄 add licence comments top	2022-12-18 15:41:10 +01:00

1 2

91 Commits All Branches Search

91 Commits

All Branches