delarte/src/delarte/download.py
Barbagus 56c1e8468a Split program/rendition/variant/target operations
Significant rewrite after model modification: introducing `*Sources`
objects that encapsulate metadata and fetch information (urls,
protocols). The API (#20) is organized as pipe elements with sources
being what flows through the pipe.
    1. fetch program sources
    2. fetch rendition sources
    3. fetch variant sources
    4. fetch targets
    5. process (download+mux) targets
Some user selection filter or modifiers could then be applied at any
step of the pipe. Our __main__.py is an implementation of that scheme.

Implied modifications include:
 - Later failure on unsupported protocols, used to be in `api`, now in
   `hls`. This offers the possibility to filter and/or support them
   later.
 - Give up honoring the http ranges for mp4 download, stream-download
   them by fixed chunk instead.
 - Cleaning up of the `hls` module moving the main download function to
   __init__ and specific (mp4/vtt) download functions to a new
   `download` module.

On the side modifications include:
 - The progress handler showing downloading rates.
 - The naming utilities providing rendition and variant code insertion.
 - Download parts to working directories and skip unnecessary
   re-downloads on failure.

This was a big change for a single commit... too big of a change maybe.
2023-01-24 08:27:37 +01:00

53 lines
1.4 KiB
Python

# License: GNU AGPL v3: http://www.gnu.org/licenses/
# This file is part of `delarte` (https://git.afpy.org/fcode/delarte.git)
"""Provide download utilities."""
import os
from . import subtitles
_CHUNK = 64 * 1024
def download_mp4_media(url, file_name, http_session, on_progress):
"""Download a MP4 (video or audio) to given file."""
on_progress(file_name, 0, 0)
if os.path.isfile(file_name):
on_progress(file_name, 1, 1)
return
temp_file = f"{file_name}.tmp"
with open(temp_file, "w+b") as f:
r = http_session.get(url, timeout=5, stream=True)
r.raise_for_status()
total = int(r.headers["content-length"])
for content in r.iter_content(_CHUNK):
f.write(content)
on_progress(file_name, f.tell(), total)
os.rename(temp_file, file_name)
def download_vtt_media(url, file_name, http_session, on_progress):
"""Download a VTT and SRT-convert it to to given file."""
on_progress(file_name, 0, 0)
if os.path.isfile(file_name):
on_progress(file_name, 1, 1)
return
temp_file = f"{file_name}.tmp"
with open(temp_file, "w", encoding="utf-8") as f:
r = http_session.get(url, timeout=5)
r.raise_for_status()
r.encoding = "utf-8"
subtitles.convert(r.text, f)
on_progress(file_name, f.tell(), f.tell())
os.rename(temp_file, file_name)