Custom naming of the output file. #8
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
For now the name of the output file is chosen from the 'title' metadata.
In some cases it is not practicle and one might prefer more elaborate scheme like using a combination of 'title', 'subtitle', 'version' etc...
This could be aproached by accepting a 'tagged' string as an option/argument to the script. Something like:
"{title}-{subtitle} ({version}, {resolution})"
That would result in, for exemple:
"Clint Eastwood - The Last Legend (VOF-STE[ANG], 720p).mkv"
There is also the issue of forbidden characters on filesystem:
L'incroyable périple de Magellan (3/4) - Le royaume de Magellan.mkv
The slash ("/") character is a problem.
This will become more relevant in regards with issue #7 as it seems that the naming policy for titles and subtitiles for playlist episodes is not consistant.
For instance, the playlist: https://www.arte.tv/fr/videos/RC-020741/corpus-christi/ has the folowing episodes:
When this other playlist: https://www.arte.tv/fr/videos/RC-023013/l-incroyable-periple-de-magellan/
I am suggesting avoiding spaces too.
I would argue that falls under personal preferences. Whereas slashes (
/
) are actually forbidden by some (most ? all?) filesystems, spaces are often not. Therfore I would be more inclined towards a solution that can be customized by the user.One thing that comes to mind is the filters that are used in some template engines. Jinja2 for instance, allows things like:
name|lower|replace(" ", "_")
That would 1) convert to lower case, 2) replace spaces with underscores.
So there is the question of a syntax for a naming pattern.
$title - ($index of $total) - $subtitle
->Frankenstream - (1 of 4) - Ce monstre qui nous dévore
$title $index $total $subtitle|lower|underize
->frankenstream_1_4_ce_monstre_qui_nous_dévore
$title $index $total $subtitle|lower|underize|ascii
->frankenstream_1_4_ce_monstre_qui_nous_devore
Could be interesting to see how other programs do it (youtube-dl, ...)
According quick internet search, invalid charters for files/directories depend on the underlying file system and/or OS (not a surprise):
/
:
<
and>
"
\
|
?
*
There are however some other things to take in consideration (ASCII control charaters, windows forbidden names lie
COM3
etc...). This seems rather painfull verification process.One way to deal with it would be to create a temporary file with the intended filename as a suffix for exemple and fail early if we get an OS error.
HOWEVER if we come up with a pattern syntax for naming files dynamicaly, chances are that we will use some characters to have meaning in the syntax itself. Picking inside that list might have advantages ?
Pathological example:
To me, would look more confusing than:
to me, jinja-like pattern is more "human-readable", so it should be in use for final user. furthermore, as You already said, slashes, for a CLI user, is a path-special character. it has sense, using it for variable injections would be confusing
bonus suggestion: maybe the naming pattern can be an option saved by the user (not typed each time he wants to download a video, but loaded from a config file saved at a smart path), so he doesn't have to rename all his documents if he doesn't use your way of naming files?
I suggest we do something simple to start:
<name>
can be the program's ID, i.e094484-002-A
, this is not pretty but garanties a working solution. To be enabled with--name-with-id
Or,
<name>
can be the slug part of the URL, i.eacquitted-saison-2-1-8
orfaire-l-histoire
, this is not much more prettier but maybe better for some cases, as a fallback. To be enabled with--name-with-slug
Or,
<name>
is<title>[<sep><subtitle>]
depeding on if there is a subitle (secondary title). In that case<sep>
shall be configured with--name-separator=<sep>
and defaults to-
.Sometimes
<title>
and/or<subtitles>
includes a string like(<seq>/<total>)
to indicate the program is part of a sequence.The
/
character is problematic and there is no obvious replacement candidate, so we replace it by<seq_pfx><seq>
. In that case<seq_pfx>
shall be configured with--name-sequence-pfx=<seq_pfx>
and defaults toE
(fo episode).To prevent problems du to string/number ordering (
10
comming before2
),<seq>
is to be zero-padded using<total>
as a hint of how many zeroes are needed. A--name-sequence-no-pad
shall disable that behaviour.The case of forbidden names or characters is handled by actually trying to create a file with given name and see if it errors. Than we fallback to slug.
These options have been added and merged into stable
Considerations for the future: set appropriate tags (title, etc..) one the outputfile so this "fine" naming can be delegated to a bulk-renaming software. This is a lot of thinking and hard descision for something that is not realy the core purpose of this sofware :)