Support for collections #7

Closed
opened 2022-12-10 07:32:07 +00:00 by Barbagus · 6 comments
Collaborator

Some programs are organized in playlists. For instance a documentary split into few episodes.

For example: https://www.arte.tv/fr/videos/RC-020741/corpus-christi/

We propose to handle such playlist by allowing to download all episodes sequentially.

Some programs are organized in playlists. For instance a documentary split into few episodes. For example: https://www.arte.tv/fr/videos/RC-020741/corpus-christi/ We propose to handle such playlist by allowing to download all episodes sequentially.
Barbagus added the
enhancement
label 2022-12-10 07:32:07 +00:00
Barbagus added reference stable 2022-12-10 07:36:59 +00:00
Author
Collaborator

If we query the config API with the playlist identifier (RC-020741), the ConfigPlayer object returned is the one of the first "episode" of the list.
The program identifier (in .data.attributes.metadata.providerId) in then different than our playlist identifier.

This is our clue to understand that the program page URL is in fact a playlist page URL.

There is a playlist API entry point, in our example:

https://api.arte.tv/api/player/v2/playlist/fr/RC-020741

The response contains reference to every episode of the list in order... So there is a way.

If we query the _config API_ with the _playlist identifier_ (`RC-020741`), the `ConfigPlayer` object returned is the one of the first "episode" of the list. The _program identifier_ (in `.data.attributes.metadata.providerId`) in then different than our _playlist identifier_. This is our clue to understand that the program page URL is in fact a playlist page URL. There is a _playlist API_ entry point, in our example: `https://api.arte.tv/api/player/v2/playlist/fr/RC-020741` The response contains reference to every episode of the list in order... So there is a way.
Author
Collaborator

The response contains reference to every episode of the list in order

well, not always it seems.

https://api.arte.tv/api/player/v2/playlist/fr/RC-021792

...has an empty data.attributes.metadata.items

> The response contains reference to every episode of the list in order well, not always it seems. `https://api.arte.tv/api/player/v2/playlist/fr/RC-021792` ...has an empty `data.attributes.metadata.items`
freezed removed reference stable 2022-12-11 22:04:42 +00:00
freezed added this to the devel project 2022-12-11 22:07:03 +00:00
Owner

If needed, another topic related playlist stream:

(delarte) user@machine ~/tmp % delarte https://www.arte.tv/fr/videos/RC-023064/frankenstream 
Invalid program
zsh: exit 1     delarte https://www.arte.tv/fr/videos/RC-023064/frankenstream
If needed, another topic related playlist stream: ```bash (delarte) user@machine ~/tmp % delarte https://www.arte.tv/fr/videos/RC-023064/frankenstream Invalid program zsh: exit 1 delarte https://www.arte.tv/fr/videos/RC-023064/frankenstream ```
Author
Collaborator

yep, that is because for now, one way to validate the ConfigPlayer object sent by the API is to veryfy it's .data.attributes.metadata.providerId is in fact the program identifier from the url (RC-023064 in your case) which is not the case for playlist urls.

It is indeed, I guess, one way to programatically figure out that the user intend to grab a whole playlist and not just one episode.

yep, that is because for now, one way to validate the `ConfigPlayer` object sent by the API is to veryfy it's `.data.attributes.metadata.providerId` is in fact the _program identifier_ from the url (`RC-023064` in your case) which is not the case for playlist urls. It is indeed, I guess, one way to programatically figure out that the user intend to grab a whole playlist and not just one _episode_.
Author
Collaborator

The response contains reference to every episode of the list in order

well, not always it seems.

https://api.arte.tv/api/player/v2/playlist/fr/RC-021792

...has an empty data.attributes.metadata.items

It seems there is some sort of linking between episodes that can be followed using a playlist API call with and episode identifier.

> > The response contains reference to every episode of the list in order > > well, not always it seems. > > `https://api.arte.tv/api/player/v2/playlist/fr/RC-021792` > > ...has an empty `data.attributes.metadata.items` It seems there is some sort of linking between episodes that can be followed using a `playlist` API call with and _episode identifier_.
Barbagus reopened this issue 2022-12-12 07:44:18 +00:00
Barbagus referenced this issue from a commit 2023-01-09 18:48:46 +00:00
Author
Collaborator

The response contains reference to every episode of the list in order

well, not always it seems.

https://api.arte.tv/api/player/v2/playlist/fr/RC-021792

...has an empty data.attributes.metadata.items

It seems there is some sort of linking between episodes that can be followed using a playlist API call with and episode identifier.

So far:

  • the items of the Playlist API object is not reliable,
  • the linking is rather terrible and requires two API calls (ConfigPlayer and Playlist) for each item.

An other aproach could be to actually load the HTML page and do some digging. From the look of it the site uses nextjs and include some JSON data that is used to build the HTML page. In that JSON we have everything we ask for (I checked).

The downsideis: relying on HTML implementation is likely less robust than just using their JSON-API, to say it otherwise: a change in thei WEB infra would kill our script.

I am torned.

> > > The response contains reference to every episode of the list in order > > > > well, not always it seems. > > > > `https://api.arte.tv/api/player/v2/playlist/fr/RC-021792` > > > > ...has an empty `data.attributes.metadata.items` > > It seems there is some sort of linking between episodes that can be followed using a `playlist` API call with and _episode identifier_. > So far: - the `items` of the `Playlist` API object is not reliable, - the linking is rather terrible and requires two API calls (`ConfigPlayer` and `Playlist`) for each item. An other aproach could be to actually load the HTML page and do some digging. From the look of it the site uses [nextjs](https://nextjs.org/) and include some JSON data that is used to build the HTML page. In that JSON we have everything we ask for (I checked). The downsideis: relying on HTML implementation is likely less robust than just using their JSON-API, to say it otherwise: a change in thei WEB infra would kill our script. I am torned.
Barbagus changed title from Support for playlists to Support for collections 2023-01-11 17:31:36 +00:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: fcode/delarte#7
No description provided.