1
0
mirror of https://github.com/yt-dlp/yt-dlp synced 2025-12-18 23:25:42 +07:00

Compare commits

..

18 Commits

Author SHA1 Message Date
github-actions[bot]
264044286d Release 2025.10.14
Created by: bashonly

:ci skip all
2025-10-14 23:29:27 +00:00
Robin
a98e7f9f58 [ie/idagio] Add extractors (#14586)
Closes #2624
Authored by: robin-mu
2025-10-15 01:23:13 +02:00
uoag
0ea5d5882d [ie/abc.net.au] Support listen URLs (#14389)
Authored by: uoag
2025-10-14 22:02:21 +02:00
CasualYouTuber31
cdc533b114 [ie/tiktok:user] Fix private account extraction (#14585)
Closes #14565
Authored by: CasualYT31
2025-10-14 19:42:36 +00:00
bashonly
c2e124881f [ie/slideslive] Fix extractor (#14619)
Closes #14518
Authored by: bashonly
2025-10-14 19:38:15 +00:00
bashonly
ad55bfcfb7 [ie/10play] Handle geo-restriction errors (#14618)
Authored by: bashonly
2025-10-14 19:36:17 +00:00
Josh Holmer
739125d40f [ie/xhamster] Fix extractor (#14446)
Closes #14395
Authored by: shssoichiro, dhwz, dirkf

Co-authored-by: dhwz <3697946+dhwz@users.noreply.github.com>
Co-authored-by: dirkf <1222880+dirkf@users.noreply.github.com>
2025-10-14 19:31:07 +00:00
Sean Ellingham
5f94f05490 [ie/vidyard] Extract chapters (#14478)
Closes #14477
Authored by: exterrestris
2025-10-14 13:53:54 +02:00
columndeeply
5d7678195a [ie/PrankCastPost] Rework extractor (#14445)
Authored by: columndeeply
2025-10-14 13:25:07 +02:00
sepro
eafedc2181 [ie/10play] Rework extractor (#14417)
Closes #14276
Authored by: seproDev, Sipherdrakon

Co-authored-by: Sipherdrakon <64430430+Sipherdrakon@users.noreply.github.com>
2025-10-13 00:54:26 +02:00
Ceci
8eb8695139 [ie/dropout] Update extractor for new domain (#14531)
Closes #14521
Authored by: cecilia-sanare
2025-10-12 23:53:53 +02:00
uoag
df160ab18d [ie/cbc.ca:listen] Add extractor (#14391)
Authored by: uoag
2025-10-12 23:42:39 +02:00
sepro
6d41aaf21c [ie/soundcloud] Support new API URLs (#14449)
Closes #14443
Authored by: seproDev
2025-10-12 22:21:34 +02:00
sepro
a6673a8e82 Fix prefer-vp9-sort compat option (#14603)
Closes #14602
Authored by: seproDev
2025-10-12 20:30:17 +02:00
sepro
87be1bb96a [ie/musescore] Fix extractor (#14598)
Closes #14485
Authored by: seproDev
2025-10-12 08:49:15 +02:00
coletdjnz
ccc25d6710 [ie/youtube:tab] Fix approximate timestamp extraction for feeds (#14539)
Authored by: coletdjnz
2025-10-12 08:29:06 +13:00
Vu Thanh Tai
5513036104 [ie/tiktok] Support browser impersonation (#14473)
Closes #10919, Closes #12574
Authored by: thanhtaivtt, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-10-01 06:53:19 +00:00
coletdjnz
bd5ed90419 [ie/youtube] Detect experiment binding GVS PO Token to video id (#14471)
Fixes https://github.com/yt-dlp/yt-dlp/issues/14421

Authored by: coletdjnz
2025-09-29 16:25:09 +13:00
25 changed files with 795 additions and 196 deletions

View File

@@ -811,3 +811,10 @@ zakaryan2004
cdce8p cdce8p
nicolaasjan nicolaasjan
willsmillie willsmillie
CasualYT31
cecilia-sanare
dhwz
robin-mu
shssoichiro
thanhtaivtt
uoag

View File

@@ -4,6 +4,32 @@ # Changelog
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master # To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
--> -->
### 2025.10.14
#### Core changes
- [Fix `prefer-vp9-sort` compat option](https://github.com/yt-dlp/yt-dlp/commit/a6673a8e82276ea529c1773ed09e5bc4a22e822a) ([#14603](https://github.com/yt-dlp/yt-dlp/issues/14603)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- **10play**
- [Handle geo-restriction errors](https://github.com/yt-dlp/yt-dlp/commit/ad55bfcfb700fbfc1364c04e3425761d6f95c0a7) ([#14618](https://github.com/yt-dlp/yt-dlp/issues/14618)) by [bashonly](https://github.com/bashonly)
- [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/eafedc21817bb0de20e9aaccd7151a1d4c4e1ebd) ([#14417](https://github.com/yt-dlp/yt-dlp/issues/14417)) by [seproDev](https://github.com/seproDev), [Sipherdrakon](https://github.com/Sipherdrakon)
- **abc.net.au**: [Support listen URLs](https://github.com/yt-dlp/yt-dlp/commit/0ea5d5882def84415f946907cfc00ab431c18fed) ([#14389](https://github.com/yt-dlp/yt-dlp/issues/14389)) by [uoag](https://github.com/uoag)
- **cbc.ca**: listen: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/df160ab18db523f6629f2e7e20123d7a3551df28) ([#14391](https://github.com/yt-dlp/yt-dlp/issues/14391)) by [uoag](https://github.com/uoag)
- **dropout**: [Update extractor for new domain](https://github.com/yt-dlp/yt-dlp/commit/8eb8695139dece6351aac10463df63b87b45b000) ([#14531](https://github.com/yt-dlp/yt-dlp/issues/14531)) by [cecilia-sanare](https://github.com/cecilia-sanare)
- **idagio**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/a98e7f9f58a9492d2cb216baa59c890ed8ce02f3) ([#14586](https://github.com/yt-dlp/yt-dlp/issues/14586)) by [robin-mu](https://github.com/robin-mu)
- **musescore**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/87be1bb96ac47abaaa4cfc6d7dd651e511b74551) ([#14598](https://github.com/yt-dlp/yt-dlp/issues/14598)) by [seproDev](https://github.com/seproDev)
- **prankcastpost**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/5d7678195a7d0c045a9fe0418383171a71a7ea43) ([#14445](https://github.com/yt-dlp/yt-dlp/issues/14445)) by [columndeeply](https://github.com/columndeeply)
- **slideslive**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/c2e124881f9aa02097589e853b3d3505e78372c4) ([#14619](https://github.com/yt-dlp/yt-dlp/issues/14619)) by [bashonly](https://github.com/bashonly)
- **soundcloud**: [Support new API URLs](https://github.com/yt-dlp/yt-dlp/commit/6d41aaf21c61a87e74564646abd0a8ee887e888d) ([#14449](https://github.com/yt-dlp/yt-dlp/issues/14449)) by [seproDev](https://github.com/seproDev)
- **tiktok**
- [Support browser impersonation](https://github.com/yt-dlp/yt-dlp/commit/5513036104ed9710f624c537fb3644b07a0680db) ([#14473](https://github.com/yt-dlp/yt-dlp/issues/14473)) by [bashonly](https://github.com/bashonly), [thanhtaivtt](https://github.com/thanhtaivtt)
- user: [Fix private account extraction](https://github.com/yt-dlp/yt-dlp/commit/cdc533b114c35ceb8a2e9dd3eb9c172a8737ae5e) ([#14585](https://github.com/yt-dlp/yt-dlp/issues/14585)) by [CasualYT31](https://github.com/CasualYT31)
- **vidyard**: [Extract chapters](https://github.com/yt-dlp/yt-dlp/commit/5f94f054907c12e68129cd9ac2508ed8aba1b223) ([#14478](https://github.com/yt-dlp/yt-dlp/issues/14478)) by [exterrestris](https://github.com/exterrestris)
- **xhamster**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/739125d40f8ede3beb7be68fc4df55bec0d226fd) ([#14446](https://github.com/yt-dlp/yt-dlp/issues/14446)) by [dhwz](https://github.com/dhwz), [dirkf](https://github.com/dirkf), [shssoichiro](https://github.com/shssoichiro)
- **youtube**
- [Detect experiment binding GVS PO Token to video id](https://github.com/yt-dlp/yt-dlp/commit/bd5ed90419eea18adfb2f0d8efa9d22b2029119f) ([#14471](https://github.com/yt-dlp/yt-dlp/issues/14471)) by [coletdjnz](https://github.com/coletdjnz)
- tab: [Fix approximate timestamp extraction for feeds](https://github.com/yt-dlp/yt-dlp/commit/ccc25d6710a4aa373b7e15c558e07f8a2ffae5f3) ([#14539](https://github.com/yt-dlp/yt-dlp/issues/14539)) by [coletdjnz](https://github.com/coletdjnz)
### 2025.09.26 ### 2025.09.26
#### Extractor changes #### Extractor changes

View File

@@ -242,6 +242,7 @@ # Supported sites
- **Canalsurmas** - **Canalsurmas**
- **CaracolTvPlay**: [*caracoltv-play*](## "netrc machine") - **CaracolTvPlay**: [*caracoltv-play*](## "netrc machine")
- **cbc.ca** - **cbc.ca**
- **cbc.ca:listen**
- **cbc.ca:player** - **cbc.ca:player**
- **cbc.ca:player:playlist** - **cbc.ca:player:playlist**
- **CBS**: (**Currently broken**) - **CBS**: (**Currently broken**)
@@ -579,6 +580,11 @@ # Supported sites
- **Hypem** - **Hypem**
- **Hytale** - **Hytale**
- **Icareus** - **Icareus**
- **IdagioAlbum**
- **IdagioPersonalPlaylist**
- **IdagioPlaylist**
- **IdagioRecording**
- **IdagioTrack**
- **IdolPlus** - **IdolPlus**
- **iflix:episode** - **iflix:episode**
- **IflixSeries** - **IflixSeries**

View File

@@ -45,3 +45,8 @@ def test_no_visitor_id(self, pot_request):
def test_invalid_base64(self, pot_request): def test_invalid_base64(self, pot_request):
pot_request.visitor_data = 'invalid-base64' pot_request.visitor_data = 'invalid-base64'
assert get_webpo_content_binding(pot_request, bind_to_visitor_id=True) == (pot_request.visitor_data, ContentBindingType.VISITOR_DATA) assert get_webpo_content_binding(pot_request, bind_to_visitor_id=True) == (pot_request.visitor_data, ContentBindingType.VISITOR_DATA)
def test_gvs_video_id_binding_experiment(self, pot_request):
pot_request.context = PoTokenContext.GVS
pot_request._gvs_bind_to_video_id = True
assert get_webpo_content_binding(pot_request) == ('example-video-id', ContentBindingType.VIDEO_ID)

View File

@@ -155,7 +155,7 @@ def set_default_compat(compat_name, opt_name, default=True, remove_compat=True):
if 'format-sort' in opts.compat_opts: if 'format-sort' in opts.compat_opts:
opts.format_sort.extend(FormatSorter.ytdl_default) opts.format_sort.extend(FormatSorter.ytdl_default)
elif 'prefer-vp9-sort' in opts.compat_opts: elif 'prefer-vp9-sort' in opts.compat_opts:
opts.format_sort.extend(FormatSorter._prefer_vp9_sort) FormatSorter.default = FormatSorter._prefer_vp9_sort
if 'mtime-by-default' in opts.compat_opts: if 'mtime-by-default' in opts.compat_opts:
if opts.updatetime is None: if opts.updatetime is None:

View File

@@ -337,6 +337,7 @@
CBCGemIE, CBCGemIE,
CBCGemLiveIE, CBCGemLiveIE,
CBCGemPlaylistIE, CBCGemPlaylistIE,
CBCListenIE,
CBCPlayerIE, CBCPlayerIE,
CBCPlayerPlaylistIE, CBCPlayerPlaylistIE,
) )
@@ -823,6 +824,13 @@
IchinanaLiveIE, IchinanaLiveIE,
IchinanaLiveVODIE, IchinanaLiveVODIE,
) )
from .idagio import (
IdagioAlbumIE,
IdagioPersonalPlaylistIE,
IdagioPlaylistIE,
IdagioRecordingIE,
IdagioTrackIE,
)
from .idolplus import IdolPlusIE from .idolplus import IdolPlusIE
from .ign import ( from .ign import (
IGNIE, IGNIE,

View File

@@ -21,7 +21,7 @@
class ABCIE(InfoExtractor): class ABCIE(InfoExtractor):
IE_NAME = 'abc.net.au' IE_NAME = 'abc.net.au'
_VALID_URL = r'https?://(?:www\.)?abc\.net\.au/(?:news|btn)/(?:[^/]+/){1,4}(?P<id>\d{5,})' _VALID_URL = r'https?://(?:www\.)?abc\.net\.au/(?:news|btn|listen)/(?:[^/?#]+/){1,4}(?P<id>\d{5,})'
_TESTS = [{ _TESTS = [{
'url': 'http://www.abc.net.au/news/2014-11-05/australia-to-staff-ebola-treatment-centre-in-sierra-leone/5868334', 'url': 'http://www.abc.net.au/news/2014-11-05/australia-to-staff-ebola-treatment-centre-in-sierra-leone/5868334',
@@ -53,8 +53,9 @@ class ABCIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '6880080', 'id': '6880080',
'ext': 'mp3', 'ext': 'mp3',
'title': 'NAB lifts interest rates, following Westpac and CBA', 'title': 'NAB lifts interest rates, following Westpac and CBA - ABC listen',
'description': 'md5:f13d8edc81e462fce4a0437c7dc04728', 'description': 'md5:f13d8edc81e462fce4a0437c7dc04728',
'thumbnail': r're:https://live-production\.wcms\.abc-cdn\.net\.au/2193d7437c84b25eafd6360c82b5fa21',
}, },
}, { }, {
'url': 'http://www.abc.net.au/news/2015-10-19/6866214', 'url': 'http://www.abc.net.au/news/2015-10-19/6866214',
@@ -64,8 +65,9 @@ class ABCIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '10527914', 'id': '10527914',
'ext': 'mp4', 'ext': 'mp4',
'title': 'WWI Centenary', 'title': 'WWI Centenary - Behind The News',
'description': 'md5:c2379ec0ca84072e86b446e536954546', 'description': 'md5:fa4405939ff750fade46ff0cd4c66a52',
'thumbnail': r're:https://live-production\.wcms\.abc-cdn\.net\.au/bcc3433c97bf992dff32ec5a768713c9',
}, },
}, { }, {
'url': 'https://www.abc.net.au/news/programs/the-world/2020-06-10/black-lives-matter-protests-spawn-support-for/12342074', 'url': 'https://www.abc.net.au/news/programs/the-world/2020-06-10/black-lives-matter-protests-spawn-support-for/12342074',
@@ -73,7 +75,8 @@ class ABCIE(InfoExtractor):
'id': '12342074', 'id': '12342074',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Black Lives Matter protests spawn support for Papuans in Indonesia', 'title': 'Black Lives Matter protests spawn support for Papuans in Indonesia',
'description': 'md5:2961a17dc53abc558589ccd0fb8edd6f', 'description': 'md5:625257209f2d14ce23cb4e3785da9beb',
'thumbnail': r're:https://live-production\.wcms\.abc-cdn\.net\.au/7ee6f190de6d7dbb04203e514bfae9ec',
}, },
}, { }, {
'url': 'https://www.abc.net.au/btn/newsbreak/btn-newsbreak-20200814/12560476', 'url': 'https://www.abc.net.au/btn/newsbreak/btn-newsbreak-20200814/12560476',
@@ -93,7 +96,16 @@ class ABCIE(InfoExtractor):
'title': 'Wagner Group retreating from Russia, leader Prigozhin to move to Belarus', 'title': 'Wagner Group retreating from Russia, leader Prigozhin to move to Belarus',
'ext': 'mp4', 'ext': 'mp4',
'description': 'Wagner troops leave Rostov-on-Don and\xa0Yevgeny Prigozhin will move to Belarus under a deal brokered by Belarusian President Alexander Lukashenko to end the mutiny.', 'description': 'Wagner troops leave Rostov-on-Don and\xa0Yevgeny Prigozhin will move to Belarus under a deal brokered by Belarusian President Alexander Lukashenko to end the mutiny.',
'thumbnail': 'https://live-production.wcms.abc-cdn.net.au/0c170f5b57f0105c432f366c0e8e267b?impolicy=wcms_crop_resize&cropH=2813&cropW=5000&xPos=0&yPos=249&width=862&height=485', 'thumbnail': r're:https://live-production\.wcm\.abc-cdn\.net\.au/0c170f5b57f0105c432f366c0e8e267b',
},
}, {
'url': 'https://www.abc.net.au/listen/programs/the-followers-madness-of-two/presents-followers-madness-of-two/105697646',
'info_dict': {
'id': '105697646',
'title': 'INTRODUCING — The Followers: Madness of Two - ABC listen',
'ext': 'mp3',
'description': 'md5:2310cd0d440a4e01656abea15db8d1f3',
'thumbnail': r're:https://live-production\.wcms\.abc-cdn\.net\.au/90d7078214e5d66553ffb7fcf0da0cda',
}, },
}] }]

View File

@@ -31,7 +31,7 @@
class CBCIE(InfoExtractor): class CBCIE(InfoExtractor):
IE_NAME = 'cbc.ca' IE_NAME = 'cbc.ca'
_VALID_URL = r'https?://(?:www\.)?cbc\.ca/(?!player/)(?:[^/]+/)+(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?cbc\.ca/(?!player/|listen/|i/caffeine/syndicate/)(?:[^/?#]+/)+(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
# with mediaId # with mediaId
'url': 'http://www.cbc.ca/22minutes/videos/clips-season-23/don-cherry-play-offs', 'url': 'http://www.cbc.ca/22minutes/videos/clips-season-23/don-cherry-play-offs',
@@ -112,10 +112,6 @@ class CBCIE(InfoExtractor):
'playlist_mincount': 6, 'playlist_mincount': 6,
}] }]
@classmethod
def suitable(cls, url):
return False if CBCPlayerIE.suitable(url) else super().suitable(url)
def _extract_player_init(self, player_init, display_id): def _extract_player_init(self, player_init, display_id):
player_info = self._parse_json(player_init, display_id, js_to_json) player_info = self._parse_json(player_init, display_id, js_to_json)
media_id = player_info.get('mediaId') media_id = player_info.get('mediaId')
@@ -913,3 +909,63 @@ def _real_extract(self, url):
'thumbnail': ('images', 'card', 'url'), 'thumbnail': ('images', 'card', 'url'),
}), }),
} }
class CBCListenIE(InfoExtractor):
IE_NAME = 'cbc.ca:listen'
_VALID_URL = r'https?://(?:www\.)?cbc\.ca/listen/(?:cbc-podcasts|live-radio)/[\w-]+/[\w-]+/(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.cbc.ca/listen/cbc-podcasts/1353-the-naked-emperor/episode/16142603-introducing-understood-who-broke-the-internet',
'info_dict': {
'id': '16142603',
'title': 'Introducing Understood: Who Broke the Internet?',
'ext': 'mp3',
'description': 'md5:c605117500084e43f08a950adc6a708c',
'duration': 229,
'timestamp': 1745812800,
'release_timestamp': 1745827200,
'release_date': '20250428',
'upload_date': '20250428',
},
}, {
'url': 'https://www.cbc.ca/listen/live-radio/1-64-the-house/clip/16170773-should-canada-suck-stand-donald-trump',
'info_dict': {
'id': '16170773',
'title': 'Should Canada suck up or stand up to Donald Trump?',
'ext': 'mp3',
'description': 'md5:7385194f1cdda8df27ba3764b35e7976',
'duration': 3159,
'timestamp': 1758340800,
'release_timestamp': 1758254400,
'release_date': '20250919',
'upload_date': '20250920',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
response = self._download_json(
f'https://www.cbc.ca/listen/api/v1/clips/{video_id}', video_id, fatal=False)
data = traverse_obj(response, ('data', {dict}))
if not data:
self.report_warning('API failed to return data. Falling back to webpage parsing')
webpage = self._download_webpage(url, video_id)
preloaded_state = self._search_json(
r'window\.__PRELOADED_STATE__\s*=', webpage, 'preloaded state',
video_id, transform_source=js_to_json)
data = traverse_obj(preloaded_state, (
('podcastDetailData', 'showDetailData'), ..., 'episodes',
lambda _, v: str(v['clipID']) == video_id, any, {require('episode data')}))
return {
'id': video_id,
**traverse_obj(data, {
'url': (('src', 'url'), {url_or_none}, any),
'title': ('title', {str}),
'description': ('description', {str}),
'release_timestamp': ('releasedAt', {int_or_none(scale=1000)}),
'timestamp': ('airdate', {int_or_none(scale=1000)}),
'duration': ('duration', {int_or_none}),
}),
}

View File

@@ -5,18 +5,6 @@
class CellebriteIE(VidyardBaseIE): class CellebriteIE(VidyardBaseIE):
_VALID_URL = r'https?://cellebrite\.com/(?:\w+)?/(?P<id>[\w-]+)' _VALID_URL = r'https?://cellebrite\.com/(?:\w+)?/(?P<id>[\w-]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://cellebrite.com/en/collect-data-from-android-devices-with-cellebrite-ufed/',
'info_dict': {
'id': 'ZqmUss3dQfEMGpauambPuH',
'display_id': '16025876',
'ext': 'mp4',
'title': 'Ask the Expert: Chat Capture - Collect Data from Android Devices in Cellebrite UFED',
'description': 'md5:dee48fe12bbae5c01fe6a053f7676da4',
'thumbnail': 'https://cellebrite.com/wp-content/uploads/2021/05/Chat-Capture-1024x559.png',
'duration': 455.979,
'_old_archive_ids': ['cellebrite 16025876'],
},
}, {
'url': 'https://cellebrite.com/en/how-to-lawfully-collect-the-maximum-amount-of-data-from-android-devices/', 'url': 'https://cellebrite.com/en/how-to-lawfully-collect-the-maximum-amount-of-data-from-android-devices/',
'info_dict': { 'info_dict': {
'id': 'QV1U8a2yzcxigw7VFnqKyg', 'id': 'QV1U8a2yzcxigw7VFnqKyg',

View File

@@ -18,15 +18,15 @@
class DropoutIE(InfoExtractor): class DropoutIE(InfoExtractor):
_LOGIN_URL = 'https://www.dropout.tv/login' _LOGIN_URL = 'https://watch.dropout.tv/login'
_NETRC_MACHINE = 'dropout' _NETRC_MACHINE = 'dropout'
_VALID_URL = r'https?://(?:www\.)?dropout\.tv/(?:[^/]+/)*videos/(?P<id>[^/]+)/?$' _VALID_URL = r'https?://(?:watch\.)?dropout\.tv/(?:[^/?#]+/)*videos/(?P<id>[^/?#]+)/?(?:[?#]|$)'
_TESTS = [ _TESTS = [
{ {
'url': 'https://www.dropout.tv/game-changer/season:2/videos/yes-or-no', 'url': 'https://watch.dropout.tv/game-changer/season:2/videos/yes-or-no',
'note': 'Episode in a series', 'note': 'Episode in a series',
'md5': '5e000fdfd8d8fa46ff40456f1c2af04a', 'md5': '4b76963f904f8bc4ba22dcf0e66ada06',
'info_dict': { 'info_dict': {
'id': '738153', 'id': '738153',
'display_id': 'yes-or-no', 'display_id': 'yes-or-no',
@@ -45,35 +45,35 @@ class DropoutIE(InfoExtractor):
'uploader_url': 'https://vimeo.com/user80538407', 'uploader_url': 'https://vimeo.com/user80538407',
'uploader': 'OTT Videos', 'uploader': 'OTT Videos',
}, },
'expected_warnings': ['Ignoring subtitle tracks found in the HLS manifest'], 'expected_warnings': ['Ignoring subtitle tracks found in the HLS manifest', 'Failed to parse XML: not well-formed'],
}, },
{ {
'url': 'https://www.dropout.tv/dimension-20-fantasy-high/season:1/videos/episode-1', 'url': 'https://watch.dropout.tv/tablepop-presents-megadungeon-live/season:1/videos/enter-through-the-gift-shop',
'note': 'Episode in a series (missing release_date)', 'note': 'Episode in a series (missing release_date)',
'md5': '712caf7c191f1c47c8f1879520c2fa5c', 'md5': 'b08fb03050585ea25cd7ee092db9134c',
'info_dict': { 'info_dict': {
'id': '320562', 'id': '624270',
'display_id': 'episode-1', 'display_id': 'enter-through-the-gift-shop',
'ext': 'mp4', 'ext': 'mp4',
'title': 'The Beginning Begins', 'title': 'Enter Through the Gift Shop',
'description': 'The cast introduces their PCs, including a neurotic elf, a goblin PI, and a corn-worshipping cleric.', 'description': 'A new adventuring party explores a gift shop and runs into a friendly orc -- and some angry goblins.',
'thumbnail': 'https://vhx.imgix.net/chuncensoredstaging/assets/4421ed0d-f630-4c88-9004-5251b2b8adfa.jpg', 'thumbnail': 'https://vhx.imgix.net/chuncensoredstaging/assets/a1d876c3-3dee-4cd0-87c6-27a851b1d0ec.jpg',
'series': 'Dimension 20: Fantasy High', 'series': 'TablePop Presents: MEGADUNGEON LIVE!',
'season_number': 1, 'season_number': 1,
'season': 'Season 1', 'season': 'Season 1',
'episode_number': 1, 'episode_number': 1,
'episode': 'The Beginning Begins', 'episode': 'Enter Through the Gift Shop',
'duration': 6838, 'duration': 7101,
'uploader_id': 'user80538407', 'uploader_id': 'user80538407',
'uploader_url': 'https://vimeo.com/user80538407', 'uploader_url': 'https://vimeo.com/user80538407',
'uploader': 'OTT Videos', 'uploader': 'OTT Videos',
}, },
'expected_warnings': ['Ignoring subtitle tracks found in the HLS manifest'], 'expected_warnings': ['Ignoring subtitle tracks found in the HLS manifest', 'Failed to parse XML: not well-formed'],
}, },
{ {
'url': 'https://www.dropout.tv/videos/misfits-magic-holiday-special', 'url': 'https://watch.dropout.tv/videos/misfits-magic-holiday-special',
'note': 'Episode not in a series', 'note': 'Episode not in a series',
'md5': 'c30fa18999c5880d156339f13c953a26', 'md5': '1e6428f7756b02c93b573d39ddd789fe',
'info_dict': { 'info_dict': {
'id': '1915774', 'id': '1915774',
'display_id': 'misfits-magic-holiday-special', 'display_id': 'misfits-magic-holiday-special',
@@ -87,7 +87,7 @@ class DropoutIE(InfoExtractor):
'uploader_url': 'https://vimeo.com/user80538407', 'uploader_url': 'https://vimeo.com/user80538407',
'uploader': 'OTT Videos', 'uploader': 'OTT Videos',
}, },
'expected_warnings': ['Ignoring subtitle tracks found in the HLS manifest'], 'expected_warnings': ['Ignoring subtitle tracks found in the HLS manifest', 'Failed to parse XML: not well-formed'],
}, },
] ]
@@ -125,7 +125,7 @@ def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = None webpage = None
if self._get_cookies('https://www.dropout.tv').get('_session'): if self._get_cookies('https://watch.dropout.tv').get('_session'):
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
if not webpage or '<div id="watch-unauthorized"' in webpage: if not webpage or '<div id="watch-unauthorized"' in webpage:
login_err = self._login(display_id) login_err = self._login(display_id)
@@ -148,7 +148,7 @@ def _real_extract(self, url):
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
'ie_key': VHXEmbedIE.ie_key(), 'ie_key': VHXEmbedIE.ie_key(),
'url': VHXEmbedIE._smuggle_referrer(embed_url, 'https://www.dropout.tv'), 'url': VHXEmbedIE._smuggle_referrer(embed_url, 'https://watch.dropout.tv'),
'id': self._search_regex(r'embed\.vhx\.tv/videos/(.+?)\?', embed_url, 'id'), 'id': self._search_regex(r'embed\.vhx\.tv/videos/(.+?)\?', embed_url, 'id'),
'display_id': display_id, 'display_id': display_id,
'title': title, 'title': title,
@@ -167,10 +167,10 @@ def _real_extract(self, url):
class DropoutSeasonIE(InfoExtractor): class DropoutSeasonIE(InfoExtractor):
_PAGE_SIZE = 24 _PAGE_SIZE = 24
_VALID_URL = r'https?://(?:www\.)?dropout\.tv/(?P<id>[^\/$&?#]+)(?:/?$|/season:(?P<season>[0-9]+)/?$)' _VALID_URL = r'https?://(?:watch\.)?dropout\.tv/(?P<id>[^\/$&?#]+)(?:/?$|/season:(?P<season>[0-9]+)/?$)'
_TESTS = [ _TESTS = [
{ {
'url': 'https://www.dropout.tv/dimension-20-fantasy-high/season:1', 'url': 'https://watch.dropout.tv/dimension-20-fantasy-high/season:1',
'note': 'Multi-season series with the season in the url', 'note': 'Multi-season series with the season in the url',
'playlist_count': 24, 'playlist_count': 24,
'info_dict': { 'info_dict': {
@@ -179,7 +179,7 @@ class DropoutSeasonIE(InfoExtractor):
}, },
}, },
{ {
'url': 'https://www.dropout.tv/dimension-20-fantasy-high', 'url': 'https://watch.dropout.tv/dimension-20-fantasy-high',
'note': 'Multi-season series with the season not in the url', 'note': 'Multi-season series with the season not in the url',
'playlist_count': 24, 'playlist_count': 24,
'info_dict': { 'info_dict': {
@@ -188,7 +188,7 @@ class DropoutSeasonIE(InfoExtractor):
}, },
}, },
{ {
'url': 'https://www.dropout.tv/dimension-20-shriek-week', 'url': 'https://watch.dropout.tv/dimension-20-shriek-week',
'note': 'Single-season series', 'note': 'Single-season series',
'playlist_count': 4, 'playlist_count': 4,
'info_dict': { 'info_dict': {
@@ -197,7 +197,7 @@ class DropoutSeasonIE(InfoExtractor):
}, },
}, },
{ {
'url': 'https://www.dropout.tv/breaking-news-no-laugh-newsroom/season:3', 'url': 'https://watch.dropout.tv/breaking-news-no-laugh-newsroom/season:3',
'note': 'Multi-season series with season in the url that requires pagination', 'note': 'Multi-season series with season in the url that requires pagination',
'playlist_count': 25, 'playlist_count': 25,
'info_dict': { 'info_dict': {

233
yt_dlp/extractor/idagio.py Normal file
View File

@@ -0,0 +1,233 @@
from .common import InfoExtractor
from ..utils import int_or_none, unified_timestamp, url_or_none
from ..utils.traversal import traverse_obj
class IdagioTrackIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?app\.idagio\.com/recordings/\d+\?(?:[^#]+&)?trackId=(?P<id>\d+)'
_TESTS = [{
'url': 'https://app.idagio.com/recordings/30576934?trackId=30576943',
'md5': '15148bd71804b2450a2508931a116b56',
'info_dict': {
'id': '30576943',
'ext': 'mp3',
'title': 'Theme. Andante',
'duration': 82,
'composers': ['Edward Elgar'],
'artists': ['Vasily Petrenko', 'Royal Liverpool Philharmonic Orchestra'],
'genres': ['Orchestral', 'Other Orchestral Music'],
'track': 'Theme. Andante',
'timestamp': 1554474370,
'upload_date': '20190405',
},
}, {
'url': 'https://app.idagio.com/recordings/20514467?trackId=20514478&utm_source=pcl',
'md5': '3acef2ea0feadf889123b70e5a1e7fa7',
'info_dict': {
'id': '20514478',
'ext': 'mp3',
'title': 'I. Adagio sostenuto',
'duration': 316,
'composers': ['Ludwig van Beethoven'],
'artists': [],
'genres': ['Keyboard', 'Sonata (Keyboard)'],
'track': 'I. Adagio sostenuto',
'timestamp': 1518076337,
'upload_date': '20180208',
},
}]
def _real_extract(self, url):
track_id = self._match_id(url)
track_info = self._download_json(
f'https://api.idagio.com/v2.0/metadata/tracks/{track_id}',
track_id, fatal=False, expected_status=406)
if traverse_obj(track_info, 'error_code') == 'idagio.error.blocked.location':
self.raise_geo_restricted()
content_info = self._download_json(
f'https://api.idagio.com/v1.8/content/track/{track_id}', track_id,
query={
'quality': '0',
'format': '2',
'client_type': 'web-4',
})
return {
'ext': 'mp3',
'vcodec': 'none',
'id': track_id,
'url': traverse_obj(content_info, ('url', {url_or_none})),
**traverse_obj(track_info, ('result', {
'title': ('piece', 'title', {str}),
'timestamp': ('recording', 'created_at', {int_or_none(scale=1000)}),
'location': ('recording', 'location', {str}),
'duration': ('duration', {int_or_none}),
'track': ('piece', 'title', {str}),
'artists': ('recording', ('conductor', ('ensembles', ...), ('soloists', ...)), 'name', {str}, filter),
'composers': ('piece', 'workpart', 'work', 'composer', 'name', {str}, filter, all, filter),
'genres': ('piece', 'workpart', 'work', ('genre', 'subgenre'), 'title', {str}, filter),
})),
}
class IdagioPlaylistBaseIE(InfoExtractor):
"""Subclasses must set _API_URL_TMPL and define _parse_playlist_metadata"""
_PLAYLIST_ID_KEY = 'id' # vs. 'display_id'
def _entries(self, playlist_info):
for track_data in traverse_obj(playlist_info, ('tracks', lambda _, v: v['id'] and v['recording']['id'])):
track_id = track_data['id']
recording_id = track_data['recording']['id']
yield self.url_result(
f'https://app.idagio.com/recordings/{recording_id}?trackId={track_id}',
ie=IdagioTrackIE, video_id=track_id)
def _real_extract(self, url):
playlist_id = self._match_id(url)
playlist_info = self._download_json(
self._API_URL_TMPL.format(playlist_id), playlist_id)['result']
return {
'_type': 'playlist',
self._PLAYLIST_ID_KEY: playlist_id,
'entries': self._entries(playlist_info),
**self._parse_playlist_metadata(playlist_info),
}
class IdagioRecordingIE(IdagioPlaylistBaseIE):
_VALID_URL = r'https?://(?:www\.)?app\.idagio\.com/recordings/(?P<id>\d+)(?![^#]*[&?]trackId=\d+)'
_TESTS = [{
'url': 'https://app.idagio.com/recordings/30576934',
'info_dict': {
'id': '30576934',
'title': 'Variations on an Original Theme op. 36',
'composers': ['Edward Elgar'],
'artists': ['Vasily Petrenko', 'Royal Liverpool Philharmonic Orchestra'],
'genres': ['Orchestral', 'Other Orchestral Music'],
'timestamp': 1554474370,
'modified_timestamp': 1554474370,
'modified_date': '20190405',
'upload_date': '20190405',
},
'playlist_count': 15,
}]
_API_URL_TMPL = 'https://api.idagio.com/v2.0/metadata/recordings/{}'
def _parse_playlist_metadata(self, playlist_info):
return traverse_obj(playlist_info, {
'title': ('work', 'title', {str}),
'timestamp': ('created_at', {int_or_none(scale=1000)}),
'modified_timestamp': ('created_at', {int_or_none(scale=1000)}),
'location': ('location', {str}),
'artists': (('conductor', ('ensembles', ...), ('soloists', ...)), 'name', {str}),
'composers': ('work', 'composer', 'name', {str}, all),
'genres': ('work', ('genre', 'subgenre'), 'title', {str}),
'tags': ('tags', ..., {str}),
})
class IdagioAlbumIE(IdagioPlaylistBaseIE):
_VALID_URL = r'https?://(?:www\.)?app\.idagio\.com/albums/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://app.idagio.com/albums/elgar-enigma-variations-in-the-south-serenade-for-strings',
'info_dict': {
'id': 'a9f139b8-f70d-4b8a-a9a4-5fe8d35eaf9c',
'display_id': 'elgar-enigma-variations-in-the-south-serenade-for-strings',
'title': 'Elgar: Enigma Variations, In the South, Serenade for Strings',
'description': '',
'thumbnail': 'https://idagio-images.global.ssl.fastly.net/albums/880040420521/main.jpg',
'artists': ['Vasily Petrenko', 'Royal Liverpool Philharmonic Orchestra', 'Edward Elgar'],
'timestamp': 1553817600,
'upload_date': '20190329',
'modified_timestamp': 1562566559.0,
'modified_date': '20190708',
},
'playlist_count': 19,
}, {
'url': 'https://app.idagio.com/albums/brahms-ein-deutsches-requiem-3B403DF6-62D7-4A42-807B-47173F3E0192',
'info_dict': {
'id': '2862ad4e-4a61-45ad-9ce4-7fcf0c2626fe',
'display_id': 'brahms-ein-deutsches-requiem-3B403DF6-62D7-4A42-807B-47173F3E0192',
'title': 'Brahms: Ein deutsches Requiem',
'description': '',
'thumbnail': 'https://idagio-images.global.ssl.fastly.net/albums/3149020954522/main.jpg',
'tags': ['recent-release'],
'artists': ['Sabine Devieilhe', 'Stéphane Degout', 'Raphaël Pichon', 'Pygmalion', 'Johannes Brahms'],
'timestamp': 1760054400,
'upload_date': '20251010',
'modified_timestamp': 1760101611,
'modified_date': '20251010',
},
'playlist_count': 7,
}]
_API_URL_TMPL = 'https://api.idagio.com/v2.0/metadata/albums/{}'
_PLAYLIST_ID_KEY = 'display_id'
def _parse_playlist_metadata(self, playlist_info):
return traverse_obj(playlist_info, {
'id': ('id', {str}),
'title': ('title', {str}),
'timestamp': ('publishDate', {unified_timestamp}),
'modified_timestamp': ('lastModified', {unified_timestamp}),
'thumbnail': ('imageUrl', {url_or_none}),
'description': ('description', {str}),
'artists': ('participants', ..., 'name', {str}),
'tags': ('tags', ..., {str}),
})
class IdagioPlaylistIE(IdagioPlaylistBaseIE):
_VALID_URL = r'https?://(?:www\.)?app\.idagio\.com/playlists/(?!personal/)(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://app.idagio.com/playlists/beethoven-the-most-beautiful-piano-music',
'info_dict': {
'id': '31652bec-8c5b-460e-a3f0-cf1f69817f53',
'display_id': 'beethoven-the-most-beautiful-piano-music',
'title': 'Beethoven: the most beautiful piano music',
'description': 'md5:d41bb04b8896bb69377f5c2cd9345ad1',
'thumbnail': r're:https://.+/playlists/31652bec-8c5b-460e-a3f0-cf1f69817f53/main\.jpg',
'creators': ['IDAGIO'],
},
'playlist_mincount': 16, # one entry is geo-restricted
}]
_API_URL_TMPL = 'https://api.idagio.com/v2.0/playlists/{}'
_PLAYLIST_ID_KEY = 'display_id'
def _parse_playlist_metadata(self, playlist_info):
return traverse_obj(playlist_info, {
'id': ('id', {str}),
'title': ('title', {str}),
'thumbnail': ('imageUrl', {url_or_none}),
'description': ('description', {str}),
'creators': ('curator', 'name', {str}, all),
})
class IdagioPersonalPlaylistIE(IdagioPlaylistBaseIE):
_VALID_URL = r'https?://(?:www\.)?app\.idagio\.com/playlists/personal/(?P<id>[\da-f-]+)'
_TESTS = [{
'url': 'https://app.idagio.com/playlists/personal/99dad72e-7b3a-45a4-b216-867c08046ed8',
'info_dict': {
'id': '99dad72e-7b3a-45a4-b216-867c08046ed8',
'title': 'Test',
'creators': ['1a6f16a6-4514-4d0c-b481-3a9877835626'],
'thumbnail': r're:https://.+/artists/86371/main\.jpg',
'timestamp': 1602859138,
'modified_timestamp': 1755616667,
'upload_date': '20201016',
'modified_date': '20250819',
},
'playlist_count': 100,
}]
_API_URL_TMPL = 'https://api.idagio.com/v1.0/personal-playlists/{}'
def _parse_playlist_metadata(self, playlist_info):
return traverse_obj(playlist_info, {
'title': ('title', {str}),
'thumbnail': ('image_url', {url_or_none}),
'creators': ('user_id', {str}, all),
'timestamp': ('created_at', {int_or_none(scale=1000)}),
'modified_timestamp': ('updated_at', {int_or_none(scale=1000)}),
})

View File

@@ -1,3 +1,5 @@
import hashlib
from .common import InfoExtractor from .common import InfoExtractor
@@ -9,10 +11,10 @@ class MuseScoreIE(InfoExtractor):
'id': '142975', 'id': '142975',
'ext': 'mp3', 'ext': 'mp3',
'title': 'WA Mozart Marche Turque (Turkish March fingered)', 'title': 'WA Mozart Marche Turque (Turkish March fingered)',
'description': 'md5:7ede08230e4eaabd67a4a98bb54d07be', 'description': 'md5:0ca4cf6b79d7f5868a1fee74097394ab',
'thumbnail': r're:https?://(?:www\.)?musescore\.com/.*\.png[^$]+', 'thumbnail': r're:https?://cdn\.ustatik\.com/musescore/.*\.jpg',
'uploader': 'PapyPiano', 'uploader': 'PapyPiano',
'creator': 'Wolfgang Amadeus Mozart', 'creators': ['Wolfgang Amadeus Mozart'],
}, },
}, { }, {
'url': 'https://musescore.com/user/36164500/scores/6837638', 'url': 'https://musescore.com/user/36164500/scores/6837638',
@@ -20,10 +22,10 @@ class MuseScoreIE(InfoExtractor):
'id': '6837638', 'id': '6837638',
'ext': 'mp3', 'ext': 'mp3',
'title': 'Sweet Child O\' Mine Guns N\' Roses sweet child', 'title': 'Sweet Child O\' Mine Guns N\' Roses sweet child',
'description': 'md5:4dca71191c14abc312a0a4192492eace', 'description': 'md5:2cd49bd6b4e48a75a3c469d4775d5079',
'thumbnail': r're:https?://(?:www\.)?musescore\.com/.*\.png[^$]+', 'thumbnail': r're:https?://cdn\.ustatik\.com/musescore/.*\.png',
'uploader': 'roxbelviolin', 'uploader': 'roxbelviolin',
'creator': 'Guns N´Roses Arr. Roxbel Violin', 'creators': ['Guns N´Roses Arr. Roxbel Violin'],
}, },
}, { }, {
'url': 'https://musescore.com/classicman/fur-elise', 'url': 'https://musescore.com/classicman/fur-elise',
@@ -31,22 +33,28 @@ class MuseScoreIE(InfoExtractor):
'id': '33816', 'id': '33816',
'ext': 'mp3', 'ext': 'mp3',
'title': 'Für Elise Beethoven', 'title': 'Für Elise Beethoven',
'description': 'md5:49515a3556d5ecaf9fa4b2514064ac34', 'description': 'md5:e37b241c0280b33e9ac25651b815d06e',
'thumbnail': r're:https?://(?:www\.)?musescore\.com/.*\.png[^$]+', 'thumbnail': r're:https?://cdn\.ustatik\.com/musescore/.*\.jpg',
'uploader': 'ClassicMan', 'uploader': 'ClassicMan',
'creator': 'Ludwig van Beethoven (17701827)', 'creators': ['Ludwig van Beethoven (17701827)'],
}, },
}, { }, {
'url': 'https://musescore.com/minh_cuteee/scores/6555384', 'url': 'https://musescore.com/minh_cuteee/scores/6555384',
'only_matching': True, 'only_matching': True,
}] }]
@staticmethod
def _generate_auth_token(video_id):
return hashlib.md5((video_id + 'mp30gs').encode()).hexdigest()[:4]
def _real_extract(self, url): def _real_extract(self, url):
webpage = self._download_webpage(url, None) webpage = self._download_webpage(url, None)
url = self._og_search_url(webpage) or url url = self._og_search_url(webpage) or url
video_id = self._match_id(url) video_id = self._match_id(url)
mp3_url = self._download_json(f'https://musescore.com/api/jmuse?id={video_id}&index=0&type=mp3&v2=1', video_id, mp3_url = self._download_json(
headers={'authorization': '63794e5461e4cfa046edfbdddfccc1ac16daffd2'})['info']['url'] 'https://musescore.com/api/jmuse', video_id,
headers={'authorization': self._generate_auth_token(video_id)},
query={'id': video_id, 'index': '0', 'type': 'mp3'})['info']['url']
formats = [{ formats = [{
'url': mp3_url, 'url': mp3_url,
'ext': 'mp3', 'ext': 'mp3',
@@ -57,7 +65,7 @@ def _real_extract(self, url):
'id': video_id, 'id': video_id,
'formats': formats, 'formats': formats,
'title': self._og_search_title(webpage), 'title': self._og_search_title(webpage),
'description': self._og_search_description(webpage), 'description': self._html_search_meta('description', webpage, 'description'),
'thumbnail': self._og_search_thumbnail(webpage), 'thumbnail': self._og_search_thumbnail(webpage),
'uploader': self._html_search_meta('musescore:author', webpage, 'uploader'), 'uploader': self._html_search_meta('musescore:author', webpage, 'uploader'),
'creator': self._html_search_meta('musescore:composer', webpage, 'composer'), 'creator': self._html_search_meta('musescore:composer', webpage, 'composer'),

View File

@@ -1,8 +1,8 @@
import json import json
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import float_or_none, parse_iso8601, str_or_none, try_call from ..utils import float_or_none, parse_iso8601, str_or_none, try_call, url_or_none
from ..utils.traversal import traverse_obj from ..utils.traversal import traverse_obj, value
class PrankCastIE(InfoExtractor): class PrankCastIE(InfoExtractor):
@@ -100,9 +100,38 @@ class PrankCastPostIE(InfoExtractor):
'duration': 263.287, 'duration': 263.287,
'cast': ['despicabledogs'], 'cast': ['despicabledogs'],
'description': 'https://imgur.com/a/vtxLvKU', 'description': 'https://imgur.com/a/vtxLvKU',
'categories': [],
'upload_date': '20240104', 'upload_date': '20240104',
}, },
}, {
'url': 'https://prankcast.com/drtomservo/posts/11988-butteye-s-late-night-stank-episode-1-part-1-',
'info_dict': {
'id': '11988',
'ext': 'mp3',
'title': 'Butteye\'s Late Night Stank Episode 1 (Part 1)',
'display_id': 'butteye-s-late-night-stank-episode-1-part-1-',
'timestamp': 1754238686,
'uploader': 'DrTomServo',
'channel_id': '136',
'duration': 2176.464,
'cast': ['DrTomServo'],
'description': '',
'upload_date': '20250803',
},
}, {
'url': 'https://prankcast.com/drtomservo/posts/12105-butteye-s-late-night-stank-episode-08-16-2025-part-2',
'info_dict': {
'id': '12105',
'ext': 'mp3',
'title': 'Butteye\'s Late Night Stank Episode 08-16-2025 Part 2',
'display_id': 'butteye-s-late-night-stank-episode-08-16-2025-part-2',
'timestamp': 1755453505,
'uploader': 'DrTomServo',
'channel_id': '136',
'duration': 19018.392,
'cast': ['DrTomServo'],
'description': '',
'upload_date': '20250817',
},
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@@ -112,26 +141,28 @@ def _real_extract(self, url):
post = self._search_nextjs_data(webpage, video_id)['props']['pageProps']['ssr_data_posts'] post = self._search_nextjs_data(webpage, video_id)['props']['pageProps']['ssr_data_posts']
content = self._parse_json(post['post_contents_json'], video_id)[0] content = self._parse_json(post['post_contents_json'], video_id)[0]
uploader = post.get('user_name')
guests_json = traverse_obj(content, ('guests_json', {json.loads}, {dict})) or {}
return { return {
'id': video_id, 'id': video_id,
'title': post.get('post_title') or self._og_search_title(webpage),
'display_id': display_id, 'display_id': display_id,
'url': content.get('url'), 'title': self._og_search_title(webpage),
'timestamp': parse_iso8601(content.get('start_date') or content.get('crdate'), ' '), **traverse_obj(post, {
'uploader': uploader, 'title': ('post_title', {str}),
'channel_id': str_or_none(post.get('user_id')), 'description': ('post_body', {str}),
'duration': float_or_none(content.get('duration')), 'tags': ('post_tags', {lambda x: x.split(',')}, ..., {str.strip}, filter),
'cast': list(filter(None, [uploader, *traverse_obj(guests_json, (..., 'name'))])), 'channel_id': ('user_id', {int}, {str_or_none}),
'description': post.get('post_body'), 'uploader': ('user_name', {str}),
'categories': list(filter(None, [content.get('category')])), }),
'tags': try_call(lambda: list(filter('', post['post_tags'].split(',')))), **traverse_obj(content, {
'subtitles': { 'url': (('secure_url', 'url'), {url_or_none}, any),
'live_chat': [{ 'timestamp': ((
'url': f'https://prankcast.com/api/private/chat/select-broadcast?id={post["content_id"]}&cache=', (('start_date', 'crdate'), {parse_iso8601(delimiter=' ')}),
'ext': 'json', ('created_at', {parse_iso8601}),
}], ), any),
} if post.get('content_id') else None, 'duration': ('duration', {float_or_none}),
'categories': ('category', {str}, filter, all, filter),
'cast': ((
{value(post.get('user_name'))},
('guests_json', {json.loads}, ..., 'name'),
), {str}, filter),
}),
} }

View File

@@ -248,35 +248,17 @@ class SlidesLiveIE(InfoExtractor):
'skip_download': 'm3u8', 'skip_download': 'm3u8',
}, },
}, { }, {
# /v3/ slides, .jpg and .png, service_name = youtube # /v3/ slides, .jpg and .png, formerly service_name = youtube, now native
'url': 'https://slideslive.com/embed/38932460/', 'url': 'https://slideslive.com/embed/38932460/',
'info_dict': { 'info_dict': {
'id': 'RTPdrgkyTiE', 'id': '38932460',
'display_id': '38932460',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Active Learning for Hierarchical Multi-Label Classification', 'title': 'Active Learning for Hierarchical Multi-Label Classification',
'description': 'Watch full version of this video at https://slideslive.com/38932460.', 'duration': 941,
'channel': 'SlidesLive Videos - A', 'thumbnail': r're:https?://.+/.+\.(?:jpg|png)',
'channel_id': 'UC62SdArr41t_-_fX40QCLRw',
'channel_url': 'https://www.youtube.com/channel/UC62SdArr41t_-_fX40QCLRw',
'uploader': 'SlidesLive Videos - A',
'uploader_id': '@slideslivevideos-a6075',
'uploader_url': 'https://www.youtube.com/@slideslivevideos-a6075',
'upload_date': '20200903',
'timestamp': 1697805922,
'duration': 942,
'age_limit': 0,
'live_status': 'not_live',
'playable_in_embed': True,
'availability': 'unlisted',
'categories': ['People & Blogs'],
'tags': [],
'channel_follower_count': int,
'like_count': int,
'view_count': int,
'thumbnail': r're:^https?://.*\.(?:jpg|png|webp)',
'thumbnails': 'count:21',
'chapters': 'count:20', 'chapters': 'count:20',
'timestamp': 1708338974,
'upload_date': '20240219',
}, },
'params': { 'params': {
'skip_download': 'm3u8', 'skip_download': 'm3u8',
@@ -425,7 +407,7 @@ def _real_extract(self, url):
player_token = self._search_regex(r'data-player-token="([^"]+)"', webpage, 'player token') player_token = self._search_regex(r'data-player-token="([^"]+)"', webpage, 'player token')
player_data = self._download_webpage( player_data = self._download_webpage(
f'https://ben.slideslive.com/player/{video_id}', video_id, f'https://slideslive.com/player/{video_id}', video_id,
note='Downloading player info', query={'player_token': player_token}) note='Downloading player info', query={'player_token': player_token})
player_info = self._extract_custom_m3u8_info(player_data) player_info = self._extract_custom_m3u8_info(player_data)
@@ -525,7 +507,7 @@ def entries():
yield info yield info
service_data = self._download_json( service_data = self._download_json(
f'https://ben.slideslive.com/player/{video_id}/slides_video_service_data', f'https://slideslive.com/player/{video_id}/slides_video_service_data',
video_id, fatal=False, query={ video_id, fatal=False, query={
'player_token': player_token, 'player_token': player_token,
'videos': ','.join(video_slides), 'videos': ','.join(video_slides),

View File

@@ -438,7 +438,7 @@ class SoundcloudIE(SoundcloudBaseIE):
(?P<title>[\w\d-]+) (?P<title>[\w\d-]+)
(?:/(?P<token>(?!(?:albums|sets|recommended))[^?]+?))? (?:/(?P<token>(?!(?:albums|sets|recommended))[^?]+?))?
(?:[?].*)?$) (?:[?].*)?$)
|(?:api(?:-v2)?\.soundcloud\.com/tracks/(?P<track_id>\d+) |(?:api(?:-v2)?\.soundcloud\.com/tracks/(?:soundcloud%3Atracks%3A)?(?P<track_id>\d+)
(?:/?\?secret_token=(?P<secret_token>[^&]+))?) (?:/?\?secret_token=(?P<secret_token>[^&]+))?)
) )
''' '''
@@ -692,6 +692,9 @@ class SoundcloudIE(SoundcloudBaseIE):
# Go+ (account with active subscription needed) # Go+ (account with active subscription needed)
'url': 'https://soundcloud.com/taylorswiftofficial/look-what-you-made-me-do', 'url': 'https://soundcloud.com/taylorswiftofficial/look-what-you-made-me-do',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://api.soundcloud.com/tracks/soundcloud%3Atracks%3A1083788353',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -1,12 +1,20 @@
import base64
import datetime as dt
import itertools import itertools
import json
import re
import time
from .common import InfoExtractor from .common import InfoExtractor
from ..networking import HEADRequest from ..networking.exceptions import HTTPError
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
encode_data_uri,
filter_dict,
int_or_none, int_or_none,
update_url_query, jwt_decode_hs256,
url_or_none, url_or_none,
urlencode_postdata,
urljoin, urljoin,
) )
from ..utils.traversal import traverse_obj from ..utils.traversal import traverse_obj
@@ -90,7 +98,7 @@ class TenPlayIE(InfoExtractor):
'only_matching': True, 'only_matching': True,
}] }]
_GEO_BYPASS = False _GEO_BYPASS = False
_GEO_COUNTRIES = ['AU']
_AUS_AGES = { _AUS_AGES = {
'G': 0, 'G': 0,
'PG': 15, 'PG': 15,
@@ -100,31 +108,155 @@ class TenPlayIE(InfoExtractor):
'R': 18, 'R': 18,
'X': 18, 'X': 18,
} }
_TOKEN_CACHE_KEY = 'token_data'
_SEGMENT_BITRATE_RE = r'(?m)-(?:300|150|75|55)0000-(\d+(?:-[\da-f]+)?)\.ts$'
_refresh_token = None
_access_token = None
@staticmethod
def _filter_ads_from_m3u8(m3u8_doc):
out = []
for line in m3u8_doc.splitlines():
if line.startswith('https://redirector.googlevideo.com/'):
out.pop()
continue
out.append(line)
return '\n'.join(out)
@staticmethod
def _generate_xnetwork_ten_auth_token():
ts = dt.datetime.now(dt.timezone.utc).strftime('%Y%m%d%H%M%S')
return base64.b64encode(ts.encode()).decode()
@staticmethod
def _is_jwt_expired(token):
return jwt_decode_hs256(token)['exp'] - time.time() < 300
def _refresh_access_token(self):
try:
refresh_data = self._download_json(
'https://10.com.au/api/token/refresh', None, 'Refreshing access token',
headers={
'Content-Type': 'application/json',
}, data=json.dumps({
'accessToken': self._access_token,
'refreshToken': self._refresh_token,
}).encode())
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 400:
self._refresh_token = self._access_token = None
self.cache.store(self._NETRC_MACHINE, self._TOKEN_CACHE_KEY, [None, None])
self.report_warning('Refresh token has been invalidated; retrying with credentials')
self._perform_login(*self._get_login_info())
return
raise
self._access_token = refresh_data['accessToken']
self._refresh_token = refresh_data['refreshToken']
self.cache.store(self._NETRC_MACHINE, self._TOKEN_CACHE_KEY, [self._refresh_token, self._access_token])
def _perform_login(self, username, password):
if not self._refresh_token:
self._refresh_token, self._access_token = self.cache.load(
self._NETRC_MACHINE, self._TOKEN_CACHE_KEY, default=[None, None])
if self._refresh_token and self._access_token:
self.write_debug('Using cached refresh token')
return
try:
auth_data = self._download_json(
'https://10.com.au/api/user/auth', None, 'Logging in',
headers={
'Content-Type': 'application/json',
'X-Network-Ten-Auth': self._generate_xnetwork_ten_auth_token(),
'Referer': 'https://10.com.au/',
}, data=json.dumps({
'email': username,
'password': password,
}).encode())
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 400:
raise ExtractorError('Invalid username/password', expected=True)
raise
self._refresh_token = auth_data['jwt']['refreshToken']
self._access_token = auth_data['jwt']['accessToken']
self.cache.store(self._NETRC_MACHINE, self._TOKEN_CACHE_KEY, [self._refresh_token, self._access_token])
def _call_playback_api(self, content_id):
if self._access_token and self._is_jwt_expired(self._access_token):
self._refresh_access_token()
for is_retry in (False, True):
try:
return self._download_json_handle(
f'https://10.com.au/api/v1/videos/playback/{content_id}/', content_id,
note='Downloading video JSON', query={'platform': 'samsung'},
headers=filter_dict({
'TP-AcceptFeature': 'v1/fw;v1/drm',
'Authorization': f'Bearer {self._access_token}' if self._access_token else None,
}))
except ExtractorError as e:
if not is_retry and isinstance(e.cause, HTTPError) and e.cause.status == 403:
if self._access_token:
self.to_screen('Access token has expired; refreshing')
self._refresh_access_token()
continue
elif not self._get_login_info()[0]:
self.raise_login_required('Login required to access this video', method='password')
raise
def _real_extract(self, url): def _real_extract(self, url):
content_id = self._match_id(url) content_id = self._match_id(url)
data = self._download_json( try:
'https://10.com.au/api/v1/videos/' + content_id, content_id) data = self._download_json(f'https://10.com.au/api/v1/videos/{content_id}', content_id)
except ExtractorError as e:
if (
isinstance(e.cause, HTTPError) and e.cause.status == 403
and 'Error 54113' in e.cause.response.read().decode()
):
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
raise
video_data = self._download_json( video_data, urlh = self._call_playback_api(content_id)
f'https://vod.ten.com.au/api/videos/bcquery?command=find_videos_by_id&video_id={data["altId"]}', content_source_id = video_data['dai']['contentSourceId']
content_id, 'Downloading video JSON') video_id = video_data['dai']['videoId']
# Dash URL 404s, changing the m3u8 format works auth_token = urlh.get_header('x-dai-auth')
m3u8_url = self._request_webpage( if not auth_token:
HEADRequest(update_url_query(video_data['items'][0]['dashManifestUrl'], { raise ExtractorError('Failed to get DAI auth token')
'manifest': 'm3u',
})), dai_data = self._download_json(
content_id, 'Checking stream URL').url f'https://pubads.g.doubleclick.net/ondemand/hls/content/{content_source_id}/vid/{video_id}/streams',
if '10play-not-in-oz' in m3u8_url: content_id, note='Downloading DAI JSON',
self.raise_geo_restricted(countries=['AU']) data=urlencode_postdata({'auth-token': auth_token}))
if '10play_unsupported' in m3u8_url:
raise ExtractorError('Unable to extract stream') # Ignore subs to avoid ad break cleanup
# Attempt to get a higher quality stream formats, _ = self._extract_m3u8_formats_and_subtitles(
formats = self._extract_m3u8_formats( dai_data['stream_manifest'], content_id, 'mp4')
m3u8_url.replace(',150,75,55,0000', ',500,300,150,75,55,0000'),
content_id, 'mp4', fatal=False) already_have_1080p = False
if not formats: for fmt in formats:
formats = self._extract_m3u8_formats(m3u8_url, content_id, 'mp4') m3u8_doc = self._download_webpage(
fmt['url'], content_id, note='Downloading m3u8 information')
m3u8_doc = self._filter_ads_from_m3u8(m3u8_doc)
fmt['hls_media_playlist_data'] = m3u8_doc
if fmt.get('height') == 1080:
already_have_1080p = True
# Attempt format upgrade
if not already_have_1080p and m3u8_doc and re.search(self._SEGMENT_BITRATE_RE, m3u8_doc):
m3u8_doc = re.sub(self._SEGMENT_BITRATE_RE, r'-5000000-\1.ts', m3u8_doc)
m3u8_doc = re.sub(r'-(?:300|150|75|55)0000\.key"', r'-5000000.key"', m3u8_doc)
formats.append({
'format_id': 'upgrade-attempt-1080p',
'url': encode_data_uri(m3u8_doc.encode(), 'application/x-mpegurl'),
'hls_media_playlist_data': m3u8_doc,
'width': 1920,
'height': 1080,
'ext': 'mp4',
'protocol': 'm3u8_native',
'__needs_testing': True,
})
return { return {
'id': content_id, 'id': content_id,

View File

@@ -220,7 +220,7 @@ def _extract_aweme_app(self, aweme_id):
def _extract_web_data_and_status(self, url, video_id, fatal=True): def _extract_web_data_and_status(self, url, video_id, fatal=True):
video_data, status = {}, -1 video_data, status = {}, -1
res = self._download_webpage_handle(url, video_id, fatal=fatal, headers={'User-Agent': 'Mozilla/5.0'}) res = self._download_webpage_handle(url, video_id, fatal=fatal, impersonate=True)
if res is False: if res is False:
return video_data, status return video_data, status
@@ -1071,12 +1071,15 @@ def _real_extract(self, url):
webpage = self._download_webpage( webpage = self._download_webpage(
self._UPLOADER_URL_FORMAT % user_name, user_name, self._UPLOADER_URL_FORMAT % user_name, user_name,
'Downloading user webpage', 'Unable to download user webpage', 'Downloading user webpage', 'Unable to download user webpage',
fatal=False, headers={'User-Agent': 'Mozilla/5.0'}) or '' fatal=False, impersonate=True) or ''
detail = traverse_obj( detail = traverse_obj(
self._get_universal_data(webpage, user_name), ('webapp.user-detail', {dict})) or {} self._get_universal_data(webpage, user_name), ('webapp.user-detail', {dict})) or {}
if detail.get('statusCode') == 10222: video_count = traverse_obj(detail, ('userInfo', ('stats', 'statsV2'), 'videoCount', {int}, any))
if not video_count and detail.get('statusCode') == 10222:
self.raise_login_required( self.raise_login_required(
'This user\'s account is private. Log into an account that has access') 'This user\'s account is private. Log into an account that has access')
elif video_count == 0:
raise ExtractorError('This account does not have any videos posted', expected=True)
sec_uid = traverse_obj(detail, ('userInfo', 'user', 'secUid', {str})) sec_uid = traverse_obj(detail, ('userInfo', 'user', 'secUid', {str}))
if sec_uid: if sec_uid:
fail_early = not traverse_obj(detail, ('userInfo', 'itemList', ...)) fail_early = not traverse_obj(detail, ('userInfo', 'itemList', ...))
@@ -1520,7 +1523,7 @@ def _real_extract(self, url):
uploader, room_id = self._match_valid_url(url).group('uploader', 'id') uploader, room_id = self._match_valid_url(url).group('uploader', 'id')
if not room_id: if not room_id:
webpage = self._download_webpage( webpage = self._download_webpage(
format_field(uploader, None, self._UPLOADER_URL_FORMAT), uploader) format_field(uploader, None, self._UPLOADER_URL_FORMAT), uploader, impersonate=True)
room_id = traverse_obj( room_id = traverse_obj(
self._get_universal_data(webpage, uploader), self._get_universal_data(webpage, uploader),
('webapp.user-detail', 'userInfo', 'user', 'roomId', {str})) ('webapp.user-detail', 'userInfo', 'user', 'roomId', {str}))

View File

@@ -58,6 +58,20 @@ def _get_direct_subtitles(self, caption_json):
return subs return subs
def _get_additional_metadata(self, video_id):
additional_metadata = self._download_json(
f'https://play.vidyard.com/video/{video_id}', video_id,
note='Downloading additional metadata', fatal=False)
return traverse_obj(additional_metadata, {
'title': ('name', {str}),
'duration': ('seconds', {int_or_none}),
'thumbnails': ('thumbnailUrl', {'url': {url_or_none}}, all),
'chapters': ('videoSections', lambda _, v: float_or_none(v['milliseconds']) is not None, {
'title': ('title', {str}),
'start_time': ('milliseconds', {float_or_none(scale=1000)}),
}),
})
def _fetch_video_json(self, video_id): def _fetch_video_json(self, video_id):
return self._download_json( return self._download_json(
f'https://play.vidyard.com/player/{video_id}.json', video_id)['payload'] f'https://play.vidyard.com/player/{video_id}.json', video_id)['payload']
@@ -67,6 +81,7 @@ def _process_video_json(self, json_data, video_id):
self._merge_subtitles(self._get_direct_subtitles(json_data.get('captions')), target=subtitles) self._merge_subtitles(self._get_direct_subtitles(json_data.get('captions')), target=subtitles)
return { return {
**self._get_additional_metadata(json_data['facadeUuid']),
**traverse_obj(json_data, { **traverse_obj(json_data, {
'id': ('facadeUuid', {str}), 'id': ('facadeUuid', {str}),
'display_id': ('videoId', {int}, {str_or_none}), 'display_id': ('videoId', {int}, {str_or_none}),
@@ -113,6 +128,29 @@ class VidyardIE(VidyardBaseIE):
'thumbnail': 'https://cdn.vidyard.com/thumbnails/spacer.gif', 'thumbnail': 'https://cdn.vidyard.com/thumbnails/spacer.gif',
'duration': 41.186, 'duration': 41.186,
}, },
}, {
'url': 'https://share.vidyard.com/watch/wL237MtNgZUHo6e8WPiJbF',
'info_dict': {
'id': 'wL237MtNgZUHo6e8WPiJbF',
'display_id': '25926870',
'ext': 'mp4',
'title': 'Adding & Editing Video Chapters',
'thumbnail': 'https://cdn.vidyard.com/thumbnails/25926870/bvSEZS3dGY7DByQ_bzB57avIZ_hsvhr4_small.jpg',
'duration': 135.46,
'chapters': [{
'title': 'Adding new chapters',
'start_time': 0,
}, {
'title': 'Previewing your video',
'start_time': 74,
}, {
'title': 'Editing your chapters',
'start_time': 91,
}, {
'title': 'Share a link to a specific chapter',
'start_time': 105,
}],
},
}, { }, {
'url': 'https://embed.vidyard.com/share/oTDMPlUv--51Th455G5u7Q', 'url': 'https://embed.vidyard.com/share/oTDMPlUv--51Th455G5u7Q',
'info_dict': { 'info_dict': {
@@ -132,8 +170,8 @@ class VidyardIE(VidyardBaseIE):
'id': 'SyStyHtYujcBHe5PkZc5DL', 'id': 'SyStyHtYujcBHe5PkZc5DL',
'display_id': '41974005', 'display_id': '41974005',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Prepare the Frame and Track for Palm Beach Polysatin Shutters With BiFold Track', 'title': 'Install Palm Beach Shutters with a Bi-Fold Track System (Video 1 of 6)',
'description': r're:In this video, you will learn how to prepare the frame.+', 'description': r're:In this video, you will learn the first step.+',
'thumbnail': 'https://cdn.vidyard.com/thumbnails/41974005/IJw7oCaJcF1h7WWu3OVZ8A_small.png', 'thumbnail': 'https://cdn.vidyard.com/thumbnails/41974005/IJw7oCaJcF1h7WWu3OVZ8A_small.png',
'duration': 258.666, 'duration': 258.666,
}, },
@@ -147,42 +185,42 @@ class VidyardIE(VidyardBaseIE):
'id': 'SyStyHtYujcBHe5PkZc5DL', 'id': 'SyStyHtYujcBHe5PkZc5DL',
'display_id': '41974005', 'display_id': '41974005',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Prepare the Frame and Track for Palm Beach Polysatin Shutters With BiFold Track', 'title': 'Install Palm Beach Shutters with a Bi-Fold Track System (Video 1 of 6)',
'thumbnail': 'https://cdn.vidyard.com/thumbnails/41974005/IJw7oCaJcF1h7WWu3OVZ8A_small.png', 'thumbnail': 'https://cdn.vidyard.com/thumbnails/41974005/IJw7oCaJcF1h7WWu3OVZ8A_small.png',
'duration': 258.666, 'duration': 258.666,
}, { }, {
'id': '1Fw4B84jZTXLXWqkE71RiM', 'id': '1Fw4B84jZTXLXWqkE71RiM',
'display_id': '5861113', 'display_id': '5861113',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Palm Beach - Bi-Fold Track System "Frame Installation"', 'title': 'Install Palm Beach Shutters with a Bi-Fold Track System (Video 2 of 6)',
'thumbnail': 'https://cdn.vidyard.com/thumbnails/5861113/29CJ54s5g1_aP38zkKLHew_small.jpg', 'thumbnail': 'https://cdn.vidyard.com/thumbnails/5861113/29CJ54s5g1_aP38zkKLHew_small.jpg',
'duration': 167.858, 'duration': 167.858,
}, { }, {
'id': 'DqP3wBvLXSpxrcqpT5kEeo', 'id': 'DqP3wBvLXSpxrcqpT5kEeo',
'display_id': '41976334', 'display_id': '41976334',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Install the Track for Palm Beach Polysatin Shutters With BiFold Track', 'title': 'Install Palm Beach Shutters with a Bi-Fold Track System (Video 3 of 6)',
'thumbnail': 'https://cdn.vidyard.com/thumbnails/5861090/RwG2VaTylUa6KhSTED1r1Q_small.png', 'thumbnail': 'https://cdn.vidyard.com/thumbnails/5861090/RwG2VaTylUa6KhSTED1r1Q_small.png',
'duration': 94.229, 'duration': 94.229,
}, { }, {
'id': 'opfybfxpzQArxqtQYB6oBU', 'id': 'opfybfxpzQArxqtQYB6oBU',
'display_id': '41976364', 'display_id': '41976364',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Install the Panel for Palm Beach Polysatin Shutters With BiFold Track', 'title': 'Install Palm Beach Shutters with a Bi-Fold Track System (Video 4 of 6)',
'thumbnail': 'https://cdn.vidyard.com/thumbnails/5860926/JIOaJR08dM4QgXi_iQ2zGA_small.png', 'thumbnail': 'https://cdn.vidyard.com/thumbnails/5860926/JIOaJR08dM4QgXi_iQ2zGA_small.png',
'duration': 191.467, 'duration': 191.467,
}, { }, {
'id': 'rWrXvkbTNNaNqD6189HJya', 'id': 'rWrXvkbTNNaNqD6189HJya',
'display_id': '41976382', 'display_id': '41976382',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Adjust the Panels for Palm Beach Polysatin Shutters With BiFold Track', 'title': 'Install Palm Beach Shutters with a Bi-Fold Track System (Video 5 of 6)',
'thumbnail': 'https://cdn.vidyard.com/thumbnails/5860687/CwHxBv4UudAhOh43FVB4tw_small.png', 'thumbnail': 'https://cdn.vidyard.com/thumbnails/5860687/CwHxBv4UudAhOh43FVB4tw_small.png',
'duration': 138.155, 'duration': 138.155,
}, { }, {
'id': 'eYPTB521MZ9TPEArSethQ5', 'id': 'eYPTB521MZ9TPEArSethQ5',
'display_id': '41976409', 'display_id': '41976409',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Assemble and Install the Valance for Palm Beach Polysatin Shutters With BiFold Track', 'title': 'Install Palm Beach Shutters with a Bi-Fold Track System (Video 6 of 6)',
'thumbnail': 'https://cdn.vidyard.com/thumbnails/5861425/0y68qlMU4O5VKU7bJ8i_AA_small.png', 'thumbnail': 'https://cdn.vidyard.com/thumbnails/5861425/0y68qlMU4O5VKU7bJ8i_AA_small.png',
'duration': 148.224, 'duration': 148.224,
}], }],
@@ -191,6 +229,7 @@ class VidyardIE(VidyardBaseIE):
}, { }, {
# Non hubs.vidyard.com playlist # Non hubs.vidyard.com playlist
'url': 'https://salesforce.vidyard.com/watch/d4vqPjs7Q5EzVEis5QT3jd', 'url': 'https://salesforce.vidyard.com/watch/d4vqPjs7Q5EzVEis5QT3jd',
'skip': 'URL now 404s. Alternative non hubs.vidyard.com playlist not yet available',
'info_dict': { 'info_dict': {
'id': 'd4vqPjs7Q5EzVEis5QT3jd', 'id': 'd4vqPjs7Q5EzVEis5QT3jd',
'title': 'How To: Service Cloud: Import External Content in Lightning Knowledge', 'title': 'How To: Service Cloud: Import External Content in Lightning Knowledge',
@@ -300,6 +339,7 @@ class VidyardIE(VidyardBaseIE):
}, { }, {
# <script ... id="vidyard_embed_code_DXx2sW4WaLA6hTdGFz7ja8" src="//play.vidyard.com/DXx2sW4WaLA6hTdGFz7ja8.js? # <script ... id="vidyard_embed_code_DXx2sW4WaLA6hTdGFz7ja8" src="//play.vidyard.com/DXx2sW4WaLA6hTdGFz7ja8.js?
'url': 'http://videos.vivint.com/watch/DXx2sW4WaLA6hTdGFz7ja8', 'url': 'http://videos.vivint.com/watch/DXx2sW4WaLA6hTdGFz7ja8',
'skip': 'URL certificate expired 2025-09-10. Alternative script embed test case not yet available',
'info_dict': { 'info_dict': {
'id': 'DXx2sW4WaLA6hTdGFz7ja8', 'id': 'DXx2sW4WaLA6hTdGFz7ja8',
'display_id': '2746529', 'display_id': '2746529',
@@ -317,11 +357,12 @@ class VidyardIE(VidyardBaseIE):
'ext': 'mp4', 'ext': 'mp4',
'title': 'Lesson 1 - Opening an MT4 Account', 'title': 'Lesson 1 - Opening an MT4 Account',
'description': 'Never heard of MetaTrader4? Here\'s the 411 on the popular trading platform!', 'description': 'Never heard of MetaTrader4? Here\'s the 411 on the popular trading platform!',
'duration': 168, 'duration': 168.16,
'thumbnail': 'https://cdn.vidyard.com/thumbnails/20291/IM-G2WXQR9VBLl2Cmzvftg_small.jpg', 'thumbnail': 'https://cdn.vidyard.com/thumbnails/20291/IM-G2WXQR9VBLl2Cmzvftg_small.jpg',
}, },
}, { }, {
# <iframe ... src="//play.vidyard.com/d61w8EQoZv1LDuPxDkQP2Q/type/background?preview=1" # <iframe ... src="//play.vidyard.com/d61w8EQoZv1LDuPxDkQP2Q/type/background?preview=1"
'skip': 'URL changed embed method to \'class="vidyard-player-embed"\'. An alternative iframe embed test case is not yet available',
'url': 'https://www.avaya.com/en/', 'url': 'https://www.avaya.com/en/',
'info_dict': { 'info_dict': {
# These values come from the generic extractor and don't matter # These values come from the generic extractor and don't matter
@@ -354,46 +395,18 @@ class VidyardIE(VidyardBaseIE):
}], }],
'playlist_count': 2, 'playlist_count': 2,
}, { }, {
# <div class="vidyard-player-embed" data-uuid="vpCWTVHw3qrciLtVY94YkS" # <div class="vidyard-player-embed" data-uuid="pMk8eNCYzukzJaEPoo1Hgn"
'url': 'https://www.gogoair.com/', # URL previously used iframe embeds and was used for that test case
'url': 'https://www.avaya.com/en/',
'info_dict': { 'info_dict': {
# These values come from the generic extractor and don't matter 'id': 'pMk8eNCYzukzJaEPoo1Hgn',
'id': str, 'display_id': '47074153',
'title': str,
'description': str,
'age_limit': 0,
},
'playlist': [{
'info_dict': {
'id': 'vpCWTVHw3qrciLtVY94YkS',
'display_id': '40780699',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Upgrade to AVANCE 100% worth it - Jason Talley, Owner and Pilot, Testimonial', 'title': 'Avaya Infinity Helps Redefine the Contact Center as Your Connection Center',
'description': 'md5:f609824839439a51990cef55ffc472aa', 'description': r're:Our mission is to help you turn single engagements.+',
'duration': 70.737, 'duration': 81.55,
'thumbnail': 'https://cdn.vidyard.com/thumbnails/40780699/KzjfYZz5MZl2gHF_e-4i2c6ib1cLDweQ_small.jpg', 'thumbnail': 'https://cdn.vidyard.com/thumbnails/47074153/MZOLKhXdbiUWwp2ROnT5HaXL0oau6JtR_small.jpg',
}, },
}, {
'info_dict': {
'id': 'xAmV9AsLbnitCw35paLBD8',
'display_id': '31130867',
'ext': 'mp4',
'title': 'Brad Keselowski goes faster with Gogo AVANCE inflight Wi-Fi',
'duration': 132.565,
'thumbnail': 'https://cdn.vidyard.com/thumbnails/31130867/HknyDtLdm2Eih9JZ4A5XLjhfBX_6HRw5_small.jpg',
},
}, {
'info_dict': {
'id': 'RkkrFRNxfP79nwCQavecpF',
'display_id': '39009815',
'ext': 'mp4',
'title': 'Live Demo of Gogo Galileo',
'description': 'md5:e2df497236f4e12c3fef8b392b5f23e0',
'duration': 112.128,
'thumbnail': 'https://cdn.vidyard.com/thumbnails/38144873/CWLlxfUbJ4Gh0ThbUum89IsEM4yupzMb_small.jpg',
},
}],
'playlist_count': 3,
}] }]
@classmethod @classmethod

View File

@@ -2,6 +2,7 @@
import codecs import codecs
import itertools import itertools
import re import re
import string
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
@@ -22,6 +23,47 @@
) )
def to_signed_32(n):
return n % ((-1 if n < 0 else 1) * 2**32)
class _ByteGenerator:
def __init__(self, algo_id, seed):
try:
self._algorithm = getattr(self, f'_algo{algo_id}')
except AttributeError:
raise ExtractorError(f'Unknown algorithm ID: {algo_id}')
self._s = to_signed_32(seed)
def _algo1(self, s):
# LCG (a=1664525, c=1013904223, m=2^32)
# Ref: https://en.wikipedia.org/wiki/Linear_congruential_generator
s = self._s = to_signed_32(s * 1664525 + 1013904223)
return s
def _algo2(self, s):
# xorshift32
# Ref: https://en.wikipedia.org/wiki/Xorshift
s = to_signed_32(s ^ (s << 13))
s = to_signed_32(s ^ ((s & 0xFFFFFFFF) >> 17))
s = self._s = to_signed_32(s ^ (s << 5))
return s
def _algo3(self, s):
# Weyl Sequence (k≈2^32*φ, m=2^32) + MurmurHash3 (fmix32)
# Ref: https://en.wikipedia.org/wiki/Weyl_sequence
# https://commons.apache.org/proper/commons-codec/jacoco/org.apache.commons.codec.digest/MurmurHash3.java.html
s = self._s = to_signed_32(s + 0x9e3779b9)
s = to_signed_32(s ^ ((s & 0xFFFFFFFF) >> 16))
s = to_signed_32(s * to_signed_32(0x85ebca77))
s = to_signed_32(s ^ ((s & 0xFFFFFFFF) >> 13))
s = to_signed_32(s * to_signed_32(0xc2b2ae3d))
return to_signed_32(s ^ ((s & 0xFFFFFFFF) >> 16))
def __next__(self):
return self._algorithm(self._s) & 0xFF
class XHamsterIE(InfoExtractor): class XHamsterIE(InfoExtractor):
_DOMAINS = r'(?:xhamster\.(?:com|one|desi)|xhms\.pro|xhamster\d+\.(?:com|desi)|xhday\.com|xhvid\.com)' _DOMAINS = r'(?:xhamster\.(?:com|one|desi)|xhms\.pro|xhamster\d+\.(?:com|desi)|xhday\.com|xhvid\.com)'
_VALID_URL = rf'''(?x) _VALID_URL = rf'''(?x)
@@ -146,6 +188,12 @@ class XHamsterIE(InfoExtractor):
_XOR_KEY = b'xh7999' _XOR_KEY = b'xh7999'
def _decipher_format_url(self, format_url, format_id): def _decipher_format_url(self, format_url, format_id):
if all(char in string.hexdigits for char in format_url):
byte_data = bytes.fromhex(format_url)
seed = int.from_bytes(byte_data[1:5], byteorder='little', signed=True)
byte_gen = _ByteGenerator(byte_data[0], seed)
return bytearray(byte ^ next(byte_gen) for byte in byte_data[5:]).decode('latin-1')
cipher_type, _, ciphertext = try_call( cipher_type, _, ciphertext = try_call(
lambda: base64.b64decode(format_url).decode().partition('_')) or [None] * 3 lambda: base64.b64decode(format_url).decode().partition('_')) or [None] * 3
@@ -164,6 +212,16 @@ def _decipher_format_url(self, format_url, format_id):
self.report_warning(f'Skipping format "{format_id}": unsupported cipher type "{cipher_type}"') self.report_warning(f'Skipping format "{format_id}": unsupported cipher type "{cipher_type}"')
return None return None
def _fixup_formats(self, formats):
for f in formats:
if f.get('vcodec'):
continue
for vcodec in ('av1', 'h264'):
if any(f'.{vcodec}.' in f_url for f_url in (f['url'], f.get('manifest_url', ''))):
f['vcodec'] = vcodec
break
return formats
def _real_extract(self, url): def _real_extract(self, url):
mobj = self._match_valid_url(url) mobj = self._match_valid_url(url)
video_id = mobj.group('id') or mobj.group('id_2') video_id = mobj.group('id') or mobj.group('id_2')
@@ -312,7 +370,8 @@ def get_height(s):
'comment_count': int_or_none(video.get('comments')), 'comment_count': int_or_none(video.get('comments')),
'age_limit': age_limit if age_limit is not None else 18, 'age_limit': age_limit if age_limit is not None else 18,
'categories': categories, 'categories': categories,
'formats': formats, 'formats': self._fixup_formats(formats),
'_format_sort_fields': ('res', 'proto', 'tbr'),
} }
# Old layout fallback # Old layout fallback

View File

@@ -1196,7 +1196,7 @@ def extract_relative_time(relative_time_text):
except ValueError: except ValueError:
return None return None
def _parse_time_text(self, text): def _parse_time_text(self, text, report_failure=True):
if not text: if not text:
return return
dt_ = self.extract_relative_time(text) dt_ = self.extract_relative_time(text)
@@ -1211,7 +1211,7 @@ def _parse_time_text(self, text):
(r'([a-z]+\s*\d{1,2},?\s*20\d{2})', r'(?:.+|^)(?:live|premieres|ed|ing)(?:\s*(?:on|for))?\s*(.+\d)'), (r'([a-z]+\s*\d{1,2},?\s*20\d{2})', r'(?:.+|^)(?:live|premieres|ed|ing)(?:\s*(?:on|for))?\s*(.+\d)'),
text.lower(), 'time text', default=None))) text.lower(), 'time text', default=None)))
if text and timestamp is None and self._preferred_lang in (None, 'en'): if report_failure and text and timestamp is None and self._preferred_lang in (None, 'en'):
self.report_warning( self.report_warning(
f'Cannot parse localized time text "{text}"', only_once=True) f'Cannot parse localized time text "{text}"', only_once=True)
return timestamp return timestamp

View File

@@ -341,7 +341,11 @@ def _extract_lockup_view_model(self, view_model):
'contentImage', *thumb_keys, 'thumbnailViewModel', 'image'), final_key='sources'), 'contentImage', *thumb_keys, 'thumbnailViewModel', 'image'), final_key='sources'),
duration=traverse_obj(view_model, ( duration=traverse_obj(view_model, (
'contentImage', 'thumbnailViewModel', 'overlays', ..., 'thumbnailOverlayBadgeViewModel', 'contentImage', 'thumbnailViewModel', 'overlays', ..., 'thumbnailOverlayBadgeViewModel',
'thumbnailBadges', ..., 'thumbnailBadgeViewModel', 'text', {parse_duration}, any))) 'thumbnailBadges', ..., 'thumbnailBadgeViewModel', 'text', {parse_duration}, any)),
timestamp=(traverse_obj(view_model, (
'metadata', 'lockupMetadataViewModel', 'metadata', 'contentMetadataViewModel', 'metadataRows',
..., 'metadataParts', ..., 'text', 'content', {lambda t: self._parse_time_text(t, report_failure=False)}, any))
if self._configuration_arg('approximate_date', ie_key=YoutubeTabIE) else None))
def _rich_entries(self, rich_grid_renderer): def _rich_entries(self, rich_grid_renderer):
if lockup_view_model := traverse_obj(rich_grid_renderer, ('content', 'lockupViewModel', {dict})): if lockup_view_model := traverse_obj(rich_grid_renderer, ('content', 'lockupViewModel', {dict})):

View File

@@ -2955,9 +2955,20 @@ def fetch_po_token(self, client='web', context: _PoTokenContext = _PoTokenContex
# TODO(future): This validation should be moved into pot framework. # TODO(future): This validation should be moved into pot framework.
# Some sort of middleware or validation provider perhaps? # Some sort of middleware or validation provider perhaps?
gvs_bind_to_video_id = False
experiments = traverse_obj(ytcfg, (
'WEB_PLAYER_CONTEXT_CONFIGS', ..., 'serializedExperimentFlags', {urllib.parse.parse_qs}))
if 'true' in traverse_obj(experiments, (..., 'html5_generate_content_po_token', -1)):
self.write_debug(
f'{video_id}: Detected experiment to bind GVS PO Token to video id.', only_once=True)
gvs_bind_to_video_id = True
# GVS WebPO Token is bound to visitor_data / Visitor ID when logged out. # GVS WebPO Token is bound to visitor_data / Visitor ID when logged out.
# Must have visitor_data for it to function. # Must have visitor_data for it to function.
if player_url and context == _PoTokenContext.GVS and not visitor_data and not self.is_authenticated: if (
player_url and context == _PoTokenContext.GVS
and not visitor_data and not self.is_authenticated and not gvs_bind_to_video_id
):
self.report_warning( self.report_warning(
f'Unable to fetch GVS PO Token for {client} client: Missing required Visitor Data. ' f'Unable to fetch GVS PO Token for {client} client: Missing required Visitor Data. '
f'You may need to pass Visitor Data with --extractor-args "youtube:visitor_data=XXX"', only_once=True) f'You may need to pass Visitor Data with --extractor-args "youtube:visitor_data=XXX"', only_once=True)
@@ -2971,7 +2982,10 @@ def fetch_po_token(self, client='web', context: _PoTokenContext = _PoTokenContex
config_po_token = self._get_config_po_token(client, context) config_po_token = self._get_config_po_token(client, context)
if config_po_token: if config_po_token:
# GVS WebPO token is bound to data_sync_id / account Session ID when logged in. # GVS WebPO token is bound to data_sync_id / account Session ID when logged in.
if player_url and context == _PoTokenContext.GVS and not data_sync_id and self.is_authenticated: if (
player_url and context == _PoTokenContext.GVS
and not data_sync_id and self.is_authenticated and not gvs_bind_to_video_id
):
self.report_warning( self.report_warning(
f'Got a GVS PO Token for {client} client, but missing Data Sync ID for account. Formats may not work.' f'Got a GVS PO Token for {client} client, but missing Data Sync ID for account. Formats may not work.'
f'You may need to pass a Data Sync ID with --extractor-args "youtube:data_sync_id=XXX"') f'You may need to pass a Data Sync ID with --extractor-args "youtube:data_sync_id=XXX"')
@@ -2997,6 +3011,7 @@ def fetch_po_token(self, client='web', context: _PoTokenContext = _PoTokenContex
video_id=video_id, video_id=video_id,
video_webpage=webpage, video_webpage=webpage,
required=required, required=required,
_gvs_bind_to_video_id=gvs_bind_to_video_id,
**kwargs, **kwargs,
) )
@@ -3040,6 +3055,7 @@ def _fetch_po_token(self, client, **kwargs):
data_sync_id=kwargs.get('data_sync_id'), data_sync_id=kwargs.get('data_sync_id'),
video_id=kwargs.get('video_id'), video_id=kwargs.get('video_id'),
request_cookiejar=self._downloader.cookiejar, request_cookiejar=self._downloader.cookiejar,
_gvs_bind_to_video_id=kwargs.get('_gvs_bind_to_video_id', False),
# All requests that would need to be proxied should be in the # All requests that would need to be proxied should be in the
# context of www.youtube.com or the innertube host # context of www.youtube.com or the innertube host
@@ -4094,7 +4110,9 @@ def is_bad_format(fmt):
else 'video'), else 'video'),
'release_timestamp': live_start_time, 'release_timestamp': live_start_time,
'_format_sort_fields': ( # source_preference is lower for potentially damaged formats '_format_sort_fields': ( # source_preference is lower for potentially damaged formats
'quality', 'res', 'fps', 'hdr:12', 'source', 'vcodec', 'channels', 'acodec', 'lang', 'proto'), 'quality', 'res', 'fps', 'hdr:12', 'source',
'vcodec:vp9.2' if 'prefer-vp9-sort' in self.get_param('compat_opts', []) else 'vcodec',
'channels', 'acodec', 'lang', 'proto'),
} }
def get_lang_code(track): def get_lang_code(track):

View File

@@ -58,6 +58,8 @@ class PoTokenRequest:
visitor_data: str | None = None visitor_data: str | None = None
data_sync_id: str | None = None data_sync_id: str | None = None
video_id: str | None = None video_id: str | None = None
# Internal, YouTube experiment on whether to bind GVS PO Token to video_id.
_gvs_bind_to_video_id: bool = False
# Networking parameters # Networking parameters
request_cookiejar: YoutubeDLCookieJar = dataclasses.field(default_factory=YoutubeDLCookieJar) request_cookiejar: YoutubeDLCookieJar = dataclasses.field(default_factory=YoutubeDLCookieJar)

View File

@@ -42,6 +42,9 @@ def get_webpo_content_binding(
if not client_name or client_name not in webpo_clients: if not client_name or client_name not in webpo_clients:
return None, None return None, None
if request.context == PoTokenContext.GVS and request._gvs_bind_to_video_id:
return request.video_id, ContentBindingType.VIDEO_ID
if request.context == PoTokenContext.GVS or client_name in ('WEB_REMIX', ): if request.context == PoTokenContext.GVS or client_name in ('WEB_REMIX', ):
if request.is_authenticated: if request.is_authenticated:
return request.data_sync_id, ContentBindingType.DATASYNC_ID return request.data_sync_id, ContentBindingType.DATASYNC_ID

View File

@@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py # Autogenerated by devscripts/update-version.py
__version__ = '2025.09.26' __version__ = '2025.10.14'
RELEASE_GIT_HEAD = '12b57d2858845c0c7fb33bf9aa8ed7be6905535d' RELEASE_GIT_HEAD = 'a98e7f9f58a9492d2cb216baa59c890ed8ce02f3'
VARIANT = None VARIANT = None
@@ -12,4 +12,4 @@
ORIGIN = 'yt-dlp/yt-dlp' ORIGIN = 'yt-dlp/yt-dlp'
_pkg_version = '2025.09.26' _pkg_version = '2025.10.14'