1
0
mirror of https://github.com/yt-dlp/yt-dlp synced 2025-12-17 22:55:42 +07:00

Compare commits

..

95 Commits

Author SHA1 Message Date
github-actions[bot]
b77e5a553a Release 2025.04.30
Created by: bashonly

:ci skip all
2025-04-30 23:24:48 +00:00
sepro
505b400795 [cleanup] Misc (#12844)
Authored by: seproDev, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-04-30 23:01:25 +00:00
bashonly
74fc2ae12c [ie/youtube] Do not strictly deprioritize missing_pot formats (#13061)
Deprioritization was redundant; they're already hidden behind an extractor-arg

Authored by: bashonly
2025-04-30 22:51:40 +00:00
InvalidUsernameException
7be14109a6 [ie/zdf] Fix extractors (#12779)
Closes #12647
Authored by: InvalidUsernameException, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-04-30 22:27:42 +00:00
bashonly
61c9a938b3 [ie/youtube] Cache signature timestamps (#13047)
Closes #12825
Authored by: bashonly
2025-04-30 01:15:17 +00:00
bashonly
fd8394bc50 [ie/youtube] Improve warning for SABR-only/SSAP player responses (#13049)
Ref: https://github.com/yt-dlp/yt-dlp/issues/12482

Authored by: bashonly
2025-04-30 01:13:35 +00:00
bashonly
22ac81a069 [ie/vimeo] Extract from mobile API (#13034)
Closes #12974
Authored by: bashonly
2025-04-29 16:45:54 +00:00
doe1080
25cd7c1ecb [ie/niconico] Fix login support (#13008)
Authored by: doe1080
2025-04-28 22:42:01 +00:00
bashonly
28f04e8a5e [ie/reddit] Support --ignore-no-formats-error (#12993)
Closes #12987
Authored by: bashonly
2025-04-28 22:31:34 +00:00
sepro
a3e91df30a [ie/TV2DK] Fix extractor (#12945)
Closes #10334
Authored by: seproDev, bashonly

Co-authored-by: bashonly <bashonly@protonmail.com>
2025-04-29 00:21:54 +02:00
bashonly
80736b9c90 [ie/bpb] Fix formats extraction (#13015)
Closes #13011
Authored by: bashonly
2025-04-28 22:20:39 +00:00
Sergei Zharkov
1ae6bff564 [ie/twitch:clips] Fix uploader metadata extraction (#13022)
Fix 61046c3161

Authored by: 1271
2025-04-28 22:19:14 +00:00
sepro
b37ff4de5b [ie/linkedin:events] Add extractor (#12926)
Authored by: seproDev, bashonly

Co-authored-by: bashonly <bashonly@protonmail.com>
2025-04-28 22:58:30 +02:00
Simon Sawicki
3690e91265 [ci] Add file mode test to code check (#13036)
Authored by: Grub4K
2025-04-28 21:21:06 +02:00
coletdjnz
8cb08028f5 [ie/youtube] Detect and warn when account cookies are rotated (#13014)
Related: https://github.com/yt-dlp/yt-dlp/issues/8227

Authored by: coletdjnz
2025-04-27 12:16:34 +12:00
bashonly
1cf39ddf3d [ie/twitter] Fix extraction when logged-in (#13024)
Closes #13010
Authored by: bashonly
2025-04-26 22:39:29 +00:00
bashonly
c2d6659d10 [ie/youtube] Detect player JS variants for any locale (#13003)
Authored by: bashonly
2025-04-26 22:08:34 +00:00
coletdjnz
26feac3dd1 [ie/youtube] Add context to video request rate limit error (#12958)
Related: https://github.com/yt-dlp/yt-dlp/issues/11426

Authored by: coletdjnz
2025-04-25 16:11:07 +12:00
doe1080
70599e53b7 [ie/twitter:spaces] Improve metadata extraction (#12911)
Authored by: doe1080
2025-04-25 03:42:17 +00:00
doe1080
8d127b18f8 [fd/NiconicoDmc] Remove downloader (#12916)
Authored by: doe1080
2025-04-24 15:20:25 -05:00
doe1080
7d05aa99c6 [ie/niconico] Remove DMC formats support (#12916)
Authored by: doe1080
2025-04-24 15:20:25 -05:00
bashonly
36da6360e1 [ie/mlbtv] Fix device ID caching (#12980)
Authored by: bashonly
2025-04-24 19:18:22 +00:00
bashonly
e7e3b7a55c [ie/dacast] Support tokenized URLs (#12979)
Authored by: bashonly
2025-04-24 19:10:34 +00:00
D Trombett
dce8234624 [ie/RaiPlay] Fix DRM detection (#12971)
Closes #12969
Authored by: DTrombett
2025-04-24 18:26:35 +00:00
sepro
2381881fe5 [ie/vk] Fix uploader extraction (#12985)
Closes #12967
Authored by: seproDev
2025-04-23 14:31:20 +00:00
Sergey B (Troex Nevelin)
741fd809bc [ie/GetCourseRu] Fix extractors (#12943)
Closes #12941
Authored by: troex
2025-04-23 00:14:42 +00:00
bashonly
34a061a295 [ie/generic] Fix MPD extraction for file:// URLs (#12978)
Fix 5086d4aed6
Authored by: bashonly
2025-04-23 00:06:35 +00:00
bashonly
9032f98136 [ie/cda] Fix formats extraction (#12975)
Closes #12962
Authored by: bashonly
2025-04-23 00:00:41 +00:00
bashonly
de271a06fd [ie/twitcasting] Fix livestream extraction (#12977)
Closes #12966
Authored by: bashonly
2025-04-22 23:54:41 +00:00
bashonly
d596824c2f [ie/vimeo] Fix API extraction (#12976)
Closes #12974
Authored by: bashonly
2025-04-22 23:47:38 +00:00
sepro
88eb1e7a9a Add --preset-alias option (#12839)
Authored by: seproDev, Grub4K

Co-authored-by: Simon Sawicki <contact@grub4k.xyz>
2025-04-19 22:08:34 +02:00
sepro
f5a37ea40e [ie/loco] Fix extractor (#12934)
Closes #12930
Authored by: seproDev
2025-04-19 02:02:09 +02:00
Florentin Le Moal
f07ee91c71 [ie/rtve] Rework extractors (#10388)
Closes #1346, Closes #5756
Authored by: meGAmeS1, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-04-19 01:47:14 +02:00
fries1234
ed8ad1b4d6 [ie/tvw:tvchannels] Add extractor (#12721)
Authored by: fries1234
2025-04-19 01:35:47 +02:00
Florentin Le Moal
839d643253 [ie/AtresPlayer] Rework extractor (#11424)
Closes #996, Closes #1165
Authored by: meGAmeS1, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-04-18 22:12:31 +02:00
香芋奶茶
f5736bb35b [ie/AbemaTV] Fix thumbnail extraction (#12859)
Closes #12858
Authored by: Kiritomo
2025-04-18 21:12:27 +02:00
sepro
9d26daa04a [ie/panopto] Fix formats extraction (#12925)
Closes #11042
Authored by: seproDev
2025-04-18 21:09:41 +02:00
sepro
73a26f9ee6 [ie/linkedin] Support feed URLs (#12927)
Closes #6104
Authored by: seproDev
2025-04-18 21:08:13 +02:00
sepro
4e69a626cc [ie/tvp:vod] Improve _VALID_URL (#12923)
Closes #12917
Authored by: seproDev
2025-04-18 21:05:01 +02:00
pj47x
77aa15e98f [ie/manyvids] Fix extractor (#10907)
Closes #8268
Authored by: pj47x
2025-04-18 18:38:58 +00:00
Michał Walenciak
cb271d445b [ie/CDAFolder] Extend _VALID_URL (#12919)
Closes #12918
Authored by: Kicer86, fireattack

Co-authored-by: fireattack <human.peng@gmail.com>
2025-04-18 18:32:38 +00:00
doe1080
ceab4d5ed6 [networking] Add PATCH request shortcut (#12884)
Authored by: doe1080
2025-04-18 11:46:19 +12:00
leeblackc
ed6c6d7eef [ie/youtube] Add extractor arg to skip "initial_data" request (#12865)
Closes https://github.com/yt-dlp/yt-dlp/issues/12826

Authored by: leeblackc
2025-04-18 11:42:08 +12:00
coletdjnz
f484c51599 [ie/youtube] Add warning on video captcha challenge (#12939)
Authored by: coletdjnz
2025-04-18 11:40:39 +12:00
coletdjnz
72ba487930 [ie/youtube:tab] Extract continuation from empty page (#12938)
Fixes https://github.com/yt-dlp/yt-dlp/issues/12933 https://github.com/yt-dlp/yt-dlp/issues/8206

Authored by: coletdjnz
2025-04-18 11:34:30 +12:00
Subrat Lima
74e90dd9b8 [ie/LRTRadio] Add extractor (#12801)
Closes #12745
Authored by: subrat-lima
2025-04-06 23:26:44 +00:00
Snack
1d45e30537 [ie/niconico:live] Fix extractor (#12809)
Closes #12365
Authored by: Snack-X
2025-04-06 23:24:58 +00:00
Frank Aurich
3c1c75ecb8 [ie/kika] Add playlist extractor (#12832)
Closes #3658
Authored by: 1100101
2025-04-06 21:04:24 +02:00
J.Luis
7faa18b83d [ie/ivoox] Add extractor (#12768)
Authored by: NeonMan, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-04-06 20:48:07 +02:00
doe1080
a473e59233 [utils] url_or_none: Support WebSocket URLs (#12848)
Authored by: doe1080
2025-04-06 20:46:08 +02:00
sepro
45f01de00e [utils] _yield_json_ld: Make function less fatal (#12855)
Authored by: seproDev
2025-04-06 20:31:00 +02:00
WouterGordts
db6d1f145a [ie/mixcloud] Refactor extractor (#12830)
Authored by: WouterGordts, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-04-06 19:51:08 +02:00
sepro
a3f2b54c25 [ie/dzen.ru] Rework extractors (#12852)
Closes #5523, Closes #10818, Closes #11385, Closes #11470
Authored by: seproDev
2025-04-06 17:41:48 +02:00
LN Liberda
91832111a1 [ie/TokFMPodcast] Fix formats extraction (#12842)
Authored by: selfisekai
2025-04-06 17:05:43 +02:00
Ben Faerber
425017531f [ie/parti] Add extractors (#12769)
Closes #11434
Authored by: benfaerber
2025-04-05 22:09:53 +02:00
sepro
58d0c83457 [ie/rumble] Improve format extraction (#12838)
Closes #12837
Authored by: seproDev
2025-04-05 20:29:57 +02:00
sepro
4ebf41309d [ie/CrowdBunker] Make format extraction non-fatal (#12836)
Authored by: seproDev
2025-04-05 19:49:51 +02:00
CasperMcFadden95
e1847535e2 [ie/RoyaLive] Add extractor (#12817)
Authored by: CasperMcFadden95
2025-04-03 21:02:24 +02:00
sepro
5361a7c6e2 [ie/vk] Fix chapters extraction (#12821)
Fix 05c8023a27

Authored by: seproDev
2025-04-03 19:55:36 +02:00
github-actions[bot]
349f36606f Release 2025.03.31
Created by: bashonly

:ci skip all
2025-03-31 21:54:27 +00:00
bashonly
5e457af57f [cleanup] Misc (#12802)
Authored by: bashonly
2025-03-31 21:38:21 +00:00
DmitryScaletta
61046c3161 [ie/twitch:clips] Extract portrait formats (#12763)
Authored by: DmitryScaletta
2025-03-31 21:21:14 +00:00
bashonly
07f04005e4 [ie/youtube] Add player_js_variant extractor-arg (#12767)
- Always distinguish between different JS variants' code/functions
- Change naming scheme for nsig and sigfuncs in disk cache

Authored by: bashonly
2025-03-31 19:45:48 +00:00
bashonly
e465b078ea [ie/on24] Support mainEvent URLs (#12800)
Closes #12782
Authored by: bashonly
2025-03-31 19:25:10 +00:00
bashonly
d63696f23a [ie/MicrosoftLearnEpisode] Extract more formats (#12799)
Closes #12798
Authored by: bashonly
2025-03-31 19:21:44 +00:00
Muhammad Labeeb
bb321cfdc3 [ie/francaisfacile] Add extractor (#12787)
Authored by: mlabeeb03
2025-03-31 19:06:33 +00:00
Miroslav Bendík
5fc521cbd0 [ie/stvr] Rename extractor from RTVS to STVR (#12788)
Authored by: mireq
2025-03-31 19:04:52 +00:00
bashonly
f033d86b96 [ie/mlbtv] Fix radio-only extraction (#12792)
Authored by: bashonly
2025-03-30 23:28:14 +00:00
bashonly
9a1ec1d36e [ie/generic] Validate response before checking m3u8 live status (#12784)
Closes #12744
Authored by: bashonly
2025-03-30 23:02:59 +00:00
bashonly
2956035912 [ie/sbs] Fix subtitles extraction (#12785)
Closes #12783
Authored by: bashonly
2025-03-30 22:54:55 +00:00
sepro
22e34adbd7 Add --compat-options 2024 (#12789)
Authored by: seproDev
2025-03-31 00:38:46 +02:00
coletdjnz
6a6d97b2cb [ie/youtube:tab] Fix playlist continuation extraction (#12777)
Fixes https://github.com/yt-dlp/yt-dlp/issues/12759

Authored by: coletdjnz
2025-03-29 11:13:09 +13:00
github-actions[bot]
3ddbebb3c6 Release 2025.03.27
Created by: bashonly

:ci skip all
2025-03-27 23:45:56 +00:00
bashonly
48be862b32 [ie/youtube] Make signature and nsig extraction more robust (#12761)
Authored by: bashonly, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-03-27 22:31:01 +00:00
bashonly
a8b9ff3c2a [jsinterp] Fix nested attributes and object extraction (#12760)
Authored by: bashonly, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-03-27 22:28:30 +00:00
github-actions[bot]
6eaa574c82 Release 2025.03.26
Created by: bashonly

:ci skip all
2025-03-26 00:04:51 +00:00
sepro
ecee97b4fa [ie/youtube] Only cache nsig code on successful decoding (#12750)
Authored by: seproDev, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-03-25 23:47:45 +00:00
sepro
a550dfc904 [ie/youtube] Fix signature and nsig extraction for player 4fcd6e4a (#12748)
Closes #12746
Authored by: seproDev
2025-03-25 23:40:58 +00:00
github-actions[bot]
336b33e72f Release 2025.03.25
Created by: bashonly

:ci skip all
2025-03-25 00:07:18 +00:00
sepro
9dde546e7e [cleanup] Misc (#12694)
Authored by: seproDev
2025-03-25 00:05:02 +00:00
Abdulmohsen
66e0bab814 [ie/TVer] Fix extractor (#12659)
Closes #12643, Closes #12282
Authored by: arabcoders, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-03-25 00:00:22 +00:00
doe1080
801afeac91 [ie/streaks] Add extractor (#12679)
Authored by: doe1080
2025-03-24 23:12:09 +00:00
bashonly
86ab79e1a5 [ie] Fix sorting of HLS audio formats by GROUP-ID (#12714)
Closes #11178
Authored by: bashonly
2025-03-24 22:38:22 +00:00
Subrat Lima
3396eb50dc [ie/17live:vod] Add extractor (#12723)
Closes #12570
Authored by: subrat-lima
2025-03-24 22:26:45 +00:00
fireattack
5086d4aed6 [ie/generic] Fix MPD base URL parsing (#12718)
Closes #12709
Authored by: fireattack
2025-03-24 22:24:09 +00:00
sepro
9491b44032 [utils] js_to_json: Make function less fatal (#12715)
Authored by: seproDev
2025-03-24 22:28:47 +01:00
doe1080
b7fbb5a0a1 [ie/vrsquare] Add extractors (#12515)
Authored by: doe1080
2025-03-24 22:28:09 +01:00
bashonly
4054a2b623 [ie/youtube] Fix PhantomJS nsig fallback (#12728)
Also fixes the NSigDeno plugin

Closes #12724
Authored by: bashonly
2025-03-24 21:22:25 +00:00
bashonly
b9c979461b [ie/youtube] Fix signature and nsig extraction for player 363db69b (#12725)
Closes #12724
Authored by: bashonly
2025-03-24 21:18:51 +00:00
bashonly
9d5e6de2e7 [ie/9now.com.au] Fix extractor (#12702)
Closes #12591
Authored by: bashonly
2025-03-23 16:35:46 +00:00
Simon Sawicki
9bf23902ce [rh:curl_cffi] Support curl_cffi 0.10.x (#12670)
Authored by: Grub4K
2025-03-23 00:15:20 +01:00
sepro
be5af3f9e9 [ie/deezer] Remove extractors (#12704)
Authored by: seproDev
2025-03-22 22:53:20 +01:00
sepro
fe4f14b836 [ie/viki] Remove extractors (#12703)
Closes #2907, Closes #2869
Authored by: seproDev
2025-03-22 22:34:07 +01:00
Simon Sawicki
b872ffec50 [core] Fix attribute error on failed VT init (#12696)
Authored by: Grub4K
2025-03-22 21:03:28 +01:00
bashonly
e2dfccaf80 [ie/chzzk:video] Fix extraction (#12692)
Closes #12487
Authored by: dirkf, bashonly

Co-authored-by: dirkf <fieldhouse@gmx.net>
2025-03-22 16:44:05 +00:00
92 changed files with 4122 additions and 2506 deletions

View File

@@ -6,7 +6,7 @@ on:
- devscripts/** - devscripts/**
- test/** - test/**
- yt_dlp/**.py - yt_dlp/**.py
- '!yt_dlp/extractor/*.py' - '!yt_dlp/extractor/**.py'
- yt_dlp/extractor/__init__.py - yt_dlp/extractor/__init__.py
- yt_dlp/extractor/common.py - yt_dlp/extractor/common.py
- yt_dlp/extractor/extractors.py - yt_dlp/extractor/extractors.py
@@ -16,7 +16,7 @@ on:
- devscripts/** - devscripts/**
- test/** - test/**
- yt_dlp/**.py - yt_dlp/**.py
- '!yt_dlp/extractor/*.py' - '!yt_dlp/extractor/**.py'
- yt_dlp/extractor/__init__.py - yt_dlp/extractor/__init__.py
- yt_dlp/extractor/common.py - yt_dlp/extractor/common.py
- yt_dlp/extractor/extractors.py - yt_dlp/extractor/extractors.py

View File

@@ -38,3 +38,5 @@ jobs:
run: ruff check --output-format github . run: ruff check --output-format github .
- name: Run autopep8 - name: Run autopep8
run: autopep8 --diff . run: autopep8 --diff .
- name: Check file mode
run: git ls-files --format="%(objectmode) %(path)" yt_dlp/ | ( ! grep -v "^100644" )

View File

@@ -757,3 +757,16 @@ rysson
somini somini
thedenv thedenv
vallovic vallovic
arabcoders
mireq
mlabeeb03
1271
CasperMcFadden95
Kicer86
Kiritomo
leeblackc
meGAmeS1
NeonMan
pj47x
troex
WouterGordts

View File

@@ -4,6 +4,148 @@ # Changelog
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master # To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
--> -->
### 2025.04.30
#### Important changes
- **New option `--preset-alias`/`-t` has been added**
This provides convenient predefined aliases for common use cases. Available presets include `mp4`, `mp3`, `mkv`, `aac`, and `sleep`. See [the README](https://github.com/yt-dlp/yt-dlp/blob/master/README.md#preset-aliases) for more details.
#### Core changes
- [Add `--preset-alias` option](https://github.com/yt-dlp/yt-dlp/commit/88eb1e7a9a2720ac89d653c0d0e40292388823bb) ([#12839](https://github.com/yt-dlp/yt-dlp/issues/12839)) by [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
- **utils**
- `_yield_json_ld`: [Make function less fatal](https://github.com/yt-dlp/yt-dlp/commit/45f01de00e1bc076b7f676a669736326178647b1) ([#12855](https://github.com/yt-dlp/yt-dlp/issues/12855)) by [seproDev](https://github.com/seproDev)
- `url_or_none`: [Support WebSocket URLs](https://github.com/yt-dlp/yt-dlp/commit/a473e592337edb8ca40cde52c1fcaee261c54df9) ([#12848](https://github.com/yt-dlp/yt-dlp/issues/12848)) by [doe1080](https://github.com/doe1080)
#### Extractor changes
- **abematv**: [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/f5736bb35bde62348caebf7b188668655e316deb) ([#12859](https://github.com/yt-dlp/yt-dlp/issues/12859)) by [Kiritomo](https://github.com/Kiritomo)
- **atresplayer**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/839d64325356310e6de6cd9cad28fb546619ca63) ([#11424](https://github.com/yt-dlp/yt-dlp/issues/11424)) by [meGAmeS1](https://github.com/meGAmeS1), [seproDev](https://github.com/seproDev)
- **bpb**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/80736b9c90818adee933a155079b8535bc06819f) ([#13015](https://github.com/yt-dlp/yt-dlp/issues/13015)) by [bashonly](https://github.com/bashonly)
- **cda**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/9032f981362ea0be90626fab51ec37934feded6d) ([#12975](https://github.com/yt-dlp/yt-dlp/issues/12975)) by [bashonly](https://github.com/bashonly)
- **cdafolder**: [Extend `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/cb271d445bc2d866c9a3404b1d8f59bcb77447df) ([#12919](https://github.com/yt-dlp/yt-dlp/issues/12919)) by [fireattack](https://github.com/fireattack), [Kicer86](https://github.com/Kicer86)
- **crowdbunker**: [Make format extraction non-fatal](https://github.com/yt-dlp/yt-dlp/commit/4ebf41309d04a6e196944f1c0f5f0154cff0055a) ([#12836](https://github.com/yt-dlp/yt-dlp/issues/12836)) by [seproDev](https://github.com/seproDev)
- **dacast**: [Support tokenized URLs](https://github.com/yt-dlp/yt-dlp/commit/e7e3b7a55c456da4a5a812b4fefce4dce8e6a616) ([#12979](https://github.com/yt-dlp/yt-dlp/issues/12979)) by [bashonly](https://github.com/bashonly)
- **dzen.ru**: [Rework extractors](https://github.com/yt-dlp/yt-dlp/commit/a3f2b54c2535d862de6efa9cfaa6ca9a2b2f7dd6) ([#12852](https://github.com/yt-dlp/yt-dlp/issues/12852)) by [seproDev](https://github.com/seproDev)
- **generic**: [Fix MPD extraction for `file://` URLs](https://github.com/yt-dlp/yt-dlp/commit/34a061a295d156934417c67ee98070b94943006b) ([#12978](https://github.com/yt-dlp/yt-dlp/issues/12978)) by [bashonly](https://github.com/bashonly)
- **getcourseru**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/741fd809bc4d301c19b53877692ae510334a6750) ([#12943](https://github.com/yt-dlp/yt-dlp/issues/12943)) by [troex](https://github.com/troex)
- **ivoox**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/7faa18b83dcfc74a1a1e2034e6b0369c495ca645) ([#12768](https://github.com/yt-dlp/yt-dlp/issues/12768)) by [NeonMan](https://github.com/NeonMan), [seproDev](https://github.com/seproDev)
- **kika**: [Add playlist extractor](https://github.com/yt-dlp/yt-dlp/commit/3c1c75ecb8ab352f422b59af46fff2be992e4115) ([#12832](https://github.com/yt-dlp/yt-dlp/issues/12832)) by [1100101](https://github.com/1100101)
- **linkedin**
- [Support feed URLs](https://github.com/yt-dlp/yt-dlp/commit/73a26f9ee68610e33c0b4407b77355f2ab7afd0e) ([#12927](https://github.com/yt-dlp/yt-dlp/issues/12927)) by [seproDev](https://github.com/seproDev)
- events: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/b37ff4de5baf4e4e70c6a0ec34e136a279ad20af) ([#12926](https://github.com/yt-dlp/yt-dlp/issues/12926)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
- **loco**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f5a37ea40e20865b976ffeeff13eeae60292eb23) ([#12934](https://github.com/yt-dlp/yt-dlp/issues/12934)) by [seproDev](https://github.com/seproDev)
- **lrtradio**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/74e90dd9b8f9c1a5c48a2515126654f4d398d687) ([#12801](https://github.com/yt-dlp/yt-dlp/issues/12801)) by [subrat-lima](https://github.com/subrat-lima)
- **manyvids**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/77aa15e98f34c4ad425aabf39dd1ee37b48f772c) ([#10907](https://github.com/yt-dlp/yt-dlp/issues/10907)) by [pj47x](https://github.com/pj47x)
- **mixcloud**: [Refactor extractor](https://github.com/yt-dlp/yt-dlp/commit/db6d1f145ad583e0220637726029f8f2fa6200a0) ([#12830](https://github.com/yt-dlp/yt-dlp/issues/12830)) by [seproDev](https://github.com/seproDev), [WouterGordts](https://github.com/WouterGordts)
- **mlbtv**: [Fix device ID caching](https://github.com/yt-dlp/yt-dlp/commit/36da6360e130197df927ee93409519ce3f4075f5) ([#12980](https://github.com/yt-dlp/yt-dlp/issues/12980)) by [bashonly](https://github.com/bashonly)
- **niconico**
- [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/25cd7c1ecbb6cbf21dd3a6e59608e4af94715ecc) ([#13008](https://github.com/yt-dlp/yt-dlp/issues/13008)) by [doe1080](https://github.com/doe1080)
- [Remove DMC formats support](https://github.com/yt-dlp/yt-dlp/commit/7d05aa99c65352feae1cd9a3ff8784b64bfe382a) ([#12916](https://github.com/yt-dlp/yt-dlp/issues/12916)) by [doe1080](https://github.com/doe1080)
- live: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/1d45e30537bf83e069184a440703e4c43b2e0198) ([#12809](https://github.com/yt-dlp/yt-dlp/issues/12809)) by [Snack-X](https://github.com/Snack-X)
- **panopto**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/9d26daa04ad5108257bc5e30f7f040c7f1fe7a5a) ([#12925](https://github.com/yt-dlp/yt-dlp/issues/12925)) by [seproDev](https://github.com/seproDev)
- **parti**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/425017531fbc3369becb5a44013e26f26efabf45) ([#12769](https://github.com/yt-dlp/yt-dlp/issues/12769)) by [benfaerber](https://github.com/benfaerber)
- **raiplay**: [Fix DRM detection](https://github.com/yt-dlp/yt-dlp/commit/dce82346245e35a46fda836ca2089805d2347935) ([#12971](https://github.com/yt-dlp/yt-dlp/issues/12971)) by [DTrombett](https://github.com/DTrombett)
- **reddit**: [Support `--ignore-no-formats-error`](https://github.com/yt-dlp/yt-dlp/commit/28f04e8a5e383ff531db646190b4be45554610d6) ([#12993](https://github.com/yt-dlp/yt-dlp/issues/12993)) by [bashonly](https://github.com/bashonly)
- **royalive**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/e1847535e28788414a25546a45bebcada2f34558) ([#12817](https://github.com/yt-dlp/yt-dlp/issues/12817)) by [CasperMcFadden95](https://github.com/CasperMcFadden95)
- **rtve**: [Rework extractors](https://github.com/yt-dlp/yt-dlp/commit/f07ee91c71920ab1187a7ea756720e81aa406a9d) ([#10388](https://github.com/yt-dlp/yt-dlp/issues/10388)) by [meGAmeS1](https://github.com/meGAmeS1), [seproDev](https://github.com/seproDev)
- **rumble**: [Improve format extraction](https://github.com/yt-dlp/yt-dlp/commit/58d0c83457b93b3c9a81eb6bc5a4c65f25e949df) ([#12838](https://github.com/yt-dlp/yt-dlp/issues/12838)) by [seproDev](https://github.com/seproDev)
- **tokfmpodcast**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/91832111a12d87499294a0f430829b8c2254c339) ([#12842](https://github.com/yt-dlp/yt-dlp/issues/12842)) by [selfisekai](https://github.com/selfisekai)
- **tv2dk**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/a3e91df30a45943f40759d2c1e0b6c2ca4b2a263) ([#12945](https://github.com/yt-dlp/yt-dlp/issues/12945)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
- **tvp**: vod: [Improve `_VALID_URL`](https://github.com/yt-dlp/yt-dlp/commit/4e69a626cce51428bc1d66dc606a56d9498b03a5) ([#12923](https://github.com/yt-dlp/yt-dlp/issues/12923)) by [seproDev](https://github.com/seproDev)
- **tvw**: tvchannels: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/ed8ad1b4d6b9d7a1426ff5192ff924f3371e4721) ([#12721](https://github.com/yt-dlp/yt-dlp/issues/12721)) by [fries1234](https://github.com/fries1234)
- **twitcasting**: [Fix livestream extraction](https://github.com/yt-dlp/yt-dlp/commit/de271a06fd6d20d4f55597ff7f90e4d913de0a52) ([#12977](https://github.com/yt-dlp/yt-dlp/issues/12977)) by [bashonly](https://github.com/bashonly)
- **twitch**: clips: [Fix uploader metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/1ae6bff564a65af41e94f1a4727892471ecdd05a) ([#13022](https://github.com/yt-dlp/yt-dlp/issues/13022)) by [1271](https://github.com/1271)
- **twitter**
- [Fix extraction when logged-in](https://github.com/yt-dlp/yt-dlp/commit/1cf39ddf3d10b6512daa7dd139e5f6c0dc548bbc) ([#13024](https://github.com/yt-dlp/yt-dlp/issues/13024)) by [bashonly](https://github.com/bashonly)
- spaces: [Improve metadata extraction](https://github.com/yt-dlp/yt-dlp/commit/70599e53b736bb75922b737e6e0d4f76e419bb20) ([#12911](https://github.com/yt-dlp/yt-dlp/issues/12911)) by [doe1080](https://github.com/doe1080)
- **vimeo**: [Extract from mobile API](https://github.com/yt-dlp/yt-dlp/commit/22ac81a0692019ac833cf282e4ef99718e9ef3fa) ([#13034](https://github.com/yt-dlp/yt-dlp/issues/13034)) by [bashonly](https://github.com/bashonly)
- **vk**
- [Fix chapters extraction](https://github.com/yt-dlp/yt-dlp/commit/5361a7c6e2933c919716e0cb1e3116c28c40419f) ([#12821](https://github.com/yt-dlp/yt-dlp/issues/12821)) by [seproDev](https://github.com/seproDev)
- [Fix uploader extraction](https://github.com/yt-dlp/yt-dlp/commit/2381881fe58a723853350a6ab750a5efc9f10c85) ([#12985](https://github.com/yt-dlp/yt-dlp/issues/12985)) by [seproDev](https://github.com/seproDev)
- **youtube**
- [Add context to video request rate limit error](https://github.com/yt-dlp/yt-dlp/commit/26feac3dd142536ad08ad1ed731378cb88e63602) ([#12958](https://github.com/yt-dlp/yt-dlp/issues/12958)) by [coletdjnz](https://github.com/coletdjnz)
- [Add extractor arg to skip "initial_data" request](https://github.com/yt-dlp/yt-dlp/commit/ed6c6d7eefbc78fa72e4e60ad6edaa3ee2acc715) ([#12865](https://github.com/yt-dlp/yt-dlp/issues/12865)) by [leeblackc](https://github.com/leeblackc)
- [Add warning on video captcha challenge](https://github.com/yt-dlp/yt-dlp/commit/f484c51599a6cd01eb078ea7dc9bbba942967774) ([#12939](https://github.com/yt-dlp/yt-dlp/issues/12939)) by [coletdjnz](https://github.com/coletdjnz)
- [Cache signature timestamps](https://github.com/yt-dlp/yt-dlp/commit/61c9a938b390b8334ee3a879fe2d93f714e30138) ([#13047](https://github.com/yt-dlp/yt-dlp/issues/13047)) by [bashonly](https://github.com/bashonly)
- [Detect and warn when account cookies are rotated](https://github.com/yt-dlp/yt-dlp/commit/8cb08028f5be2acb9835ce1670b196b9b077052f) ([#13014](https://github.com/yt-dlp/yt-dlp/issues/13014)) by [coletdjnz](https://github.com/coletdjnz)
- [Detect player JS variants for any locale](https://github.com/yt-dlp/yt-dlp/commit/c2d6659d1069f8cff97e1fd61d1c59e949e1e63d) ([#13003](https://github.com/yt-dlp/yt-dlp/issues/13003)) by [bashonly](https://github.com/bashonly)
- [Do not strictly deprioritize `missing_pot` formats](https://github.com/yt-dlp/yt-dlp/commit/74fc2ae12c24eb6b4e02c6360c89bd05f3c8f740) ([#13061](https://github.com/yt-dlp/yt-dlp/issues/13061)) by [bashonly](https://github.com/bashonly)
- [Improve warning for SABR-only/SSAP player responses](https://github.com/yt-dlp/yt-dlp/commit/fd8394bc50301ac5e930aa65aa71ab1b8372b8ab) ([#13049](https://github.com/yt-dlp/yt-dlp/issues/13049)) by [bashonly](https://github.com/bashonly)
- tab: [Extract continuation from empty page](https://github.com/yt-dlp/yt-dlp/commit/72ba4879304c2082fecbb472e6cc05ee2d154a3b) ([#12938](https://github.com/yt-dlp/yt-dlp/issues/12938)) by [coletdjnz](https://github.com/coletdjnz)
- **zdf**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/7be14109a6bd493a2e881da4f9e30adaf3e7e5d5) ([#12779](https://github.com/yt-dlp/yt-dlp/issues/12779)) by [bashonly](https://github.com/bashonly), [InvalidUsernameException](https://github.com/InvalidUsernameException)
#### Downloader changes
- **niconicodmc**: [Remove downloader](https://github.com/yt-dlp/yt-dlp/commit/8d127b18f81131453eaba05d3bb810d9b73adb75) ([#12916](https://github.com/yt-dlp/yt-dlp/issues/12916)) by [doe1080](https://github.com/doe1080)
#### Networking changes
- [Add PATCH request shortcut](https://github.com/yt-dlp/yt-dlp/commit/ceab4d5ed63a1f135a1816fe967c9d9a1ec7e6e8) ([#12884](https://github.com/yt-dlp/yt-dlp/issues/12884)) by [doe1080](https://github.com/doe1080)
#### Misc. changes
- **ci**: [Add file mode test to code check](https://github.com/yt-dlp/yt-dlp/commit/3690e91265d1d0bbeffaf6a9b8cc9baded1367bd) ([#13036](https://github.com/yt-dlp/yt-dlp/issues/13036)) by [Grub4K](https://github.com/Grub4K)
- **cleanup**: Miscellaneous: [505b400](https://github.com/yt-dlp/yt-dlp/commit/505b400795af557bdcfd9d4fa7e9133b26ef431c) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
### 2025.03.31
#### Core changes
- [Add `--compat-options 2024`](https://github.com/yt-dlp/yt-dlp/commit/22e34adbd741e1c7072015debd615dc3fb71c401) ([#12789](https://github.com/yt-dlp/yt-dlp/issues/12789)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- **francaisfacile**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/bb321cfdc3fd4400598ddb12a15862bc2ac8fc10) ([#12787](https://github.com/yt-dlp/yt-dlp/issues/12787)) by [mlabeeb03](https://github.com/mlabeeb03)
- **generic**: [Validate response before checking m3u8 live status](https://github.com/yt-dlp/yt-dlp/commit/9a1ec1d36e172d252714cef712a6d091e0a0c4f2) ([#12784](https://github.com/yt-dlp/yt-dlp/issues/12784)) by [bashonly](https://github.com/bashonly)
- **microsoftlearnepisode**: [Extract more formats](https://github.com/yt-dlp/yt-dlp/commit/d63696f23a341ee36a3237ccb5d5e14b34c2c579) ([#12799](https://github.com/yt-dlp/yt-dlp/issues/12799)) by [bashonly](https://github.com/bashonly)
- **mlbtv**: [Fix radio-only extraction](https://github.com/yt-dlp/yt-dlp/commit/f033d86b96b36f8c5289dd7c3304f42d4d9f6ff4) ([#12792](https://github.com/yt-dlp/yt-dlp/issues/12792)) by [bashonly](https://github.com/bashonly)
- **on24**: [Support `mainEvent` URLs](https://github.com/yt-dlp/yt-dlp/commit/e465b078ead75472fcb7b86f6ccaf2b5d3bc4c21) ([#12800](https://github.com/yt-dlp/yt-dlp/issues/12800)) by [bashonly](https://github.com/bashonly)
- **sbs**: [Fix subtitles extraction](https://github.com/yt-dlp/yt-dlp/commit/29560359120f28adaaac67c86fa8442eb72daa0d) ([#12785](https://github.com/yt-dlp/yt-dlp/issues/12785)) by [bashonly](https://github.com/bashonly)
- **stvr**: [Rename extractor from RTVS to STVR](https://github.com/yt-dlp/yt-dlp/commit/5fc521cbd0ce7b2410d0935369558838728e205d) ([#12788](https://github.com/yt-dlp/yt-dlp/issues/12788)) by [mireq](https://github.com/mireq)
- **twitch**: clips: [Extract portrait formats](https://github.com/yt-dlp/yt-dlp/commit/61046c31612b30c749cbdae934b7fe26abe659d7) ([#12763](https://github.com/yt-dlp/yt-dlp/issues/12763)) by [DmitryScaletta](https://github.com/DmitryScaletta)
- **youtube**
- [Add `player_js_variant` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/07f04005e40ebdb368920c511e36e98af0077ed3) ([#12767](https://github.com/yt-dlp/yt-dlp/issues/12767)) by [bashonly](https://github.com/bashonly)
- tab: [Fix playlist continuation extraction](https://github.com/yt-dlp/yt-dlp/commit/6a6d97b2cbc78f818de05cc96edcdcfd52caa259) ([#12777](https://github.com/yt-dlp/yt-dlp/issues/12777)) by [coletdjnz](https://github.com/coletdjnz)
#### Misc. changes
- **cleanup**: Miscellaneous: [5e457af](https://github.com/yt-dlp/yt-dlp/commit/5e457af57fae9645b1b8fa0ed689229c8fb9656b) by [bashonly](https://github.com/bashonly)
### 2025.03.27
#### Core changes
- **jsinterp**: [Fix nested attributes and object extraction](https://github.com/yt-dlp/yt-dlp/commit/a8b9ff3c2a0ae25735e580173becc78545b92572) ([#12760](https://github.com/yt-dlp/yt-dlp/issues/12760)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
#### Extractor changes
- **youtube**: [Make signature and nsig extraction more robust](https://github.com/yt-dlp/yt-dlp/commit/48be862b32648bff5b3e553e40fca4dcc6e88b28) ([#12761](https://github.com/yt-dlp/yt-dlp/issues/12761)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
### 2025.03.26
#### Extractor changes
- **youtube**
- [Fix signature and nsig extraction for player `4fcd6e4a`](https://github.com/yt-dlp/yt-dlp/commit/a550dfc904a02843a26369ae50dbb7c0febfb30e) ([#12748](https://github.com/yt-dlp/yt-dlp/issues/12748)) by [seproDev](https://github.com/seproDev)
- [Only cache nsig code on successful decoding](https://github.com/yt-dlp/yt-dlp/commit/ecee97b4fa90d51c48f9154c3a6d5a8ffe46cd5c) ([#12750](https://github.com/yt-dlp/yt-dlp/issues/12750)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
### 2025.03.25
#### Core changes
- [Fix attribute error on failed VT init](https://github.com/yt-dlp/yt-dlp/commit/b872ffec50fd50f790a5a490e006a369a28a3df3) ([#12696](https://github.com/yt-dlp/yt-dlp/issues/12696)) by [Grub4K](https://github.com/Grub4K)
- **utils**: `js_to_json`: [Make function less fatal](https://github.com/yt-dlp/yt-dlp/commit/9491b44032b330e05bd5eaa546187005d1e8538e) ([#12715](https://github.com/yt-dlp/yt-dlp/issues/12715)) by [seproDev](https://github.com/seproDev)
#### Extractor changes
- [Fix sorting of HLS audio formats by `GROUP-ID`](https://github.com/yt-dlp/yt-dlp/commit/86ab79e1a5182092321102adf6ca34195803b878) ([#12714](https://github.com/yt-dlp/yt-dlp/issues/12714)) by [bashonly](https://github.com/bashonly)
- **17live**: vod: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3396eb50dcd245b49c0f4aecd6e80ec914095d16) ([#12723](https://github.com/yt-dlp/yt-dlp/issues/12723)) by [subrat-lima](https://github.com/subrat-lima)
- **9now.com.au**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/9d5e6de2e7a47226d1f72c713ad45c88ba01db68) ([#12702](https://github.com/yt-dlp/yt-dlp/issues/12702)) by [bashonly](https://github.com/bashonly)
- **chzzk**: video: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/e2dfccaf808b406d5bcb7dd04ae9ce420752dd6f) ([#12692](https://github.com/yt-dlp/yt-dlp/issues/12692)) by [bashonly](https://github.com/bashonly), [dirkf](https://github.com/dirkf)
- **deezer**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/be5af3f9e91747768c2b41157851bfbe14c663f7) ([#12704](https://github.com/yt-dlp/yt-dlp/issues/12704)) by [seproDev](https://github.com/seproDev)
- **generic**: [Fix MPD base URL parsing](https://github.com/yt-dlp/yt-dlp/commit/5086d4aed6aeb3908c62f49e2d8f74cc0cb05110) ([#12718](https://github.com/yt-dlp/yt-dlp/issues/12718)) by [fireattack](https://github.com/fireattack)
- **streaks**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/801afeac91f97dc0b58cd39cc7e8c50f619dc4e1) ([#12679](https://github.com/yt-dlp/yt-dlp/issues/12679)) by [doe1080](https://github.com/doe1080)
- **tver**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/66e0bab814e4a52ef3e12d81123ad992a29df50e) ([#12659](https://github.com/yt-dlp/yt-dlp/issues/12659)) by [arabcoders](https://github.com/arabcoders), [bashonly](https://github.com/bashonly)
- **viki**: [Remove extractors](https://github.com/yt-dlp/yt-dlp/commit/fe4f14b8369038e7c58f7de546d76de1ce3a91ce) ([#12703](https://github.com/yt-dlp/yt-dlp/issues/12703)) by [seproDev](https://github.com/seproDev)
- **vrsquare**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/b7fbb5a0a16a8e8d3e29c29e26ebed677d0d6ea3) ([#12515](https://github.com/yt-dlp/yt-dlp/issues/12515)) by [doe1080](https://github.com/doe1080)
- **youtube**
- [Fix PhantomJS nsig fallback](https://github.com/yt-dlp/yt-dlp/commit/4054a2b623bd1e277b49d2e9abc3d112a4b1c7be) ([#12728](https://github.com/yt-dlp/yt-dlp/issues/12728)) by [bashonly](https://github.com/bashonly)
- [Fix signature and nsig extraction for player `363db69b`](https://github.com/yt-dlp/yt-dlp/commit/b9c979461b244713bf42691a5bc02834e2ba4b2c) ([#12725](https://github.com/yt-dlp/yt-dlp/issues/12725)) by [bashonly](https://github.com/bashonly)
#### Networking changes
- **Request Handler**: curl_cffi: [Support `curl_cffi` 0.10.x](https://github.com/yt-dlp/yt-dlp/commit/9bf23902ceb948b9685ce1dab575491571720fc6) ([#12670](https://github.com/yt-dlp/yt-dlp/issues/12670)) by [Grub4K](https://github.com/Grub4K)
#### Misc. changes
- **cleanup**: Miscellaneous: [9dde546](https://github.com/yt-dlp/yt-dlp/commit/9dde546e7ee3e1515d88ee3af08b099351455dc0) by [seproDev](https://github.com/seproDev)
### 2025.03.21 ### 2025.03.21
#### Core changes #### Core changes

View File

@@ -386,6 +386,12 @@ ## General Options:
recursive options. As a safety measure, each recursive options. As a safety measure, each
alias may be triggered a maximum of 100 alias may be triggered a maximum of 100
times. This option can be used multiple times times. This option can be used multiple times
-t, --preset-alias PRESET Applies a predefined set of options. e.g.
--preset-alias mp3. The following presets
are available: mp3, aac, mp4, mkv, sleep.
See the "Preset Aliases" section at the end
for more info. This option can be used
multiple times
## Network Options: ## Network Options:
--proxy URL Use the specified HTTP/HTTPS/SOCKS proxy. To --proxy URL Use the specified HTTP/HTTPS/SOCKS proxy. To
@@ -1098,6 +1104,23 @@ ## Extractor Options:
can use this option multiple times to give can use this option multiple times to give
arguments for different extractors arguments for different extractors
## Preset Aliases:
-t mp3 -f 'ba[acodec^=mp3]/ba/b' -x --audio-format
mp3
-t aac -f
'ba[acodec^=aac]/ba[acodec^=mp4a.40.]/ba/b'
-x --audio-format aac
-t mp4 --merge-output-format mp4 --remux-video mp4
-S vcodec:h264,lang,quality,res,fps,hdr:12,a
codec:aac
-t mkv --merge-output-format mkv --remux-video mkv
-t sleep --sleep-subtitles 5 --sleep-requests 0.75
--sleep-interval 10 --max-sleep-interval 20
# CONFIGURATION # CONFIGURATION
You can configure yt-dlp by placing any supported command line option in a configuration file. The configuration is loaded from the following locations: You can configure yt-dlp by placing any supported command line option in a configuration file. The configuration is loaded from the following locations:
@@ -1770,7 +1793,7 @@ #### youtube
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes * `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube.py](https://github.com/yt-dlp/yt-dlp/blob/c26f9b991a0681fd3ea548d535919cec1fbbd430/yt_dlp/extractor/youtube.py#L381-L390) for list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively * `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_vr`, `tv` and `tv_embedded`. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios` * `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_vr`, `tv` and `tv_embedded`. By default, `tv,ios,web` is used, or `tv,web` is used when authenticating with cookies. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause some issues. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) for more details * `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player), `initial_data` (skip initial data/next ep request). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause issues such as missing formats or metadata. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) and [#12826](https://github.com/yt-dlp/yt-dlp/issues/12826) for more details
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp. * `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
* `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side) * `comment_sort`: `top` or `new` (default) - choose comment sorting mode (on YouTube's side)
* `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all` * `max_comments`: Limit the amount of comments to gather. Comma-separated list of integers representing `max-comments,max-parents,max-replies,max-replies-per-thread`. Default is `all,all,all,all`
@@ -1782,6 +1805,7 @@ #### youtube
* `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage` * `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage`
* `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID) * `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID)
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be either `gvs` (Google Video Server URLs) or `player` (Innertube player request) * `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be either `gvs` (Google Video Server URLs) or `player` (Innertube player request)
* `player_js_variant`: The player javascript variant to use for signature and nsig deciphering. The known variants are: `main`, `tce`, `tv`, `tv_es6`, `phone`, `tablet`. Only `main` is recommended as a possible workaround; the others are for debugging purposes. The default is to use what is prescribed by the site, and can be selected with `actual`
#### youtubetab (YouTube playlists, channels, feeds, etc.) #### youtubetab (YouTube playlists, channels, feeds, etc.)
* `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details) * `skip`: One or more of `webpage` (skip initial webpage download), `authcheck` (allow the download of playlists requiring authentication when no initial webpage is downloaded. This may cause unwanted behavior, see [#1122](https://github.com/yt-dlp/yt-dlp/pull/1122) for more details)
@@ -1798,9 +1822,6 @@ #### generic
#### vikichannel #### vikichannel
* `video_types`: Types of videos to download - one or more of `episodes`, `movies`, `clips`, `trailers` * `video_types`: Types of videos to download - one or more of `episodes`, `movies`, `clips`, `trailers`
#### niconico
* `segment_duration`: Segment duration in milliseconds for HLS-DMC formats. Use it at your own risk since this feature **may result in your account termination.**
#### youtubewebarchive #### youtubewebarchive
* `check_all`: Try to check more at the cost of more requests. One or more of `thumbnails`, `captures` * `check_all`: Try to check more at the cost of more requests. One or more of `thumbnails`, `captures`
@@ -1866,6 +1887,9 @@ #### bilibili
#### sonylivseries #### sonylivseries
* `sort_order`: Episode sort order for series extraction - one of `asc` (ascending, oldest first) or `desc` (descending, newest first). Default is `asc` * `sort_order`: Episode sort order for series extraction - one of `asc` (ascending, oldest first) or `desc` (descending, newest first). Default is `asc`
#### tver
* `backend`: Backend API to use for extraction - one of `streaks` (default) or `brightcove` (deprecated)
**Note**: These options may be changed/removed in the future without concern for backward compatibility **Note**: These options may be changed/removed in the future without concern for backward compatibility
<!-- MANPAGE: MOVE "INSTALLATION" SECTION HERE --> <!-- MANPAGE: MOVE "INSTALLATION" SECTION HERE -->
@@ -2149,7 +2173,7 @@ ### New features
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection than what is possible by simply using `--format` ([examples](#format-selection-examples)) * **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection than what is possible by simply using `--format` ([examples](#format-selection-examples))
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--write-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, playlist infojson etc. Note that NicoNico livestreams are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details. * **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--write-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, playlist infojson etc. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
* **YouTube improvements**: * **YouTube improvements**:
* Supports Clips, Stories (`ytstories:<channel UCID>`), Search (including filters)**\***, YouTube Music Search, Channel-specific search, Search prefixes (`ytsearch:`, `ytsearchdate:`)**\***, Mixes, and Feeds (`:ytfav`, `:ytwatchlater`, `:ytsubs`, `:ythistory`, `:ytrec`, `:ytnotif`) * Supports Clips, Stories (`ytstories:<channel UCID>`), Search (including filters)**\***, YouTube Music Search, Channel-specific search, Search prefixes (`ytsearch:`, `ytsearchdate:`)**\***, Mixes, and Feeds (`:ytfav`, `:ytwatchlater`, `:ytsubs`, `:ythistory`, `:ytrec`, `:ytnotif`)
@@ -2215,7 +2239,7 @@ ### Differences in default behavior
* Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading * Live chats (if available) are considered as subtitles. Use `--sub-langs all,-live_chat` to download all subtitles except live chat. You can also use `--compat-options no-live-chat` to prevent any live chat/danmaku from downloading
* YouTube channel URLs download all uploads of the channel. To download only the videos in a specific tab, pass the tab's URL. If the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections * YouTube channel URLs download all uploads of the channel. To download only the videos in a specific tab, pass the tab's URL. If the channel does not show the requested tab, an error will be raised. Also, `/live` URLs raise an error if there are no live videos instead of silently downloading the entire channel. You may use `--compat-options no-youtube-channel-redirect` to revert all these redirections
* Unavailable videos are also listed for YouTube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this * Unavailable videos are also listed for YouTube playlists. Use `--compat-options no-youtube-unavailable-videos` to remove this
* The upload dates extracted from YouTube are in UTC [when available](https://github.com/yt-dlp/yt-dlp/blob/89e4d86171c7b7c997c77d4714542e0383bf0db0/yt_dlp/extractor/youtube.py#L3898-L3900). Use `--compat-options no-youtube-prefer-utc-upload-date` to prefer the non-UTC upload date. * The upload dates extracted from YouTube are in UTC.
* If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this * If `ffmpeg` is used as the downloader, the downloading and merging of formats happen in a single step when possible. Use `--compat-options no-direct-merge` to revert this
* Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead * Thumbnail embedding in `mp4` is done with mutagen if possible. Use `--compat-options embed-thumbnail-atomicparsley` to force the use of AtomicParsley instead
* Some internal metadata such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this * Some internal metadata such as filenames are removed by default from the infojson. Use `--no-clean-infojson` or `--compat-options no-clean-infojson` to revert this
@@ -2234,9 +2258,10 @@ ### Differences in default behavior
* `--compat-options all`: Use all compat options (**Do NOT use this!**) * `--compat-options all`: Use all compat options (**Do NOT use this!**)
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort` * `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
* `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort` * `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization,no-youtube-prefer-utc-upload-date` * `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization`
* `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx` * `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx`
* `--compat-options 2023`: Same as `--compat-options prefer-vp9-sort`. Use this to enable all future compat options * `--compat-options 2023`: Same as `--compat-options 2024,prefer-vp9-sort`
* `--compat-options 2024`: Currently does nothing. Use this to enable all future compat options
The following compat options restore vulnerable behavior from before security patches: The following compat options restore vulnerable behavior from before security patches:

View File

@@ -245,5 +245,14 @@
"when": "76ac023ff02f06e8c003d104f02a03deeddebdcd", "when": "76ac023ff02f06e8c003d104f02a03deeddebdcd",
"short": "[ie/youtube:tab] Improve shorts title extraction (#11997)", "short": "[ie/youtube:tab] Improve shorts title extraction (#11997)",
"authors": ["bashonly", "d3d9"] "authors": ["bashonly", "d3d9"]
},
{
"action": "add",
"when": "88eb1e7a9a2720ac89d653c0d0e40292388823bb",
"short": "[priority] **New option `--preset-alias`/`-t` has been added**\nThis provides convenient predefined aliases for common use cases. Available presets include `mp4`, `mp3`, `mkv`, `aac`, and `sleep`. See [the README](https://github.com/yt-dlp/yt-dlp/blob/master/README.md#preset-aliases) for more details."
},
{
"action": "remove",
"when": "d596824c2f8428362c072518856065070616e348"
} }
] ]

View File

@@ -55,8 +55,7 @@ default = [
"websockets>=13.0", "websockets>=13.0",
] ]
curl-cffi = [ curl-cffi = [
"curl-cffi==0.5.10; os_name=='nt' and implementation_name=='cpython'", "curl-cffi>=0.5.10,!=0.6.*,!=0.7.*,!=0.8.*,!=0.9.*,<0.11; implementation_name=='cpython'",
"curl-cffi>=0.5.10,!=0.6.*,<0.7.2; os_name!='nt' and implementation_name=='cpython'",
] ]
secretstorage = [ secretstorage = [
"cffi", "cffi",

View File

@@ -7,6 +7,7 @@ # Supported sites
- **17live** - **17live**
- **17live:clip** - **17live:clip**
- **17live:vod**
- **1News**: 1news.co.nz article videos - **1News**: 1news.co.nz article videos
- **1tv**: Первый канал - **1tv**: Первый канал
- **20min** - **20min**
@@ -200,7 +201,7 @@ # Supported sites
- **blogger.com** - **blogger.com**
- **Bloomberg** - **Bloomberg**
- **Bluesky** - **Bluesky**
- **BokeCC** - **BokeCC**: CC视频
- **BongaCams** - **BongaCams**
- **Boosty** - **Boosty**
- **BostonGlobe** - **BostonGlobe**
@@ -347,8 +348,6 @@ # Supported sites
- **daystar:clip** - **daystar:clip**
- **DBTV** - **DBTV**
- **DctpTv** - **DctpTv**
- **DeezerAlbum**
- **DeezerPlaylist**
- **democracynow** - **democracynow**
- **DestinationAmerica** - **DestinationAmerica**
- **DetikEmbed** - **DetikEmbed**
@@ -395,6 +394,8 @@ # Supported sites
- **dvtv**: http://video.aktualne.cz/ - **dvtv**: http://video.aktualne.cz/
- **dw**: (**Currently broken**) - **dw**: (**Currently broken**)
- **dw:article**: (**Currently broken**) - **dw:article**: (**Currently broken**)
- **dzen.ru**: Дзен (dzen) formerly Яндекс.Дзен (Yandex Zen)
- **dzen.ru:channel**
- **EaglePlatform** - **EaglePlatform**
- **EbaumsWorld** - **EbaumsWorld**
- **Ebay** - **Ebay**
@@ -473,6 +474,7 @@ # Supported sites
- **FoxNewsVideo** - **FoxNewsVideo**
- **FoxSports** - **FoxSports**
- **fptplay**: fptplay.vn - **fptplay**: fptplay.vn
- **FrancaisFacile**
- **FranceCulture** - **FranceCulture**
- **FranceInter** - **FranceInter**
- **francetv** - **francetv**
@@ -634,6 +636,7 @@ # Supported sites
- **ivi**: ivi.ru - **ivi**: ivi.ru
- **ivi:compilation**: ivi.ru compilations - **ivi:compilation**: ivi.ru compilations
- **ivideon**: Ivideon TV - **ivideon**: Ivideon TV
- **Ivoox**
- **IVXPlayer** - **IVXPlayer**
- **iwara**: [*iwara*](## "netrc machine") - **iwara**: [*iwara*](## "netrc machine")
- **iwara:playlist**: [*iwara*](## "netrc machine") - **iwara:playlist**: [*iwara*](## "netrc machine")
@@ -671,6 +674,7 @@ # Supported sites
- **Kicker** - **Kicker**
- **KickStarter** - **KickStarter**
- **Kika**: KiKA.de - **Kika**: KiKA.de
- **KikaPlaylist**
- **kinja:embed** - **kinja:embed**
- **KinoPoisk** - **KinoPoisk**
- **Kommunetv** - **Kommunetv**
@@ -723,6 +727,7 @@ # Supported sites
- **limelight:channel** - **limelight:channel**
- **limelight:channel_list** - **limelight:channel_list**
- **LinkedIn**: [*linkedin*](## "netrc machine") - **LinkedIn**: [*linkedin*](## "netrc machine")
- **linkedin:events**: [*linkedin*](## "netrc machine")
- **linkedin:learning**: [*linkedin*](## "netrc machine") - **linkedin:learning**: [*linkedin*](## "netrc machine")
- **linkedin:learning:course**: [*linkedin*](## "netrc machine") - **linkedin:learning:course**: [*linkedin*](## "netrc machine")
- **Liputan6** - **Liputan6**
@@ -738,6 +743,7 @@ # Supported sites
- **loom** - **loom**
- **loom:folder** - **loom:folder**
- **LoveHomePorn** - **LoveHomePorn**
- **LRTRadio**
- **LRTStream** - **LRTStream**
- **LRTVOD** - **LRTVOD**
- **LSMLREmbed** - **LSMLREmbed**
@@ -759,7 +765,7 @@ # Supported sites
- **ManotoTV**: Manoto TV (Episode) - **ManotoTV**: Manoto TV (Episode)
- **ManotoTVLive**: Manoto TV (Live) - **ManotoTVLive**: Manoto TV (Live)
- **ManotoTVShow**: Manoto TV (Show) - **ManotoTVShow**: Manoto TV (Show)
- **ManyVids**: (**Currently broken**) - **ManyVids**
- **MaoriTV** - **MaoriTV**
- **Markiza**: (**Currently broken**) - **Markiza**: (**Currently broken**)
- **MarkizaPage**: (**Currently broken**) - **MarkizaPage**: (**Currently broken**)
@@ -829,7 +835,7 @@ # Supported sites
- **MotherlessUploader** - **MotherlessUploader**
- **Motorsport**: motorsport.com (**Currently broken**) - **Motorsport**: motorsport.com (**Currently broken**)
- **MovieFap** - **MovieFap**
- **Moviepilot** - **moviepilot**: Moviepilot trailer
- **MoviewPlay** - **MoviewPlay**
- **Moviezine** - **Moviezine**
- **MovingImage** - **MovingImage**
@@ -946,7 +952,7 @@ # Supported sites
- **nickelodeonru** - **nickelodeonru**
- **niconico**: [*niconico*](## "netrc machine") ニコニコ動画 - **niconico**: [*niconico*](## "netrc machine") ニコニコ動画
- **niconico:history**: NicoNico user history or likes. Requires cookies. - **niconico:history**: NicoNico user history or likes. Requires cookies.
- **niconico:live**: ニコニコ生放送 - **niconico:live**: [*niconico*](## "netrc machine") ニコニコ生放送
- **niconico:playlist** - **niconico:playlist**
- **niconico:series** - **niconico:series**
- **niconico:tag**: NicoNico video tag URLs - **niconico:tag**: NicoNico video tag URLs
@@ -1053,6 +1059,8 @@ # Supported sites
- **Parler**: Posts on parler.com - **Parler**: Posts on parler.com
- **parliamentlive.tv**: UK parliament videos - **parliamentlive.tv**: UK parliament videos
- **Parlview**: (**Currently broken**) - **Parlview**: (**Currently broken**)
- **parti:livestream**
- **parti:video**
- **patreon** - **patreon**
- **patreon:campaign** - **patreon:campaign**
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC) - **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
@@ -1227,6 +1235,7 @@ # Supported sites
- **RoosterTeeth**: [*roosterteeth*](## "netrc machine") - **RoosterTeeth**: [*roosterteeth*](## "netrc machine")
- **RoosterTeethSeries**: [*roosterteeth*](## "netrc machine") - **RoosterTeethSeries**: [*roosterteeth*](## "netrc machine")
- **RottenTomatoes** - **RottenTomatoes**
- **RoyaLive**
- **Rozhlas** - **Rozhlas**
- **RozhlasVltava** - **RozhlasVltava**
- **RTBF**: [*rtbf*](## "netrc machine") (**Currently broken**) - **RTBF**: [*rtbf*](## "netrc machine") (**Currently broken**)
@@ -1247,12 +1256,10 @@ # Supported sites
- **RTVCKaltura** - **RTVCKaltura**
- **RTVCPlay** - **RTVCPlay**
- **RTVCPlayEmbed** - **RTVCPlayEmbed**
- **rtve.es:alacarta**: RTVE a la carta - **rtve.es:alacarta**: RTVE a la carta and Play
- **rtve.es:audio**: RTVE audio - **rtve.es:audio**: RTVE audio
- **rtve.es:infantil**: RTVE infantil
- **rtve.es:live**: RTVE.es live streams - **rtve.es:live**: RTVE.es live streams
- **rtve.es:television** - **rtve.es:television**
- **RTVS**
- **rtvslo.si** - **rtvslo.si**
- **rtvslo.si:show** - **rtvslo.si:show**
- **RudoVideo** - **RudoVideo**
@@ -1307,8 +1314,8 @@ # Supported sites
- **sejm** - **sejm**
- **Sen** - **Sen**
- **SenalColombiaLive**: (**Currently broken**) - **SenalColombiaLive**: (**Currently broken**)
- **SenateGov** - **senate.gov**
- **SenateISVP** - **senate.gov:isvp**
- **SendtoNews**: (**Currently broken**) - **SendtoNews**: (**Currently broken**)
- **Servus** - **Servus**
- **Sexu**: (**Currently broken**) - **Sexu**: (**Currently broken**)
@@ -1401,12 +1408,14 @@ # Supported sites
- **StoryFire** - **StoryFire**
- **StoryFireSeries** - **StoryFireSeries**
- **StoryFireUser** - **StoryFireUser**
- **Streaks**
- **Streamable** - **Streamable**
- **StreamCZ** - **StreamCZ**
- **StreetVoice** - **StreetVoice**
- **StretchInternet** - **StretchInternet**
- **Stripchat** - **Stripchat**
- **stv:player** - **stv:player**
- **stvr**: Slovak Television and Radio (formerly RTVS)
- **Subsplash** - **Subsplash**
- **subsplash:playlist** - **subsplash:playlist**
- **Substack** - **Substack**
@@ -1561,7 +1570,8 @@ # Supported sites
- **tvp:vod:series** - **tvp:vod:series**
- **TVPlayer** - **TVPlayer**
- **TVPlayHome** - **TVPlayHome**
- **Tvw** - **tvw**
- **tvw:tvchannels**
- **Tweakers** - **Tweakers**
- **TwitCasting** - **TwitCasting**
- **TwitCastingLive** - **TwitCastingLive**
@@ -1643,8 +1653,6 @@ # Supported sites
- **viewlift** - **viewlift**
- **viewlift:embed** - **viewlift:embed**
- **Viidea** - **Viidea**
- **viki**: [*viki*](## "netrc machine")
- **viki:channel**: [*viki*](## "netrc machine")
- **vimeo**: [*vimeo*](## "netrc machine") - **vimeo**: [*vimeo*](## "netrc machine")
- **vimeo:album**: [*vimeo*](## "netrc machine") - **vimeo:album**: [*vimeo*](## "netrc machine")
- **vimeo:channel**: [*vimeo*](## "netrc machine") - **vimeo:channel**: [*vimeo*](## "netrc machine")
@@ -1682,6 +1690,10 @@ # Supported sites
- **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **vqq:series** - **vqq:series**
- **vqq:video** - **vqq:video**
- **vrsquare**: VR SQUARE
- **vrsquare:channel**
- **vrsquare:search**
- **vrsquare:section**
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza - **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza
- **vrtmax**: [*vrtnu*](## "netrc machine") VRT MAX (formerly VRT NU) - **vrtmax**: [*vrtnu*](## "netrc machine") VRT MAX (formerly VRT NU)
- **VTM**: (**Currently broken**) - **VTM**: (**Currently broken**)
@@ -1818,14 +1830,12 @@ # Supported sites
- **ZattooLive**: [*zattoo*](## "netrc machine") - **ZattooLive**: [*zattoo*](## "netrc machine")
- **ZattooMovies**: [*zattoo*](## "netrc machine") - **ZattooMovies**: [*zattoo*](## "netrc machine")
- **ZattooRecordings**: [*zattoo*](## "netrc machine") - **ZattooRecordings**: [*zattoo*](## "netrc machine")
- **ZDF** - **zdf**
- **ZDFChannel** - **zdf:channel**
- **Zee5**: [*zee5*](## "netrc machine") - **Zee5**: [*zee5*](## "netrc machine")
- **zee5:series** - **zee5:series**
- **ZeeNews**: (**Currently broken**) - **ZeeNews**: (**Currently broken**)
- **ZenPorn** - **ZenPorn**
- **ZenYandex**
- **ZenYandexChannel**
- **ZetlandDKArticle** - **ZetlandDKArticle**
- **Zhihu** - **Zhihu**
- **zingmp3**: zingmp3.vn - **zingmp3**: zingmp3.vn

View File

@@ -136,7 +136,7 @@ def _iter_differences(got, expected, field):
return return
if op == 'startswith': if op == 'startswith':
if not val.startswith(got): if not got.startswith(val):
yield field, f'should start with {val!r}, got {got!r}' yield field, f'should start with {val!r}, got {got!r}'
return return

View File

@@ -638,6 +638,7 @@ def test_parse_m3u8_formats(self):
'img_bipbop_adv_example_fmp4', 'img_bipbop_adv_example_fmp4',
'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8', 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
[{ [{
# 60kbps (bitrate not provided in m3u8); sorted as worst because it's grouped with lowest bitrate video track
'format_id': 'aud1-English', 'format_id': 'aud1-English',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a1/prog_index.m3u8', 'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a1/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8', 'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
@@ -645,15 +646,9 @@ def test_parse_m3u8_formats(self):
'ext': 'mp4', 'ext': 'mp4',
'protocol': 'm3u8_native', 'protocol': 'm3u8_native',
'audio_ext': 'mp4', 'audio_ext': 'mp4',
'source_preference': 0,
}, { }, {
'format_id': 'aud2-English', # 192kbps (bitrate not provided in m3u8)
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a2/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
'language': 'en',
'ext': 'mp4',
'protocol': 'm3u8_native',
'audio_ext': 'mp4',
}, {
'format_id': 'aud3-English', 'format_id': 'aud3-English',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a3/prog_index.m3u8', 'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a3/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8', 'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
@@ -661,6 +656,17 @@ def test_parse_m3u8_formats(self):
'ext': 'mp4', 'ext': 'mp4',
'protocol': 'm3u8_native', 'protocol': 'm3u8_native',
'audio_ext': 'mp4', 'audio_ext': 'mp4',
'source_preference': 1,
}, {
# 384kbps (bitrate not provided in m3u8); sorted as best because it's grouped with the highest bitrate video track
'format_id': 'aud2-English',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/a2/prog_index.m3u8',
'manifest_url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/master.m3u8',
'language': 'en',
'ext': 'mp4',
'protocol': 'm3u8_native',
'audio_ext': 'mp4',
'source_preference': 2,
}, { }, {
'format_id': '530', 'format_id': '530',
'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/v2/prog_index.m3u8', 'url': 'https://devstreaming-cdn.apple.com/videos/streaming/examples/img_bipbop_adv_example_fmp4/v2/prog_index.m3u8',

View File

@@ -118,6 +118,7 @@ def test_assignments(self):
self._test('function f(){var x = 20; x = 30 + 1; return x;}', 31) self._test('function f(){var x = 20; x = 30 + 1; return x;}', 31)
self._test('function f(){var x = 20; x += 30 + 1; return x;}', 51) self._test('function f(){var x = 20; x += 30 + 1; return x;}', 51)
self._test('function f(){var x = 20; x -= 30 + 1; return x;}', -11) self._test('function f(){var x = 20; x -= 30 + 1; return x;}', -11)
self._test('function f(){var x = 2; var y = ["a", "b"]; y[x%y["length"]]="z"; return y}', ['z', 'b'])
@unittest.skip('Not implemented') @unittest.skip('Not implemented')
def test_comments(self): def test_comments(self):
@@ -403,6 +404,8 @@ def test_split(self):
test_result = list('test') test_result = list('test')
tests = [ tests = [
'function f(a, b){return a.split(b)}', 'function f(a, b){return a.split(b)}',
'function f(a, b){return a["split"](b)}',
'function f(a, b){let x = ["split"]; return a[x[0]](b)}',
'function f(a, b){return String.prototype.split.call(a, b)}', 'function f(a, b){return String.prototype.split.call(a, b)}',
'function f(a, b){return String.prototype.split.apply(a, [b])}', 'function f(a, b){return String.prototype.split.apply(a, [b])}',
] ]
@@ -441,6 +444,9 @@ def test_slice(self):
self._test('function f(){return "012345678".slice(-1, 1)}', '') self._test('function f(){return "012345678".slice(-1, 1)}', '')
self._test('function f(){return "012345678".slice(-3, -1)}', '67') self._test('function f(){return "012345678".slice(-3, -1)}', '67')
def test_splice(self):
self._test('function f(){var T = ["0", "1", "2"]; T["splice"](2, 1, "0")[0]; return T }', ['0', '1', '0'])
def test_js_number_to_string(self): def test_js_number_to_string(self):
for test, radix, expected in [ for test, radix, expected in [
(0, None, '0'), (0, None, '0'),

View File

@@ -39,6 +39,7 @@
from yt_dlp.dependencies import brotli, curl_cffi, requests, urllib3 from yt_dlp.dependencies import brotli, curl_cffi, requests, urllib3
from yt_dlp.networking import ( from yt_dlp.networking import (
HEADRequest, HEADRequest,
PATCHRequest,
PUTRequest, PUTRequest,
Request, Request,
RequestDirector, RequestDirector,
@@ -614,7 +615,6 @@ def test_source_address(self, handler):
rh, Request(f'http://127.0.0.1:{self.http_port}/source_address')).read().decode() rh, Request(f'http://127.0.0.1:{self.http_port}/source_address')).read().decode()
assert source_address == data assert source_address == data
# Not supported by CurlCFFI
@pytest.mark.skip_handler('CurlCFFI', 'not supported by curl-cffi') @pytest.mark.skip_handler('CurlCFFI', 'not supported by curl-cffi')
def test_gzip_trailing_garbage(self, handler): def test_gzip_trailing_garbage(self, handler):
with handler() as rh: with handler() as rh:
@@ -1857,6 +1857,7 @@ def test_method(self):
def test_request_helpers(self): def test_request_helpers(self):
assert HEADRequest('http://example.com').method == 'HEAD' assert HEADRequest('http://example.com').method == 'HEAD'
assert PATCHRequest('http://example.com').method == 'PATCH'
assert PUTRequest('http://example.com').method == 'PUT' assert PUTRequest('http://example.com').method == 'PUT'
def test_headers(self): def test_headers(self):

View File

@@ -23,7 +23,6 @@
TedTalkIE, TedTalkIE,
ThePlatformFeedIE, ThePlatformFeedIE,
ThePlatformIE, ThePlatformIE,
VikiIE,
VimeoIE, VimeoIE,
WallaIE, WallaIE,
YoutubeIE, YoutubeIE,
@@ -331,20 +330,6 @@ def test_subtitles_array_key(self):
self.assertEqual(md5(subtitles['it']), '4b3264186fbb103508abe5311cfcb9cd') self.assertEqual(md5(subtitles['it']), '4b3264186fbb103508abe5311cfcb9cd')
@is_download_test
@unittest.skip('IE broken - DRM only')
class TestVikiSubtitles(BaseTestSubtitles):
url = 'http://www.viki.com/videos/1060846v-punch-episode-18'
IE = VikiIE
def test_allsubtitles(self):
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), {'en'})
self.assertEqual(md5(subtitles['en']), '53cb083a5914b2d84ef1ab67b880d18a')
@is_download_test @is_download_test
class TestThePlatformSubtitles(BaseTestSubtitles): class TestThePlatformSubtitles(BaseTestSubtitles):
# from http://www.3playmedia.com/services-features/tools/integrations/theplatform/ # from http://www.3playmedia.com/services-features/tools/integrations/theplatform/

View File

@@ -659,6 +659,8 @@ def test_url_or_none(self):
self.assertEqual(url_or_none('mms://foo.de'), 'mms://foo.de') self.assertEqual(url_or_none('mms://foo.de'), 'mms://foo.de')
self.assertEqual(url_or_none('rtspu://foo.de'), 'rtspu://foo.de') self.assertEqual(url_or_none('rtspu://foo.de'), 'rtspu://foo.de')
self.assertEqual(url_or_none('ftps://foo.de'), 'ftps://foo.de') self.assertEqual(url_or_none('ftps://foo.de'), 'ftps://foo.de')
self.assertEqual(url_or_none('ws://foo.de'), 'ws://foo.de')
self.assertEqual(url_or_none('wss://foo.de'), 'wss://foo.de')
def test_parse_age_limit(self): def test_parse_age_limit(self):
self.assertEqual(parse_age_limit(None), None) self.assertEqual(parse_age_limit(None), None)
@@ -1260,6 +1262,7 @@ def test_js_to_json_edgecases(self):
def test_js_to_json_malformed(self): def test_js_to_json_malformed(self):
self.assertEqual(js_to_json('42a1'), '42"a1"') self.assertEqual(js_to_json('42a1'), '42"a1"')
self.assertEqual(js_to_json('42a-1'), '42"a"-1') self.assertEqual(js_to_json('42a-1'), '42"a"-1')
self.assertEqual(js_to_json('{a: `${e("")}`}'), '{"a": "\\"e\\"(\\"\\")"}')
def test_js_to_json_template_literal(self): def test_js_to_json_template_literal(self):
self.assertEqual(js_to_json('`Hello ${name}`', {'name': '"world"'}), '"Hello world"') self.assertEqual(js_to_json('`Hello ${name}`', {'name': '"world"'}), '"Hello world"')

View File

@@ -83,6 +83,56 @@
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA', '2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'AAOAOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7vgpDL0QwbdV06sCIEzpWqMGkFR20CFOS21Tp-7vj_EMu-m37KtXJoOy1', 'AAOAOq0QJ8wRAIgXmPlOPSBkkUs1bYFYlJCfe29xx8j7vgpDL0QwbdV06sCIEzpWqMGkFR20CFOS21Tp-7vj_EMu-m37KtXJoOy1',
), ),
(
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/363db69b/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpz2ICs6EVdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'wAOAOq0QJ8ARAIgXmPlOPSBkkUs1bYFYlJCfe29xx8q7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'wAOAOq0QJ8ARAIgXmPlOPSBkkUs1bYFYlJCfe29xx8q7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/20830619/player_ias.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-phone-en_US.vflset/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-tablet-en_US.vflset/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'7AOq0QJ8wRAIgXmPlOPSBkkAs1bYFYlJCfe29xx8jOv1pDL0Q2bdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_EMu-m37KtXJoOySqa0qaw',
),
(
'https://www.youtube.com/s/player/8a8ac953/player_ias_tce.vflset/en_US/base.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'IAOAOq0QJ8wRAAgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_E2u-m37KtXJoOySqa0',
),
(
'https://www.youtube.com/s/player/8a8ac953/tv-player-es6.vflset/tv-player-es6.js',
'2aq0aqSyOoJXtK73m-uME_jv7-pT15gOFC02RFkGMqWpzEICs69VdbwQ0LDp1v7j8xx92efCJlYFYb1sUkkBSPOlPmXgIARw8JQ0qOAOAA',
'IAOAOq0QJ8wRAAgXmPlOPSBkkUs1bYFYlJCfe29xx8j7v1pDL0QwbdV96sCIEzpWqMGkFR20CFOg51Tp-7vj_E2u-m37KtXJoOySqa0',
),
] ]
_NSIG_TESTS = [ _NSIG_TESTS = [
@@ -234,6 +284,38 @@
'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js', 'https://www.youtube.com/s/player/643afba4/tv-player-ias.vflset/tv-player-ias.js',
'ir9-V6cdbCiyKxhr', '2PL7ZDYAALMfmA', 'ir9-V6cdbCiyKxhr', '2PL7ZDYAALMfmA',
), ),
(
'https://www.youtube.com/s/player/363db69b/player_ias.vflset/en_US/base.js',
'eWYu5d5YeY_4LyEDc', 'XJQqf-N7Xra3gg',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias.vflset/en_US/base.js',
'o_L251jm8yhZkWtBW', 'lXoxI3XvToqn6A',
),
(
'https://www.youtube.com/s/player/4fcd6e4a/player_ias_tce.vflset/en_US/base.js',
'o_L251jm8yhZkWtBW', 'lXoxI3XvToqn6A',
),
(
'https://www.youtube.com/s/player/20830619/tv-player-ias.vflset/tv-player-ias.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-phone-en_US.vflset/base.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/20830619/player-plasma-ias-tablet-en_US.vflset/base.js',
'ir9-V6cdbCiyKxhr', '9YE85kNjZiS4',
),
(
'https://www.youtube.com/s/player/8a8ac953/player_ias_tce.vflset/en_US/base.js',
'MiBYeXx_vRREbiCCmh', 'RtZYMVvmkE0JE',
),
(
'https://www.youtube.com/s/player/8a8ac953/tv-player-es6.vflset/tv-player-es6.js',
'MiBYeXx_vRREbiCCmh', 'RtZYMVvmkE0JE',
),
] ]
@@ -284,33 +366,33 @@ def make_tfunc(url, sig_input, expected_sig):
test_id = re.sub(r'[/.-]', '_', m.group('id') or m.group('compat_id')) test_id = re.sub(r'[/.-]', '_', m.group('id') or m.group('compat_id'))
def test_func(self): def test_func(self):
basename = f'player-{name}-{test_id}.js' basename = f'player-{test_id}.js'
fn = os.path.join(self.TESTDATA_DIR, basename) fn = os.path.join(self.TESTDATA_DIR, basename)
if not os.path.exists(fn): if not os.path.exists(fn):
urllib.request.urlretrieve(url, fn) urllib.request.urlretrieve(url, fn)
with open(fn, encoding='utf-8') as testf: with open(fn, encoding='utf-8') as testf:
jscode = testf.read() jscode = testf.read()
self.assertEqual(sig_func(jscode, sig_input), expected_sig) self.assertEqual(sig_func(jscode, sig_input, url), expected_sig)
test_func.__name__ = f'test_{name}_js_{test_id}' test_func.__name__ = f'test_{name}_js_{test_id}'
setattr(TestSignature, test_func.__name__, test_func) setattr(TestSignature, test_func.__name__, test_func)
return make_tfunc return make_tfunc
def signature(jscode, sig_input): def signature(jscode, sig_input, player_url):
func = YoutubeIE(FakeYDL())._parse_sig_js(jscode) func = YoutubeIE(FakeYDL())._parse_sig_js(jscode, player_url)
src_sig = ( src_sig = (
str(string.printable[:sig_input]) str(string.printable[:sig_input])
if isinstance(sig_input, int) else sig_input) if isinstance(sig_input, int) else sig_input)
return func(src_sig) return func(src_sig)
def n_sig(jscode, sig_input): def n_sig(jscode, sig_input, player_url):
ie = YoutubeIE(FakeYDL()) ie = YoutubeIE(FakeYDL())
funcname = ie._extract_n_function_name(jscode) funcname = ie._extract_n_function_name(jscode, player_url=player_url)
jsi = JSInterpreter(jscode) jsi = JSInterpreter(jscode)
func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname), jscode)) func = jsi.extract_function_from_code(*ie._fixup_n_function_code(*jsi.extract_function_code(funcname), jscode, player_url))
return func([sig_input]) return func([sig_input])

View File

@@ -654,19 +654,21 @@ def __init__(self, params=None, auto_init=True):
if not all_plugins_loaded.value: if not all_plugins_loaded.value:
load_all_plugins() load_all_plugins()
try:
windows_enable_vt_mode()
except Exception as e:
self.write_debug(f'Failed to enable VT mode: {e}')
stdout = sys.stderr if self.params.get('logtostderr') else sys.stdout stdout = sys.stderr if self.params.get('logtostderr') else sys.stdout
self._out_files = Namespace( self._out_files = Namespace(
out=stdout, out=stdout,
error=sys.stderr, error=sys.stderr,
screen=sys.stderr if self.params.get('quiet') else stdout, screen=sys.stderr if self.params.get('quiet') else stdout,
console=next(filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None),
) )
try:
windows_enable_vt_mode()
except Exception as e:
self.write_debug(f'Failed to enable VT mode: {e}')
# hehe "immutable" namespace
self._out_files.console = next(filter(supports_terminal_sequences, (sys.stderr, sys.stdout)), None)
if self.params.get('no_color'): if self.params.get('no_color'):
if self.params.get('color') is not None: if self.params.get('color') is not None:
self.params.setdefault('_warnings', []).append( self.params.setdefault('_warnings', []).append(
@@ -4150,7 +4152,7 @@ def _get_available_impersonate_targets(self):
(target, rh.RH_NAME) (target, rh.RH_NAME)
for rh in self._request_director.handlers.values() for rh in self._request_director.handlers.values()
if isinstance(rh, ImpersonateRequestHandler) if isinstance(rh, ImpersonateRequestHandler)
for target in rh.supported_targets for target in reversed(rh.supported_targets)
] ]
def _impersonate_target_available(self, target): def _impersonate_target_available(self, target):

View File

@@ -1021,8 +1021,9 @@ def _real_main(argv=None):
# List of simplified targets we know are supported, # List of simplified targets we know are supported,
# to help users know what dependencies may be required. # to help users know what dependencies may be required.
(ImpersonateTarget('chrome'), 'curl_cffi'), (ImpersonateTarget('chrome'), 'curl_cffi'),
(ImpersonateTarget('edge'), 'curl_cffi'),
(ImpersonateTarget('safari'), 'curl_cffi'), (ImpersonateTarget('safari'), 'curl_cffi'),
(ImpersonateTarget('firefox'), 'curl_cffi>=0.10'),
(ImpersonateTarget('edge'), 'curl_cffi'),
] ]
available_targets = ydl._get_available_impersonate_targets() available_targets = ydl._get_available_impersonate_targets()
@@ -1038,12 +1039,12 @@ def make_row(target, handler):
for known_target, known_handler in known_targets: for known_target, known_handler in known_targets:
if not any( if not any(
known_target in target and handler == known_handler known_target in target and known_handler.startswith(handler)
for target, handler in available_targets for target, handler in available_targets
): ):
rows.append([ rows.insert(0, [
ydl._format_out(text, ydl.Styles.SUPPRESS) ydl._format_out(text, ydl.Styles.SUPPRESS)
for text in make_row(known_target, f'{known_handler} (not available)') for text in make_row(known_target, f'{known_handler} (unavailable)')
]) ])
ydl.to_screen('[info] Available impersonate targets') ydl.to_screen('[info] Available impersonate targets')

View File

@@ -30,7 +30,7 @@ def get_suitable_downloader(info_dict, params={}, default=NO_DEFAULT, protocol=N
from .http import HttpFD from .http import HttpFD
from .ism import IsmFD from .ism import IsmFD
from .mhtml import MhtmlFD from .mhtml import MhtmlFD
from .niconico import NiconicoDmcFD, NiconicoLiveFD from .niconico import NiconicoLiveFD
from .rtmp import RtmpFD from .rtmp import RtmpFD
from .rtsp import RtspFD from .rtsp import RtspFD
from .websocket import WebSocketFragmentFD from .websocket import WebSocketFragmentFD
@@ -50,7 +50,6 @@ def get_suitable_downloader(info_dict, params={}, default=NO_DEFAULT, protocol=N
'http_dash_segments_generator': DashSegmentsFD, 'http_dash_segments_generator': DashSegmentsFD,
'ism': IsmFD, 'ism': IsmFD,
'mhtml': MhtmlFD, 'mhtml': MhtmlFD,
'niconico_dmc': NiconicoDmcFD,
'niconico_live': NiconicoLiveFD, 'niconico_live': NiconicoLiveFD,
'fc2_live': FC2LiveFD, 'fc2_live': FC2LiveFD,
'websocket_frag': WebSocketFragmentFD, 'websocket_frag': WebSocketFragmentFD,
@@ -67,7 +66,6 @@ def shorten_protocol_name(proto, simplify=False):
'rtmp_ffmpeg': 'rtmpF', 'rtmp_ffmpeg': 'rtmpF',
'http_dash_segments': 'dash', 'http_dash_segments': 'dash',
'http_dash_segments_generator': 'dashG', 'http_dash_segments_generator': 'dashG',
'niconico_dmc': 'dmc',
'websocket_frag': 'WSfrag', 'websocket_frag': 'WSfrag',
} }
if simplify: if simplify:

View File

@@ -2,60 +2,12 @@
import threading import threading
import time import time
from . import get_suitable_downloader
from .common import FileDownloader from .common import FileDownloader
from .external import FFmpegFD from .external import FFmpegFD
from ..networking import Request from ..networking import Request
from ..utils import DownloadError, str_or_none, try_get from ..utils import DownloadError, str_or_none, try_get
class NiconicoDmcFD(FileDownloader):
""" Downloading niconico douga from DMC with heartbeat """
def real_download(self, filename, info_dict):
from ..extractor.niconico import NiconicoIE
self.to_screen(f'[{self.FD_NAME}] Downloading from DMC')
ie = NiconicoIE(self.ydl)
info_dict, heartbeat_info_dict = ie._get_heartbeat_info(info_dict)
fd = get_suitable_downloader(info_dict, params=self.params)(self.ydl, self.params)
success = download_complete = False
timer = [None]
heartbeat_lock = threading.Lock()
heartbeat_url = heartbeat_info_dict['url']
heartbeat_data = heartbeat_info_dict['data'].encode()
heartbeat_interval = heartbeat_info_dict.get('interval', 30)
request = Request(heartbeat_url, heartbeat_data)
def heartbeat():
try:
self.ydl.urlopen(request).read()
except Exception:
self.to_screen(f'[{self.FD_NAME}] Heartbeat failed')
with heartbeat_lock:
if not download_complete:
timer[0] = threading.Timer(heartbeat_interval, heartbeat)
timer[0].start()
heartbeat_info_dict['ping']()
self.to_screen('[%s] Heartbeat with %d second interval ...' % (self.FD_NAME, heartbeat_interval))
try:
heartbeat()
if type(fd).__name__ == 'HlsFD':
info_dict.update(ie._extract_m3u8_formats(info_dict['url'], info_dict['id'])[0])
success = fd.real_download(filename, info_dict)
finally:
if heartbeat_lock:
with heartbeat_lock:
timer[0].cancel()
download_complete = True
return success
class NiconicoLiveFD(FileDownloader): class NiconicoLiveFD(FileDownloader):
""" Downloads niconico live without being stopped """ """ Downloads niconico live without being stopped """
@@ -85,6 +37,7 @@ def communicate_ws(reconnect):
'quality': live_quality, 'quality': live_quality,
'protocol': 'hls+fmp4', 'protocol': 'hls+fmp4',
'latency': live_latency, 'latency': live_latency,
'accessRightMethod': 'single_cookie',
'chasePlay': False, 'chasePlay': False,
}, },
'room': { 'room': {

View File

@@ -496,10 +496,6 @@
from .daystar import DaystarClipIE from .daystar import DaystarClipIE
from .dbtv import DBTVIE from .dbtv import DBTVIE
from .dctp import DctpTvIE from .dctp import DctpTvIE
from .deezer import (
DeezerAlbumIE,
DeezerPlaylistIE,
)
from .democracynow import DemocracynowIE from .democracynow import DemocracynowIE
from .detik import DetikEmbedIE from .detik import DetikEmbedIE
from .deuxm import ( from .deuxm import (
@@ -687,6 +683,7 @@
) )
from .foxsports import FoxSportsIE from .foxsports import FoxSportsIE
from .fptplay import FptplayIE from .fptplay import FptplayIE
from .francaisfacile import FrancaisFacileIE
from .franceinter import FranceInterIE from .franceinter import FranceInterIE
from .francetv import ( from .francetv import (
FranceTVIE, FranceTVIE,
@@ -843,6 +840,7 @@
from .ichinanalive import ( from .ichinanalive import (
IchinanaLiveClipIE, IchinanaLiveClipIE,
IchinanaLiveIE, IchinanaLiveIE,
IchinanaLiveVODIE,
) )
from .idolplus import IdolPlusIE from .idolplus import IdolPlusIE
from .ign import ( from .ign import (
@@ -905,6 +903,7 @@
IviIE, IviIE,
) )
from .ivideon import IvideonIE from .ivideon import IvideonIE
from .ivoox import IvooxIE
from .iwara import ( from .iwara import (
IwaraIE, IwaraIE,
IwaraPlaylistIE, IwaraPlaylistIE,
@@ -962,7 +961,10 @@
) )
from .kicker import KickerIE from .kicker import KickerIE
from .kickstarter import KickStarterIE from .kickstarter import KickStarterIE
from .kika import KikaIE from .kika import (
KikaIE,
KikaPlaylistIE,
)
from .kinja import KinjaEmbedIE from .kinja import KinjaEmbedIE
from .kinopoisk import KinoPoiskIE from .kinopoisk import KinoPoiskIE
from .kommunetv import KommunetvIE from .kommunetv import KommunetvIE
@@ -1040,6 +1042,7 @@
LimelightMediaIE, LimelightMediaIE,
) )
from .linkedin import ( from .linkedin import (
LinkedInEventsIE,
LinkedInIE, LinkedInIE,
LinkedInLearningCourseIE, LinkedInLearningCourseIE,
LinkedInLearningIE, LinkedInLearningIE,
@@ -1063,6 +1066,7 @@
from .lovehomeporn import LoveHomePornIE from .lovehomeporn import LoveHomePornIE
from .lrt import ( from .lrt import (
LRTVODIE, LRTVODIE,
LRTRadioIE,
LRTStreamIE, LRTStreamIE,
) )
from .lsm import ( from .lsm import (
@@ -1495,6 +1499,10 @@
) )
from .parler import ParlerIE from .parler import ParlerIE
from .parlview import ParlviewIE from .parlview import ParlviewIE
from .parti import (
PartiLivestreamIE,
PartiVideoIE,
)
from .patreon import ( from .patreon import (
PatreonCampaignIE, PatreonCampaignIE,
PatreonIE, PatreonIE,
@@ -1741,6 +1749,7 @@
RoosterTeethSeriesIE, RoosterTeethSeriesIE,
) )
from .rottentomatoes import RottenTomatoesIE from .rottentomatoes import RottenTomatoesIE
from .roya import RoyaLiveIE
from .rozhlas import ( from .rozhlas import (
MujRozhlasIE, MujRozhlasIE,
RozhlasIE, RozhlasIE,
@@ -1775,7 +1784,6 @@
from .rtve import ( from .rtve import (
RTVEALaCartaIE, RTVEALaCartaIE,
RTVEAudioIE, RTVEAudioIE,
RTVEInfantilIE,
RTVELiveIE, RTVELiveIE,
RTVETelevisionIE, RTVETelevisionIE,
) )
@@ -1989,6 +1997,7 @@
StoryFireSeriesIE, StoryFireSeriesIE,
StoryFireUserIE, StoryFireUserIE,
) )
from .streaks import StreaksIE
from .streamable import StreamableIE from .streamable import StreamableIE
from .streamcz import StreamCZIE from .streamcz import StreamCZIE
from .streetvoice import StreetVoiceIE from .streetvoice import StreetVoiceIE
@@ -2228,7 +2237,10 @@
TVPlayIE, TVPlayIE,
) )
from .tvplayer import TVPlayerIE from .tvplayer import TVPlayerIE
from .tvw import TvwIE from .tvw import (
TvwIE,
TvwTvChannelsIE,
)
from .tweakers import TweakersIE from .tweakers import TweakersIE
from .twentymin import TwentyMinutenIE from .twentymin import TwentyMinutenIE
from .twentythreevideo import TwentyThreeVideoIE from .twentythreevideo import TwentyThreeVideoIE
@@ -2352,10 +2364,6 @@
ViewLiftIE, ViewLiftIE,
) )
from .viidea import ViideaIE from .viidea import ViideaIE
from .viki import (
VikiChannelIE,
VikiIE,
)
from .vimeo import ( from .vimeo import (
VHXEmbedIE, VHXEmbedIE,
VimeoAlbumIE, VimeoAlbumIE,
@@ -2400,6 +2408,12 @@
VoxMediaIE, VoxMediaIE,
VoxMediaVolumeIE, VoxMediaVolumeIE,
) )
from .vrsquare import (
VrSquareChannelIE,
VrSquareIE,
VrSquareSearchIE,
VrSquareSectionIE,
)
from .vrt import ( from .vrt import (
VRTIE, VRTIE,
DagelijkseKostIE, DagelijkseKostIE,

View File

@@ -21,6 +21,7 @@
int_or_none, int_or_none,
time_seconds, time_seconds,
traverse_obj, traverse_obj,
update_url,
update_url_query, update_url_query,
) )
@@ -417,6 +418,10 @@ def _real_extract(self, url):
'is_live': is_live, 'is_live': is_live,
'availability': availability, 'availability': availability,
}) })
if thumbnail := update_url(self._og_search_thumbnail(webpage, default=''), query=None):
info['thumbnails'] = [{'url': thumbnail}]
return info return info

View File

@@ -146,7 +146,7 @@ class TokFMPodcastIE(InfoExtractor):
'url': 'https://audycje.tokfm.pl/podcast/91275,-Systemowy-rasizm-Czy-zamieszki-w-USA-po-morderstwie-w-Minneapolis-doprowadza-do-zmian-w-sluzbach-panstwowych', 'url': 'https://audycje.tokfm.pl/podcast/91275,-Systemowy-rasizm-Czy-zamieszki-w-USA-po-morderstwie-w-Minneapolis-doprowadza-do-zmian-w-sluzbach-panstwowych',
'info_dict': { 'info_dict': {
'id': '91275', 'id': '91275',
'ext': 'aac', 'ext': 'mp3',
'title': 'md5:a9b15488009065556900169fb8061cce', 'title': 'md5:a9b15488009065556900169fb8061cce',
'episode': 'md5:a9b15488009065556900169fb8061cce', 'episode': 'md5:a9b15488009065556900169fb8061cce',
'series': 'Analizy', 'series': 'Analizy',
@@ -164,23 +164,20 @@ def _real_extract(self, url):
raise ExtractorError('No such podcast', expected=True) raise ExtractorError('No such podcast', expected=True)
metadata = metadata[0] metadata = metadata[0]
formats = [] mp3_url = self._download_json(
for ext in ('aac', 'mp3'): 'https://api.podcast.radioagora.pl/api4/getSongUrl',
url_data = self._download_json( media_id, 'Downloading podcast mp3 URL', query={
f'https://api.podcast.radioagora.pl/api4/getSongUrl?podcast_id={media_id}&device_id={uuid.uuid4()}&ppre=false&audio={ext}', 'podcast_id': media_id,
media_id, f'Downloading podcast {ext} URL') 'device_id': str(uuid.uuid4()),
# prevents inserting the mp3 (default) multiple times 'ppre': 'false',
if 'link_ssl' in url_data and f'.{ext}' in url_data['link_ssl']: 'audio': 'mp3',
formats.append({ })['link_ssl']
'url': url_data['link_ssl'],
'ext': ext,
'vcodec': 'none',
'acodec': ext,
})
return { return {
'id': media_id, 'id': media_id,
'formats': formats, 'url': mp3_url,
'vcodec': 'none',
'ext': 'mp3',
'title': metadata.get('podcast_name'), 'title': metadata.get('podcast_name'),
'series': metadata.get('series_name'), 'series': metadata.get('series_name'),
'episode': metadata.get('podcast_name'), 'episode': metadata.get('podcast_name'),

View File

@@ -1,64 +1,105 @@
import urllib.parse
from .common import InfoExtractor from .common import InfoExtractor
from ..networking.exceptions import HTTPError from ..networking.exceptions import HTTPError
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
parse_age_limit,
url_or_none,
urlencode_postdata, urlencode_postdata,
) )
from ..utils.traversal import traverse_obj
class AtresPlayerIE(InfoExtractor): class AtresPlayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/[^/]+/[^/]+/[^/]+/[^/]+/(?P<display_id>.+?)_(?P<id>[0-9a-f]{24})' _VALID_URL = r'https?://(?:www\.)?atresplayer\.com/(?:[^/?#]+/){4}(?P<display_id>.+?)_(?P<id>[0-9a-f]{24})'
_NETRC_MACHINE = 'atresplayer' _NETRC_MACHINE = 'atresplayer'
_TESTS = [ _TESTS = [{
{ 'url': 'https://www.atresplayer.com/lasexta/programas/el-objetivo/clips/mbappe-describe-como-entrenador-a-carlo-ancelotti-sabe-cuando-tiene-que-ser-padre-jefe-amigo-entrenador_67f2dfb2fb6ab0e4c7203849/',
'url': 'https://www.atresplayer.com/antena3/series/pequenas-coincidencias/temporada-1/capitulo-7-asuntos-pendientes_5d4aa2c57ed1a88fc715a615/',
'info_dict': { 'info_dict': {
'id': '5d4aa2c57ed1a88fc715a615',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Capítulo 7: Asuntos pendientes', 'id': '67f2dfb2fb6ab0e4c7203849',
'description': 'md5:7634cdcb4d50d5381bedf93efb537fbc', 'display_id': 'md5:c203f8d4e425ed115ba56a1c6e4b3e6c',
'duration': 3413, 'title': 'Mbappé describe como entrenador a Carlo Ancelotti: "Sabe cuándo tiene que ser padre, jefe, amigo, entrenador..."',
'channel': 'laSexta',
'duration': 31,
'thumbnail': 'https://imagenes.atresplayer.com/atp/clipping/cmsimages02/2025/04/06/B02DBE1E-D59B-4683-8404-1A9595D15269/1920x1080.jpg',
'tags': ['Entrevista informativa', 'Actualidad', 'Debate informativo', 'Política', 'Economía', 'Sociedad', 'Cara a cara', 'Análisis', 'Más periodismo'],
'series': 'El Objetivo',
'season': 'Temporada 12',
'timestamp': 1743970079,
'upload_date': '20250406',
}, },
'skip': 'This video is only available for registered users', }, {
'url': 'https://www.atresplayer.com/antena3/programas/el-hormiguero/clips/revive-la-entrevista-completa-a-miguel-bose-en-el-hormiguero_67f836baa4a5b0e4147ca59a/',
'info_dict': {
'ext': 'mp4',
'id': '67f836baa4a5b0e4147ca59a',
'display_id': 'revive-la-entrevista-completa-a-miguel-bose-en-el-hormiguero',
'title': 'Revive la entrevista completa a Miguel Bosé en El Hormiguero',
'description': 'md5:c6d2b591408d45a7bc2986dfb938eb72',
'channel': 'Antena 3',
'duration': 2556,
'thumbnail': 'https://imagenes.atresplayer.com/atp/clipping/cmsimages02/2025/04/10/9076395F-F1FD-48BE-9F18-540DBA10EBAD/1920x1080.jpg',
'tags': ['Entrevista', 'Variedades', 'Humor', 'Entretenimiento', 'Te sigo', 'Buen rollo', 'Cara a cara'],
'series': 'El Hormiguero ',
'season': 'Temporada 14',
'timestamp': 1744320111,
'upload_date': '20250410',
}, },
{ }, {
'url': 'https://www.atresplayer.com/flooxer/series/biara-proyecto-lazarus/temporada-1/capitulo-3-supervivientes_67a6038b64ceca00070f4f69/',
'info_dict': {
'ext': 'mp4',
'id': '67a6038b64ceca00070f4f69',
'display_id': 'capitulo-3-supervivientes',
'title': 'Capítulo 3: Supervivientes',
'description': 'md5:65b231f20302f776c2b0dd24594599a1',
'channel': 'Flooxer',
'duration': 1196,
'thumbnail': 'https://imagenes.atresplayer.com/atp/clipping/cmsimages01/2025/02/14/17CF90D3-FE67-40C5-A941-7825B3E13992/1920x1080.jpg',
'tags': ['Juvenil', 'Terror', 'Piel de gallina', 'Te sigo', 'Un break', 'Del tirón'],
'series': 'BIARA: Proyecto Lázarus',
'season': 'Temporada 1',
'season_number': 1,
'episode': 'Episode 3',
'episode_number': 3,
'timestamp': 1743095191,
'upload_date': '20250327',
},
}, {
'url': 'https://www.atresplayer.com/lasexta/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_5ad08edf986b2855ed47adc4/', 'url': 'https://www.atresplayer.com/lasexta/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_5ad08edf986b2855ed47adc4/',
'only_matching': True, 'only_matching': True,
}, }, {
{
'url': 'https://www.atresplayer.com/antena3/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_5ad51046986b2886722ccdea/', 'url': 'https://www.atresplayer.com/antena3/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_5ad51046986b2886722ccdea/',
'only_matching': True, 'only_matching': True,
}, }]
]
_API_BASE = 'https://api.atresplayer.com/' _API_BASE = 'https://api.atresplayer.com/'
def _perform_login(self, username, password): def _perform_login(self, username, password):
self._request_webpage(
self._API_BASE + 'login', None, 'Downloading login page')
try: try:
target_url = self._download_json( self._download_webpage(
'https://account.atresmedia.com/api/login', None, 'https://account.atresplayer.com/auth/v1/login', None,
'Logging in', headers={ 'Logging in', 'Failed to log in', data=urlencode_postdata({
'Content-Type': 'application/x-www-form-urlencoded',
}, data=urlencode_postdata({
'username': username, 'username': username,
'password': password, 'password': password,
}))['targetUrl'] }))
except ExtractorError as e: except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 400: if isinstance(e.cause, HTTPError) and e.cause.status == 400:
raise ExtractorError('Invalid username and/or password', expected=True) raise ExtractorError('Invalid username and/or password', expected=True)
raise raise
self._request_webpage(target_url, None, 'Following Target URL')
def _real_extract(self, url): def _real_extract(self, url):
display_id, video_id = self._match_valid_url(url).groups() display_id, video_id = self._match_valid_url(url).groups()
metadata_url = self._download_json(
self._API_BASE + 'client/v1/url', video_id, 'Downloading API endpoint data',
query={'href': urllib.parse.urlparse(url).path})['href']
metadata = self._download_json(metadata_url, video_id)
try: try:
episode = self._download_json( video_data = self._download_json(metadata['urlVideo'], video_id, 'Downloading video data')
self._API_BASE + 'client/v1/player/episode/' + video_id, video_id)
except ExtractorError as e: except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 403: if isinstance(e.cause, HTTPError) and e.cause.status == 403:
error = self._parse_json(e.cause.response.read(), None) error = self._parse_json(e.cause.response.read(), None)
@@ -67,37 +108,45 @@ def _real_extract(self, url):
raise ExtractorError(error['error_description'], expected=True) raise ExtractorError(error['error_description'], expected=True)
raise raise
title = episode['titulo']
formats = [] formats = []
subtitles = {} subtitles = {}
for source in episode.get('sources', []): for source in traverse_obj(video_data, ('sources', lambda _, v: url_or_none(v['src']))):
src = source.get('src') src_url = source['src']
if not src:
continue
src_type = source.get('type') src_type = source.get('type')
if src_type == 'application/vnd.apple.mpegurl': if src_type in ('application/vnd.apple.mpegurl', 'application/hls+legacy', 'application/hls+hevc'):
formats, subtitles = self._extract_m3u8_formats( fmts, subs = self._extract_m3u8_formats_and_subtitles(
src, video_id, 'mp4', 'm3u8_native', src_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
m3u8_id='hls', fatal=False) elif src_type in ('application/dash+xml', 'application/dash+hevc'):
elif src_type == 'application/dash+xml': fmts, subs = self._extract_mpd_formats_and_subtitles(
formats, subtitles = self._extract_mpd_formats( src_url, video_id, mpd_id='dash', fatal=False)
src, video_id, mpd_id='dash', fatal=False) else:
continue
heartbeat = episode.get('heartbeat') or {} formats.extend(fmts)
omniture = episode.get('omniture') or {} self._merge_subtitles(subs, target=subtitles)
get_meta = lambda x: heartbeat.get(x) or omniture.get(x)
return { return {
'display_id': display_id, 'display_id': display_id,
'id': video_id, 'id': video_id,
'title': title,
'description': episode.get('descripcion'),
'thumbnail': episode.get('imgPoster'),
'duration': int_or_none(episode.get('duration')),
'formats': formats, 'formats': formats,
'channel': get_meta('channel'),
'season': get_meta('season'),
'episode_number': int_or_none(get_meta('episodeNumber')),
'subtitles': subtitles, 'subtitles': subtitles,
**traverse_obj(video_data, {
'title': ('titulo', {str}),
'description': ('descripcion', {str}),
'duration': ('duration', {int_or_none}),
'thumbnail': ('imgPoster', {url_or_none}, {lambda v: f'{v}1920x1080.jpg'}),
'age_limit': ('ageRating', {parse_age_limit}),
}),
**traverse_obj(metadata, {
'title': ('title', {str}),
'description': ('description', {str}),
'duration': ('duration', {int_or_none}),
'tags': ('tags', ..., 'title', {str}),
'age_limit': ('ageRating', {parse_age_limit}),
'series': ('format', 'title', {str}),
'season': ('currentSeason', 'title', {str}),
'season_number': ('currentSeason', 'seasonNumber', {int_or_none}),
'episode_number': ('numberOfEpisode', {int_or_none}),
'timestamp': ('publicationDate', {int_or_none(scale=1000)}),
'channel': ('channel', 'title', {str}),
}),
} }

View File

@@ -24,7 +24,7 @@ def _extract_bokecc_formats(self, webpage, video_id, format_id=None):
class BokeCCIE(BokeCCBaseIE): class BokeCCIE(BokeCCBaseIE):
_IE_DESC = 'CC视频' IE_DESC = 'CC视频'
_VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)' _VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
_TESTS = [{ _TESTS = [{

View File

@@ -7,6 +7,7 @@
join_nonempty, join_nonempty,
js_to_json, js_to_json,
mimetype2ext, mimetype2ext,
parse_resolution,
unified_strdate, unified_strdate,
url_or_none, url_or_none,
urljoin, urljoin,
@@ -110,24 +111,23 @@ def _parse_vue_attributes(self, name, string, video_id):
return attributes return attributes
@staticmethod def _process_source(self, source):
def _process_source(source):
url = url_or_none(source['src']) url = url_or_none(source['src'])
if not url: if not url:
return None return None
source_type = source.get('type', '') source_type = source.get('type', '')
extension = mimetype2ext(source_type) extension = mimetype2ext(source_type)
is_video = source_type.startswith('video') note = self._search_regex(r'[_-]([a-z]+)\.[\da-z]+(?:$|\?)', url, 'note', default=None)
note = url.rpartition('.')[0].rpartition('_')[2] if is_video else None
return { return {
'url': url, 'url': url,
'ext': extension, 'ext': extension,
'vcodec': None if is_video else 'none', 'vcodec': None if source_type.startswith('video') else 'none',
'quality': 10 if note == 'high' else 0, 'quality': 10 if note == 'high' else 0,
'format_note': note, 'format_note': note,
'format_id': join_nonempty(extension, note), 'format_id': join_nonempty(extension, note),
**parse_resolution(source.get('label')),
} }
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -13,16 +13,17 @@
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
OnDemandPagedList, OnDemandPagedList,
determine_ext,
float_or_none, float_or_none,
int_or_none, int_or_none,
merge_dicts, merge_dicts,
multipart_encode, multipart_encode,
parse_duration, parse_duration,
traverse_obj,
try_call, try_call,
try_get, url_or_none,
urljoin, urljoin,
) )
from ..utils.traversal import traverse_obj
class CDAIE(InfoExtractor): class CDAIE(InfoExtractor):
@@ -290,15 +291,16 @@ def extract_format(page, version):
if not video or 'file' not in video: if not video or 'file' not in video:
self.report_warning(f'Unable to extract {version} version information') self.report_warning(f'Unable to extract {version} version information')
return return
video_quality = video.get('quality')
qualities = video.get('qualities', {})
video_quality = next((k for k, v in qualities.items() if v == video_quality), video_quality)
if video.get('file'):
if video['file'].startswith('uggc'): if video['file'].startswith('uggc'):
video['file'] = codecs.decode(video['file'], 'rot_13') video['file'] = codecs.decode(video['file'], 'rot_13')
if video['file'].endswith('adc.mp4'): if video['file'].endswith('adc.mp4'):
video['file'] = video['file'].replace('adc.mp4', '.mp4') video['file'] = video['file'].replace('adc.mp4', '.mp4')
elif not video['file'].startswith('http'): elif not video['file'].startswith('http'):
video['file'] = decrypt_file(video['file']) video['file'] = decrypt_file(video['file'])
video_quality = video.get('quality')
qualities = video.get('qualities', {})
video_quality = next((k for k, v in qualities.items() if v == video_quality), video_quality)
info_dict['formats'].append({ info_dict['formats'].append({
'url': video['file'], 'url': video['file'],
'format_id': video_quality, 'format_id': video_quality,
@@ -310,14 +312,26 @@ def extract_format(page, version):
data = {'jsonrpc': '2.0', 'method': 'videoGetLink', 'id': 2, data = {'jsonrpc': '2.0', 'method': 'videoGetLink', 'id': 2,
'params': [video_id, cda_quality, video.get('ts'), video.get('hash2'), {}]} 'params': [video_id, cda_quality, video.get('ts'), video.get('hash2'), {}]}
data = json.dumps(data).encode() data = json.dumps(data).encode()
video_url = self._download_json( response = self._download_json(
f'https://www.cda.pl/video/{video_id}', video_id, headers={ f'https://www.cda.pl/video/{video_id}', video_id, headers={
'Content-Type': 'application/json', 'Content-Type': 'application/json',
'X-Requested-With': 'XMLHttpRequest', 'X-Requested-With': 'XMLHttpRequest',
}, data=data, note=f'Fetching {quality} url', }, data=data, note=f'Fetching {quality} url',
errnote=f'Failed to fetch {quality} url', fatal=False) errnote=f'Failed to fetch {quality} url', fatal=False)
if try_get(video_url, lambda x: x['result']['status']) == 'ok': if (
video_url = try_get(video_url, lambda x: x['result']['resp']) traverse_obj(response, ('result', 'status')) != 'ok'
or not traverse_obj(response, ('result', 'resp', {url_or_none}))
):
continue
video_url = response['result']['resp']
ext = determine_ext(video_url)
if ext == 'mpd':
info_dict['formats'].extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
elif ext == 'm3u8':
info_dict['formats'].extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
else:
info_dict['formats'].append({ info_dict['formats'].append({
'url': video_url, 'url': video_url,
'format_id': quality, 'format_id': quality,
@@ -353,7 +367,7 @@ def extract_format(page, version):
class CDAFolderIE(InfoExtractor): class CDAFolderIE(InfoExtractor):
_MAX_PAGE_SIZE = 36 _MAX_PAGE_SIZE = 36
_VALID_URL = r'https?://(?:www\.)?cda\.pl/(?P<channel>\w+)/folder/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?cda\.pl/(?P<channel>[\w-]+)/folder/(?P<id>\d+)'
_TESTS = [ _TESTS = [
{ {
'url': 'https://www.cda.pl/domino264/folder/31188385', 'url': 'https://www.cda.pl/domino264/folder/31188385',
@@ -378,6 +392,9 @@ class CDAFolderIE(InfoExtractor):
'title': 'TESTY KOSMETYKÓW', 'title': 'TESTY KOSMETYKÓW',
}, },
'playlist_mincount': 139, 'playlist_mincount': 139,
}, {
'url': 'https://www.cda.pl/FILMY-SERIALE-ANIME-KRESKOWKI-BAJKI/folder/18493422',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -21,7 +21,7 @@ class CHZZKLiveIE(InfoExtractor):
'channel': '진짜도현', 'channel': '진짜도현',
'channel_id': 'c68b8ef525fb3d2fa146344d84991753', 'channel_id': 'c68b8ef525fb3d2fa146344d84991753',
'channel_is_verified': False, 'channel_is_verified': False,
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:https?://.+/.+\.jpg',
'timestamp': 1705510344, 'timestamp': 1705510344,
'upload_date': '20240117', 'upload_date': '20240117',
'live_status': 'is_live', 'live_status': 'is_live',
@@ -98,7 +98,7 @@ class CHZZKVideoIE(InfoExtractor):
'channel': '침착맨', 'channel': '침착맨',
'channel_id': 'bb382c2c0cc9fa7c86ab3b037fb5799c', 'channel_id': 'bb382c2c0cc9fa7c86ab3b037fb5799c',
'channel_is_verified': False, 'channel_is_verified': False,
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 15577, 'duration': 15577,
'timestamp': 1702970505.417, 'timestamp': 1702970505.417,
'upload_date': '20231219', 'upload_date': '20231219',
@@ -115,7 +115,7 @@ class CHZZKVideoIE(InfoExtractor):
'channel': '라디유radiyu', 'channel': '라디유radiyu',
'channel_id': '68f895c59a1043bc5019b5e08c83a5c5', 'channel_id': '68f895c59a1043bc5019b5e08c83a5c5',
'channel_is_verified': False, 'channel_is_verified': False,
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 95, 'duration': 95,
'timestamp': 1703102631.722, 'timestamp': 1703102631.722,
'upload_date': '20231220', 'upload_date': '20231220',
@@ -131,12 +131,30 @@ class CHZZKVideoIE(InfoExtractor):
'channel': '강지', 'channel': '강지',
'channel_id': 'b5ed5db484d04faf4d150aedd362f34b', 'channel_id': 'b5ed5db484d04faf4d150aedd362f34b',
'channel_is_verified': True, 'channel_is_verified': True,
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:https?://.+/.+\.jpg',
'duration': 4433, 'duration': 4433,
'timestamp': 1703307460.214, 'timestamp': 1703307460.214,
'upload_date': '20231223', 'upload_date': '20231223',
'view_count': int, 'view_count': int,
}, },
}, {
# video_status == 'NONE' but is downloadable
'url': 'https://chzzk.naver.com/video/6325166',
'info_dict': {
'id': '6325166',
'ext': 'mp4',
'title': '와이프 숙제빼주기',
'channel': '이 다',
'channel_id': '0076a519f147ee9fd0959bf02f9571ca',
'channel_is_verified': False,
'view_count': int,
'duration': 28167,
'thumbnail': r're:https?://.+/.+\.jpg',
'timestamp': 1742139216.86,
'upload_date': '20250316',
'live_status': 'was_live',
},
'params': {'skip_download': 'm3u8'},
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@@ -147,11 +165,7 @@ def _real_extract(self, url):
live_status = 'was_live' if video_meta.get('liveOpenDate') else 'not_live' live_status = 'was_live' if video_meta.get('liveOpenDate') else 'not_live'
video_status = video_meta.get('vodStatus') video_status = video_meta.get('vodStatus')
if video_status == 'UPLOAD': if video_status == 'ABR_HLS':
playback = self._parse_json(video_meta['liveRewindPlaybackJson'], video_id)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
playback['media'][0]['path'], video_id, 'mp4', m3u8_id='hls')
elif video_status == 'ABR_HLS':
formats, subtitles = self._extract_mpd_formats_and_subtitles( formats, subtitles = self._extract_mpd_formats_and_subtitles(
f'https://apis.naver.com/neonplayer/vodplay/v1/playback/{video_meta["videoId"]}', f'https://apis.naver.com/neonplayer/vodplay/v1/playback/{video_meta["videoId"]}',
video_id, query={ video_id, query={
@@ -161,6 +175,13 @@ def _real_extract(self, url):
'cpl': 'en_US', 'cpl': 'en_US',
}) })
else: else:
fatal = video_status == 'UPLOAD'
playback = self._parse_json(video_meta['liveRewindPlaybackJson'], video_id, fatal=fatal)
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
traverse_obj(playback, ('media', 0, 'path')), video_id, 'mp4', m3u8_id='hls', fatal=fatal)
if formats and video_status != 'UPLOAD':
self.write_debug(f'Video found with status: "{video_status}"')
elif not formats:
self.raise_no_formats( self.raise_no_formats(
f'Unknown video status detected: "{video_status}"', expected=True, video_id=video_id) f'Unknown video status detected: "{video_status}"', expected=True, video_id=video_id)
formats, subtitles = [], {} formats, subtitles = [], {}

View File

@@ -78,6 +78,7 @@
parse_iso8601, parse_iso8601,
parse_m3u8_attributes, parse_m3u8_attributes,
parse_resolution, parse_resolution,
qualities,
sanitize_url, sanitize_url,
smuggle_url, smuggle_url,
str_or_none, str_or_none,
@@ -1569,6 +1570,8 @@ def _yield_json_ld(self, html, video_id, *, fatal=True, default=NO_DEFAULT):
"""Yield all json ld objects in the html""" """Yield all json ld objects in the html"""
if default is not NO_DEFAULT: if default is not NO_DEFAULT:
fatal = False fatal = False
if not fatal and not isinstance(html, str):
return
for mobj in re.finditer(JSON_LD_RE, html): for mobj in re.finditer(JSON_LD_RE, html):
json_ld_item = self._parse_json( json_ld_item = self._parse_json(
mobj.group('json_ld'), video_id, fatal=fatal, mobj.group('json_ld'), video_id, fatal=fatal,
@@ -2177,6 +2180,8 @@ def extract_media(x_media_line):
media_url = media.get('URI') media_url = media.get('URI')
if media_url: if media_url:
manifest_url = format_url(media_url) manifest_url = format_url(media_url)
is_audio = media_type == 'AUDIO'
is_alternate = media.get('DEFAULT') == 'NO' or media.get('AUTOSELECT') == 'NO'
formats.extend({ formats.extend({
'format_id': join_nonempty(m3u8_id, group_id, name, idx), 'format_id': join_nonempty(m3u8_id, group_id, name, idx),
'format_note': name, 'format_note': name,
@@ -2189,7 +2194,11 @@ def extract_media(x_media_line):
'preference': preference, 'preference': preference,
'quality': quality, 'quality': quality,
'has_drm': has_drm, 'has_drm': has_drm,
'vcodec': 'none' if media_type == 'AUDIO' else None, 'vcodec': 'none' if is_audio else None,
# Alternate audio formats (e.g. audio description) should be deprioritized
'source_preference': -2 if is_audio and is_alternate else None,
# Save this to assign source_preference based on associated video stream
'_audio_group_id': group_id if is_audio and not is_alternate else None,
} for idx in _extract_m3u8_playlist_indices(manifest_url)) } for idx in _extract_m3u8_playlist_indices(manifest_url))
def build_stream_name(): def build_stream_name():
@@ -2284,6 +2293,8 @@ def build_stream_name():
# ignore references to rendition groups and treat them # ignore references to rendition groups and treat them
# as complete formats. # as complete formats.
if audio_group_id and codecs and f.get('vcodec') != 'none': if audio_group_id and codecs and f.get('vcodec') != 'none':
# Save this to determine quality of audio formats that only have a GROUP-ID
f['_audio_group_id'] = audio_group_id
audio_group = groups.get(audio_group_id) audio_group = groups.get(audio_group_id)
if audio_group and audio_group[0].get('URI'): if audio_group and audio_group[0].get('URI'):
# TODO: update acodec for audio only formats with # TODO: update acodec for audio only formats with
@@ -2306,6 +2317,28 @@ def build_stream_name():
formats.append(http_f) formats.append(http_f)
last_stream_inf = {} last_stream_inf = {}
# Some audio-only formats only have a GROUP-ID without any other quality/bitrate/codec info
# Each audio GROUP-ID corresponds with one or more video formats' AUDIO attribute
# For sorting purposes, set source_preference based on the quality of the video formats they are grouped with
# See https://github.com/yt-dlp/yt-dlp/issues/11178
audio_groups_by_quality = orderedSet(f['_audio_group_id'] for f in sorted(
traverse_obj(formats, lambda _, v: v.get('vcodec') != 'none' and v['_audio_group_id']),
key=lambda x: (x.get('tbr') or 0, x.get('width') or 0)))
audio_quality_map = {
audio_groups_by_quality[0]: 'low',
audio_groups_by_quality[-1]: 'high',
} if len(audio_groups_by_quality) > 1 else None
audio_preference = qualities(audio_groups_by_quality)
for fmt in formats:
audio_group_id = fmt.pop('_audio_group_id', None)
if not audio_quality_map or not audio_group_id or fmt.get('vcodec') != 'none':
continue
# Use source_preference since quality and preference are set by params
fmt['source_preference'] = audio_preference(audio_group_id)
fmt['format_note'] = join_nonempty(
fmt.get('format_note'), audio_quality_map.get(audio_group_id), delim=', ')
return formats, subtitles return formats, subtitles
def _extract_m3u8_vod_duration( def _extract_m3u8_vod_duration(

View File

@@ -5,7 +5,9 @@
int_or_none, int_or_none,
try_get, try_get,
unified_strdate, unified_strdate,
url_or_none,
) )
from ..utils.traversal import traverse_obj
class CrowdBunkerIE(InfoExtractor): class CrowdBunkerIE(InfoExtractor):
@@ -44,16 +46,15 @@ def _real_extract(self, url):
'url': sub_url, 'url': sub_url,
}) })
mpd_url = try_get(video_json, lambda x: x['dashManifest']['url']) if mpd_url := traverse_obj(video_json, ('dashManifest', 'url', {url_or_none})):
if mpd_url: fmts, subs = self._extract_mpd_formats_and_subtitles(mpd_url, video_id, mpd_id='dash', fatal=False)
fmts, subs = self._extract_mpd_formats_and_subtitles(mpd_url, video_id)
formats.extend(fmts) formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs) self._merge_subtitles(subs, target=subtitles)
m3u8_url = try_get(video_json, lambda x: x['hlsManifest']['url'])
if m3u8_url: if m3u8_url := traverse_obj(video_json, ('hlsManifest', 'url', {url_or_none})):
fmts, subs = self._extract_m3u8_formats_and_subtitles(mpd_url, video_id) fmts, subs = self._extract_m3u8_formats_and_subtitles(m3u8_url, video_id, m3u8_id='hls', fatal=False)
formats.extend(fmts) formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs) self._merge_subtitles(subs, target=subtitles)
thumbnails = [{ thumbnails = [{
'url': image['url'], 'url': image['url'],

View File

@@ -9,6 +9,7 @@
ExtractorError, ExtractorError,
classproperty, classproperty,
float_or_none, float_or_none,
parse_qs,
traverse_obj, traverse_obj,
url_or_none, url_or_none,
) )
@@ -91,11 +92,15 @@ def _usp_signing_secret(self):
# Rotates every so often, but hardcode a fallback in case of JS change/breakage before rotation # Rotates every so often, but hardcode a fallback in case of JS change/breakage before rotation
return self._search_regex( return self._search_regex(
r'\bUSP_SIGNING_SECRET\s*=\s*(["\'])(?P<secret>(?:(?!\1).)+)', player_js, r'\bUSP_SIGNING_SECRET\s*=\s*(["\'])(?P<secret>(?:(?!\1).)+)', player_js,
'usp signing secret', group='secret', fatal=False) or 'odnInCGqhvtyRTtIiddxtuRtawYYICZP' 'usp signing secret', group='secret', fatal=False) or 'hGDtqMKYVeFdofrAfFmBcrsakaZELajI'
def _real_extract(self, url): def _real_extract(self, url):
user_id, video_id = self._match_valid_url(url).group('user_id', 'id') user_id, video_id = self._match_valid_url(url).group('user_id', 'id')
query = {'contentId': f'{user_id}-vod-{video_id}', 'provider': 'universe'} query = {
'contentId': f'{user_id}-vod-{video_id}',
'provider': 'universe',
**traverse_obj(url, ({parse_qs}, 'uss_token', {'signedKey': -1})),
}
info = self._download_json(self._API_INFO_URL, video_id, query=query, fatal=False) info = self._download_json(self._API_INFO_URL, video_id, query=query, fatal=False)
access = self._download_json( access = self._download_json(
'https://playback.dacast.com/content/access', video_id, 'https://playback.dacast.com/content/access', video_id,

View File

@@ -1,142 +0,0 @@
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
orderedSet,
)
class DeezerBaseInfoExtractor(InfoExtractor):
def get_data(self, url):
if not self.get_param('test'):
self.report_warning('For now, this extractor only supports the 30 second previews. Patches welcome!')
mobj = self._match_valid_url(url)
data_id = mobj.group('id')
webpage = self._download_webpage(url, data_id)
geoblocking_msg = self._html_search_regex(
r'<p class="soon-txt">(.*?)</p>', webpage, 'geoblocking message',
default=None)
if geoblocking_msg is not None:
raise ExtractorError(
f'Deezer said: {geoblocking_msg}', expected=True)
data_json = self._search_regex(
(r'__DZR_APP_STATE__\s*=\s*({.+?})\s*</script>',
r'naboo\.display\(\'[^\']+\',\s*(.*?)\);\n'),
webpage, 'data JSON')
data = json.loads(data_json)
return data_id, webpage, data
class DeezerPlaylistIE(DeezerBaseInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?deezer\.com/(../)?playlist/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://www.deezer.com/playlist/176747451',
'info_dict': {
'id': '176747451',
'title': 'Best!',
'uploader': 'anonymous',
'thumbnail': r're:^https?://(e-)?cdns-images\.dzcdn\.net/images/cover/.*\.jpg$',
},
'playlist_count': 29,
}
def _real_extract(self, url):
playlist_id, webpage, data = self.get_data(url)
playlist_title = data.get('DATA', {}).get('TITLE')
playlist_uploader = data.get('DATA', {}).get('PARENT_USERNAME')
playlist_thumbnail = self._search_regex(
r'<img id="naboo_playlist_image".*?src="([^"]+)"', webpage,
'playlist thumbnail')
entries = []
for s in data.get('SONGS', {}).get('data'):
formats = [{
'format_id': 'preview',
'url': s.get('MEDIA', [{}])[0].get('HREF'),
'preference': -100, # Only the first 30 seconds
'ext': 'mp3',
}]
artists = ', '.join(
orderedSet(a.get('ART_NAME') for a in s.get('ARTISTS')))
entries.append({
'id': s.get('SNG_ID'),
'duration': int_or_none(s.get('DURATION')),
'title': '{} - {}'.format(artists, s.get('SNG_TITLE')),
'uploader': s.get('ART_NAME'),
'uploader_id': s.get('ART_ID'),
'age_limit': 16 if s.get('EXPLICIT_LYRICS') == '1' else 0,
'formats': formats,
})
return {
'_type': 'playlist',
'id': playlist_id,
'title': playlist_title,
'uploader': playlist_uploader,
'thumbnail': playlist_thumbnail,
'entries': entries,
}
class DeezerAlbumIE(DeezerBaseInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?deezer\.com/(../)?album/(?P<id>[0-9]+)'
_TEST = {
'url': 'https://www.deezer.com/fr/album/67505622',
'info_dict': {
'id': '67505622',
'title': 'Last Week',
'uploader': 'Home Brew',
'thumbnail': r're:^https?://(e-)?cdns-images\.dzcdn\.net/images/cover/.*\.jpg$',
},
'playlist_count': 7,
}
def _real_extract(self, url):
album_id, webpage, data = self.get_data(url)
album_title = data.get('DATA', {}).get('ALB_TITLE')
album_uploader = data.get('DATA', {}).get('ART_NAME')
album_thumbnail = self._search_regex(
r'<img id="naboo_album_image".*?src="([^"]+)"', webpage,
'album thumbnail')
entries = []
for s in data.get('SONGS', {}).get('data'):
formats = [{
'format_id': 'preview',
'url': s.get('MEDIA', [{}])[0].get('HREF'),
'preference': -100, # Only the first 30 seconds
'ext': 'mp3',
}]
artists = ', '.join(
orderedSet(a.get('ART_NAME') for a in s.get('ARTISTS')))
entries.append({
'id': s.get('SNG_ID'),
'duration': int_or_none(s.get('DURATION')),
'title': '{} - {}'.format(artists, s.get('SNG_TITLE')),
'uploader': s.get('ART_NAME'),
'uploader_id': s.get('ART_ID'),
'age_limit': 16 if s.get('EXPLICIT_LYRICS') == '1' else 0,
'formats': formats,
'track': s.get('SNG_TITLE'),
'track_number': int_or_none(s.get('TRACK_NUMBER')),
'track_id': s.get('SNG_ID'),
'artist': album_uploader,
'album': album_title,
'album_artist': album_uploader,
})
return {
'_type': 'playlist',
'id': album_id,
'title': album_title,
'uploader': album_uploader,
'thumbnail': album_thumbnail,
'entries': entries,
}

View File

@@ -1,9 +1,15 @@
from .zdf import ZDFBaseIE from .zdf import ZDFBaseIE
from ..utils import (
int_or_none,
merge_dicts,
parse_iso8601,
)
from ..utils.traversal import require, traverse_obj
class DreiSatIE(ZDFBaseIE): class DreiSatIE(ZDFBaseIE):
IE_NAME = '3sat' IE_NAME = '3sat'
_VALID_URL = r'https?://(?:www\.)?3sat\.de/(?:[^/]+/)*(?P<id>[^/?#&]+)\.html' _VALID_URL = r'https?://(?:www\.)?3sat\.de/(?:[^/?#]+/)*(?P<id>[^/?#&]+)\.html'
_TESTS = [{ _TESTS = [{
'url': 'https://www.3sat.de/dokumentation/reise/traumziele-suedostasiens-die-philippinen-und-vietnam-102.html', 'url': 'https://www.3sat.de/dokumentation/reise/traumziele-suedostasiens-die-philippinen-und-vietnam-102.html',
'info_dict': { 'info_dict': {
@@ -12,40 +18,59 @@ class DreiSatIE(ZDFBaseIE):
'title': 'Traumziele Südostasiens (1/2): Die Philippinen und Vietnam', 'title': 'Traumziele Südostasiens (1/2): Die Philippinen und Vietnam',
'description': 'md5:26329ce5197775b596773b939354079d', 'description': 'md5:26329ce5197775b596773b939354079d',
'duration': 2625.0, 'duration': 2625.0,
'thumbnail': 'https://www.3sat.de/assets/traumziele-suedostasiens-die-philippinen-und-vietnam-100~2400x1350?cb=1699870351148', 'thumbnail': 'https://www.3sat.de/assets/traumziele-suedostasiens-die-philippinen-und-vietnam-100~original?cb=1699870351148',
'episode': 'Traumziele Südostasiens (1/2): Die Philippinen und Vietnam', 'episode': 'Traumziele Südostasiens (1/2): Die Philippinen und Vietnam',
'episode_id': 'POS_cc7ff51c-98cf-4d12-b99d-f7a551de1c95', 'episode_id': 'POS_cc7ff51c-98cf-4d12-b99d-f7a551de1c95',
'timestamp': 1738593000, 'timestamp': 1747920900,
'upload_date': '20250203', 'upload_date': '20250522',
}, },
}, { }, {
# Same as https://www.zdf.de/dokumentation/ab-18/10-wochen-sommer-102.html 'url': 'https://www.3sat.de/film/ab-18/ab-18---mein-fremdes-ich-100.html',
'url': 'https://www.3sat.de/film/ab-18/10-wochen-sommer-108.html', 'md5': 'f92638413a11d759bdae95c9d8ec165c',
'md5': '0aff3e7bc72c8813f5e0fae333316a1d',
'info_dict': { 'info_dict': {
'id': '141007_ab18_10wochensommer_film', 'id': '221128_mein_fremdes_ich2_ab18',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Ab 18! - 10 Wochen Sommer', 'title': 'Ab 18! - Mein fremdes Ich',
'description': 'md5:8253f41dc99ce2c3ff892dac2d65fe26', 'description': 'md5:cae0c0b27b7426d62ca0dda181738bf0',
'duration': 2660, 'duration': 2625.0,
'timestamp': 1608604200, 'thumbnail': 'https://www.3sat.de/assets/ab-18---mein-fremdes-ich-106~original?cb=1666081865812',
'upload_date': '20201222', 'episode': 'Ab 18! - Mein fremdes Ich',
'episode_id': 'POS_6225d1ca-a0d5-45e3-870b-e783ee6c8a3f',
'timestamp': 1695081600,
'upload_date': '20230919',
}, },
'skip': '410 Gone',
}, { }, {
'url': 'https://www.3sat.de/gesellschaft/schweizweit/waidmannsheil-100.html', 'url': 'https://www.3sat.de/gesellschaft/37-grad-leben/aus-dem-leben-gerissen-102.html',
'md5': 'a903eaf8d1fd635bd3317cd2ad87ec84',
'info_dict': { 'info_dict': {
'id': '140913_sendung_schweizweit', 'id': '250323_0903_sendung_sgl',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Waidmannsheil', 'title': 'Plötzlich ohne dich',
'description': 'md5:cce00ca1d70e21425e72c86a98a56817', 'description': 'md5:380cc10659289dd91510ad8fa717c66b',
'timestamp': 1410623100, 'duration': 1620.0,
'upload_date': '20140913', 'thumbnail': 'https://www.3sat.de/assets/37-grad-leben-106~original?cb=1645537156810',
'episode': 'Plötzlich ohne dich',
'episode_id': 'POS_faa7a93c-c0f2-4d51-823f-ce2ac3ee191b',
'timestamp': 1743162540,
'upload_date': '20250328',
}, },
'params': { }, {
'skip_download': True, # Video with chapters
'url': 'https://www.3sat.de/kultur/buchmesse/dein-buch-das-beste-von-der-leipziger-buchmesse-2025-teil-1-100.html',
'md5': '6b95790ce52e75f0d050adcdd2711ee6',
'info_dict': {
'id': '250330_dein_buch1_bum',
'ext': 'mp4',
'title': 'dein buch - Das Beste von der Leipziger Buchmesse 2025 - Teil 1',
'description': 'md5:bae51bfc22f15563ce3acbf97d2e8844',
'duration': 5399.0,
'thumbnail': 'https://www.3sat.de/assets/buchmesse-kerkeling-100~original?cb=1743329640903',
'chapters': 'count:24',
'episode': 'dein buch - Das Beste von der Leipziger Buchmesse 2025 - Teil 1',
'episode_id': 'POS_1ef236cc-b390-401e-acd0-4fb4b04315fb',
'timestamp': 1743327000,
'upload_date': '20250330',
}, },
'skip': '404 Not Found',
}, { }, {
# Same as https://www.zdf.de/filme/filme-sonstige/der-hauptmann-112.html # Same as https://www.zdf.de/filme/filme-sonstige/der-hauptmann-112.html
'url': 'https://www.3sat.de/film/spielfilm/der-hauptmann-100.html', 'url': 'https://www.3sat.de/film/spielfilm/der-hauptmann-100.html',
@@ -58,11 +83,42 @@ class DreiSatIE(ZDFBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player = self._search_json(
r'data-zdfplayer-jsb=(["\'])', webpage, 'player JSON', video_id)
player_url = player['content']
api_token = f'Bearer {player["apiToken"]}'
webpage = self._download_webpage(url, video_id, fatal=False) content = self._call_api(player_url, video_id, 'video metadata', api_token)
if webpage:
player = self._extract_player(webpage, url, fatal=False)
if player:
return self._extract_regular(url, player, video_id)
return self._extract_mobile(video_id) video_target = content['mainVideoContent']['http://zdf.de/rels/target']
ptmd_path = traverse_obj(video_target, (
(('streams', 'default'), None),
('http://zdf.de/rels/streams/ptmd', 'http://zdf.de/rels/streams/ptmd-template'),
{str}, any, {require('ptmd path')}))
ptmd_url = self._expand_ptmd_template(player_url, ptmd_path)
aspect_ratio = self._parse_aspect_ratio(video_target.get('aspectRatio'))
info = self._extract_ptmd(ptmd_url, video_id, api_token, aspect_ratio)
return merge_dicts(info, {
**traverse_obj(content, {
'title': (('title', 'teaserHeadline'), {str}, any),
'episode': (('title', 'teaserHeadline'), {str}, any),
'description': (('leadParagraph', 'teasertext'), {str}, any),
'timestamp': ('editorialDate', {parse_iso8601}),
}),
**traverse_obj(video_target, {
'duration': ('duration', {int_or_none}),
'chapters': ('streamAnchorTag', {self._extract_chapters}),
}),
'thumbnails': self._extract_thumbnails(traverse_obj(content, ('teaserImageRef', 'layouts', {dict}))),
**traverse_obj(content, ('programmeItem', 0, 'http://zdf.de/rels/target', {
'series_id': ('http://zdf.de/rels/cmdm/series', 'seriesUuid', {str}),
'series': ('http://zdf.de/rels/cmdm/series', 'seriesTitle', {str}),
'season': ('http://zdf.de/rels/cmdm/season', 'seasonTitle', {str}),
'season_number': ('http://zdf.de/rels/cmdm/season', 'seasonNumber', {int_or_none}),
'season_id': ('http://zdf.de/rels/cmdm/season', 'seasonUuid', {str}),
'episode_number': ('episodeNumber', {int_or_none}),
'episode_id': ('contentId', {str}),
})),
})

View File

@@ -0,0 +1,87 @@
import urllib.parse
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
float_or_none,
url_or_none,
)
from ..utils.traversal import traverse_obj
class FrancaisFacileIE(InfoExtractor):
_VALID_URL = r'https?://francaisfacile\.rfi\.fr/[a-z]{2}/(?:actualit%C3%A9|podcasts/[^/#?]+)/(?P<id>[^/#?]+)'
_TESTS = [{
'url': 'https://francaisfacile.rfi.fr/fr/actualit%C3%A9/20250305-r%C3%A9concilier-les-jeunes-avec-la-lecture-gr%C3%A2ce-aux-r%C3%A9seaux-sociaux',
'md5': '4f33674cb205744345cc835991100afa',
'info_dict': {
'id': 'WBMZ58952-FLE-FR-20250305',
'display_id': '20250305-réconcilier-les-jeunes-avec-la-lecture-grâce-aux-réseaux-sociaux',
'title': 'Réconcilier les jeunes avec la lecture grâce aux réseaux sociaux',
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/05/6b6af52a-f9ba-11ef-a1f8-005056a97652.mp3',
'ext': 'mp3',
'description': 'md5:b903c63d8585bd59e8cc4d5f80c4272d',
'duration': 103.15,
'timestamp': 1741177984,
'upload_date': '20250305',
},
}, {
'url': 'https://francaisfacile.rfi.fr/fr/actualit%C3%A9/20250307-argentine-le-sac-d-un-alpiniste-retrouv%C3%A9-40-ans-apr%C3%A8s-sa-mort',
'md5': 'b8c3a63652d4ae8e8092dda5700c1cd9',
'info_dict': {
'id': 'WBMZ59102-FLE-FR-20250307',
'display_id': '20250307-argentine-le-sac-d-un-alpiniste-retrouvé-40-ans-après-sa-mort',
'title': 'Argentine: le sac d\'un alpiniste retrouvé 40 ans après sa mort',
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/07/8edf4082-fb46-11ef-8a37-005056bf762b.mp3',
'ext': 'mp3',
'description': 'md5:7fd088fbdf4a943bb68cf82462160dca',
'duration': 117.74,
'timestamp': 1741352789,
'upload_date': '20250307',
},
}, {
'url': 'https://francaisfacile.rfi.fr/fr/podcasts/un-mot-une-histoire/20250317-le-mot-de-david-foenkinos-peut-%C3%AAtre',
'md5': 'db83c2cc2589b4c24571c6b6cf14f5f1',
'info_dict': {
'id': 'WBMZ59441-FLE-FR-20250317',
'display_id': '20250317-le-mot-de-david-foenkinos-peut-être',
'title': 'Le mot de David Foenkinos: «peut-être» - Un mot, une histoire',
'url': 'https://aod-fle.akamaized.net/fle/sounds/fr/2025/03/17/4ca6cbbe-0315-11f0-a85b-005056a97652.mp3',
'ext': 'mp3',
'description': 'md5:3fe35fae035803df696bfa7af2496e49',
'duration': 198.96,
'timestamp': 1742210897,
'upload_date': '20250317',
},
}]
def _real_extract(self, url):
display_id = urllib.parse.unquote(self._match_id(url))
try: # yt-dlp's default user-agents are too old and blocked by the site
webpage = self._download_webpage(url, display_id, headers={
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:136.0) Gecko/20100101 Firefox/136.0',
})
except ExtractorError as e:
if not isinstance(e.cause, HTTPError) or e.cause.status != 403:
raise
# Retry with impersonation if hardcoded UA is insufficient
webpage = self._download_webpage(url, display_id, impersonate=True)
data = self._search_json(
r'<script[^>]+\bdata-media-id=[^>]+\btype="application/json"[^>]*>',
webpage, 'audio data', display_id)
return {
'id': data['mediaId'],
'display_id': display_id,
'vcodec': 'none',
'title': self._html_extract_title(webpage),
**self._search_json_ld(webpage, display_id, fatal=False),
**traverse_obj(data, {
'title': ('title', {str}),
'url': ('sources', ..., 'url', {url_or_none}, any),
'duration': ('sources', ..., 'duration', {float_or_none}, any),
}),
}

View File

@@ -37,6 +37,7 @@
unescapeHTML, unescapeHTML,
unified_timestamp, unified_timestamp,
unsmuggle_url, unsmuggle_url,
update_url,
update_url_query, update_url_query,
url_or_none, url_or_none,
urlhandle_detect_ext, urlhandle_detect_ext,
@@ -2213,10 +2214,21 @@ def hex_or_none(value):
if is_live is not None: if is_live is not None:
info['live_status'] = 'not_live' if is_live == 'false' else 'is_live' info['live_status'] = 'not_live' if is_live == 'false' else 'is_live'
return return
headers = m3u8_format.get('http_headers') or info.get('http_headers') headers = m3u8_format.get('http_headers') or info.get('http_headers') or {}
duration = self._extract_m3u8_vod_duration( display_id = info.get('id')
m3u8_format['url'], info.get('id'), note='Checking m3u8 live status', urlh = self._request_webpage(
errnote='Failed to download m3u8 media playlist', headers=headers) m3u8_format['url'], display_id, 'Checking m3u8 live status', errnote=False,
headers={**headers, 'Accept-Encoding': 'identity'}, fatal=False)
if urlh is False:
return
first_bytes = urlh.read(512)
if not first_bytes.startswith(b'#EXTM3U'):
return
m3u8_doc = self._webpage_read_content(
urlh, urlh.url, display_id, prefix=first_bytes, fatal=False, errnote=False)
if not m3u8_doc:
return
duration = self._parse_m3u8_vod_duration(m3u8_doc, display_id)
if not duration: if not duration:
info['live_status'] = 'is_live' info['live_status'] = 'is_live'
info['duration'] = info.get('duration') or duration info['duration'] = info.get('duration') or duration
@@ -2526,12 +2538,13 @@ def _real_extract(self, url):
return self.playlist_result( return self.playlist_result(
self._parse_xspf( self._parse_xspf(
doc, video_id, xspf_url=url, doc, video_id, xspf_url=url,
xspf_base_url=full_response.url), xspf_base_url=new_url),
video_id) video_id)
elif re.match(r'(?i)^(?:{[^}]+})?MPD$', doc.tag): elif re.match(r'(?i)^(?:{[^}]+})?MPD$', doc.tag):
info_dict['formats'], info_dict['subtitles'] = self._parse_mpd_formats_and_subtitles( info_dict['formats'], info_dict['subtitles'] = self._parse_mpd_formats_and_subtitles(
doc, doc,
mpd_base_url=full_response.url.rpartition('/')[0], # Do not use yt_dlp.utils.base_url here since it will raise on file:// URLs
mpd_base_url=update_url(new_url, query=None, fragment=None).rpartition('/')[0],
mpd_url=url) mpd_url=url)
info_dict['live_status'] = 'is_live' if doc.get('type') == 'dynamic' else None info_dict['live_status'] = 'is_live' if doc.get('type') == 'dynamic' else None
self._extra_manifest_info(info_dict, url) self._extra_manifest_info(info_dict, url)

View File

@@ -8,7 +8,7 @@
class GetCourseRuPlayerIE(InfoExtractor): class GetCourseRuPlayerIE(InfoExtractor):
_VALID_URL = r'https?://player02\.getcourse\.ru/sign-player/?\?(?:[^#]+&)?json=[^#&]+' _VALID_URL = r'https?://(?:player02\.getcourse\.ru|cf-api-2\.vhcdn\.com)/sign-player/?\?(?:[^#]+&)?json=[^#&]+'
_EMBED_REGEX = [rf'<iframe[^>]+\bsrc=[\'"](?P<url>{_VALID_URL}[^\'"]*)'] _EMBED_REGEX = [rf'<iframe[^>]+\bsrc=[\'"](?P<url>{_VALID_URL}[^\'"]*)']
_TESTS = [{ _TESTS = [{
'url': 'http://player02.getcourse.ru/sign-player/?json=eyJ2aWRlb19oYXNoIjoiMTkwYmRmOTNmMWIyOTczNTMwOTg1M2E3YTE5ZTI0YjMiLCJ1c2VyX2lkIjozNTk1MjUxODMsInN1Yl9sb2dpbl91c2VyX2lkIjpudWxsLCJsZXNzb25faWQiOm51bGwsImlwIjoiNDYuMTQyLjE4Mi4yNDciLCJnY19ob3N0IjoiYWNhZGVteW1lbC5vbmxpbmUiLCJ0aW1lIjoxNzA1NDQ5NjQyLCJwYXlsb2FkIjoidV8zNTk1MjUxODMiLCJ1aV9sYW5ndWFnZSI6InJ1IiwiaXNfaGF2ZV9jdXN0b21fc3R5bGUiOnRydWV9&s=354ad2c993d95d5ac629e3133d6cefea&vh-static-feature=zigzag', 'url': 'http://player02.getcourse.ru/sign-player/?json=eyJ2aWRlb19oYXNoIjoiMTkwYmRmOTNmMWIyOTczNTMwOTg1M2E3YTE5ZTI0YjMiLCJ1c2VyX2lkIjozNTk1MjUxODMsInN1Yl9sb2dpbl91c2VyX2lkIjpudWxsLCJsZXNzb25faWQiOm51bGwsImlwIjoiNDYuMTQyLjE4Mi4yNDciLCJnY19ob3N0IjoiYWNhZGVteW1lbC5vbmxpbmUiLCJ0aW1lIjoxNzA1NDQ5NjQyLCJwYXlsb2FkIjoidV8zNTk1MjUxODMiLCJ1aV9sYW5ndWFnZSI6InJ1IiwiaXNfaGF2ZV9jdXN0b21fc3R5bGUiOnRydWV9&s=354ad2c993d95d5ac629e3133d6cefea&vh-static-feature=zigzag',
@@ -20,6 +20,16 @@ class GetCourseRuPlayerIE(InfoExtractor):
'duration': 1693, 'duration': 1693,
}, },
'skip': 'JWT expired', 'skip': 'JWT expired',
}, {
'url': 'https://cf-api-2.vhcdn.com/sign-player/?json=example',
'info_dict': {
'id': '435735291',
'title': '8afd7c489952108e00f019590f3711f3',
'ext': 'mp4',
'thumbnail': 'https://preview-htz.vhcdn.com/preview/8afd7c489952108e00f019590f3711f3/preview.jpg?version=1682170973&host=vh-72',
'duration': 777,
},
'skip': 'JWT expired',
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@@ -168,7 +178,7 @@ def _real_extract(self, url):
playlist_id = self._search_regex( playlist_id = self._search_regex(
r'window\.(?:lessonId|gcsObjectId)\s*=\s*(\d+)', webpage, 'playlist id', default=display_id) r'window\.(?:lessonId|gcsObjectId)\s*=\s*(\d+)', webpage, 'playlist id', default=display_id)
title = self._og_search_title(webpage) or self._html_extract_title(webpage) title = self._og_search_title(webpage, default=None) or self._html_extract_title(webpage)
return self.playlist_from_matches( return self.playlist_from_matches(
re.findall(GetCourseRuPlayerIE._EMBED_REGEX[0], webpage), re.findall(GetCourseRuPlayerIE._EMBED_REGEX[0], webpage),

View File

@@ -6,7 +6,7 @@
) )
class HSEShowBaseInfoExtractor(InfoExtractor): class HSEShowBaseIE(InfoExtractor):
_GEO_COUNTRIES = ['DE'] _GEO_COUNTRIES = ['DE']
def _extract_redux_data(self, url, video_id): def _extract_redux_data(self, url, video_id):
@@ -28,7 +28,7 @@ def _extract_formats_and_subtitles(self, sources, video_id):
return formats, subtitles return formats, subtitles
class HSEShowIE(HSEShowBaseInfoExtractor): class HSEShowIE(HSEShowBaseIE):
_VALID_URL = r'https?://(?:www\.)?hse\.de/dpl/c/tv-shows/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?hse\.de/dpl/c/tv-shows/(?P<id>[0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.hse.de/dpl/c/tv-shows/505350', 'url': 'https://www.hse.de/dpl/c/tv-shows/505350',
@@ -64,7 +64,7 @@ def _real_extract(self, url):
} }
class HSEProductIE(HSEShowBaseInfoExtractor): class HSEProductIE(HSEShowBaseIE):
_VALID_URL = r'https?://(?:www\.)?hse\.de/dpl/p/product/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?hse\.de/dpl/p/product/(?P<id>[0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.hse.de/dpl/p/product/408630', 'url': 'https://www.hse.de/dpl/p/product/408630',

View File

@@ -1,5 +1,13 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ExtractorError, str_or_none, traverse_obj, unified_strdate from ..utils import (
ExtractorError,
int_or_none,
str_or_none,
traverse_obj,
unified_strdate,
url_or_none,
)
class IchinanaLiveIE(InfoExtractor): class IchinanaLiveIE(InfoExtractor):
@@ -157,3 +165,51 @@ def _real_extract(self, url):
'description': view_data.get('caption'), 'description': view_data.get('caption'),
'upload_date': unified_strdate(str_or_none(view_data.get('createdAt'))), 'upload_date': unified_strdate(str_or_none(view_data.get('createdAt'))),
} }
class IchinanaLiveVODIE(InfoExtractor):
IE_NAME = '17live:vod'
_VALID_URL = r'https?://(?:www\.)?17\.live/ja/vod/[^/?#]+/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://17.live/ja/vod/27323042/2cf84520-e65e-4b22-891e-1d3a00b0f068',
'md5': '3299b930d7457b069639486998a89580',
'info_dict': {
'id': '2cf84520-e65e-4b22-891e-1d3a00b0f068',
'ext': 'mp4',
'title': 'md5:b5f8cbf497d54cc6a60eb3b480182f01',
'uploader': 'md5:29fb12122ab94b5a8495586e7c3085a5',
'uploader_id': '27323042',
'channel': '🌟オールナイトニッポン アーカイブ🌟',
'channel_id': '2b4f85f1-d61e-429d-a901-68d32bdd8645',
'like_count': int,
'view_count': int,
'thumbnail': r're:https?://.+/.+\.(?:jpe?g|png)',
'duration': 549,
'description': 'md5:116f326579700f00eaaf5581aae1192e',
'timestamp': 1741058645,
'upload_date': '20250304',
},
}, {
'url': 'https://17.live/ja/vod/27323042/0de11bac-9bea-40b8-9eab-0239a7d88079',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
json_data = self._download_json(f'https://wap-api.17app.co/api/v1/vods/{video_id}', video_id)
return traverse_obj(json_data, {
'id': ('vodID', {str}),
'title': ('title', {str}),
'formats': ('vodURL', {lambda x: self._extract_m3u8_formats(x, video_id)}),
'uploader': ('userInfo', 'displayName', {str}),
'uploader_id': ('userInfo', 'roomID', {int}, {str_or_none}),
'channel': ('userInfo', 'name', {str}),
'channel_id': ('userInfo', 'userID', {str}),
'like_count': ('likeCount', {int_or_none}),
'view_count': ('viewCount', {int_or_none}),
'thumbnail': ('imageURL', {url_or_none}),
'duration': ('duration', {int_or_none}),
'description': ('description', {str}),
'timestamp': ('createdAt', {int_or_none}),
})

78
yt_dlp/extractor/ivoox.py Normal file
View File

@@ -0,0 +1,78 @@
from .common import InfoExtractor
from ..utils import int_or_none, parse_iso8601, url_or_none, urljoin
from ..utils.traversal import traverse_obj
class IvooxIE(InfoExtractor):
_VALID_URL = (
r'https?://(?:www\.)?ivoox\.com/(?:\w{2}/)?[^/?#]+_rf_(?P<id>[0-9]+)_1\.html',
r'https?://go\.ivoox\.com/rf/(?P<id>[0-9]+)',
)
_TESTS = [{
'url': 'https://www.ivoox.com/dex-08x30-rostros-del-mal-los-asesinos-en-audios-mp3_rf_143594959_1.html',
'md5': '993f712de5b7d552459fc66aa3726885',
'info_dict': {
'id': '143594959',
'ext': 'mp3',
'timestamp': 1742731200,
'channel': 'DIAS EXTRAÑOS con Santiago Camacho',
'title': 'DEx 08x30 Rostros del mal: Los asesinos en serie que aterrorizaron España',
'description': 'md5:eae8b4b9740d0216d3871390b056bb08',
'uploader': 'Santiago Camacho',
'thumbnail': 'https://static-1.ivoox.com/audios/c/d/5/2/cd52f46783fe735000c33a803dce2554_XXL.jpg',
'upload_date': '20250323',
'episode': 'DEx 08x30 Rostros del mal: Los asesinos en serie que aterrorizaron España',
'duration': 11837,
'tags': ['españa', 'asesinos en serie', 'arropiero', 'historia criminal', 'mataviejas'],
},
}, {
'url': 'https://go.ivoox.com/rf/143594959',
'only_matching': True,
}, {
'url': 'https://www.ivoox.com/en/campodelgas-28-03-2025-audios-mp3_rf_144036942_1.html',
'only_matching': True,
}]
def _real_extract(self, url):
media_id = self._match_id(url)
webpage = self._download_webpage(url, media_id, fatal=False)
data = self._search_nuxt_data(
webpage, media_id, fatal=False, traverse=('data', 0, 'data', 'audio'))
direct_download = self._download_json(
f'https://vcore-web.ivoox.com/v1/public/audios/{media_id}/download-url', media_id, fatal=False,
note='Fetching direct download link', headers={'Referer': url})
download_paths = {
*traverse_obj(direct_download, ('data', 'downloadUrl', {str}, filter, all)),
*traverse_obj(data, (('downloadUrl', 'mediaUrl'), {str}, filter)),
}
formats = []
for path in download_paths:
formats.append({
'url': urljoin('https://ivoox.com', path),
'http_headers': {'Referer': url},
})
return {
'id': media_id,
'formats': formats,
'uploader': self._html_search_regex(r'data-prm-author="([^"]+)"', webpage, 'author', default=None),
'timestamp': parse_iso8601(
self._html_search_regex(r'data-prm-pubdate="([^"]+)"', webpage, 'timestamp', default=None)),
'channel': self._html_search_regex(r'data-prm-podname="([^"]+)"', webpage, 'channel', default=None),
'title': self._html_search_regex(r'data-prm-title="([^"]+)"', webpage, 'title', default=None),
'thumbnail': self._og_search_thumbnail(webpage, default=None),
'description': self._og_search_description(webpage, default=None),
**self._search_json_ld(webpage, media_id, default={}),
**traverse_obj(data, {
'title': ('title', {str}),
'description': ('description', {str}),
'thumbnail': ('image', {url_or_none}),
'timestamp': ('uploadDate', {parse_iso8601(delimiter=' ')}),
'duration': ('duration', {int_or_none}),
'tags': ('tags', ..., 'name', {str}),
}),
}

View File

@@ -1,3 +1,5 @@
import itertools
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
@@ -124,3 +126,43 @@ def _extract_formats(self, media_info, video_id):
'vbr': ('bitrateVideo', {int_or_none}, {lambda x: None if x == -1 else x}), 'vbr': ('bitrateVideo', {int_or_none}, {lambda x: None if x == -1 else x}),
}), }),
} }
class KikaPlaylistIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?kika\.de/[\w-]+/(?P<id>[a-z-]+\d+)'
_TESTS = [{
'url': 'https://www.kika.de/logo/logo-die-welt-und-ich-562',
'info_dict': {
'id': 'logo-die-welt-und-ich-562',
'title': 'logo!',
'description': 'md5:7b9d7f65561b82fa512f2cfb553c397d',
},
'playlist_count': 100,
}]
def _entries(self, playlist_url, playlist_id):
for page in itertools.count(1):
data = self._download_json(playlist_url, playlist_id, note=f'Downloading page {page}')
for item in traverse_obj(data, ('content', lambda _, v: url_or_none(v['api']['url']))):
yield self.url_result(
item['api']['url'], ie=KikaIE,
**traverse_obj(item, {
'id': ('id', {str}),
'title': ('title', {str}),
'duration': ('duration', {int_or_none}),
'timestamp': ('date', {parse_iso8601}),
}))
playlist_url = traverse_obj(data, ('links', 'next', {url_or_none}))
if not playlist_url:
break
def _real_extract(self, url):
playlist_id = self._match_id(url)
brand_data = self._download_json(
f'https://www.kika.de/_next-api/proxy/v1/brands/{playlist_id}', playlist_id)
return self.playlist_result(
self._entries(brand_data['videoSubchannel']['videosPageUrl'], playlist_id),
playlist_id, title=brand_data.get('title'), description=brand_data.get('description'))

View File

@@ -1,4 +1,5 @@
import itertools import itertools
import json
import re import re
from .common import InfoExtractor from .common import InfoExtractor
@@ -9,12 +10,12 @@
int_or_none, int_or_none,
mimetype2ext, mimetype2ext,
srt_subtitles_timecode, srt_subtitles_timecode,
traverse_obj,
try_get, try_get,
url_or_none, url_or_none,
urlencode_postdata, urlencode_postdata,
urljoin, urljoin,
) )
from ..utils.traversal import find_elements, require, traverse_obj
class LinkedInBaseIE(InfoExtractor): class LinkedInBaseIE(InfoExtractor):
@@ -82,7 +83,10 @@ def _get_video_id(self, video_data, course_slug, video_slug):
class LinkedInIE(LinkedInBaseIE): class LinkedInIE(LinkedInBaseIE):
_VALID_URL = r'https?://(?:www\.)?linkedin\.com/posts/[^/?#]+-(?P<id>\d+)-\w{4}/?(?:[?#]|$)' _VALID_URL = [
r'https?://(?:www\.)?linkedin\.com/posts/[^/?#]+-(?P<id>\d+)-\w{4}/?(?:[?#]|$)',
r'https?://(?:www\.)?linkedin\.com/feed/update/urn:li:activity:(?P<id>\d+)',
]
_TESTS = [{ _TESTS = [{
'url': 'https://www.linkedin.com/posts/mishalkhawaja_sendinblueviews-toronto-digitalmarketing-ugcPost-6850898786781339649-mM20', 'url': 'https://www.linkedin.com/posts/mishalkhawaja_sendinblueviews-toronto-digitalmarketing-ugcPost-6850898786781339649-mM20',
'info_dict': { 'info_dict': {
@@ -106,6 +110,9 @@ class LinkedInIE(LinkedInBaseIE):
'like_count': int, 'like_count': int,
'subtitles': 'mincount:1', 'subtitles': 'mincount:1',
}, },
}, {
'url': 'https://www.linkedin.com/feed/update/urn:li:activity:7016901149999955968/?utm_source=share&utm_medium=member_desktop',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@@ -271,3 +278,110 @@ def _real_extract(self, url):
entries, course_slug, entries, course_slug,
course_data.get('title'), course_data.get('title'),
course_data.get('description')) course_data.get('description'))
class LinkedInEventsIE(LinkedInBaseIE):
IE_NAME = 'linkedin:events'
_VALID_URL = r'https?://(?:www\.)?linkedin\.com/events/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://www.linkedin.com/events/7084656651378536448/comments/',
'info_dict': {
'id': '7084656651378536448',
'ext': 'mp4',
'title': '#37 Aprende a hacer una entrevista en inglés para tu próximo trabajo remoto',
'description': '¡Agarra para anotar que se viene tremendo evento!',
'duration': 1765,
'timestamp': 1689113772,
'upload_date': '20230711',
'release_timestamp': 1689174012,
'release_date': '20230712',
'live_status': 'was_live',
},
}, {
'url': 'https://www.linkedin.com/events/27-02energyfreedombyenergyclub7295762520814874625/comments/',
'info_dict': {
'id': '27-02energyfreedombyenergyclub7295762520814874625',
'ext': 'mp4',
'title': '27.02 Energy Freedom by Energy Club',
'description': 'md5:1292e6f31df998914c293787a02c3b91',
'duration': 6420,
'timestamp': 1739445333,
'upload_date': '20250213',
'release_timestamp': 1740657620,
'release_date': '20250227',
'live_status': 'was_live',
},
}]
def _real_initialize(self):
if not self._get_cookies('https://www.linkedin.com/').get('li_at'):
self.raise_login_required()
def _real_extract(self, url):
event_id = self._match_id(url)
webpage = self._download_webpage(url, event_id)
base_data = traverse_obj(webpage, (
{find_elements(tag='code', attr='style', value='display: none')}, ..., {json.loads}, 'included', ...))
meta_data = traverse_obj(base_data, (
lambda _, v: v['$type'] == 'com.linkedin.voyager.dash.events.ProfessionalEvent', any)) or {}
live_status = {
'PAST': 'was_live',
'ONGOING': 'is_live',
'FUTURE': 'is_upcoming',
}.get(meta_data.get('lifecycleState'))
if live_status == 'is_upcoming':
player_data = {}
if event_time := traverse_obj(meta_data, ('displayEventTime', {str})):
message = f'This live event is scheduled for {event_time}'
else:
message = 'This live event has not yet started'
self.raise_no_formats(message, expected=True, video_id=event_id)
else:
# TODO: Add support for audio-only live events
player_data = traverse_obj(base_data, (
lambda _, v: v['$type'] == 'com.linkedin.videocontent.VideoPlayMetadata',
any, {require('video player data')}))
formats, subtitles = [], {}
for prog_fmts in traverse_obj(player_data, ('progressiveStreams', ..., {dict})):
for fmt_url in traverse_obj(prog_fmts, ('streamingLocations', ..., 'url', {url_or_none})):
formats.append({
'url': fmt_url,
**traverse_obj(prog_fmts, {
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
'tbr': ('bitRate', {int_or_none(scale=1000)}),
'filesize': ('size', {int_or_none}),
'ext': ('mediaType', {mimetype2ext}),
}),
})
for m3u8_url in traverse_obj(player_data, (
'adaptiveStreams', lambda _, v: v['protocol'] == 'HLS', 'masterPlaylists', ..., 'url', {url_or_none},
)):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
m3u8_url, event_id, 'mp4', m3u8_id='hls', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
return {
'id': event_id,
'formats': formats,
'subtitles': subtitles,
'live_status': live_status,
**traverse_obj(meta_data, {
'title': ('name', {str}),
'description': ('description', 'text', {str}),
'timestamp': ('createdAt', {int_or_none(scale=1000)}),
# timeRange.start is available when the stream is_upcoming
'release_timestamp': ('timeRange', 'start', {int_or_none(scale=1000)}),
}),
**traverse_obj(player_data, {
'duration': ('duration', {int_or_none(scale=1000)}),
# liveStreamCreatedAt is only available when the stream is_live or was_live
'release_timestamp': ('liveStreamCreatedAt', {int_or_none(scale=1000)}),
}),
}

View File

@@ -1,5 +1,9 @@
import json
import random
import time
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import int_or_none, url_or_none from ..utils import int_or_none, jwt_decode_hs256, try_call, url_or_none
from ..utils.traversal import require, traverse_obj from ..utils.traversal import require, traverse_obj
@@ -55,13 +59,81 @@ class LocoIE(InfoExtractor):
'upload_date': '20250226', 'upload_date': '20250226',
'modified_date': '20250226', 'modified_date': '20250226',
}, },
}, {
# Requires video authorization
'url': 'https://loco.com/stream/ac854641-ae0f-497c-a8ea-4195f6d8cc53',
'md5': '0513edf85c1e65c9521f555f665387d5',
'info_dict': {
'id': 'ac854641-ae0f-497c-a8ea-4195f6d8cc53',
'ext': 'mp4',
'title': 'DUAS CONTAS DESAFIANTE, RUSH TOP 1 NO BRASIL!',
'description': 'md5:aa77818edd6fe00dd4b6be75cba5f826',
'uploader_id': '7Y9JNAZC3Q',
'channel': 'ayellol',
'channel_follower_count': int,
'comment_count': int,
'view_count': int,
'concurrent_view_count': int,
'like_count': int,
'duration': 1229,
'thumbnail': 'https://static.ivory.getloconow.com/default_thumb/f5aa678b-6d04-45d9-a89a-859af0a8028f.jpg',
'tags': ['Gameplay', 'Carry'],
'series': 'League of Legends',
'timestamp': 1741182253,
'upload_date': '20250305',
'modified_timestamp': 1741182419,
'modified_date': '20250305',
},
}] }]
# From _app.js
_CLIENT_ID = 'TlwKp1zmF6eKFpcisn3FyR18WkhcPkZtzwPVEEC3'
_CLIENT_SECRET = 'Kp7tYlUN7LXvtcSpwYvIitgYcLparbtsQSe5AdyyCdiEJBP53Vt9J8eB4AsLdChIpcO2BM19RA3HsGtqDJFjWmwoonvMSG3ZQmnS8x1YIM8yl82xMXZGbE3NKiqmgBVU'
def _is_jwt_expired(self, token):
return jwt_decode_hs256(token)['exp'] - time.time() < 300
def _get_access_token(self, video_id):
access_token = try_call(lambda: self._get_cookies('https://loco.com')['access_token'].value)
if access_token and not self._is_jwt_expired(access_token):
return access_token
access_token = traverse_obj(self._download_json(
'https://api.getloconow.com/v3/user/device_profile/', video_id,
'Downloading access token', fatal=False, data=json.dumps({
'platform': 7,
'client_id': self._CLIENT_ID,
'client_secret': self._CLIENT_SECRET,
'model': 'Mozilla',
'os_name': 'Win32',
'os_ver': '5.0 (Windows)',
'app_ver': '5.0 (Windows)',
}).encode(), headers={
'Content-Type': 'application/json;charset=utf-8',
'DEVICE-ID': ''.join(random.choices('0123456789abcdef', k=32)) + 'live',
'X-APP-LANG': 'en',
'X-APP-LOCALE': 'en-US',
'X-CLIENT-ID': self._CLIENT_ID,
'X-CLIENT-SECRET': self._CLIENT_SECRET,
'X-PLATFORM': '7',
}), 'access_token')
if access_token and not self._is_jwt_expired(access_token):
self._set_cookie('.loco.com', 'access_token', access_token)
return access_token
def _real_extract(self, url): def _real_extract(self, url):
video_type, video_id = self._match_valid_url(url).group('type', 'id') video_type, video_id = self._match_valid_url(url).group('type', 'id')
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
stream = traverse_obj(self._search_nextjs_data(webpage, video_id), ( stream = traverse_obj(self._search_nextjs_data(webpage, video_id), (
'props', 'pageProps', ('liveStreamData', 'stream'), {dict}, any, {require('stream info')})) 'props', 'pageProps', ('liveStreamData', 'stream', 'liveStream'), {dict}, any, {require('stream info')}))
if access_token := self._get_access_token(video_id):
self._request_webpage(
'https://drm.loco.com/v1/streams/playback/', video_id,
'Downloading video authorization', fatal=False, headers={
'authorization': access_token,
}, query={
'stream_uid': stream['uid'],
})
return { return {
'formats': self._extract_m3u8_formats(stream['conf']['hls'], video_id), 'formats': self._extract_m3u8_formats(stream['conf']['hls'], video_id),

View File

@@ -2,8 +2,11 @@
from ..utils import ( from ..utils import (
clean_html, clean_html,
merge_dicts, merge_dicts,
str_or_none,
traverse_obj, traverse_obj,
unified_timestamp,
url_or_none, url_or_none,
urljoin,
) )
@@ -80,7 +83,7 @@ class LRTVODIE(LRTBaseIE):
}] }]
def _real_extract(self, url): def _real_extract(self, url):
path, video_id = self._match_valid_url(url).groups() path, video_id = self._match_valid_url(url).group('path', 'id')
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
media_url = self._extract_js_var(webpage, 'main_url', path) media_url = self._extract_js_var(webpage, 'main_url', path)
@@ -106,3 +109,42 @@ def _real_extract(self, url):
} }
return merge_dicts(clean_info, jw_data, json_ld_data) return merge_dicts(clean_info, jw_data, json_ld_data)
class LRTRadioIE(LRTBaseIE):
_VALID_URL = r'https?://(?:www\.)?lrt\.lt/radioteka/irasas/(?P<id>\d+)/(?P<path>[^?#/]+)'
_TESTS = [{
# m3u8 download
'url': 'https://www.lrt.lt/radioteka/irasas/2000359728/nemarios-eiles-apie-pragarus-ir-skaistyklas-su-aiste-kiltinaviciute',
'info_dict': {
'id': '2000359728',
'ext': 'm4a',
'title': 'Nemarios eilės: apie pragarus ir skaistyklas su Aiste Kiltinavičiūte',
'description': 'md5:5eee9a0e86a55bf547bd67596204625d',
'timestamp': 1726143120,
'upload_date': '20240912',
'tags': 'count:5',
'thumbnail': r're:https?://.+/.+\.jpe?g',
'categories': ['Daiktiniai įrodymai'],
},
}, {
'url': 'https://www.lrt.lt/radioteka/irasas/2000304654/vakaras-su-knyga-svetlana-aleksijevic-cernobylio-malda-v-dalis?season=%2Fmediateka%2Faudio%2Fvakaras-su-knyga%2F2023',
'only_matching': True,
}]
def _real_extract(self, url):
video_id, path = self._match_valid_url(url).group('id', 'path')
media = self._download_json(
'https://www.lrt.lt/radioteka/api/media', video_id,
query={'url': f'/mediateka/irasas/{video_id}/{path}'})
return traverse_obj(media, {
'id': ('id', {int}, {str_or_none}),
'title': ('title', {str}),
'tags': ('tags', ..., 'name', {str}),
'categories': ('playlist_item', 'category', {str}, filter, all, filter),
'description': ('content', {clean_html}, {str}),
'timestamp': ('date', {lambda x: x.replace('.', '/')}, {unified_timestamp}),
'thumbnail': ('playlist_item', 'image', {urljoin('https://www.lrt.lt')}),
'formats': ('playlist_item', 'file', {lambda x: self._extract_m3u8_formats(x, video_id)}),
})

View File

@@ -1,31 +1,38 @@
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
clean_html,
determine_ext, determine_ext,
extract_attributes,
int_or_none, int_or_none,
str_to_int, join_nonempty,
parse_count,
parse_duration,
parse_iso8601,
url_or_none, url_or_none,
urlencode_postdata,
) )
from ..utils.traversal import traverse_obj
class ManyVidsIE(InfoExtractor): class ManyVidsIE(InfoExtractor):
_WORKING = False
_VALID_URL = r'(?i)https?://(?:www\.)?manyvids\.com/video/(?P<id>\d+)' _VALID_URL = r'(?i)https?://(?:www\.)?manyvids\.com/video/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
# preview video # preview video
'url': 'https://www.manyvids.com/Video/133957/everthing-about-me/', 'url': 'https://www.manyvids.com/Video/530341/mv-tips-tricks',
'md5': '03f11bb21c52dd12a05be21a5c7dcc97', 'md5': '738dc723f7735ee9602f7ea352a6d058',
'info_dict': { 'info_dict': {
'id': '133957', 'id': '530341-preview',
'ext': 'mp4', 'ext': 'mp4',
'title': 'everthing about me (Preview)', 'title': 'MV Tips & Tricks (Preview)',
'uploader': 'ellyxxix', 'description': r're:I will take you on a tour around .{1313}$',
'thumbnail': r're:https://cdn5\.manyvids\.com/php_uploads/video_images/DestinyDiaz/.+\.jpg',
'uploader': 'DestinyDiaz',
'view_count': int, 'view_count': int,
'like_count': int, 'like_count': int,
'release_timestamp': 1508419904,
'tags': ['AdultSchool', 'BBW', 'SFW', 'TeacherFetish'],
'release_date': '20171019',
'duration': 3167.0,
}, },
'expected_warnings': ['Only extracting preview'],
}, { }, {
# full video # full video
'url': 'https://www.manyvids.com/Video/935718/MY-FACE-REVEAL/', 'url': 'https://www.manyvids.com/Video/935718/MY-FACE-REVEAL/',
@@ -34,129 +41,68 @@ class ManyVidsIE(InfoExtractor):
'id': '935718', 'id': '935718',
'ext': 'mp4', 'ext': 'mp4',
'title': 'MY FACE REVEAL', 'title': 'MY FACE REVEAL',
'description': 'md5:ec5901d41808b3746fed90face161612', 'description': r're:Today is the day!! I am finally taking off my mask .{445}$',
'thumbnail': r're:https://ods\.manyvids\.com/1001061960/3aa5397f2a723ec4597e344df66ab845/screenshots/.+\.jpg',
'uploader': 'Sarah Calanthe', 'uploader': 'Sarah Calanthe',
'view_count': int, 'view_count': int,
'like_count': int, 'like_count': int,
'release_date': '20181110',
'tags': ['EyeContact', 'Interviews', 'MaskFetish', 'MouthFetish', 'Redhead'],
'release_timestamp': 1541851200,
'duration': 224.0,
}, },
}] }]
_API_BASE = 'https://www.manyvids.com/bff/store/video'
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
video_data = self._download_json(f'{self._API_BASE}/{video_id}/private', video_id)['data']
formats, preview_only = [], True
real_url = f'https://www.manyvids.com/video/{video_id}/gtm.js' for format_id, path in [
try: ('preview', ['teaser', 'filepath']),
webpage = self._download_webpage(real_url, video_id) ('transcoded', ['transcodedFilepath']),
except Exception: ('filepath', ['filepath']),
# probably useless fallback ]:
webpage = self._download_webpage(url, video_id) format_url = traverse_obj(video_data, (*path, {url_or_none}))
if not format_url:
info = self._search_regex(
r'''(<div\b[^>]*\bid\s*=\s*(['"])pageMetaDetails\2[^>]*>)''',
webpage, 'meta details', default='')
info = extract_attributes(info)
player = self._search_regex(
r'''(<div\b[^>]*\bid\s*=\s*(['"])rmpPlayerStream\2[^>]*>)''',
webpage, 'player details', default='')
player = extract_attributes(player)
video_urls_and_ids = (
(info.get('data-meta-video'), 'video'),
(player.get('data-video-transcoded'), 'transcoded'),
(player.get('data-video-filepath'), 'filepath'),
(self._og_search_video_url(webpage, secure=False, default=None), 'og_video'),
)
def txt_or_none(s, default=None):
return (s.strip() or default) if isinstance(s, str) else default
uploader = txt_or_none(info.get('data-meta-author'))
def mung_title(s):
if uploader:
s = re.sub(rf'^\s*{re.escape(uploader)}\s+[|-]', '', s)
return txt_or_none(s)
title = (
mung_title(info.get('data-meta-title'))
or self._html_search_regex(
(r'<span[^>]+class=["\']item-title[^>]+>([^<]+)',
r'<h2[^>]+class=["\']h2 m-0["\'][^>]*>([^<]+)'),
webpage, 'title', default=None)
or self._html_search_meta(
'twitter:title', webpage, 'title', fatal=True))
title = re.sub(r'\s*[|-]\s+ManyVids\s*$', '', title) or title
if any(p in webpage for p in ('preview_videos', '_preview.mp4')):
title += ' (Preview)'
mv_token = self._search_regex(
r'data-mvtoken=(["\'])(?P<value>(?:(?!\1).)+)\1', webpage,
'mv token', default=None, group='value')
if mv_token:
# Sets some cookies
self._download_webpage(
'https://www.manyvids.com/includes/ajax_repository/you_had_me_at_hello.php',
video_id, note='Setting format cookies', fatal=False,
data=urlencode_postdata({
'mvtoken': mv_token,
'vid': video_id,
}), headers={
'Referer': url,
'X-Requested-With': 'XMLHttpRequest',
})
formats = []
for v_url, fmt in video_urls_and_ids:
v_url = url_or_none(v_url)
if not v_url:
continue continue
if determine_ext(v_url) == 'm3u8': if determine_ext(format_url) == 'm3u8':
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(format_url, video_id, 'mp4', m3u8_id=format_id))
v_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls'))
else: else:
formats.append({ formats.append({
'url': v_url, 'url': format_url,
'format_id': fmt, 'format_id': format_id,
'preference': -10 if format_id == 'preview' else None,
'quality': 10 if format_id == 'filepath' else None,
'height': int_or_none(
self._search_regex(r'_(\d{2,3}[02468])_', format_url, 'height', default=None)),
}) })
if format_id != 'preview':
preview_only = False
self._remove_duplicate_formats(formats) metadata = traverse_obj(
self._download_json(f'{self._API_BASE}/{video_id}', video_id, fatal=False), 'data')
title = traverse_obj(metadata, ('title', {clean_html}))
for f in formats: if preview_only:
if f.get('height') is None: title = join_nonempty(title, '(Preview)', delim=' ')
f['height'] = int_or_none( video_id += '-preview'
self._search_regex(r'_(\d{2,3}[02468])_', f['url'], 'video height', default=None)) self.report_warning(
if '/preview/' in f['url']: f'Only extracting preview. Video may be paid or subscription only. {self._login_hint()}')
f['format_id'] = '_'.join(filter(None, (f.get('format_id'), 'preview')))
f['preference'] = -10
if 'transcoded' in f['format_id']:
f['preference'] = f.get('preference', -1) - 1
def get_likes():
likes = self._search_regex(
rf'''(<a\b[^>]*\bdata-id\s*=\s*(['"]){video_id}\2[^>]*>)''',
webpage, 'likes', default='')
likes = extract_attributes(likes)
return int_or_none(likes.get('data-likes'))
def get_views():
return str_to_int(self._html_search_regex(
r'''(?s)<span\b[^>]*\bclass\s*=["']views-wrapper\b[^>]+>.+?<span\b[^>]+>\s*(\d[\d,.]*)\s*</span>''',
webpage, 'view count', default=None))
return { return {
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'formats': formats, 'formats': formats,
'description': txt_or_none(info.get('data-meta-description')), **traverse_obj(metadata, {
'uploader': txt_or_none(info.get('data-meta-author')), 'description': ('description', {clean_html}),
'thumbnail': ( 'uploader': ('model', 'displayName', {clean_html}),
url_or_none(info.get('data-meta-image')) 'thumbnail': (('screenshot', 'thumbnail'), {url_or_none}, any),
or url_or_none(player.get('data-video-screenshot'))), 'view_count': ('views', {parse_count}),
'view_count': get_views(), 'like_count': ('likes', {parse_count}),
'like_count': get_likes(), 'release_timestamp': ('launchDate', {parse_iso8601}),
'duration': ('videoDuration', {parse_duration}),
'tags': ('tagList', ..., 'label', {str}, filter, all, filter),
}),
} }

View File

@@ -4,6 +4,7 @@
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
parse_resolution,
traverse_obj, traverse_obj,
unified_timestamp, unified_timestamp,
url_basename, url_basename,
@@ -83,8 +84,8 @@ def _sub_to_dict(subtitle_list):
subtitles.setdefault(sub.pop('tag', 'und'), []).append(sub) subtitles.setdefault(sub.pop('tag', 'und'), []).append(sub)
return subtitles return subtitles
def _extract_ism(self, ism_url, video_id): def _extract_ism(self, ism_url, video_id, fatal=True):
formats = self._extract_ism_formats(ism_url, video_id) formats = self._extract_ism_formats(ism_url, video_id, fatal=fatal)
for fmt in formats: for fmt in formats:
if fmt['language'] != 'eng' and 'English' not in fmt['format_id']: if fmt['language'] != 'eng' and 'English' not in fmt['format_id']:
fmt['language_preference'] = -10 fmt['language_preference'] = -10
@@ -218,9 +219,21 @@ class MicrosoftLearnEpisodeIE(MicrosoftMediusBaseIE):
'description': 'md5:7bbbfb593d21c2cf2babc3715ade6b88', 'description': 'md5:7bbbfb593d21c2cf2babc3715ade6b88',
'timestamp': 1676339547, 'timestamp': 1676339547,
'upload_date': '20230214', 'upload_date': '20230214',
'thumbnail': r're:https://learn\.microsoft\.com/video/media/.*\.png', 'thumbnail': r're:https://learn\.microsoft\.com/video/media/.+\.png',
'subtitles': 'count:14', 'subtitles': 'count:14',
}, },
}, {
'url': 'https://learn.microsoft.com/en-gb/shows/on-demand-instructor-led-training-series/az-900-module-1',
'info_dict': {
'id': '4fe10f7c-d83c-463b-ac0e-c30a8195e01b',
'ext': 'mp4',
'title': 'AZ-900 Cloud fundamentals (1 of 6)',
'description': 'md5:3c2212ce865e9142f402c766441bd5c9',
'thumbnail': r're:https://.+/.+\.jpg',
'timestamp': 1706605184,
'upload_date': '20240130',
},
'params': {'format': 'bv[protocol=https]'},
}] }]
def _real_extract(self, url): def _real_extract(self, url):
@@ -230,9 +243,32 @@ def _real_extract(self, url):
entry_id = self._html_search_meta('entryId', webpage, 'entryId', fatal=True) entry_id = self._html_search_meta('entryId', webpage, 'entryId', fatal=True)
video_info = self._download_json( video_info = self._download_json(
f'https://learn.microsoft.com/api/video/public/v1/entries/{entry_id}', video_id) f'https://learn.microsoft.com/api/video/public/v1/entries/{entry_id}', video_id)
formats = []
if ism_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoUrl', {url_or_none})):
formats.extend(self._extract_ism(ism_url, video_id, fatal=False))
if hls_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoHLSUrl', {url_or_none})):
formats.extend(self._extract_m3u8_formats(hls_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
if mpd_url := traverse_obj(video_info, ('publicVideo', 'adaptiveVideoDashUrl', {url_or_none})):
formats.extend(self._extract_mpd_formats(mpd_url, video_id, mpd_id='dash', fatal=False))
for key in ('low', 'medium', 'high'):
if video_url := traverse_obj(video_info, ('publicVideo', f'{key}QualityVideoUrl', {url_or_none})):
formats.append({
'url': video_url,
'format_id': f'video-http-{key}',
'acodec': 'none',
**parse_resolution(video_url),
})
if audio_url := traverse_obj(video_info, ('publicVideo', 'audioUrl', {url_or_none})):
formats.append({
'url': audio_url,
'format_id': 'audio-http',
'vcodec': 'none',
})
return { return {
'id': entry_id, 'id': entry_id,
'formats': self._extract_ism(video_info['publicVideo']['adaptiveVideoUrl'], video_id), 'formats': formats,
'subtitles': self._sub_to_dict(traverse_obj(video_info, ( 'subtitles': self._sub_to_dict(traverse_obj(video_info, (
'publicVideo', 'captions', lambda _, v: url_or_none(v['url']), { 'publicVideo', 'captions', lambda _, v: url_or_none(v['url']), {
'tag': ('language', {str}), 'tag': ('language', {str}),

View File

@@ -10,7 +10,9 @@
parse_iso8601, parse_iso8601,
strip_or_none, strip_or_none,
try_get, try_get,
url_or_none,
) )
from ..utils.traversal import traverse_obj
class MixcloudBaseIE(InfoExtractor): class MixcloudBaseIE(InfoExtractor):
@@ -37,7 +39,7 @@ class MixcloudIE(MixcloudBaseIE):
'ext': 'm4a', 'ext': 'm4a',
'title': 'Cryptkeeper', 'title': 'Cryptkeeper',
'description': 'After quite a long silence from myself, finally another Drum\'n\'Bass mix with my favourite current dance floor bangers.', 'description': 'After quite a long silence from myself, finally another Drum\'n\'Bass mix with my favourite current dance floor bangers.',
'uploader': 'Daniel Holbach', 'uploader': 'dholbach',
'uploader_id': 'dholbach', 'uploader_id': 'dholbach',
'thumbnail': r're:https?://.*\.jpg', 'thumbnail': r're:https?://.*\.jpg',
'view_count': int, 'view_count': int,
@@ -46,10 +48,11 @@ class MixcloudIE(MixcloudBaseIE):
'uploader_url': 'https://www.mixcloud.com/dholbach/', 'uploader_url': 'https://www.mixcloud.com/dholbach/',
'artist': 'Submorphics & Chino , Telekinesis, Porter Robinson, Enei, Breakage ft Jess Mills', 'artist': 'Submorphics & Chino , Telekinesis, Porter Robinson, Enei, Breakage ft Jess Mills',
'duration': 3723, 'duration': 3723,
'tags': [], 'tags': ['liquid drum and bass', 'drum and bass'],
'comment_count': int, 'comment_count': int,
'repost_count': int, 'repost_count': int,
'like_count': int, 'like_count': int,
'artists': list,
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
@@ -67,7 +70,7 @@ class MixcloudIE(MixcloudBaseIE):
'upload_date': '20150203', 'upload_date': '20150203',
'uploader_url': 'https://www.mixcloud.com/gillespeterson/', 'uploader_url': 'https://www.mixcloud.com/gillespeterson/',
'duration': 2992, 'duration': 2992,
'tags': [], 'tags': ['jazz', 'soul', 'world music', 'funk'],
'comment_count': int, 'comment_count': int,
'repost_count': int, 'repost_count': int,
'like_count': int, 'like_count': int,
@@ -149,8 +152,6 @@ def _real_extract(self, url):
elif reason: elif reason:
raise ExtractorError('Track is restricted', expected=True) raise ExtractorError('Track is restricted', expected=True)
title = cloudcast['name']
stream_info = cloudcast['streamInfo'] stream_info = cloudcast['streamInfo']
formats = [] formats = []
@@ -182,47 +183,39 @@ def _real_extract(self, url):
self.raise_login_required(metadata_available=True) self.raise_login_required(metadata_available=True)
comments = [] comments = []
for edge in (try_get(cloudcast, lambda x: x['comments']['edges']) or []): for node in traverse_obj(cloudcast, ('comments', 'edges', ..., 'node', {dict})):
node = edge.get('node') or {}
text = strip_or_none(node.get('comment')) text = strip_or_none(node.get('comment'))
if not text: if not text:
continue continue
user = node.get('user') or {}
comments.append({ comments.append({
'author': user.get('displayName'),
'author_id': user.get('username'),
'text': text, 'text': text,
'timestamp': parse_iso8601(node.get('created')), **traverse_obj(node, {
'author': ('user', 'displayName', {str}),
'author_id': ('user', 'username', {str}),
'timestamp': ('created', {parse_iso8601}),
}),
}) })
tags = []
for t in cloudcast.get('tags'):
tag = try_get(t, lambda x: x['tag']['name'], str)
if not tag:
tags.append(tag)
get_count = lambda x: int_or_none(try_get(cloudcast, lambda y: y[x]['totalCount']))
owner = cloudcast.get('owner') or {}
return { return {
'id': track_id, 'id': track_id,
'title': title,
'formats': formats, 'formats': formats,
'description': cloudcast.get('description'),
'thumbnail': try_get(cloudcast, lambda x: x['picture']['url'], str),
'uploader': owner.get('displayName'),
'timestamp': parse_iso8601(cloudcast.get('publishDate')),
'uploader_id': owner.get('username'),
'uploader_url': owner.get('url'),
'duration': int_or_none(cloudcast.get('audioLength')),
'view_count': int_or_none(cloudcast.get('plays')),
'like_count': get_count('favorites'),
'repost_count': get_count('reposts'),
'comment_count': get_count('comments'),
'comments': comments, 'comments': comments,
'tags': tags, **traverse_obj(cloudcast, {
'artist': ', '.join(cloudcast.get('featuringArtistList') or []) or None, 'title': ('name', {str}),
'description': ('description', {str}),
'thumbnail': ('picture', 'url', {url_or_none}),
'timestamp': ('publishDate', {parse_iso8601}),
'duration': ('audioLength', {int_or_none}),
'uploader': ('owner', 'displayName', {str}),
'uploader_id': ('owner', 'username', {str}),
'uploader_url': ('owner', 'url', {url_or_none}),
'view_count': ('plays', {int_or_none}),
'like_count': ('favorites', 'totalCount', {int_or_none}),
'repost_count': ('reposts', 'totalCount', {int_or_none}),
'comment_count': ('comments', 'totalCount', {int_or_none}),
'tags': ('tags', ..., 'tag', 'name', {str}, filter, all, filter),
'artists': ('featuringArtistList', ..., {str}, filter, all, filter),
}),
} }
@@ -295,7 +288,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
'url': 'http://www.mixcloud.com/dholbach/', 'url': 'http://www.mixcloud.com/dholbach/',
'info_dict': { 'info_dict': {
'id': 'dholbach_uploads', 'id': 'dholbach_uploads',
'title': 'Daniel Holbach (uploads)', 'title': 'dholbach (uploads)',
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b', 'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
}, },
'playlist_mincount': 36, 'playlist_mincount': 36,
@@ -303,7 +296,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
'url': 'http://www.mixcloud.com/dholbach/uploads/', 'url': 'http://www.mixcloud.com/dholbach/uploads/',
'info_dict': { 'info_dict': {
'id': 'dholbach_uploads', 'id': 'dholbach_uploads',
'title': 'Daniel Holbach (uploads)', 'title': 'dholbach (uploads)',
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b', 'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
}, },
'playlist_mincount': 36, 'playlist_mincount': 36,
@@ -311,7 +304,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
'url': 'http://www.mixcloud.com/dholbach/favorites/', 'url': 'http://www.mixcloud.com/dholbach/favorites/',
'info_dict': { 'info_dict': {
'id': 'dholbach_favorites', 'id': 'dholbach_favorites',
'title': 'Daniel Holbach (favorites)', 'title': 'dholbach (favorites)',
'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b', 'description': 'md5:a3f468a60ac8c3e1f8616380fc469b2b',
}, },
# 'params': { # 'params': {
@@ -337,7 +330,7 @@ class MixcloudUserIE(MixcloudPlaylistBaseIE):
'title': 'First Ear (stream)', 'title': 'First Ear (stream)',
'description': 'we maraud for ears', 'description': 'we maraud for ears',
}, },
'playlist_mincount': 269, 'playlist_mincount': 267,
}] }]
_TITLE_KEY = 'displayName' _TITLE_KEY = 'displayName'
@@ -361,7 +354,7 @@ class MixcloudPlaylistIE(MixcloudPlaylistBaseIE):
'id': 'maxvibes_jazzcat-on-ness-radio', 'id': 'maxvibes_jazzcat-on-ness-radio',
'title': 'Ness Radio sessions', 'title': 'Ness Radio sessions',
}, },
'playlist_mincount': 59, 'playlist_mincount': 58,
}] }]
_TITLE_KEY = 'name' _TITLE_KEY = 'name'
_DESCRIPTION_KEY = 'description' _DESCRIPTION_KEY = 'description'

View File

@@ -365,13 +365,15 @@ def _real_initialize(self):
'All videos are only available to registered users', method='password') 'All videos are only available to registered users', method='password')
def _set_device_id(self, username): def _set_device_id(self, username):
if not self._device_id: if self._device_id:
self._device_id = self.cache.load( return
self._NETRC_MACHINE, 'device_ids', default={}).get(username) device_id_cache = self.cache.load(self._NETRC_MACHINE, 'device_ids', default={})
self._device_id = device_id_cache.get(username)
if self._device_id: if self._device_id:
return return
self._device_id = str(uuid.uuid4()) self._device_id = str(uuid.uuid4())
self.cache.store(self._NETRC_MACHINE, 'device_ids', {username: self._device_id}) device_id_cache[username] = self._device_id
self.cache.store(self._NETRC_MACHINE, 'device_ids', device_id_cache)
def _perform_login(self, username, password): def _perform_login(self, username, password):
try: try:
@@ -449,9 +451,7 @@ def _extract_formats_and_subtitles(self, broadcast, video_id):
if not (m3u8_url and token): if not (m3u8_url and token):
errors = '; '.join(traverse_obj(response, ('errors', ..., 'message', {str}))) errors = '; '.join(traverse_obj(response, ('errors', ..., 'message', {str})))
if 'not entitled' in errors: if errors: # Only warn when 'blacked out' or 'not entitled'; radio formats may be available
raise ExtractorError(errors, expected=True)
elif errors: # Only warn when 'blacked out' since radio formats are available
self.report_warning(f'API returned errors for {format_id}: {errors}') self.report_warning(f'API returned errors for {format_id}: {errors}')
else: else:
self.report_warning(f'No formats available for {format_id} broadcast; skipping') self.report_warning(f'No formats available for {format_id} broadcast; skipping')

View File

@@ -3,8 +3,8 @@
class MoviepilotIE(InfoExtractor): class MoviepilotIE(InfoExtractor):
_IE_NAME = 'moviepilot' IE_NAME = 'moviepilot'
_IE_DESC = 'Moviepilot trailer' IE_DESC = 'Moviepilot trailer'
_VALID_URL = r'https?://(?:www\.)?moviepilot\.de/movies/(?P<id>[^/]+)' _VALID_URL = r'https?://(?:www\.)?moviepilot\.de/movies/(?P<id>[^/]+)'
_TESTS = [{ _TESTS = [{

View File

@@ -16,7 +16,6 @@
determine_ext, determine_ext,
float_or_none, float_or_none,
int_or_none, int_or_none,
join_nonempty,
parse_duration, parse_duration,
parse_iso8601, parse_iso8601,
parse_qs, parse_qs,
@@ -24,22 +23,79 @@
qualities, qualities,
remove_start, remove_start,
str_or_none, str_or_none,
traverse_obj,
try_get, try_get,
unescapeHTML, unescapeHTML,
unified_timestamp,
update_url_query, update_url_query,
url_basename, url_basename,
url_or_none, url_or_none,
urlencode_postdata, urlencode_postdata,
urljoin, urljoin,
) )
from ..utils.traversal import find_element, traverse_obj
class NiconicoIE(InfoExtractor): class NiconicoBaseIE(InfoExtractor):
_GEO_BYPASS = False
_GEO_COUNTRIES = ['JP']
_LOGIN_BASE = 'https://account.nicovideo.jp'
_NETRC_MACHINE = 'niconico'
@property
def is_logged_in(self):
return bool(self._get_cookies('https://www.nicovideo.jp').get('user_session'))
def _raise_login_error(self, message, expected=True):
raise ExtractorError(f'Unable to login: {message}', expected=expected)
def _perform_login(self, username, password):
if self.is_logged_in:
return
self._request_webpage(
f'{self._LOGIN_BASE}/login', None, 'Requesting session cookies')
webpage = self._download_webpage(
f'{self._LOGIN_BASE}/login/redirector', None,
'Logging in', 'Unable to log in', headers={
'Content-Type': 'application/x-www-form-urlencoded',
'Referer': f'{self._LOGIN_BASE}/login',
}, data=urlencode_postdata({
'mail_tel': username,
'password': password,
}))
if self.is_logged_in:
return
elif err_msg := traverse_obj(webpage, (
{find_element(cls='notice error')}, {find_element(cls='notice__text')}, {clean_html},
)):
self._raise_login_error(err_msg or 'Invalid username or password')
elif 'oneTimePw' in webpage:
post_url = self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', webpage, 'post url', group='url')
mfa, urlh = self._download_webpage_handle(
urljoin(self._LOGIN_BASE, post_url), None,
'Performing MFA', 'Unable to complete MFA', headers={
'Content-Type': 'application/x-www-form-urlencoded',
}, data=urlencode_postdata({
'otp': self._get_tfa_info('6 digit number shown on app'),
}))
if self.is_logged_in:
return
elif 'error-code' in parse_qs(urlh.url):
err_msg = traverse_obj(mfa, ({find_element(cls='pageMainMsg')}, {clean_html}))
self._raise_login_error(err_msg or 'MFA session expired')
elif 'formError' in mfa:
err_msg = traverse_obj(mfa, (
{find_element(cls='formError')}, {find_element(tag='div')}, {clean_html}))
self._raise_login_error(err_msg or 'MFA challenge failed')
self._raise_login_error('Unexpected login error', expected=False)
class NiconicoIE(NiconicoBaseIE):
IE_NAME = 'niconico' IE_NAME = 'niconico'
IE_DESC = 'ニコニコ動画' IE_DESC = 'ニコニコ動画'
_GEO_COUNTRIES = ['JP']
_GEO_BYPASS = False
_TESTS = [{ _TESTS = [{
'url': 'http://www.nicovideo.jp/watch/sm22312215', 'url': 'http://www.nicovideo.jp/watch/sm22312215',
@@ -179,229 +235,6 @@ class NiconicoIE(InfoExtractor):
}] }]
_VALID_URL = r'https?://(?:(?:www\.|secure\.|sp\.)?nicovideo\.jp/watch|nico\.ms)/(?P<id>(?:[a-z]{2})?[0-9]+)' _VALID_URL = r'https?://(?:(?:www\.|secure\.|sp\.)?nicovideo\.jp/watch|nico\.ms)/(?P<id>(?:[a-z]{2})?[0-9]+)'
_NETRC_MACHINE = 'niconico'
_API_HEADERS = {
'X-Frontend-ID': '6',
'X-Frontend-Version': '0',
'X-Niconico-Language': 'en-us',
'Referer': 'https://www.nicovideo.jp/',
'Origin': 'https://www.nicovideo.jp',
}
def _perform_login(self, username, password):
login_ok = True
login_form_strs = {
'mail_tel': username,
'password': password,
}
self._request_webpage(
'https://account.nicovideo.jp/login', None,
note='Acquiring Login session')
page = self._download_webpage(
'https://account.nicovideo.jp/login/redirector?show_button_twitter=1&site=niconico&show_button_facebook=1', None,
note='Logging in', errnote='Unable to log in',
data=urlencode_postdata(login_form_strs),
headers={
'Referer': 'https://account.nicovideo.jp/login',
'Content-Type': 'application/x-www-form-urlencoded',
})
if 'oneTimePw' in page:
post_url = self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', page, 'post url', group='url')
page = self._download_webpage(
urljoin('https://account.nicovideo.jp', post_url), None,
note='Performing MFA', errnote='Unable to complete MFA',
data=urlencode_postdata({
'otp': self._get_tfa_info('6 digits code'),
}), headers={
'Content-Type': 'application/x-www-form-urlencoded',
})
if 'oneTimePw' in page or 'formError' in page:
err_msg = self._html_search_regex(
r'formError["\']+>(.*?)</div>', page, 'form_error',
default='There\'s an error but the message can\'t be parsed.',
flags=re.DOTALL)
self.report_warning(f'Unable to log in: MFA challenge failed, "{err_msg}"')
return False
login_ok = 'class="notice error"' not in page
if not login_ok:
self.report_warning('Unable to log in: bad username or password')
return login_ok
def _get_heartbeat_info(self, info_dict):
video_id, video_src_id, audio_src_id = info_dict['url'].split(':')[1].split('/')
dmc_protocol = info_dict['expected_protocol']
api_data = (
info_dict.get('_api_data')
or self._parse_json(
self._html_search_regex(
'data-api-data="([^"]+)"',
self._download_webpage('https://www.nicovideo.jp/watch/' + video_id, video_id),
'API data', default='{}'),
video_id))
session_api_data = try_get(api_data, lambda x: x['media']['delivery']['movie']['session'])
session_api_endpoint = try_get(session_api_data, lambda x: x['urls'][0])
def ping():
tracking_id = traverse_obj(api_data, ('media', 'delivery', 'trackingId'))
if tracking_id:
tracking_url = update_url_query('https://nvapi.nicovideo.jp/v1/2ab0cbaa/watch', {'t': tracking_id})
watch_request_response = self._download_json(
tracking_url, video_id,
note='Acquiring permission for downloading video', fatal=False,
headers=self._API_HEADERS)
if traverse_obj(watch_request_response, ('meta', 'status')) != 200:
self.report_warning('Failed to acquire permission for playing video. Video download may fail.')
yesno = lambda x: 'yes' if x else 'no'
if dmc_protocol == 'http':
protocol = 'http'
protocol_parameters = {
'http_output_download_parameters': {
'use_ssl': yesno(session_api_data['urls'][0]['isSsl']),
'use_well_known_port': yesno(session_api_data['urls'][0]['isWellKnownPort']),
},
}
elif dmc_protocol == 'hls':
protocol = 'm3u8'
segment_duration = try_get(self._configuration_arg('segment_duration'), lambda x: int(x[0])) or 6000
parsed_token = self._parse_json(session_api_data['token'], video_id)
encryption = traverse_obj(api_data, ('media', 'delivery', 'encryption'))
protocol_parameters = {
'hls_parameters': {
'segment_duration': segment_duration,
'transfer_preset': '',
'use_ssl': yesno(session_api_data['urls'][0]['isSsl']),
'use_well_known_port': yesno(session_api_data['urls'][0]['isWellKnownPort']),
},
}
if 'hls_encryption' in parsed_token and encryption:
protocol_parameters['hls_parameters']['encryption'] = {
parsed_token['hls_encryption']: {
'encrypted_key': encryption['encryptedKey'],
'key_uri': encryption['keyUri'],
},
}
else:
protocol = 'm3u8_native'
else:
raise ExtractorError(f'Unsupported DMC protocol: {dmc_protocol}')
session_response = self._download_json(
session_api_endpoint['url'], video_id,
query={'_format': 'json'},
headers={'Content-Type': 'application/json'},
note='Downloading JSON metadata for {}'.format(info_dict['format_id']),
data=json.dumps({
'session': {
'client_info': {
'player_id': session_api_data.get('playerId'),
},
'content_auth': {
'auth_type': try_get(session_api_data, lambda x: x['authTypes'][session_api_data['protocols'][0]]),
'content_key_timeout': session_api_data.get('contentKeyTimeout'),
'service_id': 'nicovideo',
'service_user_id': session_api_data.get('serviceUserId'),
},
'content_id': session_api_data.get('contentId'),
'content_src_id_sets': [{
'content_src_ids': [{
'src_id_to_mux': {
'audio_src_ids': [audio_src_id],
'video_src_ids': [video_src_id],
},
}],
}],
'content_type': 'movie',
'content_uri': '',
'keep_method': {
'heartbeat': {
'lifetime': session_api_data.get('heartbeatLifetime'),
},
},
'priority': session_api_data['priority'],
'protocol': {
'name': 'http',
'parameters': {
'http_parameters': {
'parameters': protocol_parameters,
},
},
},
'recipe_id': session_api_data.get('recipeId'),
'session_operation_auth': {
'session_operation_auth_by_signature': {
'signature': session_api_data.get('signature'),
'token': session_api_data.get('token'),
},
},
'timing_constraint': 'unlimited',
},
}).encode())
info_dict['url'] = session_response['data']['session']['content_uri']
info_dict['protocol'] = protocol
# get heartbeat info
heartbeat_info_dict = {
'url': session_api_endpoint['url'] + '/' + session_response['data']['session']['id'] + '?_format=json&_method=PUT',
'data': json.dumps(session_response['data']),
# interval, convert milliseconds to seconds, then halve to make a buffer.
'interval': float_or_none(session_api_data.get('heartbeatLifetime'), scale=3000),
'ping': ping,
}
return info_dict, heartbeat_info_dict
def _extract_format_for_quality(self, video_id, audio_quality, video_quality, dmc_protocol):
if not audio_quality.get('isAvailable') or not video_quality.get('isAvailable'):
return None
format_id = '-'.join(
[remove_start(s['id'], 'archive_') for s in (video_quality, audio_quality)] + [dmc_protocol])
vid_qual_label = traverse_obj(video_quality, ('metadata', 'label'))
return {
'url': 'niconico_dmc:{}/{}/{}'.format(video_id, video_quality['id'], audio_quality['id']),
'format_id': format_id,
'format_note': join_nonempty('DMC', vid_qual_label, dmc_protocol.upper(), delim=' '),
'ext': 'mp4', # Session API are used in HTML5, which always serves mp4
'acodec': 'aac',
'vcodec': 'h264',
**traverse_obj(audio_quality, ('metadata', {
'abr': ('bitrate', {float_or_none(scale=1000)}),
'asr': ('samplingRate', {int_or_none}),
})),
**traverse_obj(video_quality, ('metadata', {
'vbr': ('bitrate', {float_or_none(scale=1000)}),
'height': ('resolution', 'height', {int_or_none}),
'width': ('resolution', 'width', {int_or_none}),
})),
'quality': -2 if 'low' in video_quality['id'] else None,
'protocol': 'niconico_dmc',
'expected_protocol': dmc_protocol, # XXX: This is not a documented field
'http_headers': {
'Origin': 'https://www.nicovideo.jp',
'Referer': 'https://www.nicovideo.jp/watch/' + video_id,
},
}
def _yield_dmc_formats(self, api_data, video_id):
dmc_data = traverse_obj(api_data, ('media', 'delivery', 'movie'))
audios = traverse_obj(dmc_data, ('audios', ..., {dict}))
videos = traverse_obj(dmc_data, ('videos', ..., {dict}))
protocols = traverse_obj(dmc_data, ('session', 'protocols', ..., {str}))
if not all((audios, videos, protocols)):
return
for audio_quality, video_quality, protocol in itertools.product(audios, videos, protocols):
if fmt := self._extract_format_for_quality(video_id, audio_quality, video_quality, protocol):
yield fmt
def _yield_dms_formats(self, api_data, video_id): def _yield_dms_formats(self, api_data, video_id):
fmt_filter = lambda _, v: v['isAvailable'] and v['id'] fmt_filter = lambda _, v: v['isAvailable'] and v['id']
@@ -484,8 +317,8 @@ def _real_extract(self, url):
'needs_premium': ('isPremium', {bool}), 'needs_premium': ('isPremium', {bool}),
'needs_subscription': ('isAdmission', {bool}), 'needs_subscription': ('isAdmission', {bool}),
})) or {'needs_auth': True})) })) or {'needs_auth': True}))
formats = [*self._yield_dmc_formats(api_data, video_id),
*self._yield_dms_formats(api_data, video_id)] formats = list(self._yield_dms_formats(api_data, video_id))
if not formats: if not formats:
fail_msg = clean_html(self._html_search_regex( fail_msg = clean_html(self._html_search_regex(
r'<p[^>]+\bclass="fail-message"[^>]*>(?P<msg>.+?)</p>', r'<p[^>]+\bclass="fail-message"[^>]*>(?P<msg>.+?)</p>',
@@ -920,7 +753,7 @@ def _real_extract(self, url):
return self.playlist_result(self._entries(list_id), list_id) return self.playlist_result(self._entries(list_id), list_id)
class NiconicoLiveIE(InfoExtractor): class NiconicoLiveIE(NiconicoBaseIE):
IE_NAME = 'niconico:live' IE_NAME = 'niconico:live'
IE_DESC = 'ニコニコ生放送' IE_DESC = 'ニコニコ生放送'
_VALID_URL = r'https?://(?:sp\.)?live2?\.nicovideo\.jp/(?:watch|gate)/(?P<id>lv\d+)' _VALID_URL = r'https?://(?:sp\.)?live2?\.nicovideo\.jp/(?:watch|gate)/(?P<id>lv\d+)'
@@ -985,6 +818,7 @@ def _real_extract(self, url):
'quality': 'abr', 'quality': 'abr',
'protocol': 'hls+fmp4', 'protocol': 'hls+fmp4',
'latency': latency, 'latency': latency,
'accessRightMethod': 'single_cookie',
'chasePlay': False, 'chasePlay': False,
}, },
'room': { 'room': {
@@ -1005,6 +839,7 @@ def _real_extract(self, url):
if data.get('type') == 'stream': if data.get('type') == 'stream':
m3u8_url = data['data']['uri'] m3u8_url = data['data']['uri']
qualities = data['data']['availableQualities'] qualities = data['data']['availableQualities']
cookies = data['data']['cookies']
break break
elif data.get('type') == 'disconnect': elif data.get('type') == 'disconnect':
self.write_debug(recv) self.write_debug(recv)
@@ -1043,6 +878,11 @@ def _real_extract(self, url):
**res, **res,
}) })
for cookie in cookies:
self._set_cookie(
cookie['domain'], cookie['name'], cookie['value'],
expire_time=unified_timestamp(cookie['expires']), path=cookie['path'], secure=cookie['secure'])
formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4', live=True) formats = self._extract_m3u8_formats(m3u8_url, video_id, ext='mp4', live=True)
for fmt, q in zip(formats, reversed(qualities[1:])): for fmt, q in zip(formats, reversed(qualities[1:])):
fmt.update({ fmt.update({

View File

@@ -1,34 +1,46 @@
import json
import re
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError,
float_or_none, float_or_none,
int_or_none, int_or_none,
smuggle_url, parse_iso8601,
parse_resolution,
str_or_none, str_or_none,
try_get, url_or_none,
unified_strdate,
unified_timestamp,
) )
from ..utils.traversal import require, traverse_obj, value
class NineNowIE(InfoExtractor): class NineNowIE(InfoExtractor):
IE_NAME = '9now.com.au' IE_NAME = '9now.com.au'
_VALID_URL = r'https?://(?:www\.)?9now\.com\.au/(?:[^/]+/){2}(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?9now\.com\.au/(?:[^/?#]+/){2}(?P<id>(?P<type>clip|episode)-[^/?#]+)'
_GEO_COUNTRIES = ['AU'] _GEO_BYPASS = False
_TESTS = [{ _TESTS = [{
# clip # clip
'url': 'https://www.9now.com.au/afl-footy-show/2016/clip-ciql02091000g0hp5oktrnytc', 'url': 'https://www.9now.com.au/today/season-2025/clip-cm8hw9h5z00080hquqa5hszq7',
'md5': '17cf47d63ec9323e562c9957a968b565',
'info_dict': { 'info_dict': {
'id': '16801', 'id': '6370295582112',
'ext': 'mp4', 'ext': 'mp4',
'title': 'St. Kilda\'s Joey Montagna on the potential for a player\'s strike', 'title': 'Would Karl Stefanovic be able to land a plane?',
'description': 'Is a boycott of the NAB Cup "on the table"?', 'description': 'The Today host\'s skills are put to the test with the latest simulation tech.',
'uploader_id': '4460760524001', 'uploader_id': '4460760524001',
'upload_date': '20160713', 'duration': 197.376,
'timestamp': 1468421266, 'tags': ['flights', 'technology', 'Karl Stefanovic'],
'season': 'Season 2025',
'season_number': 2025,
'series': 'TODAY',
'timestamp': 1742507988,
'upload_date': '20250320',
'release_timestamp': 1742507983,
'release_date': '20250320',
'thumbnail': r're:https?://.+/1920x0/.+\.jpg',
},
'params': {
'skip_download': 'HLS/DASH fragments and mp4 URLs are geo-restricted; only available in AU',
}, },
'skip': 'Only available in Australia',
}, { }, {
# episode # episode
'url': 'https://www.9now.com.au/afl-footy-show/2016/episode-19', 'url': 'https://www.9now.com.au/afl-footy-show/2016/episode-19',
@@ -41,7 +53,7 @@ class NineNowIE(InfoExtractor):
# episode of series # episode of series
'url': 'https://www.9now.com.au/lego-masters/season-3/episode-3', 'url': 'https://www.9now.com.au/lego-masters/season-3/episode-3',
'info_dict': { 'info_dict': {
'id': '6249614030001', 'id': '6308830406112',
'title': 'Episode 3', 'title': 'Episode 3',
'ext': 'mp4', 'ext': 'mp4',
'season_number': 3, 'season_number': 3,
@@ -50,72 +62,87 @@ class NineNowIE(InfoExtractor):
'uploader_id': '4460760524001', 'uploader_id': '4460760524001',
'timestamp': 1619002200, 'timestamp': 1619002200,
'upload_date': '20210421', 'upload_date': '20210421',
'duration': 3574.085,
'thumbnail': r're:https?://.+/1920x0/.+\.jpg',
'tags': ['episode'],
'series': 'Lego Masters',
'season': 'Season 3',
'episode': 'Episode 3',
'release_timestamp': 1619002200,
'release_date': '20210421',
}, },
'expected_warnings': ['Ignoring subtitle tracks'],
'params': { 'params': {
'skip_download': True, 'skip_download': 'HLS/DASH fragments and mp4 URLs are geo-restricted; only available in AU',
},
}, {
'url': 'https://www.9now.com.au/married-at-first-sight/season-12/episode-1',
'info_dict': {
'id': '6367798770112',
'ext': 'mp4',
'title': 'Episode 1',
'description': r're:The cultural sensation of Married At First Sight returns with our first weddings! .{90}$',
'uploader_id': '4460760524001',
'duration': 5415.079,
'thumbnail': r're:https?://.+/1920x0/.+\.png',
'tags': ['episode'],
'season': 'Season 12',
'season_number': 12,
'episode': 'Episode 1',
'episode_number': 1,
'series': 'Married at First Sight',
'timestamp': 1737973800,
'upload_date': '20250127',
'release_timestamp': 1737973800,
'release_date': '20250127',
},
'params': {
'skip_download': 'HLS/DASH fragments and mp4 URLs are geo-restricted; only available in AU',
}, },
}] }]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/4460760524001/default_default/index.html?videoId=%s' BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/4460760524001/default_default/index.html?videoId={}'
# XXX: For parsing next.js v15+ data; see also yt_dlp.extractor.francetv and yt_dlp.extractor.goplay
def _find_json(self, s):
return self._search_json(
r'\w+\s*:\s*', s, 'next js data', None, contains_pattern=r'\[(?s:.+)\]', default=None)
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id, video_type = self._match_valid_url(url).group('id', 'type')
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
page_data = self._parse_json(self._search_regex(
r'window\.__data\s*=\s*({.*?});', webpage,
'page data', default='{}'), display_id, fatal=False)
if not page_data:
page_data = self._parse_json(self._parse_json(self._search_regex(
r'window\.__data\s*=\s*JSON\.parse\s*\(\s*(".+?")\s*\)\s*;',
webpage, 'page data'), display_id), display_id)
for kind in ('episode', 'clip'): common_data = traverse_obj(
current_key = page_data.get(kind, {}).get( re.findall(r'<script[^>]*>\s*self\.__next_f\.push\(\s*(\[.+?\])\s*\);?\s*</script>', webpage),
f'current{kind.capitalize()}Key') (..., {json.loads}, ..., {self._find_json},
if not current_key: lambda _, v: v['payload'][video_type]['slug'] == display_id,
continue 'payload', any, {require('video data')}))
cache = page_data.get(kind, {}).get(f'{kind}Cache', {})
if not cache:
continue
common_data = {
'episode': (cache.get(current_key) or next(iter(cache.values())))[kind],
'season': (cache.get(current_key) or next(iter(cache.values()))).get('season', None),
}
break
else:
raise ExtractorError('Unable to find video data')
if not self.get_param('allow_unplayable_formats') and try_get(common_data, lambda x: x['episode']['video']['drm'], bool): if traverse_obj(common_data, (video_type, 'video', 'drm', {bool})):
self.report_drm(display_id) self.report_drm(display_id)
brightcove_id = try_get( brightcove_id = traverse_obj(common_data, (
common_data, lambda x: x['episode']['video']['brightcoveId'], str) or 'ref:{}'.format(common_data['episode']['video']['referenceId']) video_type, 'video', (
video_id = str_or_none(try_get(common_data, lambda x: x['episode']['video']['id'])) or brightcove_id ('brightcoveId', {str}),
('referenceId', {str}, {lambda x: f'ref:{x}' if x else None}),
title = try_get(common_data, lambda x: x['episode']['name'], str) ), any, {require('brightcove ID')}))
season_number = try_get(common_data, lambda x: x['season']['seasonNumber'], int)
episode_number = try_get(common_data, lambda x: x['episode']['episodeNumber'], int)
timestamp = unified_timestamp(try_get(common_data, lambda x: x['episode']['airDate'], str))
release_date = unified_strdate(try_get(common_data, lambda x: x['episode']['availability'], str))
thumbnails_data = try_get(common_data, lambda x: x['episode']['image']['sizes'], dict) or {}
thumbnails = [{
'id': thumbnail_id,
'url': thumbnail_url,
'width': int_or_none(thumbnail_id[1:]),
} for thumbnail_id, thumbnail_url in thumbnails_data.items()]
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
'url': smuggle_url( 'ie_key': BrightcoveNewIE.ie_key(),
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'url': self.BRIGHTCOVE_URL_TEMPLATE.format(brightcove_id),
{'geo_countries': self._GEO_COUNTRIES}), **traverse_obj(common_data, {
'id': video_id, 'id': (video_type, 'video', 'id', {int}, ({str_or_none}, {value(brightcove_id)}), any),
'title': title, 'title': (video_type, 'name', {str}),
'description': try_get(common_data, lambda x: x['episode']['description'], str), 'description': (video_type, 'description', {str}),
'duration': float_or_none(try_get(common_data, lambda x: x['episode']['video']['duration'], float), 1000), 'duration': (video_type, 'video', 'duration', {float_or_none(scale=1000)}),
'thumbnails': thumbnails, 'tags': (video_type, 'tags', ..., 'name', {str}, all, filter),
'ie_key': 'BrightcoveNew', 'series': ('tvSeries', 'name', {str}),
'season_number': season_number, 'season_number': ('season', 'seasonNumber', {int_or_none}),
'episode_number': episode_number, 'episode_number': ('episode', 'episodeNumber', {int_or_none}),
'timestamp': timestamp, 'timestamp': ('episode', 'airDate', {parse_iso8601}),
'release_date': release_date, 'release_timestamp': (video_type, 'availability', {parse_iso8601}),
'thumbnails': (video_type, 'image', 'sizes', {dict.items}, lambda _, v: url_or_none(v[1]), {
'id': 0,
'url': 1,
'width': (1, {parse_resolution}, 'width'),
}),
}),
} }

View File

@@ -11,12 +11,15 @@ class On24IE(InfoExtractor):
IE_NAME = 'on24' IE_NAME = 'on24'
IE_DESC = 'ON24' IE_DESC = 'ON24'
_VALID_URL = r'''(?x) _ID_RE = r'(?P<id>\d{7})'
https?://event\.on24\.com/(?: _KEY_RE = r'(?P<key>[0-9A-F]{32})'
wcc/r/(?P<id_1>\d{7})/(?P<key_1>[0-9A-F]{32})| _URL_BASE_RE = r'https?://event\.on24\.com'
eventRegistration/(?:console/EventConsoleApollo|EventLobbyServlet\?target=lobby30) _URL_QUERY_RE = rf'(?:[^#]*&)?eventid={_ID_RE}&(?:[^#]+&)?key={_KEY_RE}'
\.jsp\?(?:[^/#?]*&)?eventid=(?P<id_2>\d{7})[^/#?]*&key=(?P<key_2>[0-9A-F]{32}) _VALID_URL = [
)''' rf'{_URL_BASE_RE}/wcc/r/{_ID_RE}/{_KEY_RE}',
rf'{_URL_BASE_RE}/eventRegistration/console/(?:EventConsoleApollo\.jsp|apollox/mainEvent/?)\?{_URL_QUERY_RE}',
rf'{_URL_BASE_RE}/eventRegistration/EventLobbyServlet/?\?{_URL_QUERY_RE}',
]
_TESTS = [{ _TESTS = [{
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?uimode=nextgeneration&eventid=2197467&sessionid=1&key=5DF57BE53237F36A43B478DD36277A84&contenttype=A&eventuserid=305999&playerwidth=1000&playerheight=650&caller=previewLobby&text_language_id=en&format=fhaudio&newConsole=false', 'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?uimode=nextgeneration&eventid=2197467&sessionid=1&key=5DF57BE53237F36A43B478DD36277A84&contenttype=A&eventuserid=305999&playerwidth=1000&playerheight=650&caller=previewLobby&text_language_id=en&format=fhaudio&newConsole=false',
@@ -34,12 +37,16 @@ class On24IE(InfoExtractor):
}, { }, {
'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?&eventid=2639291&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=82829018E813065A122363877975752E&newConsole=true&nxChe=true&newTabCon=true&text_language_id=en&playerwidth=748&playerheight=526&eventuserid=338788762&contenttype=A&mediametricsessionid=384764716&mediametricid=3558192&usercd=369267058&mode=launch', 'url': 'https://event.on24.com/eventRegistration/console/EventConsoleApollo.jsp?&eventid=2639291&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=82829018E813065A122363877975752E&newConsole=true&nxChe=true&newTabCon=true&text_language_id=en&playerwidth=748&playerheight=526&eventuserid=338788762&contenttype=A&mediametricsessionid=384764716&mediametricid=3558192&usercd=369267058&mode=launch',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://event.on24.com/eventRegistration/EventLobbyServlet?target=reg20.jsp&eventid=3543176&key=BC0F6B968B67C34B50D461D40FDB3E18&groupId=3143628',
'only_matching': True,
}, {
'url': 'https://event.on24.com/eventRegistration/console/apollox/mainEvent?&eventid=4843671&sessionid=1&username=&partnerref=&format=fhvideo1&mobile=&flashsupportedmobiledevice=&helpcenter=&key=4EAC9B5C564CC98FF29E619B06A2F743&newConsole=true&nxChe=true&newTabCon=true&consoleEarEventConsole=false&consoleEarCloudApi=false&text_language_id=en&playerwidth=748&playerheight=526&referrer=https%3A%2F%2Fevent.on24.com%2Finterface%2Fregistration%2Fautoreg%2Findex.html%3Fsessionid%3D1%26eventid%3D4843671%26key%3D4EAC9B5C564CC98FF29E619B06A2F743%26email%3D000a3e42-7952-4dd6-8f8a-34c38ea3cf02%2540platform%26firstname%3Ds%26lastname%3Ds%26deletecookie%3Dtrue%26event_email%3DN%26marketing_email%3DN%26std1%3D0642572014177%26std2%3D0642572014179%26std3%3D550165f7-a44e-4725-9fe6-716f89908c2b%26std4%3D0&eventuserid=745776448&contenttype=A&mediametricsessionid=640613707&mediametricid=6810717&usercd=745776448&mode=launch',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
mobj = self._match_valid_url(url) event_id, event_key = self._match_valid_url(url).group('id', 'key')
event_id = mobj.group('id_1') or mobj.group('id_2')
event_key = mobj.group('key_1') or mobj.group('key_2')
event_data = self._download_json( event_data = self._download_json(
'https://event.on24.com/apic/utilApp/EventConsoleCachedServlet', 'https://event.on24.com/apic/utilApp/EventConsoleCachedServlet',

View File

@@ -14,8 +14,9 @@
int_or_none, int_or_none,
parse_qs, parse_qs,
srt_subtitles_timecode, srt_subtitles_timecode,
traverse_obj, url_or_none,
) )
from ..utils.traversal import traverse_obj
class PanoptoBaseIE(InfoExtractor): class PanoptoBaseIE(InfoExtractor):
@@ -345,21 +346,16 @@ def _extract_streams_formats_and_subtitles(self, video_id, streams, **fmt_kwargs
subtitles = {} subtitles = {}
for stream in streams or []: for stream in streams or []:
stream_formats = [] stream_formats = []
http_stream_url = stream.get('StreamHttpUrl') for stream_url in set(traverse_obj(stream, (('StreamHttpUrl', 'StreamUrl'), {url_or_none}))):
stream_url = stream.get('StreamUrl')
if http_stream_url:
stream_formats.append({'url': http_stream_url})
if stream_url:
media_type = stream.get('ViewerMediaFileTypeName') media_type = stream.get('ViewerMediaFileTypeName')
if media_type in ('hls', ): if media_type in ('hls', ):
m3u8_formats, stream_subtitles = self._extract_m3u8_formats_and_subtitles(stream_url, video_id) fmts, subs = self._extract_m3u8_formats_and_subtitles(stream_url, video_id, m3u8_id='hls', fatal=False)
stream_formats.extend(m3u8_formats) stream_formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, stream_subtitles) self._merge_subtitles(subs, target=subtitles)
else: else:
stream_formats.append({ stream_formats.append({
'url': stream_url, 'url': stream_url,
'ext': media_type,
}) })
for fmt in stream_formats: for fmt in stream_formats:
fmt.update({ fmt.update({

101
yt_dlp/extractor/parti.py Normal file
View File

@@ -0,0 +1,101 @@
from .common import InfoExtractor
from ..utils import UserNotLive, int_or_none, parse_iso8601, url_or_none, urljoin
from ..utils.traversal import traverse_obj
class PartiBaseIE(InfoExtractor):
def _call_api(self, path, video_id, note=None):
return self._download_json(
f'https://api-backend.parti.com/parti_v2/profile/{path}', video_id, note)
class PartiVideoIE(PartiBaseIE):
IE_NAME = 'parti:video'
_VALID_URL = r'https?://(?:www\.)?parti\.com/video/(?P<id>\d+)'
_TESTS = [{
'url': 'https://parti.com/video/66284',
'info_dict': {
'id': '66284',
'ext': 'mp4',
'title': 'NOW LIVE ',
'upload_date': '20250327',
'categories': ['Gaming'],
'thumbnail': 'https://assets.parti.com/351424_eb9e5250-2821-484a-9c5f-ca99aa666c87.png',
'channel': 'ItZTMGG',
'timestamp': 1743044379,
},
'params': {'skip_download': 'm3u8'},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._call_api(f'get_livestream_channel_info/recent/{video_id}', video_id)
return {
'id': video_id,
'formats': self._extract_m3u8_formats(
urljoin('https://watch.parti.com', data['livestream_recording']), video_id, 'mp4'),
**traverse_obj(data, {
'title': ('event_title', {str}),
'channel': ('user_name', {str}),
'thumbnail': ('event_file', {url_or_none}),
'categories': ('category_name', {str}, filter, all),
'timestamp': ('event_start_ts', {int_or_none}),
}),
}
class PartiLivestreamIE(PartiBaseIE):
IE_NAME = 'parti:livestream'
_VALID_URL = r'https?://(?:www\.)?parti\.com/creator/(?P<service>[\w]+)/(?P<id>[\w/-]+)'
_TESTS = [{
'url': 'https://parti.com/creator/parti/Capt_Robs_Adventures',
'info_dict': {
'id': 'Capt_Robs_Adventures',
'ext': 'mp4',
'title': r"re:I'm Live on Parti \d{4}-\d{2}-\d{2} \d{2}:\d{2}",
'view_count': int,
'thumbnail': r're:https://assets\.parti\.com/.+\.png',
'timestamp': 1743879776,
'upload_date': '20250405',
'live_status': 'is_live',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://parti.com/creator/discord/sazboxgaming/0',
'only_matching': True,
}]
def _real_extract(self, url):
service, creator_slug = self._match_valid_url(url).group('service', 'id')
encoded_creator_slug = creator_slug.replace('/', '%23')
creator_id = self._call_api(
f'get_user_by_social_media/{service}/{encoded_creator_slug}',
creator_slug, note='Fetching user ID')
data = self._call_api(
f'get_livestream_channel_info/{creator_id}', creator_id,
note='Fetching user profile feed')['channel_info']
if not traverse_obj(data, ('channel', 'is_live', {bool})):
raise UserNotLive(video_id=creator_id)
channel_info = data['channel']
return {
'id': creator_slug,
'formats': self._extract_m3u8_formats(
channel_info['playback_url'], creator_slug, live=True, query={
'token': channel_info['playback_auth_token'],
'player_version': '1.17.0',
}),
'is_live': True,
**traverse_obj(data, {
'title': ('livestream_event_info', 'event_name', {str}),
'description': ('livestream_event_info', 'event_description', {str}),
'thumbnail': ('livestream_event_info', 'livestream_preview_file', {url_or_none}),
'timestamp': ('stream', 'start_time', {parse_iso8601}),
'view_count': ('stream', 'viewer_count', {int_or_none}),
}),
}

View File

@@ -1,5 +1,3 @@
import re
from .youtube import YoutubeIE from .youtube import YoutubeIE
from .zdf import ZDFBaseIE from .zdf import ZDFBaseIE
from ..utils import ( from ..utils import (
@@ -7,44 +5,27 @@
merge_dicts, merge_dicts,
try_get, try_get,
unified_timestamp, unified_timestamp,
urljoin,
) )
class PhoenixIE(ZDFBaseIE): class PhoenixIE(ZDFBaseIE):
IE_NAME = 'phoenix.de' IE_NAME = 'phoenix.de'
_VALID_URL = r'https?://(?:www\.)?phoenix\.de/(?:[^/]+/)*[^/?#&]*-a-(?P<id>\d+)\.html' _VALID_URL = r'https?://(?:www\.)?phoenix\.de/(?:[^/?#]+/)*[^/?#&]*-a-(?P<id>\d+)\.html'
_TESTS = [{ _TESTS = [{
# Same as https://www.zdf.de/politik/phoenix-sendungen/wohin-fuehrt-der-protest-in-der-pandemie-100.html 'url': 'https://www.phoenix.de/sendungen/dokumentationen/spitzbergen-a-893349.html',
'url': 'https://www.phoenix.de/sendungen/ereignisse/corona-nachgehakt/wohin-fuehrt-der-protest-in-der-pandemie-a-2050630.html', 'md5': 'a79e86d9774d0b3f2102aff988a0bd32',
'md5': '34ec321e7eb34231fd88616c65c92db0',
'info_dict': { 'info_dict': {
'id': '210222_phx_nachgehakt_corona_protest', 'id': '221215_phx_spitzbergen',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Wohin führt der Protest in der Pandemie?', 'title': 'Spitzbergen',
'description': 'md5:7d643fe7f565e53a24aac036b2122fbd', 'description': 'Film von Tilmann Bünz',
'duration': 1691, 'duration': 728.0,
'timestamp': 1613902500, 'timestamp': 1555600500,
'upload_date': '20210221', 'upload_date': '20190418',
'uploader': 'Phoenix', 'uploader': 'Phoenix',
'series': 'corona nachgehakt', 'thumbnail': 'https://www.phoenix.de/sixcms/media.php/21/Bergspitzen1.png',
'episode': 'Wohin führt der Protest in der Pandemie?', 'series': 'Dokumentationen',
}, 'episode': 'Spitzbergen',
}, {
# Youtube embed
'url': 'https://www.phoenix.de/sendungen/gespraeche/phoenix-streitgut-brennglas-corona-a-1965505.html',
'info_dict': {
'id': 'hMQtqFYjomk',
'ext': 'mp4',
'title': 'phoenix streitgut: Brennglas Corona - Wie gerecht ist unsere Gesellschaft?',
'description': 'md5:ac7a02e2eb3cb17600bc372e4ab28fdd',
'duration': 3509,
'upload_date': '20201219',
'uploader': 'phoenix',
'uploader_id': 'phoenix',
},
'params': {
'skip_download': True,
}, },
}, { }, {
'url': 'https://www.phoenix.de/entwicklungen-in-russland-a-2044720.html', 'url': 'https://www.phoenix.de/entwicklungen-in-russland-a-2044720.html',
@@ -90,8 +71,8 @@ def _real_extract(self, url):
content_id = details['tracking']['nielsen']['content']['assetid'] content_id = details['tracking']['nielsen']['content']['assetid']
info = self._extract_ptmd( info = self._extract_ptmd(
f'https://tmd.phoenix.de/tmd/2/ngplayer_2_3/vod/ptmd/phoenix/{content_id}', f'https://tmd.phoenix.de/tmd/2/android_native_6/vod/ptmd/phoenix/{content_id}',
content_id, None, url) content_id)
duration = int_or_none(try_get( duration = int_or_none(try_get(
details, lambda x: x['tracking']['nielsen']['content']['length'])) details, lambda x: x['tracking']['nielsen']['content']['length']))
@@ -101,20 +82,8 @@ def _real_extract(self, url):
str) str)
episode = title if details.get('contentType') == 'episode' else None episode = title if details.get('contentType') == 'episode' else None
thumbnails = []
teaser_images = try_get(details, lambda x: x['teaserImageRef']['layouts'], dict) or {} teaser_images = try_get(details, lambda x: x['teaserImageRef']['layouts'], dict) or {}
for thumbnail_key, thumbnail_url in teaser_images.items(): thumbnails = self._extract_thumbnails(teaser_images)
thumbnail_url = urljoin(url, thumbnail_url)
if not thumbnail_url:
continue
thumbnail = {
'url': thumbnail_url,
}
m = re.match('^([0-9]+)x([0-9]+)$', thumbnail_key)
if m:
thumbnail['width'] = int(m.group(1))
thumbnail['height'] = int(m.group(2))
thumbnails.append(thumbnail)
return merge_dicts(info, { return merge_dicts(info, {
'id': content_id, 'id': content_id,

View File

@@ -22,7 +22,7 @@
) )
class PolskieRadioBaseExtractor(InfoExtractor): class PolskieRadioBaseIE(InfoExtractor):
def _extract_webpage_player_entries(self, webpage, playlist_id, base_data): def _extract_webpage_player_entries(self, webpage, playlist_id, base_data):
media_urls = set() media_urls = set()
@@ -47,7 +47,7 @@ def _extract_webpage_player_entries(self, webpage, playlist_id, base_data):
yield entry yield entry
class PolskieRadioLegacyIE(PolskieRadioBaseExtractor): class PolskieRadioLegacyIE(PolskieRadioBaseIE):
# legacy sites # legacy sites
IE_NAME = 'polskieradio:legacy' IE_NAME = 'polskieradio:legacy'
_VALID_URL = r'https?://(?:www\.)?polskieradio(?:24)?\.pl/\d+/\d+/[Aa]rtykul/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?polskieradio(?:24)?\.pl/\d+/\d+/[Aa]rtykul/(?P<id>\d+)'
@@ -127,7 +127,7 @@ def _real_extract(self, url):
return self.playlist_result(entries, playlist_id, title, description) return self.playlist_result(entries, playlist_id, title, description)
class PolskieRadioIE(PolskieRadioBaseExtractor): class PolskieRadioIE(PolskieRadioBaseIE):
# new next.js sites # new next.js sites
_VALID_URL = r'https?://(?:[^/]+\.)?(?:polskieradio(?:24)?|radiokierowcow)\.pl/artykul/(?P<id>\d+)' _VALID_URL = r'https?://(?:[^/]+\.)?(?:polskieradio(?:24)?|radiokierowcow)\.pl/artykul/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
@@ -519,7 +519,7 @@ def _real_extract(self, url):
} }
class PolskieRadioPodcastBaseExtractor(InfoExtractor): class PolskieRadioPodcastBaseIE(InfoExtractor):
_API_BASE = 'https://apipodcasts.polskieradio.pl/api' _API_BASE = 'https://apipodcasts.polskieradio.pl/api'
def _parse_episode(self, data): def _parse_episode(self, data):
@@ -539,7 +539,7 @@ def _parse_episode(self, data):
} }
class PolskieRadioPodcastListIE(PolskieRadioPodcastBaseExtractor): class PolskieRadioPodcastListIE(PolskieRadioPodcastBaseIE):
IE_NAME = 'polskieradio:podcast:list' IE_NAME = 'polskieradio:podcast:list'
_VALID_URL = r'https?://podcasty\.polskieradio\.pl/podcast/(?P<id>\d+)' _VALID_URL = r'https?://podcasty\.polskieradio\.pl/podcast/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
@@ -578,7 +578,7 @@ def get_page(page_num):
} }
class PolskieRadioPodcastIE(PolskieRadioPodcastBaseExtractor): class PolskieRadioPodcastIE(PolskieRadioPodcastBaseIE):
IE_NAME = 'polskieradio:podcast' IE_NAME = 'polskieradio:podcast'
_VALID_URL = r'https?://podcasty\.polskieradio\.pl/track/(?P<id>[a-f\d]{8}(?:-[a-f\d]{4}){4}[a-f\d]{8})' _VALID_URL = r'https?://podcasty\.polskieradio\.pl/track/(?P<id>[a-f\d]{8}(?:-[a-f\d]{4}){4}[a-f\d]{8})'
_TESTS = [{ _TESTS = [{

View File

@@ -321,6 +321,27 @@ class RaiPlayIE(RaiBaseIE):
'timestamp': 1348495020, 'timestamp': 1348495020,
'upload_date': '20120924', 'upload_date': '20120924',
}, },
}, {
# checking program_info gives false positive for DRM
'url': 'https://www.raiplay.it/video/2022/10/Ad-ogni-costo---Un-giorno-in-Pretura---Puntata-del-15102022-1dfd1295-ea38-4bac-b51e-f87e2881693b.html',
'md5': '572c6f711b7c5f2d670ba419b4ae3b08',
'info_dict': {
'id': '1dfd1295-ea38-4bac-b51e-f87e2881693b',
'ext': 'mp4',
'title': 'Ad ogni costo - Un giorno in Pretura - Puntata del 15/10/2022',
'alt_title': 'St 2022/23 - Un giorno in pretura - Ad ogni costo',
'description': 'md5:4046d97b2687f74f06a8b8270ba5599f',
'uploader': 'Rai 3',
'duration': 3773.0,
'thumbnail': 'https://www.raiplay.it/dl/img/2022/10/12/1665586539957_2048x2048.png',
'creators': ['Rai 3'],
'series': 'Un giorno in pretura',
'season': '2022/23',
'episode': 'Ad ogni costo',
'timestamp': 1665507240,
'upload_date': '20221011',
'release_year': 2025,
},
}, { }, {
'url': 'http://www.raiplay.it/video/2016/11/gazebotraindesi-efebe701-969c-4593-92f3-285f0d1ce750.html?', 'url': 'http://www.raiplay.it/video/2016/11/gazebotraindesi-efebe701-969c-4593-92f3-285f0d1ce750.html?',
'only_matching': True, 'only_matching': True,
@@ -340,8 +361,7 @@ def _real_extract(self, url):
media = self._download_json( media = self._download_json(
f'{base}.json', video_id, 'Downloading video JSON') f'{base}.json', video_id, 'Downloading video JSON')
if not self.get_param('allow_unplayable_formats'): if traverse_obj(media, ('rights_management', 'rights', 'drm')):
if traverse_obj(media, (('program_info', None), 'rights_management', 'rights', 'drm')):
self.report_drm(video_id) self.report_drm(video_id)
video = media['video'] video = media['video']

View File

@@ -388,7 +388,8 @@ def add_thumbnail(src):
}) })
if entries: if entries:
return self.playlist_result(entries, video_id, **info) return self.playlist_result(entries, video_id, **info)
raise ExtractorError('No media found', expected=True) self.raise_no_formats('No media found', expected=True, video_id=video_id)
return {**info, 'id': video_id}
# Check if media is hosted on reddit: # Check if media is hosted on reddit:
reddit_video = traverse_obj(data, ( reddit_video = traverse_obj(data, (

View File

@@ -12,7 +12,7 @@
) )
class RedGifsBaseInfoExtractor(InfoExtractor): class RedGifsBaseIE(InfoExtractor):
_FORMATS = { _FORMATS = {
'gif': 250, 'gif': 250,
'sd': 480, 'sd': 480,
@@ -113,7 +113,7 @@ def _paged_entries(self, ep, item_id, query, fields):
return page_fetcher(page) if page else OnDemandPagedList(page_fetcher, self._PAGE_SIZE) return page_fetcher(page) if page else OnDemandPagedList(page_fetcher, self._PAGE_SIZE)
class RedGifsIE(RedGifsBaseInfoExtractor): class RedGifsIE(RedGifsBaseIE):
_VALID_URL = r'https?://(?:(?:www\.)?redgifs\.com/(?:watch|ifr)/|thumbs2\.redgifs\.com/)(?P<id>[^-/?#\.]+)' _VALID_URL = r'https?://(?:(?:www\.)?redgifs\.com/(?:watch|ifr)/|thumbs2\.redgifs\.com/)(?P<id>[^-/?#\.]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.redgifs.com/watch/squeakyhelplesswisent', 'url': 'https://www.redgifs.com/watch/squeakyhelplesswisent',
@@ -172,7 +172,7 @@ def _real_extract(self, url):
return self._parse_gif_data(video_info['gif']) return self._parse_gif_data(video_info['gif'])
class RedGifsSearchIE(RedGifsBaseInfoExtractor): class RedGifsSearchIE(RedGifsBaseIE):
IE_DESC = 'Redgifs search' IE_DESC = 'Redgifs search'
_VALID_URL = r'https?://(?:www\.)?redgifs\.com/browse\?(?P<query>[^#]+)' _VALID_URL = r'https?://(?:www\.)?redgifs\.com/browse\?(?P<query>[^#]+)'
_PAGE_SIZE = 80 _PAGE_SIZE = 80
@@ -226,7 +226,7 @@ def _real_extract(self, url):
entries, query_str, tags, f'RedGifs search for {tags}, ordered by {order}') entries, query_str, tags, f'RedGifs search for {tags}, ordered by {order}')
class RedGifsUserIE(RedGifsBaseInfoExtractor): class RedGifsUserIE(RedGifsBaseIE):
IE_DESC = 'Redgifs user' IE_DESC = 'Redgifs user'
_VALID_URL = r'https?://(?:www\.)?redgifs\.com/users/(?P<username>[^/?#]+)(?:\?(?P<query>[^#]+))?' _VALID_URL = r'https?://(?:www\.)?redgifs\.com/users/(?P<username>[^/?#]+)(?:\?(?P<query>[^#]+))?'
_PAGE_SIZE = 80 _PAGE_SIZE = 80

43
yt_dlp/extractor/roya.py Normal file
View File

@@ -0,0 +1,43 @@
from .common import InfoExtractor
from ..utils.traversal import traverse_obj
class RoyaLiveIE(InfoExtractor):
_VALID_URL = r'https?://roya\.tv/live-stream/(?P<id>\d+)'
_TESTS = [{
'url': 'https://roya.tv/live-stream/1',
'info_dict': {
'id': '1',
'title': r're:Roya TV \d{4}-\d{2}-\d{2} \d{2}:\d{2}',
'ext': 'mp4',
'live_status': 'is_live',
},
}, {
'url': 'https://roya.tv/live-stream/21',
'info_dict': {
'id': '21',
'title': r're:Roya News \d{4}-\d{2}-\d{2} \d{2}:\d{2}',
'ext': 'mp4',
'live_status': 'is_live',
},
}, {
'url': 'https://roya.tv/live-stream/10000',
'only_matching': True,
}]
def _real_extract(self, url):
media_id = self._match_id(url)
stream_url = self._download_json(
f'https://ticket.roya-tv.com/api/v5/fastchannel/{media_id}', media_id)['data']['secured_url']
title = traverse_obj(
self._download_json('https://backend.roya.tv/api/v01/channels/schedule-pagination', media_id, fatal=False),
('data', 0, 'channel', lambda _, v: str(v['id']) == media_id, 'title', {str}, any))
return {
'id': media_id,
'formats': self._extract_m3u8_formats(stream_url, media_id, 'mp4', m3u8_id='hls', live=True),
'title': title,
'is_live': True,
}

View File

@@ -1,35 +1,142 @@
import base64 import base64
import io import io
import struct import struct
import urllib.parse
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
clean_html,
determine_ext, determine_ext,
float_or_none, float_or_none,
make_archive_id,
parse_iso8601,
qualities, qualities,
remove_end, url_or_none,
remove_start,
try_get,
) )
from ..utils.traversal import subs_list_to_dict, traverse_obj
class RTVEALaCartaIE(InfoExtractor): class RTVEBaseIE(InfoExtractor):
# Reimplementation of https://js2.rtve.es/pages/app-player/3.5.1/js/pf_video.js
@staticmethod
def _decrypt_url(png):
encrypted_data = io.BytesIO(base64.b64decode(png)[8:])
while True:
length_data = encrypted_data.read(4)
length = struct.unpack('!I', length_data)[0]
chunk_type = encrypted_data.read(4)
if chunk_type == b'IEND':
break
data = encrypted_data.read(length)
if chunk_type == b'tEXt':
data = bytes(filter(None, data))
alphabet_data, _, url_data = data.partition(b'#')
quality_str, _, url_data = url_data.rpartition(b'%%')
quality_str = quality_str.decode() or ''
alphabet = RTVEBaseIE._get_alphabet(alphabet_data)
url = RTVEBaseIE._get_url(alphabet, url_data)
yield quality_str, url
encrypted_data.read(4) # CRC
@staticmethod
def _get_url(alphabet, url_data):
url = ''
f = 0
e = 3
b = 1
for char in url_data.decode('iso-8859-1'):
if f == 0:
l = int(char) * 10
f = 1
else:
if e == 0:
l += int(char)
url += alphabet[l]
e = (b + 3) % 4
f = 0
b += 1
else:
e -= 1
return url
@staticmethod
def _get_alphabet(alphabet_data):
alphabet = []
e = 0
d = 0
for char in alphabet_data.decode('iso-8859-1'):
if d == 0:
alphabet.append(char)
d = e = (e + 1) % 4
else:
d -= 1
return alphabet
def _extract_png_formats_and_subtitles(self, video_id, media_type='videos'):
formats, subtitles = [], {}
q = qualities(['Media', 'Alta', 'HQ', 'HD_READY', 'HD_FULL'])
for manager in ('rtveplayw', 'default'):
png = self._download_webpage(
f'http://www.rtve.es/ztnr/movil/thumbnail/{manager}/{media_type}/{video_id}.png',
video_id, 'Downloading url information', query={'q': 'v2'}, fatal=False)
if not png:
continue
for quality, video_url in self._decrypt_url(png):
ext = determine_ext(video_url)
if ext == 'm3u8':
fmts, subs = self._extract_m3u8_formats_and_subtitles(
video_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
elif ext == 'mpd':
fmts, subs = self._extract_mpd_formats_and_subtitles(
video_url, video_id, 'dash', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
else:
formats.append({
'format_id': quality,
'quality': q(quality),
'url': video_url,
})
return formats, subtitles
def _parse_metadata(self, metadata):
return traverse_obj(metadata, {
'title': ('title', {str.strip}),
'alt_title': ('alt', {str.strip}),
'description': ('description', {clean_html}),
'timestamp': ('dateOfEmission', {parse_iso8601(delimiter=' ')}),
'release_timestamp': ('publicationDate', {parse_iso8601(delimiter=' ')}),
'modified_timestamp': ('modificationDate', {parse_iso8601(delimiter=' ')}),
'thumbnail': (('thumbnail', 'image', 'imageSEO'), {url_or_none}, any),
'duration': ('duration', {float_or_none(scale=1000)}),
'is_live': ('live', {bool}),
'series': (('programTitle', ('programInfo', 'title')), {clean_html}, any),
})
class RTVEALaCartaIE(RTVEBaseIE):
IE_NAME = 'rtve.es:alacarta' IE_NAME = 'rtve.es:alacarta'
IE_DESC = 'RTVE a la carta' IE_DESC = 'RTVE a la carta and Play'
_VALID_URL = r'https?://(?:www\.)?rtve\.es/(m/)?(alacarta/videos|filmoteca)/[^/]+/[^/]+/(?P<id>\d+)' _VALID_URL = [
r'https?://(?:www\.)?rtve\.es/(?:m/)?(?:(?:alacarta|play)/videos|filmoteca)/(?!directo)(?:[^/?#]+/){2}(?P<id>\d+)',
r'https?://(?:www\.)?rtve\.es/infantil/serie/[^/?#]+/video/[^/?#]+/(?P<id>\d+)',
]
_TESTS = [{ _TESTS = [{
'url': 'http://www.rtve.es/alacarta/videos/balonmano/o-swiss-cup-masculina-final-espana-suecia/2491869/', 'url': 'http://www.rtve.es/alacarta/videos/la-aventura-del-saber/aventuraentornosilla/3088905/',
'md5': '1d49b7e1ca7a7502c56a4bf1b60f1b43', 'md5': 'a964547824359a5753aef09d79fe984b',
'info_dict': { 'info_dict': {
'id': '2491869', 'id': '3088905',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Balonmano - Swiss Cup masculina. Final: España-Suecia', 'title': 'En torno a la silla',
'duration': 5024.566, 'duration': 1216.981,
'series': 'Balonmano', 'series': 'La aventura del Saber',
'thumbnail': 'https://img2.rtve.es/v/aventuraentornosilla_3088905.png',
}, },
'expected_warnings': ['Failed to download MPD manifest', 'Failed to download m3u8 information'],
}, { }, {
'note': 'Live stream', 'note': 'Live stream',
'url': 'http://www.rtve.es/alacarta/videos/television/24h-live/1694255/', 'url': 'http://www.rtve.es/alacarta/videos/television/24h-live/1694255/',
@@ -38,140 +145,88 @@ class RTVEALaCartaIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 're:^24H LIVE [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$', 'title': 're:^24H LIVE [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'is_live': True, 'is_live': True,
'live_status': 'is_live',
'thumbnail': r're:https://img2\.rtve\.es/v/.*\.png',
}, },
'params': { 'params': {
'skip_download': 'live stream', 'skip_download': 'live stream',
}, },
}, { }, {
'url': 'http://www.rtve.es/alacarta/videos/servir-y-proteger/servir-proteger-capitulo-104/4236788/', 'url': 'http://www.rtve.es/alacarta/videos/servir-y-proteger/servir-proteger-capitulo-104/4236788/',
'md5': 'd850f3c8731ea53952ebab489cf81cbf', 'md5': 'f3cf0d1902d008c48c793e736706c174',
'info_dict': { 'info_dict': {
'id': '4236788', 'id': '4236788',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Servir y proteger - Capítulo 104', 'title': 'Episodio 104',
'duration': 3222.0, 'duration': 3222.8,
'thumbnail': r're:https://img2\.rtve\.es/v/.*\.png',
'series': 'Servir y proteger',
}, },
'expected_warnings': ['Failed to download MPD manifest', 'Failed to download m3u8 information'],
}, { }, {
'url': 'http://www.rtve.es/m/alacarta/videos/cuentame-como-paso/cuentame-como-paso-t16-ultimo-minuto-nuestra-vida-capitulo-276/2969138/?media=tve', 'url': 'http://www.rtve.es/m/alacarta/videos/cuentame-como-paso/cuentame-como-paso-t16-ultimo-minuto-nuestra-vida-capitulo-276/2969138/?media=tve',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'http://www.rtve.es/filmoteca/no-do/not-1-introduccion-primer-noticiario-espanol/1465256/', 'url': 'http://www.rtve.es/filmoteca/no-do/not-1-introduccion-primer-noticiario-espanol/1465256/',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.rtve.es/play/videos/saber-vivir/07-07-24/16177116/',
'md5': 'a5b24fcdfa3ff5cb7908aba53d22d4b6',
'info_dict': {
'id': '16177116',
'ext': 'mp4',
'title': 'Saber vivir - 07/07/24',
'thumbnail': r're:https://img2\.rtve\.es/v/.*\.png',
'duration': 2162.68,
'series': 'Saber vivir',
},
}, {
'url': 'https://www.rtve.es/infantil/serie/agus-lui-churros-crafts/video/gusano/7048976/',
'info_dict': {
'id': '7048976',
'ext': 'mp4',
'title': 'Gusano',
'thumbnail': r're:https://img2\.rtve\.es/v/.*\.png',
'duration': 292.86,
'series': 'Agus & Lui: Churros y Crafts',
'_old_archive_ids': ['rtveinfantil 7048976'],
},
}] }]
def _real_initialize(self): def _get_subtitles(self, video_id):
user_agent_b64 = base64.b64encode(self.get_param('http_headers')['User-Agent'].encode()).decode('utf-8') subtitle_data = self._download_json(
self._manager = self._download_json( f'https://api2.rtve.es/api/videos/{video_id}/subtitulos.json', video_id,
'http://www.rtve.es/odin/loki/' + user_agent_b64, 'Downloading subtitles info')
None, 'Fetching manager info')['manager'] return traverse_obj(subtitle_data, ('page', 'items', ..., {
'id': ('lang', {str}),
@staticmethod 'url': ('src', {url_or_none}),
def _decrypt_url(png): }, all, {subs_list_to_dict(lang='es')}))
encrypted_data = io.BytesIO(base64.b64decode(png)[8:])
while True:
length = struct.unpack('!I', encrypted_data.read(4))[0]
chunk_type = encrypted_data.read(4)
if chunk_type == b'IEND':
break
data = encrypted_data.read(length)
if chunk_type == b'tEXt':
alphabet_data, text = data.split(b'\0')
quality, url_data = text.split(b'%%')
alphabet = []
e = 0
d = 0
for l in alphabet_data.decode('iso-8859-1'):
if d == 0:
alphabet.append(l)
d = e = (e + 1) % 4
else:
d -= 1
url = ''
f = 0
e = 3
b = 1
for letter in url_data.decode('iso-8859-1'):
if f == 0:
l = int(letter) * 10
f = 1
else:
if e == 0:
l += int(letter)
url += alphabet[l]
e = (b + 3) % 4
f = 0
b += 1
else:
e -= 1
yield quality.decode(), url
encrypted_data.read(4) # CRC
def _extract_png_formats(self, video_id):
png = self._download_webpage(
f'http://www.rtve.es/ztnr/movil/thumbnail/{self._manager}/videos/{video_id}.png',
video_id, 'Downloading url information', query={'q': 'v2'})
q = qualities(['Media', 'Alta', 'HQ', 'HD_READY', 'HD_FULL'])
formats = []
for quality, video_url in self._decrypt_url(png):
ext = determine_ext(video_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, 'dash', fatal=False))
else:
formats.append({
'format_id': quality,
'quality': q(quality),
'url': video_url,
})
return formats
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
info = self._download_json( metadata = self._download_json(
f'http://www.rtve.es/api/videos/{video_id}/config/alacarta_videos.json', f'http://www.rtve.es/api/videos/{video_id}/config/alacarta_videos.json',
video_id)['page']['items'][0] video_id)['page']['items'][0]
if info['state'] == 'DESPU': if metadata['state'] == 'DESPU':
raise ExtractorError('The video is no longer available', expected=True) raise ExtractorError('The video is no longer available', expected=True)
title = info['title'].strip() formats, subtitles = self._extract_png_formats_and_subtitles(video_id)
formats = self._extract_png_formats(video_id)
subtitles = None self._merge_subtitles(self.extract_subtitles(video_id), target=subtitles)
sbt_file = info.get('sbtFile')
if sbt_file:
subtitles = self.extract_subtitles(video_id, sbt_file)
is_live = info.get('live') is True is_infantil = urllib.parse.urlparse(url).path.startswith('/infantil/')
return { return {
'id': video_id, 'id': video_id,
'title': title,
'formats': formats, 'formats': formats,
'thumbnail': info.get('image'),
'subtitles': subtitles, 'subtitles': subtitles,
'duration': float_or_none(info.get('duration'), 1000), **self._parse_metadata(metadata),
'is_live': is_live, '_old_archive_ids': [make_archive_id('rtveinfantil', video_id)] if is_infantil else None,
'series': info.get('programTitle'),
} }
def _get_subtitles(self, video_id, sub_file):
subs = self._download_json(
sub_file + '.json', video_id,
'Downloading subtitles info')['page']['items']
return dict(
(s['lang'], [{'ext': 'vtt', 'url': s['src']}])
for s in subs)
class RTVEAudioIE(RTVEBaseIE):
class RTVEAudioIE(RTVEALaCartaIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'rtve.es:audio' IE_NAME = 'rtve.es:audio'
IE_DESC = 'RTVE audio' IE_DESC = 'RTVE audio'
_VALID_URL = r'https?://(?:www\.)?rtve\.es/(alacarta|play)/audios/[^/]+/[^/]+/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?rtve\.es/(alacarta|play)/audios/(?:[^/?#]+/){2}(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.rtve.es/alacarta/audios/a-hombros-de-gigantes/palabra-ingeniero-codigos-informaticos-27-04-21/5889192/', 'url': 'https://www.rtve.es/alacarta/audios/a-hombros-de-gigantes/palabra-ingeniero-codigos-informaticos-27-04-21/5889192/',
@@ -180,9 +235,11 @@ class RTVEAudioIE(RTVEALaCartaIE): # XXX: Do not subclass from concrete IE
'id': '5889192', 'id': '5889192',
'ext': 'mp3', 'ext': 'mp3',
'title': 'Códigos informáticos', 'title': 'Códigos informáticos',
'thumbnail': r're:https?://.+/1598856591583.jpg', 'alt_title': 'Códigos informáticos - Escuchar ahora',
'duration': 349.440, 'duration': 349.440,
'series': 'A hombros de gigantes', 'series': 'A hombros de gigantes',
'description': 'md5:72b0d7c1ca20fd327bdfff7ac0171afb',
'thumbnail': 'https://img2.rtve.es/a/palabra-ingeniero-codigos-informaticos-270421_5889192.png',
}, },
}, { }, {
'url': 'https://www.rtve.es/play/audios/en-radio-3/ignatius-farray/5791165/', 'url': 'https://www.rtve.es/play/audios/en-radio-3/ignatius-farray/5791165/',
@@ -191,9 +248,11 @@ class RTVEAudioIE(RTVEALaCartaIE): # XXX: Do not subclass from concrete IE
'id': '5791165', 'id': '5791165',
'ext': 'mp3', 'ext': 'mp3',
'title': 'Ignatius Farray', 'title': 'Ignatius Farray',
'alt_title': 'En Radio 3 - Ignatius Farray - 13/02/21 - escuchar ahora',
'thumbnail': r're:https?://.+/1613243011863.jpg', 'thumbnail': r're:https?://.+/1613243011863.jpg',
'duration': 3559.559, 'duration': 3559.559,
'series': 'En Radio 3', 'series': 'En Radio 3',
'description': 'md5:124aa60b461e0b1724a380bad3bc4040',
}, },
}, { }, {
'url': 'https://www.rtve.es/play/audios/frankenstein-o-el-moderno-prometeo/capitulo-26-ultimo-muerte-victor-juan-jose-plans-mary-shelley/6082623/', 'url': 'https://www.rtve.es/play/audios/frankenstein-o-el-moderno-prometeo/capitulo-26-ultimo-muerte-victor-juan-jose-plans-mary-shelley/6082623/',
@@ -202,126 +261,101 @@ class RTVEAudioIE(RTVEALaCartaIE): # XXX: Do not subclass from concrete IE
'id': '6082623', 'id': '6082623',
'ext': 'mp3', 'ext': 'mp3',
'title': 'Capítulo 26 y último: La muerte de Victor', 'title': 'Capítulo 26 y último: La muerte de Victor',
'alt_title': 'Frankenstein o el moderno Prometeo - Capítulo 26 y último: La muerte de Victor',
'thumbnail': r're:https?://.+/1632147445707.jpg', 'thumbnail': r're:https?://.+/1632147445707.jpg',
'duration': 3174.086, 'duration': 3174.086,
'series': 'Frankenstein o el moderno Prometeo', 'series': 'Frankenstein o el moderno Prometeo',
'description': 'md5:4ee6fcb82ebe2e46d267e1d1c1a8f7b5',
}, },
}] }]
def _extract_png_formats(self, audio_id):
"""
This function retrieves media related png thumbnail which obfuscate
valuable information about the media. This information is decrypted
via base class _decrypt_url function providing media quality and
media url
"""
png = self._download_webpage(
f'http://www.rtve.es/ztnr/movil/thumbnail/{self._manager}/audios/{audio_id}.png',
audio_id, 'Downloading url information', query={'q': 'v2'})
q = qualities(['Media', 'Alta', 'HQ', 'HD_READY', 'HD_FULL'])
formats = []
for quality, audio_url in self._decrypt_url(png):
ext = determine_ext(audio_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
audio_url, audio_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
audio_url, audio_id, 'dash', fatal=False))
else:
formats.append({
'format_id': quality,
'quality': q(quality),
'url': audio_url,
})
return formats
def _real_extract(self, url): def _real_extract(self, url):
audio_id = self._match_id(url) audio_id = self._match_id(url)
info = self._download_json( metadata = self._download_json(
f'https://www.rtve.es/api/audios/{audio_id}.json', f'https://www.rtve.es/api/audios/{audio_id}.json', audio_id)['page']['items'][0]
audio_id)['page']['items'][0]
formats, subtitles = self._extract_png_formats_and_subtitles(audio_id, media_type='audios')
return { return {
'id': audio_id, 'id': audio_id,
'title': info['title'].strip(), 'formats': formats,
'thumbnail': info.get('thumbnail'), 'subtitles': subtitles,
'duration': float_or_none(info.get('duration'), 1000), **self._parse_metadata(metadata),
'series': try_get(info, lambda x: x['programInfo']['title']),
'formats': self._extract_png_formats(audio_id),
} }
class RTVEInfantilIE(RTVEALaCartaIE): # XXX: Do not subclass from concrete IE class RTVELiveIE(RTVEBaseIE):
IE_NAME = 'rtve.es:infantil'
IE_DESC = 'RTVE infantil'
_VALID_URL = r'https?://(?:www\.)?rtve\.es/infantil/serie/[^/]+/video/[^/]+/(?P<id>[0-9]+)/'
_TESTS = [{
'url': 'http://www.rtve.es/infantil/serie/cleo/video/maneras-vivir/3040283/',
'md5': '5747454717aedf9f9fdf212d1bcfc48d',
'info_dict': {
'id': '3040283',
'ext': 'mp4',
'title': 'Maneras de vivir',
'thumbnail': r're:https?://.+/1426182947956\.JPG',
'duration': 357.958,
},
'expected_warnings': ['Failed to download MPD manifest', 'Failed to download m3u8 information'],
}]
class RTVELiveIE(RTVEALaCartaIE): # XXX: Do not subclass from concrete IE
IE_NAME = 'rtve.es:live' IE_NAME = 'rtve.es:live'
IE_DESC = 'RTVE.es live streams' IE_DESC = 'RTVE.es live streams'
_VALID_URL = r'https?://(?:www\.)?rtve\.es/directo/(?P<id>[a-zA-Z0-9-]+)' _VALID_URL = [
r'https?://(?:www\.)?rtve\.es/directo/(?P<id>[a-zA-Z0-9-]+)',
r'https?://(?:www\.)?rtve\.es/play/videos/directo/[^/?#]+/(?P<id>[a-zA-Z0-9-]+)',
]
_TESTS = [{ _TESTS = [{
'url': 'http://www.rtve.es/directo/la-1/', 'url': 'http://www.rtve.es/directo/la-1/',
'info_dict': { 'info_dict': {
'id': 'la-1', 'id': 'la-1',
'ext': 'mp4', 'ext': 'mp4',
'title': 're:^La 1 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$', 'live_status': 'is_live',
'title': str,
'description': str,
'thumbnail': r're:https://img\d\.rtve\.es/resources/thumbslive/\d+\.jpg',
'timestamp': int,
'upload_date': str,
}, },
'params': { 'params': {'skip_download': 'live stream'},
'skip_download': 'live stream', }, {
'url': 'https://www.rtve.es/play/videos/directo/deportes/tdp/',
'info_dict': {
'id': 'tdp',
'ext': 'mp4',
'live_status': 'is_live',
'title': str,
'description': str,
'thumbnail': r're:https://img2\d\.rtve\.es/resources/thumbslive/\d+\.jpg',
'timestamp': int,
'upload_date': str,
}, },
'params': {'skip_download': 'live stream'},
}, {
'url': 'http://www.rtve.es/play/videos/directo/canales-lineales/la-1/',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
mobj = self._match_valid_url(url) video_id = self._match_id(url)
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
title = remove_end(self._og_search_title(webpage), ' en directo en RTVE.es')
title = remove_start(title, 'Estoy viendo ')
vidplayer_id = self._search_regex( data_setup = self._search_json(
(r'playerId=player([0-9]+)', r'<div[^>]+class="[^"]*videoPlayer[^"]*"[^>]*data-setup=\'',
r'class=["\'].*?\blive_mod\b.*?["\'][^>]+data-assetid=["\'](\d+)', webpage, 'data_setup', video_id)
r'data-id=["\'](\d+)'),
webpage, 'internal video ID') formats, subtitles = self._extract_png_formats_and_subtitles(data_setup['idAsset'])
return { return {
'id': video_id, 'id': video_id,
'title': title, **self._search_json_ld(webpage, video_id, fatal=False),
'formats': self._extract_png_formats(vidplayer_id), 'title': self._html_extract_title(webpage),
'formats': formats,
'subtitles': subtitles,
'is_live': True, 'is_live': True,
} }
class RTVETelevisionIE(InfoExtractor): class RTVETelevisionIE(InfoExtractor):
IE_NAME = 'rtve.es:television' IE_NAME = 'rtve.es:television'
_VALID_URL = r'https?://(?:www\.)?rtve\.es/television/[^/]+/[^/]+/(?P<id>\d+).shtml' _VALID_URL = r'https?://(?:www\.)?rtve\.es/television/[^/?#]+/[^/?#]+/(?P<id>\d+).shtml'
_TEST = { _TEST = {
'url': 'http://www.rtve.es/television/20160628/revolucion-del-movil/1364141.shtml', 'url': 'https://www.rtve.es/television/20091103/video-inedito-del-8o-programa/299020.shtml',
'info_dict': { 'info_dict': {
'id': '3069778', 'id': '572515',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Documentos TV - La revolución del móvil', 'title': 'Clase inédita',
'duration': 3496.948, 'duration': 335.817,
'thumbnail': r're:https://img2\.rtve\.es/v/.*\.png',
'series': 'El coro de la cárcel',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@@ -332,11 +366,8 @@ def _real_extract(self, url):
page_id = self._match_id(url) page_id = self._match_id(url)
webpage = self._download_webpage(url, page_id) webpage = self._download_webpage(url, page_id)
alacarta_url = self._search_regex( play_url = self._html_search_meta('contentUrl', webpage)
r'data-location="alacarta_videos"[^<]+url&quot;:&quot;(http://www\.rtve\.es/alacarta.+?)&', if play_url is None:
webpage, 'alacarta url', default=None) raise ExtractorError('The webpage doesn\'t contain any video', expected=True)
if alacarta_url is None:
raise ExtractorError(
'The webpage doesn\'t contain any video', expected=True)
return self.url_result(alacarta_url, ie=RTVEALaCartaIE.ie_key()) return self.url_result(play_url, ie=RTVEALaCartaIE.ie_key())

View File

@@ -9,7 +9,9 @@
class RTVSIE(InfoExtractor): class RTVSIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?rtvs\.sk/(?:radio|televizia)/archiv(?:/\d+)?/(?P<id>\d+)/?(?:[#?]|$)' IE_NAME = 'stvr'
IE_DESC = 'Slovak Television and Radio (formerly RTVS)'
_VALID_URL = r'https?://(?:www\.)?(?:rtvs|stvr)\.sk/(?:radio|televizia)/archiv(?:/\d+)?/(?P<id>\d+)/?(?:[#?]|$)'
_TESTS = [{ _TESTS = [{
# radio archive # radio archive
'url': 'http://www.rtvs.sk/radio/archiv/11224/414872', 'url': 'http://www.rtvs.sk/radio/archiv/11224/414872',
@@ -19,7 +21,7 @@ class RTVSIE(InfoExtractor):
'ext': 'mp3', 'ext': 'mp3',
'title': 'Ostrov pokladov 1 časť.mp3', 'title': 'Ostrov pokladov 1 časť.mp3',
'duration': 2854, 'duration': 2854,
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0000/b1R8.rtvs.jpg', 'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0000/rtvs-00009383.png',
'display_id': '135331', 'display_id': '135331',
}, },
}, { }, {
@@ -30,7 +32,7 @@ class RTVSIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'Amaro Džives - Náš deň', 'title': 'Amaro Džives - Náš deň',
'description': 'Galavečer pri príležitosti Medzinárodného dňa Rómov.', 'description': 'Galavečer pri príležitosti Medzinárodného dňa Rómov.',
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0031/L7Qm.amaro_dzives_png.jpg', 'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0031/L7Qm.amaro_dzives_png.jpg',
'timestamp': 1428555900, 'timestamp': 1428555900,
'upload_date': '20150409', 'upload_date': '20150409',
'duration': 4986, 'duration': 4986,
@@ -47,8 +49,11 @@ class RTVSIE(InfoExtractor):
'display_id': '307655', 'display_id': '307655',
'duration': 831, 'duration': 831,
'upload_date': '20211111', 'upload_date': '20211111',
'thumbnail': 'https://www.rtvs.sk/media/a501/image/file/2/0916/robin.jpg', 'thumbnail': 'https://www.stvr.sk/media/a501/image/file/2/0916/robin.jpg',
}, },
}, {
'url': 'https://www.stvr.sk/radio/archiv/11224/414872',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -7,7 +7,6 @@
ExtractorError, ExtractorError,
UnsupportedError, UnsupportedError,
clean_html, clean_html,
determine_ext,
extract_attributes, extract_attributes,
format_field, format_field,
get_element_by_class, get_element_by_class,
@@ -36,7 +35,7 @@ class RumbleEmbedIE(InfoExtractor):
'upload_date': '20191020', 'upload_date': '20191020',
'channel_url': 'https://rumble.com/c/WMAR', 'channel_url': 'https://rumble.com/c/WMAR',
'channel': 'WMAR', 'channel': 'WMAR',
'thumbnail': 'https://sp.rmbl.ws/s8/1/5/M/z/1/5Mz1a.qR4e-small-WMAR-2-News-Latest-Headline.jpg', 'thumbnail': r're:https://.+\.jpg',
'duration': 234, 'duration': 234,
'uploader': 'WMAR', 'uploader': 'WMAR',
'live_status': 'not_live', 'live_status': 'not_live',
@@ -52,7 +51,7 @@ class RumbleEmbedIE(InfoExtractor):
'upload_date': '20220217', 'upload_date': '20220217',
'channel_url': 'https://rumble.com/c/CyberTechNews', 'channel_url': 'https://rumble.com/c/CyberTechNews',
'channel': 'CTNews', 'channel': 'CTNews',
'thumbnail': 'https://sp.rmbl.ws/s8/6/7/i/9/h/7i9hd.OvCc.jpg', 'thumbnail': r're:https://.+\.jpg',
'duration': 901, 'duration': 901,
'uploader': 'CTNews', 'uploader': 'CTNews',
'live_status': 'not_live', 'live_status': 'not_live',
@@ -114,6 +113,22 @@ class RumbleEmbedIE(InfoExtractor):
'live_status': 'was_live', 'live_status': 'was_live',
}, },
'params': {'skip_download': True}, 'params': {'skip_download': True},
}, {
'url': 'https://rumble.com/embed/v6pezdb',
'info_dict': {
'id': 'v6pezdb',
'ext': 'mp4',
'title': '"Es war einmal ein Mädchen" Ein filmisches Zeitzeugnis aus Leningrad 1944',
'uploader': 'RT DE',
'channel': 'RT DE',
'channel_url': 'https://rumble.com/c/RTDE',
'duration': 309,
'thumbnail': 'https://1a-1791.com/video/fww1/dc/s8/1/n/z/2/y/nz2yy.qR4e-small-Es-war-einmal-ein-Mdchen-Ei.jpg',
'timestamp': 1743703500,
'upload_date': '20250403',
'live_status': 'not_live',
},
'params': {'skip_download': True},
}, { }, {
'url': 'https://rumble.com/embed/ufe9n.v5pv5f', 'url': 'https://rumble.com/embed/ufe9n.v5pv5f',
'only_matching': True, 'only_matching': True,
@@ -168,40 +183,42 @@ def _real_extract(self, url):
live_status = None live_status = None
formats = [] formats = []
for ext, ext_info in (video.get('ua') or {}).items(): for format_type, format_info in (video.get('ua') or {}).items():
if isinstance(ext_info, dict): if isinstance(format_info, dict):
for height, video_info in ext_info.items(): for height, video_info in format_info.items():
if not traverse_obj(video_info, ('meta', 'h', {int_or_none})): if not traverse_obj(video_info, ('meta', 'h', {int_or_none})):
video_info.setdefault('meta', {})['h'] = height video_info.setdefault('meta', {})['h'] = height
ext_info = ext_info.values() format_info = format_info.values()
for video_info in ext_info: for video_info in format_info:
meta = video_info.get('meta') or {} meta = video_info.get('meta') or {}
if not video_info.get('url'): if not video_info.get('url'):
continue continue
if ext == 'hls': # With default query params returns m3u8 variants which are duplicates, without returns tar files
if format_type == 'tar':
continue
if format_type == 'hls':
if meta.get('live') is True and video.get('live') == 1: if meta.get('live') is True and video.get('live') == 1:
live_status = 'post_live' live_status = 'post_live'
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
video_info['url'], video_id, video_info['url'], video_id,
ext='mp4', m3u8_id='hls', fatal=False, live=live_status == 'is_live')) ext='mp4', m3u8_id='hls', fatal=False, live=live_status == 'is_live'))
continue continue
timeline = ext == 'timeline' is_timeline = format_type == 'timeline'
if timeline: is_audio = format_type == 'audio'
ext = determine_ext(video_info['url'])
formats.append({ formats.append({
'ext': ext, 'acodec': 'none' if is_timeline else None,
'acodec': 'none' if timeline else None, 'vcodec': 'none' if is_audio else None,
'url': video_info['url'], 'url': video_info['url'],
'format_id': join_nonempty(ext, format_field(meta, 'h', '%sp')), 'format_id': join_nonempty(format_type, format_field(meta, 'h', '%sp')),
'format_note': 'Timeline' if timeline else None, 'format_note': 'Timeline' if is_timeline else None,
'fps': None if timeline else video.get('fps'), 'fps': None if is_timeline or is_audio else video.get('fps'),
**traverse_obj(meta, { **traverse_obj(meta, {
'tbr': 'bitrate', 'tbr': ('bitrate', {int_or_none}),
'filesize': 'size', 'filesize': ('size', {int_or_none}),
'width': 'w', 'width': ('w', {int_or_none}),
'height': 'h', 'height': ('h', {int_or_none}),
}, expected_type=lambda x: int(x) or None), }),
}) })
subtitles = { subtitles = {

View File

@@ -122,6 +122,15 @@ def _real_extract(self, url):
if traverse_obj(media, ('partOfSeries', {dict})): if traverse_obj(media, ('partOfSeries', {dict})):
media['epName'] = traverse_obj(media, ('title', {str})) media['epName'] = traverse_obj(media, ('title', {str}))
# Need to set different language for forced subs or else they have priority over full subs
fixed_subtitles = {}
for lang, subs in subtitles.items():
for sub in subs:
fixed_lang = lang
if sub['url'].lower().endswith('_fe.vtt'):
fixed_lang += '-forced'
fixed_subtitles.setdefault(fixed_lang, []).append(sub)
return { return {
'id': video_id, 'id': video_id,
**traverse_obj(media, { **traverse_obj(media, {
@@ -151,6 +160,6 @@ def _real_extract(self, url):
}), }),
}), }),
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': fixed_subtitles,
'uploader': 'SBSC', 'uploader': 'SBSC',
} }

View File

@@ -13,7 +13,7 @@
class SenateISVPIE(InfoExtractor): class SenateISVPIE(InfoExtractor):
_IE_NAME = 'senate.gov:isvp' IE_NAME = 'senate.gov:isvp'
_VALID_URL = r'https?://(?:www\.)?senate\.gov/isvp/?\?(?P<qs>.+)' _VALID_URL = r'https?://(?:www\.)?senate\.gov/isvp/?\?(?P<qs>.+)'
_EMBED_REGEX = [r"<iframe[^>]+src=['\"](?P<url>https?://www\.senate\.gov/isvp/?\?[^'\"]+)['\"]"] _EMBED_REGEX = [r"<iframe[^>]+src=['\"](?P<url>https?://www\.senate\.gov/isvp/?\?[^'\"]+)['\"]"]
@@ -137,7 +137,7 @@ def _real_extract(self, url):
class SenateGovIE(InfoExtractor): class SenateGovIE(InfoExtractor):
_IE_NAME = 'senate.gov' IE_NAME = 'senate.gov'
_SUBDOMAIN_RE = '|'.join(map(re.escape, ( _SUBDOMAIN_RE = '|'.join(map(re.escape, (
'agriculture', 'aging', 'appropriations', 'armed-services', 'banking', 'agriculture', 'aging', 'appropriations', 'armed-services', 'banking',
'budget', 'commerce', 'energy', 'epw', 'finance', 'foreign', 'help', 'budget', 'commerce', 'energy', 'epw', 'finance', 'foreign', 'help',

236
yt_dlp/extractor/streaks.py Normal file
View File

@@ -0,0 +1,236 @@
import json
import urllib.parse
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
filter_dict,
float_or_none,
join_nonempty,
mimetype2ext,
parse_iso8601,
unsmuggle_url,
update_url_query,
url_or_none,
)
from ..utils.traversal import traverse_obj
class StreaksBaseIE(InfoExtractor):
_API_URL_TEMPLATE = 'https://{}.api.streaks.jp/v1/projects/{}/medias/{}{}'
_GEO_BYPASS = False
_GEO_COUNTRIES = ['JP']
def _extract_from_streaks_api(self, project_id, media_id, headers=None, query=None, ssai=False):
try:
response = self._download_json(
self._API_URL_TEMPLATE.format('playback', project_id, media_id, ''),
media_id, 'Downloading STREAKS playback API JSON', headers={
'Accept': 'application/json',
'Origin': 'https://players.streaks.jp',
**self.geo_verification_headers(),
**(headers or {}),
})
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status in {403, 404}:
error = self._parse_json(e.cause.response.read().decode(), media_id, fatal=False)
message = traverse_obj(error, ('message', {str}))
code = traverse_obj(error, ('code', {str}))
if code == 'REQUEST_FAILED':
self.raise_geo_restricted(message, countries=self._GEO_COUNTRIES)
elif code == 'MEDIA_NOT_FOUND':
raise ExtractorError(message, expected=True)
elif code or message:
raise ExtractorError(join_nonempty(code, message, delim=': '))
raise
streaks_id = response['id']
live_status = {
'clip': 'was_live',
'file': 'not_live',
'linear': 'is_live',
'live': 'is_live',
}.get(response.get('type'))
formats, subtitles = [], {}
drm_formats = False
for source in traverse_obj(response, ('sources', lambda _, v: v['src'])):
if source.get('key_systems'):
drm_formats = True
continue
src_url = source['src']
is_live = live_status == 'is_live'
ext = mimetype2ext(source.get('type'))
if ext != 'm3u8':
self.report_warning(f'Unsupported stream type: {ext}')
continue
if is_live and ssai:
session_params = traverse_obj(self._download_json(
self._API_URL_TEMPLATE.format('ssai', project_id, streaks_id, '/ssai/session'),
media_id, 'Downloading session parameters',
headers={'Content-Type': 'application/json', 'Accept': 'application/json'},
data=json.dumps({'id': source['id']}).encode(),
), (0, 'query', {urllib.parse.parse_qs}))
src_url = update_url_query(src_url, session_params)
fmts, subs = self._extract_m3u8_formats_and_subtitles(
src_url, media_id, 'mp4', m3u8_id='hls', fatal=False, live=is_live, query=query)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
if not formats and drm_formats:
self.report_drm(media_id)
self._remove_duplicate_formats(formats)
for subs in traverse_obj(response, (
'tracks', lambda _, v: v['kind'] in ('captions', 'subtitles') and url_or_none(v['src']),
)):
lang = traverse_obj(subs, ('srclang', {str.lower})) or 'ja'
subtitles.setdefault(lang, []).append({'url': subs['src']})
return {
'id': streaks_id,
'display_id': media_id,
'formats': formats,
'live_status': live_status,
'subtitles': subtitles,
'uploader_id': project_id,
**traverse_obj(response, {
'title': ('name', {str}),
'description': ('description', {str}, filter),
'duration': ('duration', {float_or_none}),
'modified_timestamp': ('updated_at', {parse_iso8601}),
'tags': ('tags', ..., {str}),
'thumbnails': (('poster', 'thumbnail'), 'src', {'url': {url_or_none}}),
'timestamp': ('created_at', {parse_iso8601}),
}),
}
class StreaksIE(StreaksBaseIE):
_VALID_URL = [
r'https?://players\.streaks\.jp/(?P<project_id>[\w-]+)/[\da-f]+/index\.html\?(?:[^#]+&)?m=(?P<id>(?:ref:)?[\w-]+)',
r'https?://playback\.api\.streaks\.jp/v1/projects/(?P<project_id>[\w-]+)/medias/(?P<id>(?:ref:)?[\w-]+)',
]
_EMBED_REGEX = [rf'<iframe\s+[^>]*\bsrc\s*=\s*["\'](?P<url>{_VALID_URL[0]})']
_TESTS = [{
'url': 'https://players.streaks.jp/tipness/08155cd19dc14c12bebefb69b92eafcc/index.html?m=dbdf2df35b4d483ebaeeaeb38c594647',
'info_dict': {
'id': 'dbdf2df35b4d483ebaeeaeb38c594647',
'ext': 'mp4',
'title': '3shunenCM_edit.mp4',
'display_id': 'dbdf2df35b4d483ebaeeaeb38c594647',
'duration': 47.533,
'live_status': 'not_live',
'modified_date': '20230726',
'modified_timestamp': 1690356180,
'timestamp': 1690355996,
'upload_date': '20230726',
'uploader_id': 'tipness',
},
}, {
'url': 'https://players.streaks.jp/ktv-web/0298e8964c164ab384c07ef6e08c444b/index.html?m=ref:mycoffeetime_250317',
'info_dict': {
'id': 'dccdc079e3fd41f88b0c8435e2d453ab',
'ext': 'mp4',
'title': 'わたしの珈琲時間_250317',
'display_id': 'ref:mycoffeetime_250317',
'duration': 122.99,
'live_status': 'not_live',
'modified_date': '20250310',
'modified_timestamp': 1741586302,
'thumbnail': r're:https?://.+\.jpg',
'timestamp': 1741585839,
'upload_date': '20250310',
'uploader_id': 'ktv-web',
},
}, {
'url': 'https://playback.api.streaks.jp/v1/projects/ktv-web/medias/b5411938e1e5435dac71edf829dd4813',
'info_dict': {
'id': 'b5411938e1e5435dac71edf829dd4813',
'ext': 'mp4',
'title': 'KANTELE_SYUSEi_0630',
'display_id': 'b5411938e1e5435dac71edf829dd4813',
'live_status': 'not_live',
'modified_date': '20250122',
'modified_timestamp': 1737522999,
'thumbnail': r're:https?://.+\.jpg',
'timestamp': 1735205137,
'upload_date': '20241226',
'uploader_id': 'ktv-web',
},
}, {
# TVer Olympics: website already down, but api remains accessible
'url': 'https://playback.api.streaks.jp/v1/projects/tver-olympic/medias/ref:sp_240806_1748_dvr',
'info_dict': {
'id': 'c10f7345adb648cf804d7578ab93b2e3',
'ext': 'mp4',
'title': 'サッカー 男子 準決勝_dvr',
'display_id': 'ref:sp_240806_1748_dvr',
'duration': 12960.0,
'live_status': 'was_live',
'modified_date': '20240805',
'modified_timestamp': 1722896263,
'timestamp': 1722777618,
'upload_date': '20240804',
'uploader_id': 'tver-olympic',
},
}, {
# TBS FREE: 24-hour stream
'url': 'https://playback.api.streaks.jp/v1/projects/tbs/medias/ref:simul-02',
'info_dict': {
'id': 'c4e83a7b48f4409a96adacec674b4e22',
'ext': 'mp4',
'title': str,
'display_id': 'ref:simul-02',
'live_status': 'is_live',
'modified_date': '20241031',
'modified_timestamp': 1730339858,
'timestamp': 1705466840,
'upload_date': '20240117',
'uploader_id': 'tbs',
},
}, {
# DRM protected
'url': 'https://players.streaks.jp/sp-jbc/a12d7ee0f40c49d6a0a2bff520639677/index.html?m=5f89c62f37ee4a68be8e6e3b1396c7d8',
'only_matching': True,
}]
_WEBPAGE_TESTS = [{
'url': 'https://event.play.jp/playnext2023/',
'info_dict': {
'id': '2d975178293140dc8074a7fc536a7604',
'ext': 'mp4',
'title': 'PLAY NEXTキームービー本番',
'uploader_id': 'play',
'duration': 17.05,
'thumbnail': r're:https?://.+\.jpg',
'timestamp': 1668387517,
'upload_date': '20221114',
'modified_timestamp': 1739411523,
'modified_date': '20250213',
'live_status': 'not_live',
},
}, {
'url': 'https://wowshop.jp/Page/special/cooking_goods/?bid=wowshop&srsltid=AfmBOor_phUNoPEE_UCPiGGSCMrJE5T2US397smvsbrSdLqUxwON0el4',
'playlist_mincount': 2,
'info_dict': {
'id': '?bid=wowshop&srsltid=AfmBOor_phUNoPEE_UCPiGGSCMrJE5T2US397smvsbrSdLqUxwON0el4',
'title': 'ワンランク上の料理道具でとびきりの“おいしい”を食卓へwowshop',
'description': 'md5:914b5cb8624fc69274c7fb7b2342958f',
'age_limit': 0,
'thumbnail': 'https://wowshop.jp/Page/special/cooking_goods/images/ogp.jpg',
},
}]
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
project_id, media_id = self._match_valid_url(url).group('project_id', 'id')
return self._extract_from_streaks_api(
project_id, media_id, headers=filter_dict({
'X-Streaks-Api-Key': smuggled_data.get('api_key'),
}))

View File

@@ -191,12 +191,12 @@ class TapTapAppIE(TapTapBaseIE):
}] }]
class TapTapIntlBase(TapTapBaseIE): class TapTapIntlBaseIE(TapTapBaseIE):
_X_UA = 'V=1&PN=WebAppIntl2&LANG=zh_TW&VN_CODE=115&VN=0.1.0&LOC=CN&PLT=PC&DS=Android&UID={uuid}&CURR=&DT=PC&OS=Windows&OSV=NT%208.0.0' _X_UA = 'V=1&PN=WebAppIntl2&LANG=zh_TW&VN_CODE=115&VN=0.1.0&LOC=CN&PLT=PC&DS=Android&UID={uuid}&CURR=&DT=PC&OS=Windows&OSV=NT%208.0.0'
_VIDEO_API = 'https://www.taptap.io/webapiv2/video-resource/v1/multi-get' _VIDEO_API = 'https://www.taptap.io/webapiv2/video-resource/v1/multi-get'
class TapTapAppIntlIE(TapTapIntlBase): class TapTapAppIntlIE(TapTapIntlBaseIE):
_VALID_URL = r'https?://www\.taptap\.io/app/(?P<id>\d+)' _VALID_URL = r'https?://www\.taptap\.io/app/(?P<id>\d+)'
_INFO_API = 'https://www.taptap.io/webapiv2/i/app/v5/detail' _INFO_API = 'https://www.taptap.io/webapiv2/i/app/v5/detail'
_DATA_PATH = 'app' _DATA_PATH = 'app'
@@ -227,7 +227,7 @@ class TapTapAppIntlIE(TapTapIntlBase):
}] }]
class TapTapPostIntlIE(TapTapIntlBase): class TapTapPostIntlIE(TapTapIntlBaseIE):
_VALID_URL = r'https?://www\.taptap\.io/post/(?P<id>\d+)' _VALID_URL = r'https?://www\.taptap\.io/post/(?P<id>\d+)'
_INFO_API = 'https://www.taptap.io/webapiv2/creation/post/v1/detail' _INFO_API = 'https://www.taptap.io/webapiv2/creation/post/v1/detail'
_INFO_QUERY_KEY = 'id_str' _INFO_QUERY_KEY = 'id_str'

View File

@@ -2,12 +2,13 @@
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from .jwplatform import JWPlatformIE
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
extract_attributes,
js_to_json, js_to_json,
url_or_none, url_or_none,
) )
from ..utils.traversal import find_element, traverse_obj
class TV2DKIE(InfoExtractor): class TV2DKIE(InfoExtractor):
@@ -21,35 +22,46 @@ class TV2DKIE(InfoExtractor):
tv2fyn| tv2fyn|
tv2east| tv2east|
tv2lorry| tv2lorry|
tv2nord tv2nord|
tv2kosmopol
)\.dk/ )\.dk/
(:[^/]+/)* (?:[^/?#]+/)*
(?P<id>[^/?\#&]+) (?P<id>[^/?\#&]+)
''' '''
_TESTS = [{ _TESTS = [{
'url': 'https://www.tvsyd.dk/nyheder/28-10-2019/1930/1930-28-okt-2019?autoplay=1#player', 'url': 'https://www.tvsyd.dk/nyheder/28-10-2019/1930/1930-28-okt-2019?autoplay=1#player',
'info_dict': { 'info_dict': {
'id': '0_52jmwa0p', 'id': 'sPp5z21q',
'ext': 'mp4', 'ext': 'mp4',
'title': '19:30 - 28. okt. 2019', 'title': '19:30 - 28. okt. 2019',
'timestamp': 1572290248, 'description': '',
'thumbnail': 'https://cdn.jwplayer.com/v2/media/sPp5z21q/poster.jpg?width=720',
'timestamp': 1572287400,
'upload_date': '20191028', 'upload_date': '20191028',
'uploader_id': 'tvsyd',
'duration': 1347,
'view_count': int,
}, },
'add_ie': ['Kaltura'],
}, { }, {
'url': 'https://www.tv2lorry.dk/gadekamp/gadekamp-6-hoejhuse-i-koebenhavn', 'url': 'https://www.tv2lorry.dk/gadekamp/gadekamp-6-hoejhuse-i-koebenhavn',
'info_dict': { 'info_dict': {
'id': '1_7iwll9n0', 'id': 'oD9cyq0m',
'ext': 'mp4', 'ext': 'mp4',
'upload_date': '20211027',
'title': 'Gadekamp #6 - Højhuse i København', 'title': 'Gadekamp #6 - Højhuse i København',
'uploader_id': 'tv2lorry', 'description': '',
'timestamp': 1635345229, 'thumbnail': 'https://cdn.jwplayer.com/v2/media/oD9cyq0m/poster.jpg?width=720',
'timestamp': 1635348600,
'upload_date': '20211027',
}, },
'add_ie': ['Kaltura'], }, {
'url': 'https://www.tvsyd.dk/haderslev/x-factor-brodre-fulde-af-selvtillid-er-igen-hjemme-hos-mor-vores-diagnoser-har-vaeret-en-fordel',
'info_dict': {
'id': 'x-factor-brodre-fulde-af-selvtillid-er-igen-hjemme-hos-mor-vores-diagnoser-har-vaeret-en-fordel',
},
'playlist_count': 2,
}, {
'url': 'https://www.tv2ostjylland.dk/aarhus/dom-kan-fa-alvorlige-konsekvenser',
'info_dict': {
'id': 'dom-kan-fa-alvorlige-konsekvenser',
},
'playlist_count': 3,
}, { }, {
'url': 'https://www.tv2ostjylland.dk/artikel/minister-gaar-ind-i-sag-om-diabetes-teknologi', 'url': 'https://www.tv2ostjylland.dk/artikel/minister-gaar-ind-i-sag-om-diabetes-teknologi',
'only_matching': True, 'only_matching': True,
@@ -71,40 +83,22 @@ class TV2DKIE(InfoExtractor):
}, { }, {
'url': 'https://www.tv2nord.dk/artikel/dybt-uacceptabelt', 'url': 'https://www.tv2nord.dk/artikel/dybt-uacceptabelt',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.tv2kosmopol.dk/metropolen/chaufforer-beordres-til-at-kore-videre-i-ulovlige-busser-med-rode-advarselslamper',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
search_space = traverse_obj(webpage, {find_element(tag='article')}) or webpage
entries = [] player_ids = traverse_obj(
re.findall(r'x-data="(?:video_player|simple_player)\(({[^"]+})', search_space),
(..., {js_to_json}, {json.loads}, ('jwpMediaId', 'videoId'), {str}))
def add_entry(partner_id, kaltura_id): return self.playlist_from_matches(
entries.append(self.url_result( player_ids, video_id, getter=lambda x: f'jwplatform:{x}', ie=JWPlatformIE)
f'kaltura:{partner_id}:{kaltura_id}', 'Kaltura',
video_id=kaltura_id))
for video_el in re.findall(r'(?s)<[^>]+\bdata-entryid\s*=[^>]*>', webpage):
video = extract_attributes(video_el)
kaltura_id = video.get('data-entryid')
if not kaltura_id:
continue
partner_id = video.get('data-partnerid')
if not partner_id:
continue
add_entry(partner_id, kaltura_id)
if not entries:
kaltura_id = self._search_regex(
(r'entry_id\s*:\s*["\']([0-9a-z_]+)',
r'\\u002FentryId\\u002F(\w+)\\u002F'), webpage, 'kaltura id')
partner_id = self._search_regex(
(r'\\u002Fp\\u002F(\d+)\\u002F', r'/p/(\d+)/'), webpage,
'partner id')
add_entry(partner_id, kaltura_id)
if len(entries) == 1:
return entries[0]
return self.playlist_result(entries)
class TV2DKBornholmPlayIE(InfoExtractor): class TV2DKBornholmPlayIE(InfoExtractor):

View File

@@ -1,31 +1,70 @@
from .common import InfoExtractor from .streaks import StreaksBaseIE
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none,
join_nonempty, join_nonempty,
make_archive_id,
smuggle_url, smuggle_url,
str_or_none, str_or_none,
strip_or_none, strip_or_none,
traverse_obj,
update_url_query, update_url_query,
) )
from ..utils.traversal import require, traverse_obj
class TVerIE(InfoExtractor): class TVerIE(StreaksBaseIE):
_VALID_URL = r'https?://(?:www\.)?tver\.jp/(?:(?P<type>lp|corner|series|episodes?|feature)/)+(?P<id>[a-zA-Z0-9]+)' _VALID_URL = r'https?://(?:www\.)?tver\.jp/(?:(?P<type>lp|corner|series|episodes?|feature)/)+(?P<id>[a-zA-Z0-9]+)'
_GEO_COUNTRIES = ['JP']
_GEO_BYPASS = False
_TESTS = [{ _TESTS = [{
'skip': 'videos are only available for 7 days', # via Streaks backend
'url': 'https://tver.jp/episodes/ep83nf3w4p', 'url': 'https://tver.jp/episodes/epc1hdugbk',
'info_dict': { 'info_dict': {
'title': '家事ヤロウ!!! 売り場席巻のチーズSP財前直見×森泉親子の脱東京暮らし密着', 'id': 'epc1hdugbk',
'description': 'md5:dc2c06b6acc23f1e7c730c513737719b',
'series': '家事ヤロウ!!!',
'episode': '売り場席巻のチーズSP財前直見×森泉親子の脱東京暮らし密着',
'alt_title': '売り場席巻のチーズSP財前直見×森泉親子の脱東京暮らし密着',
'channel': 'テレビ朝日',
'id': 'ep83nf3w4p',
'ext': 'mp4', 'ext': 'mp4',
'display_id': 'ref:baeebeac-a2a6-4dbf-9eb3-c40d59b40068',
'title': '神回だけ見せます! #2 壮烈!車大騎馬戦(木曜スペシャル)',
'alt_title': '神回だけ見せます! #2 壮烈!車大騎馬戦(木曜スペシャル) 日テレ',
'description': 'md5:2726f742d5e3886edeaf72fb6d740fef',
'uploader_id': 'tver-ntv',
'channel': '日テレ',
'duration': 1158.024,
'thumbnail': 'https://statics.tver.jp/images/content/thumbnail/episode/xlarge/epc1hdugbk.jpg?v=16',
'series': '神回だけ見せます!',
'episode': '#2 壮烈!車大騎馬戦(木曜スペシャル)',
'episode_number': 2,
'timestamp': 1736486036,
'upload_date': '20250110',
'modified_timestamp': 1736870264,
'modified_date': '20250114',
'live_status': 'not_live',
'release_timestamp': 1651453200,
'release_date': '20220502',
'_old_archive_ids': ['brightcovenew ref:baeebeac-a2a6-4dbf-9eb3-c40d59b40068'],
}, },
'add_ie': ['BrightcoveNew'], }, {
# via Brightcove backend (deprecated)
'url': 'https://tver.jp/episodes/epc1hdugbk',
'info_dict': {
'id': 'ref:baeebeac-a2a6-4dbf-9eb3-c40d59b40068',
'ext': 'mp4',
'title': '神回だけ見せます! #2 壮烈!車大騎馬戦(木曜スペシャル)',
'alt_title': '神回だけ見せます! #2 壮烈!車大騎馬戦(木曜スペシャル) 日テレ',
'description': 'md5:2726f742d5e3886edeaf72fb6d740fef',
'uploader_id': '4394098882001',
'channel': '日テレ',
'duration': 1158.101,
'thumbnail': 'https://statics.tver.jp/images/content/thumbnail/episode/xlarge/epc1hdugbk.jpg?v=16',
'tags': [],
'series': '神回だけ見せます!',
'episode': '#2 壮烈!車大騎馬戦(木曜スペシャル)',
'episode_number': 2,
'timestamp': 1651388531,
'upload_date': '20220501',
'release_timestamp': 1651453200,
'release_date': '20220502',
},
'params': {'extractor_args': {'tver': {'backend': ['brightcove']}}},
}, { }, {
'url': 'https://tver.jp/corner/f0103888', 'url': 'https://tver.jp/corner/f0103888',
'only_matching': True, 'only_matching': True,
@@ -38,26 +77,7 @@ class TVerIE(InfoExtractor):
'id': 'srtxft431v', 'id': 'srtxft431v',
'title': '名探偵コナン', 'title': '名探偵コナン',
}, },
'playlist': [ 'playlist_mincount': 21,
{
'md5': '779ffd97493ed59b0a6277ea726b389e',
'info_dict': {
'id': 'ref:conan-1137-241005',
'ext': 'mp4',
'title': '名探偵コナン #1137「行列店、味変の秘密」',
'uploader_id': '5330942432001',
'tags': [],
'channel': '読売テレビ',
'series': '名探偵コナン',
'description': 'md5:601fccc1d2430d942a2c8068c4b33eb5',
'episode': '#1137「行列店、味変の秘密」',
'duration': 1469.077,
'timestamp': 1728030405,
'upload_date': '20241004',
'alt_title': '名探偵コナン #1137「行列店、味変の秘密」 読売テレビ 10月5日(土)放送分',
'thumbnail': r're:https://.+\.jpg',
},
}],
}, { }, {
'url': 'https://tver.jp/series/sru35hwdd2', 'url': 'https://tver.jp/series/sru35hwdd2',
'info_dict': { 'info_dict': {
@@ -70,7 +90,11 @@ class TVerIE(InfoExtractor):
'only_matching': True, 'only_matching': True,
}] }]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s' BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s'
_HEADERS = {'x-tver-platform-type': 'web'} _HEADERS = {
'x-tver-platform-type': 'web',
'Origin': 'https://tver.jp',
'Referer': 'https://tver.jp/',
}
_PLATFORM_QUERY = {} _PLATFORM_QUERY = {}
def _real_initialize(self): def _real_initialize(self):
@@ -103,6 +127,9 @@ def _yield_episode_ids_for_series(self, series_id):
def _real_extract(self, url): def _real_extract(self, url):
video_id, video_type = self._match_valid_url(url).group('id', 'type') video_id, video_type = self._match_valid_url(url).group('id', 'type')
backend = self._configuration_arg('backend', ['streaks'])[0]
if backend not in ('brightcove', 'streaks'):
raise ExtractorError(f'Invalid backend value: {backend}', expected=True)
if video_type == 'series': if video_type == 'series':
series_info = self._call_platform_api( series_info = self._call_platform_api(
@@ -129,12 +156,6 @@ def _real_extract(self, url):
video_info = self._download_json( video_info = self._download_json(
f'https://statics.tver.jp/content/episode/{video_id}.json', video_id, 'Downloading video info', f'https://statics.tver.jp/content/episode/{video_id}.json', video_id, 'Downloading video info',
query={'v': version}, headers={'Referer': 'https://tver.jp/'}) query={'v': version}, headers={'Referer': 'https://tver.jp/'})
p_id = video_info['video']['accountID']
r_id = traverse_obj(video_info, ('video', ('videoRefID', 'videoID')), get_all=False)
if not r_id:
raise ExtractorError('Failed to extract reference ID for Brightcove')
if not r_id.isdigit():
r_id = f'ref:{r_id}'
episode = strip_or_none(episode_content.get('title')) episode = strip_or_none(episode_content.get('title'))
series = str_or_none(episode_content.get('seriesTitle')) series = str_or_none(episode_content.get('seriesTitle'))
@@ -161,17 +182,53 @@ def _real_extract(self, url):
] ]
] ]
return { metadata = {
'_type': 'url_transparent',
'title': title, 'title': title,
'series': series, 'series': series,
'episode': episode, 'episode': episode,
# an another title which is considered "full title" for some viewers # an another title which is considered "full title" for some viewers
'alt_title': join_nonempty(title, provider, onair_label, delim=' '), 'alt_title': join_nonempty(title, provider, onair_label, delim=' '),
'channel': provider, 'channel': provider,
'description': str_or_none(video_info.get('description')),
'thumbnails': thumbnails, 'thumbnails': thumbnails,
**traverse_obj(video_info, {
'description': ('description', {str}),
'release_timestamp': ('viewStatus', 'startAt', {int_or_none}),
'episode_number': ('no', {int_or_none}),
}),
}
brightcove_id = traverse_obj(video_info, ('video', ('videoRefID', 'videoID'), {str}, any))
if brightcove_id and not brightcove_id.isdecimal():
brightcove_id = f'ref:{brightcove_id}'
streaks_id = traverse_obj(video_info, ('streaks', 'videoRefID', {str}))
if streaks_id and not streaks_id.startswith('ref:'):
streaks_id = f'ref:{streaks_id}'
# Deprecated Brightcove extraction reachable w/extractor-arg or fallback; errors are expected
if backend == 'brightcove' or not streaks_id:
if backend != 'brightcove':
self.report_warning(
'No STREAKS ID found; falling back to Brightcove extraction', video_id=video_id)
if not brightcove_id:
raise ExtractorError('Unable to extract brightcove reference ID', expected=True)
account_id = traverse_obj(video_info, (
'video', 'accountID', {str}, {require('brightcove account ID', expected=True)}))
return {
**metadata,
'_type': 'url_transparent',
'url': smuggle_url( 'url': smuggle_url(
self.BRIGHTCOVE_URL_TEMPLATE % (p_id, r_id), {'geo_countries': ['JP']}), self.BRIGHTCOVE_URL_TEMPLATE % (account_id, brightcove_id),
{'geo_countries': ['JP']}),
'ie_key': 'BrightcoveNew', 'ie_key': 'BrightcoveNew',
} }
return {
**self._extract_from_streaks_api(video_info['streaks']['projectID'], streaks_id, {
'Origin': 'https://tver.jp',
'Referer': 'https://tver.jp/',
}),
**metadata,
'id': video_id,
'_old_archive_ids': [make_archive_id('BrightcoveNew', brightcove_id)] if brightcove_id else None,
}

View File

@@ -513,7 +513,7 @@ def _parse_video(self, video, with_url=True):
class TVPVODVideoIE(TVPVODBaseIE): class TVPVODVideoIE(TVPVODBaseIE):
IE_NAME = 'tvp:vod' IE_NAME = 'tvp:vod'
_VALID_URL = r'https?://vod\.tvp\.pl/(?P<category>[a-z\d-]+,\d+)/[a-z\d-]+(?<!-odcinki)(?:-odcinki,\d+/odcinek-\d+,S\d+E\d+)?,(?P<id>\d+)/?(?:[?#]|$)' _VALID_URL = r'https?://vod\.tvp\.pl/(?P<category>[a-z\d-]+,\d+)/[a-z\d-]+(?<!-odcinki)(?:-odcinki,\d+/odcinek--?\d+,S-?\d+E-?\d+)?,(?P<id>\d+)/?(?:[?#]|$)'
_TESTS = [{ _TESTS = [{
'url': 'https://vod.tvp.pl/dla-dzieci,24/laboratorium-alchemika-odcinki,309338/odcinek-24,S01E24,311357', 'url': 'https://vod.tvp.pl/dla-dzieci,24/laboratorium-alchemika-odcinki,309338/odcinek-24,S01E24,311357',
@@ -568,6 +568,9 @@ class TVPVODVideoIE(TVPVODBaseIE):
'live_status': 'is_live', 'live_status': 'is_live',
'thumbnail': 're:https?://.+', 'thumbnail': 're:https?://.+',
}, },
}, {
'url': 'https://vod.tvp.pl/informacje-i-publicystyka,205/konskie-2025-debata-przedwyborcza-odcinki,2028435/odcinek--1,S01E-1,2028419',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -1,13 +1,21 @@
import json import json
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import clean_html, remove_end, unified_timestamp, url_or_none from ..utils import (
from ..utils.traversal import traverse_obj clean_html,
extract_attributes,
parse_qs,
remove_end,
require,
unified_timestamp,
url_or_none,
)
from ..utils.traversal import find_element, traverse_obj
class TvwIE(InfoExtractor): class TvwIE(InfoExtractor):
IE_NAME = 'tvw'
_VALID_URL = r'https?://(?:www\.)?tvw\.org/video/(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?tvw\.org/video/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://tvw.org/video/billy-frank-jr-statue-maquette-unveiling-ceremony-2024011211/', 'url': 'https://tvw.org/video/billy-frank-jr-statue-maquette-unveiling-ceremony-2024011211/',
'md5': '9ceb94fe2bb7fd726f74f16356825703', 'md5': '9ceb94fe2bb7fd726f74f16356825703',
@@ -115,3 +123,43 @@ def _real_extract(self, url):
'is_live': ('eventStatus', {lambda x: x == 'live'}), 'is_live': ('eventStatus', {lambda x: x == 'live'}),
}), }),
} }
class TvwTvChannelsIE(InfoExtractor):
IE_NAME = 'tvw:tvchannels'
_VALID_URL = r'https?://(?:www\.)?tvw\.org/tvchannels/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://tvw.org/tvchannels/air/',
'info_dict': {
'id': 'air',
'ext': 'mp4',
'title': r're:TVW Cable Channel Live Stream',
'thumbnail': r're:https?://.+/.+\.(?:jpe?g|png)$',
'live_status': 'is_live',
},
}, {
'url': 'https://tvw.org/tvchannels/tvw2/',
'info_dict': {
'id': 'tvw2',
'ext': 'mp4',
'title': r're:TVW-2 Broadcast Channel',
'thumbnail': r're:https?://.+/.+\.(?:jpe?g|png)$',
'live_status': 'is_live',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
m3u8_url = traverse_obj(webpage, (
{find_element(id='invintus-persistent-stream-frame', html=True)}, {extract_attributes},
'src', {parse_qs}, 'encoder', 0, {json.loads}, 'live247URI', {url_or_none}, {require('stream url')}))
return {
'id': video_id,
'formats': self._extract_m3u8_formats(m3u8_url, video_id, 'mp4', m3u8_id='hls', live=True),
'title': remove_end(self._og_search_title(webpage, default=None), ' - TVW'),
'thumbnail': self._og_search_thumbnail(webpage, default=None),
'is_live': True,
}

View File

@@ -14,12 +14,13 @@
parse_duration, parse_duration,
qualities, qualities,
str_to_int, str_to_int,
traverse_obj,
try_get, try_get,
unified_timestamp, unified_timestamp,
url_or_none,
urlencode_postdata, urlencode_postdata,
urljoin, urljoin,
) )
from ..utils.traversal import traverse_obj
class TwitCastingIE(InfoExtractor): class TwitCastingIE(InfoExtractor):
@@ -138,13 +139,7 @@ def _real_extract(self, url):
r'data-toggle="true"[^>]+datetime="([^"]+)"', r'data-toggle="true"[^>]+datetime="([^"]+)"',
webpage, 'datetime', None)) webpage, 'datetime', None))
stream_server_data = self._download_json(
f'https://twitcasting.tv/streamserver.php?target={uploader_id}&mode=client', video_id,
'Downloading live info', fatal=False)
is_live = any(f'data-{x}' in webpage for x in ['is-onlive="true"', 'live-type="live"', 'status="online"']) is_live = any(f'data-{x}' in webpage for x in ['is-onlive="true"', 'live-type="live"', 'status="online"'])
if not traverse_obj(stream_server_data, 'llfmp4') and is_live:
self.raise_login_required(method='cookies')
base_dict = { base_dict = {
'title': title, 'title': title,
@@ -165,28 +160,37 @@ def find_dmu(x):
return [data_movie_url] return [data_movie_url]
m3u8_urls = (try_get(webpage, find_dmu, list) m3u8_urls = (try_get(webpage, find_dmu, list)
or traverse_obj(video_js_data, (..., 'source', 'url')) or traverse_obj(video_js_data, (..., 'source', 'url')))
or ([f'https://twitcasting.tv/{uploader_id}/metastream.m3u8'] if is_live else None))
if not m3u8_urls:
raise ExtractorError('Failed to get m3u8 playlist')
if is_live: if is_live:
m3u8_url = m3u8_urls[0] stream_data = self._download_json(
formats = self._extract_m3u8_formats( 'https://twitcasting.tv/streamserver.php',
m3u8_url, video_id, ext='mp4', m3u8_id='hls', video_id, 'Downloading live info', query={
live=True, headers=self._M3U8_HEADERS) 'target': uploader_id,
'mode': 'client',
'player': 'pc_web',
})
if traverse_obj(stream_server_data, ('hls', 'source')): formats = []
formats.extend(self._extract_m3u8_formats( # low: 640x360, medium: 1280x720, high: 1920x1080
m3u8_url, video_id, ext='mp4', m3u8_id='source', qq = qualities(['low', 'medium', 'high'])
live=True, query={'mode': 'source'}, for quality, m3u8_url in traverse_obj(stream_data, (
note='Downloading source quality m3u8', 'tc-hls', 'streams', {dict.items}, lambda _, v: url_or_none(v[1]),
headers=self._M3U8_HEADERS, fatal=False)) )):
formats.append({
'url': m3u8_url,
'format_id': f'hls-{quality}',
'ext': 'mp4',
'quality': qq(quality),
'protocol': 'm3u8',
'http_headers': self._M3U8_HEADERS,
})
if websockets: if websockets:
qq = qualities(['base', 'mobilesource', 'main']) qq = qualities(['base', 'mobilesource', 'main'])
streams = traverse_obj(stream_server_data, ('llfmp4', 'streams')) or {} for mode, ws_url in traverse_obj(stream_data, (
for mode, ws_url in streams.items(): 'llfmp4', 'streams', {dict.items}, lambda _, v: url_or_none(v[1]),
)):
formats.append({ formats.append({
'url': ws_url, 'url': ws_url,
'format_id': f'ws-{mode}', 'format_id': f'ws-{mode}',
@@ -197,10 +201,15 @@ def find_dmu(x):
'protocol': 'websocket_frag', 'protocol': 'websocket_frag',
}) })
if not formats:
self.raise_login_required()
infodict = { infodict = {
'formats': formats, 'formats': formats,
'_format_sort_fields': ('source', ), '_format_sort_fields': ('source', ),
} }
elif not m3u8_urls:
raise ExtractorError('Failed to get m3u8 playlist')
elif len(m3u8_urls) == 1: elif len(m3u8_urls) == 1:
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(
m3u8_urls[0], video_id, 'mp4', headers=self._M3U8_HEADERS) m3u8_urls[0], video_id, 'mp4', headers=self._M3U8_HEADERS)

View File

@@ -14,19 +14,20 @@
dict_get, dict_get,
float_or_none, float_or_none,
int_or_none, int_or_none,
join_nonempty,
make_archive_id, make_archive_id,
parse_duration, parse_duration,
parse_iso8601, parse_iso8601,
parse_qs, parse_qs,
qualities, qualities,
str_or_none, str_or_none,
traverse_obj,
try_get, try_get,
unified_timestamp, unified_timestamp,
update_url_query, update_url_query,
url_or_none, url_or_none,
urljoin, urljoin,
) )
from ..utils.traversal import traverse_obj, value
class TwitchBaseIE(InfoExtractor): class TwitchBaseIE(InfoExtractor):
@@ -42,10 +43,10 @@ class TwitchBaseIE(InfoExtractor):
'CollectionSideBar': '27111f1b382effad0b6def325caef1909c733fe6a4fbabf54f8d491ef2cf2f14', 'CollectionSideBar': '27111f1b382effad0b6def325caef1909c733fe6a4fbabf54f8d491ef2cf2f14',
'FilterableVideoTower_Videos': 'a937f1d22e269e39a03b509f65a7490f9fc247d7f83d6ac1421523e3b68042cb', 'FilterableVideoTower_Videos': 'a937f1d22e269e39a03b509f65a7490f9fc247d7f83d6ac1421523e3b68042cb',
'ClipsCards__User': 'b73ad2bfaecfd30a9e6c28fada15bd97032c83ec77a0440766a56fe0bd632777', 'ClipsCards__User': 'b73ad2bfaecfd30a9e6c28fada15bd97032c83ec77a0440766a56fe0bd632777',
'ShareClipRenderStatus': 'f130048a462a0ac86bb54d653c968c514e9ab9ca94db52368c1179e97b0f16eb',
'ChannelCollectionsContent': '447aec6a0cc1e8d0a8d7732d47eb0762c336a2294fdb009e9c9d854e49d484b9', 'ChannelCollectionsContent': '447aec6a0cc1e8d0a8d7732d47eb0762c336a2294fdb009e9c9d854e49d484b9',
'StreamMetadata': 'a647c2a13599e5991e175155f798ca7f1ecddde73f7f341f39009c14dbf59962', 'StreamMetadata': 'a647c2a13599e5991e175155f798ca7f1ecddde73f7f341f39009c14dbf59962',
'ComscoreStreamingQuery': 'e1edae8122517d013405f237ffcc124515dc6ded82480a88daef69c83b53ac01', 'ComscoreStreamingQuery': 'e1edae8122517d013405f237ffcc124515dc6ded82480a88daef69c83b53ac01',
'VideoAccessToken_Clip': '36b89d2507fce29e5ca551df756d27c1cfe079e2609642b4390aa4c35796eb11',
'VideoPreviewOverlay': '3006e77e51b128d838fa4e835723ca4dc9a05c5efd4466c1085215c6e437e65c', 'VideoPreviewOverlay': '3006e77e51b128d838fa4e835723ca4dc9a05c5efd4466c1085215c6e437e65c',
'VideoMetadata': '49b5b8f268cdeb259d75b58dcb0c1a748e3b575003448a2333dc5cdafd49adad', 'VideoMetadata': '49b5b8f268cdeb259d75b58dcb0c1a748e3b575003448a2333dc5cdafd49adad',
'VideoPlayer_ChapterSelectButtonVideo': '8d2793384aac3773beab5e59bd5d6f585aedb923d292800119e03d40cd0f9b41', 'VideoPlayer_ChapterSelectButtonVideo': '8d2793384aac3773beab5e59bd5d6f585aedb923d292800119e03d40cd0f9b41',
@@ -1083,16 +1084,44 @@ class TwitchClipsIE(TwitchBaseIE):
'url': 'https://clips.twitch.tv/FaintLightGullWholeWheat', 'url': 'https://clips.twitch.tv/FaintLightGullWholeWheat',
'md5': '761769e1eafce0ffebfb4089cb3847cd', 'md5': '761769e1eafce0ffebfb4089cb3847cd',
'info_dict': { 'info_dict': {
'id': '42850523', 'id': '396245304',
'display_id': 'FaintLightGullWholeWheat', 'display_id': 'FaintLightGullWholeWheat',
'ext': 'mp4', 'ext': 'mp4',
'title': 'EA Play 2016 Live from the Novo Theatre', 'title': 'EA Play 2016 Live from the Novo Theatre',
'duration': 32,
'view_count': int,
'thumbnail': r're:^https?://.*\.jpg', 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1465767393, 'timestamp': 1465767393,
'upload_date': '20160612', 'upload_date': '20160612',
'creator': 'EA', 'creators': ['EA'],
'uploader': 'stereotype_', 'channel': 'EA',
'uploader_id': '43566419', 'channel_id': '25163635',
'channel_is_verified': False,
'channel_follower_count': int,
'uploader': 'EA',
'uploader_id': '25163635',
},
}, {
'url': 'https://www.twitch.tv/xqc/clip/CulturedAmazingKuduDatSheffy-TiZ_-ixAGYR3y2Uy',
'md5': 'e90fe616b36e722a8cfa562547c543f0',
'info_dict': {
'id': '3207364882',
'display_id': 'CulturedAmazingKuduDatSheffy-TiZ_-ixAGYR3y2Uy',
'ext': 'mp4',
'title': 'A day in the life of xQc',
'duration': 60,
'view_count': int,
'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1742869615,
'upload_date': '20250325',
'creators': ['xQc'],
'channel': 'xQc',
'channel_id': '71092938',
'channel_is_verified': True,
'channel_follower_count': int,
'uploader': 'xQc',
'uploader_id': '71092938',
'categories': ['Just Chatting'],
}, },
}, { }, {
# multiple formats # multiple formats
@@ -1116,16 +1145,14 @@ class TwitchClipsIE(TwitchBaseIE):
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) slug = self._match_id(url)
clip = self._download_gql( clip = self._download_gql(
video_id, [{ slug, [{
'operationName': 'VideoAccessToken_Clip', 'operationName': 'ShareClipRenderStatus',
'variables': { 'variables': {'slug': slug},
'slug': video_id,
},
}], }],
'Downloading clip access token GraphQL')[0]['data']['clip'] 'Downloading clip GraphQL')[0]['data']['clip']
if not clip: if not clip:
raise ExtractorError( raise ExtractorError(
@@ -1135,81 +1162,71 @@ def _real_extract(self, url):
'sig': clip['playbackAccessToken']['signature'], 'sig': clip['playbackAccessToken']['signature'],
'token': clip['playbackAccessToken']['value'], 'token': clip['playbackAccessToken']['value'],
} }
asset_default = traverse_obj(clip, ('assets', 0, {dict})) or {}
data = self._download_base_gql( asset_portrait = traverse_obj(clip, ('assets', 1, {dict})) or {}
video_id, {
'query': '''{
clip(slug: "%s") {
broadcaster {
displayName
}
createdAt
curator {
displayName
id
}
durationSeconds
id
tiny: thumbnailURL(width: 86, height: 45)
small: thumbnailURL(width: 260, height: 147)
medium: thumbnailURL(width: 480, height: 272)
title
videoQualities {
frameRate
quality
sourceURL
}
viewCount
}
}''' % video_id}, 'Downloading clip GraphQL', fatal=False) # noqa: UP031
if data:
clip = try_get(data, lambda x: x['data']['clip'], dict) or clip
formats = [] formats = []
for option in clip.get('videoQualities', []): default_aspect_ratio = float_or_none(asset_default.get('aspectRatio'))
if not isinstance(option, dict): formats.extend(traverse_obj(asset_default, ('videoQualities', lambda _, v: url_or_none(v['sourceURL']), {
continue 'url': ('sourceURL', {update_url_query(query=access_query)}),
source = url_or_none(option.get('sourceURL')) 'format_id': ('quality', {str}),
if not source: 'height': ('quality', {int_or_none}),
continue 'fps': ('frameRate', {float_or_none}),
'aspect_ratio': {value(default_aspect_ratio)},
})))
portrait_aspect_ratio = float_or_none(asset_portrait.get('aspectRatio'))
for source in traverse_obj(asset_portrait, ('videoQualities', lambda _, v: url_or_none(v['sourceURL']))):
formats.append({ formats.append({
'url': update_url_query(source, access_query), 'url': update_url_query(source['sourceURL'], access_query),
'format_id': option.get('quality'), 'format_id': join_nonempty('portrait', source.get('quality')),
'height': int_or_none(option.get('quality')), 'height': int_or_none(source.get('quality')),
'fps': int_or_none(option.get('frameRate')), 'fps': float_or_none(source.get('frameRate')),
'aspect_ratio': portrait_aspect_ratio,
'quality': -2,
}) })
thumbnails = [] thumbnails = []
for thumbnail_id in ('tiny', 'small', 'medium'): thumb_asset_default_url = url_or_none(asset_default.get('thumbnailURL'))
thumbnail_url = clip.get(thumbnail_id) if thumb_asset_default_url:
if not thumbnail_url: thumbnails.append({
continue 'id': 'default',
thumb = { 'url': thumb_asset_default_url,
'id': thumbnail_id, 'preference': 0,
'url': thumbnail_url, })
} if thumb_asset_portrait_url := url_or_none(asset_portrait.get('thumbnailURL')):
mobj = re.search(r'-(\d+)x(\d+)\.', thumbnail_url) thumbnails.append({
if mobj: 'id': 'portrait',
thumb.update({ 'url': thumb_asset_portrait_url,
'height': int(mobj.group(2)), 'preference': -1,
'width': int(mobj.group(1)), })
thumb_default_url = url_or_none(clip.get('thumbnailURL'))
if thumb_default_url and thumb_default_url != thumb_asset_default_url:
thumbnails.append({
'id': 'small',
'url': thumb_default_url,
'preference': -2,
}) })
thumbnails.append(thumb)
old_id = self._search_regex(r'%7C(\d+)(?:-\d+)?.mp4', formats[-1]['url'], 'old id', default=None) old_id = self._search_regex(r'%7C(\d+)(?:-\d+)?.mp4', formats[-1]['url'], 'old id', default=None)
return { return {
'id': clip.get('id') or video_id, 'id': clip.get('id') or slug,
'_old_archive_ids': [make_archive_id(self, old_id)] if old_id else None, '_old_archive_ids': [make_archive_id(self, old_id)] if old_id else None,
'display_id': video_id, 'display_id': slug,
'title': clip.get('title'),
'formats': formats, 'formats': formats,
'duration': int_or_none(clip.get('durationSeconds')),
'view_count': int_or_none(clip.get('viewCount')),
'timestamp': unified_timestamp(clip.get('createdAt')),
'thumbnails': thumbnails, 'thumbnails': thumbnails,
'creator': try_get(clip, lambda x: x['broadcaster']['displayName'], str), **traverse_obj(clip, {
'uploader': try_get(clip, lambda x: x['curator']['displayName'], str), 'title': ('title', {str}),
'uploader_id': try_get(clip, lambda x: x['curator']['id'], str), 'duration': ('durationSeconds', {int_or_none}),
'view_count': ('viewCount', {int_or_none}),
'timestamp': ('createdAt', {parse_iso8601}),
'creators': ('broadcaster', 'displayName', {str}, filter, all),
'channel': ('broadcaster', 'displayName', {str}),
'channel_id': ('broadcaster', 'id', {str}),
'channel_follower_count': ('broadcaster', 'followers', 'totalCount', {int_or_none}),
'channel_is_verified': ('broadcaster', 'isPartner', {bool}),
'uploader': ('curator', 'displayName', {str}),
'uploader_id': ('curator', 'id', {str}),
'categories': ('game', 'displayName', {str}, filter, all, filter),
}),
} }

View File

@@ -1221,20 +1221,10 @@ class TwitterIE(TwitterBaseIE):
}] }]
_MEDIA_ID_RE = re.compile(r'_video/(\d+)/') _MEDIA_ID_RE = re.compile(r'_video/(\d+)/')
_GRAPHQL_ENDPOINT = '2ICDjqPd81tulZcYrtpTuQ/TweetResultByRestId'
@property
def _GRAPHQL_ENDPOINT(self):
if self.is_logged_in:
return 'zZXycP0V6H7m-2r0mOnFcA/TweetDetail'
return '2ICDjqPd81tulZcYrtpTuQ/TweetResultByRestId'
def _graphql_to_legacy(self, data, twid): def _graphql_to_legacy(self, data, twid):
result = traverse_obj(data, ( result = traverse_obj(data, ('tweetResult', 'result', {dict})) or {}
'threaded_conversation_with_injections_v2', 'instructions', 0, 'entries',
lambda _, v: v['entryId'] == f'tweet-{twid}', 'content', 'itemContent',
'tweet_results', 'result', ('tweet', None), {dict},
), default={}, get_all=False) if self.is_logged_in else traverse_obj(
data, ('tweetResult', 'result', {dict}), default={})
typename = result.get('__typename') typename = result.get('__typename')
if typename not in ('Tweet', 'TweetWithVisibilityResults', 'TweetTombstone', 'TweetUnavailable', None): if typename not in ('Tweet', 'TweetWithVisibilityResults', 'TweetTombstone', 'TweetUnavailable', None):
@@ -1278,37 +1268,6 @@ def _graphql_to_legacy(self, data, twid):
def _build_graphql_query(self, media_id): def _build_graphql_query(self, media_id):
return { return {
'variables': {
'focalTweetId': media_id,
'includePromotedContent': True,
'with_rux_injections': False,
'withBirdwatchNotes': True,
'withCommunity': True,
'withDownvotePerspective': False,
'withQuickPromoteEligibilityTweetFields': True,
'withReactionsMetadata': False,
'withReactionsPerspective': False,
'withSuperFollowsTweetFields': True,
'withSuperFollowsUserFields': True,
'withV2Timeline': True,
'withVoice': True,
},
'features': {
'graphql_is_translatable_rweb_tweet_is_translatable_enabled': False,
'interactive_text_enabled': True,
'responsive_web_edit_tweet_api_enabled': True,
'responsive_web_enhance_cards_enabled': True,
'responsive_web_graphql_timeline_navigation_enabled': False,
'responsive_web_text_conversations_enabled': False,
'responsive_web_uc_gql_enabled': True,
'standardized_nudges_misinfo': True,
'tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled': False,
'tweetypie_unmention_optimization_enabled': True,
'unified_cards_ad_metadata_container_dynamic_card_content_query_enabled': True,
'verified_phone_label_enabled': False,
'vibe_api_enabled': True,
},
} if self.is_logged_in else {
'variables': { 'variables': {
'tweetId': media_id, 'tweetId': media_id,
'withCommunity': False, 'withCommunity': False,
@@ -1717,21 +1676,22 @@ class TwitterSpacesIE(TwitterBaseIE):
_VALID_URL = TwitterBaseIE._BASE_REGEX + r'i/spaces/(?P<id>[0-9a-zA-Z]{13})' _VALID_URL = TwitterBaseIE._BASE_REGEX + r'i/spaces/(?P<id>[0-9a-zA-Z]{13})'
_TESTS = [{ _TESTS = [{
'url': 'https://twitter.com/i/spaces/1RDxlgyvNXzJL', 'url': 'https://twitter.com/i/spaces/1OwxWwQOPlNxQ',
'info_dict': { 'info_dict': {
'id': '1RDxlgyvNXzJL', 'id': '1OwxWwQOPlNxQ',
'ext': 'm4a', 'ext': 'm4a',
'title': 'King Carlo e la mossa Kansas City per fare il Grande Centro', 'title': 'Everybody in: @mtbarra & @elonmusk discuss the future of EV charging',
'description': 'Twitter Space participated by annarita digiorgio, Signor Ernesto, Raffaello Colosimo, Simone M. Sepe', 'description': 'Twitter Space participated by Elon Musk',
'uploader': r're:Lucio Di Gaetano.*?',
'uploader_id': 'luciodigaetano',
'live_status': 'was_live', 'live_status': 'was_live',
'timestamp': 1659877956, 'release_date': '20230608',
'upload_date': '20220807', 'release_timestamp': 1686256230,
'release_timestamp': 1659904215, 'thumbnail': r're:https?://pbs\.twimg\.com/profile_images/.+',
'release_date': '20220807', 'timestamp': 1686254250,
'upload_date': '20230608',
'uploader': 'Mary Barra',
'uploader_id': 'mtbarra',
}, },
'skip': 'No longer available', 'params': {'skip_download': 'm3u8'},
}, { }, {
# post_live/TimedOut but downloadable # post_live/TimedOut but downloadable
'url': 'https://twitter.com/i/spaces/1vAxRAVQWONJl', 'url': 'https://twitter.com/i/spaces/1vAxRAVQWONJl',
@@ -1743,9 +1703,10 @@ class TwitterSpacesIE(TwitterBaseIE):
'uploader': 'Google Cloud', 'uploader': 'Google Cloud',
'uploader_id': 'googlecloud', 'uploader_id': 'googlecloud',
'live_status': 'post_live', 'live_status': 'post_live',
'thumbnail': r're:https?://pbs\.twimg\.com/profile_images/.+',
'timestamp': 1681409554, 'timestamp': 1681409554,
'upload_date': '20230413', 'upload_date': '20230413',
'release_timestamp': 1681839000, 'release_timestamp': 1681839082,
'release_date': '20230418', 'release_date': '20230418',
'protocol': 'm3u8', # ffmpeg is forced 'protocol': 'm3u8', # ffmpeg is forced
'container': 'm4a_dash', # audio-only format fixup is applied 'container': 'm4a_dash', # audio-only format fixup is applied
@@ -1762,6 +1723,9 @@ class TwitterSpacesIE(TwitterBaseIE):
'uploader': '息根とめる', 'uploader': '息根とめる',
'uploader_id': 'tomeru_ikinone', 'uploader_id': 'tomeru_ikinone',
'live_status': 'was_live', 'live_status': 'was_live',
'release_date': '20230601',
'release_timestamp': 1685617200,
'thumbnail': r're:https?://pbs\.twimg\.com/profile_images/.+',
'timestamp': 1685617198, 'timestamp': 1685617198,
'upload_date': '20230601', 'upload_date': '20230601',
'protocol': 'm3u8', # ffmpeg is forced 'protocol': 'm3u8', # ffmpeg is forced
@@ -1779,9 +1743,10 @@ class TwitterSpacesIE(TwitterBaseIE):
'uploader': 'Candace Owens', 'uploader': 'Candace Owens',
'uploader_id': 'RealCandaceO', 'uploader_id': 'RealCandaceO',
'live_status': 'was_live', 'live_status': 'was_live',
'thumbnail': r're:https?://pbs\.twimg\.com/profile_images/.+',
'timestamp': 1723931351, 'timestamp': 1723931351,
'upload_date': '20240817', 'upload_date': '20240817',
'release_timestamp': 1723932000, 'release_timestamp': 1723932056,
'release_date': '20240817', 'release_date': '20240817',
'protocol': 'm3u8_native', # not ffmpeg, detected as video space 'protocol': 'm3u8_native', # not ffmpeg, detected as video space
}, },
@@ -1861,18 +1826,21 @@ def _real_extract(self, url):
return { return {
'id': space_id, 'id': space_id,
'title': metadata.get('title'),
'description': f'Twitter Space participated by {participants}', 'description': f'Twitter Space participated by {participants}',
'uploader': traverse_obj(
metadata, ('creator_results', 'result', 'legacy', 'name')),
'uploader_id': traverse_obj(
metadata, ('creator_results', 'result', 'legacy', 'screen_name')),
'live_status': live_status,
'release_timestamp': try_call(
lambda: int_or_none(metadata['scheduled_start'], scale=1000)),
'timestamp': int_or_none(metadata.get('created_at'), scale=1000),
'formats': formats, 'formats': formats,
'http_headers': headers, 'http_headers': headers,
'live_status': live_status,
**traverse_obj(metadata, {
'title': ('title', {str}),
# started_at is None when stream is_upcoming so fallback to scheduled_start for --wait-for-video
'release_timestamp': (('started_at', 'scheduled_start'), {int_or_none(scale=1000)}, any),
'timestamp': ('created_at', {int_or_none(scale=1000)}),
}),
**traverse_obj(metadata, ('creator_results', 'result', 'legacy', {
'uploader': ('name', {str}),
'uploader_id': ('screen_name', {str_or_none}),
'thumbnail': ('profile_image_url_https', {lambda x: x.replace('_normal', '_400x400')}, {url_or_none}),
})),
} }

View File

@@ -51,6 +51,8 @@ class KnownDRMIE(UnsupportedInfoExtractor):
r'(?:watch|front)\.njpwworld\.com', r'(?:watch|front)\.njpwworld\.com',
r'qub\.ca/vrai', r'qub\.ca/vrai',
r'(?:beta\.)?crunchyroll\.com', r'(?:beta\.)?crunchyroll\.com',
r'viki\.com',
r'deezer\.com',
) )
_TESTS = [{ _TESTS = [{
@@ -160,6 +162,12 @@ class KnownDRMIE(UnsupportedInfoExtractor):
}, { }, {
'url': 'https://beta.crunchyroll.com/pt-br/watch/G8WUN8VKP/the-ruler-of-conspiracy', 'url': 'https://beta.crunchyroll.com/pt-br/watch/G8WUN8VKP/the-ruler-of-conspiracy',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.viki.com/videos/1175236v-choosing-spouse-by-lottery-episode-1',
'only_matching': True,
}, {
'url': 'http://www.deezer.com/playlist/176747451',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -1,346 +0,0 @@
import hashlib
import hmac
import json
import time
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
parse_age_limit,
parse_iso8601,
try_get,
)
class VikiBaseIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?viki\.(?:com|net|mx|jp|fr)/'
_API_URL_TEMPLATE = 'https://api.viki.io%s'
_DEVICE_ID = '112395910d'
_APP = '100005a'
_APP_VERSION = '6.11.3'
_APP_SECRET = 'd96704b180208dbb2efa30fe44c48bd8690441af9f567ba8fd710a72badc85198f7472'
_GEO_BYPASS = False
_NETRC_MACHINE = 'viki'
_token = None
_ERRORS = {
'geo': 'Sorry, this content is not available in your region.',
'upcoming': 'Sorry, this content is not yet available.',
'paywall': 'Sorry, this content is only available to Viki Pass Plus subscribers',
}
def _stream_headers(self, timestamp, sig):
return {
'X-Viki-manufacturer': 'vivo',
'X-Viki-device-model': 'vivo 1606',
'X-Viki-device-os-ver': '6.0.1',
'X-Viki-connection-type': 'WIFI',
'X-Viki-carrier': '',
'X-Viki-as-id': '100005a-1625321982-3932',
'timestamp': str(timestamp),
'signature': str(sig),
'x-viki-app-ver': self._APP_VERSION,
}
def _api_query(self, path, version=4, **kwargs):
path += '?' if '?' not in path else '&'
query = f'/v{version}/{path}app={self._APP}'
if self._token:
query += f'&token={self._token}'
return query + ''.join(f'&{name}={val}' for name, val in kwargs.items())
def _sign_query(self, path):
timestamp = int(time.time())
query = self._api_query(path, version=5)
sig = hmac.new(
self._APP_SECRET.encode('ascii'), f'{query}&t={timestamp}'.encode('ascii'), hashlib.sha1).hexdigest()
return timestamp, sig, self._API_URL_TEMPLATE % query
def _call_api(
self, path, video_id, note='Downloading JSON metadata', data=None, query=None, fatal=True):
if query is None:
timestamp, sig, url = self._sign_query(path)
else:
url = self._API_URL_TEMPLATE % self._api_query(path, version=4)
resp = self._download_json(
url, video_id, note, fatal=fatal, query=query,
data=json.dumps(data).encode() if data else None,
headers=({'x-viki-app-ver': self._APP_VERSION} if data
else self._stream_headers(timestamp, sig) if query is None
else None), expected_status=400) or {}
self._raise_error(resp.get('error'), fatal)
return resp
def _raise_error(self, error, fatal=True):
if error is None:
return
msg = f'{self.IE_NAME} said: {error}'
if fatal:
raise ExtractorError(msg, expected=True)
else:
self.report_warning(msg)
def _check_errors(self, data):
for reason, status in (data.get('blocking') or {}).items():
if status and reason in self._ERRORS:
message = self._ERRORS[reason]
if reason == 'geo':
self.raise_geo_restricted(msg=message)
elif reason == 'paywall':
if try_get(data, lambda x: x['paywallable']['tvod']):
self._raise_error('This video is for rent only or TVOD (Transactional Video On demand)')
self.raise_login_required(message)
self._raise_error(message)
def _perform_login(self, username, password):
self._token = self._call_api(
'sessions.json', None, 'Logging in', fatal=False,
data={'username': username, 'password': password}).get('token')
if not self._token:
self.report_warning('Login Failed: Unable to get session token')
@staticmethod
def dict_selection(dict_obj, preferred_key):
if preferred_key in dict_obj:
return dict_obj[preferred_key]
return (list(filter(None, dict_obj.values())) or [None])[0]
class VikiIE(VikiBaseIE):
IE_NAME = 'viki'
_VALID_URL = rf'{VikiBaseIE._VALID_URL_BASE}(?:videos|player)/(?P<id>[0-9]+v)'
_TESTS = [{
'note': 'Free non-DRM video with storyboards in MPD',
'url': 'https://www.viki.com/videos/1175236v-choosing-spouse-by-lottery-episode-1',
'info_dict': {
'id': '1175236v',
'ext': 'mp4',
'title': 'Choosing Spouse by Lottery - Episode 1',
'timestamp': 1606463239,
'age_limit': 13,
'uploader': 'FCC',
'upload_date': '20201127',
},
}, {
'url': 'http://www.viki.com/videos/1023585v-heirs-episode-14',
'info_dict': {
'id': '1023585v',
'ext': 'mp4',
'title': 'Heirs - Episode 14',
'uploader': 'SBS Contents Hub',
'timestamp': 1385047627,
'upload_date': '20131121',
'age_limit': 13,
'duration': 3570,
'episode_number': 14,
},
'skip': 'Blocked in the US',
}, {
# clip
'url': 'http://www.viki.com/videos/1067139v-the-avengers-age-of-ultron-press-conference',
'md5': '86c0b5dbd4d83a6611a79987cc7a1989',
'info_dict': {
'id': '1067139v',
'ext': 'mp4',
'title': "'The Avengers: Age of Ultron' Press Conference",
'description': 'md5:d70b2f9428f5488321bfe1db10d612ea',
'duration': 352,
'timestamp': 1430380829,
'upload_date': '20150430',
'uploader': 'Arirang TV',
'like_count': int,
'age_limit': 0,
},
'skip': 'Sorry. There was an error loading this video',
}, {
'url': 'http://www.viki.com/videos/1048879v-ankhon-dekhi',
'info_dict': {
'id': '1048879v',
'ext': 'mp4',
'title': 'Ankhon Dekhi',
'duration': 6512,
'timestamp': 1408532356,
'upload_date': '20140820',
'uploader': 'Spuul',
'like_count': int,
'age_limit': 13,
},
'skip': 'Blocked in the US',
}, {
# episode
'url': 'http://www.viki.com/videos/44699v-boys-over-flowers-episode-1',
'md5': '0a53dc252e6e690feccd756861495a8c',
'info_dict': {
'id': '44699v',
'ext': 'mp4',
'title': 'Boys Over Flowers - Episode 1',
'description': 'md5:b89cf50038b480b88b5b3c93589a9076',
'duration': 4172,
'timestamp': 1270496524,
'upload_date': '20100405',
'uploader': 'group8',
'like_count': int,
'age_limit': 13,
'episode_number': 1,
},
}, {
# youtube external
'url': 'http://www.viki.com/videos/50562v-poor-nastya-complete-episode-1',
'md5': '63f8600c1da6f01b7640eee7eca4f1da',
'info_dict': {
'id': '50562v',
'ext': 'webm',
'title': 'Poor Nastya [COMPLETE] - Episode 1',
'description': '',
'duration': 606,
'timestamp': 1274949505,
'upload_date': '20101213',
'uploader': 'ad14065n',
'uploader_id': 'ad14065n',
'like_count': int,
'age_limit': 13,
},
'skip': 'Page not found!',
}, {
'url': 'http://www.viki.com/player/44699v',
'only_matching': True,
}, {
# non-English description
'url': 'http://www.viki.com/videos/158036v-love-in-magic',
'md5': '41faaba0de90483fb4848952af7c7d0d',
'info_dict': {
'id': '158036v',
'ext': 'mp4',
'uploader': 'I Planet Entertainment',
'upload_date': '20111122',
'timestamp': 1321985454,
'description': 'md5:44b1e46619df3a072294645c770cef36',
'title': 'Love In Magic',
'age_limit': 13,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video = self._call_api(f'videos/{video_id}.json', video_id, 'Downloading video JSON', query={})
self._check_errors(video)
title = try_get(video, lambda x: x['titles']['en'], str)
episode_number = int_or_none(video.get('number'))
if not title:
title = f'Episode {episode_number}' if video.get('type') == 'episode' else video.get('id') or video_id
container_titles = try_get(video, lambda x: x['container']['titles'], dict) or {}
container_title = self.dict_selection(container_titles, 'en')
title = f'{container_title} - {title}'
thumbnails = [{
'id': thumbnail_id,
'url': thumbnail['url'],
} for thumbnail_id, thumbnail in (video.get('images') or {}).items() if thumbnail.get('url')]
resp = self._call_api(
f'playback_streams/{video_id}.json?drms=dt3&device_id={self._DEVICE_ID}',
video_id, 'Downloading video streams JSON')['main'][0]
stream_id = try_get(resp, lambda x: x['properties']['track']['stream_id'])
subtitles = dict((lang, [{
'ext': ext,
'url': self._API_URL_TEMPLATE % self._api_query(
f'videos/{video_id}/auth_subtitles/{lang}.{ext}', stream_id=stream_id),
} for ext in ('srt', 'vtt')]) for lang in (video.get('subtitle_completions') or {}))
mpd_url = resp['url']
# 720p is hidden in another MPD which can be found in the current manifest content
mpd_content = self._download_webpage(mpd_url, video_id, note='Downloading initial MPD manifest')
mpd_url = self._search_regex(
r'(?mi)<BaseURL>(http.+.mpd)', mpd_content, 'new manifest', default=mpd_url)
if 'mpdhd_high' not in mpd_url and 'sig=' not in mpd_url:
# Modify the URL to get 1080p
mpd_url = mpd_url.replace('mpdhd', 'mpdhd_high')
formats = self._extract_mpd_formats(mpd_url, video_id)
return {
'id': video_id,
'formats': formats,
'title': title,
'description': self.dict_selection(video.get('descriptions', {}), 'en'),
'duration': int_or_none(video.get('duration')),
'timestamp': parse_iso8601(video.get('created_at')),
'uploader': video.get('author'),
'uploader_url': video.get('author_url'),
'like_count': int_or_none(try_get(video, lambda x: x['likes']['count'])),
'age_limit': parse_age_limit(video.get('rating')),
'thumbnails': thumbnails,
'subtitles': subtitles,
'episode_number': episode_number,
}
class VikiChannelIE(VikiBaseIE):
IE_NAME = 'viki:channel'
_VALID_URL = rf'{VikiBaseIE._VALID_URL_BASE}(?:tv|news|movies|artists)/(?P<id>[0-9]+c)'
_TESTS = [{
'url': 'http://www.viki.com/tv/50c-boys-over-flowers',
'info_dict': {
'id': '50c',
'title': 'Boys Over Flowers',
'description': 'md5:804ce6e7837e1fd527ad2f25420f4d59',
},
'playlist_mincount': 51,
}, {
'url': 'http://www.viki.com/tv/1354c-poor-nastya-complete',
'info_dict': {
'id': '1354c',
'title': 'Poor Nastya [COMPLETE]',
'description': 'md5:05bf5471385aa8b21c18ad450e350525',
},
'playlist_count': 127,
'skip': 'Page not found',
}, {
'url': 'http://www.viki.com/news/24569c-showbiz-korea',
'only_matching': True,
}, {
'url': 'http://www.viki.com/movies/22047c-pride-and-prejudice-2005',
'only_matching': True,
}, {
'url': 'http://www.viki.com/artists/2141c-shinee',
'only_matching': True,
}]
_video_types = ('episodes', 'movies', 'clips', 'trailers')
def _entries(self, channel_id):
params = {
'app': self._APP, 'token': self._token, 'only_ids': 'true',
'direction': 'asc', 'sort': 'number', 'per_page': 30,
}
video_types = self._configuration_arg('video_types') or self._video_types
for video_type in video_types:
if video_type not in self._video_types:
self.report_warning(f'Unknown video_type: {video_type}')
page_num = 0
while True:
page_num += 1
params['page'] = page_num
res = self._call_api(
f'containers/{channel_id}/{video_type}.json', channel_id, query=params, fatal=False,
note=f'Downloading {video_type.title()} JSON page {page_num}')
for video_id in res.get('response') or []:
yield self.url_result(f'https://www.viki.com/videos/{video_id}', VikiIE.ie_key(), video_id)
if not res.get('more'):
break
def _real_extract(self, url):
channel_id = self._match_id(url)
channel = self._call_api(f'containers/{channel_id}.json', channel_id, 'Downloading channel JSON')
self._check_errors(channel)
return self.playlist_result(
self._entries(channel_id), channel_id,
self.dict_selection(channel['titles'], 'en'),
self.dict_selection(channel['descriptions'], 'en'))

View File

@@ -39,6 +39,14 @@ class VimeoBaseInfoExtractor(InfoExtractor):
_NETRC_MACHINE = 'vimeo' _NETRC_MACHINE = 'vimeo'
_LOGIN_REQUIRED = False _LOGIN_REQUIRED = False
_LOGIN_URL = 'https://vimeo.com/log_in' _LOGIN_URL = 'https://vimeo.com/log_in'
_IOS_CLIENT_AUTH = 'MTMxNzViY2Y0NDE0YTQ5YzhjZTc0YmU0NjVjNDQxYzNkYWVjOWRlOTpHKzRvMmgzVUh4UkxjdU5FRW80cDNDbDhDWGR5dVJLNUJZZ055dHBHTTB4V1VzaG41bEx1a2hiN0NWYWNUcldSSW53dzRUdFRYZlJEZmFoTTArOTBUZkJHS3R4V2llYU04Qnl1bERSWWxUdXRidjNqR2J4SHFpVmtFSUcyRktuQw=='
_IOS_CLIENT_HEADERS = {
'Accept': 'application/vnd.vimeo.*+json; version=3.4.10',
'Accept-Language': 'en',
'User-Agent': 'Vimeo/11.10.0 (com.vimeo; build:250424.164813.0; iOS 18.4.1) Alamofire/5.9.0 VimeoNetworking/5.0.0',
}
_IOS_OAUTH_CACHE_KEY = 'oauth-token-ios'
_ios_oauth_token = None
@staticmethod @staticmethod
def _smuggle_referrer(url, referrer_url): def _smuggle_referrer(url, referrer_url):
@@ -88,13 +96,16 @@ def _get_video_password(self):
expected=True) expected=True)
return password return password
def _verify_video_password(self, video_id, password, token): def _verify_video_password(self, video_id):
video_password = self._get_video_password()
token = self._download_json(
'https://vimeo.com/_next/viewer', video_id, 'Downloading viewer info')['xsrft']
url = f'https://vimeo.com/{video_id}' url = f'https://vimeo.com/{video_id}'
try: try:
return self._download_webpage( self._request_webpage(
f'{url}/password', video_id, f'{url}/password', video_id,
'Submitting video password', data=json.dumps({ 'Submitting video password', data=json.dumps({
'password': password, 'password': video_password,
'token': token, 'token': token,
}, separators=(',', ':')).encode(), headers={ }, separators=(',', ':')).encode(), headers={
'Accept': '*/*', 'Accept': '*/*',
@@ -239,20 +250,39 @@ def _parse_config(self, config, video_id):
'_format_sort_fields': ('quality', 'res', 'fps', 'hdr:12', 'source'), '_format_sort_fields': ('quality', 'res', 'fps', 'hdr:12', 'source'),
} }
def _call_videos_api(self, video_id, jwt_token, unlisted_hash=None, **kwargs): def _fetch_oauth_token(self):
if not self._ios_oauth_token:
self._ios_oauth_token = self.cache.load(self._NETRC_MACHINE, self._IOS_OAUTH_CACHE_KEY)
if not self._ios_oauth_token:
self._ios_oauth_token = self._download_json(
'https://api.vimeo.com/oauth/authorize/client', None,
'Fetching OAuth token', 'Failed to fetch OAuth token',
headers={
'Authorization': f'Basic {self._IOS_CLIENT_AUTH}',
**self._IOS_CLIENT_HEADERS,
}, data=urlencode_postdata({
'grant_type': 'client_credentials',
'scope': 'private public create edit delete interact upload purchased stats',
}, quote_via=urllib.parse.quote))['access_token']
self.cache.store(self._NETRC_MACHINE, self._IOS_OAUTH_CACHE_KEY, self._ios_oauth_token)
return self._ios_oauth_token
def _call_videos_api(self, video_id, unlisted_hash=None, **kwargs):
return self._download_json( return self._download_json(
join_nonempty(f'https://api.vimeo.com/videos/{video_id}', unlisted_hash, delim=':'), join_nonempty(f'https://api.vimeo.com/videos/{video_id}', unlisted_hash, delim=':'),
video_id, 'Downloading API JSON', headers={ video_id, 'Downloading API JSON', headers={
'Authorization': f'jwt {jwt_token}', 'Authorization': f'Bearer {self._fetch_oauth_token()}',
'Accept': 'application/json', **self._IOS_CLIENT_HEADERS,
}, query={ }, query={
'fields': ','.join(( 'fields': ','.join((
'config_url', 'created_time', 'description', 'download', 'license', 'config_url', 'embed_player_config_url', 'player_embed_url', 'download', 'play',
'metadata.connections.comments.total', 'metadata.connections.likes.total', 'files', 'description', 'license', 'release_time', 'created_time', 'stats.plays',
'release_time', 'stats.plays')), 'metadata.connections.comments.total', 'metadata.connections.likes.total')),
}, **kwargs) }, **kwargs)
def _extract_original_format(self, url, video_id, unlisted_hash=None, jwt=None, api_data=None): def _extract_original_format(self, url, video_id, unlisted_hash=None, api_data=None):
# Original/source formats are only available when logged in # Original/source formats are only available when logged in
if not self._get_cookies('https://vimeo.com/').get('vimeo'): if not self._get_cookies('https://vimeo.com/').get('vimeo'):
return return
@@ -283,12 +313,8 @@ def _extract_original_format(self, url, video_id, unlisted_hash=None, jwt=None,
'quality': 1, 'quality': 1,
} }
jwt = jwt or traverse_obj(self._download_json(
'https://vimeo.com/_rv/viewer', video_id, 'Downloading jwt token', fatal=False), ('jwt', {str}))
if not jwt:
return
original_response = api_data or self._call_videos_api( original_response = api_data or self._call_videos_api(
video_id, jwt, unlisted_hash, fatal=False, expected_status=(403, 404)) video_id, unlisted_hash, fatal=False, expected_status=(403, 404))
for download_data in traverse_obj(original_response, ('download', ..., {dict})): for download_data in traverse_obj(original_response, ('download', ..., {dict})):
download_url = download_data.get('link') download_url = download_data.get('link')
if not download_url or download_data.get('quality') != 'source': if not download_url or download_data.get('quality') != 'source':
@@ -410,6 +436,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
'duration': 10, 'duration': 10,
'comment_count': int, 'comment_count': int,
'like_count': int, 'like_count': int,
'view_count': int,
'thumbnail': 'https://i.vimeocdn.com/video/440665496-b2c5aee2b61089442c794f64113a8e8f7d5763c3e6b3ebfaf696ae6413f8b1f4-d', 'thumbnail': 'https://i.vimeocdn.com/video/440665496-b2c5aee2b61089442c794f64113a8e8f7d5763c3e6b3ebfaf696ae6413f8b1f4-d',
}, },
'params': { 'params': {
@@ -500,15 +527,16 @@ class VimeoIE(VimeoBaseInfoExtractor):
'uploader': 'The DMCI', 'uploader': 'The DMCI',
'uploader_url': r're:https?://(?:www\.)?vimeo\.com/dmci', 'uploader_url': r're:https?://(?:www\.)?vimeo\.com/dmci',
'uploader_id': 'dmci', 'uploader_id': 'dmci',
'timestamp': 1324343742, 'timestamp': 1324361742,
'upload_date': '20111220', 'upload_date': '20111220',
'description': 'md5:ae23671e82d05415868f7ad1aec21147', 'description': 'md5:f37b4ad0f3ded6fa16f38ecde16c3c44',
'duration': 60, 'duration': 60,
'comment_count': int, 'comment_count': int,
'view_count': int, 'view_count': int,
'thumbnail': 'https://i.vimeocdn.com/video/231174622-dd07f015e9221ff529d451e1cc31c982b5d87bfafa48c4189b1da72824ee289a-d', 'thumbnail': 'https://i.vimeocdn.com/video/231174622-dd07f015e9221ff529d451e1cc31c982b5d87bfafa48c4189b1da72824ee289a-d',
'like_count': int, 'like_count': int,
'tags': 'count:11', 'release_timestamp': 1324361742,
'release_date': '20111220',
}, },
# 'params': {'format': 'Original'}, # 'params': {'format': 'Original'},
'expected_warnings': ['Failed to parse XML: not well-formed'], 'expected_warnings': ['Failed to parse XML: not well-formed'],
@@ -521,15 +549,18 @@ class VimeoIE(VimeoBaseInfoExtractor):
'id': '393756517', 'id': '393756517',
# 'ext': 'mov', # 'ext': 'mov',
'ext': 'mp4', 'ext': 'mp4',
'timestamp': 1582642091, 'timestamp': 1582660091,
'uploader_id': 'frameworkla', 'uploader_id': 'frameworkla',
'title': 'Straight To Hell - Sabrina: Netflix', 'title': 'Straight To Hell - Sabrina: Netflix',
'uploader': 'Framework Studio', 'uploader': 'Framework Studio',
'description': 'md5:f2edc61af3ea7a5592681ddbb683db73',
'upload_date': '20200225', 'upload_date': '20200225',
'duration': 176, 'duration': 176,
'thumbnail': 'https://i.vimeocdn.com/video/859377297-836494a4ef775e9d4edbace83937d9ad34dc846c688c0c419c0e87f7ab06c4b3-d', 'thumbnail': 'https://i.vimeocdn.com/video/859377297-836494a4ef775e9d4edbace83937d9ad34dc846c688c0c419c0e87f7ab06c4b3-d',
'uploader_url': 'https://vimeo.com/frameworkla', 'uploader_url': 'https://vimeo.com/frameworkla',
'comment_count': int,
'like_count': int,
'release_timestamp': 1582660091,
'release_date': '20200225',
}, },
# 'params': {'format': 'source'}, # 'params': {'format': 'source'},
'expected_warnings': ['Failed to parse XML: not well-formed'], 'expected_warnings': ['Failed to parse XML: not well-formed'],
@@ -630,7 +661,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
'description': str, # FIXME: Dynamic SEO spam description 'description': str, # FIXME: Dynamic SEO spam description
'upload_date': '20150209', 'upload_date': '20150209',
'timestamp': 1423518307, 'timestamp': 1423518307,
'thumbnail': 'https://i.vimeocdn.com/video/default', 'thumbnail': r're:https://i\.vimeocdn\.com/video/default',
'duration': 10, 'duration': 10,
'like_count': int, 'like_count': int,
'uploader_url': 'https://vimeo.com/user20132939', 'uploader_url': 'https://vimeo.com/user20132939',
@@ -667,6 +698,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
'like_count': int, 'like_count': int,
'uploader_url': 'https://vimeo.com/aliniamedia', 'uploader_url': 'https://vimeo.com/aliniamedia',
'release_date': '20160329', 'release_date': '20160329',
'view_count': int,
}, },
'params': {'skip_download': True}, 'params': {'skip_download': True},
'expected_warnings': ['Failed to parse XML: not well-formed'], 'expected_warnings': ['Failed to parse XML: not well-formed'],
@@ -678,18 +710,19 @@ class VimeoIE(VimeoBaseInfoExtractor):
# 'ext': 'm4v', # 'ext': 'm4v',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Eastnor Castle 2015 Firework Champions - The Promo!', 'title': 'Eastnor Castle 2015 Firework Champions - The Promo!',
'description': 'md5:5967e090768a831488f6e74b7821b3c1', 'description': 'md5:9441e6829ae94f380cc6417d982f63ac',
'uploader_id': 'fireworkchampions', 'uploader_id': 'fireworkchampions',
'uploader': 'Firework Champions', 'uploader': 'Firework Champions',
'upload_date': '20150910', 'upload_date': '20150910',
'timestamp': 1441901895, 'timestamp': 1441916295,
'thumbnail': 'https://i.vimeocdn.com/video/534715882-6ff8e4660cbf2fea68282876d8d44f318825dfe572cc4016e73b3266eac8ae3a-d', 'thumbnail': 'https://i.vimeocdn.com/video/534715882-6ff8e4660cbf2fea68282876d8d44f318825dfe572cc4016e73b3266eac8ae3a-d',
'uploader_url': 'https://vimeo.com/fireworkchampions', 'uploader_url': 'https://vimeo.com/fireworkchampions',
'tags': 'count:6',
'duration': 229, 'duration': 229,
'view_count': int, 'view_count': int,
'like_count': int, 'like_count': int,
'comment_count': int, 'comment_count': int,
'release_timestamp': 1441916295,
'release_date': '20150910',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@@ -820,7 +853,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
'uploader': 'Raja Virdi', 'uploader': 'Raja Virdi',
'uploader_id': 'rajavirdi', 'uploader_id': 'rajavirdi',
'uploader_url': 'https://vimeo.com/rajavirdi', 'uploader_url': 'https://vimeo.com/rajavirdi',
'duration': 309, 'duration': 300,
'thumbnail': r're:https://i\.vimeocdn\.com/video/1716727772-[\da-f]+-d', 'thumbnail': r're:https://i\.vimeocdn\.com/video/1716727772-[\da-f]+-d',
}, },
# 'params': {'format': 'source'}, # 'params': {'format': 'source'},
@@ -860,12 +893,9 @@ def _verify_player_video_password(self, url, video_id, headers):
return checked return checked
def _extract_from_api(self, video_id, unlisted_hash=None): def _extract_from_api(self, video_id, unlisted_hash=None):
viewer = self._download_json(
'https://vimeo.com/_next/viewer', video_id, 'Downloading viewer info')
for retry in (False, True): for retry in (False, True):
try: try:
video = self._call_videos_api(video_id, viewer['jwt'], unlisted_hash) video = self._call_videos_api(video_id, unlisted_hash)
break break
except ExtractorError as e: except ExtractorError as e:
if (not retry and isinstance(e.cause, HTTPError) and e.cause.status == 400 if (not retry and isinstance(e.cause, HTTPError) and e.cause.status == 400
@@ -873,15 +903,14 @@ def _extract_from_api(self, video_id, unlisted_hash=None):
self._webpage_read_content(e.cause.response, e.cause.response.url, video_id, fatal=False), self._webpage_read_content(e.cause.response, e.cause.response.url, video_id, fatal=False),
({json.loads}, 'invalid_parameters', ..., 'field'), ({json.loads}, 'invalid_parameters', ..., 'field'),
)): )):
self._verify_video_password( self._verify_video_password(video_id)
video_id, self._get_video_password(), viewer['xsrft'])
continue continue
raise raise
info = self._parse_config(self._download_json( info = self._parse_config(self._download_json(
video['config_url'], video_id), video_id) video['config_url'], video_id), video_id)
source_format = self._extract_original_format( source_format = self._extract_original_format(
f'https://vimeo.com/{video_id}', video_id, unlisted_hash, jwt=viewer['jwt'], api_data=video) f'https://vimeo.com/{video_id}', video_id, unlisted_hash, api_data=video)
if source_format: if source_format:
info['formats'].append(source_format) info['formats'].append(source_format)
@@ -1122,7 +1151,7 @@ class VimeoOndemandIE(VimeoIE): # XXX: Do not subclass from concrete IE
'description': 'md5:aeeba3dbd4d04b0fa98a4fdc9c639998', 'description': 'md5:aeeba3dbd4d04b0fa98a4fdc9c639998',
'upload_date': '20140906', 'upload_date': '20140906',
'timestamp': 1410032453, 'timestamp': 1410032453,
'thumbnail': 'https://i.vimeocdn.com/video/488238335-d7bf151c364cff8d467f1b73784668fe60aae28a54573a35d53a1210ae283bd8-d_1280', 'thumbnail': r're:https://i\.vimeocdn\.com/video/\d+-[\da-f]+-d',
'comment_count': int, 'comment_count': int,
'license': 'https://creativecommons.org/licenses/by-nc-nd/3.0/', 'license': 'https://creativecommons.org/licenses/by-nc-nd/3.0/',
'duration': 53, 'duration': 53,
@@ -1132,7 +1161,7 @@ class VimeoOndemandIE(VimeoIE): # XXX: Do not subclass from concrete IE
'params': { 'params': {
'format': 'best[protocol=https]', 'format': 'best[protocol=https]',
}, },
'expected_warnings': ['Unable to download JSON metadata'], 'expected_warnings': ['Failed to parse XML: not well-formed'],
}, { }, {
# requires Referer to be passed along with og:video:url # requires Referer to be passed along with og:video:url
'url': 'https://vimeo.com/ondemand/36938/126682985', 'url': 'https://vimeo.com/ondemand/36938/126682985',
@@ -1149,13 +1178,14 @@ class VimeoOndemandIE(VimeoIE): # XXX: Do not subclass from concrete IE
'duration': 121, 'duration': 121,
'comment_count': int, 'comment_count': int,
'view_count': int, 'view_count': int,
'thumbnail': 'https://i.vimeocdn.com/video/517077723-7066ae1d9a79d3eb361334fb5d58ec13c8f04b52f8dd5eadfbd6fb0bcf11f613-d_1280', 'thumbnail': r're:https://i\.vimeocdn\.com/video/\d+-[\da-f]+-d',
'like_count': int, 'like_count': int,
'tags': 'count:5',
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
'expected_warnings': ['Unable to download JSON metadata'], 'expected_warnings': ['Failed to parse XML: not well-formed'],
}, { }, {
'url': 'https://vimeo.com/ondemand/nazmaalik', 'url': 'https://vimeo.com/ondemand/nazmaalik',
'only_matching': True, 'only_matching': True,
@@ -1237,7 +1267,7 @@ class VimeoUserIE(VimeoChannelIE): # XXX: Do not subclass from concrete IE
_TESTS = [{ _TESTS = [{
'url': 'https://vimeo.com/nkistudio/videos', 'url': 'https://vimeo.com/nkistudio/videos',
'info_dict': { 'info_dict': {
'title': 'Nki', 'title': 'AKAMA',
'id': 'nkistudio', 'id': 'nkistudio',
}, },
'playlist_mincount': 66, 'playlist_mincount': 66,
@@ -1370,10 +1400,10 @@ class VimeoReviewIE(VimeoBaseInfoExtractor):
'uploader_id': 'user170863801', 'uploader_id': 'user170863801',
'uploader_url': 'https://vimeo.com/user170863801', 'uploader_url': 'https://vimeo.com/user170863801',
'duration': 30, 'duration': 30,
'thumbnail': 'https://i.vimeocdn.com/video/1912612821-09a43bd2e75c203d503aed89de7534f28fc4474a48f59c51999716931a246af5-d_1280', 'thumbnail': r're:https://i\.vimeocdn\.com/video/\d+-[\da-f]+-d',
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
'expected_warnings': ['Failed to parse XML'], 'expected_warnings': ['Failed to parse XML: not well-formed'],
}, { }, {
'url': 'https://vimeo.com/user21297594/review/75524534/3c257a1b5d', 'url': 'https://vimeo.com/user21297594/review/75524534/3c257a1b5d',
'md5': 'c507a72f780cacc12b2248bb4006d253', 'md5': 'c507a72f780cacc12b2248bb4006d253',
@@ -1423,12 +1453,8 @@ def _real_extract(self, url):
user, video_id, review_hash = self._match_valid_url(url).group('user', 'id', 'hash') user, video_id, review_hash = self._match_valid_url(url).group('user', 'id', 'hash')
data_url = f'https://vimeo.com/{user}/review/data/{video_id}/{review_hash}' data_url = f'https://vimeo.com/{user}/review/data/{video_id}/{review_hash}'
data = self._download_json(data_url, video_id) data = self._download_json(data_url, video_id)
viewer = {}
if data.get('isLocked') is True: if data.get('isLocked') is True:
video_password = self._get_video_password() self._verify_video_password(video_id)
viewer = self._download_json(
'https://vimeo.com/_rv/viewer', video_id)
self._verify_video_password(video_id, video_password, viewer['xsrft'])
data = self._download_json(data_url, video_id) data = self._download_json(data_url, video_id)
clip_data = data['clipData'] clip_data = data['clipData']
config_url = clip_data['configUrl'] config_url = clip_data['configUrl']
@@ -1436,7 +1462,7 @@ def _real_extract(self, url):
info_dict = self._parse_config(config, video_id) info_dict = self._parse_config(config, video_id)
source_format = self._extract_original_format( source_format = self._extract_original_format(
f'https://vimeo.com/{user}/review/{video_id}/{review_hash}/action', f'https://vimeo.com/{user}/review/{video_id}/{review_hash}/action',
video_id, unlisted_hash=clip_data.get('unlistedHash'), jwt=viewer.get('jwt')) video_id, unlisted_hash=clip_data.get('unlistedHash'))
if source_format: if source_format:
info_dict['formats'].append(source_format) info_dict['formats'].append(source_format)
info_dict['description'] = clean_html(clip_data.get('description')) info_dict['description'] = clean_html(clip_data.get('description'))
@@ -1528,20 +1554,22 @@ class VimeoProIE(VimeoBaseInfoExtractor):
'uploader_id': 'openstreetmapus', 'uploader_id': 'openstreetmapus',
'uploader': 'OpenStreetMap US', 'uploader': 'OpenStreetMap US',
'title': 'Andy Allan - Putting the Carto into OpenStreetMap Cartography', 'title': 'Andy Allan - Putting the Carto into OpenStreetMap Cartography',
'description': 'md5:2c362968038d4499f4d79f88458590c1', 'description': 'md5:8cf69a1a435f2d763f4adf601e9c3125',
'duration': 1595, 'duration': 1595,
'upload_date': '20130610', 'upload_date': '20130610',
'timestamp': 1370893156, 'timestamp': 1370907556,
'license': 'by', 'license': 'by',
'thumbnail': 'https://i.vimeocdn.com/video/440260469-19b0d92fca3bd84066623b53f1eb8aaa3980c6c809e2d67b6b39ab7b4a77a344-d_960', 'thumbnail': r're:https://i\.vimeocdn\.com/video/\d+-[\da-f]+-d',
'view_count': int, 'view_count': int,
'comment_count': int, 'comment_count': int,
'like_count': int, 'like_count': int,
'tags': 'count:1', 'release_timestamp': 1370907556,
'release_date': '20130610',
}, },
'params': { 'params': {
'format': 'best[protocol=https]', 'format': 'best[protocol=https]',
}, },
'expected_warnings': ['Failed to parse XML: not well-formed'],
}, { }, {
# password-protected VimeoPro page with Vimeo player embed # password-protected VimeoPro page with Vimeo player embed
'url': 'https://vimeopro.com/cadfem/simulation-conference-mechanische-systeme-in-perfektion', 'url': 'https://vimeopro.com/cadfem/simulation-conference-mechanische-systeme-in-perfektion',
@@ -1549,7 +1577,7 @@ class VimeoProIE(VimeoBaseInfoExtractor):
'id': '764543723', 'id': '764543723',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Mechanische Systeme in Perfektion: Realität erfassen, Innovation treiben', 'title': 'Mechanische Systeme in Perfektion: Realität erfassen, Innovation treiben',
'thumbnail': 'https://i.vimeocdn.com/video/1543784598-a1a750494a485e601110136b9fe11e28c2131942452b3a5d30391cb3800ca8fd-d_1280', 'thumbnail': r're:https://i\.vimeocdn\.com/video/\d+-[\da-f]+-d',
'description': 'md5:2a9d195cd1b0f6f79827107dc88c2420', 'description': 'md5:2a9d195cd1b0f6f79827107dc88c2420',
'uploader': 'CADFEM', 'uploader': 'CADFEM',
'uploader_id': 'cadfem', 'uploader_id': 'cadfem',
@@ -1561,6 +1589,7 @@ class VimeoProIE(VimeoBaseInfoExtractor):
'videopassword': 'Conference2022', 'videopassword': 'Conference2022',
'skip_download': True, 'skip_download': True,
}, },
'expected_warnings': ['Failed to parse XML: not well-formed'],
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -300,6 +300,24 @@ class VKIE(VKBaseIE):
'upload_date': '20250130', 'upload_date': '20250130',
}, },
}, },
{
'url': 'https://vkvideo.ru/video-50883936_456244102',
'info_dict': {
'id': '-50883936_456244102',
'ext': 'mp4',
'title': 'Добивание Украины // Техник в коме // МОЯ ЗЛОСТЬ №140',
'description': 'md5:a9bc46181e9ebd0fdd82cef6c0191140',
'uploader': 'Стас Ай, Как Просто!',
'uploader_id': '-50883936',
'comment_count': int,
'like_count': int,
'duration': 4651,
'thumbnail': r're:https?://.+\.jpg',
'chapters': 'count:59',
'timestamp': 1743333869,
'upload_date': '20250330',
},
},
{ {
# live stream, hls and rtmp links, most likely already finished live # live stream, hls and rtmp links, most likely already finished live
# stream by the time you are reading this comment # stream by the time you are reading this comment
@@ -540,11 +558,11 @@ def _real_extract(self, url):
'title': ('md_title', {unescapeHTML}), 'title': ('md_title', {unescapeHTML}),
'description': ('description', {clean_html}, filter), 'description': ('description', {clean_html}, filter),
'thumbnail': ('jpg', {url_or_none}), 'thumbnail': ('jpg', {url_or_none}),
'uploader': ('md_author', {str}), 'uploader': ('md_author', {unescapeHTML}),
'uploader_id': (('author_id', 'authorId'), {str_or_none}, any), 'uploader_id': (('author_id', 'authorId'), {str_or_none}, any),
'duration': ('duration', {int_or_none}), 'duration': ('duration', {int_or_none}),
'chapters': ('time_codes', lambda _, v: isinstance(v['time'], int), { 'chapters': ('time_codes', lambda _, v: isinstance(v['time'], int), {
'title': ('text', {str}), 'title': ('text', {unescapeHTML}),
'start_time': 'time', 'start_time': 'time',
}), }),
}), }),

View File

@@ -0,0 +1,185 @@
import itertools
from .common import InfoExtractor
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
clean_html,
extract_attributes,
parse_duration,
parse_qs,
)
from ..utils.traversal import (
find_element,
find_elements,
traverse_obj,
)
class VrSquareIE(InfoExtractor):
IE_NAME = 'vrsquare'
IE_DESC = 'VR SQUARE'
_BASE_URL = 'https://livr.jp'
_VALID_URL = r'https?://livr\.jp/contents/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://livr.jp/contents/P470896661',
'info_dict': {
'id': 'P470896661',
'ext': 'mp4',
'title': 'そこ曲がったら、櫻坂? 7年間お疲れ様!菅井友香の卒業を祝う会!前半 2022年11月6日放送分',
'description': 'md5:523726dc835aa8014dfe1e2b38d36cd1',
'duration': 1515.0,
'tags': 'count:2',
'thumbnail': r're:https?://media\.livr\.jp/vod/img/.+\.jpg',
},
}, {
'url': 'https://livr.jp/contents/P589523973',
'info_dict': {
'id': 'P589523973',
'ext': 'mp4',
'title': '薄闇に仰ぐ しだれ桜の妖艶',
'description': 'md5:a042f517b2cbb4ed6746707afec4d306',
'duration': 1084.0,
'tags': list,
'thumbnail': r're:https?://media\.livr\.jp/vod/img/.+\.jpg',
},
'skip': 'Paid video',
}, {
'url': 'https://livr.jp/contents/P316939908',
'info_dict': {
'id': 'P316939908',
'ext': 'mp4',
'title': '2024年5月16日 「今日は誰に恋をする?」公演 小栗有以 生誕祭',
'description': 'md5:2110bdcf947f28bd7d06ec420e51b619',
'duration': 8559.0,
'tags': list,
'thumbnail': r're:https?://media\.livr\.jp/vod/img/.+\.jpg',
},
'skip': 'Premium channel subscribers only',
}, {
# Accessible only in the VR SQUARE app
'url': 'https://livr.jp/contents/P126481458',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
status = self._download_json(
f'{self._BASE_URL}/webApi/contentsStatus/{video_id}',
video_id, 'Checking contents status', fatal=False)
if traverse_obj(status, 'result_code') == '40407':
self.raise_login_required('Unable to access this video')
try:
web_api = self._download_json(
f'{self._BASE_URL}/webApi/play/url/{video_id}', video_id)
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 500:
raise ExtractorError('VR SQUARE app-only videos are not supported', expected=True)
raise
return {
'id': video_id,
'title': self._html_search_meta(['og:title', 'twitter:title'], webpage),
'description': self._html_search_meta('description', webpage),
'formats': self._extract_m3u8_formats(traverse_obj(web_api, (
'urls', ..., 'url', any)), video_id, 'mp4', fatal=False),
'thumbnail': self._html_search_meta('og:image', webpage),
**traverse_obj(webpage, {
'duration': ({find_element(cls='layout-product-data-time')}, {parse_duration}),
'tags': ({find_elements(cls='search-tag')}, ..., {clean_html}),
}),
}
class VrSquarePlaylistBaseIE(InfoExtractor):
_BASE_URL = 'https://livr.jp'
def _fetch_vids(self, source, keys=()):
for url_path in traverse_obj(source, (
*keys, {find_elements(cls='video', html=True)}, ...,
{extract_attributes}, 'data-url', {str}, filter),
):
yield self.url_result(
f'{self._BASE_URL}/contents/{url_path.removeprefix("/contents/")}', VrSquareIE)
def _entries(self, path, display_id, query=None):
for page in itertools.count(1):
ajax = self._download_json(
f'{self._BASE_URL}{path}', display_id,
f'Downloading playlist JSON page {page}',
query={'p': page, **(query or {})})
yield from self._fetch_vids(ajax, ('contents_render_list', ...))
if not traverse_obj(ajax, (('has_next', 'hasNext'), {bool}, any)):
break
class VrSquareChannelIE(VrSquarePlaylistBaseIE):
IE_NAME = 'vrsquare:channel'
_VALID_URL = r'https?://livr\.jp/channel/(?P<id>\w+)'
_TESTS = [{
'url': 'https://livr.jp/channel/H372648599',
'info_dict': {
'id': 'H372648599',
'title': 'AKB48チャンネル',
},
'playlist_mincount': 502,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
return self.playlist_result(
self._entries(f'/ajax/channel/{playlist_id}', playlist_id),
playlist_id, self._html_search_meta('og:title', webpage))
class VrSquareSearchIE(VrSquarePlaylistBaseIE):
IE_NAME = 'vrsquare:search'
_VALID_URL = r'https?://livr\.jp/web-search/?\?(?:[^#]+&)?w=[^#]+'
_TESTS = [{
'url': 'https://livr.jp/web-search?w=%23%E5%B0%8F%E6%A0%97%E6%9C%89%E4%BB%A5',
'info_dict': {
'id': '#小栗有以',
},
'playlist_mincount': 60,
}]
def _real_extract(self, url):
search_query = parse_qs(url)['w'][0]
return self.playlist_result(
self._entries('/ajax/web-search', search_query, {'w': search_query}), search_query)
class VrSquareSectionIE(VrSquarePlaylistBaseIE):
IE_NAME = 'vrsquare:section'
_VALID_URL = r'https?://livr\.jp/(?:category|headline)/(?P<id>\w+)'
_TESTS = [{
'url': 'https://livr.jp/category/C133936275',
'info_dict': {
'id': 'C133936275',
'title': 'そこ曲がったら、櫻坂VR',
},
'playlist_mincount': 308,
}, {
'url': 'https://livr.jp/headline/A296449604',
'info_dict': {
'id': 'A296449604',
'title': 'AKB48 アフターVR',
},
'playlist_mincount': 22,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
return self.playlist_result(
self._fetch_vids(webpage), playlist_id, self._html_search_meta('og:title', webpage))

View File

@@ -11,7 +11,7 @@
) )
class WykopBaseExtractor(InfoExtractor): class WykopBaseIE(InfoExtractor):
def _get_token(self, force_refresh=False): def _get_token(self, force_refresh=False):
if not force_refresh: if not force_refresh:
maybe_cached = self.cache.load('wykop', 'bearer') maybe_cached = self.cache.load('wykop', 'bearer')
@@ -72,7 +72,7 @@ def _common_data_extract(self, data):
} }
class WykopDigIE(WykopBaseExtractor): class WykopDigIE(WykopBaseIE):
IE_NAME = 'wykop:dig' IE_NAME = 'wykop:dig'
_VALID_URL = r'https?://(?:www\.)?wykop\.pl/link/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?wykop\.pl/link/(?P<id>\d+)'
@@ -128,7 +128,7 @@ def _real_extract(self, url):
} }
class WykopDigCommentIE(WykopBaseExtractor): class WykopDigCommentIE(WykopBaseIE):
IE_NAME = 'wykop:dig:comment' IE_NAME = 'wykop:dig:comment'
_VALID_URL = r'https?://(?:www\.)?wykop\.pl/link/(?P<dig_id>\d+)/[^/]+/komentarz/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?wykop\.pl/link/(?P<dig_id>\d+)/[^/]+/komentarz/(?P<id>\d+)'
@@ -177,7 +177,7 @@ def _real_extract(self, url):
} }
class WykopPostIE(WykopBaseExtractor): class WykopPostIE(WykopBaseIE):
IE_NAME = 'wykop:post' IE_NAME = 'wykop:post'
_VALID_URL = r'https?://(?:www\.)?wykop\.pl/wpis/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?wykop\.pl/wpis/(?P<id>\d+)'
@@ -228,7 +228,7 @@ def _real_extract(self, url):
} }
class WykopPostCommentIE(WykopBaseExtractor): class WykopPostCommentIE(WykopBaseIE):
IE_NAME = 'wykop:post:comment' IE_NAME = 'wykop:post:comment'
_VALID_URL = r'https?://(?:www\.)?wykop\.pl/wpis/(?P<post_id>\d+)/[^/#]+#(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?wykop\.pl/wpis/(?P<post_id>\d+)/[^/#]+#(?P<id>\d+)'

View File

@@ -2,15 +2,17 @@
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
bug_reports_message,
determine_ext, determine_ext,
extract_attributes,
int_or_none, int_or_none,
lowercase_escape, lowercase_escape,
parse_qs, parse_qs,
traverse_obj, qualities,
try_get, try_get,
update_url_query,
url_or_none, url_or_none,
) )
from ..utils.traversal import traverse_obj
class YandexVideoIE(InfoExtractor): class YandexVideoIE(InfoExtractor):
@@ -186,7 +188,22 @@ def _real_extract(self, url):
return self.url_result(data_json['video']['url']) return self.url_result(data_json['video']['url'])
class ZenYandexIE(InfoExtractor): class ZenYandexBaseIE(InfoExtractor):
def _fetch_ssr_data(self, url, video_id):
webpage = self._download_webpage(url, video_id)
redirect = self._search_json(
r'(?:var|let|const)\s+it\s*=', webpage, 'redirect', video_id, default={}).get('retpath')
if redirect:
video_id = self._match_id(redirect)
webpage = self._download_webpage(redirect, video_id, note='Redirecting')
return video_id, self._search_json(
r'(?:var|let|const)\s+_params\s*=\s*\(', webpage, 'metadata', video_id,
contains_pattern=r'{["\']ssrData.+}')['ssrData']
class ZenYandexIE(ZenYandexBaseIE):
IE_NAME = 'dzen.ru'
IE_DESC = 'Дзен (dzen) formerly Яндекс.Дзен (Yandex Zen)'
_VALID_URL = r'https?://(zen\.yandex|dzen)\.ru(?:/video)?/(media|watch)/(?:(?:id/[^/]+/|[^/]+/)(?:[a-z0-9-]+)-)?(?P<id>[a-z0-9-]+)' _VALID_URL = r'https?://(zen\.yandex|dzen)\.ru(?:/video)?/(media|watch)/(?:(?:id/[^/]+/|[^/]+/)(?:[a-z0-9-]+)-)?(?P<id>[a-z0-9-]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://zen.yandex.ru/media/id/606fd806cc13cb3c58c05cf5/vot-eto-focus-dedy-morozy-na-gidrociklah-60c7c443da18892ebfe85ed7', 'url': 'https://zen.yandex.ru/media/id/606fd806cc13cb3c58c05cf5/vot-eto-focus-dedy-morozy-na-gidrociklah-60c7c443da18892ebfe85ed7',
@@ -216,6 +233,7 @@ class ZenYandexIE(InfoExtractor):
'timestamp': 1573465585, 'timestamp': 1573465585,
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
'skip': 'The page does not exist',
}, { }, {
'url': 'https://zen.yandex.ru/video/watch/6002240ff8b1af50bb2da5e3', 'url': 'https://zen.yandex.ru/video/watch/6002240ff8b1af50bb2da5e3',
'info_dict': { 'info_dict': {
@@ -227,6 +245,9 @@ class ZenYandexIE(InfoExtractor):
'uploader': 'TechInsider', 'uploader': 'TechInsider',
'timestamp': 1611378221, 'timestamp': 1611378221,
'upload_date': '20210123', 'upload_date': '20210123',
'view_count': int,
'duration': 243,
'tags': ['опыт', 'эксперимент', 'огонь'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
@@ -240,6 +261,9 @@ class ZenYandexIE(InfoExtractor):
'uploader': 'TechInsider', 'uploader': 'TechInsider',
'upload_date': '20210123', 'upload_date': '20210123',
'timestamp': 1611378221, 'timestamp': 1611378221,
'view_count': int,
'duration': 243,
'tags': ['опыт', 'эксперимент', 'огонь'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
@@ -252,44 +276,56 @@ class ZenYandexIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) video_id, ssr_data = self._fetch_ssr_data(url, video_id)
redirect = self._search_json(r'var it\s*=', webpage, 'redirect', id, default={}).get('retpath') video_data = ssr_data['videoMetaResponse']
if redirect:
video_id = self._match_id(redirect)
webpage = self._download_webpage(redirect, video_id, note='Redirecting')
data_json = self._search_json(
r'("data"\s*:|data\s*=)', webpage, 'metadata', video_id, contains_pattern=r'{["\']_*serverState_*video.+}')
serverstate = self._search_regex(r'(_+serverState_+video-site_[^_]+_+)', webpage, 'server state')
uploader = self._search_regex(r'(<a\s*class=["\']card-channel-link[^"\']+["\'][^>]+>)',
webpage, 'uploader', default='<a>')
uploader_name = extract_attributes(uploader).get('aria-label')
item_id = traverse_obj(data_json, (serverstate, 'videoViewer', 'openedItemId', {str}))
video_json = traverse_obj(data_json, (serverstate, 'videoViewer', 'items', item_id, {dict})) or {}
formats, subtitles = [], {} formats, subtitles = [], {}
for s_url in traverse_obj(video_json, ('video', 'streams', ..., {url_or_none})): quality = qualities(('4', '0', '1', '2', '3', '5', '6', '7'))
# Deduplicate stream URLs. The "dzen_dash" query parameter is present in some URLs but can be omitted
stream_urls = set(traverse_obj(video_data, (
'video', ('id', ('streams', ...), ('mp4Streams', ..., 'url'), ('oneVideoStreams', ..., 'url')),
{url_or_none}, {update_url_query(query={'dzen_dash': []})})))
for s_url in stream_urls:
ext = determine_ext(s_url) ext = determine_ext(s_url)
if ext == 'mpd': content_type = traverse_obj(parse_qs(s_url), ('ct', 0))
fmts, subs = self._extract_mpd_formats_and_subtitles(s_url, video_id, mpd_id='dash') if ext == 'mpd' or content_type == '6':
elif ext == 'm3u8': fmts, subs = self._extract_mpd_formats_and_subtitles(s_url, video_id, mpd_id='dash', fatal=False)
fmts, subs = self._extract_m3u8_formats_and_subtitles(s_url, video_id, 'mp4') elif ext == 'm3u8' or content_type == '8':
fmts, subs = self._extract_m3u8_formats_and_subtitles(s_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
elif content_type == '0':
format_type = traverse_obj(parse_qs(s_url), ('type', 0))
formats.append({
'url': s_url,
'format_id': format_type,
'ext': 'mp4',
'quality': quality(format_type),
})
continue
else:
self.report_warning(f'Unsupported stream URL: {s_url}{bug_reports_message()}')
continue
formats.extend(fmts) formats.extend(fmts)
subtitles = self._merge_subtitles(subtitles, subs) self._merge_subtitles(subs, target=subtitles)
return { return {
'id': video_id, 'id': video_id,
'title': video_json.get('title') or self._og_search_title(webpage),
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
'duration': int_or_none(video_json.get('duration')), **traverse_obj(video_data, {
'view_count': int_or_none(video_json.get('views')), 'title': ('title', {str}),
'timestamp': int_or_none(video_json.get('publicationDate')), 'description': ('description', {str}),
'uploader': uploader_name or data_json.get('authorName') or try_get(data_json, lambda x: x['publisher']['name']), 'thumbnail': ('image', {url_or_none}),
'description': video_json.get('description') or self._og_search_description(webpage), 'duration': ('video', 'duration', {int_or_none}),
'thumbnail': self._og_search_thumbnail(webpage) or try_get(data_json, lambda x: x['og']['imageUrl']), 'view_count': ('video', 'views', {int_or_none}),
'timestamp': ('publicationDate', {int_or_none}),
'tags': ('tags', ..., {str}),
'uploader': ('source', 'title', {str}),
}),
} }
class ZenYandexChannelIE(InfoExtractor): class ZenYandexChannelIE(ZenYandexBaseIE):
IE_NAME = 'dzen.ru:channel'
_VALID_URL = r'https?://(zen\.yandex|dzen)\.ru/(?!media|video)(?:id/)?(?P<id>[a-z0-9-_]+)' _VALID_URL = r'https?://(zen\.yandex|dzen)\.ru/(?!media|video)(?:id/)?(?P<id>[a-z0-9-_]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://zen.yandex.ru/tok_media', 'url': 'https://zen.yandex.ru/tok_media',
@@ -323,8 +359,8 @@ class ZenYandexChannelIE(InfoExtractor):
'url': 'https://zen.yandex.ru/jony_me', 'url': 'https://zen.yandex.ru/jony_me',
'info_dict': { 'info_dict': {
'id': 'jony_me', 'id': 'jony_me',
'description': 'md5:ce0a5cad2752ab58701b5497835b2cc5', 'description': 'md5:7c30d11dc005faba8826feae99da3113',
'title': 'JONY ', 'title': 'JONY',
}, },
'playlist_count': 18, 'playlist_count': 18,
}, { }, {
@@ -333,9 +369,8 @@ class ZenYandexChannelIE(InfoExtractor):
'url': 'https://zen.yandex.ru/tatyanareva', 'url': 'https://zen.yandex.ru/tatyanareva',
'info_dict': { 'info_dict': {
'id': 'tatyanareva', 'id': 'tatyanareva',
'description': 'md5:40a1e51f174369ec3ba9d657734ac31f', 'description': 'md5:92e56fa730a932ca2483ba5c2186ad96',
'title': 'Татьяна Рева', 'title': 'Татьяна Рева',
'entries': 'maxcount:200',
}, },
'playlist_mincount': 46, 'playlist_mincount': 46,
}, { }, {
@@ -348,43 +383,31 @@ class ZenYandexChannelIE(InfoExtractor):
'playlist_mincount': 657, 'playlist_mincount': 657,
}] }]
def _entries(self, item_id, server_state_json, server_settings_json): def _entries(self, feed_data, channel_id):
items = (traverse_obj(server_state_json, ('feed', 'items', ...))
or traverse_obj(server_settings_json, ('exportData', 'items', ...)))
more = (traverse_obj(server_state_json, ('links', 'more'))
or traverse_obj(server_settings_json, ('exportData', 'more', 'link')))
next_page_id = None next_page_id = None
for page in itertools.count(1): for page in itertools.count(1):
for item in items or []: for item in traverse_obj(feed_data, (
if item.get('type') != 'gif': (None, ('items', lambda _, v: v['tab'] in ('shorts', 'longs'))),
continue 'items', lambda _, v: url_or_none(v['link']),
video_id = traverse_obj(item, 'publication_id', 'publicationId') or '' )):
yield self.url_result(item['link'], ZenYandexIE, video_id.split(':')[-1]) yield self.url_result(item['link'], ZenYandexIE, item.get('id'), title=item.get('title'))
more = traverse_obj(feed_data, ('more', 'link', {url_or_none}))
current_page_id = next_page_id current_page_id = next_page_id
next_page_id = traverse_obj(parse_qs(more), ('next_page_id', -1)) next_page_id = traverse_obj(parse_qs(more), ('next_page_id', -1))
if not all((more, items, next_page_id, next_page_id != current_page_id)): if not all((more, next_page_id, next_page_id != current_page_id)):
break break
data = self._download_json(more, item_id, note=f'Downloading Page {page}') feed_data = self._download_json(more, channel_id, note=f'Downloading Page {page}')
items, more = data.get('items'), traverse_obj(data, ('more', 'link'))
def _real_extract(self, url): def _real_extract(self, url):
item_id = self._match_id(url) channel_id = self._match_id(url)
webpage = self._download_webpage(url, item_id) channel_id, ssr_data = self._fetch_ssr_data(url, channel_id)
redirect = self._search_json( channel_data = ssr_data['exportResponse']
r'var it\s*=', webpage, 'redirect', item_id, default={}).get('retpath')
if redirect:
item_id = self._match_id(redirect)
webpage = self._download_webpage(redirect, item_id, note='Redirecting')
data = self._search_json(
r'("data"\s*:|data\s*=)', webpage, 'channel data', item_id, contains_pattern=r'{\"__serverState__.+}')
server_state_json = traverse_obj(data, lambda k, _: k.startswith('__serverState__'), get_all=False)
server_settings_json = traverse_obj(data, lambda k, _: k.startswith('__serverSettings__'), get_all=False)
return self.playlist_result( return self.playlist_result(
self._entries(item_id, server_state_json, server_settings_json), self._entries(channel_data['feedData'], channel_id),
item_id, traverse_obj(server_state_json, ('channel', 'source', 'title')), channel_id, **traverse_obj(channel_data, ('channel', 'source', {
traverse_obj(server_state_json, ('channel', 'source', 'description'))) 'title': ('title', {str}),
'description': ('description', {str}),
})))

View File

@@ -227,7 +227,7 @@ def extract_tag_box(regex, title):
return result return result
class YouPornListBase(InfoExtractor): class YouPornListBaseIE(InfoExtractor):
def _get_next_url(self, url, pl_id, html): def _get_next_url(self, url, pl_id, html):
return urljoin(url, self._search_regex( return urljoin(url, self._search_regex(
r'''<a [^>]*?\bhref\s*=\s*("|')(?P<url>(?:(?!\1)[^>])+)\1''', r'''<a [^>]*?\bhref\s*=\s*("|')(?P<url>(?:(?!\1)[^>])+)\1''',
@@ -284,7 +284,7 @@ def _real_extract(self, url, html=None):
playlist_id=pl_id, playlist_title=title) playlist_id=pl_id, playlist_title=title)
class YouPornCategoryIE(YouPornListBase): class YouPornCategoryIE(YouPornListBaseIE):
IE_DESC = 'YouPorn category, with sorting, filtering and pagination' IE_DESC = 'YouPorn category, with sorting, filtering and pagination'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?://(?:www\.)?youporn\.com/ https?://(?:www\.)?youporn\.com/
@@ -319,7 +319,7 @@ class YouPornCategoryIE(YouPornListBase):
}] }]
class YouPornChannelIE(YouPornListBase): class YouPornChannelIE(YouPornListBaseIE):
IE_DESC = 'YouPorn channel, with sorting and pagination' IE_DESC = 'YouPorn channel, with sorting and pagination'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?://(?:www\.)?youporn\.com/ https?://(?:www\.)?youporn\.com/
@@ -349,7 +349,7 @@ def _get_title_from_slug(title_slug):
return re.sub(r'_', ' ', title_slug).title() return re.sub(r'_', ' ', title_slug).title()
class YouPornCollectionIE(YouPornListBase): class YouPornCollectionIE(YouPornListBaseIE):
IE_DESC = 'YouPorn collection (user playlist), with sorting and pagination' IE_DESC = 'YouPorn collection (user playlist), with sorting and pagination'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?://(?:www\.)?youporn\.com/ https?://(?:www\.)?youporn\.com/
@@ -394,7 +394,7 @@ def _real_extract(self, url):
return playlist return playlist
class YouPornTagIE(YouPornListBase): class YouPornTagIE(YouPornListBaseIE):
IE_DESC = 'YouPorn tag (porntags), with sorting, filtering and pagination' IE_DESC = 'YouPorn tag (porntags), with sorting, filtering and pagination'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?://(?:www\.)?youporn\.com/ https?://(?:www\.)?youporn\.com/
@@ -442,7 +442,7 @@ def _real_extract(self, url):
return super()._real_extract(url) return super()._real_extract(url)
class YouPornStarIE(YouPornListBase): class YouPornStarIE(YouPornListBaseIE):
IE_DESC = 'YouPorn Pornstar, with description, sorting and pagination' IE_DESC = 'YouPorn Pornstar, with description, sorting and pagination'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?://(?:www\.)?youporn\.com/ https?://(?:www\.)?youporn\.com/
@@ -493,7 +493,7 @@ def _real_extract(self, url):
} }
class YouPornVideosIE(YouPornListBase): class YouPornVideosIE(YouPornListBaseIE):
IE_DESC = 'YouPorn video (browse) playlists, with sorting, filtering and pagination' IE_DESC = 'YouPorn video (browse) playlists, with sorting, filtering and pagination'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?://(?:www\.)?youporn\.com/ https?://(?:www\.)?youporn\.com/

View File

@@ -417,6 +417,8 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
_NETRC_MACHINE = 'youtube' _NETRC_MACHINE = 'youtube'
_COOKIE_HOWTO_WIKI_URL = 'https://github.com/yt-dlp/yt-dlp/wiki/Extractors#exporting-youtube-cookies'
def ucid_or_none(self, ucid): def ucid_or_none(self, ucid):
return self._search_regex(rf'^({self._YT_CHANNEL_UCID_RE})$', ucid, 'UC-id', default=None) return self._search_regex(rf'^({self._YT_CHANNEL_UCID_RE})$', ucid, 'UC-id', default=None)
@@ -451,17 +453,15 @@ def _preferred_lang(self):
return preferred_lang return preferred_lang
def _initialize_consent(self): def _initialize_consent(self):
cookies = self._get_cookies('https://www.youtube.com/') if self._has_auth_cookies:
if cookies.get('__Secure-3PSID'):
return return
socs = cookies.get('SOCS') socs = self._youtube_cookies.get('SOCS')
if socs and not socs.value.startswith('CAA'): # not consented if socs and not socs.value.startswith('CAA'): # not consented
return return
self._set_cookie('.youtube.com', 'SOCS', 'CAI', secure=True) # accept all (required for mixes) self._set_cookie('.youtube.com', 'SOCS', 'CAI', secure=True) # accept all (required for mixes)
def _initialize_pref(self): def _initialize_pref(self):
cookies = self._get_cookies('https://www.youtube.com/') pref_cookie = self._youtube_cookies.get('PREF')
pref_cookie = cookies.get('PREF')
pref = {} pref = {}
if pref_cookie: if pref_cookie:
try: try:
@@ -472,8 +472,9 @@ def _initialize_pref(self):
self._set_cookie('.youtube.com', name='PREF', value=urllib.parse.urlencode(pref)) self._set_cookie('.youtube.com', name='PREF', value=urllib.parse.urlencode(pref))
def _initialize_cookie_auth(self): def _initialize_cookie_auth(self):
yt_sapisid, yt_1psapisid, yt_3psapisid = self._get_sid_cookies() self._passed_auth_cookies = False
if yt_sapisid or yt_1psapisid or yt_3psapisid: if self._has_auth_cookies:
self._passed_auth_cookies = True
self.write_debug('Found YouTube account cookies') self.write_debug('Found YouTube account cookies')
def _real_initialize(self): def _real_initialize(self):
@@ -492,8 +493,7 @@ def _perform_login(self, username, password):
@property @property
def _youtube_login_hint(self): def _youtube_login_hint(self):
return (f'{self._login_hint(method="cookies")}. Also see ' return (f'{self._login_hint(method="cookies")}. Also see {self._COOKIE_HOWTO_WIKI_URL} '
'https://github.com/yt-dlp/yt-dlp/wiki/Extractors#exporting-youtube-cookies '
'for tips on effectively exporting YouTube cookies') 'for tips on effectively exporting YouTube cookies')
def _check_login_required(self): def _check_login_required(self):
@@ -553,12 +553,16 @@ def _make_sid_authorization(scheme, sid, origin, additional_parts):
return f'{scheme} {"_".join(parts)}' return f'{scheme} {"_".join(parts)}'
@property
def _youtube_cookies(self):
return self._get_cookies('https://www.youtube.com')
def _get_sid_cookies(self): def _get_sid_cookies(self):
""" """
Get SAPISID, 1PSAPISID, 3PSAPISID cookie values Get SAPISID, 1PSAPISID, 3PSAPISID cookie values
@returns sapisid, 1psapisid, 3psapisid @returns sapisid, 1psapisid, 3psapisid
""" """
yt_cookies = self._get_cookies('https://www.youtube.com') yt_cookies = self._youtube_cookies
yt_sapisid = try_call(lambda: yt_cookies['SAPISID'].value) yt_sapisid = try_call(lambda: yt_cookies['SAPISID'].value)
yt_3papisid = try_call(lambda: yt_cookies['__Secure-3PAPISID'].value) yt_3papisid = try_call(lambda: yt_cookies['__Secure-3PAPISID'].value)
yt_1papisid = try_call(lambda: yt_cookies['__Secure-1PAPISID'].value) yt_1papisid = try_call(lambda: yt_cookies['__Secure-1PAPISID'].value)
@@ -595,6 +599,31 @@ def _get_sid_authorization_header(self, origin='https://www.youtube.com', user_s
return ' '.join(authorizations) return ' '.join(authorizations)
@property
def is_authenticated(self):
return self._has_auth_cookies
@property
def _has_auth_cookies(self):
yt_sapisid, yt_1psapisid, yt_3psapisid = self._get_sid_cookies()
# YouTube doesn't appear to clear 3PSAPISID when rotating cookies (as of 2025-04-26)
# But LOGIN_INFO is cleared and should exist if logged in
has_login_info = 'LOGIN_INFO' in self._youtube_cookies
return bool(has_login_info and (yt_sapisid or yt_1psapisid or yt_3psapisid))
def _request_webpage(self, *args, **kwargs):
response = super()._request_webpage(*args, **kwargs)
# Check that we are still logged-in and cookies have not rotated after every request
if getattr(self, '_passed_auth_cookies', None) and not self._has_auth_cookies:
self.report_warning(
'The provided YouTube account cookies are no longer valid. '
'They have likely been rotated in the browser as a security measure. '
f'For tips on how to effectively export YouTube cookies, refer to {self._COOKIE_HOWTO_WIKI_URL} .',
only_once=False)
return response
def _call_api(self, ep, query, video_id, fatal=True, headers=None, def _call_api(self, ep, query, video_id, fatal=True, headers=None,
note='Downloading API JSON', errnote='Unable to download API page', note='Downloading API JSON', errnote='Unable to download API page',
context=None, api_key=None, api_hostname=None, default_client='web'): context=None, api_key=None, api_hostname=None, default_client='web'):
@@ -695,10 +724,6 @@ def _extract_visitor_data(self, *args):
args, [('VISITOR_DATA', ('INNERTUBE_CONTEXT', 'client', 'visitorData'), ('responseContext', 'visitorData'))], args, [('VISITOR_DATA', ('INNERTUBE_CONTEXT', 'client', 'visitorData'), ('responseContext', 'visitorData'))],
expected_type=str) expected_type=str)
@functools.cached_property
def is_authenticated(self):
return bool(self._get_sid_authorization_header())
def extract_ytcfg(self, video_id, webpage): def extract_ytcfg(self, video_id, webpage):
if not webpage: if not webpage:
return {} return {}
@@ -803,12 +828,14 @@ def _extract_next_continuation_data(cls, renderer):
@classmethod @classmethod
def _extract_continuation_ep_data(cls, continuation_ep: dict): def _extract_continuation_ep_data(cls, continuation_ep: dict):
if isinstance(continuation_ep, dict): continuation_commands = traverse_obj(
continuation = try_get( continuation_ep, ('commandExecutorCommand', 'commands', ..., {dict}))
continuation_ep, lambda x: x['continuationCommand']['token'], str) continuation_commands.append(continuation_ep)
for command in continuation_commands:
continuation = traverse_obj(command, ('continuationCommand', 'token', {str}))
if not continuation: if not continuation:
return continue
ctp = continuation_ep.get('clickTrackingParams') ctp = command.get('clickTrackingParams')
return cls._build_api_continuation_query(continuation, ctp) return cls._build_api_continuation_query(continuation, ctp)
@classmethod @classmethod

View File

@@ -524,10 +524,16 @@ def _entries(self, tab, item_id, ytcfg, delegated_session_id, visitor_data):
response = self._extract_response( response = self._extract_response(
item_id=f'{item_id} page {page_num}', item_id=f'{item_id} page {page_num}',
query=continuation, headers=headers, ytcfg=ytcfg, query=continuation, headers=headers, ytcfg=ytcfg,
check_get_keys=('continuationContents', 'onResponseReceivedActions', 'onResponseReceivedEndpoints')) check_get_keys=(
'continuationContents', 'onResponseReceivedActions', 'onResponseReceivedEndpoints',
# Playlist recommendations may return with no data - ignore
('responseContext', 'serviceTrackingParams', ..., 'params', ..., lambda k, v: k == 'key' and v == 'GetRecommendedMusicPlaylists_rid'),
))
if not response: if not response:
break break
continuation = None
# Extracting updated visitor data is required to prevent an infinite extraction loop in some cases # Extracting updated visitor data is required to prevent an infinite extraction loop in some cases
# See: https://github.com/ytdl-org/youtube-dl/issues/28702 # See: https://github.com/ytdl-org/youtube-dl/issues/28702
visitor_data = self._extract_visitor_data(response) or visitor_data visitor_data = self._extract_visitor_data(response) or visitor_data
@@ -564,7 +570,13 @@ def _entries(self, tab, item_id, ytcfg, delegated_session_id, visitor_data):
yield from func(video_items_renderer) yield from func(video_items_renderer)
continuation = continuation_list[0] or self._extract_continuation(video_items_renderer) continuation = continuation_list[0] or self._extract_continuation(video_items_renderer)
if not video_items_renderer: # In the case only a continuation is returned, try to follow it.
# We extract this after trying to extract non-continuation items as otherwise this
# may be prioritized over other continuations.
# see: https://github.com/yt-dlp/yt-dlp/issues/12933
continuation = continuation or self._extract_continuation({'contents': [continuation_item]})
if not continuation and not video_items_renderer:
break break
@staticmethod @staticmethod
@@ -999,14 +1011,14 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'playlist_mincount': 94, 'playlist_mincount': 94,
'info_dict': { 'info_dict': {
'id': 'UCqj7Cz7revf5maW9g5pgNcg', 'id': 'UCqj7Cz7revf5maW9g5pgNcg',
'title': 'Igor Kleiner Ph.D. - Playlists', 'title': 'Igor Kleiner - Playlists',
'description': 'md5:15d7dd9e333cb987907fcb0d604b233a', 'description': 'md5:15d7dd9e333cb987907fcb0d604b233a',
'uploader': 'Igor Kleiner Ph.D.', 'uploader': 'Igor Kleiner ',
'uploader_id': '@IgorDataScience', 'uploader_id': '@IgorDataScience',
'uploader_url': 'https://www.youtube.com/@IgorDataScience', 'uploader_url': 'https://www.youtube.com/@IgorDataScience',
'channel': 'Igor Kleiner Ph.D.', 'channel': 'Igor Kleiner ',
'channel_id': 'UCqj7Cz7revf5maW9g5pgNcg', 'channel_id': 'UCqj7Cz7revf5maW9g5pgNcg',
'tags': ['критическое мышление', 'наука просто', 'математика', 'анализ данных'], 'tags': 'count:23',
'channel_url': 'https://www.youtube.com/channel/UCqj7Cz7revf5maW9g5pgNcg', 'channel_url': 'https://www.youtube.com/channel/UCqj7Cz7revf5maW9g5pgNcg',
'channel_follower_count': int, 'channel_follower_count': int,
}, },
@@ -1016,18 +1028,19 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'playlist_mincount': 94, 'playlist_mincount': 94,
'info_dict': { 'info_dict': {
'id': 'UCqj7Cz7revf5maW9g5pgNcg', 'id': 'UCqj7Cz7revf5maW9g5pgNcg',
'title': 'Igor Kleiner Ph.D. - Playlists', 'title': 'Igor Kleiner - Playlists',
'description': 'md5:15d7dd9e333cb987907fcb0d604b233a', 'description': 'md5:15d7dd9e333cb987907fcb0d604b233a',
'uploader': 'Igor Kleiner Ph.D.', 'uploader': 'Igor Kleiner ',
'uploader_id': '@IgorDataScience', 'uploader_id': '@IgorDataScience',
'uploader_url': 'https://www.youtube.com/@IgorDataScience', 'uploader_url': 'https://www.youtube.com/@IgorDataScience',
'tags': ['критическое мышление', 'наука просто', 'математика', 'анализ данных'], 'tags': 'count:23',
'channel_id': 'UCqj7Cz7revf5maW9g5pgNcg', 'channel_id': 'UCqj7Cz7revf5maW9g5pgNcg',
'channel': 'Igor Kleiner Ph.D.', 'channel': 'Igor Kleiner ',
'channel_url': 'https://www.youtube.com/channel/UCqj7Cz7revf5maW9g5pgNcg', 'channel_url': 'https://www.youtube.com/channel/UCqj7Cz7revf5maW9g5pgNcg',
'channel_follower_count': int, 'channel_follower_count': int,
}, },
}, { }, {
# TODO: fix channel_is_verified extraction
'note': 'playlists, series', 'note': 'playlists, series',
'url': 'https://www.youtube.com/c/3blue1brown/playlists?view=50&sort=dd&shelf_id=3', 'url': 'https://www.youtube.com/c/3blue1brown/playlists?view=50&sort=dd&shelf_id=3',
'playlist_mincount': 5, 'playlist_mincount': 5,
@@ -1066,22 +1079,23 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'url': 'https://www.youtube.com/c/ChristophLaimer/playlists', 'url': 'https://www.youtube.com/c/ChristophLaimer/playlists',
'only_matching': True, 'only_matching': True,
}, { }, {
# TODO: fix availability extraction
'note': 'basic, single video playlist', 'note': 'basic, single video playlist',
'url': 'https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc', 'url': 'https://www.youtube.com/playlist?list=PLt5yu3-wZAlSLRHmI1qNm0wjyVNWw1pCU',
'info_dict': { 'info_dict': {
'id': 'PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc', 'id': 'PLt5yu3-wZAlSLRHmI1qNm0wjyVNWw1pCU',
'title': 'youtube-dl public playlist', 'title': 'single video playlist',
'description': '', 'description': '',
'tags': [], 'tags': [],
'view_count': int, 'view_count': int,
'modified_date': '20201130', 'modified_date': '20250417',
'channel': 'Sergey M.', 'channel': 'cole-dlp-test-acc',
'channel_id': 'UCmlqkdCBesrv2Lak1mF_MxA', 'channel_id': 'UCiu-3thuViMebBjw_5nWYrA',
'channel_url': 'https://www.youtube.com/channel/UCmlqkdCBesrv2Lak1mF_MxA', 'channel_url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA',
'availability': 'public', 'availability': 'public',
'uploader': 'Sergey M.', 'uploader': 'cole-dlp-test-acc',
'uploader_url': 'https://www.youtube.com/@sergeym.6173', 'uploader_url': 'https://www.youtube.com/@coletdjnz',
'uploader_id': '@sergeym.6173', 'uploader_id': '@coletdjnz',
}, },
'playlist_count': 1, 'playlist_count': 1,
}, { }, {
@@ -1171,11 +1185,11 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
}, },
'playlist_mincount': 17, 'playlist_mincount': 17,
}, { }, {
'note': 'Community tab', 'note': 'Posts tab',
'url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w/community', 'url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w/community',
'info_dict': { 'info_dict': {
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w', 'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
'title': 'lex will - Community', 'title': 'lex will - Posts',
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488', 'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
'channel': 'lex will', 'channel': 'lex will',
'channel_url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w', 'channel_url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w',
@@ -1188,30 +1202,14 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
}, },
'playlist_mincount': 18, 'playlist_mincount': 18,
}, { }, {
'note': 'Channels tab', # TODO: fix channel_is_verified extraction
'url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w/channels',
'info_dict': {
'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
'title': 'lex will - Channels',
'description': 'md5:2163c5d0ff54ed5f598d6a7e6211e488',
'channel': 'lex will',
'channel_url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w',
'channel_id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
'tags': ['bible', 'history', 'prophesy'],
'channel_follower_count': int,
'uploader_url': 'https://www.youtube.com/@lexwill718',
'uploader_id': '@lexwill718',
'uploader': 'lex will',
},
'playlist_mincount': 12,
}, {
'note': 'Search tab', 'note': 'Search tab',
'url': 'https://www.youtube.com/c/3blue1brown/search?query=linear%20algebra', 'url': 'https://www.youtube.com/c/3blue1brown/search?query=linear%20algebra',
'playlist_mincount': 40, 'playlist_mincount': 40,
'info_dict': { 'info_dict': {
'id': 'UCYO_jab_esuFRV4b17AJtAw', 'id': 'UCYO_jab_esuFRV4b17AJtAw',
'title': '3Blue1Brown - Search - linear algebra', 'title': '3Blue1Brown - Search - linear algebra',
'description': 'md5:4d1da95432004b7ba840ebc895b6b4c9', 'description': 'md5:602e3789e6a0cb7d9d352186b720e395',
'channel_url': 'https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw', 'channel_url': 'https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw',
'tags': ['Mathematics'], 'tags': ['Mathematics'],
'channel': '3Blue1Brown', 'channel': '3Blue1Brown',
@@ -1232,6 +1230,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'url': 'https://music.youtube.com/channel/UCmlqkdCBesrv2Lak1mF_MxA', 'url': 'https://music.youtube.com/channel/UCmlqkdCBesrv2Lak1mF_MxA',
'only_matching': True, 'only_matching': True,
}, { }, {
# TODO: fix availability extraction
'note': 'Playlist with deleted videos (#651). As a bonus, the video #51 is also twice in this list.', 'note': 'Playlist with deleted videos (#651). As a bonus, the video #51 is also twice in this list.',
'url': 'https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC', 'url': 'https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC',
'info_dict': { 'info_dict': {
@@ -1294,24 +1293,25 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
}, },
'playlist_mincount': 21, 'playlist_mincount': 21,
}, { }, {
# TODO: fix availability extraction
'note': 'Playlist with "show unavailable videos" button', 'note': 'Playlist with "show unavailable videos" button',
'url': 'https://www.youtube.com/playlist?list=UUTYLiWFZy8xtPwxFwX9rV7Q', 'url': 'https://www.youtube.com/playlist?list=PLYwq8WOe86_xGmR7FrcJq8Sb7VW8K3Tt2',
'info_dict': { 'info_dict': {
'title': 'Uploads from Phim Siêu Nhân Nhật Bản', 'title': 'The Memes Of 2010s.....',
'id': 'UUTYLiWFZy8xtPwxFwX9rV7Q', 'id': 'PLYwq8WOe86_xGmR7FrcJq8Sb7VW8K3Tt2',
'view_count': int, 'view_count': int,
'channel': 'Phim Siêu Nhân Nhật Bản', 'channel': "I'm Not JiNxEd",
'tags': [], 'tags': [],
'description': '', 'description': 'md5:44dc3b315ba69394feaafa2f40e7b2a1',
'channel_url': 'https://www.youtube.com/channel/UCTYLiWFZy8xtPwxFwX9rV7Q', 'channel_url': 'https://www.youtube.com/channel/UC5H5H85D1QE5-fuWWQ1hdNg',
'channel_id': 'UCTYLiWFZy8xtPwxFwX9rV7Q', 'channel_id': 'UC5H5H85D1QE5-fuWWQ1hdNg',
'modified_date': r're:\d{8}', 'modified_date': r're:\d{8}',
'availability': 'public', 'availability': 'public',
'uploader_url': 'https://www.youtube.com/@phimsieunhannhatban', 'uploader_url': 'https://www.youtube.com/@imnotjinxed1998',
'uploader_id': '@phimsieunhannhatban', 'uploader_id': '@imnotjinxed1998',
'uploader': 'Phim Siêu Nhân Nhật Bản', 'uploader': "I'm Not JiNxEd",
}, },
'playlist_mincount': 200, 'playlist_mincount': 150,
'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'], 'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'],
}, { }, {
'note': 'Playlist with unavailable videos in page 7', 'note': 'Playlist with unavailable videos in page 7',
@@ -1334,6 +1334,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'playlist_mincount': 1000, 'playlist_mincount': 1000,
'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'], 'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'],
}, { }, {
# TODO: fix availability extraction
'note': 'https://github.com/ytdl-org/youtube-dl/issues/21844', 'note': 'https://github.com/ytdl-org/youtube-dl/issues/21844',
'url': 'https://www.youtube.com/playlist?list=PLzH6n4zXuckpfMu_4Ff8E7Z1behQks5ba', 'url': 'https://www.youtube.com/playlist?list=PLzH6n4zXuckpfMu_4Ff8E7Z1behQks5ba',
'info_dict': { 'info_dict': {
@@ -1384,7 +1385,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
}, { }, {
'url': 'https://www.youtube.com/channel/UCoMdktPbSTixAyNGwb-UYkQ/live', 'url': 'https://www.youtube.com/channel/UCoMdktPbSTixAyNGwb-UYkQ/live',
'info_dict': { 'info_dict': {
'id': 'hGkQjiJLjWQ', # This will keep changing 'id': 'YDvsBbKfLPA', # This will keep changing
'ext': 'mp4', 'ext': 'mp4',
'title': str, 'title': str,
'upload_date': r're:\d{8}', 'upload_date': r're:\d{8}',
@@ -1409,6 +1410,8 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'uploader_id': '@SkyNews', 'uploader_id': '@SkyNews',
'uploader': 'Sky News', 'uploader': 'Sky News',
'channel_is_verified': True, 'channel_is_verified': True,
'media_type': 'livestream',
'timestamp': int,
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@@ -1496,6 +1499,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'url': 'https://music.youtube.com/browse/UC1a8OFewdjuLq6KlF8M_8Ng', 'url': 'https://music.youtube.com/browse/UC1a8OFewdjuLq6KlF8M_8Ng',
'only_matching': True, 'only_matching': True,
}, { }, {
# TODO: fix availability extraction
'note': 'VLPL, should redirect to playlist?list=PL...', 'note': 'VLPL, should redirect to playlist?list=PL...',
'url': 'https://music.youtube.com/browse/VLPLRBp0Fe2GpgmgoscNFLxNyBVSFVdYmFkq', 'url': 'https://music.youtube.com/browse/VLPLRBp0Fe2GpgmgoscNFLxNyBVSFVdYmFkq',
'info_dict': { 'info_dict': {
@@ -1537,6 +1541,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
}, { }, {
# Destination channel with only a hidden self tab (tab id is UCtFRv9O2AHqOZjjynzrv-xg) # Destination channel with only a hidden self tab (tab id is UCtFRv9O2AHqOZjjynzrv-xg)
# Treat as a general feed # Treat as a general feed
# TODO: fix extraction
'url': 'https://www.youtube.com/channel/UCtFRv9O2AHqOZjjynzrv-xg', 'url': 'https://www.youtube.com/channel/UCtFRv9O2AHqOZjjynzrv-xg',
'info_dict': { 'info_dict': {
'id': 'UCtFRv9O2AHqOZjjynzrv-xg', 'id': 'UCtFRv9O2AHqOZjjynzrv-xg',
@@ -1560,21 +1565,21 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'expected_warnings': ['YouTube Music is not directly supported'], 'expected_warnings': ['YouTube Music is not directly supported'],
}, { }, {
'note': 'unlisted single video playlist', 'note': 'unlisted single video playlist',
'url': 'https://www.youtube.com/playlist?list=PLwL24UFy54GrB3s2KMMfjZscDi1x5Dajf', 'url': 'https://www.youtube.com/playlist?list=PLt5yu3-wZAlQLfIN0MMgp0wVV6MP3bM4_',
'info_dict': { 'info_dict': {
'id': 'PLwL24UFy54GrB3s2KMMfjZscDi1x5Dajf', 'id': 'PLt5yu3-wZAlQLfIN0MMgp0wVV6MP3bM4_',
'title': 'yt-dlp unlisted playlist test', 'title': 'unlisted playlist',
'availability': 'unlisted', 'availability': 'unlisted',
'tags': [], 'tags': [],
'modified_date': '20220418', 'modified_date': '20250417',
'channel': 'colethedj', 'channel': 'cole-dlp-test-acc',
'view_count': int, 'view_count': int,
'description': '', 'description': '',
'channel_id': 'UC9zHu_mHU96r19o-wV5Qs1Q', 'channel_id': 'UCiu-3thuViMebBjw_5nWYrA',
'channel_url': 'https://www.youtube.com/channel/UC9zHu_mHU96r19o-wV5Qs1Q', 'channel_url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA',
'uploader_url': 'https://www.youtube.com/@colethedj1894', 'uploader_url': 'https://www.youtube.com/@coletdjnz',
'uploader_id': '@colethedj1894', 'uploader_id': '@coletdjnz',
'uploader': 'colethedj', 'uploader': 'cole-dlp-test-acc',
}, },
'playlist': [{ 'playlist': [{
'info_dict': { 'info_dict': {
@@ -1596,6 +1601,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'playlist_count': 1, 'playlist_count': 1,
'params': {'extract_flat': True}, 'params': {'extract_flat': True},
}, { }, {
# By default, recommended is always empty.
'note': 'API Fallback: Recommended - redirects to home page. Requires visitorData', 'note': 'API Fallback: Recommended - redirects to home page. Requires visitorData',
'url': 'https://www.youtube.com/feed/recommended', 'url': 'https://www.youtube.com/feed/recommended',
'info_dict': { 'info_dict': {
@@ -1603,7 +1609,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'title': 'recommended', 'title': 'recommended',
'tags': [], 'tags': [],
}, },
'playlist_mincount': 50, 'playlist_count': 0,
'params': { 'params': {
'skip_download': True, 'skip_download': True,
'extractor_args': {'youtubetab': {'skip': ['webpage']}}, 'extractor_args': {'youtubetab': {'skip': ['webpage']}},
@@ -1628,6 +1634,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
}, },
'skip': 'Query for sorting no longer works', 'skip': 'Query for sorting no longer works',
}, { }, {
# TODO: fix 'unviewable' issue with this playlist when reloading with unavailable videos
'note': 'API Fallback: Topic, should redirect to playlist?list=UU...', 'note': 'API Fallback: Topic, should redirect to playlist?list=UU...',
'url': 'https://music.youtube.com/browse/UC9ALqqC4aIeG5iDs7i90Bfw', 'url': 'https://music.youtube.com/browse/UC9ALqqC4aIeG5iDs7i90Bfw',
'info_dict': { 'info_dict': {
@@ -1654,11 +1661,12 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'url': 'https://www.youtube.com/channel/UCwVVpHQ2Cs9iGJfpdFngePQ', 'url': 'https://www.youtube.com/channel/UCwVVpHQ2Cs9iGJfpdFngePQ',
'only_matching': True, 'only_matching': True,
}, { }, {
# TODO: fix metadata extraction
'note': 'collaborative playlist (uploader name in the form "by <uploader> and x other(s)")', 'note': 'collaborative playlist (uploader name in the form "by <uploader> and x other(s)")',
'url': 'https://www.youtube.com/playlist?list=PLx-_-Kk4c89oOHEDQAojOXzEzemXxoqx6', 'url': 'https://www.youtube.com/playlist?list=PLx-_-Kk4c89oOHEDQAojOXzEzemXxoqx6',
'info_dict': { 'info_dict': {
'id': 'PLx-_-Kk4c89oOHEDQAojOXzEzemXxoqx6', 'id': 'PLx-_-Kk4c89oOHEDQAojOXzEzemXxoqx6',
'modified_date': '20220407', 'modified_date': '20250115',
'channel_url': 'https://www.youtube.com/channel/UCKcqXmCcyqnhgpA5P0oHH_Q', 'channel_url': 'https://www.youtube.com/channel/UCKcqXmCcyqnhgpA5P0oHH_Q',
'tags': [], 'tags': [],
'availability': 'unlisted', 'availability': 'unlisted',
@@ -1692,6 +1700,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'expected_warnings': ['Preferring "ja"'], 'expected_warnings': ['Preferring "ja"'],
}, { }, {
# XXX: this should really check flat playlist entries, but the test suite doesn't support that # XXX: this should really check flat playlist entries, but the test suite doesn't support that
# TODO: fix availability extraction
'note': 'preferred lang set with playlist with translated video titles', 'note': 'preferred lang set with playlist with translated video titles',
'url': 'https://www.youtube.com/playlist?list=PLt5yu3-wZAlQAaPZ5Z-rJoTdbT-45Q7c0', 'url': 'https://www.youtube.com/playlist?list=PLt5yu3-wZAlQAaPZ5Z-rJoTdbT-45Q7c0',
'info_dict': { 'info_dict': {
@@ -1714,6 +1723,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
}, { }, {
# shorts audio pivot for 2GtVksBMYFM. # shorts audio pivot for 2GtVksBMYFM.
'url': 'https://www.youtube.com/feed/sfv_audio_pivot?bp=8gUrCikSJwoLMkd0VmtzQk1ZRk0SCzJHdFZrc0JNWUZNGgsyR3RWa3NCTVlGTQ==', 'url': 'https://www.youtube.com/feed/sfv_audio_pivot?bp=8gUrCikSJwoLMkd0VmtzQk1ZRk0SCzJHdFZrc0JNWUZNGgsyR3RWa3NCTVlGTQ==',
# TODO: fix extraction
'info_dict': { 'info_dict': {
'id': 'sfv_audio_pivot', 'id': 'sfv_audio_pivot',
'title': 'sfv_audio_pivot', 'title': 'sfv_audio_pivot',
@@ -1751,6 +1761,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'playlist_mincount': 8, 'playlist_mincount': 8,
}, { }, {
# Should get three playlists for videos, shorts and streams tabs # Should get three playlists for videos, shorts and streams tabs
# TODO: fix channel_is_verified extraction
'url': 'https://www.youtube.com/channel/UCK9V2B22uJYu3N7eR_BT9QA', 'url': 'https://www.youtube.com/channel/UCK9V2B22uJYu3N7eR_BT9QA',
'info_dict': { 'info_dict': {
'id': 'UCK9V2B22uJYu3N7eR_BT9QA', 'id': 'UCK9V2B22uJYu3N7eR_BT9QA',
@@ -1758,7 +1769,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'channel_follower_count': int, 'channel_follower_count': int,
'channel_id': 'UCK9V2B22uJYu3N7eR_BT9QA', 'channel_id': 'UCK9V2B22uJYu3N7eR_BT9QA',
'channel_url': 'https://www.youtube.com/channel/UCK9V2B22uJYu3N7eR_BT9QA', 'channel_url': 'https://www.youtube.com/channel/UCK9V2B22uJYu3N7eR_BT9QA',
'description': 'md5:49809d8bf9da539bc48ed5d1f83c33f2', 'description': 'md5:01e53f350ab8ad6fcf7c4fedb3c1b99f',
'channel': 'Polka Ch. 尾丸ポルカ', 'channel': 'Polka Ch. 尾丸ポルカ',
'tags': 'count:35', 'tags': 'count:35',
'uploader_url': 'https://www.youtube.com/@OmaruPolka', 'uploader_url': 'https://www.youtube.com/@OmaruPolka',
@@ -1769,14 +1780,14 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'playlist_count': 3, 'playlist_count': 3,
}, { }, {
# Shorts tab with channel with handle # Shorts tab with channel with handle
# TODO: fix channel description # TODO: fix channel_is_verified extraction
'url': 'https://www.youtube.com/@NotJustBikes/shorts', 'url': 'https://www.youtube.com/@NotJustBikes/shorts',
'info_dict': { 'info_dict': {
'id': 'UC0intLFzLaudFG-xAvUEO-A', 'id': 'UC0intLFzLaudFG-xAvUEO-A',
'title': 'Not Just Bikes - Shorts', 'title': 'Not Just Bikes - Shorts',
'tags': 'count:10', 'tags': 'count:10',
'channel_url': 'https://www.youtube.com/channel/UC0intLFzLaudFG-xAvUEO-A', 'channel_url': 'https://www.youtube.com/channel/UC0intLFzLaudFG-xAvUEO-A',
'description': 'md5:5e82545b3a041345927a92d0585df247', 'description': 'md5:1d9fc1bad7f13a487299d1fe1712e031',
'channel_follower_count': int, 'channel_follower_count': int,
'channel_id': 'UC0intLFzLaudFG-xAvUEO-A', 'channel_id': 'UC0intLFzLaudFG-xAvUEO-A',
'channel': 'Not Just Bikes', 'channel': 'Not Just Bikes',
@@ -1797,7 +1808,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'channel_url': 'https://www.youtube.com/channel/UC3eYAvjCVwNHgkaGbXX3sig', 'channel_url': 'https://www.youtube.com/channel/UC3eYAvjCVwNHgkaGbXX3sig',
'channel': '中村悠一', 'channel': '中村悠一',
'channel_follower_count': int, 'channel_follower_count': int,
'description': 'md5:e744f6c93dafa7a03c0c6deecb157300', 'description': 'md5:e8fd705073a594f27d6d6d020da560dc',
'uploader_url': 'https://www.youtube.com/@Yuichi-Nakamura', 'uploader_url': 'https://www.youtube.com/@Yuichi-Nakamura',
'uploader_id': '@Yuichi-Nakamura', 'uploader_id': '@Yuichi-Nakamura',
'uploader': '中村悠一', 'uploader': '中村悠一',
@@ -1815,6 +1826,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'only_matching': True, 'only_matching': True,
}, { }, {
# No videos tab but has a shorts tab # No videos tab but has a shorts tab
# TODO: fix metadata extraction
'url': 'https://www.youtube.com/c/TKFShorts', 'url': 'https://www.youtube.com/c/TKFShorts',
'info_dict': { 'info_dict': {
'id': 'UCgJ5_1F6yJhYLnyMszUdmUg', 'id': 'UCgJ5_1F6yJhYLnyMszUdmUg',
@@ -1851,6 +1863,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
}, { }, {
# Shorts url result in shorts tab # Shorts url result in shorts tab
# TODO: Fix channel id extraction # TODO: Fix channel id extraction
# TODO: fix test suite, 208163447408c78673b08c172beafe5c310fb167 broke this test
'url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA/shorts', 'url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA/shorts',
'info_dict': { 'info_dict': {
'id': 'UCiu-3thuViMebBjw_5nWYrA', 'id': 'UCiu-3thuViMebBjw_5nWYrA',
@@ -1879,6 +1892,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'params': {'extract_flat': True}, 'params': {'extract_flat': True},
}, { }, {
# Live video status should be extracted # Live video status should be extracted
# TODO: fix test suite, 208163447408c78673b08c172beafe5c310fb167 broke this test
'url': 'https://www.youtube.com/channel/UCQvWX73GQygcwXOTSf_VDVg/live', 'url': 'https://www.youtube.com/channel/UCQvWX73GQygcwXOTSf_VDVg/live',
'info_dict': { 'info_dict': {
'id': 'UCQvWX73GQygcwXOTSf_VDVg', 'id': 'UCQvWX73GQygcwXOTSf_VDVg',
@@ -1907,6 +1921,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'playlist_mincount': 1, 'playlist_mincount': 1,
}, { }, {
# Channel renderer metadata. Contains number of videos on the channel # Channel renderer metadata. Contains number of videos on the channel
# TODO: channels tab removed, change this test to use another page with channel renderer
'url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA/channels', 'url': 'https://www.youtube.com/channel/UCiu-3thuViMebBjw_5nWYrA/channels',
'info_dict': { 'info_dict': {
'id': 'UCiu-3thuViMebBjw_5nWYrA', 'id': 'UCiu-3thuViMebBjw_5nWYrA',
@@ -1940,7 +1955,9 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
}, },
}], }],
'params': {'extract_flat': True}, 'params': {'extract_flat': True},
'skip': 'channels tab removed',
}, { }, {
# TODO: fix channel_is_verified extraction
'url': 'https://www.youtube.com/@3blue1brown/about', 'url': 'https://www.youtube.com/@3blue1brown/about',
'info_dict': { 'info_dict': {
'id': '@3blue1brown', 'id': '@3blue1brown',
@@ -1950,7 +1967,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'channel_id': 'UCYO_jab_esuFRV4b17AJtAw', 'channel_id': 'UCYO_jab_esuFRV4b17AJtAw',
'channel': '3Blue1Brown', 'channel': '3Blue1Brown',
'channel_url': 'https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw', 'channel_url': 'https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw',
'description': 'md5:4d1da95432004b7ba840ebc895b6b4c9', 'description': 'md5:602e3789e6a0cb7d9d352186b720e395',
'uploader_url': 'https://www.youtube.com/@3blue1brown', 'uploader_url': 'https://www.youtube.com/@3blue1brown',
'uploader_id': '@3blue1brown', 'uploader_id': '@3blue1brown',
'uploader': '3Blue1Brown', 'uploader': '3Blue1Brown',
@@ -1976,6 +1993,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'playlist_count': 5, 'playlist_count': 5,
}, { }, {
# Releases tab, with rich entry playlistRenderers (same as Podcasts tab) # Releases tab, with rich entry playlistRenderers (same as Podcasts tab)
# TODO: fix channel_is_verified extraction
'url': 'https://www.youtube.com/@AHimitsu/releases', 'url': 'https://www.youtube.com/@AHimitsu/releases',
'info_dict': { 'info_dict': {
'id': 'UCgFwu-j5-xNJml2FtTrrB3A', 'id': 'UCgFwu-j5-xNJml2FtTrrB3A',
@@ -2015,6 +2033,7 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'playlist_mincount': 100, 'playlist_mincount': 100,
'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'], 'expected_warnings': [r'[Uu]navailable videos (are|will be) hidden'],
}, { }, {
# TODO: fix channel_is_verified extraction
'note': 'Tags containing spaces', 'note': 'Tags containing spaces',
'url': 'https://www.youtube.com/channel/UC7_YxT-KID8kRbqZo7MyscQ', 'url': 'https://www.youtube.com/channel/UC7_YxT-KID8kRbqZo7MyscQ',
'playlist_count': 3, 'playlist_count': 3,
@@ -2035,6 +2054,24 @@ class YoutubeTabIE(YoutubeTabBaseInfoExtractor):
'challenges', 'sketches', 'scary games', 'funny games', 'rage games', 'challenges', 'sketches', 'scary games', 'funny games', 'rage games',
'mark fischbach'], 'mark fischbach'],
}, },
}, {
# https://github.com/yt-dlp/yt-dlp/issues/12933
'note': 'streams tab, some scheduled streams. Empty intermediate response with only continuation - must follow',
'url': 'https://www.youtube.com/@sbcitygov/streams',
'playlist_mincount': 150,
'info_dict': {
'id': 'UCH6-qfQwlUgz9SAf05jvc_w',
'channel': 'sbcitygov',
'channel_id': 'UCH6-qfQwlUgz9SAf05jvc_w',
'title': 'sbcitygov - Live',
'channel_follower_count': int,
'description': 'md5:ca1a92059835c071e33b3db52f4a6d67',
'uploader_id': '@sbcitygov',
'uploader_url': 'https://www.youtube.com/@sbcitygov',
'uploader': 'sbcitygov',
'channel_url': 'https://www.youtube.com/channel/UCH6-qfQwlUgz9SAf05jvc_w',
'tags': [],
},
}] }]
@classmethod @classmethod

View File

@@ -34,6 +34,7 @@
clean_html, clean_html,
datetime_from_str, datetime_from_str,
filesize_from_tbr, filesize_from_tbr,
filter_dict,
float_or_none, float_or_none,
format_field, format_field,
get_first, get_first,
@@ -1760,6 +1761,16 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
}, },
] ]
_PLAYER_JS_VARIANT_MAP = {
'main': 'player_ias.vflset/en_US/base.js',
'tce': 'player_ias_tce.vflset/en_US/base.js',
'tv': 'tv-player-ias.vflset/tv-player-ias.js',
'tv_es6': 'tv-player-es6.vflset/tv-player-es6.js',
'phone': 'player-plasma-ias-phone-en_US.vflset/base.js',
'tablet': 'player-plasma-ias-tablet-en_US.vflset/base.js',
}
_INVERSE_PLAYER_JS_VARIANT_MAP = {v: k for k, v in _PLAYER_JS_VARIANT_MAP.items()}
@classmethod @classmethod
def suitable(cls, url): def suitable(cls, url):
from yt_dlp.utils import parse_qs from yt_dlp.utils import parse_qs
@@ -1939,6 +1950,21 @@ def _extract_player_url(self, *ytcfgs, webpage=None):
get_all=False, expected_type=str) get_all=False, expected_type=str)
if not player_url: if not player_url:
return return
requested_js_variant = self._configuration_arg('player_js_variant', [''])[0] or 'actual'
if requested_js_variant in self._PLAYER_JS_VARIANT_MAP:
player_id = self._extract_player_info(player_url)
original_url = player_url
player_url = f'/s/player/{player_id}/{self._PLAYER_JS_VARIANT_MAP[requested_js_variant]}'
if original_url != player_url:
self.write_debug(
f'Forcing "{requested_js_variant}" player JS variant for player {player_id}\n'
f' original url = {original_url}', only_once=True)
elif requested_js_variant != 'actual':
self.report_warning(
f'Invalid player JS variant name "{requested_js_variant}" requested. '
f'Valid choices are: {", ".join(self._PLAYER_JS_VARIANT_MAP)}', only_once=True)
return urljoin('https://www.youtube.com', player_url) return urljoin('https://www.youtube.com', player_url)
def _download_player_url(self, video_id, fatal=False): def _download_player_url(self, video_id, fatal=False):
@@ -1953,6 +1979,19 @@ def _download_player_url(self, video_id, fatal=False):
if player_version: if player_version:
return f'https://www.youtube.com/s/player/{player_version}/player_ias.vflset/en_US/base.js' return f'https://www.youtube.com/s/player/{player_version}/player_ias.vflset/en_US/base.js'
def _player_js_cache_key(self, player_url):
player_id = self._extract_player_info(player_url)
player_path = remove_start(urllib.parse.urlparse(player_url).path, f'/s/player/{player_id}/')
variant = self._INVERSE_PLAYER_JS_VARIANT_MAP.get(player_path) or next((
v for k, v in self._INVERSE_PLAYER_JS_VARIANT_MAP.items()
if re.fullmatch(re.escape(k).replace('en_US', r'[a-zA-Z0-9_]+'), player_path)), None)
if not variant:
self.write_debug(
f'Unable to determine player JS variant\n'
f' player = {player_url}', only_once=True)
variant = re.sub(r'[^a-zA-Z0-9]', '_', remove_end(player_path, '.js'))
return join_nonempty(player_id, variant)
def _signature_cache_id(self, example_sig): def _signature_cache_id(self, example_sig):
""" Return a string representation of a signature """ """ Return a string representation of a signature """
return '.'.join(str(len(part)) for part in example_sig.split('.')) return '.'.join(str(len(part)) for part in example_sig.split('.'))
@@ -1968,30 +2007,29 @@ def _extract_player_info(cls, player_url):
return id_m.group('id') return id_m.group('id')
def _load_player(self, video_id, player_url, fatal=True): def _load_player(self, video_id, player_url, fatal=True):
player_id = self._extract_player_info(player_url) player_js_key = self._player_js_cache_key(player_url)
if player_id not in self._code_cache: if player_js_key not in self._code_cache:
code = self._download_webpage( code = self._download_webpage(
player_url, video_id, fatal=fatal, player_url, video_id, fatal=fatal,
note='Downloading player ' + player_id, note=f'Downloading player {player_js_key}',
errnote=f'Download of {player_url} failed') errnote=f'Download of {player_js_key} failed')
if code: if code:
self._code_cache[player_id] = code self._code_cache[player_js_key] = code
return self._code_cache.get(player_id) return self._code_cache.get(player_js_key)
def _extract_signature_function(self, video_id, player_url, example_sig): def _extract_signature_function(self, video_id, player_url, example_sig):
player_id = self._extract_player_info(player_url)
# Read from filesystem cache # Read from filesystem cache
func_id = f'js_{player_id}_{self._signature_cache_id(example_sig)}' func_id = join_nonempty(
self._player_js_cache_key(player_url), self._signature_cache_id(example_sig))
assert os.path.basename(func_id) == func_id assert os.path.basename(func_id) == func_id
self.write_debug(f'Extracting signature function {func_id}') self.write_debug(f'Extracting signature function {func_id}')
cache_spec, code = self.cache.load('youtube-sigfuncs', func_id), None cache_spec, code = self.cache.load('youtube-sigfuncs', func_id, min_ver='2025.03.31'), None
if not cache_spec: if not cache_spec:
code = self._load_player(video_id, player_url) code = self._load_player(video_id, player_url)
if code: if code:
res = self._parse_sig_js(code) res = self._parse_sig_js(code, player_url)
test_string = ''.join(map(chr, range(len(example_sig)))) test_string = ''.join(map(chr, range(len(example_sig))))
cache_spec = [ord(c) for c in res(test_string)] cache_spec = [ord(c) for c in res(test_string)]
self.cache.store('youtube-sigfuncs', func_id, cache_spec) self.cache.store('youtube-sigfuncs', func_id, cache_spec)
@@ -2039,7 +2077,7 @@ def _genslice(start, end, step):
f' return {expr_code}\n') f' return {expr_code}\n')
self.to_screen('Extracted signature function:\n' + code) self.to_screen('Extracted signature function:\n' + code)
def _parse_sig_js(self, jscode): def _parse_sig_js(self, jscode, player_url):
# Examples where `sig` is funcname: # Examples where `sig` is funcname:
# sig=function(a){a=a.split(""); ... ;return a.join("")}; # sig=function(a){a=a.split(""); ... ;return a.join("")};
# ;c&&(c=sig(decodeURIComponent(c)),a.set(b,encodeURIComponent(c)));return a}; # ;c&&(c=sig(decodeURIComponent(c)),a.set(b,encodeURIComponent(c)));return a};
@@ -2063,12 +2101,9 @@ def _parse_sig_js(self, jscode):
r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('), r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('),
jscode, 'Initial JS player signature function name', group='sig') jscode, 'Initial JS player signature function name', group='sig')
varname, global_list = self._interpret_player_js_global_var(jscode, player_url)
jsi = JSInterpreter(jscode) jsi = JSInterpreter(jscode)
global_var_map = {} initial_function = jsi.extract_function(funcname, filter_dict({varname: global_list}))
_, varname, value = self._extract_player_js_global_var(jscode)
if varname:
global_var_map[varname] = jsi.interpret_expression(value, {}, allow_recursion=100)
initial_function = jsi.extract_function(funcname, global_var_map)
return lambda s: initial_function([s]) return lambda s: initial_function([s])
def _cached(self, func, *cache_id): def _cached(self, func, *cache_id):
@@ -2087,6 +2122,24 @@ def inner(*args, **kwargs):
return ret return ret
return inner return inner
def _load_player_data_from_cache(self, name, player_url):
cache_id = (f'youtube-{name}', self._player_js_cache_key(player_url))
if data := self._player_cache.get(cache_id):
return data
data = self.cache.load(*cache_id, min_ver='2025.03.31')
if data:
self._player_cache[cache_id] = data
return data
def _store_player_data_to_cache(self, name, player_url, data):
cache_id = (f'youtube-{name}', self._player_js_cache_key(player_url))
if cache_id not in self._player_cache:
self.cache.store(*cache_id, data)
self._player_cache[cache_id] = data
def _decrypt_signature(self, s, video_id, player_url): def _decrypt_signature(self, s, video_id, player_url):
"""Turn the encrypted s field into a working signature""" """Turn the encrypted s field into a working signature"""
extract_sig = self._cached( extract_sig = self._cached(
@@ -2127,9 +2180,31 @@ def _decrypt_nsig(self, s, video_id, player_url):
video_id=video_id, note='Executing signature code').strip() video_id=video_id, note='Executing signature code').strip()
self.write_debug(f'Decrypted nsig {s} => {ret}') self.write_debug(f'Decrypted nsig {s} => {ret}')
# Only cache nsig func JS code to disk if successful, and only once
self._store_player_data_to_cache('nsig', player_url, func_code)
return ret return ret
def _extract_n_function_name(self, jscode, player_url=None): def _extract_n_function_name(self, jscode, player_url=None):
varname, global_list = self._interpret_player_js_global_var(jscode, player_url)
if debug_str := traverse_obj(global_list, (lambda _, v: v.endswith('_w8_'), any)):
funcname = self._search_regex(
r'''(?xs)
[;\n](?:
(?P<f>function\s+)|
(?:var\s+)?
)(?P<funcname>[a-zA-Z0-9_$]+)\s*(?(f)|=\s*function\s*)
\((?P<argname>[a-zA-Z0-9_$]+)\)\s*\{
(?:(?!\}[;\n]).)+
\}\s*catch\(\s*[a-zA-Z0-9_$]+\s*\)\s*
\{\s*return\s+%s\[%d\]\s*\+\s*(?P=argname)\s*\}\s*return\s+[^}]+\}[;\n]
''' % (re.escape(varname), global_list.index(debug_str)),
jscode, 'nsig function name', group='funcname', default=None)
if funcname:
return funcname
self.write_debug(join_nonempty(
'Initial search was unable to find nsig function name',
player_url and f' player = {player_url}', delim='\n'), only_once=True)
# Examples (with placeholders nfunc, narray, idx): # Examples (with placeholders nfunc, narray, idx):
# * .get("n"))&&(b=nfunc(b) # * .get("n"))&&(b=nfunc(b)
# * .get("n"))&&(b=narray[idx](b) # * .get("n"))&&(b=narray[idx](b)
@@ -2159,7 +2234,7 @@ def _extract_n_function_name(self, jscode, player_url=None):
if not funcname: if not funcname:
self.report_warning(join_nonempty( self.report_warning(join_nonempty(
'Falling back to generic n function search', 'Falling back to generic n function search',
player_url and f' player = {player_url}', delim='\n')) player_url and f' player = {player_url}', delim='\n'), only_once=True)
return self._search_regex( return self._search_regex(
r'''(?xs) r'''(?xs)
;\s*(?P<name>[a-zA-Z0-9_$]+)\s*=\s*function\([a-zA-Z0-9_$]+\) ;\s*(?P<name>[a-zA-Z0-9_$]+)\s*=\s*function\([a-zA-Z0-9_$]+\)
@@ -2172,31 +2247,60 @@ def _extract_n_function_name(self, jscode, player_url=None):
rf'var {re.escape(funcname)}\s*=\s*(\[.+?\])\s*[,;]', jscode, rf'var {re.escape(funcname)}\s*=\s*(\[.+?\])\s*[,;]', jscode,
f'Initial JS player n function list ({funcname}.{idx})')))[int(idx)] f'Initial JS player n function list ({funcname}.{idx})')))[int(idx)]
def _extract_player_js_global_var(self, jscode): def _extract_player_js_global_var(self, jscode, player_url):
"""Returns tuple of strings: variable assignment code, variable name, variable value code""" """Returns tuple of strings: variable assignment code, variable name, variable value code"""
return self._search_regex( extract_global_var = self._cached(self._search_regex, 'js global array', player_url)
varcode, varname, varvalue = extract_global_var(
r'''(?x) r'''(?x)
\'use\s+strict\';\s* (?P<q1>["\'])use\s+strict(?P=q1);\s*
(?P<code> (?P<code>
var\s+(?P<name>[a-zA-Z0-9_$]+)\s*=\s* var\s+(?P<name>[a-zA-Z0-9_$]+)\s*=\s*
(?P<value>"(?:[^"\\]|\\.)+"\.split\("[^"]+"\)) (?P<value>
(?P<q2>["\'])(?:(?!(?P=q2)).|\\.)+(?P=q2)
\.split\((?P<q3>["\'])(?:(?!(?P=q3)).)+(?P=q3)\)
|\[\s*(?:(?P<q4>["\'])(?:(?!(?P=q4)).|\\.)*(?P=q4)\s*,?\s*)+\]
)
)[;,] )[;,]
''', jscode, 'global variable', group=('code', 'name', 'value'), default=(None, None, None)) ''', jscode, 'global variable', group=('code', 'name', 'value'), default=(None, None, None))
if not varcode:
self.write_debug(join_nonempty(
'No global array variable found in player JS',
player_url and f' player = {player_url}', delim='\n'), only_once=True)
return varcode, varname, varvalue
def _fixup_n_function_code(self, argnames, code, full_code): def _interpret_player_js_global_var(self, jscode, player_url):
global_var, varname, _ = self._extract_player_js_global_var(full_code) """Returns tuple of: variable name string, variable value list"""
if global_var: _, varname, array_code = self._extract_player_js_global_var(jscode, player_url)
self.write_debug(f'Prepending n function code with global array variable "{varname}"') jsi = JSInterpreter(array_code)
code = global_var + ', ' + code interpret_global_var = self._cached(jsi.interpret_expression, 'js global list', player_url)
return varname, interpret_global_var(array_code, {}, allow_recursion=10)
def _fixup_n_function_code(self, argnames, nsig_code, jscode, player_url):
varcode, varname, _ = self._extract_player_js_global_var(jscode, player_url)
if varcode and varname:
nsig_code = varcode + '; ' + nsig_code
_, global_list = self._interpret_player_js_global_var(jscode, player_url)
else: else:
self.write_debug('No global array variable found in player JS') varname = 'dlp_wins'
return argnames, re.sub( global_list = []
rf';\s*if\s*\(\s*typeof\s+[a-zA-Z0-9_$]+\s*===?\s*(?:(["\'])undefined\1|{varname}\[\d+\])\s*\)\s*return\s+{argnames[0]};',
';', code) undefined_idx = global_list.index('undefined') if 'undefined' in global_list else r'\d+'
fixed_code = re.sub(
rf'''(?x)
;\s*if\s*\(\s*typeof\s+[a-zA-Z0-9_$]+\s*===?\s*(?:
(["\'])undefined\1|
{re.escape(varname)}\[{undefined_idx}\]
)\s*\)\s*return\s+{re.escape(argnames[0])};
''', ';', nsig_code)
if fixed_code == nsig_code:
self.write_debug(join_nonempty(
'No typeof statement found in nsig function code',
player_url and f' player = {player_url}', delim='\n'), only_once=True)
return argnames, fixed_code
def _extract_n_function_code(self, video_id, player_url): def _extract_n_function_code(self, video_id, player_url):
player_id = self._extract_player_info(player_url) player_id = self._extract_player_info(player_url)
func_code = self.cache.load('youtube-nsig', player_id, min_ver='2025.03.21') func_code = self._load_player_data_from_cache('nsig', player_url)
jscode = func_code or self._load_player(video_id, player_url) jscode = func_code or self._load_player(video_id, player_url)
jsi = JSInterpreter(jscode) jsi = JSInterpreter(jscode)
@@ -2206,9 +2310,8 @@ def _extract_n_function_code(self, video_id, player_url):
func_name = self._extract_n_function_name(jscode, player_url=player_url) func_name = self._extract_n_function_name(jscode, player_url=player_url)
# XXX: Workaround for the global array variable and lack of `typeof` implementation # XXX: Workaround for the global array variable and lack of `typeof` implementation
func_code = self._fixup_n_function_code(*jsi.extract_function_code(func_name), jscode) func_code = self._fixup_n_function_code(*jsi.extract_function_code(func_name), jscode, player_url)
self.cache.store('youtube-nsig', player_id, func_code)
return jsi, player_id, func_code return jsi, player_id, func_code
def _extract_n_function_from_code(self, jsi, func_code): def _extract_n_function_from_code(self, jsi, func_code):
@@ -2233,23 +2336,27 @@ def _extract_signature_timestamp(self, video_id, player_url, ytcfg=None, fatal=F
Extract signatureTimestamp (sts) Extract signatureTimestamp (sts)
Required to tell API what sig/player version is in use. Required to tell API what sig/player version is in use.
""" """
sts = None if sts := traverse_obj(ytcfg, ('STS', {int_or_none})):
if isinstance(ytcfg, dict): return sts
sts = int_or_none(ytcfg.get('STS'))
if not sts: if not player_url:
# Attempt to extract from player error_msg = 'Cannot extract signature timestamp without player url'
if player_url is None:
error_msg = 'Cannot extract signature timestamp without player_url.'
if fatal: if fatal:
raise ExtractorError(error_msg) raise ExtractorError(error_msg)
self.report_warning(error_msg) self.report_warning(error_msg)
return return None
code = self._load_player(video_id, player_url, fatal=fatal)
if code: sts = self._load_player_data_from_cache('sts', player_url)
if sts:
return sts
if code := self._load_player(video_id, player_url, fatal=fatal):
sts = int_or_none(self._search_regex( sts = int_or_none(self._search_regex(
r'(?:signatureTimestamp|sts)\s*:\s*(?P<sts>[0-9]{5})', code, r'(?:signatureTimestamp|sts)\s*:\s*(?P<sts>[0-9]{5})', code,
'JS player signature timestamp', group='sts', fatal=fatal)) 'JS player signature timestamp', group='sts', fatal=fatal))
if sts:
self._store_player_data_to_cache('sts', player_url, sts)
return sts return sts
def _mark_watched(self, video_id, player_responses): def _mark_watched(self, video_id, player_responses):
@@ -3131,12 +3238,16 @@ def build_fragments(f):
fmt_url = url_or_none(try_get(sc, lambda x: x['url'][0])) fmt_url = url_or_none(try_get(sc, lambda x: x['url'][0]))
encrypted_sig = try_get(sc, lambda x: x['s'][0]) encrypted_sig = try_get(sc, lambda x: x['s'][0])
if not all((sc, fmt_url, player_url, encrypted_sig)): if not all((sc, fmt_url, player_url, encrypted_sig)):
self.report_warning( msg = f'Some {client_name} client https formats have been skipped as they are missing a url. '
f'Some {client_name} client https formats have been skipped as they are missing a url. ' if client_name == 'web':
f'{"Your account" if self.is_authenticated else "The current session"} may have ' msg += 'YouTube is forcing SABR streaming for this client. '
f'the SSAP (server-side ads) experiment which interferes with yt-dlp. ' else:
f'Please see https://github.com/yt-dlp/yt-dlp/issues/12482 for more details.', msg += (
video_id, only_once=True) f'YouTube may have enabled the SABR-only or Server-Side Ad Placement experiment for '
f'{"your account" if self.is_authenticated else "the current session"}. '
)
msg += 'See https://github.com/yt-dlp/yt-dlp/issues/12482 for more details'
self.report_warning(msg, video_id, only_once=True)
continue continue
try: try:
fmt_url += '&{}={}'.format( fmt_url += '&{}={}'.format(
@@ -3160,7 +3271,8 @@ def build_fragments(f):
if player_url: if player_url:
self.report_warning( self.report_warning(
f'nsig extraction failed: Some formats may be missing\n' f'nsig extraction failed: Some formats may be missing\n'
f' n = {query["n"][0]} ; player = {player_url}', f' n = {query["n"][0]} ; player = {player_url}\n'
f' {bug_reports_message(before="")}',
video_id=video_id, only_once=True) video_id=video_id, only_once=True)
self.write_debug(e, only_once=True) self.write_debug(e, only_once=True)
else: else:
@@ -3178,7 +3290,7 @@ def build_fragments(f):
is_damaged = try_call(lambda: format_duration < duration // 2) is_damaged = try_call(lambda: format_duration < duration // 2)
if is_damaged: if is_damaged:
self.report_warning( self.report_warning(
f'{video_id}: Some formats are possibly damaged. They will be deprioritized', only_once=True) 'Some formats are possibly damaged. They will be deprioritized', video_id, only_once=True)
po_token = fmt.get(STREAMING_DATA_INITIAL_PO_TOKEN) po_token = fmt.get(STREAMING_DATA_INITIAL_PO_TOKEN)
@@ -3222,8 +3334,8 @@ def build_fragments(f):
'width': int_or_none(fmt.get('width')), 'width': int_or_none(fmt.get('width')),
'language': join_nonempty(language_code, 'desc' if is_descriptive else '') or None, 'language': join_nonempty(language_code, 'desc' if is_descriptive else '') or None,
'language_preference': PREFERRED_LANG_VALUE if is_original else 5 if is_default else -10 if is_descriptive else -1, 'language_preference': PREFERRED_LANG_VALUE if is_original else 5 if is_default else -10 if is_descriptive else -1,
# Strictly de-prioritize broken, damaged and 3gp formats # Strictly de-prioritize damaged and 3gp formats
'preference': -20 if require_po_token else -10 if is_damaged else -2 if itag == '17' else None, 'preference': -10 if is_damaged else -2 if itag == '17' else None,
} }
mime_mobj = re.match( mime_mobj = re.match(
r'((?:[^/]+)/(?:[^;]+))(?:;\s*codecs="([^"]+)")?', fmt.get('mimeType') or '') r'((?:[^/]+)/(?:[^;]+))(?:;\s*codecs="([^"]+)")?', fmt.get('mimeType') or '')
@@ -3544,6 +3656,15 @@ def feed_entry(name):
if 'sign in' in reason.lower(): if 'sign in' in reason.lower():
reason = remove_end(reason, 'This helps protect our community. Learn more') reason = remove_end(reason, 'This helps protect our community. Learn more')
reason = f'{remove_end(reason.strip(), ".")}. {self._youtube_login_hint}' reason = f'{remove_end(reason.strip(), ".")}. {self._youtube_login_hint}'
elif get_first(playability_statuses, ('errorScreen', 'playerCaptchaViewModel', {dict})):
reason += '. YouTube is requiring a captcha challenge before playback'
elif "This content isn't available, try again later" in reason:
reason = (
f'{remove_end(reason.strip(), ".")}. {"Your account" if self.is_authenticated else "The current session"} '
f'has been rate-limited by YouTube for up to an hour. It is recommended to use `-t sleep` to add a delay '
f'between video requests to avoid exceeding the rate limit. For more information, refer to '
f'https://github.com/yt-dlp/yt-dlp/wiki/Extractors#this-content-isnt-available-try-again-later'
)
self.raise_no_formats(reason, expected=True) self.raise_no_formats(reason, expected=True)
keywords = get_first(video_details, 'keywords', expected_type=list) or [] keywords = get_first(video_details, 'keywords', expected_type=list) or []
@@ -3772,7 +3893,7 @@ def process_language(container, base_url, lang_code, sub_name, query):
if not traverse_obj(initial_data, 'contents'): if not traverse_obj(initial_data, 'contents'):
self.report_warning('Incomplete data received in embedded initial data; re-fetching using API.') self.report_warning('Incomplete data received in embedded initial data; re-fetching using API.')
initial_data = None initial_data = None
if not initial_data: if not initial_data and 'initial_data' not in self._configuration_arg('player_skip'):
query = {'videoId': video_id} query = {'videoId': video_id}
query.update(self._get_checkok_params()) query.update(self._get_checkok_params())
initial_data = self._extract_response( initial_data = self._extract_response(

File diff suppressed because it is too large Load Diff

View File

@@ -188,6 +188,7 @@ def js_number_to_string(val: float, radix: int = 10):
_NAME_RE = r'[a-zA-Z_$][\w$]*' _NAME_RE = r'[a-zA-Z_$][\w$]*'
_MATCHING_PARENS = dict(zip(*zip('()', '{}', '[]'))) _MATCHING_PARENS = dict(zip(*zip('()', '{}', '[]')))
_QUOTES = '\'"/' _QUOTES = '\'"/'
_NESTED_BRACKETS = r'[^[\]]+(?:\[[^[\]]+(?:\[[^\]]+\])?\])?'
class JS_Undefined: class JS_Undefined:
@@ -606,15 +607,18 @@ def dict_item(key, val):
m = re.match(fr'''(?x) m = re.match(fr'''(?x)
(?P<assign> (?P<assign>
(?P<out>{_NAME_RE})(?:\[(?P<index>[^\]]+?)\])?\s* (?P<out>{_NAME_RE})(?:\[(?P<index>{_NESTED_BRACKETS})\])?\s*
(?P<op>{"|".join(map(re.escape, set(_OPERATORS) - _COMP_OPERATORS))})? (?P<op>{"|".join(map(re.escape, set(_OPERATORS) - _COMP_OPERATORS))})?
=(?!=)(?P<expr>.*)$ =(?!=)(?P<expr>.*)$
)|(?P<return> )|(?P<return>
(?!if|return|true|false|null|undefined|NaN)(?P<name>{_NAME_RE})$ (?!if|return|true|false|null|undefined|NaN)(?P<name>{_NAME_RE})$
)|(?P<attribute>
(?P<var>{_NAME_RE})(?:
(?P<nullish>\?)?\.(?P<member>[^(]+)|
\[(?P<member2>{_NESTED_BRACKETS})\]
)\s*
)|(?P<indexing> )|(?P<indexing>
(?P<in>{_NAME_RE})\[(?P<idx>.+)\]$ (?P<in>{_NAME_RE})\[(?P<idx>.+)\]$
)|(?P<attribute>
(?P<var>{_NAME_RE})(?:(?P<nullish>\?)?\.(?P<member>[^(]+)|\[(?P<member2>[^\]]+)\])\s*
)|(?P<function> )|(?P<function>
(?P<fname>{_NAME_RE})\((?P<args>.*)\)$ (?P<fname>{_NAME_RE})\((?P<args>.*)\)$
)''', expr) )''', expr)
@@ -707,7 +711,7 @@ def eval_method():
if obj is NO_DEFAULT: if obj is NO_DEFAULT:
if variable not in self._objects: if variable not in self._objects:
try: try:
self._objects[variable] = self.extract_object(variable) self._objects[variable] = self.extract_object(variable, local_vars)
except self.Exception: except self.Exception:
if not nullish: if not nullish:
raise raise
@@ -847,7 +851,7 @@ def interpret_expression(self, expr, local_vars, allow_recursion):
raise self.Exception('Cannot return from an expression', expr) raise self.Exception('Cannot return from an expression', expr)
return ret return ret
def extract_object(self, objname): def extract_object(self, objname, *global_stack):
_FUNC_NAME_RE = r'''(?:[a-zA-Z$0-9]+|"[a-zA-Z$0-9]+"|'[a-zA-Z$0-9]+')''' _FUNC_NAME_RE = r'''(?:[a-zA-Z$0-9]+|"[a-zA-Z$0-9]+"|'[a-zA-Z$0-9]+')'''
obj = {} obj = {}
obj_m = re.search( obj_m = re.search(
@@ -869,7 +873,8 @@ def extract_object(self, objname):
for f in fields_m: for f in fields_m:
argnames = f.group('args').split(',') argnames = f.group('args').split(',')
name = remove_quotes(f.group('key')) name = remove_quotes(f.group('key'))
obj[name] = function_with_repr(self.build_function(argnames, f.group('code')), f'F<{name}>') obj[name] = function_with_repr(
self.build_function(argnames, f.group('code'), *global_stack), f'F<{name}>')
return obj return obj

View File

@@ -3,6 +3,7 @@
from .common import ( from .common import (
HEADRequest, HEADRequest,
PATCHRequest,
PUTRequest, PUTRequest,
Request, Request,
RequestDirector, RequestDirector,

View File

@@ -1,6 +1,7 @@
from __future__ import annotations from __future__ import annotations
import io import io
import itertools
import math import math
import re import re
import urllib.parse import urllib.parse
@@ -31,9 +32,9 @@
curl_cffi_version = tuple(map(int, re.split(r'[^\d]+', curl_cffi.__version__)[:3])) curl_cffi_version = tuple(map(int, re.split(r'[^\d]+', curl_cffi.__version__)[:3]))
if curl_cffi_version != (0, 5, 10) and not ((0, 7, 0) <= curl_cffi_version < (0, 7, 2)): if curl_cffi_version != (0, 5, 10) and not (0, 10) <= curl_cffi_version:
curl_cffi._yt_dlp__version = f'{curl_cffi.__version__} (unsupported)' curl_cffi._yt_dlp__version = f'{curl_cffi.__version__} (unsupported)'
raise ImportError('Only curl_cffi versions 0.5.10, 0.7.0 and 0.7.1 are supported') raise ImportError('Only curl_cffi versions 0.5.10 and 0.10.x are supported')
import curl_cffi.requests import curl_cffi.requests
from curl_cffi.const import CurlECode, CurlOpt from curl_cffi.const import CurlECode, CurlOpt
@@ -97,7 +98,7 @@ def read(self, amt=None):
return self.fp.read(amt) return self.fp.read(amt)
except curl_cffi.requests.errors.RequestsError as e: except curl_cffi.requests.errors.RequestsError as e:
if e.code == CurlECode.PARTIAL_FILE: if e.code == CurlECode.PARTIAL_FILE:
content_length = int_or_none(e.response.headers.get('Content-Length')) content_length = e.response and int_or_none(e.response.headers.get('Content-Length'))
raise IncompleteRead( raise IncompleteRead(
partial=self.fp.bytes_read, partial=self.fp.bytes_read,
expected=content_length - self.fp.bytes_read if content_length is not None else None, expected=content_length - self.fp.bytes_read if content_length is not None else None,
@@ -105,6 +106,51 @@ def read(self, amt=None):
raise TransportError(cause=e) from e raise TransportError(cause=e) from e
# See: https://github.com/lexiforest/curl_cffi?tab=readme-ov-file#supported-impersonate-browsers
# https://github.com/lexiforest/curl-impersonate?tab=readme-ov-file#supported-browsers
BROWSER_TARGETS: dict[tuple[int, ...], dict[str, ImpersonateTarget]] = {
(0, 5): {
'chrome99': ImpersonateTarget('chrome', '99', 'windows', '10'),
'chrome99_android': ImpersonateTarget('chrome', '99', 'android', '12'),
'chrome100': ImpersonateTarget('chrome', '100', 'windows', '10'),
'chrome101': ImpersonateTarget('chrome', '101', 'windows', '10'),
'chrome104': ImpersonateTarget('chrome', '104', 'windows', '10'),
'chrome107': ImpersonateTarget('chrome', '107', 'windows', '10'),
'chrome110': ImpersonateTarget('chrome', '110', 'windows', '10'),
'edge99': ImpersonateTarget('edge', '99', 'windows', '10'),
'edge101': ImpersonateTarget('edge', '101', 'windows', '10'),
'safari15_3': ImpersonateTarget('safari', '15.3', 'macos', '11'),
'safari15_5': ImpersonateTarget('safari', '15.5', 'macos', '12'),
},
(0, 7): {
'chrome116': ImpersonateTarget('chrome', '116', 'windows', '10'),
'chrome119': ImpersonateTarget('chrome', '119', 'macos', '14'),
'chrome120': ImpersonateTarget('chrome', '120', 'macos', '14'),
'chrome123': ImpersonateTarget('chrome', '123', 'macos', '14'),
'chrome124': ImpersonateTarget('chrome', '124', 'macos', '14'),
'safari17_0': ImpersonateTarget('safari', '17.0', 'macos', '14'),
'safari17_2_ios': ImpersonateTarget('safari', '17.2', 'ios', '17.2'),
},
(0, 9): {
'safari15_3': ImpersonateTarget('safari', '15.3', 'macos', '14'),
'safari15_5': ImpersonateTarget('safari', '15.5', 'macos', '14'),
'chrome119': ImpersonateTarget('chrome', '119', 'macos', '14'),
'chrome120': ImpersonateTarget('chrome', '120', 'macos', '14'),
'chrome123': ImpersonateTarget('chrome', '123', 'macos', '14'),
'chrome124': ImpersonateTarget('chrome', '124', 'macos', '14'),
'chrome131': ImpersonateTarget('chrome', '131', 'macos', '14'),
'chrome131_android': ImpersonateTarget('chrome', '131', 'android', '14'),
'chrome133a': ImpersonateTarget('chrome', '133', 'macos', '15'),
'firefox133': ImpersonateTarget('firefox', '133', 'macos', '14'),
'safari18_0': ImpersonateTarget('safari', '18.0', 'macos', '15'),
'safari18_0_ios': ImpersonateTarget('safari', '18.0', 'ios', '18.0'),
},
(0, 10): {
'firefox135': ImpersonateTarget('firefox', '135', 'macos', '14'),
},
}
@register_rh @register_rh
class CurlCFFIRH(ImpersonateRequestHandler, InstanceStoreMixin): class CurlCFFIRH(ImpersonateRequestHandler, InstanceStoreMixin):
RH_NAME = 'curl_cffi' RH_NAME = 'curl_cffi'
@@ -112,30 +158,21 @@ class CurlCFFIRH(ImpersonateRequestHandler, InstanceStoreMixin):
_SUPPORTED_FEATURES = (Features.NO_PROXY, Features.ALL_PROXY) _SUPPORTED_FEATURES = (Features.NO_PROXY, Features.ALL_PROXY)
_SUPPORTED_PROXY_SCHEMES = ('http', 'https', 'socks4', 'socks4a', 'socks5', 'socks5h') _SUPPORTED_PROXY_SCHEMES = ('http', 'https', 'socks4', 'socks4a', 'socks5', 'socks5h')
_SUPPORTED_IMPERSONATE_TARGET_MAP = { _SUPPORTED_IMPERSONATE_TARGET_MAP = {
**({ target: name if curl_cffi_version >= (0, 9) else curl_cffi.requests.BrowserType[name]
ImpersonateTarget('chrome', '124', 'macos', '14'): curl_cffi.requests.BrowserType.chrome124, for name, target in dict(sorted(itertools.chain.from_iterable(
ImpersonateTarget('chrome', '123', 'macos', '14'): curl_cffi.requests.BrowserType.chrome123, targets.items()
ImpersonateTarget('chrome', '120', 'macos', '14'): curl_cffi.requests.BrowserType.chrome120, for version, targets in BROWSER_TARGETS.items()
ImpersonateTarget('chrome', '119', 'macos', '14'): curl_cffi.requests.BrowserType.chrome119, if curl_cffi_version >= version
ImpersonateTarget('chrome', '116', 'windows', '10'): curl_cffi.requests.BrowserType.chrome116, ), key=lambda x: (
} if curl_cffi_version >= (0, 7, 0) else {}), # deprioritize mobile targets since they give very different behavior
ImpersonateTarget('chrome', '110', 'windows', '10'): curl_cffi.requests.BrowserType.chrome110, x[1].os not in ('ios', 'android'),
ImpersonateTarget('chrome', '107', 'windows', '10'): curl_cffi.requests.BrowserType.chrome107, # prioritize edge < firefox < safari < chrome
ImpersonateTarget('chrome', '104', 'windows', '10'): curl_cffi.requests.BrowserType.chrome104, ('edge', 'firefox', 'safari', 'chrome').index(x[1].client),
ImpersonateTarget('chrome', '101', 'windows', '10'): curl_cffi.requests.BrowserType.chrome101, # prioritize newest version
ImpersonateTarget('chrome', '100', 'windows', '10'): curl_cffi.requests.BrowserType.chrome100, float(x[1].version) if x[1].version else 0,
ImpersonateTarget('chrome', '99', 'windows', '10'): curl_cffi.requests.BrowserType.chrome99, # group by os name
ImpersonateTarget('edge', '101', 'windows', '10'): curl_cffi.requests.BrowserType.edge101, x[1].os,
ImpersonateTarget('edge', '99', 'windows', '10'): curl_cffi.requests.BrowserType.edge99, ), reverse=True)).items()
**({
ImpersonateTarget('safari', '17.0', 'macos', '14'): curl_cffi.requests.BrowserType.safari17_0,
} if curl_cffi_version >= (0, 7, 0) else {}),
ImpersonateTarget('safari', '15.5', 'macos', '12'): curl_cffi.requests.BrowserType.safari15_5,
ImpersonateTarget('safari', '15.3', 'macos', '11'): curl_cffi.requests.BrowserType.safari15_3,
ImpersonateTarget('chrome', '99', 'android', '12'): curl_cffi.requests.BrowserType.chrome99_android,
**({
ImpersonateTarget('safari', '17.2', 'ios', '17.2'): curl_cffi.requests.BrowserType.safari17_2_ios,
} if curl_cffi_version >= (0, 7, 0) else {}),
} }
def _create_instance(self, cookiejar=None): def _create_instance(self, cookiejar=None):

View File

@@ -505,6 +505,7 @@ def copy(self):
HEADRequest = functools.partial(Request, method='HEAD') HEADRequest = functools.partial(Request, method='HEAD')
PATCHRequest = functools.partial(Request, method='PATCH')
PUTRequest = functools.partial(Request, method='PUT') PUTRequest = functools.partial(Request, method='PUT')

View File

@@ -150,6 +150,15 @@ def format_option_strings(option):
return opts return opts
_PRESET_ALIASES = {
'mp3': ['-f', 'ba[acodec^=mp3]/ba/b', '-x', '--audio-format', 'mp3'],
'aac': ['-f', 'ba[acodec^=aac]/ba[acodec^=mp4a.40.]/ba/b', '-x', '--audio-format', 'aac'],
'mp4': ['--merge-output-format', 'mp4', '--remux-video', 'mp4', '-S', 'vcodec:h264,lang,quality,res,fps,hdr:12,acodec:aac'],
'mkv': ['--merge-output-format', 'mkv', '--remux-video', 'mkv'],
'sleep': ['--sleep-subtitles', '5', '--sleep-requests', '0.75', '--sleep-interval', '10', '--max-sleep-interval', '20'],
}
class _YoutubeDLOptionParser(optparse.OptionParser): class _YoutubeDLOptionParser(optparse.OptionParser):
# optparse is deprecated since Python 3.2. So assume a stable interface even for private methods # optparse is deprecated since Python 3.2. So assume a stable interface even for private methods
ALIAS_DEST = '_triggered_aliases' ALIAS_DEST = '_triggered_aliases'
@@ -215,6 +224,22 @@ def _match_long_opt(self, opt):
return e.possibilities[0] return e.possibilities[0]
raise raise
def format_option_help(self, formatter=None):
assert formatter, 'Formatter can not be None'
formatted_help = super().format_option_help(formatter=formatter)
formatter.indent()
heading = formatter.format_heading('Preset Aliases')
formatter.indent()
result = []
for name, args in _PRESET_ALIASES.items():
option = optparse.Option('-t', help=shlex.join(args))
formatter.option_strings[option] = f'-t {name}'
result.append(formatter.format_option(option))
formatter.dedent()
formatter.dedent()
help_lines = '\n'.join(result)
return f'{formatted_help}\n{heading}{help_lines}'
def create_parser(): def create_parser():
def _list_from_options_callback(option, opt_str, value, parser, append=True, delim=',', process=str.strip): def _list_from_options_callback(option, opt_str, value, parser, append=True, delim=',', process=str.strip):
@@ -317,6 +342,13 @@ def _alias_callback(option, opt_str, value, parser, opts, nargs):
parser.rargs[:0] = shlex.split( parser.rargs[:0] = shlex.split(
opts if value is None else opts.format(*map(shlex.quote, value))) opts if value is None else opts.format(*map(shlex.quote, value)))
def _preset_alias_callback(option, opt_str, value, parser):
if not value:
return
if value not in _PRESET_ALIASES:
raise optparse.OptionValueError(f'Unknown preset alias: {value}')
parser.rargs[:0] = _PRESET_ALIASES[value]
general = optparse.OptionGroup(parser, 'General Options') general = optparse.OptionGroup(parser, 'General Options')
general.add_option( general.add_option(
'-h', '--help', dest='print_help', action='store_true', '-h', '--help', dest='print_help', action='store_true',
@@ -500,7 +532,8 @@ def _alias_callback(option, opt_str, value, parser, opts, nargs):
'youtube-dlc': ['all', '-no-youtube-channel-redirect', '-no-live-chat', '-playlist-match-filter', '-manifest-filesize-approx', '-allow-unsafe-ext', '-prefer-vp9-sort'], 'youtube-dlc': ['all', '-no-youtube-channel-redirect', '-no-live-chat', '-playlist-match-filter', '-manifest-filesize-approx', '-allow-unsafe-ext', '-prefer-vp9-sort'],
'2021': ['2022', 'no-certifi', 'filename-sanitization'], '2021': ['2022', 'no-certifi', 'filename-sanitization'],
'2022': ['2023', 'no-external-downloader-progress', 'playlist-match-filter', 'prefer-legacy-http-handler', 'manifest-filesize-approx'], '2022': ['2023', 'no-external-downloader-progress', 'playlist-match-filter', 'prefer-legacy-http-handler', 'manifest-filesize-approx'],
'2023': ['prefer-vp9-sort'], '2023': ['2024', 'prefer-vp9-sort'],
'2024': [],
}, },
}, help=( }, help=(
'Options that can help keep compatibility with youtube-dl or youtube-dlc ' 'Options that can help keep compatibility with youtube-dl or youtube-dlc '
@@ -518,6 +551,15 @@ def _alias_callback(option, opt_str, value, parser, opts, nargs):
'Alias options can trigger more aliases; so be careful to avoid defining recursive options. ' 'Alias options can trigger more aliases; so be careful to avoid defining recursive options. '
f'As a safety measure, each alias may be triggered a maximum of {_YoutubeDLOptionParser.ALIAS_TRIGGER_LIMIT} times. ' f'As a safety measure, each alias may be triggered a maximum of {_YoutubeDLOptionParser.ALIAS_TRIGGER_LIMIT} times. '
'This option can be used multiple times')) 'This option can be used multiple times'))
general.add_option(
'-t', '--preset-alias',
metavar='PRESET', dest='_', type='str',
action='callback', callback=_preset_alias_callback,
help=(
'Applies a predefined set of options. e.g. --preset-alias mp3. '
f'The following presets are available: {", ".join(_PRESET_ALIASES)}. '
'See the "Preset Aliases" section at the end for more info. '
'This option can be used multiple times'))
network = optparse.OptionGroup(parser, 'Network Options') network = optparse.OptionGroup(parser, 'Network Options')
network.add_option( network.add_option(

View File

@@ -202,7 +202,7 @@ class UpdateInfo:
requested_version: str | None = None requested_version: str | None = None
commit: str | None = None commit: str | None = None
binary_name: str | None = _get_binary_name() # noqa: RUF009: Always returns the same value binary_name: str | None = _get_binary_name() # noqa: RUF009 # Always returns the same value
checksum: str | None = None checksum: str | None = None

View File

@@ -54,7 +54,7 @@
from ..dependencies import xattr from ..dependencies import xattr
from ..globals import IN_CLI from ..globals import IN_CLI
__name__ = __name__.rsplit('.', 1)[0] # noqa: A001: Pretend to be the parent module __name__ = __name__.rsplit('.', 1)[0] # noqa: A001 # Pretend to be the parent module
class NO_DEFAULT: class NO_DEFAULT:
@@ -2044,7 +2044,7 @@ def url_or_none(url):
if not url or not isinstance(url, str): if not url or not isinstance(url, str):
return None return None
url = url.strip() url = url.strip()
return url if re.match(r'(?:(?:https?|rt(?:m(?:pt?[es]?|fp)|sp[su]?)|mms|ftps?):)?//', url) else None return url if re.match(r'(?:(?:https?|rt(?:m(?:pt?[es]?|fp)|sp[su]?)|mms|ftps?|wss?):)?//', url) else None
def strftime_or_none(timestamp, date_format='%Y%m%d', default=None): def strftime_or_none(timestamp, date_format='%Y%m%d', default=None):
@@ -2767,6 +2767,7 @@ def process_escape(match):
def template_substitute(match): def template_substitute(match):
evaluated = js_to_json(match.group(1), vars, strict=strict) evaluated = js_to_json(match.group(1), vars, strict=strict)
if evaluated[0] == '"': if evaluated[0] == '"':
with contextlib.suppress(json.JSONDecodeError):
return json.loads(evaluated) return json.loads(evaluated)
return evaluated return evaluated

View File

@@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py # Autogenerated by devscripts/update-version.py
__version__ = '2025.03.21' __version__ = '2025.04.30'
RELEASE_GIT_HEAD = 'f36e4b6e65cb8403791aae2f520697115cb88dec' RELEASE_GIT_HEAD = '505b400795af557bdcfd9d4fa7e9133b26ef431c'
VARIANT = None VARIANT = None
@@ -12,4 +12,4 @@
ORIGIN = 'yt-dlp/yt-dlp' ORIGIN = 'yt-dlp/yt-dlp'
_pkg_version = '2025.03.21' _pkg_version = '2025.04.30'