1
0
mirror of https://github.com/yt-dlp/yt-dlp synced 2025-12-15 13:45:41 +07:00

Compare commits

...

46 Commits

Author SHA1 Message Date
github-actions[bot]
aa220d0aaa Release 2025.12.08
Created by: bashonly

:ci skip all
2025-12-08 00:06:43 +00:00
bashonly
7a52ff29d8 [cleanup] Misc (#15016)
Closes #15160, Closes #15184
Authored by: bashonly, seproDev, RezSat, oxyzenQ

Co-authored-by: sepro <sepro@sepr0.com>
Co-authored-by: Yehan Wasura <yehantest@gmail.com>
Co-authored-by: rezky_nightky <with.rezky@gmail.com>
2025-12-07 23:58:34 +00:00
bashonly
0c7e4cfcae [ie/youtube] Update ejs to 0.3.2 (#15267)
Authored by: bashonly
2025-12-07 23:51:49 +00:00
bashonly
29fe515d8d [devscripts] install_deps: Align options/terms with PEP 735 (#15200)
Authored by: bashonly
2025-12-07 23:39:05 +00:00
bashonly
1d43fa5af8 [ie/youtube] Improve message when no JS runtime is found (#15266)
Closes #15158
Authored by: bashonly
2025-12-07 23:37:03 +00:00
bashonly
fa16dc5241 [cookies] Fix --cookies-from-browser for new installs of Firefox 147+ (#15215)
Ref: https://bugzilla.mozilla.org/show_bug.cgi?id=259356

Authored by: bashonly, mbway

Co-authored-by: Matthew Broadway <mattdbway@gmail.com>
2025-12-07 23:20:02 +00:00
garret1317
04050be583 [pp/FFmpegMetadata] Add more tag mappings (#14654)
Authored by: garret1317
2025-12-07 23:04:03 +00:00
Simon Sawicki
7bd79d9296 [ie/youtube] Allow ejs patch version to differ (#15263)
Authored by: Grub4K
2025-12-07 22:10:53 +00:00
0x∅
29e2570378 [ie/xhamster] Fix extractor (#15252)
Closes #15239
Authored by: 0xvd
2025-12-06 22:12:38 +00:00
sepro
c70b57c03e [ie/Alibaba] Add extractor (#15253)
Closes #13774
Authored by: seproDev
2025-12-06 22:24:03 +01:00
bashonly
025191fea6 [ie/sporteurope] Support new domain (#15251)
Closes #15250
Authored by: bashonly
2025-12-06 21:16:05 +00:00
bashonly
36b29bb353 [ie/loom] Fix extractor (#15236)
Closes #15141
Authored by: bashonly
2025-12-05 23:18:02 +00:00
sepro
7ec6b9bc40 [ie/web.archive:youtube] Fix extractor (#15234)
Closes #15233
Authored by: seproDev
2025-12-04 18:15:09 +01:00
WhatAmISupposedToPutHere
f7acf3c1f4 [ie/youtube] Add use_ad_playback_context extractor-arg (#15220)
Closes #15144
Authored by: WhatAmISupposedToPutHere
2025-12-03 23:26:20 +00:00
bashonly
017d76edcf [ie/youtube] Revert 56ea3a00ea
Remove `request_no_ads` workaround (#15214)

Closes #15212
Authored by: bashonly
2025-12-01 05:01:22 +00:00
WhatAmISupposedToPutHere
56ea3a00ea [ie/youtube] Add request_no_ads extractor-arg (#15145)
Default is `true` for unauthenticated users.
Default is `false` if logged-in cookies have been passed to yt-dlp.
Using `true` results in a loss of premium formats.

Closes #15144
Authored by: WhatAmISupposedToPutHere
2025-12-01 01:02:58 +00:00
Zer0 Spectrum
2a777ecbd5 [ie/tubitv:series] Fix extractor (#15018)
Authored by: Zer0spectrum
2025-12-01 00:33:14 +00:00
thomasmllt
023e4db9af [ie/patreon:campaign] Fix extractor (#15108)
Closes #15094
Authored by: thomasmllt
2025-11-30 23:59:28 +00:00
Zer0 Spectrum
4433b3a217 [ie/fc2:live] Raise appropriate error when stream is offline (#15180)
Closes #15179
Authored by: Zer0spectrum
2025-11-30 23:54:17 +00:00
bashonly
419776ecf5 [ie/youtube] Extract all automatic caption languages (#15156)
Closes #14889, Closes #15150
Authored by: bashonly
2025-11-30 23:35:05 +00:00
bashonly
2801650268 [build] Bump PyInstaller minimum version requirement to 6.17.0 (#15199)
Ref: https://github.com/pyinstaller/pyinstaller/issues/9149

Authored by: bashonly
2025-11-29 21:18:49 +00:00
sepro
26c2545b87 [ie/S4C] Fix geo-restricted content (#15196)
Closes #15190
Authored by: seproDev
2025-11-28 23:14:03 +01:00
garret1317
12d411722a [ie/nhk] Fix extractors (#14528)
Closes #14223, Closes #14589
Authored by: garret1317
2025-11-24 11:27:43 +00:00
Simon Sawicki
e564b4a808 Respect PATHEXT when locating JS runtime on Windows (#15117)
Fixes #15043

Authored by: Grub4K
2025-11-24 01:56:43 +01:00
WhatAmISupposedToPutHere
715af0c636 [ie/youtube] Determine wait time from player response (#14646)
Closes #14645
Authored by: WhatAmISupposedToPutHere, bashonly

Co-authored-by: bashonly <88596187+bashonly@users.noreply.github.com>
2025-11-23 00:49:36 +00:00
Sojiroh
0c696239ef [ie/WistiaChannel] Fix extractor (#14218)
Closes #14204
Authored by: Sojiroh
2025-11-21 23:08:20 +00:00
putridambassador121
3cb5e4db54 [ie/AGalega] Add extractor (#15105)
Closes #14758
Authored by: putridambassador121
2025-11-21 20:07:07 +01:00
Elioo
6842620d56 [ie/Digiteka] Rework extractor (#14903)
Closes #12454
Authored by: beliote
2025-11-20 20:01:07 +01:00
Michael D.
20f83f208e [ie/netapp] Add extractors (#15122)
Closes #14902
Authored by: darkstar
2025-11-20 19:56:25 +01:00
sepro
c2e7e9cdb2 [ie/URPlay] Fix extractor (#15120)
Closes #13028
Authored by: seproDev
2025-11-20 16:22:45 +01:00
bashonly
2c9f0c3456 [ie/sproutvideo] Fix extractor (#15113)
Closes #15112
Authored by: bashonly
2025-11-19 18:17:29 +00:00
bashonly
0eed3fe530 [pp/ffmpeg] Fix uncaught error if bad --ffmpeg-location is given (#15104)
Revert 9f77e04c76

Closes #12829
Authored by: bashonly
2025-11-19 00:23:00 +00:00
sepro
a4c72acc46 [ie/MedalTV] Rework extractor (#15103)
Closes #15102
Authored by: seproDev
2025-11-19 00:52:55 +01:00
bashonly
9daba4f442 [ie/thisoldhouse] Fix login support (#15097)
Closes #14931
Authored by: bashonly
2025-11-18 23:08:21 +00:00
Mr Flamel
854fded114 [ie/TheChosen] Add extractors (#14183)
Closes #11246
Authored by: mrFlamel
2025-11-17 00:17:55 +01:00
Anton Larionov
5f66ac71f6 [ie/mave:channel] Add extractor (#14915)
Authored by: anlar
2025-11-17 00:05:44 +01:00
bashonly
4cb5e191ef [ie/youtube] Detect "super resolution" AI-upscaled formats (#15050)
Closes #14923
Authored by: bashonly
2025-11-16 22:39:22 +00:00
bashonly
6ee6a6fc58 [rh:urllib] Do not read after close (#15049)
Fix regression introduced in 5767fb4ab1

Closes #15017
Authored by: bashonly
2025-11-16 19:07:48 +00:00
bashonly
23f1ab3469 [fd] Fix playback wait time for ffmpeg downloads (#15066)
Authored by: bashonly
2025-11-16 18:15:16 +00:00
Haytam001
af285016d2 [ie/yfanefa] Add extractor (#15032)
Closes #14974
Authored by: Haytam001
2025-11-16 12:02:13 +01:00
sepro
1dd84b9d1c [ie/SoundcloudPlaylist] Support new API URLs (#15071)
Closes #15068
Authored by: seproDev
2025-11-16 00:35:00 +01:00
sepro
b333ef1b3f [ie/floatplane] Add subtitle support (#15069)
Authored by: seproDev
2025-11-15 17:22:17 +01:00
Pedro Ferreira
4e680db150 [ie/NowCanal] Add extractor (#14584)
Authored by: pferreir
2025-11-15 02:28:57 +01:00
sepro
45a3b42bb9 [ie/Bitmovin] Add extractor (#15064)
Authored by: seproDev
2025-11-15 01:43:53 +01:00
Omar Merroun
d6aa8c235d [ie/rinsefm] Fix extractors (#15020)
Closes #14626
Authored by: 1bnBattuta, seproDev

Co-authored-by: sepro <sepro@sepr0.com>
2025-11-14 20:17:30 +01:00
sepro
947e788340 [ie/jtbc] Fix extractor (#15047)
Authored by: seproDev
2025-11-14 18:42:18 +01:00
65 changed files with 1568 additions and 737 deletions

View File

@@ -196,7 +196,7 @@ jobs:
UPDATE_TO: yt-dlp/yt-dlp@2025.09.05
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
with:
fetch-depth: 0 # Needed for changelog
@@ -257,7 +257,7 @@ jobs:
SKIP_ONEFILE_BUILD: ${{ (!matrix.onefile && '1') || '' }}
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
- name: Cache requirements
if: matrix.cache_requirements
@@ -320,7 +320,7 @@ jobs:
UPDATE_TO: yt-dlp/yt-dlp@2025.09.05
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
# NB: Building universal2 does not work with python from actions/setup-python
- name: Cache requirements
@@ -343,14 +343,14 @@ jobs:
brew uninstall --ignore-dependencies python3
python3 -m venv ~/yt-dlp-build-venv
source ~/yt-dlp-build-venv/bin/activate
python3 devscripts/install_deps.py --only-optional-groups --include-group build
python3 devscripts/install_deps.py --print --include-group pyinstaller > requirements.txt
python3 devscripts/install_deps.py --omit-default --include-extra build
python3 devscripts/install_deps.py --print --include-extra pyinstaller > requirements.txt
# We need to ignore wheels otherwise we break universal2 builds
python3 -m pip install -U --no-binary :all: -r requirements.txt
# We need to fuse our own universal2 wheels for curl_cffi
python3 -m pip install -U 'delocate==0.11.0'
mkdir curl_cffi_whls curl_cffi_universal2
python3 devscripts/install_deps.py --print --only-optional-groups --include-group curl-cffi > requirements.txt
python3 devscripts/install_deps.py --print --omit-default --include-extra curl-cffi > requirements.txt
for platform in "macosx_11_0_arm64" "macosx_11_0_x86_64"; do
python3 -m pip download \
--only-binary=:all: \
@@ -422,23 +422,23 @@ jobs:
runner: windows-2025
python_version: '3.10'
platform_tag: win_amd64
pyi_version: '6.16.0'
pyi_tag: '2025.09.13.221251'
pyi_hash: b6496c7630c3afe66900cfa824e8234a8c2e2c81704bd7facd79586abc76c0e5
pyi_version: '6.17.0'
pyi_tag: '2025.11.29.054325'
pyi_hash: e28cc13e4ad0cc74330d832202806d0c1976e9165da6047309348ca663c0ed3d
- arch: 'x86'
runner: windows-2025
python_version: '3.10'
platform_tag: win32
pyi_version: '6.16.0'
pyi_tag: '2025.09.13.221251'
pyi_hash: 2d881843580efdc54f3523507fc6d9c5b6051ee49c743a6d9b7003ac5758c226
pyi_version: '6.17.0'
pyi_tag: '2025.11.29.054325'
pyi_hash: c00f600c17de3bdd589f043f60ab64fc34fcba6dd902ad973af9c8afc74f80d1
- arch: 'arm64'
runner: windows-11-arm
python_version: '3.13' # arm64 only has Python >= 3.11 available
platform_tag: win_arm64
pyi_version: '6.16.0'
pyi_tag: '2025.09.13.221251'
pyi_hash: 4250c9085e34a95c898f3ee2f764914fc36ec59f0d97c28e6a75fcf21f7b144f
pyi_version: '6.17.0'
pyi_tag: '2025.11.29.054325'
pyi_hash: a2033b18b4f7bc6108b5fd76a92c6c1de0a12ec4fe98a23396a9f978cb4b7d7b
env:
CHANNEL: ${{ inputs.channel }}
ORIGIN: ${{ needs.process.outputs.origin }}
@@ -450,7 +450,7 @@ jobs:
PYI_WHEEL: pyinstaller-${{ matrix.pyi_version }}-py3-none-${{ matrix.platform_tag }}.whl
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python_version }}
@@ -484,11 +484,11 @@ jobs:
mkdir /pyi-wheels
python -m pip download -d /pyi-wheels --no-deps --require-hashes "pyinstaller@${Env:PYI_URL}#sha256=${Env:PYI_HASH}"
python -m pip install --force-reinstall -U "/pyi-wheels/${Env:PYI_WHEEL}"
python devscripts/install_deps.py --only-optional-groups --include-group build
python devscripts/install_deps.py --omit-default --include-extra build
if ("${Env:ARCH}" -eq "x86") {
python devscripts/install_deps.py
} else {
python devscripts/install_deps.py --include-group curl-cffi
python devscripts/install_deps.py --include-extra curl-cffi
}
- name: Prepare

View File

@@ -35,7 +35,7 @@ jobs:
env:
QJS_VERSION: '2025-04-26' # Earliest version with rope strings
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v6
with:
@@ -67,7 +67,7 @@ jobs:
unzip quickjs.zip
- name: Install test requirements
run: |
python ./devscripts/install_deps.py --print --only-optional-groups --include-group test > requirements.txt
python ./devscripts/install_deps.py --print --omit-default --include-extra test > requirements.txt
python ./devscripts/install_deps.py --print -c certifi -c requests -c urllib3 -c yt-dlp-ejs >> requirements.txt
python -m pip install -U -r requirements.txt
- name: Run tests

View File

@@ -2,7 +2,7 @@ name: "CodeQL"
on:
push:
branches: [ 'master', 'gh-pages', 'release' ]
branches: [ 'master' ]
pull_request:
# The branches below must be a subset of the branches above
branches: [ 'master' ]
@@ -11,7 +11,7 @@ on:
jobs:
analyze:
name: Analyze
name: Analyze (${{ matrix.language }})
runs-on: ubuntu-latest
permissions:
actions: read
@@ -21,45 +21,19 @@ jobs:
strategy:
fail-fast: false
matrix:
language: [ 'python' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby' ]
# Use only 'java' to analyze code written in Java, Kotlin or both
# Use only 'javascript' to analyze code written in JavaScript, TypeScript or both
# Learn more about CodeQL language support at https://aka.ms/codeql-docs/language-support
language: [ 'actions', 'javascript-typescript', 'python' ]
steps:
- name: Checkout repository
uses: actions/checkout@v5
uses: actions/checkout@v6
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
uses: github/codeql-action/init@v4
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
# queries: security-extended,security-and-quality
# Autobuild attempts to build any compiled languages (C/C++, C#, Go, Java, or Swift).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v3
# Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
# If the Autobuild fails above, remove it and uncomment the following three lines.
# modify them (or add more) to build your code if your project, please refer to the EXAMPLE below for guidance.
# - run: |
# echo "Run, Build Application using script"
# ./location_of_script_within_repo/buildscript.sh
build-mode: none
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
uses: github/codeql-action/analyze@v4
with:
category: "/language:${{matrix.language}}"

View File

@@ -55,7 +55,7 @@ jobs:
- os: windows-latest
python-version: pypy-3.11
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
with:
fetch-depth: 0
- name: Set up Python ${{ matrix.python-version }}
@@ -63,7 +63,7 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
- name: Install test requirements
run: python ./devscripts/install_deps.py --include-group test --include-group curl-cffi
run: python ./devscripts/install_deps.py --include-extra test --include-extra curl-cffi
- name: Run tests
timeout-minutes: 15
continue-on-error: False

View File

@@ -9,13 +9,13 @@ jobs:
if: "contains(github.event.head_commit.message, 'ci run dl')"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: '3.10'
- name: Install test requirements
run: python ./devscripts/install_deps.py --include-group dev
run: python ./devscripts/install_deps.py --include-extra dev
- name: Run tests
continue-on-error: true
run: python ./devscripts/run_tests.py download
@@ -36,13 +36,13 @@ jobs:
- os: windows-latest
python-version: pypy-3.11
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python-version }}
- name: Install test requirements
run: python ./devscripts/install_deps.py --include-group dev
run: python ./devscripts/install_deps.py --include-extra dev
- name: Run tests
continue-on-error: true
run: python ./devscripts/run_tests.py download

View File

@@ -9,13 +9,13 @@ jobs:
if: "!contains(github.event.head_commit.message, 'ci skip all')"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
- name: Set up Python 3.10
uses: actions/setup-python@v6
with:
python-version: '3.10'
- name: Install test requirements
run: python ./devscripts/install_deps.py --only-optional-groups --include-group test
run: python ./devscripts/install_deps.py --omit-default --include-extra test
- name: Run tests
timeout-minutes: 15
run: |
@@ -26,12 +26,12 @@ jobs:
if: "!contains(github.event.head_commit.message, 'ci skip all')"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
with:
python-version: '3.10'
- name: Install dev dependencies
run: python ./devscripts/install_deps.py --only-optional-groups --include-group static-analysis
run: python ./devscripts/install_deps.py --omit-default --include-extra static-analysis
- name: Make lazy extractors
run: python ./devscripts/make_lazy_extractors.py
- name: Run ruff

View File

@@ -12,7 +12,7 @@ jobs:
outputs:
commit: ${{ steps.check_for_new_commits.outputs.commit }}
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
with:
fetch-depth: 0
- name: Check for new commits

View File

@@ -75,7 +75,7 @@ jobs:
head_sha: ${{ steps.get_target.outputs.head_sha }}
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
with:
fetch-depth: 0
@@ -170,7 +170,7 @@ jobs:
id-token: write # mandatory for trusted publishing
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: actions/setup-python@v6
@@ -180,7 +180,7 @@ jobs:
- name: Install Requirements
run: |
sudo apt -y install pandoc man
python devscripts/install_deps.py --only-optional-groups --include-group build
python devscripts/install_deps.py --omit-default --include-extra build
- name: Prepare
env:
@@ -233,7 +233,7 @@ jobs:
VERSION: ${{ needs.prepare.outputs.version }}
HEAD_SHA: ${{ needs.prepare.outputs.head_sha }}
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: actions/download-artifact@v5

View File

@@ -17,8 +17,8 @@ on:
permissions:
contents: read
env:
ACTIONLINT_VERSION: "1.7.8"
ACTIONLINT_SHA256SUM: be92c2652ab7b6d08425428797ceabeb16e31a781c07bc388456b4e592f3e36a
ACTIONLINT_VERSION: "1.7.9"
ACTIONLINT_SHA256SUM: 233b280d05e100837f4af1433c7b40a5dcb306e3aa68fb4f17f8a7f45a7df7b4
ACTIONLINT_REPO: https://github.com/rhysd/actionlint
jobs:
@@ -26,7 +26,7 @@ jobs:
name: Check workflows
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
with:
python-version: "3.10" # Keep this in sync with release.yml's prepare job
@@ -34,7 +34,7 @@ jobs:
env:
ACTIONLINT_TARBALL: ${{ format('actionlint_{0}_linux_amd64.tar.gz', env.ACTIONLINT_VERSION) }}
run: |
python -m devscripts.install_deps --only-optional-groups --include-group test
python -m devscripts.install_deps --omit-default --include-extra test
sudo apt -y install shellcheck
python -m pip install -U pyflakes
curl -LO "${ACTIONLINT_REPO}/releases/download/v${ACTIONLINT_VERSION}/${ACTIONLINT_TARBALL}"

View File

@@ -177,7 +177,7 @@ # DEVELOPER INSTRUCTIONS
```shell
# To only install development dependencies:
$ python -m devscripts.install_deps --include-group dev
$ python -m devscripts.install_deps --include-extra dev
# Or, for an editable install plus dev dependencies:
$ python -m pip install -e ".[default,dev]"
@@ -763,7 +763,7 @@ ### Use convenience conversion and parsing functions
Use `url_or_none` for safe URL processing.
Use `traverse_obj` and `try_call` (superseeds `dict_get` and `try_get`) for safe metadata extraction from parsed JSON.
Use `traverse_obj` and `try_call` (supersedes `dict_get` and `try_get`) for safe metadata extraction from parsed JSON.
Use `unified_strdate` for uniform `upload_date` or any `YYYYMMDD` meta field extraction, `unified_timestamp` for uniform `timestamp` extraction, `parse_filesize` for `filesize` extraction, `parse_count` for count meta fields extraction, `parse_resolution`, `parse_duration` for `duration` extraction, `parse_age_limit` for `age_limit` extraction.

View File

@@ -828,9 +828,18 @@ krystophny
matyb08
pha1n0q
PierceLBrooks
sepro
TheQWERTYCodr
thomasmllt
w4grfw
WeidiDeng
Zer0spectrum
0xvd
1bnBattuta
beliote
darkstar
Haytam001
mrFlamel
oxyzenQ
putridambassador121
RezSat
WhatAmISupposedToPutHere

View File

@@ -4,6 +4,64 @@ # Changelog
# To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
-->
### 2025.12.08
#### Core changes
- [Respect `PATHEXT` when locating JS runtime on Windows](https://github.com/yt-dlp/yt-dlp/commit/e564b4a8080cff48fa0c28f20272c05085ee6130) ([#15117](https://github.com/yt-dlp/yt-dlp/issues/15117)) by [Grub4K](https://github.com/Grub4K)
- **cookies**: [Fix `--cookies-from-browser` for new installs of Firefox 147+](https://github.com/yt-dlp/yt-dlp/commit/fa16dc5241ac1552074feee48e1c2605dc36d352) ([#15215](https://github.com/yt-dlp/yt-dlp/issues/15215)) by [bashonly](https://github.com/bashonly), [mbway](https://github.com/mbway)
#### Extractor changes
- **agalega**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/3cb5e4db54d44fe82d4eee94ae2f37cbce2e7dfc) ([#15105](https://github.com/yt-dlp/yt-dlp/issues/15105)) by [putridambassador121](https://github.com/putridambassador121)
- **alibaba**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/c70b57c03e0c25767a5166620798297a2a4878fb) ([#15253](https://github.com/yt-dlp/yt-dlp/issues/15253)) by [seproDev](https://github.com/seproDev)
- **bitmovin**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/45a3b42bb917e99b0b5c155c272ebf4a82a5bf66) ([#15064](https://github.com/yt-dlp/yt-dlp/issues/15064)) by [seproDev](https://github.com/seproDev)
- **digiteka**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/6842620d56e4c4e6affb90c2f8dff8a36dee852c) ([#14903](https://github.com/yt-dlp/yt-dlp/issues/14903)) by [beliote](https://github.com/beliote)
- **fc2**: live: [Raise appropriate error when stream is offline](https://github.com/yt-dlp/yt-dlp/commit/4433b3a217c9f430dc057643bfd7b6769eff4a45) ([#15180](https://github.com/yt-dlp/yt-dlp/issues/15180)) by [Zer0spectrum](https://github.com/Zer0spectrum)
- **floatplane**: [Add subtitle support](https://github.com/yt-dlp/yt-dlp/commit/b333ef1b3f961e292a8bf7052c54b54c81587a17) ([#15069](https://github.com/yt-dlp/yt-dlp/issues/15069)) by [seproDev](https://github.com/seproDev)
- **jtbc**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/947e7883406e5ea43687d6e4ff721cc0162c9664) ([#15047](https://github.com/yt-dlp/yt-dlp/issues/15047)) by [seproDev](https://github.com/seproDev)
- **loom**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/36b29bb3532e008a2aaf3d36d1c6fc3944137930) ([#15236](https://github.com/yt-dlp/yt-dlp/issues/15236)) by [bashonly](https://github.com/bashonly)
- **mave**: channel: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/5f66ac71f6637f768cd251509b0a932d0ce56427) ([#14915](https://github.com/yt-dlp/yt-dlp/issues/14915)) by [anlar](https://github.com/anlar)
- **medaltv**: [Rework extractor](https://github.com/yt-dlp/yt-dlp/commit/a4c72acc462668a938827370bd77084a1cd4733b) ([#15103](https://github.com/yt-dlp/yt-dlp/issues/15103)) by [seproDev](https://github.com/seproDev)
- **netapp**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/20f83f208eae863250b35e2761adad88e91d85a1) ([#15122](https://github.com/yt-dlp/yt-dlp/issues/15122)) by [darkstar](https://github.com/darkstar)
- **nhk**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/12d411722a3d7a0382d1d230a904ecd4e20298b6) ([#14528](https://github.com/yt-dlp/yt-dlp/issues/14528)) by [garret1317](https://github.com/garret1317)
- **nowcanal**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/4e680db1505dafb93313b1d42ffcd3f230fcc92a) ([#14584](https://github.com/yt-dlp/yt-dlp/issues/14584)) by [pferreir](https://github.com/pferreir)
- **patreon**: campaign: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/023e4db9afe0630c608621846856a1ca876d8bab) ([#15108](https://github.com/yt-dlp/yt-dlp/issues/15108)) by [thomasmllt](https://github.com/thomasmllt)
- **rinsefm**: [Fix extractors](https://github.com/yt-dlp/yt-dlp/commit/d6aa8c235d2e7d9374f79ec73af23a3859c76bea) ([#15020](https://github.com/yt-dlp/yt-dlp/issues/15020)) by [1bnBattuta](https://github.com/1bnBattuta), [seproDev](https://github.com/seproDev)
- **s4c**: [Fix geo-restricted content](https://github.com/yt-dlp/yt-dlp/commit/26c2545b87e2b22f134d1f567ed4d4b0b91c3253) ([#15196](https://github.com/yt-dlp/yt-dlp/issues/15196)) by [seproDev](https://github.com/seproDev)
- **soundcloudplaylist**: [Support new API URLs](https://github.com/yt-dlp/yt-dlp/commit/1dd84b9d1c33e50de49866b0d93c2596897ce506) ([#15071](https://github.com/yt-dlp/yt-dlp/issues/15071)) by [seproDev](https://github.com/seproDev)
- **sporteurope**: [Support new domain](https://github.com/yt-dlp/yt-dlp/commit/025191fea655ac879ca6dc68df358c26456a6e46) ([#15251](https://github.com/yt-dlp/yt-dlp/issues/15251)) by [bashonly](https://github.com/bashonly)
- **sproutvideo**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/2c9f0c3456057aff0631d9ea6d3eda70ffd8aabe) ([#15113](https://github.com/yt-dlp/yt-dlp/issues/15113)) by [bashonly](https://github.com/bashonly)
- **thechosen**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/854fded114f3b7b33693c2d3418575d04014aa4b) ([#14183](https://github.com/yt-dlp/yt-dlp/issues/14183)) by [mrFlamel](https://github.com/mrFlamel)
- **thisoldhouse**: [Fix login support](https://github.com/yt-dlp/yt-dlp/commit/9daba4f442139ee2537746398afc5ac30b51c28c) ([#15097](https://github.com/yt-dlp/yt-dlp/issues/15097)) by [bashonly](https://github.com/bashonly)
- **tubitv**: series: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/2a777ecbd598de19a4c691ba1f790ccbec9cdbc4) ([#15018](https://github.com/yt-dlp/yt-dlp/issues/15018)) by [Zer0spectrum](https://github.com/Zer0spectrum)
- **urplay**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/c2e7e9cdb2261adde01048d161914b156a3bad51) ([#15120](https://github.com/yt-dlp/yt-dlp/issues/15120)) by [seproDev](https://github.com/seproDev)
- **web.archive**: youtube: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/7ec6b9bc40ee8a21b11cce83a09a07a37014062e) ([#15234](https://github.com/yt-dlp/yt-dlp/issues/15234)) by [seproDev](https://github.com/seproDev)
- **wistiachannel**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/0c696239ef418776ac6ba20284bd2f3976a011b4) ([#14218](https://github.com/yt-dlp/yt-dlp/issues/14218)) by [Sojiroh](https://github.com/Sojiroh)
- **xhamster**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/29e257037862f3b2ad65e6e8d2972f9ed89389e3) ([#15252](https://github.com/yt-dlp/yt-dlp/issues/15252)) by [0xvd](https://github.com/0xvd)
- **yfanefa**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/af285016d2b14c4445109283e7c590b31542de88) ([#15032](https://github.com/yt-dlp/yt-dlp/issues/15032)) by [Haytam001](https://github.com/Haytam001)
- **youtube**
- [Add `use_ad_playback_context` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/f7acf3c1f42cc474927ecc452205d7877af36731) ([#15220](https://github.com/yt-dlp/yt-dlp/issues/15220)) by [WhatAmISupposedToPutHere](https://github.com/WhatAmISupposedToPutHere)
- [Allow `ejs` patch version to differ](https://github.com/yt-dlp/yt-dlp/commit/7bd79d92965fe9f84d7e1720eb6bb10fa9a10c77) ([#15263](https://github.com/yt-dlp/yt-dlp/issues/15263)) by [Grub4K](https://github.com/Grub4K)
- [Detect "super resolution" AI-upscaled formats](https://github.com/yt-dlp/yt-dlp/commit/4cb5e191efeebc3679f89c3c8ac819bcd511bb1f) ([#15050](https://github.com/yt-dlp/yt-dlp/issues/15050)) by [bashonly](https://github.com/bashonly)
- [Determine wait time from player response](https://github.com/yt-dlp/yt-dlp/commit/715af0c636b2b33fb3df1eb2ee37eac8262d43ac) ([#14646](https://github.com/yt-dlp/yt-dlp/issues/14646)) by [bashonly](https://github.com/bashonly), [WhatAmISupposedToPutHere](https://github.com/WhatAmISupposedToPutHere)
- [Extract all automatic caption languages](https://github.com/yt-dlp/yt-dlp/commit/419776ecf57269efb13095386a19ddc75c1f11b2) ([#15156](https://github.com/yt-dlp/yt-dlp/issues/15156)) by [bashonly](https://github.com/bashonly)
- [Improve message when no JS runtime is found](https://github.com/yt-dlp/yt-dlp/commit/1d43fa5af883f96af902a29544fc766f5e97fce6) ([#15266](https://github.com/yt-dlp/yt-dlp/issues/15266)) by [bashonly](https://github.com/bashonly)
- [Update ejs to 0.3.2](https://github.com/yt-dlp/yt-dlp/commit/0c7e4cfcaed95909d7c1c0a11b5a12881bcfdfd6) ([#15267](https://github.com/yt-dlp/yt-dlp/issues/15267)) by [bashonly](https://github.com/bashonly)
#### Downloader changes
- [Fix playback wait time for ffmpeg downloads](https://github.com/yt-dlp/yt-dlp/commit/23f1ab346927ab73ad510fd7ba105a69e5291c66) ([#15066](https://github.com/yt-dlp/yt-dlp/issues/15066)) by [bashonly](https://github.com/bashonly)
#### Postprocessor changes
- **ffmpeg**: [Fix uncaught error if bad --ffmpeg-location is given](https://github.com/yt-dlp/yt-dlp/commit/0eed3fe530d6ff4b668494c5b1d4d6fc1ade96f7) ([#15104](https://github.com/yt-dlp/yt-dlp/issues/15104)) by [bashonly](https://github.com/bashonly)
- **ffmpegmetadata**: [Add more tag mappings](https://github.com/yt-dlp/yt-dlp/commit/04050be583aae21f99932a674d1d2992ff016d5c) ([#14654](https://github.com/yt-dlp/yt-dlp/issues/14654)) by [garret1317](https://github.com/garret1317)
#### Networking changes
- **Request Handler**: urllib: [Do not read after close](https://github.com/yt-dlp/yt-dlp/commit/6ee6a6fc58d6254ef944bd311e6890e208a75e98) ([#15049](https://github.com/yt-dlp/yt-dlp/issues/15049)) by [bashonly](https://github.com/bashonly)
#### Misc. changes
- **build**: [Bump PyInstaller minimum version requirement to 6.17.0](https://github.com/yt-dlp/yt-dlp/commit/280165026886a1f1614ab527c34c66d71faa5d69) ([#15199](https://github.com/yt-dlp/yt-dlp/issues/15199)) by [bashonly](https://github.com/bashonly)
- **cleanup**: Miscellaneous: [7a52ff2](https://github.com/yt-dlp/yt-dlp/commit/7a52ff29d86efc8f3adeba977b2009ce40b8e52e) by [bashonly](https://github.com/bashonly), [oxyzenQ](https://github.com/oxyzenQ), [RezSat](https://github.com/RezSat), [seproDev](https://github.com/seproDev)
- **devscripts**: `install_deps`: [Align options/terms with PEP 735](https://github.com/yt-dlp/yt-dlp/commit/29fe515d8d3386b3406ff02bdabb967d6821bc02) ([#15200](https://github.com/yt-dlp/yt-dlp/issues/15200)) by [bashonly](https://github.com/bashonly)
### 2025.11.12
#### Important changes
@@ -64,7 +122,7 @@ #### Misc. changes
- **build**: [Bump musllinux Python version to 3.14](https://github.com/yt-dlp/yt-dlp/commit/646904cd3a79429ec5fdc43f904b3f57ae213f34) ([#14623](https://github.com/yt-dlp/yt-dlp/issues/14623)) by [bashonly](https://github.com/bashonly)
- **cleanup**
- Miscellaneous
- [c63b4e2](https://github.com/yt-dlp/yt-dlp/commit/c63b4e2a2b81cc78397c8709ef53ffd29bada213) by [bashonly](https://github.com/bashonly), [matyb08](https://github.com/matyb08), [sepro](https://github.com/sepro)
- [c63b4e2](https://github.com/yt-dlp/yt-dlp/commit/c63b4e2a2b81cc78397c8709ef53ffd29bada213) by [bashonly](https://github.com/bashonly), [matyb08](https://github.com/matyb08), [seproDev](https://github.com/seproDev)
- [335653b](https://github.com/yt-dlp/yt-dlp/commit/335653be82d5ef999cfc2879d005397402eebec1) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
- **devscripts**: [Improve `install_deps` script](https://github.com/yt-dlp/yt-dlp/commit/73922e66e437fb4bb618bdc119a96375081bf508) ([#14766](https://github.com/yt-dlp/yt-dlp/issues/14766)) by [bashonly](https://github.com/bashonly)
- **test**: [Skip flaky tests if source unchanged](https://github.com/yt-dlp/yt-dlp/commit/ade8c2b36ff300edef87d48fd1ba835ac35c5b63) ([#14970](https://github.com/yt-dlp/yt-dlp/issues/14970)) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K)

View File

@@ -8,9 +8,7 @@ ## Core Maintainers
Core Maintainers are responsible for reviewing and merging contributions, publishing releases, and steering the overall direction of the project.
**You can contact the core maintainers via `maintainers@yt-dlp.org`.**
This is **NOT** a support channel. [Open an issue](https://github.com/yt-dlp/yt-dlp/issues/new/choose) if you need help or want to report a bug.
**You can contact the core maintainers via `maintainers@yt-dlp.org`.** This email address is **NOT** a support channel. [Open an issue](https://github.com/yt-dlp/yt-dlp/issues/new/choose) if you need help or want to report a bug.
### [coletdjnz](https://github.com/coletdjnz)
@@ -18,6 +16,7 @@ ### [coletdjnz](https://github.com/coletdjnz)
* Overhauled the networking stack and implemented support for `requests` and `curl_cffi` (`--impersonate`) HTTP clients
* Reworked the plugin architecture to support installing plugins across all yt-dlp distributions (exe, pip, etc.)
* Implemented support for external JavaScript runtimes/engines
* Maintains support for YouTube
* Added and fixed support for various other sites
@@ -25,9 +24,10 @@ ### [bashonly](https://github.com/bashonly)
* Rewrote and maintains the build/release workflows and the self-updater: executables, automated/nightly/master releases, `--update-to`
* Overhauled external downloader cookie handling
* Helped in implementing support for external JavaScript runtimes/engines
* Added `--cookies-from-browser` support for Firefox containers
* Overhauled and maintains support for sites like Youtube, Vimeo, Twitter, TikTok, etc
* Added support for sites like Dacast, Kick, Loom, SproutVideo, Triller, Weverse, etc
* Maintains support for sites like YouTube, Vimeo, Twitter, TikTok, etc
* Added support for various sites
### [Grub4K](https://github.com/Grub4K)
@@ -37,12 +37,14 @@ ### [Grub4K](https://github.com/Grub4K)
* `--update-to`, self-updater rewrite, automated/nightly/master releases
* Reworked internals like `traverse_obj`, various core refactors and bugs fixes
* Implemented proper progress reporting for parallel downloads
* Implemented support for external JavaScript runtimes/engines
* Improved/fixed/added Bundestag, crunchyroll, pr0gramm, Twitter, WrestleUniverse etc
### [sepro](https://github.com/seproDev)
* UX improvements: Warn when ffmpeg is missing, warn when double-clicking exe
* Helped in implementing support for external JavaScript runtimes/engines
* Code cleanup: Remove dead extractors, mark extractors as broken, enable/apply ruff rules
* Improved/fixed/added ArdMediathek, DRTV, Floatplane, MagentaMusik, Naver, Nebula, OnDemandKorea, Vbox7 etc

View File

@@ -202,9 +202,9 @@ CONTRIBUTORS: Changelog.md
# The following EJS_-prefixed variables are auto-generated by devscripts/update_ejs.py
# DO NOT EDIT!
EJS_VERSION = 0.3.1
EJS_WHEEL_NAME = yt_dlp_ejs-0.3.1-py3-none-any.whl
EJS_WHEEL_HASH = sha256:a6e3548874db7c774388931752bb46c7f4642c044b2a189e56968f3d5ecab622
EJS_VERSION = 0.3.2
EJS_WHEEL_NAME = yt_dlp_ejs-0.3.2-py3-none-any.whl
EJS_WHEEL_HASH = sha256:f2dc6b3d1b909af1f13e021621b0af048056fca5fb07c4db6aa9bbb37a4f66a9
EJS_PY_FOLDERS = yt_dlp_ejs yt_dlp_ejs/yt yt_dlp_ejs/yt/solver
EJS_PY_FILES = yt_dlp_ejs/__init__.py yt_dlp_ejs/_version.py yt_dlp_ejs/yt/__init__.py yt_dlp_ejs/yt/solver/__init__.py
EJS_JS_FOLDERS = yt_dlp_ejs/yt/solver

View File

@@ -203,7 +203,7 @@ ## DEPENDENCIES
On Windows, [Microsoft Visual C++ 2010 SP1 Redistributable Package (x86)](https://download.microsoft.com/download/1/6/5/165255E7-1014-4D0A-B094-B6A430A6BFFC/vcredist_x86.exe) is also necessary to run yt-dlp. You probably already have this, but if the executable throws an error due to missing `MSVCR100.dll` you need to install it manually.
-->
While all the other dependencies are optional, `ffmpeg`, `ffprobe`, `yt-dlp-ejs` and a JavaScript runtime are highly recommended
While all the other dependencies are optional, `ffmpeg`, `ffprobe`, `yt-dlp-ejs` and a supported JavaScript runtime/engine are highly recommended
### Strongly recommended
@@ -215,7 +215,7 @@ ### Strongly recommended
* [**yt-dlp-ejs**](https://github.com/yt-dlp/ejs) - Required for deciphering YouTube n/sig values. Licensed under [Unlicense](https://github.com/yt-dlp/ejs/blob/main/LICENSE), bundles [MIT](https://github.com/davidbonnet/astring/blob/main/LICENSE) and [ISC](https://github.com/meriyah/meriyah/blob/main/LICENSE.md) components.
A JavaScript runtime like [**deno**](https://deno.land) (recommended), [**node.js**](https://nodejs.org), [**bun**](https://bun.sh), or [**QuickJS**](https://bellard.org/quickjs/) is also required to run yt-dlp-ejs. See [the wiki](https://github.com/yt-dlp/yt-dlp/wiki/EJS).
A JavaScript runtime/engine like [**deno**](https://deno.land) (recommended), [**node.js**](https://nodejs.org), [**bun**](https://bun.sh), or [**QuickJS**](https://bellard.org/quickjs/) is also required to run yt-dlp-ejs. See [the wiki](https://github.com/yt-dlp/yt-dlp/wiki/EJS).
### Networking
* [**certifi**](https://github.com/certifi/python-certifi)\* - Provides Mozilla's root certificate bundle. Licensed under [MPLv2](https://github.com/certifi/python-certifi/blob/master/LICENSE)
@@ -228,7 +228,7 @@ #### Impersonation
The following provide support for impersonating browser requests. This may be required for some sites that employ TLS fingerprinting.
* [**curl_cffi**](https://github.com/lexiforest/curl_cffi) (recommended) - Python binding for [curl-impersonate](https://github.com/lexiforest/curl-impersonate). Provides impersonation targets for Chrome, Edge and Safari. Licensed under [MIT](https://github.com/lexiforest/curl_cffi/blob/main/LICENSE)
* Can be installed with the `curl-cffi` group, e.g. `pip install "yt-dlp[default,curl-cffi]"`
* Can be installed with the `curl-cffi` extra, e.g. `pip install "yt-dlp[default,curl-cffi]"`
* Currently included in most builds *except* `yt-dlp` (Unix zipimport binary), `yt-dlp_x86` (Windows 32-bit) and `yt-dlp_musllinux_aarch64`
@@ -265,7 +265,7 @@ ### Standalone PyInstaller Builds
You can run the following commands:
```
python devscripts/install_deps.py --include-group pyinstaller
python devscripts/install_deps.py --include-extra pyinstaller
python devscripts/make_lazy_extractors.py
python -m bundle.pyinstaller
```
@@ -483,7 +483,7 @@ ## Geo-restriction:
two-letter ISO 3166-2 country code
## Video Selection:
-I, --playlist-items ITEM_SPEC Comma separated playlist_index of the items
-I, --playlist-items ITEM_SPEC Comma-separated playlist_index of the items
to download. You can specify a range using
"[START]:[STOP][:STEP]". For backward
compatibility, START-STOP is also supported.
@@ -1299,7 +1299,7 @@ # OUTPUT TEMPLATE
1. **Default**: A literal default value can be specified for when the field is empty using a `|` separator. This overrides `--output-na-placeholder`. E.g. `%(uploader|Unknown)s`
1. **More Conversions**: In addition to the normal format types `diouxXeEfFgGcrs`, yt-dlp additionally supports converting to `B` = **B**ytes, `j` = **j**son (flag `#` for pretty-printing, `+` for Unicode), `h` = HTML escaping, `l` = a comma separated **l**ist (flag `#` for `\n` newline-separated), `q` = a string **q**uoted for the terminal (flag `#` to split a list into different arguments), `D` = add **D**ecimal suffixes (e.g. 10M) (flag `#` to use 1024 as factor), and `S` = **S**anitize as filename (flag `#` for restricted)
1. **More Conversions**: In addition to the normal format types `diouxXeEfFgGcrs`, yt-dlp additionally supports converting to `B` = **B**ytes, `j` = **j**son (flag `#` for pretty-printing, `+` for Unicode), `h` = HTML escaping, `l` = a comma-separated **l**ist (flag `#` for `\n` newline-separated), `q` = a string **q**uoted for the terminal (flag `#` to split a list into different arguments), `D` = add **D**ecimal suffixes (e.g. 10M) (flag `#` to use 1024 as factor), and `S` = **S**anitize as filename (flag `#` for restricted)
1. **Unicode normalization**: The format type `U` can be used for NFC [Unicode normalization](https://docs.python.org/3/library/unicodedata.html#unicodedata.normalize). The alternate form flag (`#`) changes the normalization to NFD and the conversion flag `+` can be used for NFKC/NFKD compatibility equivalence normalization. E.g. `%(title)+.100U` is NFKC
@@ -1798,8 +1798,8 @@ # MODIFYING METADATA
`track` | `track_number`
`artist` | `artist`, `artists`, `creator`, `creators`, `uploader` or `uploader_id`
`composer` | `composer` or `composers`
`genre` | `genre` or `genres`
`album` | `album`
`genre` | `genre`, `genres`, `categories` or `tags`
`album` | `album` or `series`
`album_artist` | `album_artist` or `album_artists`
`disc` | `disc_number`
`show` | `series`
@@ -1852,7 +1852,7 @@ # EXTRACTOR ARGUMENTS
#### youtube
* `lang`: Prefer translated metadata (`title`, `description` etc) of this language code (case-sensitive). By default, the video primary language metadata is preferred, with a fallback to `en` translated. See [youtube/_base.py](https://github.com/yt-dlp/yt-dlp/blob/415b4c9f955b1a0391204bd24a7132590e7b3bdb/yt_dlp/extractor/youtube/_base.py#L402-L409) for the list of supported content language codes
* `skip`: One or more of `hls`, `dash` or `translated_subs` to skip extraction of the m3u8 manifests, dash manifests and [auto-translated subtitles](https://github.com/yt-dlp/yt-dlp/issues/4090#issuecomment-1158102032) respectively
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_sdkless`, `android_vr`, `tv`, `tv_simply`, `tv_downgraded`, and `tv_embedded`. By default, `tv,android_sdkless,web` is used. If no JavaScript runtime is available, then `android_sdkless,web_safari,web` is used. If logged-in cookies are passed to yt-dlp, then `tv_downgraded,web_safari,web` is used for free accounts and `tv_downgraded,web_creator,web` is used for premium accounts. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `web_embedded` client is added for age-restricted videos but only works if the video is embeddable. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_client`: Clients to extract video data from. The currently available clients are `web`, `web_safari`, `web_embedded`, `web_music`, `web_creator`, `mweb`, `ios`, `android`, `android_sdkless`, `android_vr`, `tv`, `tv_simply`, `tv_downgraded`, and `tv_embedded`. By default, `tv,android_sdkless,web` is used. If no JavaScript runtime/engine is available, then `android_sdkless,web_safari,web` is used. If logged-in cookies are passed to yt-dlp, then `tv_downgraded,web_safari,web` is used for free accounts and `tv_downgraded,web_creator,web` is used for premium accounts. The `web_music` client is added for `music.youtube.com` URLs when logged-in cookies are used. The `web_embedded` client is added for age-restricted videos but only works if the video is embeddable. The `tv_embedded` and `web_creator` clients are added for age-restricted videos if account age-verification is required. Some clients, such as `web` and `web_music`, require a `po_token` for their formats to be downloadable. Some clients, such as `web_creator`, will only work with authentication. Not all clients support authentication via cookies. You can use `default` for the default clients, or you can use `all` for all clients (not recommended). You can prefix a client with `-` to exclude it, e.g. `youtube:player_client=default,-ios`
* `player_skip`: Skip some network requests that are generally needed for robust extraction. One or more of `configs` (skip client configs), `webpage` (skip initial webpage), `js` (skip js player), `initial_data` (skip initial data/next ep request). While these options can help reduce the number of requests needed or avoid some rate-limiting, they could cause issues such as missing formats or metadata. See [#860](https://github.com/yt-dlp/yt-dlp/pull/860) and [#12826](https://github.com/yt-dlp/yt-dlp/issues/12826) for more details
* `webpage_skip`: Skip extraction of embedded webpage data. One or both of `player_response`, `initial_data`. These options are for testing purposes and don't skip any network requests
* `player_params`: YouTube player parameters to use for player requests. Will overwrite any default ones set by yt-dlp.
@@ -1867,14 +1867,14 @@ #### youtube
* `raise_incomplete_data`: `Incomplete Data Received` raises an error instead of reporting a warning
* `data_sync_id`: Overrides the account Data Sync ID used in Innertube API requests. This may be needed if you are using an account with `youtube:player_skip=webpage,configs` or `youtubetab:skip=webpage`
* `visitor_data`: Overrides the Visitor Data used in Innertube API requests. This should be used with `player_skip=webpage,configs` and without cookies. Note: this may have adverse effects if used improperly. If a session from a browser is wanted, you should pass cookies instead (which contain the Visitor ID)
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma seperated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be any of `gvs` (Google Video Server URLs), `player` (Innertube player request) or `subs` (Subtitles)
* `po_token`: Proof of Origin (PO) Token(s) to use. Comma-separated list of PO Tokens in the format `CLIENT.CONTEXT+PO_TOKEN`, e.g. `youtube:po_token=web.gvs+XXX,web.player=XXX,web_safari.gvs+YYY`. Context can be any of `gvs` (Google Video Server URLs), `player` (Innertube player request) or `subs` (Subtitles)
* `pot_trace`: Enable debug logging for PO Token fetching. Either `true` or `false` (default)
* `fetch_pot`: Policy to use for fetching a PO Token from providers. One of `always` (always try fetch a PO Token regardless if the client requires one for the given context), `never` (never fetch a PO Token), or `auto` (default; only fetch a PO Token if the client requires one for the given context)
* `playback_wait`: Duration (in seconds) to wait inbetween the extraction and download stages in order to ensure the formats are available. The default is `6` seconds
* `jsc_trace`: Enable debug logging for JS Challenge fetching. Either `true` or `false` (default)
* `use_ad_playback_context`: Skip preroll ads to eliminate the mandatory wait period before download. Do NOT use this when passing premium account cookies to yt-dlp, as it will result in a loss of premium formats. Only effective with the `web`, `web_safari`, `web_music` and `mweb` player clients. Either `true` or `false` (default)
#### youtube-ejs
* `jitless`: Run suported Javascript engines in JIT-less mode. Supported runtimes are `deno`, `node` and `bun`. Provides better security at the cost of performance/speed. Do note that `node` and `bun` are still considered unsecure. Either `true` or `false` (default)
* `jitless`: Run supported Javascript engines in JIT-less mode. Supported runtimes are `deno`, `node` and `bun`. Provides better security at the cost of performance/speed. Do note that `node` and `bun` are still considered insecure. Either `true` or `false` (default)
#### youtubepot-webpo
* `bind_to_visitor_id`: Whether to use the Visitor ID instead of Visitor Data for caching WebPO tokens. Either `true` (default) or `false`

View File

@@ -15,12 +15,12 @@ function venvpy {
}
INCLUDES=(
--include-group pyinstaller
--include-group secretstorage
--include-extra pyinstaller
--include-extra secretstorage
)
if [[ -z "${EXCLUDE_CURL_CFFI:-}" ]]; then
INCLUDES+=(--include-group curl-cffi)
INCLUDES+=(--include-extra curl-cffi)
fi
runpy -m venv /yt-dlp-build-venv
@@ -28,7 +28,7 @@ runpy -m venv /yt-dlp-build-venv
source /yt-dlp-build-venv/bin/activate
# Inside the venv we use venvpy instead of runpy
venvpy -m ensurepip --upgrade --default-pip
venvpy -m devscripts.install_deps --only-optional-groups --include-group build
venvpy -m devscripts.install_deps --omit-default --include-extra build
venvpy -m devscripts.install_deps "${INCLUDES[@]}"
venvpy -m devscripts.make_lazy_extractors
venvpy devscripts/update-version.py -c "${CHANNEL}" -r "${ORIGIN}" "${VERSION}"

View File

@@ -319,5 +319,11 @@
"action": "add",
"when": "6224a3898821965a7d6a2cb9cc2de40a0fd6e6bc",
"short": "[priority] **An external JavaScript runtime is now required for full YouTube support**\nyt-dlp now requires users to have an external JavaScript runtime (e.g. Deno) installed in order to solve the JavaScript challenges presented by YouTube. [Read more](https://github.com/yt-dlp/yt-dlp/issues/15012)"
},
{
"action": "change",
"when": "c63b4e2a2b81cc78397c8709ef53ffd29bada213",
"short": "[cleanup] Misc (#14767)",
"authors": ["bashonly", "seproDev", "matyb08"]
}
]

View File

@@ -25,16 +25,16 @@ def parse_args():
'-e', '--exclude-dependency', metavar='DEPENDENCY', action='append',
help='exclude a dependency (can be used multiple times)')
parser.add_argument(
'-i', '--include-group', metavar='GROUP', action='append',
help='include an optional dependency group (can be used multiple times)')
'-i', '--include-extra', metavar='EXTRA', action='append',
help='include an extra/optional-dependencies list (can be used multiple times)')
parser.add_argument(
'-c', '--cherry-pick', metavar='DEPENDENCY', action='append',
help=(
'only include a specific dependency from the resulting dependency list '
'(can be used multiple times)'))
parser.add_argument(
'-o', '--only-optional-groups', action='store_true',
help='omit default dependencies unless the "default" group is specified with --include-group')
'-o', '--omit-default', action='store_true',
help='omit the "default" extra unless it is explicitly included (it is included by default)')
parser.add_argument(
'-p', '--print', action='store_true',
help='only print requirements to stdout')
@@ -51,27 +51,27 @@ def uniq(arg) -> dict[str, None]:
def main():
args = parse_args()
project_table = parse_toml(read_file(args.input))['project']
recursive_pattern = re.compile(rf'{project_table["name"]}\[(?P<group_name>[\w-]+)\]')
optional_groups = project_table['optional-dependencies']
recursive_pattern = re.compile(rf'{project_table["name"]}\[(?P<extra_name>[\w-]+)\]')
extras = project_table['optional-dependencies']
excludes = uniq(args.exclude_dependency)
only_includes = uniq(args.cherry_pick)
include_groups = uniq(args.include_group)
include_extras = uniq(args.include_extra)
def yield_deps(group):
for dep in group:
def yield_deps(extra):
for dep in extra:
if mobj := recursive_pattern.fullmatch(dep):
yield from optional_groups.get(mobj.group('group_name'), ())
yield from extras.get(mobj.group('extra_name'), ())
else:
yield dep
targets = {}
if not args.only_optional_groups:
if not args.omit_default:
# legacy: 'dependencies' is empty now
targets.update(dict.fromkeys(project_table['dependencies']))
targets.update(dict.fromkeys(yield_deps(optional_groups['default'])))
targets.update(dict.fromkeys(yield_deps(extras['default'])))
for include in filter(None, map(optional_groups.get, include_groups)):
for include in filter(None, map(extras.get, include_extras)):
targets.update(dict.fromkeys(yield_deps(include)))
def target_filter(target):

View File

@@ -251,7 +251,13 @@ class CommitRange:
''', re.VERBOSE | re.DOTALL)
EXTRACTOR_INDICATOR_RE = re.compile(r'(?:Fix|Add)\s+Extractors?', re.IGNORECASE)
REVERT_RE = re.compile(r'(?:\[[^\]]+\]\s+)?(?i:Revert)\s+([\da-f]{40})')
FIXES_RE = re.compile(r'(?i:(?:bug\s*)?fix(?:es)?(?:\s+bugs?)?(?:\s+in|\s+for)?|Improve)\s+([\da-f]{40})')
FIXES_RE = re.compile(r'''
(?i:
(?:bug\s*)?fix(?:es)?(?:
\s+(?:bugs?|regression(?:\s+introduced)?)
)?(?:\s+(?:in|for|from|by))?
|Improve
)\s+([\da-f]{40})''', re.VERBOSE)
UPSTREAM_MERGE_RE = re.compile(r'Update to ytdl-commit-([\da-f]+)')
def __init__(self, start, end, default_author=None):

View File

@@ -56,7 +56,7 @@ default = [
"requests>=2.32.2,<3",
"urllib3>=2.0.2,<3",
"websockets>=13.0",
"yt-dlp-ejs==0.3.1",
"yt-dlp-ejs==0.3.2",
]
curl-cffi = [
"curl-cffi>=0.5.10,!=0.6.*,!=0.7.*,!=0.8.*,!=0.9.*,<0.14; implementation_name=='cpython'",
@@ -69,7 +69,7 @@ build = [
"build",
"hatchling>=1.27.0",
"pip",
"setuptools>=71.0.2,<81", # See https://github.com/pyinstaller/pyinstaller/issues/9149
"setuptools>=71.0.2",
"wheel",
]
dev = [
@@ -86,7 +86,7 @@ test = [
"pytest-rerunfailures~=14.0",
]
pyinstaller = [
"pyinstaller>=6.13.0", # Windows temp cleanup fixed in 6.13.0
"pyinstaller>=6.17.0", # 6.17.0+ needed for compat with setuptools 81+
]
[project.urls]

View File

@@ -50,8 +50,10 @@ # Supported sites
- **aenetworks:collection**
- **aenetworks:show**
- **AeonCo**
- **agalega:videos**
- **AirTV**
- **AitubeKZVideo**
- **Alibaba**
- **AliExpressLive**
- **AlJazeera**
- **Allocine**
@@ -190,6 +192,7 @@ # Supported sites
- **Biography**
- **BitChute**
- **BitChuteChannel**
- **Bitmovin**
- **BlackboardCollaborate**
- **BlackboardCollaborateLaunch**
- **BleacherReport**: (**Currently broken**)
@@ -731,7 +734,7 @@ # Supported sites
- **loc**: Library of Congress
- **Loco**
- **loom**
- **loom:folder**
- **loom:folder**: (**Currently broken**)
- **LoveHomePorn**
- **LRTRadio**
- **LRTStream**
@@ -762,7 +765,8 @@ # Supported sites
- **massengeschmack.tv**
- **Masters**
- **MatchTV**
- **Mave**
- **mave**
- **mave:channel**
- **MBN**: mbn.co.kr (매일방송)
- **MDR**: MDR.DE
- **MedalTV**
@@ -895,6 +899,8 @@ # Supported sites
- **NerdCubedFeed**
- **Nest**
- **NestClip**
- **NetAppCollection**
- **NetAppVideo**
- **netease:album**: 网易云音乐 - 专辑
- **netease:djradio**: 网易云音乐 - 电台
- **netease:mv**: 网易云音乐 - MV
@@ -962,6 +968,7 @@ # Supported sites
- **Nova**: TN.cz, Prásk.tv, Nova.cz, Novaplus.cz, FANDA.tv, Krásná.cz and Doma.cz
- **NovaEmbed**
- **NovaPlay**
- **NowCanal**
- **nowness**
- **nowness:playlist**
- **nowness:series**
@@ -1373,7 +1380,7 @@ # Supported sites
- **Spiegel**
- **Sport5**
- **SportBox**: (**Currently broken**)
- **SportDeutschland**
- **sporteurope**
- **Spreaker**
- **SpreakerShow**
- **SpringboardPlatform**
@@ -1461,6 +1468,8 @@ # Supported sites
- **TFO**: (**Currently broken**)
- **theatercomplextown:ppv**: [*theatercomplextown*](## "netrc machine")
- **theatercomplextown:vod**: [*theatercomplextown*](## "netrc machine")
- **TheChosen**
- **TheChosenGroup**
- **TheGuardianPodcast**
- **TheGuardianPodcastPlaylist**
- **TheHighWire**
@@ -1778,6 +1787,7 @@ # Supported sites
- **YapFiles**: (**Currently broken**)
- **Yappy**: (**Currently broken**)
- **YappyProfile**
- **yfanefa**
- **YleAreena**
- **YouJizz**
- **youku**: 优酷

View File

@@ -755,6 +755,17 @@ def test_partial_read_then_full_read(self, handler):
assert res.read(0) == b''
assert res.read() == b'<video src="/vid.mp4" /></html>'
def test_partial_read_greater_than_response_then_full_read(self, handler):
with handler() as rh:
for encoding in ('', 'gzip', 'deflate'):
res = validate_and_send(rh, Request(
f'http://127.0.0.1:{self.http_port}/content-encoding',
headers={'ytdl-encoding': encoding}))
assert res.headers.get('Content-Encoding') == encoding
assert res.read(512) == b'<html><video src="/vid.mp4" /></html>'
assert res.read(0) == b''
assert res.read() == b''
@pytest.mark.parametrize('handler', ['Urllib', 'Requests', 'CurlCFFI'], indirect=True)
@pytest.mark.handler_flaky('CurlCFFI', reason='segfaults')
@@ -920,6 +931,28 @@ def test_http_response_auto_close(self, handler):
assert res.fp.fp is None
assert res.closed
def test_data_uri_partial_read_then_full_read(self, handler):
with handler() as rh:
res = validate_and_send(rh, Request('data:text/plain,hello%20world'))
assert res.read(6) == b'hello '
assert res.read(0) == b''
assert res.read() == b'world'
# Should automatically close the underlying file object
assert res.fp.closed
assert res.closed
def test_data_uri_partial_read_greater_than_response_then_full_read(self, handler):
with handler() as rh:
res = validate_and_send(rh, Request('data:text/plain,hello%20world'))
assert res.read(512) == b'hello world'
# Response and its underlying file object should already be closed now
assert res.fp.closed
assert res.closed
assert res.read(0) == b''
assert res.read() == b''
assert res.fp.closed
assert res.closed
def test_http_error_returns_content(self, handler):
# urllib HTTPError will try close the underlying response if reference to the HTTPError object is lost
def get_response():

View File

@@ -1403,6 +1403,9 @@ def test_version_tuple(self):
self.assertEqual(version_tuple('1'), (1,))
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
self.assertEqual(version_tuple('10.1-6'), (10, 1, 6)) # avconv style
self.assertEqual(version_tuple('invalid', lenient=True), (-1,))
self.assertEqual(version_tuple('1.2.3', lenient=True), (1, 2, 3))
self.assertEqual(version_tuple('12.34-something', lenient=True), (12, 34, -1))
def test_detect_exe_version(self):
self.assertEqual(detect_exe_version('''ffmpeg version 1.2.1

View File

@@ -40,7 +40,7 @@
pytestmark = pytest.mark.handler_flaky(
'Websockets',
os.name != 'nt' and sys.implementation.name == 'pypy',
os.name == 'nt' or sys.implementation.name == 'pypy',
reason='segfaults',
)

View File

@@ -212,9 +212,16 @@ def _firefox_browser_dirs():
else:
yield from map(os.path.expanduser, (
# New installations of FF147+ respect the XDG base directory specification
# Ref: https://bugzilla.mozilla.org/show_bug.cgi?id=259356
os.path.join(_config_home(), 'mozilla/firefox'),
# Existing FF version<=146 installations
'~/.mozilla/firefox',
'~/snap/firefox/common/.mozilla/firefox',
# Flatpak XDG: https://docs.flatpak.org/en/latest/conventions.html#xdg-base-directories
'~/.var/app/org.mozilla.firefox/config/mozilla/firefox',
'~/.var/app/org.mozilla.firefox/.mozilla/firefox',
# Snap installations do not respect the XDG base directory specification
'~/snap/firefox/common/.mozilla/firefox',
))

View File

@@ -461,7 +461,8 @@ def download(self, filename, info_dict, subtitle=False):
min_sleep_interval = self.params.get('sleep_interval') or 0
max_sleep_interval = self.params.get('max_sleep_interval') or 0
if available_at := info_dict.get('available_at'):
requested_formats = info_dict.get('requested_formats') or [info_dict]
if available_at := max(f.get('available_at') or 0 for f in requested_formats):
forced_sleep_interval = available_at - int(time.time())
if forced_sleep_interval > min_sleep_interval:
sleep_note = 'as required by the site'

View File

@@ -457,6 +457,8 @@ class FFmpegFD(ExternalFD):
@classmethod
def available(cls, path=None):
# TODO: Fix path for ffmpeg
# Fixme: This may be wrong when --ffmpeg-location is used
return FFmpegPostProcessor().available
def on_process_started(self, proc, stdin):

View File

@@ -75,6 +75,7 @@
AfreecaTVLiveIE,
AfreecaTVUserIE,
)
from .agalega import AGalegaIE
from .agora import (
TokFMAuditionIE,
TokFMPodcastIE,
@@ -83,6 +84,7 @@
)
from .airtv import AirTVIE
from .aitube import AitubeKZVideoIE
from .alibaba import AlibabaIE
from .aliexpress import AliExpressLiveIE
from .aljazeera import AlJazeeraIE
from .allocine import AllocineIE
@@ -268,6 +270,7 @@
BitChuteChannelIE,
BitChuteIE,
)
from .bitmovin import BitmovinIE
from .blackboardcollaborate import (
BlackboardCollaborateIE,
BlackboardCollaborateLaunchIE,
@@ -690,6 +693,10 @@
FrontendMastersIE,
FrontendMastersLessonIE,
)
from .frontro import (
TheChosenGroupIE,
TheChosenIE,
)
from .fujitv import FujiTVFODPlus7IE
from .funk import FunkIE
from .funker530 import Funker530IE
@@ -1093,7 +1100,10 @@
from .massengeschmacktv import MassengeschmackTVIE
from .masters import MastersIE
from .matchtv import MatchTVIE
from .mave import MaveIE
from .mave import (
MaveChannelIE,
MaveIE,
)
from .mbn import MBNIE
from .mdr import MDRIE
from .medaltv import MedalTVIE
@@ -1276,6 +1286,10 @@
NestClipIE,
NestIE,
)
from .netapp import (
NetAppCollectionIE,
NetAppVideoIE,
)
from .neteasemusic import (
NetEaseMusicAlbumIE,
NetEaseMusicDjRadioIE,
@@ -1368,6 +1382,7 @@
NovaIE,
)
from .novaplay import NovaPlayIE
from .nowcanal import NowCanalIE
from .nowness import (
NownessIE,
NownessPlaylistIE,
@@ -2521,6 +2536,7 @@
YappyIE,
YappyProfileIE,
)
from .yfanefa import YfanefaIE
from .yle_areena import YleAreenaIE
from .youjizz import YouJizzIE
from .youku import (

View File

@@ -0,0 +1,91 @@
import json
import time
from .common import InfoExtractor
from ..utils import jwt_decode_hs256, url_or_none
from ..utils.traversal import traverse_obj
class AGalegaBaseIE(InfoExtractor):
_access_token = None
@staticmethod
def _jwt_is_expired(token):
return jwt_decode_hs256(token)['exp'] - time.time() < 120
def _refresh_access_token(self, video_id):
AGalegaBaseIE._access_token = self._download_json(
'https://www.agalega.gal/api/fetch-api/jwt/token', video_id,
note='Downloading access token',
data=json.dumps({
'username': None,
'password': None,
'client': 'crtvg',
'checkExistsCookies': False,
}).encode())['access']
def _call_api(self, endpoint, display_id, note, fatal=True, query=None):
if not AGalegaBaseIE._access_token or self._jwt_is_expired(AGalegaBaseIE._access_token):
self._refresh_access_token(endpoint)
return self._download_json(
f'https://api-agalega.interactvty.com/api/2.0/contents/{endpoint}', display_id,
note=note, fatal=fatal, query=query,
headers={'Authorization': f'jwtok {AGalegaBaseIE._access_token}'})
class AGalegaIE(AGalegaBaseIE):
IE_NAME = 'agalega:videos'
_VALID_URL = r'https?://(?:www\.)?agalega\.gal/videos/(?:detail/)?(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://www.agalega.gal/videos/288664-lr-ninguencheconta',
'md5': '04533a66c5f863d08dd9724b11d1c223',
'info_dict': {
'id': '288664',
'title': 'Roberto e Ángel Martín atenden consultas dos espectadores',
'description': 'O cómico ademais fai un repaso dalgúns momentos da súa traxectoria profesional',
'thumbnail': 'https://crtvg-bucket.flumotion.cloud/content_cards/2ef32c3b9f6249d9868fd8f11d389d3d.png',
'ext': 'mp4',
},
}, {
'url': 'https://www.agalega.gal/videos/detail/296152-pulso-activo-7',
'md5': '26df7fdcf859f38ad92d837279d6b56d',
'info_dict': {
'id': '296152',
'title': 'Pulso activo | 18-11-2025',
'description': 'Anxo, Noemí, Silvia e Estrella comparten as sensacións da clase de Eddy.',
'thumbnail': 'https://crtvg-bucket.flumotion.cloud/content_cards/a6bb7da6c8994b82bf961ac6cad1707b.png',
'ext': 'mp4',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
content_data = self._call_api(
f'content/{video_id}/', video_id, note='Downloading content data', fatal=False,
query={
'optional_fields': 'image,is_premium,short_description,has_subtitle',
})
resource_data = self._call_api(
f'content_resources/{video_id}/', video_id, note='Downloading resource data',
query={
'optional_fields': 'media_url',
})
formats = []
subtitles = {}
for m3u8_url in traverse_obj(resource_data, ('results', ..., 'media_url', {url_or_none})):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
m3u8_url, video_id, ext='mp4', m3u8_id='hls')
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(content_data, {
'title': ('name', {str}),
'description': (('description', 'short_description'), {str}, any),
'thumbnail': ('image', {url_or_none}),
}),
}

View File

@@ -0,0 +1,42 @@
from .common import InfoExtractor
from ..utils import int_or_none, str_or_none, url_or_none
from ..utils.traversal import traverse_obj
class AlibabaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?alibaba\.com/product-detail/[\w-]+_(?P<id>\d+)\.html'
_TESTS = [{
'url': 'https://www.alibaba.com/product-detail/Kids-Entertainment-Bouncer-Bouncy-Castle-Waterslide_1601271126969.html',
'info_dict': {
'id': '6000280444270',
'display_id': '1601271126969',
'ext': 'mp4',
'title': 'Kids Entertainment Bouncer Bouncy Castle Waterslide Juex Gonflables Commercial Inflatable Tropical Water Slide',
'duration': 30,
'thumbnail': 'https://sc04.alicdn.com/kf/Hc5bb391974454af18c7a4f91cbe4062bg.jpg_120x120.jpg',
},
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
product_data = self._search_json(
r'window\.detailData\s*=', webpage, 'detail data', display_id)['globalData']['product']
return {
**traverse_obj(product_data, ('mediaItems', lambda _, v: v['type'] == 'video' and v['videoId'], any, {
'id': ('videoId', {int}, {str_or_none}),
'duration': ('duration', {int_or_none}),
'thumbnail': ('videoCoverUrl', {url_or_none}),
'formats': ('videoUrl', lambda _, v: url_or_none(v['videoUrl']), {
'url': 'videoUrl',
'format_id': ('definition', {str_or_none}),
'tbr': ('bitrate', {int_or_none}),
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
'filesize': ('length', {int_or_none}),
}),
})),
'title': traverse_obj(product_data, ('subject', {str})),
'display_id': display_id,
}

View File

@@ -704,6 +704,24 @@ class YoutubeWebArchiveIE(InfoExtractor):
'thumbnail': 'https://web.archive.org/web/20160108040020if_/https://i.ytimg.com/vi/SQCom7wjGDs/maxresdefault.jpg',
'upload_date': '20160107',
},
}, {
# dmuxed formats
'url': 'https://web.archive.org/web/20240922160632/https://www.youtube.com/watch?v=z7hzvTL3k1k',
'info_dict': {
'id': 'z7hzvTL3k1k',
'ext': 'webm',
'title': 'Praise the Lord and Pass the Ammunition (BARRXN REMIX)',
'description': 'md5:45dbf2c71c23b0734c8dfb82dd1e94b6',
'uploader': 'Barrxn',
'uploader_id': 'TheRockstar6086',
'uploader_url': 'https://www.youtube.com/user/TheRockstar6086',
'channel_id': 'UCjJPGUTtvR9uizmawn2ThqA',
'channel_url': 'https://www.youtube.com/channel/UCjJPGUTtvR9uizmawn2ThqA',
'duration': 125,
'thumbnail': r're:https?://.*\.(jpg|webp)',
'upload_date': '20201207',
},
'params': {'format': 'bv'},
}, {
'url': 'https://web.archive.org/web/http://www.youtube.com/watch?v=kH-G_aIBlFw',
'only_matching': True,
@@ -1060,6 +1078,19 @@ def _get_capture_dates(self, video_id, url_date):
capture_dates.extend([self._OLDEST_CAPTURE_DATE, self._NEWEST_CAPTURE_DATE])
return orderedSet(filter(None, capture_dates))
def _parse_fmt(self, fmt, extra_info=None):
format_id = traverse_obj(fmt, ('url', {parse_qs}, 'itag', 0))
return {
'format_id': format_id,
**self._FORMATS.get(format_id, {}),
**traverse_obj(fmt, {
'url': ('url', {lambda x: f'https://web.archive.org/web/2id_/{x}'}),
'ext': ('ext', {str}),
'filesize': ('url', {parse_qs}, 'clen', 0, {int_or_none}),
}),
**(extra_info or {}),
}
def _real_extract(self, url):
video_id, url_date, url_date_2 = self._match_valid_url(url).group('id', 'date', 'date2')
url_date = url_date or url_date_2
@@ -1090,17 +1121,14 @@ def _real_extract(self, url):
info['thumbnails'] = self._extract_thumbnails(video_id)
formats = []
for fmt in traverse_obj(video_info, ('formats', lambda _, v: url_or_none(v['url']))):
format_id = traverse_obj(fmt, ('url', {parse_qs}, 'itag', 0))
formats.append({
'format_id': format_id,
**self._FORMATS.get(format_id, {}),
**traverse_obj(fmt, {
'url': ('url', {lambda x: f'https://web.archive.org/web/2id_/{x}'}),
'ext': ('ext', {str}),
'filesize': ('url', {parse_qs}, 'clen', 0, {int_or_none}),
}),
})
if video_info.get('dmux'):
for vf in traverse_obj(video_info, ('formats', 'video', lambda _, v: url_or_none(v['url']))):
formats.append(self._parse_fmt(vf, {'acodec': 'none'}))
for af in traverse_obj(video_info, ('formats', 'audio', lambda _, v: url_or_none(v['url']))):
formats.append(self._parse_fmt(af, {'vcodec': 'none'}))
else:
for fmt in traverse_obj(video_info, ('formats', lambda _, v: url_or_none(v['url']))):
formats.append(self._parse_fmt(fmt))
info['formats'] = formats
return info

View File

@@ -0,0 +1,74 @@
import re
from .common import InfoExtractor
from ..utils.traversal import traverse_obj
class BitmovinIE(InfoExtractor):
_VALID_URL = r'https?://streams\.bitmovin\.com/(?P<id>\w+)'
_EMBED_REGEX = [r'<iframe\b[^>]+\bsrc=["\'](?P<url>(?:https?:)?//streams\.bitmovin\.com/(?P<id>\w+)[^"\']+)']
_TESTS = [{
'url': 'https://streams.bitmovin.com/cqkl1t5giv3lrce7pjbg/embed',
'info_dict': {
'id': 'cqkl1t5giv3lrce7pjbg',
'ext': 'mp4',
'title': 'Developing Osteopathic Residents as Faculty',
'thumbnail': 'https://streams.bitmovin.com/cqkl1t5giv3lrce7pjbg/poster',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://streams.bitmovin.com/cgl9rh94uvs51rqc8jhg/share',
'info_dict': {
'id': 'cgl9rh94uvs51rqc8jhg',
'ext': 'mp4',
'title': 'Big Buck Bunny (Streams Docs)',
'thumbnail': 'https://streams.bitmovin.com/cgl9rh94uvs51rqc8jhg/poster',
},
'params': {'skip_download': 'm3u8'},
}]
_WEBPAGE_TESTS = [{
# bitmovin-stream web component
'url': 'https://www.institutionalinvestor.com/article/2bsw1in1l9k68mp9kritc/video-war-stories-over-board-games/best-case-i-get-fired-war-stories',
'info_dict': {
'id': 'cuiumeil6g115lc4li3g',
'ext': 'mp4',
'title': '[media] War Stories over Board Games: ÄúBest Case: I Get FiredÄù ',
'thumbnail': 'https://streams.bitmovin.com/cuiumeil6g115lc4li3g/poster',
},
'params': {'skip_download': 'm3u8'},
}, {
# iframe embed
'url': 'https://www.clearblueionizer.com/en/pool-ionizers/mineral-pool-vs-saltwater-pool/',
'info_dict': {
'id': 'cvpvfsm1pf7itg7cfvtg',
'ext': 'mp4',
'title': 'Pool Ionizer vs. Salt Chlorinator',
'thumbnail': 'https://streams.bitmovin.com/cvpvfsm1pf7itg7cfvtg/poster',
},
'params': {'skip_download': 'm3u8'},
}]
@classmethod
def _extract_embed_urls(cls, url, webpage):
yield from super()._extract_embed_urls(url, webpage)
for stream_id in re.findall(r'<bitmovin-stream\b[^>]*\bstream-id=["\'](?P<id>\w+)', webpage):
yield f'https://streams.bitmovin.com/{stream_id}'
def _real_extract(self, url):
video_id = self._match_id(url)
player_config = self._download_json(
f'https://streams.bitmovin.com/{video_id}/config', video_id)['sources']
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
player_config['hls'], video_id, 'mp4')
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(player_config, {
'title': ('title', {str}),
'thumbnail': ('poster', {str}),
}),
}

View File

@@ -1,5 +1,6 @@
from .common import InfoExtractor
from ..utils import int_or_none
from ..utils import int_or_none, url_or_none
from ..utils.traversal import traverse_obj
class DigitekaIE(InfoExtractor):
@@ -25,74 +26,56 @@ class DigitekaIE(InfoExtractor):
)/(?P<id>[\d+a-z]+)'''
_EMBED_REGEX = [r'<(?:iframe|script)[^>]+src=["\'](?P<url>(?:https?:)?//(?:www\.)?ultimedia\.com/deliver/(?:generic|musique)(?:/[^/]+)*/(?:src|article)/[\d+a-z]+)']
_TESTS = [{
# news
'url': 'https://www.ultimedia.com/default/index/videogeneric/id/s8uk0r',
'md5': '276a0e49de58c7e85d32b057837952a2',
'url': 'https://www.ultimedia.com/default/index/videogeneric/id/3x5x55k',
'info_dict': {
'id': 's8uk0r',
'id': '3x5x55k',
'ext': 'mp4',
'title': 'Loi sur la fin de vie: le texte prévoit un renforcement des directives anticipées',
'title': 'Il est passionné de DS',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 74,
'upload_date': '20150317',
'timestamp': 1426604939,
'uploader_id': '3fszv',
'duration': 89,
'upload_date': '20251012',
'timestamp': 1760285363,
'uploader_id': '3pz33',
},
}, {
# music
'url': 'https://www.ultimedia.com/default/index/videomusic/id/xvpfp8',
'md5': '2ea3513813cf230605c7e2ffe7eca61c',
'info_dict': {
'id': 'xvpfp8',
'ext': 'mp4',
'title': 'Two - C\'est La Vie (clip)',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 233,
'upload_date': '20150224',
'timestamp': 1424760500,
'uploader_id': '3rfzk',
},
}, {
'url': 'https://www.digiteka.net/deliver/generic/iframe/mdtk/01637594/src/lqm3kl/zone/1/showtitle/1/autoplay/yes',
'only_matching': True,
'params': {'skip_download': True},
}]
_IFRAME_MD_ID = '01836272' # One static ID working for Ultimedia iframes
def _real_extract(self, url):
mobj = self._match_valid_url(url)
video_id = mobj.group('id')
video_type = mobj.group('embed_type') or mobj.group('site_type')
if video_type == 'music':
video_type = 'musique'
video_id = self._match_id(url)
deliver_info = self._download_json(
f'http://www.ultimedia.com/deliver/video?video={video_id}&topic={video_type}',
video_id)
yt_id = deliver_info.get('yt_id')
if yt_id:
return self.url_result(yt_id, 'Youtube')
jwconf = deliver_info['jwconf']
video_info = self._download_json(
f'https://www.ultimedia.com/player/getConf/{self._IFRAME_MD_ID}/1/{video_id}', video_id,
note='Downloading player configuration')['video']
formats = []
for source in jwconf['playlist'][0]['sources']:
formats.append({
'url': source['file'],
'format_id': source.get('label'),
})
subtitles = {}
title = deliver_info['title']
thumbnail = jwconf.get('image')
duration = int_or_none(deliver_info.get('duration'))
timestamp = int_or_none(deliver_info.get('release_time'))
uploader_id = deliver_info.get('owner_id')
if hls_url := traverse_obj(video_info, ('media_sources', 'hls', 'hls_auto', {url_or_none})):
fmts, subs = self._extract_m3u8_formats_and_subtitles(
hls_url, video_id, 'mp4', m3u8_id='hls', fatal=False)
formats.extend(fmts)
self._merge_subtitles(subs, target=subtitles)
for format_id, mp4_url in traverse_obj(video_info, ('media_sources', 'mp4', {dict.items}, ...)):
if not mp4_url:
continue
formats.append({
'url': mp4_url,
'format_id': format_id,
'height': int_or_none(format_id.partition('_')[2]),
'ext': 'mp4',
})
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'timestamp': timestamp,
'uploader_id': uploader_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(video_info, {
'title': ('title', {str}),
'thumbnail': ('image', {url_or_none}),
'duration': ('duration', {int_or_none}),
'timestamp': ('creationDate', {int_or_none}),
'uploader_id': ('ownerId', {str}),
}),
}

View File

@@ -5,6 +5,7 @@
from ..networking import Request
from ..utils import (
ExtractorError,
UserNotLive,
js_to_json,
traverse_obj,
update_url_query,
@@ -205,6 +206,9 @@ def _real_extract(self, url):
'client_app': 'browser_hls',
'ipv6': '',
}), headers={'X-Requested-With': 'XMLHttpRequest'})
# A non-zero 'status' indicates the stream is not live, so check truthiness
if traverse_obj(control_server, ('status', {int})) and 'control_token' not in control_server:
raise UserNotLive(video_id=video_id)
self._set_cookie('live.fc2.com', 'l_ortkn', control_server['orz_raw'])
ws_url = update_url_query(control_server['url'], {'control_token': control_server['control_token']})

View File

@@ -109,6 +109,17 @@ def _real_extract(self, url):
'hls_media_playlist_data': m3u8_data,
'hls_aes': hls_aes or None,
})
subtitles = {}
automatic_captions = {}
for sub_data in traverse_obj(metadata, ('textTracks', lambda _, v: url_or_none(v['src']))):
sub_lang = sub_data.get('language') or 'en'
sub_entry = {'url': sub_data['src']}
if sub_data.get('generated'):
automatic_captions.setdefault(sub_lang, []).append(sub_entry)
else:
subtitles.setdefault(sub_lang, []).append(sub_entry)
items.append({
**common_info,
'id': media_id,
@@ -118,6 +129,8 @@ def _real_extract(self, url):
'thumbnail': ('thumbnail', 'path', {url_or_none}),
}),
'formats': formats,
'subtitles': subtitles,
'automatic_captions': automatic_captions,
})
post_info = {

164
yt_dlp/extractor/frontro.py Normal file
View File

@@ -0,0 +1,164 @@
import json
from .common import InfoExtractor
from ..utils import int_or_none, parse_iso8601, url_or_none
from ..utils.traversal import traverse_obj
class FrontoBaseIE(InfoExtractor):
def _get_auth_headers(self, url):
return traverse_obj(self._get_cookies(url), {
'authorization': ('frAccessToken', 'value', {lambda token: f'Bearer {token}' if token else None}),
})
class FrontroVideoBaseIE(FrontoBaseIE):
_CHANNEL_ID = None
def _real_extract(self, url):
video_id = self._match_id(url)
metadata = self._download_json(
'https://api.frontrow.cc/query', video_id, data=json.dumps({
'operationName': 'Video',
'variables': {'channelID': self._CHANNEL_ID, 'videoID': video_id},
'query': '''query Video($channelID: ID!, $videoID: ID!) {
video(ChannelID: $channelID, VideoID: $videoID) {
... on Video {title description updatedAt thumbnail createdAt duration likeCount comments views url hasAccess}
}
}''',
}).encode(), headers={
'content-type': 'application/json',
**self._get_auth_headers(url),
})['data']['video']
if not traverse_obj(metadata, 'hasAccess'):
self.raise_login_required()
formats, subtitles = self._extract_m3u8_formats_and_subtitles(metadata['url'], video_id)
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
**traverse_obj(metadata, {
'title': ('title', {str}),
'description': ('description', {str}),
'thumbnail': ('thumbnail', {url_or_none}),
'timestamp': ('createdAt', {parse_iso8601}),
'modified_timestamp': ('updatedAt', {parse_iso8601}),
'duration': ('duration', {int_or_none}),
'like_count': ('likeCount', {int_or_none}),
'comment_count': ('comments', {int_or_none}),
'view_count': ('views', {int_or_none}),
}),
}
class FrontroGroupBaseIE(FrontoBaseIE):
_CHANNEL_ID = None
_VIDEO_EXTRACTOR = None
_VIDEO_URL_TMPL = None
def _real_extract(self, url):
group_id = self._match_id(url)
metadata = self._download_json(
'https://api.frontrow.cc/query', group_id, note='Downloading playlist metadata',
data=json.dumps({
'operationName': 'PaginatedStaticPageContainer',
'variables': {'channelID': self._CHANNEL_ID, 'first': 500, 'pageContainerID': group_id},
'query': '''query PaginatedStaticPageContainer($channelID: ID!, $pageContainerID: ID!) {
pageContainer(ChannelID: $channelID, PageContainerID: $pageContainerID) {
... on StaticPageContainer { id title updatedAt createdAt itemRefs {edges {node {
id contentItem { ... on ItemVideo { videoItem: item {
id
}}}
}}}
}
}
}''',
}).encode(), headers={
'content-type': 'application/json',
**self._get_auth_headers(url),
})['data']['pageContainer']
entries = []
for video_id in traverse_obj(metadata, (
'itemRefs', 'edges', ..., 'node', 'contentItem', 'videoItem', 'id', {str}),
):
entries.append(self.url_result(
self._VIDEO_URL_TMPL % video_id, self._VIDEO_EXTRACTOR, video_id))
return {
'_type': 'playlist',
'id': group_id,
'entries': entries,
**traverse_obj(metadata, {
'title': ('title', {str}),
'timestamp': ('createdAt', {parse_iso8601}),
'modified_timestamp': ('updatedAt', {parse_iso8601}),
}),
}
class TheChosenIE(FrontroVideoBaseIE):
_CHANNEL_ID = '12884901895'
_VALID_URL = r'https?://(?:www\.)?watch\.thechosen\.tv/video/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://watch.thechosen.tv/video/184683594325',
'md5': '3f878b689588c71b38ec9943c54ff5b0',
'info_dict': {
'id': '184683594325',
'ext': 'mp4',
'title': 'Season 3 Episode 2: Two by Two',
'description': 'md5:174c373756ecc8df46b403f4fcfbaf8c',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 4212,
'thumbnail': r're:https://fastly\.frontrowcdn\.com/channels/12884901895/VIDEO_THUMBNAIL/184683594325/',
'timestamp': 1698954546,
'upload_date': '20231102',
'modified_timestamp': int,
'modified_date': str,
},
}, {
'url': 'https://watch.thechosen.tv/video/184683596189',
'md5': 'd581562f9d29ce82f5b7770415334151',
'info_dict': {
'id': '184683596189',
'ext': 'mp4',
'title': 'Season 4 Episode 8: Humble',
'description': 'md5:20a57bead43da1cf77cd5b0fe29bbc76',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 5092,
'thumbnail': r're:https://fastly\.frontrowcdn\.com/channels/12884901895/VIDEO_THUMBNAIL/184683596189/',
'timestamp': 1715019474,
'upload_date': '20240506',
'modified_timestamp': int,
'modified_date': str,
},
}]
class TheChosenGroupIE(FrontroGroupBaseIE):
_CHANNEL_ID = '12884901895'
_VIDEO_EXTRACTOR = TheChosenIE
_VIDEO_URL_TMPL = 'https://watch.thechosen.tv/video/%s'
_VALID_URL = r'https?://(?:www\.)?watch\.thechosen\.tv/group/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://watch.thechosen.tv/group/309237658592',
'info_dict': {
'id': '309237658592',
'title': 'Season 3',
'timestamp': 1746203969,
'upload_date': '20250502',
'modified_timestamp': int,
'modified_date': str,
},
'playlist_count': 8,
}]

View File

@@ -98,7 +98,7 @@ def _real_extract(self, url):
formats = []
for stream_url in traverse_obj(playback_data, ('sources', 'HLS', ..., 'file', {url_or_none})):
stream_url = re.sub(r'/playlist(?:_pd\d+)?\.m3u8', '/index.m3u8', stream_url)
stream_url = re.sub(r'/playlist_pd\d+\.m3u8', '/playlist.m3u8', stream_url)
formats.extend(self._extract_m3u8_formats(stream_url, video_id, fatal=False))
metadata = self._download_json(

View File

@@ -8,12 +8,10 @@
ExtractorError,
determine_ext,
filter_dict,
get_first,
int_or_none,
parse_iso8601,
update_url,
url_or_none,
variadic,
)
from ..utils.traversal import traverse_obj
@@ -51,7 +49,7 @@ class LoomIE(InfoExtractor):
}, {
# m3u8 raw-url, mp4 transcoded-url, cdn url == raw-url, vtt sub and json subs
'url': 'https://www.loom.com/share/9458bcbf79784162aa62ffb8dd66201b',
'md5': '51737ec002969dd28344db4d60b9cbbb',
'md5': '7b6bfdef8181c4ffc376e18919a4dcc2',
'info_dict': {
'id': '9458bcbf79784162aa62ffb8dd66201b',
'ext': 'mp4',
@@ -71,12 +69,13 @@ class LoomIE(InfoExtractor):
'ext': 'webm',
'title': 'OMFG clown',
'description': 'md5:285c5ee9d62aa087b7e3271b08796815',
'uploader': 'MrPumkin B',
'uploader': 'Brailey Bragg',
'upload_date': '20210924',
'timestamp': 1632519618,
'duration': 210,
},
'params': {'skip_download': 'dash'},
'expected_warnings': ['Failed to parse JSON'], # transcoded-url no longer available
}, {
# password-protected
'url': 'https://www.loom.com/share/50e26e8aeb7940189dff5630f95ce1f4',
@@ -91,10 +90,11 @@ class LoomIE(InfoExtractor):
'duration': 35,
},
'params': {'videopassword': 'seniorinfants2'},
'expected_warnings': ['Failed to parse JSON'], # transcoded-url no longer available
}, {
# embed, transcoded-url endpoint sends empty JSON response, split video and audio HLS formats
'url': 'https://www.loom.com/embed/ddcf1c1ad21f451ea7468b1e33917e4e',
'md5': 'b321d261656848c184a94e3b93eae28d',
'md5': 'f983a0f02f24331738b2f43aecb05256',
'info_dict': {
'id': 'ddcf1c1ad21f451ea7468b1e33917e4e',
'ext': 'mp4',
@@ -119,11 +119,12 @@ class LoomIE(InfoExtractor):
'duration': 247,
'timestamp': 1676274030,
},
'skip': '404 Not Found',
}]
_GRAPHQL_VARIABLES = {
'GetVideoSource': {
'acceptableMimes': ['DASH', 'M3U8', 'MP4'],
'acceptableMimes': ['DASH', 'M3U8', 'MP4', 'WEBM'],
},
}
_GRAPHQL_QUERIES = {
@@ -192,6 +193,12 @@ class LoomIE(InfoExtractor):
id
nullableRawCdnUrl(acceptableMimes: $acceptableMimes, password: $password) {
url
credentials {
Policy
Signature
KeyPairId
__typename
}
__typename
}
__typename
@@ -240,9 +247,9 @@ class LoomIE(InfoExtractor):
}
}\n'''),
}
_APOLLO_GRAPHQL_VERSION = '0a1856c'
_APOLLO_GRAPHQL_VERSION = '45a5bd4'
def _call_graphql_api(self, operations, video_id, note=None, errnote=None):
def _call_graphql_api(self, operation_name, video_id, note=None, errnote=None, fatal=True):
password = self.get_param('videopassword')
return self._download_json(
'https://www.loom.com/graphql', video_id, note or 'Downloading GraphQL JSON',
@@ -252,7 +259,9 @@ def _call_graphql_api(self, operations, video_id, note=None, errnote=None):
'x-loom-request-source': f'loom_web_{self._APOLLO_GRAPHQL_VERSION}',
'apollographql-client-name': 'web',
'apollographql-client-version': self._APOLLO_GRAPHQL_VERSION,
}, data=json.dumps([{
'graphql-operation-name': operation_name,
'Origin': 'https://www.loom.com',
}, data=json.dumps({
'operationName': operation_name,
'variables': {
'videoId': video_id,
@@ -260,7 +269,7 @@ def _call_graphql_api(self, operations, video_id, note=None, errnote=None):
**self._GRAPHQL_VARIABLES.get(operation_name, {}),
},
'query': self._GRAPHQL_QUERIES[operation_name],
} for operation_name in variadic(operations)], separators=(',', ':')).encode())
}, separators=(',', ':')).encode(), fatal=fatal)
def _call_url_api(self, endpoint, video_id):
response = self._download_json(
@@ -275,7 +284,7 @@ def _call_url_api(self, endpoint, video_id):
}, separators=(',', ':')).encode())
return traverse_obj(response, ('url', {url_or_none}))
def _extract_formats(self, video_id, metadata, gql_data):
def _extract_formats(self, video_id, metadata, video_data):
formats = []
video_properties = traverse_obj(metadata, ('video_properties', {
'width': ('width', {int_or_none}),
@@ -330,7 +339,7 @@ def get_formats(format_url, format_id, quality):
transcoded_url = self._call_url_api('transcoded-url', video_id)
formats.extend(get_formats(transcoded_url, 'transcoded', quality=-1)) # transcoded quality
cdn_url = get_first(gql_data, ('data', 'getVideo', 'nullableRawCdnUrl', 'url', {url_or_none}))
cdn_url = traverse_obj(video_data, ('data', 'getVideo', 'nullableRawCdnUrl', 'url', {url_or_none}))
# cdn_url is usually a dupe, but the raw-url/transcoded-url endpoints could return errors
valid_urls = [update_url(url, query=None) for url in (raw_url, transcoded_url) if url]
if cdn_url and update_url(cdn_url, query=None) not in valid_urls:
@@ -338,10 +347,21 @@ def get_formats(format_url, format_id, quality):
return formats
def _get_subtitles(self, video_id):
subs_data = self._call_graphql_api(
'FetchVideoTranscript', video_id, 'Downloading GraphQL subtitles JSON', fatal=False)
return filter_dict({
'en': traverse_obj(subs_data, (
'data', 'fetchVideoTranscript',
('source_url', 'captions_source_url'), {
'url': {url_or_none},
})) or None,
})
def _real_extract(self, url):
video_id = self._match_id(url)
metadata = get_first(
self._call_graphql_api('GetVideoSSR', video_id, 'Downloading GraphQL metadata JSON'),
metadata = traverse_obj(
self._call_graphql_api('GetVideoSSR', video_id, 'Downloading GraphQL metadata JSON', fatal=False),
('data', 'getVideo', {dict})) or {}
if metadata.get('__typename') == 'VideoPasswordMissingOrIncorrect':
@@ -350,22 +370,19 @@ def _real_extract(self, url):
'This video is password-protected, use the --video-password option', expected=True)
raise ExtractorError('Invalid video password', expected=True)
gql_data = self._call_graphql_api(['FetchChapters', 'FetchVideoTranscript', 'GetVideoSource'], video_id)
video_data = self._call_graphql_api(
'GetVideoSource', video_id, 'Downloading GraphQL video JSON')
chapter_data = self._call_graphql_api(
'FetchChapters', video_id, 'Downloading GraphQL chapters JSON', fatal=False)
duration = traverse_obj(metadata, ('video_properties', 'duration', {int_or_none}))
return {
'id': video_id,
'duration': duration,
'chapters': self._extract_chapters_from_description(
get_first(gql_data, ('data', 'fetchVideoChapters', 'content', {str})), duration) or None,
'formats': self._extract_formats(video_id, metadata, gql_data),
'subtitles': filter_dict({
'en': traverse_obj(gql_data, (
..., 'data', 'fetchVideoTranscript',
('source_url', 'captions_source_url'), {
'url': {url_or_none},
})) or None,
}),
traverse_obj(chapter_data, ('data', 'fetchVideoChapters', 'content', {str})), duration) or None,
'formats': self._extract_formats(video_id, metadata, video_data),
'subtitles': self.extract_subtitles(video_id),
**traverse_obj(metadata, {
'title': ('name', {str}),
'description': ('description', {str}),
@@ -376,6 +393,7 @@ def _real_extract(self, url):
class LoomFolderIE(InfoExtractor):
_WORKING = False
IE_NAME = 'loom:folder'
_VALID_URL = r'https?://(?:www\.)?loom\.com/share/folder/(?P<id>[\da-f]{32})'
_TESTS = [{

View File

@@ -1,7 +1,9 @@
import re
import functools
import math
from .common import InfoExtractor
from ..utils import (
InAdvancePagedList,
clean_html,
int_or_none,
parse_iso8601,
@@ -10,15 +12,64 @@
from ..utils.traversal import require, traverse_obj
class MaveIE(InfoExtractor):
_VALID_URL = r'https?://(?P<channel>[\w-]+)\.mave\.digital/(?P<id>ep-\d+)'
class MaveBaseIE(InfoExtractor):
_API_BASE_URL = 'https://api.mave.digital/v1/website'
_API_BASE_STORAGE_URL = 'https://store.cloud.mts.ru/mave/'
def _load_channel_meta(self, channel_id, display_id):
return traverse_obj(self._download_json(
f'{self._API_BASE_URL}/{channel_id}/', display_id,
note='Downloading channel metadata'), 'podcast')
def _load_episode_meta(self, channel_id, episode_code, display_id):
return self._download_json(
f'{self._API_BASE_URL}/{channel_id}/episodes/{episode_code}',
display_id, note='Downloading episode metadata')
def _create_entry(self, channel_id, channel_meta, episode_meta):
episode_code = traverse_obj(episode_meta, ('code', {int}, {require('episode code')}))
return {
'display_id': f'{channel_id}-{episode_code}',
'extractor_key': MaveIE.ie_key(),
'extractor': MaveIE.IE_NAME,
'webpage_url': f'https://{channel_id}.mave.digital/ep-{episode_code}',
'channel_id': channel_id,
'channel_url': f'https://{channel_id}.mave.digital/',
'vcodec': 'none',
**traverse_obj(episode_meta, {
'id': ('id', {str}),
'url': ('audio', {urljoin(self._API_BASE_STORAGE_URL)}),
'title': ('title', {str}),
'description': ('description', {clean_html}),
'thumbnail': ('image', {urljoin(self._API_BASE_STORAGE_URL)}),
'duration': ('duration', {int_or_none}),
'season_number': ('season', {int_or_none}),
'episode_number': ('number', {int_or_none}),
'view_count': ('listenings', {int_or_none}),
'like_count': ('reactions', lambda _, v: v['type'] == 'like', 'count', {int_or_none}, any),
'dislike_count': ('reactions', lambda _, v: v['type'] == 'dislike', 'count', {int_or_none}, any),
'age_limit': ('is_explicit', {bool}, {lambda x: 18 if x else None}),
'timestamp': ('publish_date', {parse_iso8601}),
}),
**traverse_obj(channel_meta, {
'series_id': ('id', {str}),
'series': ('title', {str}),
'channel': ('title', {str}),
'uploader': ('author', {str}),
}),
}
class MaveIE(MaveBaseIE):
IE_NAME = 'mave'
_VALID_URL = r'https?://(?P<channel_id>[\w-]+)\.mave\.digital/ep-(?P<episode_code>\d+)'
_TESTS = [{
'url': 'https://ochenlichnoe.mave.digital/ep-25',
'md5': 'aa3e513ef588b4366df1520657cbc10c',
'info_dict': {
'id': '4035f587-914b-44b6-aa5a-d76685ad9bc2',
'ext': 'mp3',
'display_id': 'ochenlichnoe-ep-25',
'display_id': 'ochenlichnoe-25',
'title': 'Между мной и миром: психология самооценки',
'description': 'md5:4b7463baaccb6982f326bce5c700382a',
'uploader': 'Самарский университет',
@@ -45,7 +96,7 @@ class MaveIE(InfoExtractor):
'info_dict': {
'id': '41898bb5-ff57-4797-9236-37a8e537aa21',
'ext': 'mp3',
'display_id': 'budem-ep-12',
'display_id': 'budem-12',
'title': 'Екатерина Михайлова: "Горе от ума" не про женщин написана',
'description': 'md5:fa3bdd59ee829dfaf16e3efcb13f1d19',
'uploader': 'Полина Цветкова+Евгения Акопова',
@@ -68,40 +119,72 @@ class MaveIE(InfoExtractor):
'upload_date': '20241230',
},
}]
_API_BASE_URL = 'https://api.mave.digital/'
def _real_extract(self, url):
channel_id, slug = self._match_valid_url(url).group('channel', 'id')
display_id = f'{channel_id}-{slug}'
webpage = self._download_webpage(url, display_id)
data = traverse_obj(
self._search_nuxt_json(webpage, display_id),
('data', lambda _, v: v['activeEpisodeData'], any, {require('podcast data')}))
channel_id, episode_code = self._match_valid_url(url).group(
'channel_id', 'episode_code')
display_id = f'{channel_id}-{episode_code}'
channel_meta = self._load_channel_meta(channel_id, display_id)
episode_meta = self._load_episode_meta(channel_id, episode_code, display_id)
return self._create_entry(channel_id, channel_meta, episode_meta)
class MaveChannelIE(MaveBaseIE):
IE_NAME = 'mave:channel'
_VALID_URL = r'https?://(?P<id>[\w-]+)\.mave\.digital/?(?:$|[?#])'
_TESTS = [{
'url': 'https://budem.mave.digital/',
'info_dict': {
'id': 'budem',
'title': 'Все там будем',
'description': 'md5:f04ae12a42be0f1d765c5e326b41987a',
},
'playlist_mincount': 15,
}, {
'url': 'https://ochenlichnoe.mave.digital/',
'info_dict': {
'id': 'ochenlichnoe',
'title': 'Очень личное',
'description': 'md5:ee36a6a52546b91b487fe08c552fdbb2',
},
'playlist_mincount': 20,
}, {
'url': 'https://geekcity.mave.digital/',
'info_dict': {
'id': 'geekcity',
'title': 'Мужчины в трико',
'description': 'md5:4164d425d60a0d97abdce9d1f6f8e049',
},
'playlist_mincount': 80,
}]
_PAGE_SIZE = 50
def _entries(self, channel_id, channel_meta, page_num):
page_data = self._download_json(
f'{self._API_BASE_URL}/{channel_id}/episodes', channel_id, query={
'view': 'all',
'page': page_num + 1,
'sort': 'newest',
'format': 'all',
}, note=f'Downloading page {page_num + 1}')
for ep in traverse_obj(page_data, ('episodes', lambda _, v: v['audio'] and v['id'])):
yield self._create_entry(channel_id, channel_meta, ep)
def _real_extract(self, url):
channel_id = self._match_id(url)
channel_meta = self._load_channel_meta(channel_id, channel_id)
return {
'display_id': display_id,
'channel_id': channel_id,
'channel_url': f'https://{channel_id}.mave.digital/',
'vcodec': 'none',
'thumbnail': re.sub(r'_\d+(?=\.(?:jpg|png))', '', self._og_search_thumbnail(webpage, default='')) or None,
**traverse_obj(data, ('activeEpisodeData', {
'url': ('audio', {urljoin(self._API_BASE_URL)}),
'id': ('id', {str}),
'_type': 'playlist',
'id': channel_id,
**traverse_obj(channel_meta, {
'title': ('title', {str}),
'description': ('description', {clean_html}),
'duration': ('duration', {int_or_none}),
'season_number': ('season', {int_or_none}),
'episode_number': ('number', {int_or_none}),
'view_count': ('listenings', {int_or_none}),
'like_count': ('reactions', lambda _, v: v['type'] == 'like', 'count', {int_or_none}, any),
'dislike_count': ('reactions', lambda _, v: v['type'] == 'dislike', 'count', {int_or_none}, any),
'age_limit': ('is_explicit', {bool}, {lambda x: 18 if x else None}),
'timestamp': ('publish_date', {parse_iso8601}),
})),
**traverse_obj(data, ('podcast', 'podcast', {
'series_id': ('id', {str}),
'series': ('title', {str}),
'channel': ('title', {str}),
'uploader': ('author', {str}),
})),
'description': ('description', {str}),
}),
'entries': InAdvancePagedList(
functools.partial(self._entries, channel_id, channel_meta),
math.ceil(channel_meta['episodes_count'] / self._PAGE_SIZE), self._PAGE_SIZE),
}

View File

@@ -1,14 +1,9 @@
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
float_or_none,
format_field,
int_or_none,
str_or_none,
traverse_obj,
url_or_none,
)
from ..utils.traversal import traverse_obj
class MedalTVIE(InfoExtractor):
@@ -30,25 +25,8 @@ class MedalTVIE(InfoExtractor):
'view_count': int,
'like_count': int,
'duration': 13,
},
}, {
'url': 'https://medal.tv/games/cod-cold-war/clips/2mA60jWAGQCBH',
'md5': 'fc7a3e4552ae8993c1c4006db46be447',
'info_dict': {
'id': '2mA60jWAGQCBH',
'ext': 'mp4',
'title': 'Quad Cold',
'description': 'Medal,https://medal.tv/desktop/',
'uploader': 'MowgliSB',
'timestamp': 1603165266,
'upload_date': '20201020',
'uploader_id': '10619174',
'thumbnail': 'https://cdn.medal.tv/10619174/thumbnail-34934644-720p.jpg?t=1080p&c=202042&missing',
'uploader_url': 'https://medal.tv/users/10619174',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 23,
'thumbnail': r're:https://cdn\.medal\.tv/ugcp/content-thumbnail/.*\.jpg',
'tags': ['headshot', 'valorant', '4k', 'clutch', 'mornu'],
},
}, {
'url': 'https://medal.tv/games/cod-cold-war/clips/2um24TWdty0NA',
@@ -57,12 +35,12 @@ class MedalTVIE(InfoExtractor):
'id': '2um24TWdty0NA',
'ext': 'mp4',
'title': 'u tk me i tk u bigger',
'description': 'Medal,https://medal.tv/desktop/',
'uploader': 'Mimicc',
'description': '',
'uploader': 'zahl',
'timestamp': 1605580939,
'upload_date': '20201117',
'uploader_id': '5156321',
'thumbnail': 'https://cdn.medal.tv/5156321/thumbnail-36787208-360p.jpg?t=1080p&c=202046&missing',
'thumbnail': r're:https://cdn\.medal\.tv/source/.*\.png',
'uploader_url': 'https://medal.tv/users/5156321',
'comment_count': int,
'view_count': int,
@@ -70,91 +48,77 @@ class MedalTVIE(InfoExtractor):
'duration': 9,
},
}, {
'url': 'https://medal.tv/games/valorant/clips/37rMeFpryCC-9',
'only_matching': True,
}, {
# API requires auth
'url': 'https://medal.tv/games/valorant/clips/2WRj40tpY_EU9',
'md5': '6c6bb6569777fd8b4ef7b33c09de8dcf',
'info_dict': {
'id': '2WRj40tpY_EU9',
'ext': 'mp4',
'title': '1v5 clutch',
'description': '',
'uploader': 'adny',
'uploader_id': '6256941',
'uploader_url': 'https://medal.tv/users/6256941',
'comment_count': int,
'view_count': int,
'like_count': int,
'duration': 25,
'thumbnail': r're:https://cdn\.medal\.tv/source/.*\.jpg',
'timestamp': 1612896680,
'upload_date': '20210209',
},
'expected_warnings': ['Video formats are not available through API'],
}, {
'url': 'https://medal.tv/games/valorant/clips/37rMeFpryCC-9',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id, query={'mobilebypass': 'true'})
hydration_data = self._search_json(
r'<script[^>]*>[^<]*\bhydrationData\s*=', webpage,
'next data', video_id, end_pattern='</script>', fatal=False)
clip = traverse_obj(hydration_data, ('clips', ...), get_all=False)
if not clip:
raise ExtractorError(
'Could not find video information.', video_id=video_id)
title = clip['contentTitle']
source_width = int_or_none(clip.get('sourceWidth'))
source_height = int_or_none(clip.get('sourceHeight'))
aspect_ratio = source_width / source_height if source_width and source_height else 16 / 9
def add_item(container, item_url, height, id_key='format_id', item_id=None):
item_id = item_id or '%dp' % height
if item_id not in item_url:
return
container.append({
'url': item_url,
id_key: item_id,
'width': round(aspect_ratio * height),
'height': height,
})
content_data = self._download_json(
f'https://medal.tv/api/content/{video_id}', video_id,
headers={'Accept': 'application/json'})
formats = []
thumbnails = []
for k, v in clip.items():
if not (v and isinstance(v, str)):
continue
mobj = re.match(r'(contentUrl|thumbnail)(?:(\d+)p)?$', k)
if not mobj:
continue
prefix = mobj.group(1)
height = int_or_none(mobj.group(2))
if prefix == 'contentUrl':
add_item(
formats, v, height or source_height,
item_id=None if height else 'source')
elif prefix == 'thumbnail':
add_item(thumbnails, v, height, 'id')
error = clip.get('error')
if not formats and error:
if error == 404:
self.raise_no_formats(
'That clip does not exist.',
expected=True, video_id=video_id)
else:
self.raise_no_formats(
f'An unknown error occurred ({error}).',
video_id=video_id)
# Necessary because the id of the author is not known in advance.
# Won't raise an issue if no profile can be found as this is optional.
author = traverse_obj(hydration_data, ('profiles', ...), get_all=False) or {}
author_id = str_or_none(author.get('userId'))
author_url = format_field(author_id, None, 'https://medal.tv/users/%s')
if m3u8_url := url_or_none(content_data.get('contentUrlHls')):
formats.extend(self._extract_m3u8_formats(m3u8_url, video_id, 'mp4', m3u8_id='hls'))
if http_url := url_or_none(content_data.get('contentUrl')):
formats.append({
'url': http_url,
'format_id': 'http-source',
'ext': 'mp4',
'quality': 1,
})
formats = [fmt for fmt in formats if 'video/privacy-protected-guest' not in fmt['url']]
if not formats:
# Fallback, does not require auth
self.report_warning('Video formats are not available through API, falling back to social video URL')
urlh = self._request_webpage(
f'https://medal.tv/api/content/{video_id}/socialVideoUrl', video_id,
note='Checking social video URL')
formats.append({
'url': urlh.url,
'format_id': 'social-video',
'ext': 'mp4',
'quality': -1,
})
return {
'id': video_id,
'title': title,
'formats': formats,
'thumbnails': thumbnails,
'description': clip.get('contentDescription'),
'uploader': author.get('displayName'),
'timestamp': float_or_none(clip.get('created'), 1000),
'uploader_id': author_id,
'uploader_url': author_url,
'duration': int_or_none(clip.get('videoLengthSeconds')),
'view_count': int_or_none(clip.get('views')),
'like_count': int_or_none(clip.get('likes')),
'comment_count': int_or_none(clip.get('comments')),
**traverse_obj(content_data, {
'title': ('contentTitle', {str}),
'description': ('contentDescription', {str}),
'timestamp': ('created', {int_or_none(scale=1000)}),
'duration': ('videoLengthSeconds', {int_or_none}),
'view_count': ('views', {int_or_none}),
'like_count': ('likes', {int_or_none}),
'comment_count': ('comments', {int_or_none}),
'uploader': ('poster', 'displayName', {str}),
'uploader_id': ('poster', 'userId', {str}),
'uploader_url': ('poster', 'userId', {str}, filter, {lambda x: x and f'https://medal.tv/users/{x}'}),
'tags': ('tags', ..., {str}),
'thumbnail': ('thumbnailUrl', {url_or_none}),
}),
}

View File

@@ -0,0 +1,79 @@
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor
from ..utils import parse_iso8601
from ..utils.traversal import require, traverse_obj
class NetAppBaseIE(InfoExtractor):
_BC_URL = 'https://players.brightcove.net/6255154784001/default_default/index.html?videoId={}'
@staticmethod
def _parse_metadata(item):
return traverse_obj(item, {
'title': ('name', {str}),
'description': ('description', {str}),
'timestamp': ('createdAt', {parse_iso8601}),
})
class NetAppVideoIE(NetAppBaseIE):
_VALID_URL = r'https?://media\.netapp\.com/video-detail/(?P<id>[0-9a-f-]+)'
_TESTS = [{
'url': 'https://media.netapp.com/video-detail/da25fc01-82ad-5284-95bc-26920200a222/seamless-storage-for-modern-kubernetes-deployments',
'info_dict': {
'id': '1843620950167202073',
'ext': 'mp4',
'title': 'Seamless storage for modern Kubernetes deployments',
'description': 'md5:1ee39e315243fe71fb90af2796037248',
'uploader_id': '6255154784001',
'duration': 2159.41,
'thumbnail': r're:https://house-fastly-signed-us-east-1-prod\.brightcovecdn\.com/image/.*\.jpg',
'tags': 'count:15',
'timestamp': 1758213949,
'upload_date': '20250918',
},
}, {
'url': 'https://media.netapp.com/video-detail/45593e5d-cf1c-5996-978c-c9081906e69f/unleash-ai-innovation-with-your-data-with-the-netapp-platform',
'only_matching': True,
}]
def _real_extract(self, url):
video_uuid = self._match_id(url)
metadata = self._download_json(
f'https://api.media.netapp.com/client/detail/{video_uuid}', video_uuid)
brightcove_video_id = traverse_obj(metadata, (
'sections', lambda _, v: v['type'] == 'Player', 'video', {str}, any, {require('brightcove video id')}))
video_item = traverse_obj(metadata, ('sections', lambda _, v: v['type'] == 'VideoDetail', any))
return self.url_result(
self._BC_URL.format(brightcove_video_id), BrightcoveNewIE, brightcove_video_id,
url_transparent=True, **self._parse_metadata(video_item))
class NetAppCollectionIE(NetAppBaseIE):
_VALID_URL = r'https?://media\.netapp\.com/collection/(?P<id>[0-9a-f-]+)'
_TESTS = [{
'url': 'https://media.netapp.com/collection/9820e190-f2a6-47ac-9c0a-98e5e64234a4',
'info_dict': {
'title': 'Featured sessions',
'id': '9820e190-f2a6-47ac-9c0a-98e5e64234a4',
},
'playlist_count': 4,
}]
def _entries(self, metadata):
for item in traverse_obj(metadata, ('items', lambda _, v: v['brightcoveVideoId'])):
brightcove_video_id = item['brightcoveVideoId']
yield self.url_result(
self._BC_URL.format(brightcove_video_id), BrightcoveNewIE, brightcove_video_id,
url_transparent=True, **self._parse_metadata(item))
def _real_extract(self, url):
collection_uuid = self._match_id(url)
metadata = self._download_json(
f'https://api.media.netapp.com/client/collection/{collection_uuid}', collection_uuid)
return self.playlist_result(self._entries(metadata), collection_uuid, playlist_title=metadata.get('name'))

View File

@@ -23,96 +23,38 @@
class NhkBaseIE(InfoExtractor):
_API_URL_TEMPLATE = 'https://nwapi.nhk.jp/nhkworld/%sod%slist/v7b/%s/%s/%s/all%s.json'
_API_URL_TEMPLATE = 'https://api.nhkworld.jp/showsapi/v1/{lang}/{content_format}_{page_type}/{m_id}{extra_page}'
_BASE_URL_REGEX = r'https?://www3\.nhk\.or\.jp/nhkworld/(?P<lang>[a-z]{2})/'
def _call_api(self, m_id, lang, is_video, is_episode, is_clip):
content_format = 'video' if is_video else 'audio'
content_type = 'clips' if is_clip else 'episodes'
if not is_episode:
extra_page = f'/{content_format}_{content_type}'
page_type = 'programs'
else:
extra_page = ''
page_type = content_type
return self._download_json(
self._API_URL_TEMPLATE % (
'v' if is_video else 'r',
'clip' if is_clip else 'esd',
'episode' if is_episode else 'program',
m_id, lang, '/all' if is_video else ''),
m_id, query={'apikey': 'EJfK8jdS57GqlupFgAfAAwr573q01y6k'})['data']['episodes'] or []
def _get_api_info(self, refresh=True):
if not refresh:
return self.cache.load('nhk', 'api_info')
self.cache.store('nhk', 'api_info', {})
movie_player_js = self._download_webpage(
'https://movie-a.nhk.or.jp/world/player/js/movie-player.js', None,
note='Downloading stream API information')
api_info = {
'url': self._search_regex(
r'prod:[^;]+\bapiUrl:\s*[\'"]([^\'"]+)[\'"]', movie_player_js, None, 'stream API url'),
'token': self._search_regex(
r'prod:[^;]+\btoken:\s*[\'"]([^\'"]+)[\'"]', movie_player_js, None, 'stream API token'),
}
self.cache.store('nhk', 'api_info', api_info)
return api_info
def _extract_stream_info(self, vod_id):
for refresh in (False, True):
api_info = self._get_api_info(refresh)
if not api_info:
continue
api_url = api_info.pop('url')
meta = traverse_obj(
self._download_json(
api_url, vod_id, 'Downloading stream url info', fatal=False, query={
**api_info,
'type': 'json',
'optional_id': vod_id,
'active_flg': 1,
}), ('meta', 0))
stream_url = traverse_obj(
meta, ('movie_url', ('mb_auto', 'auto_sp', 'auto_pc'), {url_or_none}), get_all=False)
if stream_url:
formats, subtitles = self._extract_m3u8_formats_and_subtitles(stream_url, vod_id)
return {
**traverse_obj(meta, {
'duration': ('duration', {int_or_none}),
'timestamp': ('publication_date', {unified_timestamp}),
'release_timestamp': ('insert_date', {unified_timestamp}),
'modified_timestamp': ('update_date', {unified_timestamp}),
}),
'formats': formats,
'subtitles': subtitles,
}
raise ExtractorError('Unable to extract stream url')
self._API_URL_TEMPLATE.format(
lang=lang, content_format=content_format, page_type=page_type,
m_id=m_id, extra_page=extra_page),
join_nonempty(m_id, lang))
def _extract_episode_info(self, url, episode=None):
fetch_episode = episode is None
lang, m_type, episode_id = NhkVodIE._match_valid_url(url).group('lang', 'type', 'id')
is_video = m_type != 'audio'
if is_video:
episode_id = episode_id[:4] + '-' + episode_id[4:]
if fetch_episode:
episode = self._call_api(
episode_id, lang, is_video, True, episode_id[:4] == '9999')[0]
episode_id, lang, is_video, is_episode=True, is_clip=episode_id[:4] == '9999')
def get_clean_field(key):
return clean_html(episode.get(key + '_clean') or episode.get(key))
video_id = join_nonempty('id', 'lang', from_dict=episode)
title = get_clean_field('sub_title')
series = get_clean_field('title')
thumbnails = []
for s, w, h in [('', 640, 360), ('_l', 1280, 720)]:
img_path = episode.get('image' + s)
if not img_path:
continue
thumbnails.append({
'id': f'{h}p',
'height': h,
'width': w,
'url': 'https://www3.nhk.or.jp' + img_path,
})
title = episode.get('title')
series = traverse_obj(episode, (('video_program', 'audio_program'), any, 'title'))
episode_name = title
if series and title:
@@ -125,37 +67,52 @@ def get_clean_field(key):
episode_name = None
info = {
'id': episode_id + '-' + lang,
'id': video_id,
'title': title,
'description': get_clean_field('description'),
'thumbnails': thumbnails,
'series': series,
'episode': episode_name,
**traverse_obj(episode, {
'description': ('description', {str}),
'release_timestamp': ('first_broadcasted_at', {unified_timestamp}),
'categories': ('categories', ..., 'name', {str}),
'tags': ('tags', ..., 'name', {str}),
'thumbnails': ('images', lambda _, v: v['url'], {
'url': ('url', {urljoin(url)}),
'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}),
}),
'webpage_url': ('url', {urljoin(url)}),
}),
'extractor_key': NhkVodIE.ie_key(),
'extractor': NhkVodIE.IE_NAME,
}
if is_video:
vod_id = episode['vod_id']
info.update({
**self._extract_stream_info(vod_id),
'id': vod_id,
})
# XXX: We are assuming that 'video' and 'audio' are mutually exclusive
stream_info = traverse_obj(episode, (('video', 'audio'), {dict}, any)) or {}
if not stream_info.get('url'):
self.raise_no_formats('Stream not found; it has most likely expired', expected=True)
else:
if fetch_episode:
stream_url = stream_info['url']
if is_video:
formats, subtitles = self._extract_m3u8_formats_and_subtitles(stream_url, video_id)
info.update({
'formats': formats,
'subtitles': subtitles,
**traverse_obj(stream_info, ({
'duration': ('duration', {int_or_none}),
'timestamp': ('published_at', {unified_timestamp}),
})),
})
else:
# From https://www3.nhk.or.jp/nhkworld/common/player/radio/inline/rod.html
audio_path = remove_end(episode['audio']['audio'], '.m4a')
audio_path = remove_end(stream_url, '.m4a')
info['formats'] = self._extract_m3u8_formats(
f'{urljoin("https://vod-stream.nhk.jp", audio_path)}/index.m3u8',
episode_id, 'm4a', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False)
for f in info['formats']:
f['language'] = lang
else:
info.update({
'_type': 'url_transparent',
'ie_key': NhkVodIE.ie_key(),
'url': url,
})
return info
@@ -168,29 +125,29 @@ class NhkVodIE(NhkBaseIE):
# Content available only for a limited period of time. Visit
# https://www3.nhk.or.jp/nhkworld/en/ondemand/ for working samples.
_TESTS = [{
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2049126/',
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/2049165/',
'info_dict': {
'id': 'nw_vod_v_en_2049_126_20230413233000_01_1681398302',
'id': '2049165-en',
'ext': 'mp4',
'title': 'Japan Railway Journal - The Tohoku Shinkansen: Full Speed Ahead',
'description': 'md5:49f7c5b206e03868a2fdf0d0814b92f6',
'title': 'Japan Railway Journal - Choshi Electric Railway: Fighting to Get Back on Track',
'description': 'md5:ab57df2fca7f04245148c2e787bb203d',
'thumbnail': r're:https://.+/.+\.jpg',
'episode': 'The Tohoku Shinkansen: Full Speed Ahead',
'episode': 'Choshi Electric Railway: Fighting to Get Back on Track',
'series': 'Japan Railway Journal',
'modified_timestamp': 1707217907,
'timestamp': 1681428600,
'release_timestamp': 1693883728,
'duration': 1679,
'upload_date': '20230413',
'modified_date': '20240206',
'release_date': '20230905',
'duration': 1680,
'categories': ['Biz & Tech'],
'tags': ['Akita', 'Chiba', 'Trains', 'Transcript', 'All (Japan Navigator)'],
'timestamp': 1759055880,
'upload_date': '20250928',
'release_timestamp': 1758810600,
'release_date': '20250925',
},
}, {
# video clip
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/9999011/',
'md5': '153c3016dfd252ba09726588149cf0e7',
'info_dict': {
'id': 'lpZXIwaDE6_Z-976CPsFdxyICyWUzlT5',
'id': '9999011-en',
'ext': 'mp4',
'title': 'Dining with the Chef - Chef Saito\'s Family recipe: MENCHI-KATSU',
'description': 'md5:5aee4a9f9d81c26281862382103b0ea5',
@@ -198,24 +155,23 @@ class NhkVodIE(NhkBaseIE):
'series': 'Dining with the Chef',
'episode': 'Chef Saito\'s Family recipe: MENCHI-KATSU',
'duration': 148,
'upload_date': '20190816',
'release_date': '20230902',
'release_timestamp': 1693619292,
'modified_timestamp': 1707217907,
'modified_date': '20240206',
'timestamp': 1565997540,
'categories': ['Food'],
'tags': ['Washoku'],
'timestamp': 1548212400,
'upload_date': '20190123',
},
}, {
# radio
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/audio/livinginjapan-20231001-1/',
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/audio/livinginjapan-20240901-1/',
'info_dict': {
'id': 'livinginjapan-20231001-1-en',
'id': 'livinginjapan-20240901-1-en',
'ext': 'm4a',
'title': 'Living in Japan - Tips for Travelers to Japan / Ramen Vending Machines',
'title': 'Living in Japan - Weekend Hiking / Self-protection from crime',
'series': 'Living in Japan',
'description': 'md5:0a0e2077d8f07a03071e990a6f51bfab',
'description': 'md5:4d0e14ab73bdbfedb60a53b093954ed6',
'thumbnail': r're:https://.+/.+\.jpg',
'episode': 'Tips for Travelers to Japan / Ramen Vending Machines',
'episode': 'Weekend Hiking / Self-protection from crime',
'categories': ['Interactive'],
},
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2015173/',
@@ -256,96 +212,51 @@ class NhkVodIE(NhkBaseIE):
},
'skip': 'expires 2023-10-15',
}, {
# a one-off (single-episode series). title from the api is just '<p></p>'
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/3004952/',
# a one-off (single-episode series). title from the api is just null
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/3026036/',
'info_dict': {
'id': 'nw_vod_v_en_3004_952_20230723091000_01_1690074552',
'id': '3026036-en',
'ext': 'mp4',
'title': 'Barakan Discovers - AMAMI OSHIMA: Isson\'s Treasure Isla',
'description': 'md5:5db620c46a0698451cc59add8816b797',
'thumbnail': r're:https://.+/.+\.jpg',
'release_date': '20230905',
'timestamp': 1690103400,
'duration': 2939,
'release_timestamp': 1693898699,
'upload_date': '20230723',
'modified_timestamp': 1707217907,
'modified_date': '20240206',
'episode': 'AMAMI OSHIMA: Isson\'s Treasure Isla',
'series': 'Barakan Discovers',
'title': 'STATELESS: The Japanese Left Behind in the Philippines',
'description': 'md5:9a2fd51cdfa9f52baae28569e0053786',
'duration': 2955,
'thumbnail': 'https://www3.nhk.or.jp/nhkworld/en/shows/3026036/images/wide_l_QPtWpt4lzVhm3NzPAMIIF35MCg4CdNwcikPaTS5Q.jpg',
'categories': ['Documentary', 'Culture & Lifestyle'],
'tags': ['Transcript', 'Documentary 360', 'The Pursuit of PEACE'],
'timestamp': 1758931800,
'upload_date': '20250927',
'release_timestamp': 1758931800,
'release_date': '20250927',
},
}, {
# /ondemand/video/ url with alphabetical character in 5th position of id
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/9999a07/',
'info_dict': {
'id': 'nw_c_en_9999-a07',
'id': '9999a07-en',
'ext': 'mp4',
'episode': 'Mini-Dramas on SDGs: Ep 1 Close the Gender Gap [Director\'s Cut]',
'series': 'Mini-Dramas on SDGs',
'modified_date': '20240206',
'title': 'Mini-Dramas on SDGs - Mini-Dramas on SDGs: Ep 1 Close the Gender Gap [Director\'s Cut]',
'description': 'md5:3f9dcb4db22fceb675d90448a040d3f6',
'timestamp': 1621962360,
'duration': 189,
'release_date': '20230903',
'modified_timestamp': 1707217907,
'timestamp': 1621911600,
'duration': 190,
'upload_date': '20210525',
'thumbnail': r're:https://.+/.+\.jpg',
'release_timestamp': 1693713487,
'categories': ['Current Affairs', 'Entertainment'],
},
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/9999d17/',
'info_dict': {
'id': 'nw_c_en_9999-d17',
'id': '9999d17-en',
'ext': 'mp4',
'title': 'Flowers of snow blossom - The 72 Pentads of Yamato',
'description': 'Todays focus: Snow',
'release_timestamp': 1693792402,
'release_date': '20230904',
'upload_date': '20220128',
'timestamp': 1643370960,
'thumbnail': r're:https://.+/.+\.jpg',
'duration': 136,
'series': '',
'modified_date': '20240206',
'modified_timestamp': 1707217907,
},
}, {
# new /shows/ url format
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/2032307/',
'info_dict': {
'id': 'nw_vod_v_en_2032_307_20240321113000_01_1710990282',
'ext': 'mp4',
'title': 'Japanology Plus - 20th Anniversary Special Part 1',
'description': 'md5:817d41fc8e54339ad2a916161ea24faf',
'episode': '20th Anniversary Special Part 1',
'series': 'Japanology Plus',
'thumbnail': r're:https://.+/.+\.jpg',
'duration': 1680,
'timestamp': 1711020600,
'upload_date': '20240321',
'release_timestamp': 1711022683,
'release_date': '20240321',
'modified_timestamp': 1711031012,
'modified_date': '20240321',
},
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/3020025/',
'info_dict': {
'id': 'nw_vod_v_en_3020_025_20230325144000_01_1679723944',
'ext': 'mp4',
'title': '100 Ideas to Save the World - Working Styles Evolve',
'description': 'md5:9e6c7778eaaf4f7b4af83569649f84d9',
'episode': 'Working Styles Evolve',
'series': '100 Ideas to Save the World',
'thumbnail': r're:https://.+/.+\.jpg',
'duration': 899,
'upload_date': '20230325',
'timestamp': 1679755200,
'release_date': '20230905',
'release_timestamp': 1693880540,
'modified_date': '20240206',
'modified_timestamp': 1707217907,
'categories': ['Culture & Lifestyle', 'Science & Nature'],
'tags': ['Nara', 'Temples & Shrines', 'Winter', 'Snow'],
'timestamp': 1643339040,
'upload_date': '20220128',
},
}, {
# new /shows/audio/ url format
@@ -373,6 +284,7 @@ class NhkVodProgramIE(NhkBaseIE):
'id': 'sumo',
'title': 'GRAND SUMO Highlights',
'description': 'md5:fc20d02dc6ce85e4b72e0273aa52fdbf',
'series': 'GRAND SUMO Highlights',
},
'playlist_mincount': 1,
}, {
@@ -381,6 +293,7 @@ class NhkVodProgramIE(NhkBaseIE):
'id': 'japanrailway',
'title': 'Japan Railway Journal',
'description': 'md5:ea39d93af7d05835baadf10d1aae0e3f',
'series': 'Japan Railway Journal',
},
'playlist_mincount': 12,
}, {
@@ -390,6 +303,7 @@ class NhkVodProgramIE(NhkBaseIE):
'id': 'japanrailway',
'title': 'Japan Railway Journal',
'description': 'md5:ea39d93af7d05835baadf10d1aae0e3f',
'series': 'Japan Railway Journal',
},
'playlist_mincount': 12,
}, {
@@ -399,17 +313,9 @@ class NhkVodProgramIE(NhkBaseIE):
'id': 'livinginjapan',
'title': 'Living in Japan',
'description': 'md5:665bb36ec2a12c5a7f598ee713fc2b54',
'series': 'Living in Japan',
},
'playlist_mincount': 12,
}, {
# /tv/ program url
'url': 'https://www3.nhk.or.jp/nhkworld/en/tv/designtalksplus/',
'info_dict': {
'id': 'designtalksplus',
'title': 'DESIGN TALKS plus',
'description': 'md5:47b3b3a9f10d4ac7b33b53b70a7d2837',
},
'playlist_mincount': 20,
'playlist_mincount': 11,
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/shows/10yearshayaomiyazaki/',
'only_matching': True,
@@ -430,9 +336,8 @@ def _real_extract(self, url):
program_id, lang, m_type != 'audio', False, episode_type == 'clip')
def entries():
for episode in episodes:
if episode_path := episode.get('url'):
yield self._extract_episode_info(urljoin(url, episode_path), episode)
for episode in traverse_obj(episodes, ('items', lambda _, v: v['url'])):
yield self._extract_episode_info(urljoin(url, episode['url']), episode)
html = self._download_webpage(url, program_id)
program_title = self._extract_meta_from_class_elements([
@@ -446,7 +351,7 @@ def entries():
'tAudioProgramMain__info', # /shows/audio/programs/
'p-program-description'], html) # /tv/
return self.playlist_result(entries(), program_id, program_title, program_description)
return self.playlist_result(entries(), program_id, program_title, program_description, series=program_title)
class NhkForSchoolBangumiIE(InfoExtractor):

View File

@@ -0,0 +1,37 @@
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor
class NowCanalIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?nowcanal\.pt(?:/[\w-]+)+/detalhe/(?P<id>[\w-]+)'
_TESTS = [{
'url': 'https://www.nowcanal.pt/ultimas/detalhe/pedro-sousa-hjulmand-pode-ter-uma-saida-limpa-do-sporting-daqui-a-um-ano',
'md5': '047f17cb783e66e467d703e704bbc95d',
'info_dict': {
'id': '6376598467112',
'ext': 'mp4',
'title': 'Pedro Sousa «Hjulmand pode ter uma saída limpa do Sporting daqui a um ano»',
'description': '',
'uploader_id': '6108484330001',
'duration': 65.237,
'thumbnail': r're:^https://.+\.jpg',
'timestamp': 1754440620,
'upload_date': '20250806',
'tags': ['now'],
},
}, {
'url': 'https://www.nowcanal.pt/programas/frente-a-frente/detalhe/frente-a-frente-eva-cruzeiro-ps-e-rita-matias-chega',
'only_matching': True,
}]
_BC_URL_TMPL = 'https://players.brightcove.net/6108484330001/chhIqzukMq_default/index.html?videoId={}'
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_json(
r'videoHandler\.addBrightcoveVideoWithJson\(\[',
webpage, 'video data', display_id)['brightcoveVideoId']
return self.url_result(self._BC_URL_TMPL.format(video_id), BrightcoveNewIE)

View File

@@ -598,7 +598,8 @@ def _real_extract(self, url):
'props', 'pageProps', 'bootstrapEnvelope', 'pageBootstrap', 'campaign', 'data', 'id', {str}))
if not campaign_id:
campaign_id = traverse_obj(self._search_nextjs_v13_data(webpage, vanity), (
lambda _, v: v['type'] == 'campaign', 'id', {str}, any, {require('campaign ID')}))
((..., 'value', 'campaign', 'data'), lambda _, v: v['type'] == 'campaign'),
'id', {str}, any, {require('campaign ID')}))
params = {
'json-api-use-default-includes': 'false',

View File

@@ -3,12 +3,14 @@
MEDIA_EXTENSIONS,
determine_ext,
parse_iso8601,
traverse_obj,
url_or_none,
)
from ..utils.traversal import traverse_obj
class RinseFMBaseIE(InfoExtractor):
_API_BASE = 'https://rinse.fm/api/query/v1'
@staticmethod
def _parse_entry(entry):
return {
@@ -45,8 +47,10 @@ class RinseFMIE(RinseFMBaseIE):
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
entry = self._search_nextjs_data(webpage, display_id)['props']['pageProps']['entry']
entry = self._download_json(
f'{self._API_BASE}/episodes/{display_id}', display_id,
note='Downloading episode data from API')['entry']
return self._parse_entry(entry)
@@ -58,32 +62,35 @@ class RinseFMArtistPlaylistIE(RinseFMBaseIE):
'info_dict': {
'id': 'resources',
'title': '[re]sources',
'description': '[re]sources est un label parisien piloté par le DJ et producteur Tommy Kid.',
'description': 'md5:fd6a7254e8273510e6d49fbf50edf392',
},
'playlist_mincount': 40,
}, {
'url': 'https://rinse.fm/shows/ivy/',
'url': 'https://www.rinse.fm/shows/esk',
'info_dict': {
'id': 'ivy',
'title': '[IVY]',
'description': 'A dedicated space for DNB/Turbo House and 4x4.',
'id': 'esk',
'title': 'Esk',
'description': 'md5:5893d7c1d411ae8dea7fba12f109aa98',
},
'playlist_mincount': 7,
'playlist_mincount': 139,
}]
def _entries(self, data):
for episode in traverse_obj(data, (
'props', 'pageProps', 'episodes', lambda _, v: determine_ext(v['fileUrl']) in MEDIA_EXTENSIONS.audio),
'episodes', lambda _, v: determine_ext(v['fileUrl']) in MEDIA_EXTENSIONS.audio),
):
yield self._parse_entry(episode)
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
title = self._og_search_title(webpage) or self._html_search_meta('title', webpage)
description = self._og_search_description(webpage) or self._html_search_meta(
'description', webpage)
data = self._search_nextjs_data(webpage, playlist_id)
api_data = self._download_json(
f'{self._API_BASE}/shows/{playlist_id}', playlist_id,
note='Downloading show data from API')
return self.playlist_result(
self._entries(data), playlist_id, title, description=description)
self._entries(api_data), playlist_id,
**traverse_obj(api_data, ('entry', {
'title': ('title', {str}),
'description': ('description', {str}),
})))

View File

@@ -15,14 +15,15 @@ class S4CIE(InfoExtractor):
'thumbnail': 'https://www.s4c.cymru/amg/1920x1080/Y_Swn_2023S4C_099_ii.jpg',
},
}, {
'url': 'https://www.s4c.cymru/clic/programme/856636948',
# Geo restricted to the UK
'url': 'https://www.s4c.cymru/clic/programme/886303048',
'info_dict': {
'id': '856636948',
'id': '886303048',
'ext': 'mp4',
'title': 'Am Dro',
'title': 'Pennod 1',
'description': 'md5:7e3f364b70f61fcdaa8b4cb4a3eb3e7a',
'duration': 2880,
'description': 'md5:100d8686fc9a632a0cb2db52a3433ffe',
'thumbnail': 'https://www.s4c.cymru/amg/1920x1080/Am_Dro_2022-23S4C_P6_4005.jpg',
'thumbnail': 'https://www.s4c.cymru/amg/1920x1080/Stad_2025S4C_P1_210053.jpg',
},
}]
@@ -51,7 +52,7 @@ def _real_extract(self, url):
'https://player-api.s4c-cdn.co.uk/streaming-urls/prod', video_id, query={
'mode': 'od',
'application': 'clic',
'region': 'WW',
'region': 'UK' if player_config.get('application') == 's4chttpl' else 'WW',
'extra': 'false',
'thirdParty': 'false',
'filename': player_config['filename'],

View File

@@ -1064,7 +1064,7 @@ def _real_extract(self, url):
class SoundcloudPlaylistIE(SoundcloudPlaylistBaseIE):
_VALID_URL = r'https?://api(?:-v2)?\.soundcloud\.com/playlists/(?P<id>[0-9]+)(?:/?\?secret_token=(?P<token>[^&]+?))?$'
_VALID_URL = r'https?://api(?:-v2)?\.soundcloud\.com/playlists/(?:soundcloud(?:%3A|:)playlists(?:%3A|:))?(?P<id>[0-9]+)(?:/?\?secret_token=(?P<token>[^&]+?))?$'
IE_NAME = 'soundcloud:playlist'
_TESTS = [{
'url': 'https://api.soundcloud.com/playlists/4110309',
@@ -1079,6 +1079,12 @@ class SoundcloudPlaylistIE(SoundcloudPlaylistBaseIE):
'album': 'TILT Brass - Bowery Poetry Club, August \'03 [Non-Site SCR 02]',
},
'playlist_count': 6,
}, {
'url': 'https://api.soundcloud.com/playlists/soundcloud%3Aplaylists%3A1759227795',
'only_matching': True,
}, {
'url': 'https://api.soundcloud.com/playlists/soundcloud:playlists:2104769627?secret_token=s-wmpCLuExeYX',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -8,10 +8,11 @@
class SportDeutschlandIE(InfoExtractor):
_VALID_URL = r'https?://(?:player\.)?sportdeutschland\.tv/(?P<id>(?:[^/?#]+/)?[^?#/&]+)'
IE_NAME = 'sporteurope'
_VALID_URL = r'https?://(?:player\.)?sporteurope\.tv/(?P<id>(?:[^/?#]+/)?[^?#/&]+)'
_TESTS = [{
# Single-part video, direct link
'url': 'https://sportdeutschland.tv/rostock-griffins/gfl2-rostock-griffins-vs-elmshorn-fighting-pirates',
'url': 'https://sporteurope.tv/rostock-griffins/gfl2-rostock-griffins-vs-elmshorn-fighting-pirates',
'md5': '35c11a19395c938cdd076b93bda54cde',
'info_dict': {
'id': '9f27a97d-1544-4d0b-aa03-48d92d17a03a',
@@ -19,9 +20,9 @@ class SportDeutschlandIE(InfoExtractor):
'title': 'GFL2: Rostock Griffins vs. Elmshorn Fighting Pirates',
'display_id': 'rostock-griffins/gfl2-rostock-griffins-vs-elmshorn-fighting-pirates',
'channel': 'Rostock Griffins',
'channel_url': 'https://sportdeutschland.tv/rostock-griffins',
'channel_url': 'https://sporteurope.tv/rostock-griffins',
'live_status': 'was_live',
'description': 'md5:60cb00067e55dafa27b0933a43d72862',
'description': r're:Video-Livestream des Spiels Rostock Griffins vs\. Elmshorn Fighting Pirates.+',
'channel_id': '9635f21c-3f67-4584-9ce4-796e9a47276b',
'timestamp': 1749913117,
'upload_date': '20250614',
@@ -29,16 +30,16 @@ class SportDeutschlandIE(InfoExtractor):
},
}, {
# Single-part video, embedded player link
'url': 'https://player.sportdeutschland.tv/9e9619c4-7d77-43c4-926d-49fb57dc06dc',
'url': 'https://player.sporteurope.tv/9e9619c4-7d77-43c4-926d-49fb57dc06dc',
'info_dict': {
'id': '9f27a97d-1544-4d0b-aa03-48d92d17a03a',
'ext': 'mp4',
'title': 'GFL2: Rostock Griffins vs. Elmshorn Fighting Pirates',
'display_id': '9e9619c4-7d77-43c4-926d-49fb57dc06dc',
'channel': 'Rostock Griffins',
'channel_url': 'https://sportdeutschland.tv/rostock-griffins',
'channel_url': 'https://sporteurope.tv/rostock-griffins',
'live_status': 'was_live',
'description': 'md5:60cb00067e55dafa27b0933a43d72862',
'description': r're:Video-Livestream des Spiels Rostock Griffins vs\. Elmshorn Fighting Pirates.+',
'channel_id': '9635f21c-3f67-4584-9ce4-796e9a47276b',
'timestamp': 1749913117,
'upload_date': '20250614',
@@ -47,7 +48,7 @@ class SportDeutschlandIE(InfoExtractor):
'params': {'skip_download': True},
}, {
# Multi-part video
'url': 'https://sportdeutschland.tv/rhine-ruhr-2025-fisu-world-university-games/volleyball-w-japan-vs-brasilien-halbfinale-2',
'url': 'https://sporteurope.tv/rhine-ruhr-2025-fisu-world-university-games/volleyball-w-japan-vs-brasilien-halbfinale-2',
'info_dict': {
'id': '9f63d737-2444-4e3a-a1ea-840df73fd481',
'display_id': 'rhine-ruhr-2025-fisu-world-university-games/volleyball-w-japan-vs-brasilien-halbfinale-2',
@@ -55,7 +56,7 @@ class SportDeutschlandIE(InfoExtractor):
'description': 'md5:0a17da15e48a687e6019639c3452572b',
'channel': 'Rhine-Ruhr 2025 FISU World University Games',
'channel_id': '9f5216be-a49d-470b-9a30-4fe9df993334',
'channel_url': 'https://sportdeutschland.tv/rhine-ruhr-2025-fisu-world-university-games',
'channel_url': 'https://sporteurope.tv/rhine-ruhr-2025-fisu-world-university-games',
'live_status': 'was_live',
},
'playlist_count': 2,
@@ -66,7 +67,7 @@ class SportDeutschlandIE(InfoExtractor):
'title': 'Volleyball w: Japan vs. Braslien - Halbfinale 2 Part 1',
'channel': 'Rhine-Ruhr 2025 FISU World University Games',
'channel_id': '9f5216be-a49d-470b-9a30-4fe9df993334',
'channel_url': 'https://sportdeutschland.tv/rhine-ruhr-2025-fisu-world-university-games',
'channel_url': 'https://sporteurope.tv/rhine-ruhr-2025-fisu-world-university-games',
'duration': 14773.0,
'timestamp': 1753085197,
'upload_date': '20250721',
@@ -79,16 +80,17 @@ class SportDeutschlandIE(InfoExtractor):
'title': 'Volleyball w: Japan vs. Braslien - Halbfinale 2 Part 2',
'channel': 'Rhine-Ruhr 2025 FISU World University Games',
'channel_id': '9f5216be-a49d-470b-9a30-4fe9df993334',
'channel_url': 'https://sportdeutschland.tv/rhine-ruhr-2025-fisu-world-university-games',
'channel_url': 'https://sporteurope.tv/rhine-ruhr-2025-fisu-world-university-games',
'duration': 14773.0,
'timestamp': 1753128421,
'upload_date': '20250721',
'live_status': 'was_live',
},
}],
'skip': '404 Not Found',
}, {
# Livestream
'url': 'https://sportdeutschland.tv/dtb/gymnastik-international-tag-1',
'url': 'https://sporteurope.tv/dtb/gymnastik-international-tag-1',
'info_dict': {
'id': '95d71b8a-370a-4b87-ad16-94680da18528',
'ext': 'mp4',
@@ -96,7 +98,7 @@ class SportDeutschlandIE(InfoExtractor):
'display_id': 'dtb/gymnastik-international-tag-1',
'channel_id': '936ecef1-2f4a-4e08-be2f-68073cb7ecab',
'channel': 'Deutscher Turner-Bund',
'channel_url': 'https://sportdeutschland.tv/dtb',
'channel_url': 'https://sporteurope.tv/dtb',
'description': 'md5:07a885dde5838a6f0796ee21dc3b0c52',
'live_status': 'is_live',
},
@@ -106,9 +108,9 @@ class SportDeutschlandIE(InfoExtractor):
def _process_video(self, asset_id, video):
is_live = video['type'] == 'mux_live'
token = self._download_json(
f'https://api.sportdeutschland.tv/api/web/personal/asset-token/{asset_id}',
f'https://api.sporteurope.tv/api/web/personal/asset-token/{asset_id}',
video['id'], query={'type': video['type'], 'playback_id': video['src']},
headers={'Referer': 'https://sportdeutschland.tv/'})['token']
headers={'Referer': 'https://sporteurope.tv/'})['token']
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
f'https://stream.mux.com/{video["src"]}.m3u8?token={token}', video['id'], live=is_live)
@@ -126,7 +128,7 @@ def _process_video(self, asset_id, video):
def _real_extract(self, url):
display_id = self._match_id(url)
meta = self._download_json(
f'https://api.sportdeutschland.tv/api/stateless/frontend/assets/{display_id}',
f'https://api.sporteurope.tv/api/stateless/frontend/assets/{display_id}',
display_id, query={'access_token': 'true'})
info = {
@@ -139,7 +141,7 @@ def _real_extract(self, url):
'channel_id': ('profile', 'id'),
'is_live': 'currently_live',
'was_live': 'was_live',
'channel_url': ('profile', 'slug', {lambda x: f'https://sportdeutschland.tv/{x}'}),
'channel_url': ('profile', 'slug', {lambda x: f'https://sporteurope.tv/{x}'}),
}, get_all=False),
}

View File

@@ -101,8 +101,8 @@ def _real_extract(self, url):
webpage = self._download_webpage(
url, video_id, headers=traverse_obj(smuggled_data, {'Referer': 'referer'}))
data = self._search_json(
r'(?:var|const|let)\s+(?:dat|(?:player|video)Info|)\s*=\s*["\']', webpage, 'player info',
video_id, contains_pattern=r'[A-Za-z0-9+/=]+', end_pattern=r'["\'];',
r'(?:window\.|(?:var|const|let)\s+)(?:dat|(?:player|video)Info|)\s*=\s*["\']', webpage,
'player info', video_id, contains_pattern=r'[A-Za-z0-9+/=]+', end_pattern=r'["\'];',
transform_source=lambda x: base64.b64decode(x).decode())
# SproutVideo may send player info for 'SMPTE Color Monitor Test' [a791d7b71b12ecc52e]

View File

@@ -1,18 +1,17 @@
import json
import urllib.parse
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor
from .zype import ZypeIE
from ..networking import HEADRequest
from ..networking.exceptions import HTTPError
from ..utils import (
ExtractorError,
filter_dict,
parse_qs,
smuggle_url,
try_call,
urlencode_postdata,
)
from ..utils.traversal import traverse_obj
class ThisOldHouseIE(InfoExtractor):
@@ -77,46 +76,43 @@ class ThisOldHouseIE(InfoExtractor):
'only_matching': True,
}]
_LOGIN_URL = 'https://login.thisoldhouse.com/usernamepassword/login'
def _perform_login(self, username, password):
self._request_webpage(
HEADRequest('https://www.thisoldhouse.com/insider'), None, 'Requesting session cookies')
urlh = self._request_webpage(
'https://www.thisoldhouse.com/wp-login.php', None, 'Requesting login info',
errnote='Unable to login', query={'redirect_to': 'https://www.thisoldhouse.com/insider'})
login_page = self._download_webpage(
'https://www.thisoldhouse.com/insider-login', None, 'Downloading login page')
hidden_inputs = self._hidden_inputs(login_page)
response = self._download_json(
'https://www.thisoldhouse.com/wp-admin/admin-ajax.php', None, 'Logging in',
headers={
'Accept': 'application/json',
'X-Requested-With': 'XMLHttpRequest',
}, data=urlencode_postdata(filter_dict({
'action': 'onebill_subscriber_login',
'email': username,
'password': password,
'pricingPlanTerm': hidden_inputs['pricing_plan_term'],
'utm_parameters': hidden_inputs.get('utm_parameters'),
'nonce': hidden_inputs['mdcr_onebill_login_nonce'],
})))
try:
auth_form = self._download_webpage(
self._LOGIN_URL, None, 'Submitting credentials', headers={
'Content-Type': 'application/json',
'Referer': urlh.url,
}, data=json.dumps(filter_dict({
**{('client_id' if k == 'client' else k): v[0] for k, v in parse_qs(urlh.url).items()},
'tenant': 'thisoldhouse',
'username': username,
'password': password,
'popup_options': {},
'sso': True,
'_csrf': try_call(lambda: self._get_cookies(self._LOGIN_URL)['_csrf'].value),
'_intstate': 'deprecated',
}), separators=(',', ':')).encode())
except ExtractorError as e:
if isinstance(e.cause, HTTPError) and e.cause.status == 401:
message = traverse_obj(response, ('data', 'message', {str}))
if not response['success']:
if message and 'Something went wrong' in message:
raise ExtractorError('Invalid username or password', expected=True)
raise
self._request_webpage(
'https://login.thisoldhouse.com/login/callback', None, 'Completing login',
data=urlencode_postdata(self._hidden_inputs(auth_form)))
raise ExtractorError(message or 'Login was unsuccessful')
if message and 'Your subscription is not active' in message:
self.report_warning(
f'{self.IE_NAME} said your subscription is not active. '
f'If your subscription is active, this could be caused by too many sign-ins, '
f'and you should instead try using {self._login_hint(method="cookies")[4:]}')
else:
self.write_debug(f'{self.IE_NAME} said: {message}')
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
if 'To Unlock This content' in webpage:
self.raise_login_required(
'This video is only available for subscribers. '
'Note that --cookies-from-browser may not work due to this site using session cookies')
webpage, urlh = self._download_webpage_handle(url, display_id)
# If login response says inactive subscription, site redirects to frontpage for Insider content
if 'To Unlock This content' in webpage or urllib.parse.urlparse(urlh.url).path in ('', '/'):
self.raise_login_required('This video is only available for subscribers')
video_url, video_id = self._search_regex(
r'<iframe[^>]+src=[\'"]((?:https?:)?//(?:www\.)?thisoldhouse\.(?:chorus\.build|com)/videos/zype/([0-9a-f]{24})[^\'"]*)[\'"]',

View File

@@ -182,13 +182,13 @@ def _entries(self, show_url, playlist_id, selected_season):
webpage = self._download_webpage(show_url, playlist_id)
data = self._search_json(
r'window\.__data\s*=', webpage, 'data', playlist_id,
transform_source=js_to_json)['video']
r'window\.__REACT_QUERY_STATE__\s*=', webpage, 'data', playlist_id,
transform_source=js_to_json)['queries'][0]['state']['data']
# v['number'] is already a decimal string, but stringify to protect against API changes
path = [lambda _, v: str(v['number']) == selected_season] if selected_season else [..., {dict}]
for season in traverse_obj(data, ('byId', lambda _, v: v['type'] == 's', 'seasons', *path)):
for season in traverse_obj(data, ('seasons', *path)):
season_number = int_or_none(season.get('number'))
for episode in traverse_obj(season, ('episodes', lambda _, v: v['id'])):
episode_id = episode['id']

View File

@@ -7,15 +7,15 @@
parse_age_limit,
try_get,
unified_timestamp,
url_or_none,
)
from ..utils.traversal import traverse_obj
from ..utils.traversal import require, traverse_obj
class URPlayIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?ur(?:play|skola)\.se/(?:program|Produkter)/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://urplay.se/program/203704-ur-samtiden-livet-universum-och-rymdens-markliga-musik-om-vetenskap-kritiskt-tankande-och-motstand',
'md5': '5ba36643c77cc3d34ffeadad89937d1e',
'info_dict': {
'id': '203704',
'ext': 'mp4',
@@ -31,6 +31,7 @@ class URPlayIE(InfoExtractor):
'episode': 'Om vetenskap, kritiskt tänkande och motstånd',
'age_limit': 15,
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://urplay.se/program/222967-en-foralders-dagbok-mitt-barn-skadar-sig-sjalv',
'info_dict': {
@@ -49,6 +50,7 @@ class URPlayIE(InfoExtractor):
'tags': 'count:7',
'episode': 'Mitt barn skadar sig själv',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'https://urskola.se/Produkter/190031-Tripp-Trapp-Trad-Sovkudde',
'info_dict': {
@@ -68,6 +70,27 @@ class URPlayIE(InfoExtractor):
'episode': 'Sovkudde',
'season': 'Säsong 1',
},
'params': {'skip_download': 'm3u8'},
}, {
# Only accessible through new media api
'url': 'https://urplay.se/program/242932-vulkanernas-krafter-fran-kraftfull-till-forgorande',
'info_dict': {
'id': '242932',
'ext': 'mp4',
'title': 'Vulkanernas krafter : Från kraftfull till förgörande',
'description': 'md5:742bb87048e7d5a7f209d28f9bb70ab1',
'age_limit': 15,
'duration': 2613,
'thumbnail': 'https://assets.ur.se/id/242932/images/1_hd.jpg',
'categories': ['Vetenskap & teknik'],
'tags': ['Geofysik', 'Naturvetenskap', 'Vulkaner', 'Vulkanutbrott'],
'series': 'Vulkanernas krafter',
'episode': 'Från kraftfull till förgörande',
'episode_number': 2,
'timestamp': 1763514000,
'upload_date': '20251119',
},
'params': {'skip_download': 'm3u8'},
}, {
'url': 'http://urskola.se/Produkter/155794-Smasagor-meankieli-Grodan-i-vida-varlden',
'only_matching': True,
@@ -88,21 +111,12 @@ def _real_extract(self, url):
webpage, 'urplayer data'), video_id)['accessibleEpisodes']
urplayer_data = next(e for e in accessible_episodes if e.get('id') == int_or_none(video_id))
episode = urplayer_data['title']
host = self._download_json('http://streaming-loadbalancer.ur.se/loadbalancer.json', video_id)['redirect']
formats = []
urplayer_streams = urplayer_data.get('streamingInfo', {})
for k, v in urplayer_streams.get('raw', {}).items():
if not (k in ('sd', 'hd', 'mp3', 'm4a') and isinstance(v, dict)):
continue
file_http = v.get('location')
if file_http:
formats.extend(self._extract_wowza_formats(
f'http://{host}/{file_http}playlist.m3u8',
video_id, skip_protocols=['f4m', 'rtmp', 'rtsp']))
subtitles = {}
sources = self._download_json(
f'https://media-api.urplay.se/config-streaming/v1/urplay/sources/{video_id}', video_id,
note='Downloading streaming information')
hls_url = traverse_obj(sources, ('sources', 'hls', {url_or_none}, {require('HLS URL')}))
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
hls_url, video_id, 'mp4', m3u8_id='hls')
def parse_lang_code(code):
"3-character language code or None (utils candidate)"

View File

@@ -339,11 +339,20 @@ class WistiaChannelIE(WistiaBaseIE):
'title': 'The Roof S2: The Modern CRO',
'thumbnail': r're:https?://embed(?:-ssl)?\.wistia\.com/.+\.(?:jpg|png)',
'duration': 86.487,
'description': 'A sales leader on The Roof? Man, they really must be letting anyone up here this season.\n',
'description': 'A sales leader on The Roof? Man, they really must be letting anyone up here this season. ',
'timestamp': 1619790290,
'upload_date': '20210430',
},
'params': {'noplaylist': True, 'skip_download': True},
}, {
# Channel with episodes structure instead of videos
'url': 'https://fast.wistia.net/embed/channel/sapab9p6qd',
'info_dict': {
'id': 'sapab9p6qd',
'title': 'Credo: An RCIA Program',
'description': '\n',
},
'playlist_mincount': 80,
}]
_WEBPAGE_TESTS = [{
'url': 'https://www.profitwell.com/recur/boxed-out',
@@ -399,8 +408,7 @@ def _real_extract(self, url):
entries = [
self.url_result(f'wistia:{video["hashedId"]}', WistiaIE, title=video.get('name'))
for video in traverse_obj(series, ('sections', ..., 'videos', ...)) or []
if video.get('hashedId')
for video in traverse_obj(series, ('sections', ..., ('videos', 'episodes'), lambda _, v: v['hashedId']))
]
return self.playlist_result(

View File

@@ -1,8 +1,6 @@
import base64
import codecs
import itertools
import re
import string
import urllib.parse
from .common import InfoExtractor
from ..utils import (
@@ -16,7 +14,6 @@
join_nonempty,
parse_duration,
str_or_none,
try_call,
try_get,
unified_strdate,
url_or_none,
@@ -32,7 +29,7 @@ def __init__(self, algo_id, seed):
try:
self._algorithm = getattr(self, f'_algo{algo_id}')
except AttributeError:
raise ExtractorError(f'Unknown algorithm ID: {algo_id}')
raise ExtractorError(f'Unknown algorithm ID "{algo_id}"')
self._s = to_signed_32(seed)
def _algo1(self, s):
@@ -216,32 +213,28 @@ class XHamsterIE(InfoExtractor):
'only_matching': True,
}]
_XOR_KEY = b'xh7999'
def _decipher_format_url(self, format_url, format_id):
if all(char in string.hexdigits for char in format_url):
byte_data = bytes.fromhex(format_url)
seed = int.from_bytes(byte_data[1:5], byteorder='little', signed=True)
byte_gen = _ByteGenerator(byte_data[0], seed)
return bytearray(byte ^ next(byte_gen) for byte in byte_data[5:]).decode('latin-1')
parsed_url = urllib.parse.urlparse(format_url)
cipher_type, _, ciphertext = try_call(
lambda: base64.b64decode(format_url).decode().partition('_')) or [None] * 3
if not cipher_type or not ciphertext:
self.report_warning(f'Skipping format "{format_id}": failed to decipher URL')
hex_string, path_remainder = self._search_regex(
r'^/(?P<hex>[0-9a-fA-F]{12,})(?P<rem>[/,].+)$', parsed_url.path, 'url components',
default=(None, None), group=('hex', 'rem'))
if not hex_string:
self.report_warning(f'Skipping format "{format_id}": unsupported URL format')
return None
if cipher_type == 'xor':
return bytes(
a ^ b for a, b in
zip(ciphertext.encode(), itertools.cycle(self._XOR_KEY))).decode()
byte_data = bytes.fromhex(hex_string)
seed = int.from_bytes(byte_data[1:5], byteorder='little', signed=True)
if cipher_type == 'rot13':
return codecs.decode(ciphertext, cipher_type)
try:
byte_gen = _ByteGenerator(byte_data[0], seed)
except ExtractorError as e:
self.report_warning(f'Skipping format "{format_id}": {e.msg}')
return None
self.report_warning(f'Skipping format "{format_id}": unsupported cipher type "{cipher_type}"')
return None
deciphered = bytearray(byte ^ next(byte_gen) for byte in byte_data[5:]).decode('latin-1')
return parsed_url._replace(path=f'/{deciphered}{path_remainder}').geturl()
def _fixup_formats(self, formats):
for f in formats:
@@ -364,8 +357,11 @@ def get_height(s):
'height': get_height(quality),
'filesize': format_sizes.get(quality),
'http_headers': {
'Referer': standard_url,
'Referer': urlh.url,
},
# HTTP formats return "Wrong key" error even when deciphered by site JS
# TODO: Remove this when resolved on the site's end
'__needs_testing': True,
})
categories_list = video.get('categories')
@@ -402,7 +398,8 @@ def get_height(s):
'age_limit': age_limit if age_limit is not None else 18,
'categories': categories,
'formats': self._fixup_formats(formats),
'_format_sort_fields': ('res', 'proto', 'tbr'),
# TODO: Revert to ('res', 'proto', 'tbr') when HTTP formats problem is resolved
'_format_sort_fields': ('res', 'proto:m3u8', 'tbr'),
}
# Old layout fallback

View File

@@ -0,0 +1,67 @@
from .common import InfoExtractor
from ..utils import (
determine_ext,
int_or_none,
join_nonempty,
remove_end,
url_or_none,
)
from ..utils.traversal import traverse_obj
class YfanefaIE(InfoExtractor):
IE_NAME = 'yfanefa'
_VALID_URL = r'https?://(?:www\.)?yfanefa\.com/(?P<id>[^?#]+)'
_TESTS = [{
'url': 'https://www.yfanefa.com/record/2717',
'info_dict': {
'id': 'record-2717',
'ext': 'mp4',
'title': 'THE HALLAMSHIRE RIFLES LEAVING SHEFFIELD, 1914',
'duration': 5239,
'thumbnail': r're:https://media\.yfanefa\.com/storage/v1/file/',
},
}, {
'url': 'https://www.yfanefa.com/news/53',
'info_dict': {
'id': 'news-53',
'ext': 'mp4',
'title': 'Memory Bank: Bradford Launch',
'thumbnail': r're:https://media\.yfanefa\.com/storage/v1/file/',
},
}, {
'url': 'https://www.yfanefa.com/evaluating_nature_matters',
'info_dict': {
'id': 'evaluating_nature_matters',
'ext': 'mp4',
'title': 'Evaluating Nature Matters',
'thumbnail': r're:https://media\.yfanefa\.com/storage/v1/file/',
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player_data = self._search_json(
r'iwPlayer\.options\["[\w.]+"\]\s*=', webpage, 'player options', video_id)
formats = []
video_url = join_nonempty(player_data['url'], player_data.get('signature'), delim='')
if determine_ext(video_url) == 'm3u8':
formats = self._extract_m3u8_formats(
video_url, video_id, 'mp4', m3u8_id='hls')
else:
formats = [{'url': video_url, 'ext': 'mp4'}]
return {
'id': video_id.strip('/').replace('/', '-'),
'title':
self._og_search_title(webpage, default=None)
or remove_end(self._html_extract_title(webpage), ' | Yorkshire Film Archive'),
'formats': formats,
**traverse_obj(player_data, {
'thumbnail': ('preview', {url_or_none}),
'duration': ('duration', {int_or_none}),
}),
}

View File

@@ -104,6 +104,7 @@ class SubsPoTokenPolicy(BasePoTokenPolicy):
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 1,
'SUPPORTS_COOKIES': True,
'SUPPORTS_AD_PLAYBACK_CONTEXT': True,
**WEB_PO_TOKEN_POLICIES,
},
# Safari UA returns pre-merged video+audio 144p/240p/360p/720p/1080p HLS formats
@@ -117,6 +118,7 @@ class SubsPoTokenPolicy(BasePoTokenPolicy):
},
'INNERTUBE_CONTEXT_CLIENT_NAME': 1,
'SUPPORTS_COOKIES': True,
'SUPPORTS_AD_PLAYBACK_CONTEXT': True,
**WEB_PO_TOKEN_POLICIES,
},
'web_embedded': {
@@ -157,6 +159,7 @@ class SubsPoTokenPolicy(BasePoTokenPolicy):
),
},
'SUPPORTS_COOKIES': True,
'SUPPORTS_AD_PLAYBACK_CONTEXT': True,
},
# This client now requires sign-in for every video
'web_creator': {
@@ -313,6 +316,7 @@ class SubsPoTokenPolicy(BasePoTokenPolicy):
),
},
'SUPPORTS_COOKIES': True,
'SUPPORTS_AD_PLAYBACK_CONTEXT': True,
},
'tv': {
'INNERTUBE_CONTEXT': {
@@ -412,6 +416,7 @@ def build_innertube_clients():
ytcfg.setdefault('SUBS_PO_TOKEN_POLICY', SubsPoTokenPolicy())
ytcfg.setdefault('REQUIRE_AUTH', False)
ytcfg.setdefault('SUPPORTS_COOKIES', False)
ytcfg.setdefault('SUPPORTS_AD_PLAYBACK_CONTEXT', False)
ytcfg.setdefault('PLAYER_PARAMS', None)
ytcfg.setdefault('AUTHENTICATED_USER_AGENT', None)
ytcfg['INNERTUBE_CONTEXT']['client'].setdefault('hl', 'en')

View File

@@ -76,7 +76,7 @@
STREAMING_DATA_PLAYER_TOKEN_PROVIDED = '__yt_dlp_player_token_provided'
STREAMING_DATA_INNERTUBE_CONTEXT = '__yt_dlp_innertube_context'
STREAMING_DATA_IS_PREMIUM_SUBSCRIBER = '__yt_dlp_is_premium_subscriber'
STREAMING_DATA_FETCHED_TIMESTAMP = '__yt_dlp_fetched_timestamp'
STREAMING_DATA_AVAILABLE_AT_TIMESTAMP = '__yt_dlp_available_at_timestamp'
PO_TOKEN_GUIDE_URL = 'https://github.com/yt-dlp/yt-dlp/wiki/PO-Token-Guide'
@@ -2629,16 +2629,23 @@ def _get_checkok_params():
return {'contentCheckOk': True, 'racyCheckOk': True}
@classmethod
def _generate_player_context(cls, sts=None):
def _generate_player_context(cls, sts=None, use_ad_playback_context=False):
context = {
'html5Preference': 'HTML5_PREF_WANTS',
}
if sts is not None:
context['signatureTimestamp'] = sts
playback_context = {
'contentPlaybackContext': context,
}
if use_ad_playback_context:
playback_context['adPlaybackContext'] = {
'pyv': True,
}
return {
'playbackContext': {
'contentPlaybackContext': context,
},
'playbackContext': playback_context,
**cls._get_checkok_params(),
}
@@ -2866,7 +2873,13 @@ def _extract_player_response(self, client, video_id, webpage_ytcfg, player_ytcfg
yt_query['serviceIntegrityDimensions'] = {'poToken': po_token}
sts = self._extract_signature_timestamp(video_id, player_url, webpage_ytcfg, fatal=False) if player_url else None
yt_query.update(self._generate_player_context(sts))
use_ad_playback_context = (
self._configuration_arg('use_ad_playback_context', ['false'])[0] != 'false'
and traverse_obj(INNERTUBE_CLIENTS, (client, 'SUPPORTS_AD_PLAYBACK_CONTEXT', {bool})))
yt_query.update(self._generate_player_context(sts, use_ad_playback_context))
return self._extract_response(
item_id=video_id, ep='player', query=yt_query,
ytcfg=player_ytcfg, headers=headers, fatal=True,
@@ -2901,10 +2914,10 @@ def _get_requested_clients(self, url, smuggled_data, is_premium_subscriber):
if not (requested_clients or excluded_clients) and default_clients == self._DEFAULT_JSLESS_CLIENTS:
self.report_warning(
f'No supported JavaScript runtime could be found. YouTube extraction without '
f'a JS runtime has been deprecated, and some formats may be missing. '
f'See {_EJS_WIKI_URL} for details on installing one. To silence this warning, '
f'you can use --extractor-args "youtube:player_client=default"', only_once=True)
f'No supported JavaScript runtime could be found. Only deno is enabled by default; '
f'to use another runtime add --js-runtimes RUNTIME[:PATH] to your command/config. '
f'YouTube extraction without a JS runtime has been deprecated, and some formats may be missing. '
f'See {_EJS_WIKI_URL} for details on installing one', only_once=True)
if not requested_clients:
requested_clients.extend(default_clients)
@@ -3032,7 +3045,6 @@ def append_client(*client_names):
elif pr:
# Save client details for introspection later
innertube_context = traverse_obj(player_ytcfg or self._get_default_ytcfg(client), 'INNERTUBE_CONTEXT')
fetched_timestamp = int(time.time())
sd = pr.setdefault('streamingData', {})
sd[STREAMING_DATA_CLIENT_NAME] = client
sd[STREAMING_DATA_FETCH_GVS_PO_TOKEN] = fetch_gvs_po_token_func
@@ -3040,7 +3052,7 @@ def append_client(*client_names):
sd[STREAMING_DATA_INNERTUBE_CONTEXT] = innertube_context
sd[STREAMING_DATA_FETCH_SUBS_PO_TOKEN] = fetch_subs_po_token_func
sd[STREAMING_DATA_IS_PREMIUM_SUBSCRIBER] = is_premium_subscriber
sd[STREAMING_DATA_FETCHED_TIMESTAMP] = fetched_timestamp
sd[STREAMING_DATA_AVAILABLE_AT_TIMESTAMP] = self._get_available_at_timestamp(pr, video_id, client)
for f in traverse_obj(sd, (('formats', 'adaptiveFormats'), ..., {dict})):
f[STREAMING_DATA_CLIENT_NAME] = client
f[STREAMING_DATA_FETCH_GVS_PO_TOKEN] = fetch_gvs_po_token_func
@@ -3150,6 +3162,9 @@ def _extract_formats_and_subtitles(self, video_id, player_responses, player_url,
self._downloader.deprecated_feature('[youtube] include_duplicate_formats extractor argument is deprecated. '
'Use formats=duplicate extractor argument instead')
def is_super_resolution(f_url):
return '1' in traverse_obj(f_url, ({parse_qs}, 'xtags', ..., {urllib.parse.parse_qs}, 'sr', ...))
def solve_sig(s, spec):
return ''.join(s[i] for i in spec)
@@ -3169,9 +3184,6 @@ def gvs_pot_required(policy, is_premium_subscriber, has_player_token):
# save pots per client to avoid fetching again
gvs_pots = {}
# For handling potential pre-playback required waiting period
playback_wait = int_or_none(self._configuration_arg('playback_wait', [None])[0], default=6)
def get_language_code_and_preference(fmt_stream):
audio_track = fmt_stream.get('audioTrack') or {}
display_name = audio_track.get('displayName') or ''
@@ -3196,13 +3208,13 @@ def get_language_code_and_preference(fmt_stream):
is_premium_subscriber = streaming_data[STREAMING_DATA_IS_PREMIUM_SUBSCRIBER]
player_token_provided = streaming_data[STREAMING_DATA_PLAYER_TOKEN_PROVIDED]
client_name = streaming_data.get(STREAMING_DATA_CLIENT_NAME)
available_at = streaming_data[STREAMING_DATA_FETCHED_TIMESTAMP] + playback_wait
available_at = streaming_data[STREAMING_DATA_AVAILABLE_AT_TIMESTAMP]
streaming_formats = traverse_obj(streaming_data, (('formats', 'adaptiveFormats'), ...))
def get_stream_id(fmt_stream):
return str_or_none(fmt_stream.get('itag')), traverse_obj(fmt_stream, 'audioTrack', 'id'), fmt_stream.get('isDrc')
def process_format_stream(fmt_stream, proto, missing_pot):
def process_format_stream(fmt_stream, proto, missing_pot, super_resolution=False):
itag = str_or_none(fmt_stream.get('itag'))
audio_track = fmt_stream.get('audioTrack') or {}
quality = fmt_stream.get('quality')
@@ -3253,10 +3265,13 @@ def process_format_stream(fmt_stream, proto, missing_pot):
dct = {
'asr': int_or_none(fmt_stream.get('audioSampleRate')),
'filesize': int_or_none(fmt_stream.get('contentLength')),
'format_id': f'{itag}{"-drc" if fmt_stream.get("isDrc") else ""}',
'format_id': join_nonempty(itag, (
'drc' if fmt_stream.get('isDrc')
else 'sr' if super_resolution
else None)),
'format_note': join_nonempty(
join_nonempty(audio_track.get('displayName'), audio_track.get('audioIsDefault') and '(default)', delim=' '),
name, fmt_stream.get('isDrc') and 'DRC',
name, fmt_stream.get('isDrc') and 'DRC', super_resolution and 'AI-upscaled',
try_get(fmt_stream, lambda x: x['projectionType'].replace('RECTANGULAR', '').lower()),
try_get(fmt_stream, lambda x: x['spatialAudioType'].replace('SPATIAL_AUDIO_TYPE_', '').lower()),
is_damaged and 'DAMAGED', missing_pot and 'MISSING POT',
@@ -3342,7 +3357,9 @@ def process_https_formats():
self.report_warning(msg, video_id, only_once=True)
continue
fmt = process_format_stream(fmt_stream, proto, missing_pot=require_po_token and not po_token)
fmt = process_format_stream(
fmt_stream, proto, missing_pot=require_po_token and not po_token,
super_resolution=is_super_resolution(fmt_url))
if not fmt:
continue
@@ -3645,6 +3662,36 @@ def _download_initial_webpage(self, webpage_url, webpage_client, video_id):
}))
return webpage
def _get_available_at_timestamp(self, player_response, video_id, client):
now = time.time()
wait_seconds = 0
for renderer in traverse_obj(player_response, (
'adSlots', lambda _, v: v['adSlotRenderer']['adSlotMetadata']['triggerEvent'] == 'SLOT_TRIGGER_EVENT_BEFORE_CONTENT',
'adSlotRenderer', 'fulfillmentContent', 'fulfilledLayout', 'playerBytesAdLayoutRenderer', 'renderingContent', (
None,
('playerBytesSequentialLayoutRenderer', 'sequentialLayouts', ..., 'playerBytesAdLayoutRenderer', 'renderingContent'),
), 'instreamVideoAdRenderer', {dict},
)):
duration = traverse_obj(renderer, ('playerVars', {urllib.parse.parse_qs}, 'length_seconds', -1, {int_or_none}))
ad = 'an ad' if duration is None else f'a {duration}s ad'
skip_time = traverse_obj(renderer, ('skipOffsetMilliseconds', {float_or_none(scale=1000)}))
if skip_time is not None:
# YT allows skipping this ad; use the wait-until-skip time instead of full ad duration
skip_time = skip_time if skip_time % 1 else int(skip_time)
ad += f' skippable after {skip_time}s'
duration = skip_time
if duration is not None:
self.write_debug(f'{video_id}: Detected {ad} for {client}')
wait_seconds += duration
if wait_seconds:
return math.ceil(now) + wait_seconds
return int(now)
def _list_formats(self, video_id, microformats, video_details, player_responses, player_url, duration=None):
live_broadcast_details = traverse_obj(microformats, (..., 'liveBroadcastDetails'))
is_live = get_first(video_details, 'isLive')
@@ -3995,6 +4042,11 @@ def process_language(container, base_url, lang_code, sub_name, client_name, quer
STREAMING_DATA_CLIENT_NAME: client_name,
})
def set_audio_lang_from_orig_subs_lang(lang_code):
for f in formats:
if f.get('acodec') != 'none' and not f.get('language'):
f['language'] = lang_code
subtitles = {}
skipped_subs_clients = set()
@@ -4054,7 +4106,8 @@ def process_language(container, base_url, lang_code, sub_name, client_name, quer
orig_lang = qs.get('lang', [None])[-1]
lang_name = self._get_text(caption_track, 'name', max_runs=1)
if caption_track.get('kind') != 'asr':
is_manual_subs = caption_track.get('kind') != 'asr'
if is_manual_subs:
if not lang_code:
continue
process_language(
@@ -4065,16 +4118,14 @@ def process_language(container, base_url, lang_code, sub_name, client_name, quer
if not trans_code:
continue
orig_trans_code = trans_code
if caption_track.get('kind') != 'asr' and trans_code != 'und':
if is_manual_subs and trans_code != 'und':
if not get_translated_subs:
continue
trans_code += f'-{lang_code}'
trans_name += format_field(lang_name, None, ' from %s')
if lang_code == f'a-{orig_trans_code}':
# Set audio language based on original subtitles
for f in formats:
if f.get('acodec') != 'none' and not f.get('language'):
f['language'] = orig_trans_code
set_audio_lang_from_orig_subs_lang(orig_trans_code)
# Add an "-orig" label to the original language so that it can be distinguished.
# The subs are returned without "-orig" as well for compatibility
process_language(
@@ -4085,6 +4136,21 @@ def process_language(container, base_url, lang_code, sub_name, client_name, quer
automatic_captions, base_url, trans_code, trans_name, client_name,
pot_params if orig_lang == orig_trans_code else {'tlang': trans_code, **pot_params})
# Extract automatic captions when the language is not in 'translationLanguages'
# e.g. Cantonese [yue], see https://github.com/yt-dlp/yt-dlp/issues/14889
lang_code = remove_start(lang_code, 'a-')
if is_manual_subs or not lang_code or lang_code in automatic_captions:
continue
lang_name = remove_end(lang_name, ' (auto-generated)')
if caption_track.get('isTranslatable'):
# We can assume this is the original audio language
set_audio_lang_from_orig_subs_lang(lang_code)
process_language(
automatic_captions, base_url, f'{lang_code}-orig',
f'{lang_name} (Original)', client_name, pot_params)
process_language(
automatic_captions, base_url, lang_code, lang_name, client_name, pot_params)
# Avoid duplication if we've already got everything we need
need_subs_langs.difference_update(subtitles)
need_caps_langs.difference_update(automatic_captions)

View File

@@ -21,6 +21,7 @@
)
from yt_dlp.extractor.youtube.pot._provider import configuration_arg
from yt_dlp.extractor.youtube.pot.provider import provider_bug_report_message
from yt_dlp.utils import version_tuple
from yt_dlp.utils._jsruntime import JsRuntimeInfo
if _has_ejs:
@@ -223,7 +224,8 @@ def _get_script(self, script_type: ScriptType, /) -> Script:
skipped_components.append(script)
continue
if not self.is_dev:
if script.version != self._SCRIPT_VERSION:
# Matching patch version is expected to have same hash
if version_tuple(script.version, lenient=True)[:2] != version_tuple(self._SCRIPT_VERSION, lenient=True)[:2]:
self.logger.warning(
f'Challenge solver {script_type.value} script version {script.version} '
f'is not supported (source: {script.source.value}, variant: {script.variant}, supported version: {self._SCRIPT_VERSION})')

View File

@@ -1,6 +1,6 @@
# This file is generated by devscripts/update_ejs.py. DO NOT MODIFY!
VERSION = '0.3.1'
VERSION = '0.3.2'
HASHES = {
'yt.solver.bun.lib.js': '6ff45e94de9f0ea936a183c48173cfa9ce526ee4b7544cd556428427c1dd53c8073ef0174e79b320252bf0e7c64b0032cc1cf9c4358f3fda59033b7caa01c241',
'yt.solver.core.js': '0cd96b2d3f319dfa62cae689efa7d930ef1706e95f5921794db5089b2262957ec0a17d73938d8975ea35d0309cbfb4c8e4418d5e219837215eee242890c8b64d',

View File

@@ -305,6 +305,8 @@ def __init__(self, res: http.client.HTTPResponse | urllib.response.addinfourl):
status=getattr(res, 'status', None) or res.getcode(), reason=getattr(res, 'reason', None))
def read(self, amt=None):
if self.closed:
return b''
try:
data = self.fp.read(amt)
underlying = getattr(self.fp, 'fp', None)

View File

@@ -689,7 +689,7 @@ def _preset_alias_callback(option, opt_str, value, parser):
'-I', '--playlist-items',
dest='playlist_items', metavar='ITEM_SPEC', default=None,
help=(
'Comma separated playlist_index of the items to download. '
'Comma-separated playlist_index of the items to download. '
'You can specify a range using "[START]:[STOP][:STEP]". For backward compatibility, START-STOP is also supported. '
'Use negative indices to count from the right and negative STEP to download in reverse order. '
'E.g. "-I 1:3,7,-5::2" used on a playlist of size 15 will download the items at index 1,2,3,7,11,13,15'))

View File

@@ -192,7 +192,10 @@ def _probe_version(self):
@property
def available(self):
return bool(self._ffmpeg_location.get()) or self.basename is not None
# If we return that ffmpeg is available, then the basename property *must* be run
# (as doing so has side effects), and its value can never be None
# See: https://github.com/yt-dlp/yt-dlp/issues/12829
return self.basename is not None
@property
def executable(self):
@@ -747,8 +750,8 @@ def add(meta_list, info_list=None):
add('track', 'track_number')
add('artist', ('artist', 'artists', 'creator', 'creators', 'uploader', 'uploader_id'))
add('composer', ('composer', 'composers'))
add('genre', ('genre', 'genres'))
add('album')
add('genre', ('genre', 'genres', 'categories', 'tags'))
add('album', ('album', 'series'))
add('album_artist', ('album_artist', 'album_artists'))
add('disc', 'disc_number')
add('show', 'series')

View File

@@ -1,21 +1,56 @@
from __future__ import annotations
import abc
import dataclasses
import functools
import os.path
import sys
from ._utils import _get_exe_version_output, detect_exe_version, int_or_none
from ._utils import _get_exe_version_output, detect_exe_version, version_tuple
# NOT public API
def runtime_version_tuple(v):
# NB: will return (0,) if `v` is an invalid version string
return tuple(int_or_none(x, default=0) for x in v.split('.'))
_FALLBACK_PATHEXT = ('.COM', '.EXE', '.BAT', '.CMD')
def _find_exe(basename: str) -> str:
if os.name != 'nt':
return basename
paths: list[str] = []
# binary dir
if getattr(sys, 'frozen', False):
paths.append(os.path.dirname(sys.executable))
# cwd
paths.append(os.getcwd())
# PATH items
if path := os.environ.get('PATH'):
paths.extend(filter(None, path.split(os.path.pathsep)))
pathext = os.environ.get('PATHEXT')
if pathext is None:
exts = _FALLBACK_PATHEXT
else:
exts = tuple(ext for ext in pathext.split(os.pathsep) if ext)
visited = []
for path in map(os.path.realpath, paths):
normed = os.path.normcase(path)
if normed in visited:
continue
visited.append(normed)
for ext in exts:
binary = os.path.join(path, f'{basename}{ext}')
if os.access(binary, os.F_OK | os.X_OK) and not os.path.isdir(binary):
return binary
return basename
def _determine_runtime_path(path, basename):
if not path:
return basename
return _find_exe(basename)
if os.path.isdir(path):
return os.path.join(path, basename)
return path
@@ -52,7 +87,7 @@ def _info(self):
if not out:
return None
version = detect_exe_version(out, r'^deno (\S+)', 'unknown')
vt = runtime_version_tuple(version)
vt = version_tuple(version, lenient=True)
return JsRuntimeInfo(
name='deno', path=path, version=version, version_tuple=vt,
supported=vt >= self.MIN_SUPPORTED_VERSION)
@@ -67,7 +102,7 @@ def _info(self):
if not out:
return None
version = detect_exe_version(out, r'^(\S+)', 'unknown')
vt = runtime_version_tuple(version)
vt = version_tuple(version, lenient=True)
return JsRuntimeInfo(
name='bun', path=path, version=version, version_tuple=vt,
supported=vt >= self.MIN_SUPPORTED_VERSION)
@@ -82,7 +117,7 @@ def _info(self):
if not out:
return None
version = detect_exe_version(out, r'^v(\S+)', 'unknown')
vt = runtime_version_tuple(version)
vt = version_tuple(version, lenient=True)
return JsRuntimeInfo(
name='node', path=path, version=version, version_tuple=vt,
supported=vt >= self.MIN_SUPPORTED_VERSION)
@@ -100,7 +135,7 @@ def _info(self):
is_ng = 'QuickJS-ng' in out
version = detect_exe_version(out, r'^QuickJS(?:-ng)?\s+version\s+(\S+)', 'unknown')
vt = runtime_version_tuple(version.replace('-', '.'))
vt = version_tuple(version, lenient=True)
if is_ng:
return JsRuntimeInfo(
name='quickjs-ng', path=path, version=version, version_tuple=vt,

View File

@@ -876,13 +876,19 @@ def __init__(self, args, *remaining, env=None, text=False, shell=False, **kwargs
kwargs.setdefault('encoding', 'utf-8')
kwargs.setdefault('errors', 'replace')
if shell and os.name == 'nt' and kwargs.get('executable') is None:
if not isinstance(args, str):
args = shell_quote(args, shell=True)
shell = False
# Set variable for `cmd.exe` newline escaping (see `utils.shell_quote`)
env['='] = '"^\n\n"'
args = f'{self.__comspec()} /Q /S /D /V:OFF /E:ON /C "{args}"'
if os.name == 'nt' and kwargs.get('executable') is None:
# Must apply shell escaping if we are trying to run a batch file
# These conditions should be very specific to limit impact
if not shell and isinstance(args, list) and args and args[0].lower().endswith(('.bat', '.cmd')):
shell = True
if shell:
if not isinstance(args, str):
args = shell_quote(args, shell=True)
shell = False
# Set variable for `cmd.exe` newline escaping (see `utils.shell_quote`)
env['='] = '"^\n\n"'
args = f'{self.__comspec()} /Q /S /D /V:OFF /E:ON /C "{args}"'
super().__init__(args, *remaining, env=env, shell=shell, **kwargs, startupinfo=self._startupinfo)
@@ -2889,8 +2895,9 @@ def limit_length(s, length):
return s
def version_tuple(v):
return tuple(int(e) for e in re.split(r'[-.]', v))
def version_tuple(v, *, lenient=False):
parse = int_or_none(default=-1) if lenient else int
return tuple(parse(e) for e in re.split(r'[-.]', v))
def is_outdated_version(version, limit, assume_new=True):

View File

@@ -1,8 +1,8 @@
# Autogenerated by devscripts/update-version.py
__version__ = '2025.11.12'
__version__ = '2025.12.08'
RELEASE_GIT_HEAD = '335653be82d5ef999cfc2879d005397402eebec1'
RELEASE_GIT_HEAD = '7a52ff29d86efc8f3adeba977b2009ce40b8e52e'
VARIANT = None
@@ -12,4 +12,4 @@
ORIGIN = 'yt-dlp/yt-dlp'
_pkg_version = '2025.11.12'
_pkg_version = '2025.12.08'