mirror of
https://github.com/yt-dlp/yt-dlp
synced 2025-12-16 22:25:40 +07:00
Compare commits
22 Commits
2021.01.07
...
2021.01.10
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
65156eba45 | ||
|
|
ba3c9477ee | ||
|
|
a3e26449cd | ||
|
|
7267acd1ed | ||
|
|
f446cc6667 | ||
|
|
ebdd9275c3 | ||
|
|
b2f70ae74e | ||
|
|
5ac2324460 | ||
|
|
4084f235eb | ||
|
|
6fd35a1101 | ||
|
|
f5b1bca913 | ||
|
|
d9eebbc747 | ||
|
|
c3e6ffba53 | ||
|
|
8c04f0be96 | ||
|
|
ab8e5e516f | ||
|
|
62d80ba17c | ||
|
|
e8273c86a3 | ||
|
|
e5bc03a6fa | ||
|
|
034b6215b4 | ||
|
|
00dd0cd573 | ||
|
|
0c0ff18f7d | ||
|
|
a26c99ac13 |
6
.github/ISSUE_TEMPLATE/1_broken_site.md
vendored
6
.github/ISSUE_TEMPLATE/1_broken_site.md
vendored
@@ -21,7 +21,7 @@ ## Checklist
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
|
||||
- First of, make sure you are using the latest version of youtube-dlc. Run `youtube-dlc --version` and ensure your version is 2021.01.07. If it's not, see https://github.com/pukkandan/yt-dlc on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of youtube-dlc. Run `youtube-dlc --version` and ensure your version is 2021.01.09. If it's not, see https://github.com/pukkandan/yt-dlc on how to update. Issues with outdated version will be REJECTED.
|
||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/pukkandan/yt-dlc.
|
||||
- Search the bugtracker for similar issues: https://github.com/pukkandan/yt-dlc. DO NOT post duplicates.
|
||||
@@ -29,7 +29,7 @@ ## Checklist
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a broken site support
|
||||
- [ ] I've verified that I'm running youtube-dlc version **2021.01.07**
|
||||
- [ ] I've verified that I'm running youtube-dlc version **2021.01.09**
|
||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
|
||||
- [ ] I've searched the bugtracker for similar issues including closed ones
|
||||
@@ -44,7 +44,7 @@ ## Verbose log
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dlc version 2021.01.07
|
||||
[debug] youtube-dlc version 2021.01.09
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
||||
@@ -21,7 +21,7 @@ ## Checklist
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
|
||||
- First of, make sure you are using the latest version of youtube-dlc. Run `youtube-dlc --version` and ensure your version is 2021.01.07. If it's not, see https://github.com/pukkandan/yt-dlc on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of youtube-dlc. Run `youtube-dlc --version` and ensure your version is 2021.01.09. If it's not, see https://github.com/pukkandan/yt-dlc on how to update. Issues with outdated version will be REJECTED.
|
||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://github.com/pukkandan/yt-dlc. youtube-dlc does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
|
||||
- Search the bugtracker for similar site support requests: https://github.com/pukkandan/yt-dlc. DO NOT post duplicates.
|
||||
@@ -29,7 +29,7 @@ ## Checklist
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a new site support request
|
||||
- [ ] I've verified that I'm running youtube-dlc version **2021.01.07**
|
||||
- [ ] I've verified that I'm running youtube-dlc version **2021.01.09**
|
||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||
- [ ] I've checked that none of provided URLs violate any copyrights
|
||||
- [ ] I've searched the bugtracker for similar site support requests including closed ones
|
||||
|
||||
@@ -21,13 +21,13 @@ ## Checklist
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
|
||||
- First of, make sure you are using the latest version of youtube-dlc. Run `youtube-dlc --version` and ensure your version is 2021.01.07. If it's not, see https://github.com/pukkandan/yt-dlc on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of youtube-dlc. Run `youtube-dlc --version` and ensure your version is 2021.01.09. If it's not, see https://github.com/pukkandan/yt-dlc on how to update. Issues with outdated version will be REJECTED.
|
||||
- Search the bugtracker for similar site feature requests: https://github.com/pukkandan/yt-dlc. DO NOT post duplicates.
|
||||
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a site feature request
|
||||
- [ ] I've verified that I'm running youtube-dlc version **2021.01.07**
|
||||
- [ ] I've verified that I'm running youtube-dlc version **2021.01.09**
|
||||
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
|
||||
|
||||
|
||||
|
||||
6
.github/ISSUE_TEMPLATE/4_bug_report.md
vendored
6
.github/ISSUE_TEMPLATE/4_bug_report.md
vendored
@@ -21,7 +21,7 @@ ## Checklist
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
|
||||
- First of, make sure you are using the latest version of youtube-dlc. Run `youtube-dlc --version` and ensure your version is 2021.01.07. If it's not, see https://github.com/pukkandan/yt-dlc on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of youtube-dlc. Run `youtube-dlc --version` and ensure your version is 2021.01.09. If it's not, see https://github.com/pukkandan/yt-dlc on how to update. Issues with outdated version will be REJECTED.
|
||||
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
|
||||
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/pukkandan/yt-dlc.
|
||||
- Search the bugtracker for similar issues: https://github.com/pukkandan/yt-dlc. DO NOT post duplicates.
|
||||
@@ -30,7 +30,7 @@ ## Checklist
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a broken site support issue
|
||||
- [ ] I've verified that I'm running youtube-dlc version **2021.01.07**
|
||||
- [ ] I've verified that I'm running youtube-dlc version **2021.01.09**
|
||||
- [ ] I've checked that all provided URLs are alive and playable in a browser
|
||||
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
|
||||
- [ ] I've searched the bugtracker for similar bug reports including closed ones
|
||||
@@ -46,7 +46,7 @@ ## Verbose log
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dlc version 2021.01.07
|
||||
[debug] youtube-dlc version 2021.01.09
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
||||
4
.github/ISSUE_TEMPLATE/5_feature_request.md
vendored
4
.github/ISSUE_TEMPLATE/5_feature_request.md
vendored
@@ -21,13 +21,13 @@ ## Checklist
|
||||
|
||||
<!--
|
||||
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dlc:
|
||||
- First of, make sure you are using the latest version of youtube-dlc. Run `youtube-dlc --version` and ensure your version is 2021.01.07. If it's not, see https://github.com/pukkandan/yt-dlc on how to update. Issues with outdated version will be REJECTED.
|
||||
- First of, make sure you are using the latest version of youtube-dlc. Run `youtube-dlc --version` and ensure your version is 2021.01.09. If it's not, see https://github.com/pukkandan/yt-dlc on how to update. Issues with outdated version will be REJECTED.
|
||||
- Search the bugtracker for similar feature requests: https://github.com/pukkandan/yt-dlc. DO NOT post duplicates.
|
||||
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
|
||||
-->
|
||||
|
||||
- [ ] I'm reporting a feature request
|
||||
- [ ] I've verified that I'm running youtube-dlc version **2021.01.07**
|
||||
- [ ] I've verified that I'm running youtube-dlc version **2021.01.09**
|
||||
- [ ] I've searched the bugtracker for similar feature requests including closed ones
|
||||
|
||||
|
||||
|
||||
16
.github/workflows/build.yml
vendored
16
.github/workflows/build.yml
vendored
@@ -161,3 +161,19 @@ jobs:
|
||||
asset_path: ./SHA2-256SUMS
|
||||
asset_name: SHA2-256SUMS
|
||||
asset_content_type: text/plain
|
||||
|
||||
update_version_badge:
|
||||
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
needs: build_unix
|
||||
|
||||
steps:
|
||||
- name: Create Version Badge
|
||||
uses: schneegans/dynamic-badges-action@v1.0.0
|
||||
with:
|
||||
auth: ${{ secrets.GIST_TOKEN }}
|
||||
gistID: c69cb23c3c5b3316248e52022790aa57
|
||||
filename: version.json
|
||||
label: Version
|
||||
message: ${{ needs.build_unix.outputs.ytdlc_version }}
|
||||
|
||||
21
.github/workflows/ci.yml
vendored
21
.github/workflows/ci.yml
vendored
@@ -1,4 +1,4 @@
|
||||
name: CI
|
||||
name: Full Test
|
||||
on: [push]
|
||||
jobs:
|
||||
tests:
|
||||
@@ -7,10 +7,9 @@ jobs:
|
||||
strategy:
|
||||
fail-fast: true
|
||||
matrix:
|
||||
os: [ubuntu-latest]
|
||||
os: [ubuntu-18.04]
|
||||
# TODO: python 2.6
|
||||
# 3.3, 3.4 are not running
|
||||
python-version: [2.7, 3.5, 3.6, 3.7, 3.8, 3.9, pypy-2.7, pypy-3.6, pypy-3.7]
|
||||
python-version: [2.7, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, pypy-2.7, pypy-3.6, pypy-3.7]
|
||||
python-impl: [cpython]
|
||||
ytdl-test-set: [core, download]
|
||||
run-tests-ext: [sh]
|
||||
@@ -60,16 +59,4 @@ jobs:
|
||||
env:
|
||||
YTDL_TEST_SET: ${{ matrix.ytdl-test-set }}
|
||||
run: ./devscripts/run_tests.${{ matrix.run-tests-ext }}
|
||||
flake8:
|
||||
name: Linter
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v2
|
||||
with:
|
||||
python-version: 3.9
|
||||
- name: Install flake8
|
||||
run: pip install flake8
|
||||
- name: Run flake8
|
||||
run: flake8 .
|
||||
# flake8 has been moved to quick-test
|
||||
31
.github/workflows/quick-test.yml
vendored
Normal file
31
.github/workflows/quick-test.yml
vendored
Normal file
@@ -0,0 +1,31 @@
|
||||
name: Core Test
|
||||
on: [push]
|
||||
jobs:
|
||||
tests:
|
||||
name: Core Tests
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: Set up Python 3.9
|
||||
uses: actions/setup-python@v2
|
||||
with:
|
||||
python-version: 3.9
|
||||
- name: Install nose
|
||||
run: pip install nose
|
||||
- name: Run tests
|
||||
env:
|
||||
YTDL_TEST_SET: core
|
||||
run: ./devscripts/run_tests.sh
|
||||
flake8:
|
||||
name: Linter
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v2
|
||||
with:
|
||||
python-version: 3.9
|
||||
- name: Install flake8
|
||||
run: pip install flake8
|
||||
- name: Run flake8
|
||||
run: flake8 .
|
||||
@@ -5,4 +5,10 @@ nixxo
|
||||
GreyAlien502
|
||||
kyuyeunk
|
||||
siikamiika
|
||||
jbruchon
|
||||
jbruchon
|
||||
alexmerkel
|
||||
glenn-slayden
|
||||
Unrud
|
||||
wporr
|
||||
mariuszskon
|
||||
ohnonot
|
||||
82
Changelog.md
Normal file
82
Changelog.md
Normal file
@@ -0,0 +1,82 @@
|
||||
# Changelog
|
||||
|
||||
<!--
|
||||
# Instuctions for creating release
|
||||
|
||||
* Run `make doc`
|
||||
* Update Changelog.md and Authors-Fork
|
||||
* Commit to master as `Release <version>`
|
||||
* Push to origin/release - build task will now run
|
||||
* Update version.py and run `make issuetemplates`
|
||||
* Commit to master as `[version] update`
|
||||
* Push to origin/master
|
||||
|
||||
-->
|
||||
|
||||
### 2020.01.10
|
||||
* [archive.org] Fix extractor and add support for audio and playlists by @wporr
|
||||
* [Animelab] Added by @mariuszskon
|
||||
* [youtube:search] Fix view_count by @ohnonot
|
||||
* [youtube] Show if video is embeddable in info
|
||||
* Update version badge automatically in README
|
||||
* Enable `test_youtube_search_matching`
|
||||
* Create `to_screen` and similar functions in postprocessor/common
|
||||
|
||||
### 2020.01.09
|
||||
* [youtube] Fix bug in automatic caption extraction
|
||||
* Add `post_hooks` to YoutubeDL by @alexmerkel
|
||||
* Batch file enumeration improvements by @glenn-slayden
|
||||
* Stop immediately when reaching `--max-downloads` by @glenn-slayden
|
||||
* Fix incorrect ANSI sequence for restoring console-window title by @glenn-slayden
|
||||
* Kill child processes when yt-dlc is killed by @Unrud
|
||||
|
||||
### 2020.01.08
|
||||
* **Merge youtube-dl:** Upto [2020.01.08](https://github.com/ytdl-org/youtube-dl/commit/bf6a74c620bd4d5726503c5302906bb36b009026)
|
||||
* Extractor stitcher ([1](https://github.com/ytdl-org/youtube-dl/commit/bb38a1215718cdf36d73ff0a7830a64cd9fa37cc), [2](https://github.com/ytdl-org/youtube-dl/commit/a563c97c5cddf55f8989ed7ea8314ef78e30107f)) have not been merged
|
||||
* Moved changelog to seperate file
|
||||
|
||||
### 2021.01.07-1
|
||||
* [Akamai] fix by @nixxo
|
||||
* [Tiktok] merge youtube-dl tiktok extractor by @GreyAlien502
|
||||
* [vlive] add support for playlists by @kyuyeunk
|
||||
* [youtube_live_chat] make sure playerOffsetMs is positive by @siikamiika
|
||||
* Ignore extra data streams in ffmpeg by @jbruchon
|
||||
* Allow passing different arguments to different postprocessors using `--postprocessor-args`
|
||||
* Deprecated `--sponskrub-args`. The same can now be done using `--postprocessor-args "sponskrub:<args>"`
|
||||
* [CI] Split tests into core-test and full-test
|
||||
|
||||
### 2021.01.07
|
||||
* Removed priority of `av01` codec in `-S` since most devices don't support it yet
|
||||
* Added `duration_string` to be used in `--output`
|
||||
* Created First Release
|
||||
|
||||
### 2021.01.05-1
|
||||
* **Changed defaults:**
|
||||
* Enabled `--ignore`
|
||||
* Disabled `--video-multistreams` and `--audio-multistreams`
|
||||
* Changed default format selection to `bv*+ba/b` when `--audio-multistreams` is disabled
|
||||
* Changed default format sort order to `res,fps,codec,size,br,asr,proto,ext,has_audio,source,format_id`
|
||||
* Changed `webm` to be more preferable than `flv` in format sorting
|
||||
* Changed default output template to `%(title)s [%(id)s].%(ext)s`
|
||||
* Enabled `--list-formats-as-table`
|
||||
|
||||
### 2021.01.05
|
||||
* **Format Sort:** Added `--format-sort` (`-S`), `--format-sort-force` (`--S-force`) - See [Sorting Formats](README.md#sorting-formats) for details
|
||||
* **Format Selection:** See [Format Selection](README.md#format-selection) for details
|
||||
* New format selectors: `best*`, `worst*`, `bestvideo*`, `bestaudio*`, `worstvideo*`, `worstaudio*`
|
||||
* Changed video format sorting to show video only files and video+audio files together.
|
||||
* Added `--video-multistreams`, `--no-video-multistreams`, `--audio-multistreams`, `--no-audio-multistreams`
|
||||
* Added `b`,`w`,`v`,`a` as alias for `best`, `worst`, `video` and `audio` respectively
|
||||
* **Shortcut Options:** Added `--write-link`, `--write-url-link`, `--write-webloc-link`, `--write-desktop-link` by @h-h-h-h - See [Internet Shortcut Options]README.md(#internet-shortcut-options) for details
|
||||
* **Sponskrub integration:** Added `--sponskrub`, `--sponskrub-cut`, `--sponskrub-force`, `--sponskrub-location`, `--sponskrub-args` - See [SponSkrub Options](README.md#sponskrub-options-sponsorblock) for details
|
||||
* Added `--force-download-archive` (`--force-write-archive`) by by h-h-h-h
|
||||
* Added `--list-formats-as-table`, `--list-formats-old`
|
||||
* **Negative Options:** Makes it possible to negate boolean options by adding a `no-` to the switch
|
||||
* Added `--no-ignore-dynamic-mpd`, `--no-allow-dynamic-mpd`, `--allow-dynamic-mpd`, `--youtube-include-hls-manifest`, `--no-youtube-include-hls-manifest`, `--no-youtube-skip-hls-manifest`, `--no-download`, `--no-download-archive`, `--resize-buffer`, `--part`, `--mtime`, `--no-keep-fragments`, `--no-cookies`, `--no-write-annotations`, `--no-write-info-json`, `--no-write-description`, `--no-write-thumbnail`, `--youtube-include-dash-manifest`, `--post-overwrites`, `--no-keep-video`, `--no-embed-subs`, `--no-embed-thumbnail`, `--no-add-metadata`, `--no-include-ads`, `--no-write-sub`, `--no-write-auto-sub`, `--no-playlist-reverse`, `--no-restrict-filenames`, `--youtube-include-dash-manifest`, `--no-format-sort-force`, `--flat-videos`, `--no-list-formats-as-table`, `--no-sponskrub`, `--no-sponskrub-cut`, `--no-sponskrub-force`
|
||||
* Renamed: `--write-subs`, `--no-write-subs`, `--no-write-auto-subs`, `--write-auto-subs`. Note that these can still be used without the ending "s"
|
||||
* Relaxed validation for format filters so that any arbitrary field can be used
|
||||
* Fix for embedding thumbnail in mp3 by @pauldubois98
|
||||
* Make Twitch Video ID output from Playlist and VOD extractor same. This is only a temporary fix
|
||||
* **Merge youtube-dl:** Upto [2020.01.03](https://github.com/ytdl-org/youtube-dl/commit/8e953dcbb10a1a42f4e12e4e132657cb0100a1f8) - See [blackjack4494/yt-dlc#280](https://github.com/blackjack4494/yt-dlc/pull/280) for details
|
||||
* Extractors [tiktok](https://github.com/ytdl-org/youtube-dl/commit/fb626c05867deab04425bad0c0b16b55473841a2) and [hotstar](https://github.com/ytdl-org/youtube-dl/commit/bb38a1215718cdf36d73ff0a7830a64cd9fa37cc) have not been merged
|
||||
* Cleaned up the fork for public use
|
||||
6
Makefile
6
Makefile
@@ -10,7 +10,8 @@ PREFIX ?= /usr/local
|
||||
BINDIR ?= $(PREFIX)/bin
|
||||
MANDIR ?= $(PREFIX)/man
|
||||
SHAREDIR ?= $(PREFIX)/share
|
||||
PYTHON ?= /usr/bin/env python
|
||||
# make_supportedsites.py doesnot work correctly in python2
|
||||
PYTHON ?= /usr/bin/env python3
|
||||
|
||||
# set SYSCONFDIR to /etc if PREFIX=/usr or PREFIX=/usr/local
|
||||
SYSCONFDIR = $(shell if [ $(PREFIX) = /usr -o $(PREFIX) = /usr/local ]; then echo /etc; else echo $(PREFIX)/etc; fi)
|
||||
@@ -50,7 +51,8 @@ offlinetest: codetest
|
||||
--exclude test_subtitles.py \
|
||||
--exclude test_write_annotations.py \
|
||||
--exclude test_youtube_lists.py \
|
||||
--exclude test_youtube_signature.py
|
||||
--exclude test_youtube_signature.py \
|
||||
--exclude test_post_hooks.py
|
||||
|
||||
tar: youtube-dlc.tar.gz
|
||||
|
||||
|
||||
82
README.md
82
README.md
@@ -1,12 +1,14 @@
|
||||
[](https://github.com/pukkandan/yt-dlc/actions?query=workflow%3ACI)
|
||||
[](https://github.com/pukkandan/yt-dlc/releases/latest)
|
||||
<!-- See: https://github.com/marketplace/actions/dynamic-badges -->
|
||||
[](https://github.com/pukkandan/yt-dlc/releases/latest)
|
||||
[](https://github.com/pukkandan/yt-dlc/blob/master/LICENSE)
|
||||
[](https://github.com/pukkandan/yt-dlc/actions?query=workflow%3ACore)
|
||||
[](https://github.com/pukkandan/yt-dlc/actions?query=workflow%3AFull)
|
||||
|
||||
youtube-dlc - download videos from youtube.com and many other [video platforms](docs/supportedsites.md)
|
||||
|
||||
This is a fork of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) which is inturn a fork of [youtube-dl](https://github.com/ytdl-org/youtube-dl)
|
||||
|
||||
* [CHANGES FROM YOUTUBE-DLC](#changes)
|
||||
* [NEW FEATURES](#new-features)
|
||||
* [INSTALLATION](#installation)
|
||||
* [UPDATE](#update)
|
||||
* [COMPILE](#compile)
|
||||
@@ -27,7 +29,7 @@
|
||||
* [Authentication Options](#authentication-options)
|
||||
* [Adobe Pass Options](#adobe-pass-options)
|
||||
* [Post-processing Options](#post-processing-options)
|
||||
* [SponSkrub Options (SponsorBlock)](#sponskrub-options-sponsorblock)
|
||||
* [SponSkrub Options (SponsorBlock)](#sponSkrub-options-sponsorblock)
|
||||
* [Extractor Options](#extractor-options)
|
||||
* [CONFIGURATION](#configuration)
|
||||
* [Authentication with .netrc file](#authentication-with-netrc-file)
|
||||
@@ -42,52 +44,18 @@
|
||||
* [MORE](#more)
|
||||
|
||||
|
||||
# CHANGES
|
||||
See [commits](https://github.com/pukkandan/yt-dlc/commits) for more details
|
||||
# NEW FEATURES
|
||||
The major new features are:
|
||||
|
||||
### 2021.01.05
|
||||
* **Format Sort:** Added `--format-sort` (`-S`), `--format-sort-force` (`--S-force`) - See [Sorting Formats](#sorting-formats) for details
|
||||
* **Format Selection:** See [Format Selection](#format-selection) for details
|
||||
* New format selectors: `best*`, `worst*`, `bestvideo*`, `bestaudio*`, `worstvideo*`, `worstaudio*`
|
||||
* Changed video format sorting to show video only files and video+audio files together.
|
||||
* Added `--video-multistreams`, `--no-video-multistreams`, `--audio-multistreams`, `--no-audio-multistreams`
|
||||
* Added `b`,`w`,`v`,`a` as alias for `best`, `worst`, `video` and `audio` respectively
|
||||
* **Shortcut Options:** Added `--write-link`, `--write-url-link`, `--write-webloc-link`, `--write-desktop-link` by @h-h-h-h - See [Internet Shortcut Options](#internet-shortcut-options) for details
|
||||
* **Sponskrub integration:** Added `--sponskrub`, `--sponskrub-cut`, `--sponskrub-force`, `--sponskrub-location`, `--sponskrub-args` - See [SponSkrub Options](#sponskrub-options-sponsorblock) for details
|
||||
* Added `--force-download-archive` (`--force-write-archive`) by by h-h-h-h
|
||||
* Added `--list-formats-as-table`, `--list-formats-old`
|
||||
* **Negative Options:** Makes it possible to negate boolean options by adding a `no-` to the switch
|
||||
* Added `--no-ignore-dynamic-mpd`, `--no-allow-dynamic-mpd`, `--allow-dynamic-mpd`, `--youtube-include-hls-manifest`, `--no-youtube-include-hls-manifest`, `--no-youtube-skip-hls-manifest`, `--no-download`, `--no-download-archive`, `--resize-buffer`, `--part`, `--mtime`, `--no-keep-fragments`, `--no-cookies`, `--no-write-annotations`, `--no-write-info-json`, `--no-write-description`, `--no-write-thumbnail`, `--youtube-include-dash-manifest`, `--post-overwrites`, `--no-keep-video`, `--no-embed-subs`, `--no-embed-thumbnail`, `--no-add-metadata`, `--no-include-ads`, `--no-write-sub`, `--no-write-auto-sub`, `--no-playlist-reverse`, `--no-restrict-filenames`, `--youtube-include-dash-manifest`, `--no-format-sort-force`, `--flat-videos`, `--no-list-formats-as-table`, `--no-sponskrub`, `--no-sponskrub-cut`, `--no-sponskrub-force`
|
||||
* Renamed: `--write-subs`, `--no-write-subs`, `--no-write-auto-subs`, `--write-auto-subs`. Note that these can still be used without the ending "s"
|
||||
* Relaxed validation for format filters so that any arbitrary field can be used
|
||||
* Fix for embedding thumbnail in mp3 by @pauldubois98
|
||||
* Make Twitch Video ID output from Playlist and VOD extractor same. This is only a temporary fix
|
||||
* **Merge youtube-dl:** Upto [2020.01.03](https://github.com/ytdl-org/youtube-dl/commit/8e953dcbb10a1a42f4e12e4e132657cb0100a1f8) - See [blackjack4494/yt-dlc#280](https://github.com/blackjack4494/yt-dlc/pull/280) for details
|
||||
* Cleaned up the fork for public use
|
||||
* **[SponSkrub Integration](#sponSkrub-options-sponsorblock)** - You can use [SponSkrub](https://github.com/faissaloo/SponSkrub) to mark/remove sponsor sections in youtube videos by utilizing the [SponsorBlock](https://sponsor.ajay.app) API
|
||||
|
||||
### 2021.01.05-2
|
||||
* **Changed defaults:**
|
||||
* Enabled `--ignore`
|
||||
* Disabled `--video-multistreams` and `--audio-multistreams`
|
||||
* Changed default format selection to `bv*+ba/b` when `--audio-multistreams` is disabled
|
||||
* Changed default format sort order to `res,fps,codec,size,br,asr,proto,ext,has_audio,source,format_id`
|
||||
* Changed `webm` to be more preferable than `flv` in format sorting
|
||||
* Changed default output template to `%(title)s [%(id)s].%(ext)s`
|
||||
* Enabled `--list-formats-as-table`
|
||||
* **[Format Sorting](#sorting-format)** - The default format sorting options have been changed so that higher resolution and better codecs will be now prefered instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection that what is possible by simply using `--format` ([examples](#format-selection-examples))
|
||||
|
||||
### 2021.01.07
|
||||
* Removed priority of `av01` codec in `-S` since most devices don't support it yet
|
||||
* Added `duration_string` to be used in `--output`
|
||||
* Created First Release
|
||||
* Merged with youtube-dl **v2020.01.08** - You get the new features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494)
|
||||
|
||||
### 2021.01.07-2
|
||||
* [Akamai] fix by @nixxo
|
||||
* [Tiktok] fix extractor by @GreyAlien502
|
||||
* [vlive] add support for playlists by @kyuyeunk
|
||||
* [youtube_live_chat] make sure playerOffsetMs is positive by @siikamiika
|
||||
* Ignore extra data streams in ffmpeg by @jbruchon
|
||||
* Allow passing different arguments to different postprocessors using `--postprocessor-args`
|
||||
* Deprecated `--sponskrub-args`. The same can now be done using `--postprocessor-args "sponskrub:<args>"`
|
||||
* **New options** - `--list-formats-as-table`, `--write-link`, `--force-download-archive` etc
|
||||
|
||||
and many other features and patches. See [changelog](Changelog.md) or [commits](https://github.com/pukkandan/yt-dlc/commits) for the full list of changes
|
||||
|
||||
|
||||
# INSTALLATION
|
||||
@@ -480,8 +448,8 @@ ## Video Format Options:
|
||||
--no-audio-multistreams Only one audio stream is downloaded for
|
||||
each output file (default)
|
||||
--all-formats Download all available video formats
|
||||
--prefer-free-formats Prefer free video formats unless a specific
|
||||
one is requested
|
||||
--prefer-free-formats Prefer free video formats over non-free
|
||||
formats of same quality
|
||||
-F, --list-formats List all available formats of requested
|
||||
videos
|
||||
--list-formats-as-table Present the output of -F in a more tabular
|
||||
@@ -634,7 +602,6 @@ ## [SponSkrub](https://github.com/faissaloo/SponSkrub) Options ([SponsorBlock](h
|
||||
--sponskrub-location PATH Location of the sponskrub binary; either
|
||||
the path to the binary or its containing
|
||||
directory.
|
||||
--sponskrub-args None Give these arguments to sponskrub
|
||||
|
||||
## Extractor Options:
|
||||
--ignore-dynamic-mpd Do not process dynamic DASH manifests
|
||||
@@ -953,9 +920,17 @@ # and if it doesn't already have an audio stream, merge it with best audio-only
|
||||
# Same as above
|
||||
$ youtube-dlc
|
||||
|
||||
# Download the best video-only format and the best audio-only format without merging them
|
||||
# For this case, an output template should be used since
|
||||
# by default, bestvideo and bestaudio will have the same file name.
|
||||
$ youtube-dlc -f 'bv,ba' -o '%(title)s.f%(format_id)s.%(ext)s'
|
||||
|
||||
|
||||
# Download the worst video available
|
||||
|
||||
# The following examples show the old method (without -S) of format selection
|
||||
# and how to use -S to achieve a similar but better result
|
||||
|
||||
# Download the worst video available (old method)
|
||||
$ youtube-dlc -f 'wv*+wa/w'
|
||||
|
||||
# Download the best video available but with the smallest resolution
|
||||
@@ -1014,13 +989,6 @@ # (https/ftps > http/ftp > m3u8_native > m3u8 > http_dash_segments ...)
|
||||
|
||||
|
||||
|
||||
# Download the best video-only format and the best audio-only format without merging them
|
||||
# For this case, an output template should be used since
|
||||
# by default, bestvideo and bestaudio will have the same file name.
|
||||
$ youtube-dlc -f 'bv,ba' -o '%(title)s.f%(format_id)s.%(ext)s'
|
||||
|
||||
|
||||
|
||||
# Download the best video with h264 codec, or the best video if there is no such video
|
||||
$ youtube-dlc -f '(bv*+ba/b)[vcodec^=avc1] / (bv*+ba/b)'
|
||||
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
@echo off
|
||||
|
||||
rem Keep this list in sync with the `offlinetest` target in Makefile
|
||||
set DOWNLOAD_TESTS="age_restriction^|download^|iqiyi_sdk_interpreter^|socks^|subtitles^|write_annotations^|youtube_lists^|youtube_signature"
|
||||
set DOWNLOAD_TESTS="age_restriction^|download^|iqiyi_sdk_interpreter^|socks^|subtitles^|write_annotations^|youtube_lists^|youtube_signature^|post_hooks"
|
||||
|
||||
if "%YTDL_TEST_SET%" == "core" (
|
||||
set test_set="-I test_("%DOWNLOAD_TESTS%")\.py"
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Keep this list in sync with the `offlinetest` target in Makefile
|
||||
DOWNLOAD_TESTS="age_restriction|download|iqiyi_sdk_interpreter|socks|subtitles|write_annotations|youtube_lists|youtube_signature"
|
||||
DOWNLOAD_TESTS="age_restriction|download|iqiyi_sdk_interpreter|socks|subtitles|write_annotations|youtube_lists|youtube_signature|post_hooks"
|
||||
|
||||
test_set=""
|
||||
multiprocess_args=""
|
||||
|
||||
@@ -48,6 +48,8 @@ # Supported sites
|
||||
- **AMCNetworks**
|
||||
- **AmericasTestKitchen**
|
||||
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
|
||||
- **AnimeLab**
|
||||
- **AnimeLabShows**
|
||||
- **AnimeOnDemand**
|
||||
- **Anvato**
|
||||
- **aol.com**
|
||||
@@ -55,9 +57,10 @@ # Supported sites
|
||||
- **Aparat**
|
||||
- **AppleConnect**
|
||||
- **AppleDaily**: 臺灣蘋果日報
|
||||
- **ApplePodcasts**
|
||||
- **appletrailers**
|
||||
- **appletrailers:section**
|
||||
- **archive.org**: archive.org videos
|
||||
- **archive.org**: archive.org video and audio
|
||||
- **ArcPublishing**
|
||||
- **ARD**
|
||||
- **ARD:mediathek**
|
||||
@@ -99,6 +102,10 @@ # Supported sites
|
||||
- **BellMedia**
|
||||
- **Bet**
|
||||
- **bfi:player**
|
||||
- **bfmtv**
|
||||
- **bfmtv:article**
|
||||
- **bfmtv:live**
|
||||
- **BibelTV**
|
||||
- **Bigflix**
|
||||
- **Bild**: Bild.de
|
||||
- **BiliBili**
|
||||
@@ -346,6 +353,8 @@ # Supported sites
|
||||
- **Go**
|
||||
- **GodTube**
|
||||
- **Golem**
|
||||
- **google:podcasts**
|
||||
- **google:podcasts:feed**
|
||||
- **GoogleDrive**
|
||||
- **Goshgay**
|
||||
- **GPUTechConf**
|
||||
@@ -381,6 +390,8 @@ # Supported sites
|
||||
- **HungamaSong**
|
||||
- **Hypem**
|
||||
- **ign.com**
|
||||
- **IHeartRadio**
|
||||
- **iheartradio:podcast**
|
||||
- **imdb**: Internet Movie Database trailers
|
||||
- **imdb:list**: Internet Movie Database lists
|
||||
- **Imgur**
|
||||
@@ -706,7 +717,6 @@ # Supported sites
|
||||
- **Playwire**
|
||||
- **pluralsight**
|
||||
- **pluralsight:course**
|
||||
- **plus.google**: Google Plus
|
||||
- **podomatic**
|
||||
- **Pokemon**
|
||||
- **PokemonWatch**
|
||||
@@ -1146,7 +1156,7 @@ # Supported sites
|
||||
- **WWE**
|
||||
- **XBef**
|
||||
- **XboxClips**
|
||||
- **XFileShare**: XFileShare based sites: ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, XVideoSharing
|
||||
- **XFileShare**: XFileShare based sites: Aparat, ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, XVideoSharing
|
||||
- **XHamster**
|
||||
- **XHamsterEmbed**
|
||||
- **XHamsterUser**
|
||||
|
||||
@@ -69,9 +69,9 @@ def test_youtube_feeds(self):
|
||||
self.assertMatch('https://www.youtube.com/feed/watch_later', ['youtube:tab'])
|
||||
self.assertMatch('https://www.youtube.com/feed/subscriptions', ['youtube:tab'])
|
||||
|
||||
# def test_youtube_search_matching(self):
|
||||
# self.assertMatch('http://www.youtube.com/results?search_query=making+mustard', ['youtube:search_url'])
|
||||
# self.assertMatch('https://www.youtube.com/results?baz=bar&search_query=youtube-dl+test+video&filters=video&lclk=video', ['youtube:search_url'])
|
||||
def test_youtube_search_matching(self):
|
||||
self.assertMatch('http://www.youtube.com/results?search_query=making+mustard', ['youtube:search_url'])
|
||||
self.assertMatch('https://www.youtube.com/results?baz=bar&search_query=youtube-dl+test+video&filters=video&lclk=video', ['youtube:search_url'])
|
||||
|
||||
def test_youtube_extract(self):
|
||||
assertExtractId = lambda url, id: self.assertEqual(YoutubeIE.extract_id(url), id)
|
||||
|
||||
68
test/test_post_hooks.py
Normal file
68
test/test_post_hooks.py
Normal file
@@ -0,0 +1,68 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import os
|
||||
import sys
|
||||
import unittest
|
||||
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
|
||||
|
||||
from test.helper import get_params, try_rm
|
||||
import youtube_dl.YoutubeDL
|
||||
from youtube_dl.utils import DownloadError
|
||||
|
||||
|
||||
class YoutubeDL(youtube_dl.YoutubeDL):
|
||||
def __init__(self, *args, **kwargs):
|
||||
super(YoutubeDL, self).__init__(*args, **kwargs)
|
||||
self.to_stderr = self.to_screen
|
||||
|
||||
|
||||
TEST_ID = 'gr51aVj-mLg'
|
||||
EXPECTED_NAME = 'gr51aVj-mLg'
|
||||
|
||||
|
||||
class TestPostHooks(unittest.TestCase):
|
||||
def setUp(self):
|
||||
self.stored_name_1 = None
|
||||
self.stored_name_2 = None
|
||||
self.params = get_params({
|
||||
'skip_download': False,
|
||||
'writeinfojson': False,
|
||||
'quiet': True,
|
||||
'verbose': False,
|
||||
'cachedir': False,
|
||||
})
|
||||
self.files = []
|
||||
|
||||
def test_post_hooks(self):
|
||||
self.params['post_hooks'] = [self.hook_one, self.hook_two]
|
||||
ydl = YoutubeDL(self.params)
|
||||
ydl.download([TEST_ID])
|
||||
self.assertEqual(self.stored_name_1, EXPECTED_NAME, 'Not the expected name from hook 1')
|
||||
self.assertEqual(self.stored_name_2, EXPECTED_NAME, 'Not the expected name from hook 2')
|
||||
|
||||
def test_post_hook_exception(self):
|
||||
self.params['post_hooks'] = [self.hook_three]
|
||||
ydl = YoutubeDL(self.params)
|
||||
self.assertRaises(DownloadError, ydl.download, [TEST_ID])
|
||||
|
||||
def hook_one(self, filename):
|
||||
self.stored_name_1, _ = os.path.splitext(os.path.basename(filename))
|
||||
self.files.append(filename)
|
||||
|
||||
def hook_two(self, filename):
|
||||
self.stored_name_2, _ = os.path.splitext(os.path.basename(filename))
|
||||
self.files.append(filename)
|
||||
|
||||
def hook_three(self, filename):
|
||||
self.files.append(filename)
|
||||
raise Exception('Test exception for \'%s\'' % filename)
|
||||
|
||||
def tearDown(self):
|
||||
for f in self.files:
|
||||
try_rm(f)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
||||
@@ -264,16 +264,24 @@ def test_allsubtitles(self):
|
||||
|
||||
|
||||
class TestRaiPlaySubtitles(BaseTestSubtitles):
|
||||
url = 'http://www.raiplay.it/video/2014/04/Report-del-07042014-cb27157f-9dd0-4aee-b788-b1f67643a391.html'
|
||||
IE = RaiPlayIE
|
||||
|
||||
def test_allsubtitles(self):
|
||||
def test_subtitles_key(self):
|
||||
self.url = 'http://www.raiplay.it/video/2014/04/Report-del-07042014-cb27157f-9dd0-4aee-b788-b1f67643a391.html'
|
||||
self.DL.params['writesubtitles'] = True
|
||||
self.DL.params['allsubtitles'] = True
|
||||
subtitles = self.getSubtitles()
|
||||
self.assertEqual(set(subtitles.keys()), set(['it']))
|
||||
self.assertEqual(md5(subtitles['it']), 'b1d90a98755126b61e667567a1f6680a')
|
||||
|
||||
def test_subtitles_array_key(self):
|
||||
self.url = 'https://www.raiplay.it/video/2020/12/Report---04-01-2021-2e90f1de-8eee-4de4-ac0e-78d21db5b600.html'
|
||||
self.DL.params['writesubtitles'] = True
|
||||
self.DL.params['allsubtitles'] = True
|
||||
subtitles = self.getSubtitles()
|
||||
self.assertEqual(set(subtitles.keys()), set(['it']))
|
||||
self.assertEqual(md5(subtitles['it']), '4b3264186fbb103508abe5311cfcb9cd')
|
||||
|
||||
|
||||
class TestVikiSubtitles(BaseTestSubtitles):
|
||||
url = 'http://www.viki.com/videos/1060846v-punch-episode-18'
|
||||
|
||||
@@ -21,6 +21,7 @@
|
||||
encode_base_n,
|
||||
caesar,
|
||||
clean_html,
|
||||
clean_podcast_url,
|
||||
date_from_str,
|
||||
DateRange,
|
||||
detect_exe_version,
|
||||
@@ -1497,6 +1498,10 @@ def test_iri_to_uri(self):
|
||||
iri_to_uri('http://导航.中国/'),
|
||||
'http://xn--fet810g.xn--fiqs8s/')
|
||||
|
||||
def test_clean_podcast_url(self):
|
||||
self.assertEqual(clean_podcast_url('https://www.podtrac.com/pts/redirect.mp3/chtbl.com/track/5899E/traffic.megaphone.fm/HSW7835899191.mp3'), 'https://traffic.megaphone.fm/HSW7835899191.mp3')
|
||||
self.assertEqual(clean_podcast_url('https://play.podtrac.com/npr-344098539/edge1.pod.npr.org/anon.npr-podcasts/podcast/npr/waitwait/2020/10/20201003_waitwait_wwdtmpodcast201003-015621a5-f035-4eca-a9a1-7c118d90bc3c.mp3'), 'https://edge1.pod.npr.org/anon.npr-podcasts/podcast/npr/waitwait/2020/10/20201003_waitwait_wwdtmpodcast201003-015621a5-f035-4eca-a9a1-7c118d90bc3c.mp3')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
unittest.main()
|
||||
|
||||
@@ -99,6 +99,7 @@
|
||||
YoutubeDLCookieProcessor,
|
||||
YoutubeDLHandler,
|
||||
YoutubeDLRedirectHandler,
|
||||
process_communicate_or_kill,
|
||||
)
|
||||
from .cache import Cache
|
||||
from .extractor import get_info_extractor, gen_extractor_classes, _LAZY_LOADER
|
||||
@@ -252,6 +253,9 @@ class YoutubeDL(object):
|
||||
youtube_dlc/postprocessor/__init__.py for a list.
|
||||
as well as any further keyword arguments for the
|
||||
postprocessor.
|
||||
post_hooks: A list of functions that get called as the final step
|
||||
for each video file, after all postprocessors have been
|
||||
called. The filename will be passed as the only argument.
|
||||
progress_hooks: A list of functions that get called on download
|
||||
progress, with a dictionary with the entries
|
||||
* status: One of "downloading", "error", or "finished".
|
||||
@@ -369,6 +373,7 @@ def __init__(self, params=None, auto_init=True):
|
||||
self._ies = []
|
||||
self._ies_instances = {}
|
||||
self._pps = []
|
||||
self._post_hooks = []
|
||||
self._progress_hooks = []
|
||||
self._download_retcode = 0
|
||||
self._num_downloads = 0
|
||||
@@ -472,6 +477,9 @@ def check_deprecated(param, option, suggestion):
|
||||
pp = pp_class(self, **compat_kwargs(pp_def))
|
||||
self.add_post_processor(pp)
|
||||
|
||||
for ph in self.params.get('post_hooks', []):
|
||||
self.add_post_hook(ph)
|
||||
|
||||
for ph in self.params.get('progress_hooks', []):
|
||||
self.add_progress_hook(ph)
|
||||
|
||||
@@ -524,6 +532,10 @@ def add_post_processor(self, pp):
|
||||
self._pps.append(pp)
|
||||
pp.set_downloader(self)
|
||||
|
||||
def add_post_hook(self, ph):
|
||||
"""Add the post hook"""
|
||||
self._post_hooks.append(ph)
|
||||
|
||||
def add_progress_hook(self, ph):
|
||||
"""Add the progress hook (currently only for the file downloader)"""
|
||||
self._progress_hooks.append(ph)
|
||||
@@ -578,7 +590,7 @@ def to_console_title(self, message):
|
||||
# already of type unicode()
|
||||
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
|
||||
elif 'TERM' in os.environ:
|
||||
self._write_string('\033]0;%s\007' % message, self._screen_file)
|
||||
self._write_string('\033[0;%s\007' % message, self._screen_file)
|
||||
|
||||
def save_console_title(self):
|
||||
if not self.params.get('consoletitle', False):
|
||||
@@ -2199,10 +2211,19 @@ def compatible_formats(formats):
|
||||
except (PostProcessingError) as err:
|
||||
self.report_error('postprocessing: %s' % str(err))
|
||||
return
|
||||
try:
|
||||
for ph in self._post_hooks:
|
||||
ph(filename)
|
||||
except Exception as err:
|
||||
self.report_error('post hooks: %s' % str(err))
|
||||
return
|
||||
must_record_download_archive = True
|
||||
|
||||
if must_record_download_archive or self.params.get('force_write_download_archive', False):
|
||||
self.record_download_archive(info_dict)
|
||||
max_downloads = self.params.get('max_downloads')
|
||||
if max_downloads is not None and self._num_downloads >= int(max_downloads):
|
||||
raise MaxDownloadsReached()
|
||||
|
||||
def download(self, url_list):
|
||||
"""Download a given list of URLs."""
|
||||
@@ -2501,7 +2522,7 @@ def print_debug_header(self):
|
||||
['git', 'rev-parse', '--short', 'HEAD'],
|
||||
stdout=subprocess.PIPE, stderr=subprocess.PIPE,
|
||||
cwd=os.path.dirname(os.path.abspath(__file__)))
|
||||
out, err = sp.communicate()
|
||||
out, err = process_communicate_or_kill(sp)
|
||||
out = out.decode().strip()
|
||||
if re.match('[0-9a-f]+', out):
|
||||
self._write_string('[debug] Git HEAD: ' + out + '\n')
|
||||
|
||||
@@ -2896,6 +2896,7 @@ def _compat_add_option(self, *args, **kwargs):
|
||||
_terminal_size = collections.namedtuple('terminal_size', ['columns', 'lines'])
|
||||
|
||||
def compat_get_terminal_size(fallback=(80, 24)):
|
||||
from .utils import process_communicate_or_kill
|
||||
columns = compat_getenv('COLUMNS')
|
||||
if columns:
|
||||
columns = int(columns)
|
||||
@@ -2912,7 +2913,7 @@ def compat_get_terminal_size(fallback=(80, 24)):
|
||||
sp = subprocess.Popen(
|
||||
['stty', 'size'],
|
||||
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
|
||||
out, err = sp.communicate()
|
||||
out, err = process_communicate_or_kill(sp)
|
||||
_lines, _columns = map(int, out.split())
|
||||
except Exception:
|
||||
_columns, _lines = _terminal_size(*fallback)
|
||||
|
||||
@@ -22,6 +22,7 @@
|
||||
handle_youtubedl_headers,
|
||||
check_executable,
|
||||
is_outdated_version,
|
||||
process_communicate_or_kill,
|
||||
)
|
||||
|
||||
|
||||
@@ -104,7 +105,7 @@ def _call_downloader(self, tmpfilename, info_dict):
|
||||
|
||||
p = subprocess.Popen(
|
||||
cmd, stderr=subprocess.PIPE)
|
||||
_, stderr = p.communicate()
|
||||
_, stderr = process_communicate_or_kill(p)
|
||||
if p.returncode != 0:
|
||||
self.to_stderr(stderr.decode('utf-8', 'replace'))
|
||||
return p.returncode
|
||||
@@ -143,7 +144,7 @@ def _call_downloader(self, tmpfilename, info_dict):
|
||||
|
||||
# curl writes the progress to stderr so don't capture it.
|
||||
p = subprocess.Popen(cmd)
|
||||
p.communicate()
|
||||
process_communicate_or_kill(p)
|
||||
return p.returncode
|
||||
|
||||
|
||||
@@ -343,14 +344,17 @@ def _call_downloader(self, tmpfilename, info_dict):
|
||||
proc = subprocess.Popen(args, stdin=subprocess.PIPE, env=env)
|
||||
try:
|
||||
retval = proc.wait()
|
||||
except KeyboardInterrupt:
|
||||
except BaseException as e:
|
||||
# subprocces.run would send the SIGKILL signal to ffmpeg and the
|
||||
# mp4 file couldn't be played, but if we ask ffmpeg to quit it
|
||||
# produces a file that is playable (this is mostly useful for live
|
||||
# streams). Note that Windows is not affected and produces playable
|
||||
# files (see https://github.com/ytdl-org/youtube-dl/issues/8300).
|
||||
if sys.platform != 'win32':
|
||||
proc.communicate(b'q')
|
||||
if isinstance(e, KeyboardInterrupt) and sys.platform != 'win32':
|
||||
process_communicate_or_kill(proc, b'q')
|
||||
else:
|
||||
proc.kill()
|
||||
proc.wait()
|
||||
raise
|
||||
return retval
|
||||
|
||||
|
||||
@@ -172,8 +172,12 @@ def is_ad_fragment_end(s):
|
||||
iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
|
||||
decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen(
|
||||
self._prepare_url(info_dict, info_dict.get('_decryption_key_url') or decrypt_info['URI'])).read()
|
||||
frag_content = AES.new(
|
||||
decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content)
|
||||
# Don't decrypt the content in tests since the data is explicitly truncated and it's not to a valid block
|
||||
# size (see https://github.com/ytdl-org/youtube-dl/pull/27660). Tests only care that the correct data downloaded,
|
||||
# not what it decrypts to.
|
||||
if not test:
|
||||
frag_content = AES.new(
|
||||
decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content)
|
||||
self._append_fragment(ctx, frag_content)
|
||||
# We only download the first fragment during the test
|
||||
if test:
|
||||
|
||||
@@ -89,11 +89,13 @@ def run_rtmpdump(args):
|
||||
self.to_screen('')
|
||||
cursor_in_new_line = True
|
||||
self.to_screen('[rtmpdump] ' + line)
|
||||
finally:
|
||||
if not cursor_in_new_line:
|
||||
self.to_screen('')
|
||||
return proc.wait()
|
||||
except BaseException: # Including KeyboardInterrupt
|
||||
proc.kill()
|
||||
proc.wait()
|
||||
if not cursor_in_new_line:
|
||||
self.to_screen('')
|
||||
return proc.returncode
|
||||
raise
|
||||
|
||||
url = info_dict['url']
|
||||
player_url = info_dict.get('player_url')
|
||||
|
||||
@@ -6,6 +6,7 @@
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
clean_podcast_url,
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
)
|
||||
@@ -17,7 +18,7 @@ def _extract_episode(self, episode, show_info):
|
||||
info = {
|
||||
'id': episode['id'],
|
||||
'display_id': episode.get('episodeUrl'),
|
||||
'url': episode['url'],
|
||||
'url': clean_podcast_url(episode['url']),
|
||||
'title': title,
|
||||
'description': clean_html(episode.get('description') or episode.get('summary')),
|
||||
'thumbnail': episode.get('image'),
|
||||
|
||||
285
youtube_dlc/extractor/animelab.py
Normal file
285
youtube_dlc/extractor/animelab.py
Normal file
@@ -0,0 +1,285 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
from ..utils import (
|
||||
ExtractorError,
|
||||
urlencode_postdata,
|
||||
int_or_none,
|
||||
str_or_none,
|
||||
determine_ext,
|
||||
)
|
||||
|
||||
from ..compat import compat_HTTPError
|
||||
|
||||
|
||||
class AnimeLabBaseIE(InfoExtractor):
|
||||
_LOGIN_REQUIRED = True
|
||||
_LOGIN_URL = 'https://www.animelab.com/login'
|
||||
_NETRC_MACHINE = 'animelab'
|
||||
|
||||
def _login(self):
|
||||
def is_logged_in(login_webpage):
|
||||
return 'Sign In' not in login_webpage
|
||||
|
||||
login_page = self._download_webpage(
|
||||
self._LOGIN_URL, None, 'Downloading login page')
|
||||
|
||||
# Check if already logged in
|
||||
if is_logged_in(login_page):
|
||||
return
|
||||
|
||||
(username, password) = self._get_login_info()
|
||||
if username is None and self._LOGIN_REQUIRED:
|
||||
self.raise_login_required('Login is required to access any AnimeLab content')
|
||||
|
||||
login_form = {
|
||||
'email': username,
|
||||
'password': password,
|
||||
}
|
||||
|
||||
try:
|
||||
response = self._download_webpage(
|
||||
self._LOGIN_URL, None, 'Logging in', 'Wrong login info',
|
||||
data=urlencode_postdata(login_form),
|
||||
headers={'Content-Type': 'application/x-www-form-urlencoded'})
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
|
||||
raise ExtractorError('Unable to log in (wrong credentials?)', expected=True)
|
||||
else:
|
||||
raise
|
||||
|
||||
# if login was successful
|
||||
if is_logged_in(response):
|
||||
return
|
||||
|
||||
raise ExtractorError('Unable to login (cannot verify if logged in)')
|
||||
|
||||
def _real_initialize(self):
|
||||
self._login()
|
||||
|
||||
|
||||
class AnimeLabIE(AnimeLabBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?animelab\.com/player/(?P<id>[^/]+)'
|
||||
|
||||
# the following tests require authentication, but a free account will suffice
|
||||
# just set 'usenetrc' to true in test/local_parameters.json if you use a .netrc file
|
||||
# or you can set 'username' and 'password' there
|
||||
# the tests also select a specific format so that the same video is downloaded
|
||||
# regardless of whether the user is premium or not (needs testing on a premium account)
|
||||
_TEST = {
|
||||
'url': 'https://www.animelab.com/player/fullmetal-alchemist-brotherhood-episode-42',
|
||||
'md5': '05bde4b91a5d1ff46ef5b94df05b0f7f',
|
||||
'info_dict': {
|
||||
'id': '383',
|
||||
'ext': 'mp4',
|
||||
'display_id': 'fullmetal-alchemist-brotherhood-episode-42',
|
||||
'title': 'Fullmetal Alchemist: Brotherhood - Episode 42 - Signs of a Counteroffensive',
|
||||
'description': 'md5:103eb61dd0a56d3dfc5dbf748e5e83f4',
|
||||
'series': 'Fullmetal Alchemist: Brotherhood',
|
||||
'episode': 'Signs of a Counteroffensive',
|
||||
'episode_number': 42,
|
||||
'duration': 1469,
|
||||
'season': 'Season 1',
|
||||
'season_number': 1,
|
||||
'season_id': '38',
|
||||
},
|
||||
'params': {
|
||||
'format': '[format_id=21711_yeshardsubbed_ja-JP][height=480]',
|
||||
},
|
||||
'skip': 'All AnimeLab content requires authentication',
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
# unfortunately we can get different URLs for the same formats
|
||||
# e.g. if we are using a "free" account so no dubs available
|
||||
# (so _remove_duplicate_formats is not effective)
|
||||
# so we use a dictionary as a workaround
|
||||
formats = {}
|
||||
for language_option_url in ('https://www.animelab.com/player/%s/subtitles',
|
||||
'https://www.animelab.com/player/%s/dubbed'):
|
||||
actual_url = language_option_url % display_id
|
||||
webpage = self._download_webpage(actual_url, display_id, 'Downloading URL ' + actual_url)
|
||||
|
||||
video_collection = self._parse_json(self._search_regex(r'new\s+?AnimeLabApp\.VideoCollection\s*?\((.*?)\);', webpage, 'AnimeLab VideoCollection'), display_id)
|
||||
position = int_or_none(self._search_regex(r'playlistPosition\s*?=\s*?(\d+)', webpage, 'Playlist Position'))
|
||||
|
||||
raw_data = video_collection[position]['videoEntry']
|
||||
|
||||
video_id = str_or_none(raw_data['id'])
|
||||
|
||||
# create a title from many sources (while grabbing other info)
|
||||
# TODO use more fallback sources to get some of these
|
||||
series = raw_data.get('showTitle')
|
||||
video_type = raw_data.get('videoEntryType', {}).get('name')
|
||||
episode_number = raw_data.get('episodeNumber')
|
||||
episode_name = raw_data.get('name')
|
||||
|
||||
title_parts = (series, video_type, episode_number, episode_name)
|
||||
if None not in title_parts:
|
||||
title = '%s - %s %s - %s' % title_parts
|
||||
else:
|
||||
title = episode_name
|
||||
|
||||
description = raw_data.get('synopsis') or self._og_search_description(webpage, default=None)
|
||||
|
||||
duration = int_or_none(raw_data.get('duration'))
|
||||
|
||||
thumbnail_data = raw_data.get('images', [])
|
||||
thumbnails = []
|
||||
for thumbnail in thumbnail_data:
|
||||
for instance in thumbnail['imageInstances']:
|
||||
image_data = instance.get('imageInfo', {})
|
||||
thumbnails.append({
|
||||
'id': str_or_none(image_data.get('id')),
|
||||
'url': image_data.get('fullPath'),
|
||||
'width': image_data.get('width'),
|
||||
'height': image_data.get('height'),
|
||||
})
|
||||
|
||||
season_data = raw_data.get('season', {}) or {}
|
||||
season = str_or_none(season_data.get('name'))
|
||||
season_number = int_or_none(season_data.get('seasonNumber'))
|
||||
season_id = str_or_none(season_data.get('id'))
|
||||
|
||||
for video_data in raw_data['videoList']:
|
||||
current_video_list = {}
|
||||
current_video_list['language'] = video_data.get('language', {}).get('languageCode')
|
||||
|
||||
is_hardsubbed = video_data.get('hardSubbed')
|
||||
|
||||
for video_instance in video_data['videoInstances']:
|
||||
httpurl = video_instance.get('httpUrl')
|
||||
url = httpurl if httpurl else video_instance.get('rtmpUrl')
|
||||
if url is None:
|
||||
# this video format is unavailable to the user (not premium etc.)
|
||||
continue
|
||||
|
||||
current_format = current_video_list.copy()
|
||||
|
||||
format_id_parts = []
|
||||
|
||||
format_id_parts.append(str_or_none(video_instance.get('id')))
|
||||
|
||||
if is_hardsubbed is not None:
|
||||
if is_hardsubbed:
|
||||
format_id_parts.append('yeshardsubbed')
|
||||
else:
|
||||
format_id_parts.append('nothardsubbed')
|
||||
|
||||
format_id_parts.append(current_format['language'])
|
||||
|
||||
format_id = '_'.join([x for x in format_id_parts if x is not None])
|
||||
|
||||
ext = determine_ext(url)
|
||||
if ext == 'm3u8':
|
||||
for format_ in self._extract_m3u8_formats(
|
||||
url, video_id, m3u8_id=format_id, fatal=False):
|
||||
formats[format_['format_id']] = format_
|
||||
continue
|
||||
elif ext == 'mpd':
|
||||
for format_ in self._extract_mpd_formats(
|
||||
url, video_id, mpd_id=format_id, fatal=False):
|
||||
formats[format_['format_id']] = format_
|
||||
continue
|
||||
|
||||
current_format['url'] = url
|
||||
quality_data = video_instance.get('videoQuality')
|
||||
if quality_data:
|
||||
quality = quality_data.get('name') or quality_data.get('description')
|
||||
else:
|
||||
quality = None
|
||||
|
||||
height = None
|
||||
if quality:
|
||||
height = int_or_none(self._search_regex(r'(\d+)p?$', quality, 'Video format height', default=None))
|
||||
|
||||
if height is None:
|
||||
self.report_warning('Could not get height of video')
|
||||
else:
|
||||
current_format['height'] = height
|
||||
current_format['format_id'] = format_id
|
||||
|
||||
formats[current_format['format_id']] = current_format
|
||||
|
||||
formats = list(formats.values())
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'series': series,
|
||||
'episode': episode_name,
|
||||
'episode_number': int_or_none(episode_number),
|
||||
'thumbnails': thumbnails,
|
||||
'duration': duration,
|
||||
'formats': formats,
|
||||
'season': season,
|
||||
'season_number': season_number,
|
||||
'season_id': season_id,
|
||||
}
|
||||
|
||||
|
||||
class AnimeLabShowsIE(AnimeLabBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?animelab\.com/shows/(?P<id>[^/]+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'https://www.animelab.com/shows/attack-on-titan',
|
||||
'info_dict': {
|
||||
'id': '45',
|
||||
'title': 'Attack on Titan',
|
||||
'description': 'md5:989d95a2677e9309368d5cf39ba91469',
|
||||
},
|
||||
'playlist_count': 59,
|
||||
'skip': 'All AnimeLab content requires authentication',
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
_BASE_URL = 'http://www.animelab.com'
|
||||
_SHOWS_API_URL = '/api/videoentries/show/videos/'
|
||||
display_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, display_id, 'Downloading requested URL')
|
||||
|
||||
show_data_str = self._search_regex(r'({"id":.*}),\svideoEntry', webpage, 'AnimeLab show data')
|
||||
show_data = self._parse_json(show_data_str, display_id)
|
||||
|
||||
show_id = str_or_none(show_data.get('id'))
|
||||
title = show_data.get('name')
|
||||
description = show_data.get('shortSynopsis') or show_data.get('longSynopsis')
|
||||
|
||||
entries = []
|
||||
for season in show_data['seasons']:
|
||||
season_id = season['id']
|
||||
get_data = urlencode_postdata({
|
||||
'seasonId': season_id,
|
||||
'limit': 1000,
|
||||
})
|
||||
# despite using urlencode_postdata, we are sending a GET request
|
||||
target_url = _BASE_URL + _SHOWS_API_URL + show_id + "?" + get_data.decode('utf-8')
|
||||
response = self._download_webpage(
|
||||
target_url,
|
||||
None, 'Season id %s' % season_id)
|
||||
|
||||
season_data = self._parse_json(response, display_id)
|
||||
|
||||
for video_data in season_data['list']:
|
||||
entries.append(self.url_result(
|
||||
_BASE_URL + '/player/' + video_data['slug'], 'AnimeLab',
|
||||
str_or_none(video_data.get('id')), video_data.get('name')
|
||||
))
|
||||
|
||||
return {
|
||||
'_type': 'playlist',
|
||||
'id': show_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'entries': entries,
|
||||
}
|
||||
|
||||
# TODO implement myqueue
|
||||
61
youtube_dlc/extractor/applepodcasts.py
Normal file
61
youtube_dlc/extractor/applepodcasts.py
Normal file
@@ -0,0 +1,61 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
clean_podcast_url,
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
try_get,
|
||||
)
|
||||
|
||||
|
||||
class ApplePodcastsIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://podcasts\.apple\.com/(?:[^/]+/)?podcast(?:/[^/]+){1,2}.*?\bi=(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://podcasts.apple.com/us/podcast/207-whitney-webb-returns/id1135137367?i=1000482637777',
|
||||
'md5': 'df02e6acb11c10e844946a39e7222b08',
|
||||
'info_dict': {
|
||||
'id': '1000482637777',
|
||||
'ext': 'mp3',
|
||||
'title': '207 - Whitney Webb Returns',
|
||||
'description': 'md5:13a73bade02d2e43737751e3987e1399',
|
||||
'upload_date': '20200705',
|
||||
'timestamp': 1593921600,
|
||||
'duration': 6425,
|
||||
'series': 'The Tim Dillon Show',
|
||||
}
|
||||
}, {
|
||||
'url': 'https://podcasts.apple.com/podcast/207-whitney-webb-returns/id1135137367?i=1000482637777',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://podcasts.apple.com/podcast/207-whitney-webb-returns?i=1000482637777',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://podcasts.apple.com/podcast/id1135137367?i=1000482637777',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
episode_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, episode_id)
|
||||
ember_data = self._parse_json(self._search_regex(
|
||||
r'id="shoebox-ember-data-store"[^>]*>\s*({.+?})\s*<',
|
||||
webpage, 'ember data'), episode_id)
|
||||
episode = ember_data['data']['attributes']
|
||||
description = episode.get('description') or {}
|
||||
|
||||
series = None
|
||||
for inc in (ember_data.get('included') or []):
|
||||
if inc.get('type') == 'media/podcast':
|
||||
series = try_get(inc, lambda x: x['attributes']['name'])
|
||||
|
||||
return {
|
||||
'id': episode_id,
|
||||
'title': episode['name'],
|
||||
'url': clean_podcast_url(episode['assetUrl']),
|
||||
'description': description.get('standard') or description.get('short'),
|
||||
'timestamp': parse_iso8601(episode.get('releaseDateTime')),
|
||||
'duration': int_or_none(episode.get('durationInMilliseconds'), 1000),
|
||||
'series': series,
|
||||
}
|
||||
@@ -1,27 +1,43 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
import json
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_urllib_parse_unquote_plus
|
||||
from ..utils import (
|
||||
KNOWN_EXTENSIONS,
|
||||
|
||||
extract_attributes,
|
||||
unified_strdate,
|
||||
unified_timestamp,
|
||||
clean_html,
|
||||
dict_get,
|
||||
parse_duration,
|
||||
int_or_none,
|
||||
str_or_none,
|
||||
merge_dicts,
|
||||
)
|
||||
|
||||
|
||||
class ArchiveOrgIE(InfoExtractor):
|
||||
IE_NAME = 'archive.org'
|
||||
IE_DESC = 'archive.org videos'
|
||||
_VALID_URL = r'https?://(?:www\.)?archive\.org/(?:details|embed)/(?P<id>[^/?#]+)(?:[?].*)?$'
|
||||
IE_DESC = 'archive.org video and audio'
|
||||
_VALID_URL = r'https?://(?:www\.)?archive\.org/(?:details|embed)/(?P<id>[^?#]+)(?:[?].*)?$'
|
||||
_TESTS = [{
|
||||
'url': 'http://archive.org/details/XD300-23_68HighlightsAResearchCntAugHumanIntellect',
|
||||
'md5': '8af1d4cf447933ed3c7f4871162602db',
|
||||
'info_dict': {
|
||||
'id': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect',
|
||||
'ext': 'ogg',
|
||||
'ext': 'ogv',
|
||||
'title': '1968 Demo - FJCC Conference Presentation Reel #1',
|
||||
'description': 'md5:da45c349df039f1cc8075268eb1b5c25',
|
||||
'upload_date': '19681210',
|
||||
'uploader': 'SRI International'
|
||||
}
|
||||
'release_date': '19681210',
|
||||
'timestamp': 1268695290,
|
||||
'upload_date': '20100315',
|
||||
'creator': 'SRI International',
|
||||
'uploader': 'laura@archive.org',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://archive.org/details/Cops1922',
|
||||
'md5': '0869000b4ce265e8ca62738b336b268a',
|
||||
@@ -29,37 +45,199 @@ class ArchiveOrgIE(InfoExtractor):
|
||||
'id': 'Cops1922',
|
||||
'ext': 'mp4',
|
||||
'title': 'Buster Keaton\'s "Cops" (1922)',
|
||||
'description': 'md5:89e7c77bf5d965dd5c0372cfb49470f6',
|
||||
}
|
||||
'description': 'md5:43a603fd6c5b4b90d12a96b921212b9c',
|
||||
'uploader': 'yorkmba99@hotmail.com',
|
||||
'timestamp': 1387699629,
|
||||
'upload_date': "20131222",
|
||||
},
|
||||
}, {
|
||||
'url': 'http://archive.org/embed/XD300-23_68HighlightsAResearchCntAugHumanIntellect',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://archive.org/details/Election_Ads',
|
||||
'md5': '284180e857160cf866358700bab668a3',
|
||||
'info_dict': {
|
||||
'id': 'Election_Ads/Commercial-JFK1960ElectionAdCampaignJingle.mpg',
|
||||
'title': 'Commercial-JFK1960ElectionAdCampaignJingle.mpg',
|
||||
'ext': 'mp4',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://archive.org/details/Election_Ads/Commercial-Nixon1960ElectionAdToughonDefense.mpg',
|
||||
'md5': '7915213ef02559b5501fe630e1a53f59',
|
||||
'info_dict': {
|
||||
'id': 'Election_Ads/Commercial-Nixon1960ElectionAdToughonDefense.mpg',
|
||||
'title': 'Commercial-Nixon1960ElectionAdToughonDefense.mpg',
|
||||
'ext': 'mp4',
|
||||
'timestamp': 1205588045,
|
||||
'uploader': 'mikedavisstripmaster@yahoo.com',
|
||||
'description': '1960 Presidential Campaign Election Commercials John F Kennedy, Richard M Nixon',
|
||||
'upload_date': '20080315',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://archive.org/details/gd1977-05-08.shure57.stevenson.29303.flac16',
|
||||
'md5': '7d07ffb42aba6537c28e053efa4b54c9',
|
||||
'info_dict': {
|
||||
'id': 'gd1977-05-08.shure57.stevenson.29303.flac16/gd1977-05-08d01t01.flac',
|
||||
'title': 'Turning',
|
||||
'ext': 'flac',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://archive.org/details/gd1977-05-08.shure57.stevenson.29303.flac16/gd1977-05-08d01t07.flac',
|
||||
'md5': 'a07cd8c6ab4ee1560f8a0021717130f3',
|
||||
'info_dict': {
|
||||
'id': 'gd1977-05-08.shure57.stevenson.29303.flac16/gd1977-05-08d01t07.flac',
|
||||
'title': 'Deal',
|
||||
'ext': 'flac',
|
||||
'timestamp': 1205895624,
|
||||
'uploader': 'mvernon54@yahoo.com',
|
||||
'description': 'md5:6a31f1996db0aa0fc9da6d6e708a1bb0',
|
||||
'upload_date': '20080319',
|
||||
'location': 'Barton Hall - Cornell University',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://archive.org/details/lp_the-music-of-russia_various-artists-a-askaryan-alexander-melik',
|
||||
'md5': '7cb019baa9b332e82ea7c10403acd180',
|
||||
'info_dict': {
|
||||
'id': 'lp_the-music-of-russia_various-artists-a-askaryan-alexander-melik/disc1/01.01. Bells Of Rostov.mp3',
|
||||
'title': 'Bells Of Rostov',
|
||||
'ext': 'mp3',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://archive.org/details/lp_the-music-of-russia_various-artists-a-askaryan-alexander-melik/disc1/02.02.+Song+And+Chorus+In+The+Polovetsian+Camp+From+%22Prince+Igor%22+(Act+2%2C+Scene+1).mp3',
|
||||
'md5': '1d0aabe03edca83ca58d9ed3b493a3c3',
|
||||
'info_dict': {
|
||||
'id': 'lp_the-music-of-russia_various-artists-a-askaryan-alexander-melik/disc1/02.02. Song And Chorus In The Polovetsian Camp From "Prince Igor" (Act 2, Scene 1).mp3',
|
||||
'title': 'Song And Chorus In The Polovetsian Camp From "Prince Igor" (Act 2, Scene 1)',
|
||||
'ext': 'mp3',
|
||||
'timestamp': 1569662587,
|
||||
'uploader': 'associate-joygen-odiongan@archive.org',
|
||||
'description': 'md5:012b2d668ae753be36896f343d12a236',
|
||||
'upload_date': '20190928',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(
|
||||
'http://archive.org/embed/' + video_id, video_id)
|
||||
jwplayer_playlist = self._parse_json(self._search_regex(
|
||||
r"(?s)Play\('[^']+'\s*,\s*(\[.+\])\s*,\s*{.*?}\)",
|
||||
webpage, 'jwplayer playlist'), video_id)
|
||||
info = self._parse_jwplayer_data(
|
||||
{'playlist': jwplayer_playlist}, video_id, base_url=url)
|
||||
@staticmethod
|
||||
def _playlist_data(webpage):
|
||||
element = re.findall(r'''(?xs)
|
||||
<input
|
||||
(?:\s+[a-zA-Z0-9:._-]+(?:=[a-zA-Z0-9:._-]*|="[^"]*"|='[^']*'|))*?
|
||||
\s+class=['"]?js-play8-playlist['"]?
|
||||
(?:\s+[a-zA-Z0-9:._-]+(?:=[a-zA-Z0-9:._-]*|="[^"]*"|='[^']*'|))*?
|
||||
\s*/>
|
||||
''', webpage)[0]
|
||||
|
||||
def get_optional(metadata, field):
|
||||
return metadata.get(field, [None])[0]
|
||||
return json.loads(extract_attributes(element)['value'])
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = compat_urllib_parse_unquote_plus(self._match_id(url))
|
||||
identifier, entry_id = (video_id.split('/', 1) + [None])[:2]
|
||||
|
||||
# Archive.org metadata API doesn't clearly demarcate playlist entries
|
||||
# or subtitle tracks, so we get them from the embeddable player.
|
||||
embed_page = self._download_webpage(
|
||||
'https://archive.org/embed/' + identifier, identifier)
|
||||
playlist = self._playlist_data(embed_page)
|
||||
|
||||
entries = {}
|
||||
for p in playlist:
|
||||
# If the user specified a playlist entry in the URL, ignore the
|
||||
# rest of the playlist.
|
||||
if entry_id and p['orig'] != entry_id:
|
||||
continue
|
||||
|
||||
entries[p['orig']] = {
|
||||
'formats': [],
|
||||
'thumbnails': [],
|
||||
'artist': p.get('artist'),
|
||||
'track': p.get('title'),
|
||||
'subtitles': {}}
|
||||
|
||||
for track in p.get('tracks', []):
|
||||
if track['kind'] != 'subtitles':
|
||||
continue
|
||||
|
||||
entries[p['orig']][track['label']] = {
|
||||
'url': 'https://archive.org/' + track['file'].lstrip('/')}
|
||||
|
||||
metadata = self._download_json(
|
||||
'http://archive.org/details/' + video_id, video_id, query={
|
||||
'output': 'json',
|
||||
})['metadata']
|
||||
info.update({
|
||||
'title': get_optional(metadata, 'title') or info.get('title'),
|
||||
'description': clean_html(get_optional(metadata, 'description')),
|
||||
})
|
||||
if info.get('_type') != 'playlist':
|
||||
info.update({
|
||||
'uploader': get_optional(metadata, 'creator'),
|
||||
'upload_date': unified_strdate(get_optional(metadata, 'date')),
|
||||
})
|
||||
'http://archive.org/metadata/' + identifier, identifier)
|
||||
m = metadata['metadata']
|
||||
identifier = m['identifier']
|
||||
|
||||
info = {
|
||||
'id': identifier,
|
||||
'title': m['title'],
|
||||
'description': clean_html(m.get('description')),
|
||||
'uploader': dict_get(m, ['uploader', 'adder']),
|
||||
'creator': m.get('creator'),
|
||||
'license': m.get('licenseurl'),
|
||||
'release_date': unified_strdate(m.get('date')),
|
||||
'timestamp': unified_timestamp(dict_get(m, ['publicdate', 'addeddate'])),
|
||||
'webpage_url': 'https://archive.org/details/' + identifier,
|
||||
'location': m.get('venue'),
|
||||
'release_year': int_or_none(m.get('year'))}
|
||||
|
||||
for f in metadata['files']:
|
||||
if f['name'] in entries:
|
||||
entries[f['name']] = merge_dicts(entries[f['name']], {
|
||||
'id': identifier + '/' + f['name'],
|
||||
'title': f.get('title') or f['name'],
|
||||
'display_id': f['name'],
|
||||
'description': clean_html(f.get('description')),
|
||||
'creator': f.get('creator'),
|
||||
'duration': parse_duration(f.get('length')),
|
||||
'track_number': int_or_none(f.get('track')),
|
||||
'album': f.get('album'),
|
||||
'discnumber': int_or_none(f.get('disc')),
|
||||
'release_year': int_or_none(f.get('year'))})
|
||||
entry = entries[f['name']]
|
||||
elif f.get('original') in entries:
|
||||
entry = entries[f['original']]
|
||||
else:
|
||||
continue
|
||||
|
||||
if f.get('format') == 'Thumbnail':
|
||||
entry['thumbnails'].append({
|
||||
'id': f['name'],
|
||||
'url': 'https://archive.org/download/' + identifier + '/' + f['name'],
|
||||
'width': int_or_none(f.get('width')),
|
||||
'height': int_or_none(f.get('width')),
|
||||
'filesize': int_or_none(f.get('size'))})
|
||||
|
||||
extension = (f['name'].rsplit('.', 1) + [None])[1]
|
||||
if extension in KNOWN_EXTENSIONS:
|
||||
entry['formats'].append({
|
||||
'url': 'https://archive.org/download/' + identifier + '/' + f['name'],
|
||||
'format': f.get('format'),
|
||||
'width': int_or_none(f.get('width')),
|
||||
'height': int_or_none(f.get('height')),
|
||||
'filesize': int_or_none(f.get('size')),
|
||||
'protocol': 'https'})
|
||||
|
||||
# Sort available formats by filesize
|
||||
for entry in entries.values():
|
||||
entry['formats'] = list(sorted(entry['formats'], key=lambda x: x.get('filesize', -1)))
|
||||
|
||||
if len(entries) == 1:
|
||||
# If there's only one item, use it as the main info dict
|
||||
only_video = entries[list(entries.keys())[0]]
|
||||
if entry_id:
|
||||
info = merge_dicts(only_video, info)
|
||||
else:
|
||||
info = merge_dicts(info, only_video)
|
||||
else:
|
||||
# Otherwise, we have a playlist.
|
||||
info['_type'] = 'playlist'
|
||||
info['entries'] = list(entries.values())
|
||||
|
||||
if metadata.get('reviews'):
|
||||
info['comments'] = []
|
||||
for review in metadata['reviews']:
|
||||
info['comments'].append({
|
||||
'id': review.get('review_id'),
|
||||
'author': review.get('reviewer'),
|
||||
'text': str_or_none(review.get('reviewtitle'), '') + '\n\n' + review.get('reviewbody'),
|
||||
'timestamp': unified_timestamp(review.get('createdate')),
|
||||
'parent': 'root'})
|
||||
|
||||
return info
|
||||
|
||||
103
youtube_dlc/extractor/bfmtv.py
Normal file
103
youtube_dlc/extractor/bfmtv.py
Normal file
@@ -0,0 +1,103 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import extract_attributes
|
||||
|
||||
|
||||
class BFMTVBaseIE(InfoExtractor):
|
||||
_VALID_URL_BASE = r'https?://(?:www\.)?bfmtv\.com/'
|
||||
_VALID_URL_TMPL = _VALID_URL_BASE + r'(?:[^/]+/)*[^/?&#]+_%s[A-Z]-(?P<id>\d{12})\.html'
|
||||
_VIDEO_BLOCK_REGEX = r'(<div[^>]+class="video_block"[^>]*>)'
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/%s_default/index.html?videoId=%s'
|
||||
|
||||
def _brightcove_url_result(self, video_id, video_block):
|
||||
account_id = video_block.get('accountid') or '876450612001'
|
||||
player_id = video_block.get('playerid') or 'I2qBTln4u'
|
||||
return self.url_result(
|
||||
self.BRIGHTCOVE_URL_TEMPLATE % (account_id, player_id, video_id),
|
||||
'BrightcoveNew', video_id)
|
||||
|
||||
|
||||
class BFMTVIE(BFMTVBaseIE):
|
||||
IE_NAME = 'bfmtv'
|
||||
_VALID_URL = BFMTVBaseIE._VALID_URL_TMPL % 'V'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.bfmtv.com/politique/emmanuel-macron-l-islam-est-une-religion-qui-vit-une-crise-aujourd-hui-partout-dans-le-monde_VN-202010020146.html',
|
||||
'info_dict': {
|
||||
'id': '6196747868001',
|
||||
'ext': 'mp4',
|
||||
'title': 'Emmanuel Macron: "L\'Islam est une religion qui vit une crise aujourd’hui, partout dans le monde"',
|
||||
'description': 'Le Président s\'exprime sur la question du séparatisme depuis les Mureaux, dans les Yvelines.',
|
||||
'uploader_id': '876450610001',
|
||||
'upload_date': '20201002',
|
||||
'timestamp': 1601629620,
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
bfmtv_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, bfmtv_id)
|
||||
video_block = extract_attributes(self._search_regex(
|
||||
self._VIDEO_BLOCK_REGEX, webpage, 'video block'))
|
||||
return self._brightcove_url_result(video_block['videoid'], video_block)
|
||||
|
||||
|
||||
class BFMTVLiveIE(BFMTVIE):
|
||||
IE_NAME = 'bfmtv:live'
|
||||
_VALID_URL = BFMTVBaseIE._VALID_URL_BASE + '(?P<id>(?:[^/]+/)?en-direct)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.bfmtv.com/en-direct/',
|
||||
'info_dict': {
|
||||
'id': '5615950982001',
|
||||
'ext': 'mp4',
|
||||
'title': r're:^le direct BFMTV WEB \d{4}-\d{2}-\d{2} \d{2}:\d{2}$',
|
||||
'uploader_id': '876450610001',
|
||||
'upload_date': '20171018',
|
||||
'timestamp': 1508329950,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.bfmtv.com/economie/en-direct/',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
|
||||
class BFMTVArticleIE(BFMTVBaseIE):
|
||||
IE_NAME = 'bfmtv:article'
|
||||
_VALID_URL = BFMTVBaseIE._VALID_URL_TMPL % 'A'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.bfmtv.com/sante/covid-19-un-responsable-de-l-institut-pasteur-se-demande-quand-la-france-va-se-reconfiner_AV-202101060198.html',
|
||||
'info_dict': {
|
||||
'id': '202101060198',
|
||||
'title': 'Covid-19: un responsable de l\'Institut Pasteur se demande "quand la France va se reconfiner"',
|
||||
'description': 'md5:947974089c303d3ac6196670ae262843',
|
||||
},
|
||||
'playlist_count': 2,
|
||||
}, {
|
||||
'url': 'https://www.bfmtv.com/international/pour-bolsonaro-le-bresil-est-en-faillite-mais-il-ne-peut-rien-faire_AD-202101060232.html',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.bfmtv.com/sante/covid-19-oui-le-vaccin-de-pfizer-distribue-en-france-a-bien-ete-teste-sur-des-personnes-agees_AN-202101060275.html',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
bfmtv_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, bfmtv_id)
|
||||
|
||||
entries = []
|
||||
for video_block_el in re.findall(self._VIDEO_BLOCK_REGEX, webpage):
|
||||
video_block = extract_attributes(video_block_el)
|
||||
video_id = video_block.get('videoid')
|
||||
if not video_id:
|
||||
continue
|
||||
entries.append(self._brightcove_url_result(video_id, video_block))
|
||||
|
||||
return self.playlist_result(
|
||||
entries, bfmtv_id, self._og_search_title(webpage, fatal=False),
|
||||
self._html_search_meta(['og:description', 'description'], webpage))
|
||||
30
youtube_dlc/extractor/bibeltv.py
Normal file
30
youtube_dlc/extractor/bibeltv.py
Normal file
@@ -0,0 +1,30 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class BibelTVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?bibeltv\.de/mediathek/videos/(?:crn/)?(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.bibeltv.de/mediathek/videos/329703-sprachkurs-in-malaiisch',
|
||||
'md5': '252f908192d611de038b8504b08bf97f',
|
||||
'info_dict': {
|
||||
'id': 'ref:329703',
|
||||
'ext': 'mp4',
|
||||
'title': 'Sprachkurs in Malaiisch',
|
||||
'description': 'md5:3e9f197d29ee164714e67351cf737dfe',
|
||||
'timestamp': 1608316701,
|
||||
'uploader_id': '5840105145001',
|
||||
'upload_date': '20201218',
|
||||
}
|
||||
}, {
|
||||
'url': 'https://www.bibeltv.de/mediathek/videos/crn/326374',
|
||||
'only_matching': True,
|
||||
}]
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/5840105145001/default_default/index.html?videoId=ref:%s'
|
||||
|
||||
def _real_extract(self, url):
|
||||
crn_id = self._match_id(url)
|
||||
return self.url_result(
|
||||
self.BRIGHTCOVE_URL_TEMPLATE % crn_id, 'BrightcoveNew')
|
||||
@@ -7,12 +7,12 @@
|
||||
from .gigya import GigyaBaseIE
|
||||
from ..compat import compat_HTTPError
|
||||
from ..utils import (
|
||||
extract_attributes,
|
||||
ExtractorError,
|
||||
strip_or_none,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
merge_dicts,
|
||||
parse_iso8601,
|
||||
str_or_none,
|
||||
url_or_none,
|
||||
)
|
||||
@@ -37,6 +37,7 @@ class CanvasIE(InfoExtractor):
|
||||
'url': 'https://mediazone.vrt.be/api/v1/canvas/assets/mz-ast-5e5f90b6-2d72-4c40-82c2-e134f884e93e',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_GEO_BYPASS = False
|
||||
_HLS_ENTRY_PROTOCOLS_MAP = {
|
||||
'HLS': 'm3u8_native',
|
||||
'HLS_AES': 'm3u8',
|
||||
@@ -47,29 +48,34 @@ def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
site_id, video_id = mobj.group('site_id'), mobj.group('id')
|
||||
|
||||
# Old API endpoint, serves more formats but may fail for some videos
|
||||
data = self._download_json(
|
||||
'https://mediazone.vrt.be/api/v1/%s/assets/%s'
|
||||
% (site_id, video_id), video_id, 'Downloading asset JSON',
|
||||
'Unable to download asset JSON', fatal=False)
|
||||
data = None
|
||||
if site_id != 'vrtvideo':
|
||||
# Old API endpoint, serves more formats but may fail for some videos
|
||||
data = self._download_json(
|
||||
'https://mediazone.vrt.be/api/v1/%s/assets/%s'
|
||||
% (site_id, video_id), video_id, 'Downloading asset JSON',
|
||||
'Unable to download asset JSON', fatal=False)
|
||||
|
||||
# New API endpoint
|
||||
if not data:
|
||||
headers = self.geo_verification_headers()
|
||||
headers.update({'Content-Type': 'application/json'})
|
||||
token = self._download_json(
|
||||
'%s/tokens' % self._REST_API_BASE, video_id,
|
||||
'Downloading token', data=b'',
|
||||
headers={'Content-Type': 'application/json'})['vrtPlayerToken']
|
||||
'Downloading token', data=b'', headers=headers)['vrtPlayerToken']
|
||||
data = self._download_json(
|
||||
'%s/videos/%s' % (self._REST_API_BASE, video_id),
|
||||
video_id, 'Downloading video JSON', fatal=False, query={
|
||||
video_id, 'Downloading video JSON', query={
|
||||
'vrtPlayerToken': token,
|
||||
'client': '%s@PROD' % site_id,
|
||||
}, expected_status=400)
|
||||
message = data.get('message')
|
||||
if message and not data.get('title'):
|
||||
if data.get('code') == 'AUTHENTICATION_REQUIRED':
|
||||
self.raise_login_required(message)
|
||||
raise ExtractorError(message, expected=True)
|
||||
if not data.get('title'):
|
||||
code = data.get('code')
|
||||
if code == 'AUTHENTICATION_REQUIRED':
|
||||
self.raise_login_required()
|
||||
elif code == 'INVALID_LOCATION':
|
||||
self.raise_geo_restricted(countries=['BE'])
|
||||
raise ExtractorError(data.get('message') or code, expected=True)
|
||||
|
||||
title = data['title']
|
||||
description = data.get('description')
|
||||
@@ -205,20 +211,24 @@ def _real_extract(self, url):
|
||||
|
||||
class VrtNUIE(GigyaBaseIE):
|
||||
IE_DESC = 'VrtNU.be'
|
||||
_VALID_URL = r'https?://(?:www\.)?vrt\.be/(?P<site_id>vrtnu)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?vrt\.be/vrtnu/a-z/(?:[^/]+/){2}(?P<id>[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
# Available via old API endpoint
|
||||
'url': 'https://www.vrt.be/vrtnu/a-z/postbus-x/1/postbus-x-s1a1/',
|
||||
'url': 'https://www.vrt.be/vrtnu/a-z/postbus-x/1989/postbus-x-s1989a1/',
|
||||
'info_dict': {
|
||||
'id': 'pbs-pub-2e2d8c27-df26-45c9-9dc6-90c78153044d$vid-90c932b1-e21d-4fb8-99b1-db7b49cf74de',
|
||||
'id': 'pbs-pub-e8713dac-899e-41de-9313-81269f4c04ac$vid-90c932b1-e21d-4fb8-99b1-db7b49cf74de',
|
||||
'ext': 'mp4',
|
||||
'title': 'De zwarte weduwe',
|
||||
'description': 'md5:db1227b0f318c849ba5eab1fef895ee4',
|
||||
'title': 'Postbus X - Aflevering 1 (Seizoen 1989)',
|
||||
'description': 'md5:b704f669eb9262da4c55b33d7c6ed4b7',
|
||||
'duration': 1457.04,
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'season': 'Season 1',
|
||||
'season_number': 1,
|
||||
'series': 'Postbus X',
|
||||
'season': 'Seizoen 1989',
|
||||
'season_number': 1989,
|
||||
'episode': 'De zwarte weduwe',
|
||||
'episode_number': 1,
|
||||
'timestamp': 1595822400,
|
||||
'upload_date': '20200727',
|
||||
},
|
||||
'skip': 'This video is only available for registered users',
|
||||
'params': {
|
||||
@@ -300,69 +310,25 @@ def _login(self):
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
webpage, urlh = self._download_webpage_handle(url, display_id)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
attrs = extract_attributes(self._search_regex(
|
||||
r'(<nui-media[^>]+>)', webpage, 'media element'))
|
||||
video_id = attrs['videoid']
|
||||
publication_id = attrs.get('publicationid')
|
||||
if publication_id:
|
||||
video_id = publication_id + '$' + video_id
|
||||
|
||||
page = (self._parse_json(self._search_regex(
|
||||
r'digitalData\s*=\s*({.+?});', webpage, 'digial data',
|
||||
default='{}'), video_id, fatal=False) or {}).get('page') or {}
|
||||
|
||||
info = self._search_json_ld(webpage, display_id, default={})
|
||||
|
||||
# title is optional here since it may be extracted by extractor
|
||||
# that is delegated from here
|
||||
title = strip_or_none(self._html_search_regex(
|
||||
r'(?ms)<h1 class="content__heading">(.+?)</h1>',
|
||||
webpage, 'title', default=None))
|
||||
|
||||
description = self._html_search_regex(
|
||||
r'(?ms)<div class="content__description">(.+?)</div>',
|
||||
webpage, 'description', default=None)
|
||||
|
||||
season = self._html_search_regex(
|
||||
[r'''(?xms)<div\ class="tabs__tab\ tabs__tab--active">\s*
|
||||
<span>seizoen\ (.+?)</span>\s*
|
||||
</div>''',
|
||||
r'<option value="seizoen (\d{1,3})" data-href="[^"]+?" selected>'],
|
||||
webpage, 'season', default=None)
|
||||
|
||||
season_number = int_or_none(season)
|
||||
|
||||
episode_number = int_or_none(self._html_search_regex(
|
||||
r'''(?xms)<div\ class="content__episode">\s*
|
||||
<abbr\ title="aflevering">afl</abbr>\s*<span>(\d+)</span>
|
||||
</div>''',
|
||||
webpage, 'episode_number', default=None))
|
||||
|
||||
release_date = parse_iso8601(self._html_search_regex(
|
||||
r'(?ms)<div class="content__broadcastdate">\s*<time\ datetime="(.+?)"',
|
||||
webpage, 'release_date', default=None))
|
||||
|
||||
# If there's a ? or a # in the URL, remove them and everything after
|
||||
clean_url = urlh.geturl().split('?')[0].split('#')[0].strip('/')
|
||||
securevideo_url = clean_url + '.mssecurevideo.json'
|
||||
|
||||
try:
|
||||
video = self._download_json(securevideo_url, display_id)
|
||||
except ExtractorError as e:
|
||||
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
|
||||
self.raise_login_required()
|
||||
raise
|
||||
|
||||
# We are dealing with a '../<show>.relevant' URL
|
||||
redirect_url = video.get('url')
|
||||
if redirect_url:
|
||||
return self.url_result(self._proto_relative_url(redirect_url, 'https:'))
|
||||
|
||||
# There is only one entry, but with an unknown key, so just get
|
||||
# the first one
|
||||
video_id = list(video.values())[0].get('videoid')
|
||||
|
||||
return merge_dicts(info, {
|
||||
'_type': 'url_transparent',
|
||||
'url': 'https://mediazone.vrt.be/api/v1/vrtvideo/assets/%s' % video_id,
|
||||
'ie_key': CanvasIE.ie_key(),
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'season': season,
|
||||
'season_number': season_number,
|
||||
'episode_number': episode_number,
|
||||
'release_date': release_date,
|
||||
'season_number': int_or_none(page.get('episode_season')),
|
||||
})
|
||||
|
||||
@@ -17,7 +17,12 @@
|
||||
class DPlayIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)https?://
|
||||
(?P<domain>
|
||||
(?:www\.)?(?P<host>dplay\.(?P<country>dk|fi|jp|se|no))|
|
||||
(?:www\.)?(?P<host>d
|
||||
(?:
|
||||
play\.(?P<country>dk|fi|jp|se|no)|
|
||||
iscoveryplus\.(?P<plus_country>dk|es|fi|it|se|no)
|
||||
)
|
||||
)|
|
||||
(?P<subdomain_country>es|it)\.dplay\.com
|
||||
)/[^/]+/(?P<id>[^/]+/[^/?#]+)'''
|
||||
|
||||
@@ -126,6 +131,24 @@ class DPlayIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'https://www.dplay.jp/video/gold-rush/24086',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.discoveryplus.se/videos/nugammalt-77-handelser-som-format-sverige/nugammalt-77-handelser-som-format-sverige-101',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.discoveryplus.dk/videoer/ted-bundy-mind-of-a-monster/ted-bundy-mind-of-a-monster',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.discoveryplus.no/videoer/i-kongens-klr/sesong-1-episode-7',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.discoveryplus.it/videos/biografie-imbarazzanti/luigi-di-maio-la-psicosi-di-stanislawskij',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.discoveryplus.es/videos/la-fiebre-del-oro/temporada-8-episodio-1',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.discoveryplus.fi/videot/shifting-gears-with-aaron-kaufman/episode-16',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _get_disco_api_info(self, url, display_id, disco_host, realm, country):
|
||||
@@ -241,7 +264,7 @@ def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
display_id = mobj.group('id')
|
||||
domain = mobj.group('domain').lstrip('www.')
|
||||
country = mobj.group('country') or mobj.group('subdomain_country')
|
||||
host = 'disco-api.' + domain if domain.startswith('dplay.') else 'eu2-prod.disco-api.com'
|
||||
country = mobj.group('country') or mobj.group('subdomain_country') or mobj.group('plus_country')
|
||||
host = 'disco-api.' + domain if domain[0] == 'd' else 'eu2-prod.disco-api.com'
|
||||
return self._get_disco_api_info(
|
||||
url, display_id, host, 'dplay' + country, country)
|
||||
|
||||
@@ -46,6 +46,10 @@
|
||||
AluraCourseIE
|
||||
)
|
||||
from .amcnetworks import AMCNetworksIE
|
||||
from .animelab import (
|
||||
AnimeLabIE,
|
||||
AnimeLabShowsIE,
|
||||
)
|
||||
from .americastestkitchen import AmericasTestKitchenIE
|
||||
from .animeondemand import AnimeOnDemandIE
|
||||
from .anvato import AnvatoIE
|
||||
@@ -59,6 +63,7 @@
|
||||
AppleTrailersIE,
|
||||
AppleTrailersSectionIE,
|
||||
)
|
||||
from .applepodcasts import ApplePodcastsIE
|
||||
from .archiveorg import ArchiveOrgIE
|
||||
from .arcpublishing import ArcPublishingIE
|
||||
from .arkena import ArkenaIE
|
||||
@@ -104,6 +109,12 @@
|
||||
from .beatport import BeatportIE
|
||||
from .bet import BetIE
|
||||
from .bfi import BFIPlayerIE
|
||||
from .bfmtv import (
|
||||
BFMTVIE,
|
||||
BFMTVLiveIE,
|
||||
BFMTVArticleIE,
|
||||
)
|
||||
from .bibeltv import BibelTVIE
|
||||
from .bigflix import BigflixIE
|
||||
from .bild import BildIE
|
||||
from .bilibili import (
|
||||
@@ -442,7 +453,10 @@
|
||||
from .godtube import GodTubeIE
|
||||
from .golem import GolemIE
|
||||
from .googledrive import GoogleDriveIE
|
||||
from .googleplus import GooglePlusIE
|
||||
from .googlepodcasts import (
|
||||
GooglePodcastsIE,
|
||||
GooglePodcastsFeedIE,
|
||||
)
|
||||
from .googlesearch import GoogleSearchIE
|
||||
from .goshgay import GoshgayIE
|
||||
from .gputechconf import GPUTechConfIE
|
||||
@@ -484,6 +498,10 @@
|
||||
OneUPIE,
|
||||
PCMagIE,
|
||||
)
|
||||
from .iheart import (
|
||||
IHeartRadioIE,
|
||||
IHeartRadioPodcastIE,
|
||||
)
|
||||
from .imdb import (
|
||||
ImdbIE,
|
||||
ImdbListIE
|
||||
|
||||
@@ -1,73 +0,0 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
import codecs
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import unified_strdate
|
||||
|
||||
|
||||
class GooglePlusIE(InfoExtractor):
|
||||
IE_DESC = 'Google Plus'
|
||||
_VALID_URL = r'https?://plus\.google\.com/(?:[^/]+/)*?posts/(?P<id>\w+)'
|
||||
IE_NAME = 'plus.google'
|
||||
_TEST = {
|
||||
'url': 'https://plus.google.com/u/0/108897254135232129896/posts/ZButuJc6CtH',
|
||||
'info_dict': {
|
||||
'id': 'ZButuJc6CtH',
|
||||
'ext': 'flv',
|
||||
'title': '嘆きの天使 降臨',
|
||||
'upload_date': '20120613',
|
||||
'uploader': '井上ヨシマサ',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
# Step 1, Retrieve post webpage to extract further information
|
||||
webpage = self._download_webpage(url, video_id, 'Downloading entry webpage')
|
||||
|
||||
title = self._og_search_description(webpage).splitlines()[0]
|
||||
upload_date = unified_strdate(self._html_search_regex(
|
||||
r'''(?x)<a.+?class="o-U-s\s[^"]+"\s+style="display:\s*none"\s*>
|
||||
([0-9]{4}-[0-9]{2}-[0-9]{2})</a>''',
|
||||
webpage, 'upload date', fatal=False, flags=re.VERBOSE))
|
||||
uploader = self._html_search_regex(
|
||||
r'rel="author".*?>(.*?)</a>', webpage, 'uploader', fatal=False)
|
||||
|
||||
# Step 2, Simulate clicking the image box to launch video
|
||||
DOMAIN = 'https://plus.google.com/'
|
||||
video_page = self._search_regex(
|
||||
r'<a href="((?:%s)?photos/.*?)"' % re.escape(DOMAIN),
|
||||
webpage, 'video page URL')
|
||||
if not video_page.startswith(DOMAIN):
|
||||
video_page = DOMAIN + video_page
|
||||
|
||||
webpage = self._download_webpage(video_page, video_id, 'Downloading video page')
|
||||
|
||||
def unicode_escape(s):
|
||||
decoder = codecs.getdecoder('unicode_escape')
|
||||
return re.sub(
|
||||
r'\\u[0-9a-fA-F]{4,}',
|
||||
lambda m: decoder(m.group(0))[0],
|
||||
s)
|
||||
|
||||
# Extract video links all sizes
|
||||
formats = [{
|
||||
'url': unicode_escape(video_url),
|
||||
'ext': 'flv',
|
||||
'width': int(width),
|
||||
'height': int(height),
|
||||
} for width, height, video_url in re.findall(
|
||||
r'\d+,(\d+),(\d+),"(https?://[^.]+\.googleusercontent\.com.*?)"', webpage)]
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'uploader': uploader,
|
||||
'upload_date': upload_date,
|
||||
'formats': formats,
|
||||
}
|
||||
88
youtube_dlc/extractor/googlepodcasts.py
Normal file
88
youtube_dlc/extractor/googlepodcasts.py
Normal file
@@ -0,0 +1,88 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
clean_podcast_url,
|
||||
int_or_none,
|
||||
try_get,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
|
||||
class GooglePodcastsBaseIE(InfoExtractor):
|
||||
_VALID_URL_BASE = r'https?://podcasts\.google\.com/feed/'
|
||||
|
||||
def _batch_execute(self, func_id, video_id, params):
|
||||
return json.loads(self._download_json(
|
||||
'https://podcasts.google.com/_/PodcastsUi/data/batchexecute',
|
||||
video_id, data=urlencode_postdata({
|
||||
'f.req': json.dumps([[[func_id, json.dumps(params), None, '1']]]),
|
||||
}), transform_source=lambda x: self._search_regex(r'(?s)(\[.+\])', x, 'data'))[0][2])
|
||||
|
||||
def _extract_episode(self, episode):
|
||||
return {
|
||||
'id': episode[4][3],
|
||||
'title': episode[8],
|
||||
'url': clean_podcast_url(episode[13]),
|
||||
'thumbnail': episode[2],
|
||||
'description': episode[9],
|
||||
'creator': try_get(episode, lambda x: x[14]),
|
||||
'timestamp': int_or_none(episode[11]),
|
||||
'duration': int_or_none(episode[12]),
|
||||
'series': episode[1],
|
||||
}
|
||||
|
||||
|
||||
class GooglePodcastsIE(GooglePodcastsBaseIE):
|
||||
IE_NAME = 'google:podcasts'
|
||||
_VALID_URL = GooglePodcastsBaseIE._VALID_URL_BASE + r'(?P<feed_url>[^/]+)/episode/(?P<id>[^/?&#]+)'
|
||||
_TEST = {
|
||||
'url': 'https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5ucHIub3JnLzM0NDA5ODUzOS9wb2RjYXN0LnhtbA/episode/MzBlNWRlN2UtOWE4Yy00ODcwLTk2M2MtM2JlMmUyNmViOTRh',
|
||||
'md5': 'fa56b2ee8bd0703e27e42d4b104c4766',
|
||||
'info_dict': {
|
||||
'id': '30e5de7e-9a8c-4870-963c-3be2e26eb94a',
|
||||
'ext': 'mp3',
|
||||
'title': 'WWDTM New Year 2021',
|
||||
'description': 'We say goodbye to 2020 with Christine Baranksi, Doug Jones, Jonna Mendez, and Kellee Edwards.',
|
||||
'upload_date': '20210102',
|
||||
'timestamp': 1609606800,
|
||||
'duration': 2901,
|
||||
'series': "Wait Wait... Don't Tell Me!",
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
b64_feed_url, b64_guid = re.match(self._VALID_URL, url).groups()
|
||||
episode = self._batch_execute(
|
||||
'oNjqVe', b64_guid, [b64_feed_url, b64_guid])[1]
|
||||
return self._extract_episode(episode)
|
||||
|
||||
|
||||
class GooglePodcastsFeedIE(GooglePodcastsBaseIE):
|
||||
IE_NAME = 'google:podcasts:feed'
|
||||
_VALID_URL = GooglePodcastsBaseIE._VALID_URL_BASE + r'(?P<id>[^/?&#]+)/?(?:[?#&]|$)'
|
||||
_TEST = {
|
||||
'url': 'https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5ucHIub3JnLzM0NDA5ODUzOS9wb2RjYXN0LnhtbA',
|
||||
'info_dict': {
|
||||
'title': "Wait Wait... Don't Tell Me!",
|
||||
'description': "NPR's weekly current events quiz. Have a laugh and test your news knowledge while figuring out what's real and what we've made up.",
|
||||
},
|
||||
'playlist_mincount': 20,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
b64_feed_url = self._match_id(url)
|
||||
data = self._batch_execute('ncqJEe', b64_feed_url, [b64_feed_url])
|
||||
|
||||
entries = []
|
||||
for episode in (try_get(data, lambda x: x[1][0]) or []):
|
||||
entries.append(self._extract_episode(episode))
|
||||
|
||||
feed = try_get(data, lambda x: x[3]) or []
|
||||
return self.playlist_result(
|
||||
entries, playlist_title=try_get(feed, lambda x: x[0]),
|
||||
playlist_description=try_get(feed, lambda x: x[2]))
|
||||
97
youtube_dlc/extractor/iheart.py
Normal file
97
youtube_dlc/extractor/iheart.py
Normal file
@@ -0,0 +1,97 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
clean_podcast_url,
|
||||
int_or_none,
|
||||
str_or_none,
|
||||
)
|
||||
|
||||
|
||||
class IHeartRadioBaseIE(InfoExtractor):
|
||||
def _call_api(self, path, video_id, fatal=True, query=None):
|
||||
return self._download_json(
|
||||
'https://api.iheart.com/api/v3/podcast/' + path,
|
||||
video_id, fatal=fatal, query=query)
|
||||
|
||||
def _extract_episode(self, episode):
|
||||
return {
|
||||
'thumbnail': episode.get('imageUrl'),
|
||||
'description': clean_html(episode.get('description')),
|
||||
'timestamp': int_or_none(episode.get('startDate'), 1000),
|
||||
'duration': int_or_none(episode.get('duration')),
|
||||
}
|
||||
|
||||
|
||||
class IHeartRadioIE(IHeartRadioBaseIE):
|
||||
IENAME = 'iheartradio'
|
||||
_VALID_URL = r'(?:https?://(?:www\.)?iheart\.com/podcast/[^/]+/episode/(?P<display_id>[^/?&#]+)-|iheartradio:)(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'https://www.iheart.com/podcast/105-behind-the-bastards-29236323/episode/part-one-alexander-lukashenko-the-dictator-70346499/?embed=true',
|
||||
'md5': 'c8609c92c8688dcb69d8541042b8abca',
|
||||
'info_dict': {
|
||||
'id': '70346499',
|
||||
'ext': 'mp3',
|
||||
'title': 'Part One: Alexander Lukashenko: The Dictator of Belarus',
|
||||
'description': 'md5:96cc7297b3a5a9ebae28643801c96fae',
|
||||
'timestamp': 1597741200,
|
||||
'upload_date': '20200818',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
episode_id = self._match_id(url)
|
||||
episode = self._call_api(
|
||||
'episodes/' + episode_id, episode_id)['episode']
|
||||
info = self._extract_episode(episode)
|
||||
info.update({
|
||||
'id': episode_id,
|
||||
'title': episode['title'],
|
||||
'url': clean_podcast_url(episode['mediaUrl']),
|
||||
})
|
||||
return info
|
||||
|
||||
|
||||
class IHeartRadioPodcastIE(IHeartRadioBaseIE):
|
||||
IE_NAME = 'iheartradio:podcast'
|
||||
_VALID_URL = r'https?://(?:www\.)?iheart(?:podcastnetwork)?\.com/podcast/[^/?&#]+-(?P<id>\d+)/?(?:[?#&]|$)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.iheart.com/podcast/1119-it-could-happen-here-30717896/',
|
||||
'info_dict': {
|
||||
'id': '30717896',
|
||||
'title': 'It Could Happen Here',
|
||||
'description': 'md5:5842117412a967eb0b01f8088eb663e2',
|
||||
},
|
||||
'playlist_mincount': 11,
|
||||
}, {
|
||||
'url': 'https://www.iheartpodcastnetwork.com/podcast/105-stuff-you-should-know-26940277',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
podcast_id = self._match_id(url)
|
||||
path = 'podcasts/' + podcast_id
|
||||
episodes = self._call_api(
|
||||
path + '/episodes', podcast_id, query={'limit': 1000000000})['data']
|
||||
|
||||
entries = []
|
||||
for episode in episodes:
|
||||
episode_id = str_or_none(episode.get('id'))
|
||||
if not episode_id:
|
||||
continue
|
||||
info = self._extract_episode(episode)
|
||||
info.update({
|
||||
'_type': 'url',
|
||||
'id': episode_id,
|
||||
'title': episode.get('title'),
|
||||
'url': 'iheartradio:' + episode_id,
|
||||
'ie_key': IHeartRadioIE.ie_key(),
|
||||
})
|
||||
entries.append(info)
|
||||
|
||||
podcast = self._call_api(path, podcast_id, False) or {}
|
||||
|
||||
return self.playlist_result(
|
||||
entries, podcast_id, podcast.get('title'), podcast.get('description'))
|
||||
@@ -2,92 +2,71 @@
|
||||
|
||||
from .canvas import CanvasIE
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_urllib_parse_unquote
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
parse_iso8601,
|
||||
)
|
||||
|
||||
|
||||
class KetnetIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?ketnet\.be/(?:[^/]+/)*(?P<id>[^/?#&]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?ketnet\.be/(?P<id>(?:[^/]+/)*[^/?#&]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.ketnet.be/kijken/zomerse-filmpjes',
|
||||
'md5': '6bdeb65998930251bbd1c510750edba9',
|
||||
'url': 'https://www.ketnet.be/kijken/n/nachtwacht/3/nachtwacht-s3a1-de-greystook',
|
||||
'md5': '37b2b7bb9b3dcaa05b67058dc3a714a9',
|
||||
'info_dict': {
|
||||
'id': 'zomerse-filmpjes',
|
||||
'id': 'pbs-pub-aef8b526-115e-4006-aa24-e59ff6c6ef6f$vid-ddb815bf-c8e7-467b-8879-6bad7a32cebd',
|
||||
'ext': 'mp4',
|
||||
'title': 'Gluur mee op de filmset en op Pennenzakkenrock',
|
||||
'description': 'Gluur mee met Ghost Rockers op de filmset',
|
||||
'title': 'Nachtwacht - Reeks 3: Aflevering 1',
|
||||
'description': 'De Nachtwacht krijgt te maken met een parasiet',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
}
|
||||
}, {
|
||||
# mzid in playerConfig instead of sources
|
||||
'url': 'https://www.ketnet.be/kijken/nachtwacht/de-greystook',
|
||||
'md5': '90139b746a0a9bd7bb631283f6e2a64e',
|
||||
'info_dict': {
|
||||
'id': 'md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
|
||||
'display_id': 'md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
|
||||
'ext': 'flv',
|
||||
'title': 'Nachtwacht: De Greystook',
|
||||
'description': 'md5:1db3f5dc4c7109c821261e7512975be7',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 1468.03,
|
||||
'duration': 1468.02,
|
||||
'timestamp': 1609225200,
|
||||
'upload_date': '20201229',
|
||||
'series': 'Nachtwacht',
|
||||
'season': 'Reeks 3',
|
||||
'episode': 'De Greystook',
|
||||
'episode_number': 1,
|
||||
},
|
||||
'expected_warnings': ['is not a supported codec', 'Unknown MIME type'],
|
||||
}, {
|
||||
'url': 'https://www.ketnet.be/kijken/karrewiet/uitzending-8-september-2016',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.ketnet.be/achter-de-schermen/sien-repeteert-voor-stars-for-life',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# mzsource, geo restricted to Belgium
|
||||
'url': 'https://www.ketnet.be/kijken/nachtwacht/de-bermadoe',
|
||||
'url': 'https://www.ketnet.be/themas/karrewiet/jaaroverzicht-20200/karrewiet-het-jaar-van-black-mamba',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
display_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
video = self._download_json(
|
||||
'https://senior-bff.ketnet.be/graphql', display_id, query={
|
||||
'query': '''{
|
||||
video(id: "content/ketnet/nl/%s.model.json") {
|
||||
description
|
||||
episodeNr
|
||||
imageUrl
|
||||
mediaReference
|
||||
programTitle
|
||||
publicationDate
|
||||
seasonTitle
|
||||
subtitleVideodetail
|
||||
titleVideodetail
|
||||
}
|
||||
}''' % display_id,
|
||||
})['data']['video']
|
||||
|
||||
config = self._parse_json(
|
||||
self._search_regex(
|
||||
r'(?s)playerConfig\s*=\s*({.+?})\s*;', webpage,
|
||||
'player config'),
|
||||
video_id)
|
||||
|
||||
mzid = config.get('mzid')
|
||||
if mzid:
|
||||
return self.url_result(
|
||||
'https://mediazone.vrt.be/api/v1/ketnet/assets/%s' % mzid,
|
||||
CanvasIE.ie_key(), video_id=mzid)
|
||||
|
||||
title = config['title']
|
||||
|
||||
formats = []
|
||||
for source_key in ('', 'mz'):
|
||||
source = config.get('%ssource' % source_key)
|
||||
if not isinstance(source, dict):
|
||||
continue
|
||||
for format_id, format_url in source.items():
|
||||
if format_id == 'hls':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
format_url, video_id, 'mp4',
|
||||
entry_protocol='m3u8_native', m3u8_id=format_id,
|
||||
fatal=False))
|
||||
elif format_id == 'hds':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
format_url, video_id, f4m_id=format_id, fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
'url': format_url,
|
||||
'format_id': format_id,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
mz_id = compat_urllib_parse_unquote(video['mediaReference'])
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': config.get('description'),
|
||||
'thumbnail': config.get('image'),
|
||||
'series': config.get('program'),
|
||||
'episode': config.get('episode'),
|
||||
'formats': formats,
|
||||
'_type': 'url_transparent',
|
||||
'id': mz_id,
|
||||
'title': video['titleVideodetail'],
|
||||
'url': 'https://mediazone.vrt.be/api/v1/ketnet/assets/' + mz_id,
|
||||
'thumbnail': video.get('imageUrl'),
|
||||
'description': video.get('description'),
|
||||
'timestamp': parse_iso8601(video.get('publicationDate')),
|
||||
'series': video.get('programTitle'),
|
||||
'season': video.get('seasonTitle'),
|
||||
'episode': video.get('subtitleVideodetail'),
|
||||
'episode_number': int_or_none(video.get('episodeNr')),
|
||||
'ie_key': CanvasIE.ie_key(),
|
||||
}
|
||||
|
||||
@@ -61,6 +61,23 @@ class MotherlessIE(InfoExtractor):
|
||||
# no keywords
|
||||
'url': 'http://motherless.com/8B4BBC1',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# see https://motherless.com/videos/recent for recent videos with
|
||||
# uploaded date in "ago" format
|
||||
'url': 'https://motherless.com/3C3E2CF',
|
||||
'info_dict': {
|
||||
'id': '3C3E2CF',
|
||||
'ext': 'mp4',
|
||||
'title': 'a/ Hot Teens',
|
||||
'categories': list,
|
||||
'upload_date': '20210104',
|
||||
'uploader_id': 'yonbiw',
|
||||
'thumbnail': r're:https?://.*\.jpg',
|
||||
'age_limit': 18,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@@ -85,20 +102,28 @@ def _real_extract(self, url):
|
||||
or 'http://cdn4.videos.motherlessmedia.com/videos/%s.mp4?fs=opencloud' % video_id)
|
||||
age_limit = self._rta_search(webpage)
|
||||
view_count = str_to_int(self._html_search_regex(
|
||||
(r'>(\d+)\s+Views<', r'<strong>Views</strong>\s+([^<]+)<'),
|
||||
(r'>([\d,.]+)\s+Views<', r'<strong>Views</strong>\s+([^<]+)<'),
|
||||
webpage, 'view count', fatal=False))
|
||||
like_count = str_to_int(self._html_search_regex(
|
||||
(r'>(\d+)\s+Favorites<', r'<strong>Favorited</strong>\s+([^<]+)<'),
|
||||
(r'>([\d,.]+)\s+Favorites<',
|
||||
r'<strong>Favorited</strong>\s+([^<]+)<'),
|
||||
webpage, 'like count', fatal=False))
|
||||
|
||||
upload_date = self._html_search_regex(
|
||||
(r'class=["\']count[^>]+>(\d+\s+[a-zA-Z]{3}\s+\d{4})<',
|
||||
r'<strong>Uploaded</strong>\s+([^<]+)<'), webpage, 'upload date')
|
||||
if 'Ago' in upload_date:
|
||||
days = int(re.search(r'([0-9]+)', upload_date).group(1))
|
||||
upload_date = (datetime.datetime.now() - datetime.timedelta(days=days)).strftime('%Y%m%d')
|
||||
else:
|
||||
upload_date = unified_strdate(upload_date)
|
||||
upload_date = unified_strdate(self._search_regex(
|
||||
r'class=["\']count[^>]+>(\d+\s+[a-zA-Z]{3}\s+\d{4})<', webpage,
|
||||
'upload date', default=None))
|
||||
if not upload_date:
|
||||
uploaded_ago = self._search_regex(
|
||||
r'>\s*(\d+[hd])\s+[aA]go\b', webpage, 'uploaded ago',
|
||||
default=None)
|
||||
if uploaded_ago:
|
||||
delta = int(uploaded_ago[:-1])
|
||||
_AGO_UNITS = {
|
||||
'h': 'hours',
|
||||
'd': 'days',
|
||||
}
|
||||
kwargs = {_AGO_UNITS.get(uploaded_ago[-1]): delta}
|
||||
upload_date = (datetime.datetime.utcnow() - datetime.timedelta(**kwargs)).strftime('%Y%m%d')
|
||||
|
||||
comment_count = webpage.count('class="media-comment-contents"')
|
||||
uploader_id = self._html_search_regex(
|
||||
|
||||
@@ -223,12 +223,12 @@ def call_playback_api(item, query=None):
|
||||
legal_age = try_get(
|
||||
data, lambda x: x['legalAge']['body']['rating']['code'], compat_str)
|
||||
# https://en.wikipedia.org/wiki/Norwegian_Media_Authority
|
||||
if legal_age == 'A':
|
||||
age_limit = 0
|
||||
elif legal_age.isdigit():
|
||||
age_limit = int_or_none(legal_age)
|
||||
else:
|
||||
age_limit = None
|
||||
age_limit = None
|
||||
if legal_age:
|
||||
if legal_age == 'A':
|
||||
age_limit = 0
|
||||
elif legal_age.isdigit():
|
||||
age_limit = int_or_none(legal_age)
|
||||
|
||||
is_series = try_get(data, lambda x: x['_links']['series']['name']) == 'series'
|
||||
|
||||
@@ -298,6 +298,14 @@ class NRKTVIE(InfoExtractor):
|
||||
'description': 'md5:46923a6e6510eefcce23d5ef2a58f2ce',
|
||||
'duration': 2223.44,
|
||||
'age_limit': 6,
|
||||
'subtitles': {
|
||||
'nb-nor': [{
|
||||
'ext': 'vtt',
|
||||
}],
|
||||
'nb-ttv': [{
|
||||
'ext': 'vtt',
|
||||
}]
|
||||
},
|
||||
},
|
||||
}, {
|
||||
'url': 'https://tv.nrk.no/serie/20-spoersmaal-tv/MUHH48000314/23-05-2014',
|
||||
|
||||
@@ -17,6 +17,7 @@
|
||||
get_exe_version,
|
||||
is_outdated_version,
|
||||
std_headers,
|
||||
process_communicate_or_kill,
|
||||
)
|
||||
|
||||
|
||||
@@ -226,7 +227,7 @@ def get(self, url, html=None, video_id=None, note=None, note2='Executing JS on w
|
||||
self.exe, '--ssl-protocol=any',
|
||||
self._TMP_FILES['script'].name
|
||||
], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
|
||||
out, err = p.communicate()
|
||||
out, err = process_communicate_or_kill(p)
|
||||
if p.returncode != 0:
|
||||
raise ExtractorError(
|
||||
'Executing JS failed\n:' + encodeArgument(err))
|
||||
|
||||
@@ -103,22 +103,28 @@ def _extract_relinker_info(self, relinker_url, video_id):
|
||||
}.items() if v is not None)
|
||||
|
||||
@staticmethod
|
||||
def _extract_subtitles(url, subtitle_url):
|
||||
def _extract_subtitles(url, video_data):
|
||||
STL_EXT = 'stl'
|
||||
SRT_EXT = 'srt'
|
||||
subtitles = {}
|
||||
if subtitle_url and isinstance(subtitle_url, compat_str):
|
||||
subtitle_url = urljoin(url, subtitle_url)
|
||||
STL_EXT = '.stl'
|
||||
SRT_EXT = '.srt'
|
||||
subtitles['it'] = [{
|
||||
'ext': 'stl',
|
||||
'url': subtitle_url,
|
||||
}]
|
||||
if subtitle_url.endswith(STL_EXT):
|
||||
srt_url = subtitle_url[:-len(STL_EXT)] + SRT_EXT
|
||||
subtitles['it'].append({
|
||||
'ext': 'srt',
|
||||
'url': srt_url,
|
||||
subtitles_array = video_data.get('subtitlesArray') or []
|
||||
for k in ('subtitles', 'subtitlesUrl'):
|
||||
subtitles_array.append({'url': video_data.get(k)})
|
||||
for subtitle in subtitles_array:
|
||||
sub_url = subtitle.get('url')
|
||||
if sub_url and isinstance(sub_url, compat_str):
|
||||
sub_lang = subtitle.get('language') or 'it'
|
||||
sub_url = urljoin(url, sub_url)
|
||||
sub_ext = determine_ext(sub_url, SRT_EXT)
|
||||
subtitles.setdefault(sub_lang, []).append({
|
||||
'ext': sub_ext,
|
||||
'url': sub_url,
|
||||
})
|
||||
if STL_EXT == sub_ext:
|
||||
subtitles[sub_lang].append({
|
||||
'ext': SRT_EXT,
|
||||
'url': sub_url[:-len(STL_EXT)] + SRT_EXT,
|
||||
})
|
||||
return subtitles
|
||||
|
||||
|
||||
@@ -138,6 +144,9 @@ class RaiPlayIE(RaiBaseIE):
|
||||
'duration': 6160,
|
||||
'series': 'Report',
|
||||
'season': '2013/14',
|
||||
'subtitles': {
|
||||
'it': 'count:2',
|
||||
},
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -145,6 +154,10 @@ class RaiPlayIE(RaiBaseIE):
|
||||
}, {
|
||||
'url': 'http://www.raiplay.it/video/2016/11/gazebotraindesi-efebe701-969c-4593-92f3-285f0d1ce750.html?',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# subtitles at 'subtitlesArray' key (see #27698)
|
||||
'url': 'https://www.raiplay.it/video/2020/12/Report---04-01-2021-2e90f1de-8eee-4de4-ac0e-78d21db5b600.html',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@@ -171,7 +184,7 @@ def _real_extract(self, url):
|
||||
if date_published and time_published:
|
||||
date_published += ' ' + time_published
|
||||
|
||||
subtitles = self._extract_subtitles(url, video.get('subtitles'))
|
||||
subtitles = self._extract_subtitles(url, video)
|
||||
|
||||
program_info = media.get('program_info') or {}
|
||||
season = media.get('season')
|
||||
@@ -325,6 +338,22 @@ class RaiIE(RaiBaseIE):
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# ContentItem in iframe (see #12652) and subtitle at 'subtitlesUrl' key
|
||||
'url': 'http://www.presadiretta.rai.it/dl/portali/site/puntata/ContentItem-3ed19d13-26c2-46ff-a551-b10828262f1b.html',
|
||||
'info_dict': {
|
||||
'id': '1ad6dc64-444a-42a4-9bea-e5419ad2f5fd',
|
||||
'ext': 'mp4',
|
||||
'title': 'Partiti acchiappavoti - Presa diretta del 13/09/2015',
|
||||
'description': 'md5:d291b03407ec505f95f27970c0b025f4',
|
||||
'upload_date': '20150913',
|
||||
'subtitles': {
|
||||
'it': 'count:2',
|
||||
},
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
}, {
|
||||
# Direct MMS URL
|
||||
'url': 'http://www.rai.it/dl/RaiTV/programmi/media/ContentItem-b63a4089-ac28-48cf-bca5-9f5b5bc46df5.html',
|
||||
@@ -365,7 +394,7 @@ def _extract_from_content_id(self, content_id, url):
|
||||
'url': compat_urlparse.urljoin(url, thumbnail_url),
|
||||
})
|
||||
|
||||
subtitles = self._extract_subtitles(url, media.get('subtitlesUrl'))
|
||||
subtitles = self._extract_subtitles(url, media)
|
||||
|
||||
info = {
|
||||
'id': content_id,
|
||||
@@ -402,7 +431,8 @@ def _real_extract(self, url):
|
||||
r'''(?x)
|
||||
(?:
|
||||
(?:initEdizione|drawMediaRaiTV)\(|
|
||||
<(?:[^>]+\bdata-id|var\s+uniquename)=
|
||||
<(?:[^>]+\bdata-id|var\s+uniquename)=|
|
||||
<iframe[^>]+\bsrc=
|
||||
)
|
||||
(["\'])
|
||||
(?:(?!\1).)*\bContentItem-(?P<id>%s)
|
||||
|
||||
@@ -10,7 +10,7 @@
|
||||
|
||||
class SBSIE(InfoExtractor):
|
||||
IE_DESC = 'sbs.com.au'
|
||||
_VALID_URL = r'https?://(?:www\.)?sbs\.com\.au/(?:ondemand|news)/video/(?:single/)?(?P<id>[0-9]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?sbs\.com\.au/(?:ondemand(?:/video/(?:single/)?|.*?\bplay=)|news/(?:embeds/)?video/)(?P<id>[0-9]+)'
|
||||
|
||||
_TESTS = [{
|
||||
# Original URL is handled by the generic IE which finds the iframe:
|
||||
@@ -18,7 +18,7 @@ class SBSIE(InfoExtractor):
|
||||
'url': 'http://www.sbs.com.au/ondemand/video/single/320403011771/?source=drupal&vertical=thefeed',
|
||||
'md5': '3150cf278965eeabb5b4cea1c963fe0a',
|
||||
'info_dict': {
|
||||
'id': '320403011771',
|
||||
'id': '_rFBPRPO4pMR',
|
||||
'ext': 'mp4',
|
||||
'title': 'Dingo Conservation (The Feed)',
|
||||
'description': 'md5:f250a9856fca50d22dec0b5b8015f8a5',
|
||||
@@ -34,6 +34,15 @@ class SBSIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://www.sbs.com.au/news/video/471395907773/The-Feed-July-9',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.sbs.com.au/ondemand/?play=1836638787723',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.sbs.com.au/ondemand/program/inside-windsor-castle?play=1283505731842',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.sbs.com.au/news/embeds/video/1840778819866',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
|
||||
@@ -8,13 +8,17 @@
|
||||
compat_str,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
smuggle_url,
|
||||
str_or_none,
|
||||
try_get,
|
||||
)
|
||||
|
||||
|
||||
class STVPlayerIE(InfoExtractor):
|
||||
IE_NAME = 'stv:player'
|
||||
_VALID_URL = r'https?://player\.stv\.tv/(?P<type>episode|video)/(?P<id>[a-z0-9]{4})'
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
# shortform
|
||||
'url': 'https://player.stv.tv/video/4gwd/emmerdale/60-seconds-on-set-with-laura-norton/',
|
||||
'md5': '5adf9439c31d554f8be0707c7abe7e0a',
|
||||
'info_dict': {
|
||||
@@ -27,7 +31,11 @@ class STVPlayerIE(InfoExtractor):
|
||||
'uploader_id': '1486976045',
|
||||
},
|
||||
'skip': 'this resource is unavailable outside of the UK',
|
||||
}
|
||||
}, {
|
||||
# episodes
|
||||
'url': 'https://player.stv.tv/episode/4125/jennifer-saunders-memory-lane',
|
||||
'only_matching': True,
|
||||
}]
|
||||
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1486976045/default_default/index.html?videoId=%s'
|
||||
_PTYPE_MAP = {
|
||||
'episode': 'episodes',
|
||||
@@ -36,11 +44,31 @@ class STVPlayerIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
ptype, video_id = re.match(self._VALID_URL, url).groups()
|
||||
resp = self._download_json(
|
||||
'https://player.api.stv.tv/v1/%s/%s' % (self._PTYPE_MAP[ptype], video_id),
|
||||
video_id)
|
||||
|
||||
result = resp['results']
|
||||
webpage = self._download_webpage(url, video_id, fatal=False) or ''
|
||||
props = (self._parse_json(self._search_regex(
|
||||
r'<script[^>]+id="__NEXT_DATA__"[^>]*>({.+?})</script>',
|
||||
webpage, 'next data', default='{}'), video_id,
|
||||
fatal=False) or {}).get('props') or {}
|
||||
player_api_cache = try_get(
|
||||
props, lambda x: x['initialReduxState']['playerApiCache']) or {}
|
||||
|
||||
api_path, resp = None, {}
|
||||
for k, v in player_api_cache.items():
|
||||
if k.startswith('/episodes/') or k.startswith('/shortform/'):
|
||||
api_path, resp = k, v
|
||||
break
|
||||
else:
|
||||
episode_id = str_or_none(try_get(
|
||||
props, lambda x: x['pageProps']['episodeId']))
|
||||
api_path = '/%s/%s' % (self._PTYPE_MAP[ptype], episode_id or video_id)
|
||||
|
||||
result = resp.get('results')
|
||||
if not result:
|
||||
resp = self._download_json(
|
||||
'https://player.api.stv.tv/v1' + api_path, video_id)
|
||||
result = resp['results']
|
||||
|
||||
video = result['video']
|
||||
video_id = compat_str(video['id'])
|
||||
|
||||
@@ -57,7 +85,7 @@ def _real_extract(self, url):
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'id': video_id,
|
||||
'url': self.BRIGHTCOVE_URL_TEMPLATE % video_id,
|
||||
'url': smuggle_url(self.BRIGHTCOVE_URL_TEMPLATE % video_id, {'geo_countries': ['GB']}),
|
||||
'description': result.get('summary'),
|
||||
'duration': float_or_none(video.get('length'), 1000),
|
||||
'subtitles': subtitles,
|
||||
|
||||
@@ -9,7 +9,6 @@
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_kwargs,
|
||||
compat_parse_qs,
|
||||
compat_str,
|
||||
compat_urlparse,
|
||||
@@ -42,30 +41,16 @@ class TwitchBaseIE(InfoExtractor):
|
||||
_CLIENT_ID = 'kimne78kx3ncx6brgo4mv6wki5h1ko'
|
||||
_NETRC_MACHINE = 'twitch'
|
||||
|
||||
def _handle_error(self, response):
|
||||
if not isinstance(response, dict):
|
||||
return
|
||||
error = response.get('error')
|
||||
if error:
|
||||
raise ExtractorError(
|
||||
'%s returned error: %s - %s' % (self.IE_NAME, error, response.get('message')),
|
||||
expected=True)
|
||||
|
||||
def _call_api(self, path, item_id, *args, **kwargs):
|
||||
headers = kwargs.get('headers', {}).copy()
|
||||
headers.update({
|
||||
'Accept': 'application/vnd.twitchtv.v5+json; charset=UTF-8',
|
||||
'Client-ID': self._CLIENT_ID,
|
||||
})
|
||||
kwargs.update({
|
||||
'headers': headers,
|
||||
'expected_status': (400, 410),
|
||||
})
|
||||
response = self._download_json(
|
||||
'%s/%s' % (self._API_BASE, path), item_id,
|
||||
*args, **compat_kwargs(kwargs))
|
||||
self._handle_error(response)
|
||||
return response
|
||||
_OPERATION_HASHES = {
|
||||
'CollectionSideBar': '27111f1b382effad0b6def325caef1909c733fe6a4fbabf54f8d491ef2cf2f14',
|
||||
'FilterableVideoTower_Videos': 'a937f1d22e269e39a03b509f65a7490f9fc247d7f83d6ac1421523e3b68042cb',
|
||||
'ClipsCards__User': 'b73ad2bfaecfd30a9e6c28fada15bd97032c83ec77a0440766a56fe0bd632777',
|
||||
'ChannelCollectionsContent': '07e3691a1bad77a36aba590c351180439a40baefc1c275356f40fc7082419a84',
|
||||
'StreamMetadata': '1c719a40e481453e5c48d9bb585d971b8b372f8ebb105b17076722264dfa5b3e',
|
||||
'ComscoreStreamingQuery': 'e1edae8122517d013405f237ffcc124515dc6ded82480a88daef69c83b53ac01',
|
||||
'VideoPreviewOverlay': '3006e77e51b128d838fa4e835723ca4dc9a05c5efd4466c1085215c6e437e65c',
|
||||
'VideoMetadata': '226edb3e692509f727fd56821f5653c05740242c82b0388883e0c0e75dcbf687',
|
||||
}
|
||||
|
||||
def _real_initialize(self):
|
||||
self._login()
|
||||
@@ -151,13 +136,46 @@ def _prefer_source(self, formats):
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
def _download_access_token(self, channel_name):
|
||||
return self._call_api(
|
||||
'api/channels/%s/access_token' % channel_name, channel_name,
|
||||
'Downloading access token JSON')
|
||||
def _download_base_gql(self, video_id, ops, note, fatal=True):
|
||||
return self._download_json(
|
||||
'https://gql.twitch.tv/gql', video_id, note,
|
||||
data=json.dumps(ops).encode(),
|
||||
headers={
|
||||
'Content-Type': 'text/plain;charset=UTF-8',
|
||||
'Client-ID': self._CLIENT_ID,
|
||||
}, fatal=fatal)
|
||||
|
||||
def _extract_channel_id(self, token, channel_name):
|
||||
return compat_str(self._parse_json(token, channel_name)['channel_id'])
|
||||
def _download_gql(self, video_id, ops, note, fatal=True):
|
||||
for op in ops:
|
||||
op['extensions'] = {
|
||||
'persistedQuery': {
|
||||
'version': 1,
|
||||
'sha256Hash': self._OPERATION_HASHES[op['operationName']],
|
||||
}
|
||||
}
|
||||
return self._download_base_gql(video_id, ops, note)
|
||||
|
||||
def _download_access_token(self, video_id, token_kind, param_name):
|
||||
method = '%sPlaybackAccessToken' % token_kind
|
||||
ops = {
|
||||
'query': '''{
|
||||
%s(
|
||||
%s: "%s",
|
||||
params: {
|
||||
platform: "web",
|
||||
playerBackend: "mediaplayer",
|
||||
playerType: "site"
|
||||
}
|
||||
)
|
||||
{
|
||||
value
|
||||
signature
|
||||
}
|
||||
}''' % (method, param_name, video_id),
|
||||
}
|
||||
return self._download_base_gql(
|
||||
video_id, ops,
|
||||
'Downloading %s access token GraphQL' % token_kind)['data'][method]
|
||||
|
||||
|
||||
class TwitchVodIE(TwitchBaseIE):
|
||||
@@ -170,8 +188,6 @@ class TwitchVodIE(TwitchBaseIE):
|
||||
)
|
||||
(?P<id>\d+)
|
||||
'''
|
||||
_ITEM_TYPE = 'vod'
|
||||
_ITEM_SHORTCUT = 'v'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.twitch.tv/riotgames/v/6528877?t=5m10s',
|
||||
@@ -181,7 +197,7 @@ class TwitchVodIE(TwitchBaseIE):
|
||||
'title': 'LCK Summer Split - Week 6 Day 1',
|
||||
'thumbnail': r're:^https?://.*\.jpg$',
|
||||
'duration': 17208,
|
||||
'timestamp': 1435131709,
|
||||
'timestamp': 1435131734,
|
||||
'upload_date': '20150624',
|
||||
'uploader': 'Riot Games',
|
||||
'uploader_id': 'riotgames',
|
||||
@@ -230,10 +246,20 @@ class TwitchVodIE(TwitchBaseIE):
|
||||
}]
|
||||
|
||||
def _download_info(self, item_id):
|
||||
return self._extract_info(
|
||||
self._call_api(
|
||||
'kraken/videos/%s' % item_id, item_id,
|
||||
'Downloading video info JSON'))
|
||||
data = self._download_gql(
|
||||
item_id, [{
|
||||
'operationName': 'VideoMetadata',
|
||||
'variables': {
|
||||
'channelLogin': '',
|
||||
'videoID': item_id,
|
||||
},
|
||||
}],
|
||||
'Downloading stream metadata GraphQL')[0]['data']
|
||||
video = data.get('video')
|
||||
if video is None:
|
||||
raise ExtractorError(
|
||||
'Video %s does not exist' % item_id, expected=True)
|
||||
return self._extract_info_gql(video, item_id)
|
||||
|
||||
@staticmethod
|
||||
def _extract_info(info):
|
||||
@@ -272,13 +298,33 @@ def _extract_info(info):
|
||||
'is_live': is_live,
|
||||
}
|
||||
|
||||
@staticmethod
|
||||
def _extract_info_gql(info, item_id):
|
||||
vod_id = info.get('id') or item_id
|
||||
# id backward compatibility for download archives
|
||||
if vod_id[0] != 'v':
|
||||
vod_id = 'v%s' % vod_id
|
||||
thumbnail = url_or_none(info.get('previewThumbnailURL'))
|
||||
if thumbnail:
|
||||
for p in ('width', 'height'):
|
||||
thumbnail = thumbnail.replace('{%s}' % p, '0')
|
||||
return {
|
||||
'id': vod_id,
|
||||
'title': info.get('title') or 'Untitled Broadcast',
|
||||
'description': info.get('description'),
|
||||
'duration': int_or_none(info.get('lengthSeconds')),
|
||||
'thumbnail': thumbnail,
|
||||
'uploader': try_get(info, lambda x: x['owner']['displayName'], compat_str),
|
||||
'uploader_id': try_get(info, lambda x: x['owner']['login'], compat_str),
|
||||
'timestamp': unified_timestamp(info.get('publishedAt')),
|
||||
'view_count': int_or_none(info.get('viewCount')),
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
vod_id = self._match_id(url)
|
||||
|
||||
info = self._download_info(vod_id)
|
||||
access_token = self._call_api(
|
||||
'api/vods/%s/access_token' % vod_id, vod_id,
|
||||
'Downloading %s access token' % self._ITEM_TYPE)
|
||||
access_token = self._download_access_token(vod_id, 'video', 'id')
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
'%s/vod/%s.m3u8?%s' % (
|
||||
@@ -289,8 +335,8 @@ def _real_extract(self, url):
|
||||
'allow_spectre': 'true',
|
||||
'player': 'twitchweb',
|
||||
'playlist_include_framerate': 'true',
|
||||
'nauth': access_token['token'],
|
||||
'nauthsig': access_token['sig'],
|
||||
'nauth': access_token['value'],
|
||||
'nauthsig': access_token['signature'],
|
||||
})),
|
||||
vod_id, 'mp4', entry_protocol='m3u8_native')
|
||||
|
||||
@@ -333,37 +379,7 @@ def _make_video_result(node):
|
||||
}
|
||||
|
||||
|
||||
class TwitchGraphQLBaseIE(TwitchBaseIE):
|
||||
_PAGE_LIMIT = 100
|
||||
|
||||
_OPERATION_HASHES = {
|
||||
'CollectionSideBar': '27111f1b382effad0b6def325caef1909c733fe6a4fbabf54f8d491ef2cf2f14',
|
||||
'FilterableVideoTower_Videos': 'a937f1d22e269e39a03b509f65a7490f9fc247d7f83d6ac1421523e3b68042cb',
|
||||
'ClipsCards__User': 'b73ad2bfaecfd30a9e6c28fada15bd97032c83ec77a0440766a56fe0bd632777',
|
||||
'ChannelCollectionsContent': '07e3691a1bad77a36aba590c351180439a40baefc1c275356f40fc7082419a84',
|
||||
'StreamMetadata': '1c719a40e481453e5c48d9bb585d971b8b372f8ebb105b17076722264dfa5b3e',
|
||||
'ComscoreStreamingQuery': 'e1edae8122517d013405f237ffcc124515dc6ded82480a88daef69c83b53ac01',
|
||||
'VideoPreviewOverlay': '3006e77e51b128d838fa4e835723ca4dc9a05c5efd4466c1085215c6e437e65c',
|
||||
}
|
||||
|
||||
def _download_gql(self, video_id, ops, note, fatal=True):
|
||||
for op in ops:
|
||||
op['extensions'] = {
|
||||
'persistedQuery': {
|
||||
'version': 1,
|
||||
'sha256Hash': self._OPERATION_HASHES[op['operationName']],
|
||||
}
|
||||
}
|
||||
return self._download_json(
|
||||
'https://gql.twitch.tv/gql', video_id, note,
|
||||
data=json.dumps(ops).encode(),
|
||||
headers={
|
||||
'Content-Type': 'text/plain;charset=UTF-8',
|
||||
'Client-ID': self._CLIENT_ID,
|
||||
}, fatal=fatal)
|
||||
|
||||
|
||||
class TwitchCollectionIE(TwitchGraphQLBaseIE):
|
||||
class TwitchCollectionIE(TwitchBaseIE):
|
||||
_VALID_URL = r'https?://(?:(?:www|go|m)\.)?twitch\.tv/collections/(?P<id>[^/]+)'
|
||||
|
||||
_TESTS = [{
|
||||
@@ -400,7 +416,9 @@ def _real_extract(self, url):
|
||||
entries, playlist_id=collection_id, playlist_title=title)
|
||||
|
||||
|
||||
class TwitchPlaylistBaseIE(TwitchGraphQLBaseIE):
|
||||
class TwitchPlaylistBaseIE(TwitchBaseIE):
|
||||
_PAGE_LIMIT = 100
|
||||
|
||||
def _entries(self, channel_name, *args):
|
||||
cursor = None
|
||||
variables_common = self._make_variables(channel_name, *args)
|
||||
@@ -440,49 +458,6 @@ def _entries(self, channel_name, *args):
|
||||
if not cursor or not isinstance(cursor, compat_str):
|
||||
break
|
||||
|
||||
# Deprecated kraken v5 API
|
||||
def _entries_kraken(self, channel_name, broadcast_type, sort):
|
||||
access_token = self._download_access_token(channel_name)
|
||||
channel_id = self._extract_channel_id(access_token['token'], channel_name)
|
||||
offset = 0
|
||||
counter_override = None
|
||||
for counter in itertools.count(1):
|
||||
response = self._call_api(
|
||||
'kraken/channels/%s/videos/' % channel_id,
|
||||
channel_id,
|
||||
'Downloading video JSON page %s' % (counter_override or counter),
|
||||
query={
|
||||
'offset': offset,
|
||||
'limit': self._PAGE_LIMIT,
|
||||
'broadcast_type': broadcast_type,
|
||||
'sort': sort,
|
||||
})
|
||||
videos = response.get('videos')
|
||||
if not isinstance(videos, list):
|
||||
break
|
||||
for video in videos:
|
||||
if not isinstance(video, dict):
|
||||
continue
|
||||
video_url = url_or_none(video.get('url'))
|
||||
if not video_url:
|
||||
continue
|
||||
yield {
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': TwitchVodIE.ie_key(),
|
||||
'id': video.get('_id'),
|
||||
'url': video_url,
|
||||
'title': video.get('title'),
|
||||
'description': video.get('description'),
|
||||
'timestamp': unified_timestamp(video.get('published_at')),
|
||||
'duration': float_or_none(video.get('length')),
|
||||
'view_count': int_or_none(video.get('views')),
|
||||
'language': video.get('language'),
|
||||
}
|
||||
offset += self._PAGE_LIMIT
|
||||
total = int_or_none(response.get('_total'))
|
||||
if total and offset >= total:
|
||||
break
|
||||
|
||||
|
||||
class TwitchVideosIE(TwitchPlaylistBaseIE):
|
||||
_VALID_URL = r'https?://(?:(?:www|go|m)\.)?twitch\.tv/(?P<id>[^/]+)/(?:videos|profile)'
|
||||
@@ -724,7 +699,7 @@ def _real_extract(self, url):
|
||||
playlist_title='%s - Collections' % channel_name)
|
||||
|
||||
|
||||
class TwitchStreamIE(TwitchGraphQLBaseIE):
|
||||
class TwitchStreamIE(TwitchBaseIE):
|
||||
IE_NAME = 'twitch:stream'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
@@ -814,8 +789,9 @@ def _real_extract(self, url):
|
||||
if not stream:
|
||||
raise ExtractorError('%s is offline' % channel_name, expected=True)
|
||||
|
||||
access_token = self._download_access_token(channel_name)
|
||||
token = access_token['token']
|
||||
access_token = self._download_access_token(
|
||||
channel_name, 'stream', 'channelName')
|
||||
token = access_token['value']
|
||||
|
||||
stream_id = stream.get('id') or channel_name
|
||||
query = {
|
||||
@@ -826,7 +802,7 @@ def _real_extract(self, url):
|
||||
'player': 'twitchweb',
|
||||
'playlist_include_framerate': 'true',
|
||||
'segment_preference': '4',
|
||||
'sig': access_token['sig'].encode('utf-8'),
|
||||
'sig': access_token['signature'].encode('utf-8'),
|
||||
'token': token.encode('utf-8'),
|
||||
}
|
||||
formats = self._extract_m3u8_formats(
|
||||
@@ -912,8 +888,8 @@ class TwitchClipsIE(TwitchBaseIE):
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
clip = self._download_json(
|
||||
'https://gql.twitch.tv/gql', video_id, data=json.dumps({
|
||||
clip = self._download_base_gql(
|
||||
video_id, {
|
||||
'query': '''{
|
||||
clip(slug: "%s") {
|
||||
broadcaster {
|
||||
@@ -937,10 +913,7 @@ def _real_extract(self, url):
|
||||
}
|
||||
viewCount
|
||||
}
|
||||
}''' % video_id,
|
||||
}).encode(), headers={
|
||||
'Client-ID': self._CLIENT_ID,
|
||||
})['data']['clip']
|
||||
}''' % video_id}, 'Downloading clip GraphQL')['data']['clip']
|
||||
|
||||
if not clip:
|
||||
raise ExtractorError(
|
||||
|
||||
@@ -251,10 +251,10 @@ class TwitterIE(TwitterBaseIE):
|
||||
'info_dict': {
|
||||
'id': '700207533655363584',
|
||||
'ext': 'mp4',
|
||||
'title': 'simon vetugo - BEAT PROD: @suhmeduh #Damndaniel',
|
||||
'title': 'simon vertugo - BEAT PROD: @suhmeduh #Damndaniel',
|
||||
'description': 'BEAT PROD: @suhmeduh https://t.co/HBrQ4AfpvZ #Damndaniel https://t.co/byBooq2ejZ',
|
||||
'thumbnail': r're:^https?://.*\.jpg',
|
||||
'uploader': 'simon vetugo',
|
||||
'uploader': 'simon vertugo',
|
||||
'uploader_id': 'simonvertugo',
|
||||
'duration': 30.0,
|
||||
'timestamp': 1455777459,
|
||||
@@ -312,6 +312,7 @@ class TwitterIE(TwitterBaseIE):
|
||||
'timestamp': 1492000653,
|
||||
'upload_date': '20170412',
|
||||
},
|
||||
'skip': 'Account suspended',
|
||||
}, {
|
||||
'url': 'https://twitter.com/i/web/status/910031516746514432',
|
||||
'info_dict': {
|
||||
@@ -380,6 +381,14 @@ class TwitterIE(TwitterBaseIE):
|
||||
# promo_video_website card
|
||||
'url': 'https://twitter.com/GunB1g/status/1163218564784017422',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# promo_video_convo card
|
||||
'url': 'https://twitter.com/poco_dandy/status/1047395834013384704',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# appplayer card
|
||||
'url': 'https://twitter.com/poco_dandy/status/1150646424461176832',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@@ -462,7 +471,30 @@ def get_binding_value(k):
|
||||
return try_get(o, lambda x: x[x['type'].lower() + '_value'])
|
||||
|
||||
card_name = card['name'].split(':')[-1]
|
||||
if card_name in ('amplify', 'promo_video_website'):
|
||||
if card_name == 'player':
|
||||
info.update({
|
||||
'_type': 'url',
|
||||
'url': get_binding_value('player_url'),
|
||||
})
|
||||
elif card_name == 'periscope_broadcast':
|
||||
info.update({
|
||||
'_type': 'url',
|
||||
'url': get_binding_value('url') or get_binding_value('player_url'),
|
||||
'ie_key': PeriscopeIE.ie_key(),
|
||||
})
|
||||
elif card_name == 'broadcast':
|
||||
info.update({
|
||||
'_type': 'url',
|
||||
'url': get_binding_value('broadcast_url'),
|
||||
'ie_key': TwitterBroadcastIE.ie_key(),
|
||||
})
|
||||
elif card_name == 'summary':
|
||||
info.update({
|
||||
'_type': 'url',
|
||||
'url': get_binding_value('card_url'),
|
||||
})
|
||||
# amplify, promo_video_website, promo_video_convo, appplayer, ...
|
||||
else:
|
||||
is_amplify = card_name == 'amplify'
|
||||
vmap_url = get_binding_value('amplify_url_vmap') if is_amplify else get_binding_value('player_stream_url')
|
||||
content_id = get_binding_value('%s_content_id' % (card_name if is_amplify else 'player'))
|
||||
@@ -488,25 +520,6 @@ def get_binding_value(k):
|
||||
'duration': int_or_none(get_binding_value(
|
||||
'content_duration_seconds')),
|
||||
})
|
||||
elif card_name == 'player':
|
||||
info.update({
|
||||
'_type': 'url',
|
||||
'url': get_binding_value('player_url'),
|
||||
})
|
||||
elif card_name == 'periscope_broadcast':
|
||||
info.update({
|
||||
'_type': 'url',
|
||||
'url': get_binding_value('url') or get_binding_value('player_url'),
|
||||
'ie_key': PeriscopeIE.ie_key(),
|
||||
})
|
||||
elif card_name == 'broadcast':
|
||||
info.update({
|
||||
'_type': 'url',
|
||||
'url': get_binding_value('broadcast_url'),
|
||||
'ie_key': TwitterBroadcastIE.ie_key(),
|
||||
})
|
||||
else:
|
||||
raise ExtractorError('Unsupported Twitter Card.')
|
||||
else:
|
||||
expanded_url = try_get(status, lambda x: x['entities']['urls'][0]['expanded_url'])
|
||||
if not expanded_url:
|
||||
|
||||
@@ -45,6 +45,7 @@ def aa_decode(aa_code):
|
||||
|
||||
class XFileShareIE(InfoExtractor):
|
||||
_SITES = (
|
||||
(r'aparat\.cam', 'Aparat'),
|
||||
(r'clipwatching\.com', 'ClipWatching'),
|
||||
(r'gounlimited\.to', 'GoUnlimited'),
|
||||
(r'govid\.me', 'GoVid'),
|
||||
@@ -78,6 +79,9 @@ class XFileShareIE(InfoExtractor):
|
||||
'title': 'sample',
|
||||
'thumbnail': r're:http://.*\.jpg',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://aparat.cam/n4d6dh0wvlpr',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
|
||||
@@ -1686,11 +1686,12 @@ def extract_embedded_config(embed_webpage, video_id):
|
||||
if embedded_config:
|
||||
return embedded_config
|
||||
|
||||
video_info = {}
|
||||
player_response = {}
|
||||
ytplayer_config = None
|
||||
embed_webpage = None
|
||||
|
||||
# Get video info
|
||||
video_info = {}
|
||||
embed_webpage = None
|
||||
if (self._og_search_property('restrictions:age', video_webpage, default=None) == '18+'
|
||||
or re.search(r'player-age-gate-content">', video_webpage) is not None):
|
||||
cookie_keys = self._get_cookies('https://www.youtube.com').keys()
|
||||
@@ -1816,6 +1817,9 @@ def extract_unavailable_message():
|
||||
if not isinstance(video_info, dict):
|
||||
video_info = {}
|
||||
|
||||
playable_in_embed = try_get(
|
||||
player_response, lambda x: x['playabilityStatus']['playableInEmbed'])
|
||||
|
||||
video_details = try_get(
|
||||
player_response, lambda x: x['videoDetails'], dict) or {}
|
||||
|
||||
@@ -2537,6 +2541,7 @@ def decrypt_sig(mobj):
|
||||
'release_date': release_date,
|
||||
'release_year': release_year,
|
||||
'subscriber_count': subscriber_count,
|
||||
'playable_in_embed': playable_in_embed,
|
||||
}
|
||||
|
||||
|
||||
@@ -3619,8 +3624,8 @@ def _entries(self, query, n):
|
||||
description = try_get(video, lambda x: x['descriptionSnippet']['runs'][0]['text'], compat_str)
|
||||
duration = parse_duration(try_get(video, lambda x: x['lengthText']['simpleText'], compat_str))
|
||||
view_count_text = try_get(video, lambda x: x['viewCountText']['simpleText'], compat_str) or ''
|
||||
view_count = int_or_none(self._search_regex(
|
||||
r'^(\d+)', re.sub(r'\s', '', view_count_text),
|
||||
view_count = str_to_int(self._search_regex(
|
||||
r'^([\d,]+)', re.sub(r'\s', '', view_count_text),
|
||||
'view count', default=None))
|
||||
uploader = try_get(video, lambda x: x['ownerText']['runs'][0]['text'], compat_str)
|
||||
total += 1
|
||||
|
||||
@@ -466,7 +466,7 @@ def _comma_separated_values_options_callback(option, opt_str, value, parser):
|
||||
video_format.add_option(
|
||||
'--prefer-free-formats',
|
||||
action='store_true', dest='prefer_free_formats', default=False,
|
||||
help='Prefer free video formats unless a specific one is requested')
|
||||
help='Prefer free video formats over non-free formats of same quality')
|
||||
video_format.add_option(
|
||||
'-F', '--list-formats',
|
||||
action='store_true', dest='listformats',
|
||||
|
||||
@@ -37,7 +37,25 @@ def __init__(self, downloader=None):
|
||||
self.PP_NAME = self.__class__.__name__[:-2]
|
||||
|
||||
def to_screen(self, text, *args, **kwargs):
|
||||
return self._downloader.to_screen('[%s] %s' % (self.PP_NAME, text), *args, **kwargs)
|
||||
if self._downloader:
|
||||
return self._downloader.to_screen('[%s] %s' % (self.PP_NAME, text), *args, **kwargs)
|
||||
|
||||
def report_warning(self, text, *args, **kwargs):
|
||||
if self._downloader:
|
||||
return self._downloader.report_warning(text, *args, **kwargs)
|
||||
|
||||
def report_error(self, text, *args, **kwargs):
|
||||
if self._downloader:
|
||||
return self._downloader.report_error(text, *args, **kwargs)
|
||||
|
||||
def write_debug(self, text, *args, **kwargs):
|
||||
if self.get_param('verbose', False):
|
||||
return self._downloader.to_screen('[debug] %s' % text, *args, **kwargs)
|
||||
|
||||
def get_param(self, name, default=None, *args, **kwargs):
|
||||
if self._downloader:
|
||||
return self._downloader.params.get(name, default, *args, **kwargs)
|
||||
return default
|
||||
|
||||
def set_downloader(self, downloader):
|
||||
"""Sets the downloader for this PP."""
|
||||
@@ -64,10 +82,10 @@ def try_utime(self, path, atime, mtime, errnote='Cannot update utime of file'):
|
||||
try:
|
||||
os.utime(encodeFilename(path), (atime, mtime))
|
||||
except Exception:
|
||||
self._downloader.report_warning(errnote)
|
||||
self.report_warning(errnote)
|
||||
|
||||
def _configuration_args(self, default=[]):
|
||||
args = self._downloader.params.get('postprocessor_args', {})
|
||||
args = self.get_param('postprocessor_args', {})
|
||||
if isinstance(args, list): # for backward compatibility
|
||||
args = {'default': args, 'sponskrub': []}
|
||||
return cli_configuration_args(args, self.PP_NAME.lower(), args.get('default', []))
|
||||
|
||||
@@ -14,7 +14,8 @@
|
||||
PostProcessingError,
|
||||
prepend_extension,
|
||||
replace_extension,
|
||||
shell_quote
|
||||
shell_quote,
|
||||
process_communicate_or_kill,
|
||||
)
|
||||
|
||||
|
||||
@@ -40,8 +41,7 @@ def run(self, info):
|
||||
thumbnail_filename = info['thumbnails'][-1]['filename']
|
||||
|
||||
if not os.path.exists(encodeFilename(thumbnail_filename)):
|
||||
self._downloader.report_warning(
|
||||
'Skipping embedding the thumbnail because the file is missing.')
|
||||
self.report_warning('Skipping embedding the thumbnail because the file is missing.')
|
||||
return [], info
|
||||
|
||||
def is_webp(path):
|
||||
@@ -124,11 +124,10 @@ def is_webp(path):
|
||||
|
||||
self.to_screen('Adding thumbnail to "%s"' % filename)
|
||||
|
||||
if self._downloader.params.get('verbose', False):
|
||||
self._downloader.to_screen('[debug] AtomicParsley command line: %s' % shell_quote(cmd))
|
||||
self.verbose_message('AtomicParsley command line: %s' % shell_quote(cmd))
|
||||
|
||||
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
|
||||
stdout, stderr = p.communicate()
|
||||
stdout, stderr = process_communicate_or_kill(p)
|
||||
|
||||
if p.returncode != 0:
|
||||
msg = stderr.decode('utf-8', 'replace').strip()
|
||||
@@ -139,7 +138,7 @@ def is_webp(path):
|
||||
# for formats that don't support thumbnails (like 3gp) AtomicParsley
|
||||
# won't create to the temporary file
|
||||
if b'No changes' in stdout:
|
||||
self._downloader.report_warning('The file format doesn\'t support embedding a thumbnail')
|
||||
self.report_warning('The file format doesn\'t support embedding a thumbnail')
|
||||
else:
|
||||
os.remove(encodeFilename(filename))
|
||||
os.rename(encodeFilename(temp_filename), encodeFilename(filename))
|
||||
|
||||
@@ -21,6 +21,7 @@
|
||||
dfxp2srt,
|
||||
ISO639Utils,
|
||||
replace_extension,
|
||||
process_communicate_or_kill,
|
||||
)
|
||||
|
||||
|
||||
@@ -67,8 +68,7 @@ def check_version(self):
|
||||
self._versions[self.basename], required_version):
|
||||
warning = 'Your copy of %s is outdated, update %s to version %s or newer if you encounter any errors.' % (
|
||||
self.basename, self.basename, required_version)
|
||||
if self._downloader:
|
||||
self._downloader.report_warning(warning)
|
||||
self.report_warning(warning)
|
||||
|
||||
@staticmethod
|
||||
def get_versions(downloader=None):
|
||||
@@ -98,11 +98,11 @@ def get_ffmpeg_version(path):
|
||||
self._paths = None
|
||||
self._versions = None
|
||||
if self._downloader:
|
||||
prefer_ffmpeg = self._downloader.params.get('prefer_ffmpeg', True)
|
||||
location = self._downloader.params.get('ffmpeg_location')
|
||||
prefer_ffmpeg = self.get_param('prefer_ffmpeg', True)
|
||||
location = self.get_param('ffmpeg_location')
|
||||
if location is not None:
|
||||
if not os.path.exists(location):
|
||||
self._downloader.report_warning(
|
||||
self.report_warning(
|
||||
'ffmpeg-location %s does not exist! '
|
||||
'Continuing without avconv/ffmpeg.' % (location))
|
||||
self._versions = {}
|
||||
@@ -110,7 +110,7 @@ def get_ffmpeg_version(path):
|
||||
elif not os.path.isdir(location):
|
||||
basename = os.path.splitext(os.path.basename(location))[0]
|
||||
if basename not in programs:
|
||||
self._downloader.report_warning(
|
||||
self.report_warning(
|
||||
'Cannot identify executable %s, its basename should be one of %s. '
|
||||
'Continuing without avconv/ffmpeg.' %
|
||||
(location, ', '.join(programs)))
|
||||
@@ -176,13 +176,11 @@ def get_audio_codec(self, path):
|
||||
encodeFilename(self.executable, True),
|
||||
encodeArgument('-i')]
|
||||
cmd.append(encodeFilename(self._ffmpeg_filename_argument(path), True))
|
||||
if self._downloader.params.get('verbose', False):
|
||||
self._downloader.to_screen(
|
||||
'[debug] %s command line: %s' % (self.basename, shell_quote(cmd)))
|
||||
self.write_debug('%s command line: %s' % (self.basename, shell_quote(cmd)))
|
||||
handle = subprocess.Popen(
|
||||
cmd, stderr=subprocess.PIPE,
|
||||
stdout=subprocess.PIPE, stdin=subprocess.PIPE)
|
||||
stdout_data, stderr_data = handle.communicate()
|
||||
stdout_data, stderr_data = process_communicate_or_kill(handle)
|
||||
expected_ret = 0 if self.probe_available else 1
|
||||
if handle.wait() != expected_ret:
|
||||
return None
|
||||
@@ -227,10 +225,9 @@ def run_ffmpeg_multiple_files(self, input_paths, out_path, opts):
|
||||
+ [encodeArgument(o) for o in opts]
|
||||
+ [encodeFilename(self._ffmpeg_filename_argument(out_path), True)])
|
||||
|
||||
if self._downloader.params.get('verbose', False):
|
||||
self._downloader.to_screen('[debug] ffmpeg command line: %s' % shell_quote(cmd))
|
||||
self.write_debug('ffmpeg command line: %s' % shell_quote(cmd))
|
||||
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
|
||||
stdout, stderr = p.communicate()
|
||||
stdout, stderr = process_communicate_or_kill(p)
|
||||
if p.returncode != 0:
|
||||
stderr = stderr.decode('utf-8', 'replace')
|
||||
msg = stderr.strip().split('\n')[-1]
|
||||
@@ -565,8 +562,7 @@ def can_merge(self):
|
||||
'youtube-dlc will download single file media. '
|
||||
'Update %s to version %s or newer to fix this.') % (
|
||||
self.basename, self.basename, required_version)
|
||||
if self._downloader:
|
||||
self._downloader.report_warning(warning)
|
||||
self.report_warning(warning)
|
||||
return False
|
||||
return True
|
||||
|
||||
@@ -655,7 +651,7 @@ def run(self, info):
|
||||
new_file = subtitles_filename(filename, lang, new_ext, info.get('ext'))
|
||||
|
||||
if ext in ('dfxp', 'ttml', 'tt'):
|
||||
self._downloader.report_warning(
|
||||
self.report_warning(
|
||||
'You have requested to convert dfxp (TTML) subtitles into another format, '
|
||||
'which results in style information loss')
|
||||
|
||||
|
||||
@@ -46,16 +46,16 @@ def run(self, information):
|
||||
self.to_screen('Skipping sponskrub since it is not a YouTube video')
|
||||
return [], information
|
||||
if self.cutout and not self.force and not information.get('__real_download', False):
|
||||
self._downloader.to_screen(
|
||||
'[sponskrub] Skipping sponskrub since the video was already downloaded. '
|
||||
self.report_warning(
|
||||
'Skipping sponskrub since the video was already downloaded. '
|
||||
'Use --sponskrub-force to run sponskrub anyway')
|
||||
return [], information
|
||||
|
||||
self.to_screen('Trying to %s sponsor sections' % ('remove' if self.cutout else 'mark'))
|
||||
if self.cutout:
|
||||
self._downloader.to_screen('WARNING: Cutting out sponsor segments will cause the subtitles to go out of sync.')
|
||||
self.report_warning('Cutting out sponsor segments will cause the subtitles to go out of sync.')
|
||||
if not information.get('__real_download', False):
|
||||
self._downloader.to_screen('WARNING: If sponskrub is run multiple times, unintended parts of the video could be cut out.')
|
||||
self.report_warning('If sponskrub is run multiple times, unintended parts of the video could be cut out.')
|
||||
|
||||
filename = information['filepath']
|
||||
temp_filename = filename + '.' + self._temp_ext + os.path.splitext(filename)[1]
|
||||
@@ -68,8 +68,7 @@ def run(self, information):
|
||||
cmd += ['--', information['id'], filename, temp_filename]
|
||||
cmd = [encodeArgument(i) for i in cmd]
|
||||
|
||||
if self._downloader.params.get('verbose', False):
|
||||
self._downloader.to_screen('[debug] sponskrub command line: %s' % shell_quote(cmd))
|
||||
self.write_debug('sponskrub command line: %s' % shell_quote(cmd))
|
||||
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
|
||||
stdout, stderr = p.communicate()
|
||||
|
||||
|
||||
@@ -57,16 +57,16 @@ def run(self, info):
|
||||
return [], info
|
||||
|
||||
except XAttrUnavailableError as e:
|
||||
self._downloader.report_error(str(e))
|
||||
self.report_error(str(e))
|
||||
return [], info
|
||||
|
||||
except XAttrMetadataError as e:
|
||||
if e.reason == 'NO_SPACE':
|
||||
self._downloader.report_warning(
|
||||
self.report_warning(
|
||||
'There\'s no disk space left, disk quota exceeded or filesystem xattr limit exceeded. '
|
||||
+ (('Some ' if num_written else '') + 'extended attributes are not written.').capitalize())
|
||||
elif e.reason == 'VALUE_TOO_LONG':
|
||||
self._downloader.report_warning(
|
||||
self.report_warning(
|
||||
'Unable to write extended attributes due to too long values.')
|
||||
else:
|
||||
msg = 'This filesystem doesn\'t support extended attributes. '
|
||||
@@ -74,5 +74,5 @@ def run(self, info):
|
||||
msg += 'You need to use NTFS.'
|
||||
else:
|
||||
msg += '(You may have to enable them in your /etc/fstab)'
|
||||
self._downloader.report_error(msg)
|
||||
self.report_error(msg)
|
||||
return [], info
|
||||
|
||||
@@ -2215,6 +2215,15 @@ def unescapeHTML(s):
|
||||
r'&([^&;]+;)', lambda m: _htmlentity_transform(m.group(1)), s)
|
||||
|
||||
|
||||
def process_communicate_or_kill(p, *args, **kwargs):
|
||||
try:
|
||||
return p.communicate(*args, **kwargs)
|
||||
except BaseException: # Including KeyboardInterrupt
|
||||
p.kill()
|
||||
p.wait()
|
||||
raise
|
||||
|
||||
|
||||
def get_subprocess_encoding():
|
||||
if sys.platform == 'win32' and sys.getwindowsversion()[0] >= 5:
|
||||
# For subprocess calls, encode with locale encoding
|
||||
@@ -3730,7 +3739,8 @@ def check_executable(exe, args=[]):
|
||||
""" Checks if the given binary is installed somewhere in PATH, and returns its name.
|
||||
args can be a list of arguments for a short output (like -version) """
|
||||
try:
|
||||
subprocess.Popen([exe] + args, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
|
||||
process_communicate_or_kill(subprocess.Popen(
|
||||
[exe] + args, stdout=subprocess.PIPE, stderr=subprocess.PIPE))
|
||||
except OSError:
|
||||
return False
|
||||
return exe
|
||||
@@ -3744,10 +3754,10 @@ def get_exe_version(exe, args=['--version'],
|
||||
# STDIN should be redirected too. On UNIX-like systems, ffmpeg triggers
|
||||
# SIGTTOU if youtube-dlc is run in the background.
|
||||
# See https://github.com/ytdl-org/youtube-dl/issues/955#issuecomment-209789656
|
||||
out, _ = subprocess.Popen(
|
||||
out, _ = process_communicate_or_kill(subprocess.Popen(
|
||||
[encodeArgument(exe)] + args,
|
||||
stdin=subprocess.PIPE,
|
||||
stdout=subprocess.PIPE, stderr=subprocess.STDOUT).communicate()
|
||||
stdout=subprocess.PIPE, stderr=subprocess.STDOUT))
|
||||
except OSError:
|
||||
return False
|
||||
if isinstance(out, bytes): # Python 2.x
|
||||
@@ -3892,13 +3902,16 @@ def read_batch_urls(batch_fd):
|
||||
def fixup(url):
|
||||
if not isinstance(url, compat_str):
|
||||
url = url.decode('utf-8', 'replace')
|
||||
BOM_UTF8 = '\xef\xbb\xbf'
|
||||
if url.startswith(BOM_UTF8):
|
||||
url = url[len(BOM_UTF8):]
|
||||
url = url.strip()
|
||||
if url.startswith(('#', ';', ']')):
|
||||
BOM_UTF8 = ('\xef\xbb\xbf', '\ufeff')
|
||||
for bom in BOM_UTF8:
|
||||
if url.startswith(bom):
|
||||
url = url[len(bom):]
|
||||
url = url.lstrip()
|
||||
if not url or url.startswith(('#', ';', ']')):
|
||||
return False
|
||||
return url
|
||||
# "#" cannot be stripped out since it is part of the URI
|
||||
# However, it can be safely stipped out if follwing a whitespace
|
||||
return re.split(r'\s#', url, 1)[0].rstrip()
|
||||
|
||||
with contextlib.closing(batch_fd) as fd:
|
||||
return [url for url in map(fixup, fd) if url]
|
||||
@@ -5703,7 +5716,7 @@ def write_xattr(path, key, value):
|
||||
cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE)
|
||||
except EnvironmentError as e:
|
||||
raise XAttrMetadataError(e.errno, e.strerror)
|
||||
stdout, stderr = p.communicate()
|
||||
stdout, stderr = process_communicate_or_kill(p)
|
||||
stderr = stderr.decode('utf-8', 'replace')
|
||||
if p.returncode != 0:
|
||||
raise XAttrMetadataError(p.returncode, stderr)
|
||||
@@ -5819,3 +5832,20 @@ def format_field(obj, field, template='%s', ignore=(None, ''), default='', func=
|
||||
if func and val not in ignore:
|
||||
val = func(val)
|
||||
return template % val if val not in ignore else default
|
||||
|
||||
|
||||
def clean_podcast_url(url):
|
||||
return re.sub(r'''(?x)
|
||||
(?:
|
||||
(?:
|
||||
chtbl\.com/track|
|
||||
media\.blubrry\.com| # https://create.blubrry.com/resources/podcast-media-download-statistics/getting-started/
|
||||
play\.podtrac\.com
|
||||
)/[^/]+|
|
||||
(?:dts|www)\.podtrac\.com/(?:pts/)?redirect\.[0-9a-z]{3,4}| # http://analytics.podtrac.com/how-to-measure
|
||||
flex\.acast\.com|
|
||||
pd(?:
|
||||
cn\.co| # https://podcorn.com/analytics-prefix/
|
||||
st\.fm # https://podsights.com/docs/
|
||||
)/e
|
||||
)/''', '', url)
|
||||
|
||||
@@ -1,3 +1,3 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
__version__ = '2021.01.07'
|
||||
__version__ = '2021.01.09'
|
||||
|
||||
Reference in New Issue
Block a user