1
0
mirror of https://github.com/yt-dlp/yt-dlp synced 2025-12-18 23:25:42 +07:00

Compare commits

..

3 Commits

Author SHA1 Message Date
pukkandan
d392c66fb4 Release 2021.03.21 2021-03-22 02:56:39 +05:30
pukkandan
a31a3a791c Update to ytdl-commit-7e79ba7
7e79ba7dd6
[vimeo:album] Fix extraction for albums with number of videos multiple to page size
2021-03-22 02:56:38 +05:30
Matthew
89a0d0c071 [youtube] Show if video is private, unlisted etc in new field availability
Authored by: colethedj, pukkandan
2021-03-22 02:53:34 +05:30
26 changed files with 385 additions and 647 deletions

View File

@@ -21,7 +21,7 @@ ## Checklist
<!-- <!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp: Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.24. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED. - First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.15. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser. - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/yt-dlp/yt-dlp. - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/yt-dlp/yt-dlp.
- Search the bugtracker for similar issues: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates. - Search the bugtracker for similar issues: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
@@ -29,7 +29,7 @@ ## Checklist
--> -->
- [ ] I'm reporting a broken site support - [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running yt-dlp version **2021.03.24** - [ ] I've verified that I'm running yt-dlp version **2021.03.15**
- [ ] I've checked that all provided URLs are alive and playable in a browser - [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones - [ ] I've searched the bugtracker for similar issues including closed ones
@@ -44,7 +44,7 @@ ## Verbose log
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] yt-dlp version 2021.03.24 [debug] yt-dlp version 2021.03.15
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}

View File

@@ -21,7 +21,7 @@ ## Checklist
<!-- <!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp: Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.24. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED. - First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.15. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser. - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://github.com/yt-dlp/yt-dlp. yt-dlp does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights. - Make sure that site you are requesting is not dedicated to copyright infringement, see https://github.com/yt-dlp/yt-dlp. yt-dlp does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates. - Search the bugtracker for similar site support requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
@@ -29,7 +29,7 @@ ## Checklist
--> -->
- [ ] I'm reporting a new site support request - [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running yt-dlp version **2021.03.24** - [ ] I've verified that I'm running yt-dlp version **2021.03.15**
- [ ] I've checked that all provided URLs are alive and playable in a browser - [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights - [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones - [ ] I've searched the bugtracker for similar site support requests including closed ones

View File

@@ -21,13 +21,13 @@ ## Checklist
<!-- <!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp: Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.24. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED. - First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.15. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates. - Search the bugtracker for similar site feature requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space) - Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
--> -->
- [ ] I'm reporting a site feature request - [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running yt-dlp version **2021.03.24** - [ ] I've verified that I'm running yt-dlp version **2021.03.15**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones - [ ] I've searched the bugtracker for similar site feature requests including closed ones

View File

@@ -21,7 +21,7 @@ ## Checklist
<!-- <!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp: Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.24. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED. - First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.15. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser. - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/yt-dlp/yt-dlp. - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in https://github.com/yt-dlp/yt-dlp.
- Search the bugtracker for similar issues: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates. - Search the bugtracker for similar issues: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
@@ -30,7 +30,7 @@ ## Checklist
--> -->
- [ ] I'm reporting a broken site support issue - [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running yt-dlp version **2021.03.24** - [ ] I've verified that I'm running yt-dlp version **2021.03.15**
- [ ] I've checked that all provided URLs are alive and playable in a browser - [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones - [ ] I've searched the bugtracker for similar bug reports including closed ones
@@ -46,7 +46,7 @@ ## Verbose log
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] yt-dlp version 2021.03.24 [debug] yt-dlp version 2021.03.15
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {} [debug] Proxy map: {}

View File

@@ -21,13 +21,13 @@ ## Checklist
<!-- <!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp: Carefully read and work through this check list in order to prevent the most common mistakes and misuse of yt-dlp:
- First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.24. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED. - First of, make sure you are using the latest version of yt-dlp. Run `yt-dlp --version` and ensure your version is 2021.03.15. If it's not, see https://github.com/yt-dlp/yt-dlp on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates. - Search the bugtracker for similar feature requests: https://github.com/yt-dlp/yt-dlp. DO NOT post duplicates.
- Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space) - Finally, put x into all relevant boxes like this [x] (Dont forget to delete the empty space)
--> -->
- [ ] I'm reporting a feature request - [ ] I'm reporting a feature request
- [ ] I've verified that I'm running yt-dlp version **2021.03.24** - [ ] I've verified that I'm running yt-dlp version **2021.03.15**
- [ ] I've searched the bugtracker for similar feature requests including closed ones - [ ] I've searched the bugtracker for similar feature requests including closed ones

View File

@@ -31,7 +31,3 @@ DennyDai
codeasashu codeasashu
teesid teesid
kevinoconnor7 kevinoconnor7
damianoamatruda
2ShedsJackson
CXwudi
xtkoba

View File

@@ -17,23 +17,9 @@ # Instuctions for creating release
--> -->
### 2021.03.24.1
* Revert [commit/8562218](https://github.com/ytdl-org/youtube-dl/commit/8562218350a79d4709da8593bb0c538aa0824acf)
### 2021.03.24
* Merge youtube-dl: Upto [commit/8562218](https://github.com/ytdl-org/youtube-dl/commit/8562218350a79d4709da8593bb0c538aa0824acf)
* Parse metadata from multiple fields using `--parse-metadata`
* Ability to load playlist infojson using `--load-info-json`
* Write current epoch to infojson when using `--no-clean-infojson`
* [youtube_live_chat] fix bug when trying to set cookies
* [niconico] Fix for when logged in by: @CXwudi and @xtkoba
* [linuxacadamy] Fix login
### 2021.03.21 ### 2021.03.21
* Merge youtube-dl: Upto [commit/7e79ba7](https://github.com/ytdl-org/youtube-dl/commit/7e79ba7dd6e6649dd2ce3a74004b2044f2182881) * Merge youtube-dl: Upto [commit/7e79ba7](https://github.com/ytdl-org/youtube-dl/commit/7e79ba7dd6e6649dd2ce3a74004b2044f2182881)
* Option `--no-clean-infojson` to keep private keys in the infojson * Option `--clean-infojson` to keep private keys in the infojson
* [aria2c] Support retry/abort unavailable fragments by [damianoamatruda](https://github.com/damianoamatruda) * [aria2c] Support retry/abort unavailable fragments by [damianoamatruda](https://github.com/damianoamatruda)
* [aria2c] Better default arguments * [aria2c] Better default arguments
* [movefiles] Fix bugs and make more robust * [movefiles] Fix bugs and make more robust

View File

@@ -3,7 +3,7 @@ # YT-DLP
[![Release version](https://img.shields.io/github/v/release/yt-dlp/yt-dlp?color=brightgreen&label=Release)](https://github.com/yt-dlp/yt-dlp/releases/latest) [![Release version](https://img.shields.io/github/v/release/yt-dlp/yt-dlp?color=brightgreen&label=Release)](https://github.com/yt-dlp/yt-dlp/releases/latest)
[![License: Unlicense](https://img.shields.io/badge/License-Unlicense-blue.svg)](LICENSE) [![License: Unlicense](https://img.shields.io/badge/License-Unlicense-blue.svg)](LICENSE)
[![CI Status](https://github.com/yt-dlp/yt-dlp/workflows/Core%20Tests/badge.svg?branch=master)](https://github.com/yt-dlp/yt-dlp/actions) [![CI Status](https://github.com/yt-dlp/yt-dlp/workflows/Core%20Tests/badge.svg?branch=master)](https://github.com/yt-dlp/yt-dlp/actions)
[![Discord](https://img.shields.io/discord/807245652072857610?color=blue&label=discord&logo=discord)](https://discord.gg/H5MNcFW63r) [![Discord](https://img.shields.io/discord/807245652072857610?color=blue&label=discord&logo=discord)](https://discord.gg/S75JaBna)
[![Commits](https://img.shields.io/github/commit-activity/m/yt-dlp/yt-dlp?label=commits)](https://github.com/yt-dlp/yt-dlp/commits) [![Commits](https://img.shields.io/github/commit-activity/m/yt-dlp/yt-dlp?label=commits)](https://github.com/yt-dlp/yt-dlp/commits)
[![Last Commit](https://img.shields.io/github/last-commit/yt-dlp/yt-dlp/master)](https://github.com/yt-dlp/yt-dlp/commits) [![Last Commit](https://img.shields.io/github/last-commit/yt-dlp/yt-dlp/master)](https://github.com/yt-dlp/yt-dlp/commits)
@@ -58,7 +58,7 @@ # NEW FEATURES
* **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection that what is possible by simply using `--format` ([examples](#format-selection-examples)) * **[Format Sorting](#sorting-formats)**: The default format sorting options have been changed so that higher resolution and better codecs will be now preferred instead of simply using larger bitrate. Furthermore, you can now specify the sort order using `-S`. This allows for much easier format selection that what is possible by simply using `--format` ([examples](#format-selection-examples))
* **Merged with youtube-dl v2021.03.25**: You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc) * **Merged with youtube-dl v2021.03.14**: You get all the latest features and patches of [youtube-dl](https://github.com/ytdl-org/youtube-dl) in addition to all the features of [youtube-dlc](https://github.com/blackjack4494/yt-dlc)
* **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--get-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, Playlist infojson etc. Note that the NicoNico improvements are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details. * **Merged with animelover1984/youtube-dl**: You get most of the features and improvements from [animelover1984/youtube-dl](https://github.com/animelover1984/youtube-dl) including `--get-comments`, `BiliBiliSearch`, `BilibiliChannel`, Embedding thumbnail in mp4/ogg/opus, Playlist infojson etc. Note that the NicoNico improvements are not available. See [#31](https://github.com/yt-dlp/yt-dlp/pull/31) for details.
@@ -670,24 +670,18 @@ ## Post-Processing Options:
--add-metadata Write metadata to the video file --add-metadata Write metadata to the video file
--no-add-metadata Do not write metadata (default) --no-add-metadata Do not write metadata (default)
--parse-metadata FIELD:FORMAT Parse additional metadata like title/artist --parse-metadata FIELD:FORMAT Parse additional metadata like title/artist
from other fields. Give a template or field from other fields. Give field name to
name to extract data from and the format to extract data from, and format of the field
interpret it as, seperated by a ":". Either seperated by a ":". Either regular
regular expression with named capture expression with named capture groups or a
groups or a similar syntax to the output similar syntax to the output template can
template can be used for the FORMAT. also be used. The parsed parameters replace
Similarly, the syntax for output template any existing values and can be use in
can be used for FIELD to parse the data output template. This option can be used
from multiple fields. The parsed parameters multiple times. Example: --parse-metadata
replace any existing values and can be used "title:%(artist)s - %(title)s" matches a
in output templates. This option can be title like "Coldplay - Paradise". Example
used multiple times. Example: --parse- (regex): --parse-metadata
metadata "title:%(artist)s - %(title)s"
matches a title like "Coldplay - Paradise".
Example: --parse-metadata "%(series)s
%(episode_number)s:%(title)s" sets the
title using series and episode number.
Example (regex): --parse-metadata
"description:Artist - (?P<artist>.+?)" "description:Artist - (?P<artist>.+?)"
--xattrs Write metadata to the video file's xattrs --xattrs Write metadata to the video file's xattrs
(using dublin core and xdg standards) (using dublin core and xdg standards)

View File

@@ -97,8 +97,7 @@ # Supported sites
- **bbc**: BBC - **bbc**: BBC
- **bbc.co.uk**: BBC iPlayer - **bbc.co.uk**: BBC iPlayer
- **bbc.co.uk:article**: BBC articles - **bbc.co.uk:article**: BBC articles
- **bbc.co.uk:iplayer:episodes** - **bbc.co.uk:iplayer:playlist**
- **bbc.co.uk:iplayer:group**
- **bbc.co.uk:playlist** - **bbc.co.uk:playlist**
- **BBVTV** - **BBVTV**
- **Beatport** - **Beatport**
@@ -1252,6 +1251,5 @@ # Supported sites
- **zee5:series** - **zee5:series**
- **Zhihu** - **Zhihu**
- **zingmp3**: mp3.zing.vn - **zingmp3**: mp3.zing.vn
- **zingmp3:album**
- **zoom** - **zoom**
- **Zype** - **Zype**

View File

@@ -60,14 +60,12 @@
encode_compat_str, encode_compat_str,
encodeFilename, encodeFilename,
error_to_compat_str, error_to_compat_str,
EntryNotInPlaylist,
ExistingVideoReached, ExistingVideoReached,
expand_path, expand_path,
ExtractorError, ExtractorError,
float_or_none, float_or_none,
format_bytes, format_bytes,
format_field, format_field,
FORMAT_RE,
formatSeconds, formatSeconds,
GeoRestrictedError, GeoRestrictedError,
int_or_none, int_or_none,
@@ -773,93 +771,95 @@ def parse_outtmpl(self):
'Put from __future__ import unicode_literals at the top of your code file or consider switching to Python 3.x.') 'Put from __future__ import unicode_literals at the top of your code file or consider switching to Python 3.x.')
return outtmpl_dict return outtmpl_dict
def prepare_outtmpl(self, outtmpl, info_dict, sanitize=None):
""" Make the template and info_dict suitable for substitution (outtmpl % info_dict)"""
template_dict = dict(info_dict)
# duration_string
template_dict['duration_string'] = ( # %(duration>%H-%M-%S)s is wrong if duration > 24hrs
formatSeconds(info_dict['duration'], '-')
if info_dict.get('duration', None) is not None
else None)
# epoch
template_dict['epoch'] = int(time.time())
# autonumber
autonumber_size = self.params.get('autonumber_size')
if autonumber_size is None:
autonumber_size = 5
template_dict['autonumber'] = self.params.get('autonumber_start', 1) - 1 + self._num_downloads
# resolution if not defined
if template_dict.get('resolution') is None:
if template_dict.get('width') and template_dict.get('height'):
template_dict['resolution'] = '%dx%d' % (template_dict['width'], template_dict['height'])
elif template_dict.get('height'):
template_dict['resolution'] = '%sp' % template_dict['height']
elif template_dict.get('width'):
template_dict['resolution'] = '%dx?' % template_dict['width']
if sanitize is None:
sanitize = lambda k, v: v
template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
for k, v in template_dict.items()
if v is not None and not isinstance(v, (list, tuple, dict)))
na = self.params.get('outtmpl_na_placeholder', 'NA')
template_dict = collections.defaultdict(lambda: na, template_dict)
# For fields playlist_index and autonumber convert all occurrences
# of %(field)s to %(field)0Nd for backward compatibility
field_size_compat_map = {
'playlist_index': len(str(template_dict['n_entries'])),
'autonumber': autonumber_size,
}
FIELD_SIZE_COMPAT_RE = r'(?<!%)%\((?P<field>autonumber|playlist_index)\)s'
mobj = re.search(FIELD_SIZE_COMPAT_RE, outtmpl)
if mobj:
outtmpl = re.sub(
FIELD_SIZE_COMPAT_RE,
r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')],
outtmpl)
numeric_fields = list(self._NUMERIC_FIELDS)
# Format date
FORMAT_DATE_RE = FORMAT_RE.format(r'(?P<key>(?P<field>\w+)>(?P<format>.+?))')
for mobj in re.finditer(FORMAT_DATE_RE, outtmpl):
conv_type, field, frmt, key = mobj.group('type', 'field', 'format', 'key')
if key in template_dict:
continue
value = strftime_or_none(template_dict.get(field), frmt, na)
if conv_type in 'crs': # string
value = sanitize(field, value)
else: # number
numeric_fields.append(key)
value = float_or_none(value, default=None)
if value is not None:
template_dict[key] = value
# Missing numeric fields used together with integer presentation types
# in format specification will break the argument substitution since
# string NA placeholder is returned for missing fields. We will patch
# output template for missing fields to meet string presentation type.
for numeric_field in numeric_fields:
if numeric_field not in template_dict:
outtmpl = re.sub(
FORMAT_RE.format(re.escape(numeric_field)),
r'%({0})s'.format(numeric_field), outtmpl)
return outtmpl, template_dict
def _prepare_filename(self, info_dict, tmpl_type='default'): def _prepare_filename(self, info_dict, tmpl_type='default'):
try: try:
template_dict = dict(info_dict)
template_dict['duration_string'] = ( # %(duration>%H-%M-%S)s is wrong if duration > 24hrs
formatSeconds(info_dict['duration'], '-')
if info_dict.get('duration', None) is not None
else None)
template_dict['epoch'] = int(time.time())
autonumber_size = self.params.get('autonumber_size')
if autonumber_size is None:
autonumber_size = 5
template_dict['autonumber'] = self.params.get('autonumber_start', 1) - 1 + self._num_downloads
if template_dict.get('resolution') is None:
if template_dict.get('width') and template_dict.get('height'):
template_dict['resolution'] = '%dx%d' % (template_dict['width'], template_dict['height'])
elif template_dict.get('height'):
template_dict['resolution'] = '%sp' % template_dict['height']
elif template_dict.get('width'):
template_dict['resolution'] = '%dx?' % template_dict['width']
sanitize = lambda k, v: sanitize_filename( sanitize = lambda k, v: sanitize_filename(
compat_str(v), compat_str(v),
restricted=self.params.get('restrictfilenames'), restricted=self.params.get('restrictfilenames'),
is_id=(k == 'id' or k.endswith('_id'))) is_id=(k == 'id' or k.endswith('_id')))
template_dict = dict((k, v if isinstance(v, compat_numeric_types) else sanitize(k, v))
for k, v in template_dict.items()
if v is not None and not isinstance(v, (list, tuple, dict)))
na = self.params.get('outtmpl_na_placeholder', 'NA')
template_dict = collections.defaultdict(lambda: na, template_dict)
outtmpl = self.outtmpl_dict.get(tmpl_type, self.outtmpl_dict['default']) outtmpl = self.outtmpl_dict.get(tmpl_type, self.outtmpl_dict['default'])
outtmpl, template_dict = self.prepare_outtmpl(outtmpl, info_dict, sanitize) force_ext = OUTTMPL_TYPES.get(tmpl_type)
# For fields playlist_index and autonumber convert all occurrences
# of %(field)s to %(field)0Nd for backward compatibility
field_size_compat_map = {
'playlist_index': len(str(template_dict['n_entries'])),
'autonumber': autonumber_size,
}
FIELD_SIZE_COMPAT_RE = r'(?<!%)%\((?P<field>autonumber|playlist_index)\)s'
mobj = re.search(FIELD_SIZE_COMPAT_RE, outtmpl)
if mobj:
outtmpl = re.sub(
FIELD_SIZE_COMPAT_RE,
r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')],
outtmpl)
# As of [1] format syntax is:
# %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
# 1. https://docs.python.org/2/library/stdtypes.html#string-formatting
FORMAT_RE = r'''(?x)
(?<!%)
%
\({0}\) # mapping key
(?:[#0\-+ ]+)? # conversion flags (optional)
(?:\d+)? # minimum field width (optional)
(?:\.\d+)? # precision (optional)
[hlL]? # length modifier (optional)
(?P<type>[diouxXeEfFgGcrs%]) # conversion type
'''
numeric_fields = list(self._NUMERIC_FIELDS)
# Format date
FORMAT_DATE_RE = FORMAT_RE.format(r'(?P<key>(?P<field>\w+)>(?P<format>.+?))')
for mobj in re.finditer(FORMAT_DATE_RE, outtmpl):
conv_type, field, frmt, key = mobj.group('type', 'field', 'format', 'key')
if key in template_dict:
continue
value = strftime_or_none(template_dict.get(field), frmt, na)
if conv_type in 'crs': # string
value = sanitize(field, value)
else: # number
numeric_fields.append(key)
value = float_or_none(value, default=None)
if value is not None:
template_dict[key] = value
# Missing numeric fields used together with integer presentation types
# in format specification will break the argument substitution since
# string NA placeholder is returned for missing fields. We will patch
# output template for missing fields to meet string presentation type.
for numeric_field in numeric_fields:
if numeric_field not in template_dict:
outtmpl = re.sub(
FORMAT_RE.format(re.escape(numeric_field)),
r'%({0})s'.format(numeric_field), outtmpl)
# expand_path translates '%%' into '%' and '$$' into '$' # expand_path translates '%%' into '%' and '$$' into '$'
# correspondingly that is not what we want since we need to keep # correspondingly that is not what we want since we need to keep
@@ -874,7 +874,6 @@ def _prepare_filename(self, info_dict, tmpl_type='default'):
# title "Hello $PATH", we don't want `$PATH` to be expanded. # title "Hello $PATH", we don't want `$PATH` to be expanded.
filename = expand_path(outtmpl).replace(sep, '') % template_dict filename = expand_path(outtmpl).replace(sep, '') % template_dict
force_ext = OUTTMPL_TYPES.get(tmpl_type)
if force_ext is not None: if force_ext is not None:
filename = replace_extension(filename, force_ext, template_dict.get('ext')) filename = replace_extension(filename, force_ext, template_dict.get('ext'))
@@ -1181,16 +1180,48 @@ def __process_playlist(self, ie_result, download):
playlist = ie_result.get('title') or ie_result.get('id') playlist = ie_result.get('title') or ie_result.get('id')
self.to_screen('[download] Downloading playlist: %s' % playlist) self.to_screen('[download] Downloading playlist: %s' % playlist)
if 'entries' not in ie_result: if self.params.get('allow_playlist_files', True):
raise EntryNotInPlaylist() ie_copy = {
incomplete_entries = bool(ie_result.get('requested_entries')) 'playlist': playlist,
if incomplete_entries: 'playlist_id': ie_result.get('id'),
def fill_missing_entries(entries, indexes): 'playlist_title': ie_result.get('title'),
ret = [None] * max(*indexes) 'playlist_uploader': ie_result.get('uploader'),
for i, entry in zip(indexes, entries): 'playlist_uploader_id': ie_result.get('uploader_id'),
ret[i - 1] = entry 'playlist_index': 0
return ret }
ie_result['entries'] = fill_missing_entries(ie_result['entries'], ie_result['requested_entries']) ie_copy.update(dict(ie_result))
if self.params.get('writeinfojson', False):
infofn = self.prepare_filename(ie_copy, 'pl_infojson')
if not self._ensure_dir_exists(encodeFilename(infofn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(infofn)):
self.to_screen('[info] Playlist metadata is already present')
else:
playlist_info = dict(ie_result)
# playlist_info['entries'] = list(playlist_info['entries']) # Entries is a generator which shouldnot be resolved here
self.to_screen('[info] Writing playlist metadata as JSON to: ' + infofn)
try:
write_json_file(self.filter_requested_info(playlist_info, self.params.get('clean_infojson', True)), infofn)
except (OSError, IOError):
self.report_error('Cannot write playlist metadata to JSON file ' + infofn)
if self.params.get('writedescription', False):
descfn = self.prepare_filename(ie_copy, 'pl_description')
if not self._ensure_dir_exists(encodeFilename(descfn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(descfn)):
self.to_screen('[info] Playlist description is already present')
elif ie_result.get('description') is None:
self.report_warning('There\'s no playlist description to write.')
else:
try:
self.to_screen('[info] Writing playlist description to: ' + descfn)
with io.open(encodeFilename(descfn), 'w', encoding='utf-8') as descfile:
descfile.write(ie_result['description'])
except (OSError, IOError):
self.report_error('Cannot write playlist description file ' + descfn)
return
playlist_results = [] playlist_results = []
@@ -1217,20 +1248,25 @@ def iter_playlistitems(format):
def make_playlistitems_entries(list_ie_entries): def make_playlistitems_entries(list_ie_entries):
num_entries = len(list_ie_entries) num_entries = len(list_ie_entries)
for i in playlistitems: return [
if -num_entries < i <= num_entries: list_ie_entries[i - 1] for i in playlistitems
yield list_ie_entries[i - 1] if -num_entries <= i - 1 < num_entries]
elif incomplete_entries:
raise EntryNotInPlaylist() def report_download(num_entries):
self.to_screen(
'[%s] playlist %s: Downloading %d videos' %
(ie_result['extractor'], playlist, num_entries))
if isinstance(ie_entries, list): if isinstance(ie_entries, list):
n_all_entries = len(ie_entries) n_all_entries = len(ie_entries)
if playlistitems: if playlistitems:
entries = list(make_playlistitems_entries(ie_entries)) entries = make_playlistitems_entries(ie_entries)
else: else:
entries = ie_entries[playliststart:playlistend] entries = ie_entries[playliststart:playlistend]
n_entries = len(entries) n_entries = len(entries)
msg = 'Collected %d videos; downloading %d of them' % (n_all_entries, n_entries) self.to_screen(
'[%s] playlist %s: Collected %d video ids (downloading %d of them)' %
(ie_result['extractor'], playlist, n_all_entries, n_entries))
elif isinstance(ie_entries, PagedList): elif isinstance(ie_entries, PagedList):
if playlistitems: if playlistitems:
entries = [] entries = []
@@ -1242,73 +1278,25 @@ def make_playlistitems_entries(list_ie_entries):
entries = ie_entries.getslice( entries = ie_entries.getslice(
playliststart, playlistend) playliststart, playlistend)
n_entries = len(entries) n_entries = len(entries)
msg = 'Downloading %d videos' % n_entries report_download(n_entries)
else: # iterable else: # iterable
if playlistitems: if playlistitems:
entries = list(make_playlistitems_entries(list(itertools.islice( entries = make_playlistitems_entries(list(itertools.islice(
ie_entries, 0, max(playlistitems))))) ie_entries, 0, max(playlistitems))))
else: else:
entries = list(itertools.islice( entries = list(itertools.islice(
ie_entries, playliststart, playlistend)) ie_entries, playliststart, playlistend))
n_entries = len(entries) n_entries = len(entries)
msg = 'Downloading %d videos' % n_entries report_download(n_entries)
if any((entry is None for entry in entries)):
raise EntryNotInPlaylist()
if not playlistitems and (playliststart or playlistend):
playlistitems = list(range(1 + playliststart, 1 + playliststart + len(entries)))
ie_result['entries'] = entries
ie_result['requested_entries'] = playlistitems
if self.params.get('allow_playlist_files', True):
ie_copy = {
'playlist': playlist,
'playlist_id': ie_result.get('id'),
'playlist_title': ie_result.get('title'),
'playlist_uploader': ie_result.get('uploader'),
'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_index': 0
}
ie_copy.update(dict(ie_result))
if self.params.get('writeinfojson', False):
infofn = self.prepare_filename(ie_copy, 'pl_infojson')
if not self._ensure_dir_exists(encodeFilename(infofn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(infofn)):
self.to_screen('[info] Playlist metadata is already present')
else:
self.to_screen('[info] Writing playlist metadata as JSON to: ' + infofn)
try:
write_json_file(self.filter_requested_info(ie_result, self.params.get('clean_infojson', True)), infofn)
except (OSError, IOError):
self.report_error('Cannot write playlist metadata to JSON file ' + infofn)
if self.params.get('writedescription', False):
descfn = self.prepare_filename(ie_copy, 'pl_description')
if not self._ensure_dir_exists(encodeFilename(descfn)):
return
if not self.params.get('overwrites', True) and os.path.exists(encodeFilename(descfn)):
self.to_screen('[info] Playlist description is already present')
elif ie_result.get('description') is None:
self.report_warning('There\'s no playlist description to write.')
else:
try:
self.to_screen('[info] Writing playlist description to: ' + descfn)
with io.open(encodeFilename(descfn), 'w', encoding='utf-8') as descfile:
descfile.write(ie_result['description'])
except (OSError, IOError):
self.report_error('Cannot write playlist description file ' + descfn)
return
if self.params.get('playlistreverse', False): if self.params.get('playlistreverse', False):
entries = entries[::-1] entries = entries[::-1]
if self.params.get('playlistrandom', False): if self.params.get('playlistrandom', False):
random.shuffle(entries) random.shuffle(entries)
x_forwarded_for = ie_result.get('__x_forwarded_for_ip') x_forwarded_for = ie_result.get('__x_forwarded_for_ip')
self.to_screen('[%s] playlist %s: %s' % (ie_result['extractor'], playlist, msg))
for i, entry in enumerate(entries, 1): for i, entry in enumerate(entries, 1):
self.to_screen('[download] Downloading video %s of %s' % (i, n_entries)) self.to_screen('[download] Downloading video %s of %s' % (i, n_entries))
# This __x_forwarded_for_ip thing is a bit ugly but requires # This __x_forwarded_for_ip thing is a bit ugly but requires
@@ -1322,7 +1310,7 @@ def make_playlistitems_entries(list_ie_entries):
'playlist_title': ie_result.get('title'), 'playlist_title': ie_result.get('title'),
'playlist_uploader': ie_result.get('uploader'), 'playlist_uploader': ie_result.get('uploader'),
'playlist_uploader_id': ie_result.get('uploader_id'), 'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_index': playlistitems[i - 1] if playlistitems else i, 'playlist_index': playlistitems[i - 1] if playlistitems else i + playliststart,
'extractor': ie_result['extractor'], 'extractor': ie_result['extractor'],
'webpage_url': ie_result['webpage_url'], 'webpage_url': ie_result['webpage_url'],
'webpage_url_basename': url_basename(ie_result['webpage_url']), 'webpage_url_basename': url_basename(ie_result['webpage_url']),
@@ -2536,10 +2524,10 @@ def download_with_info_file(self, info_filename):
[info_filename], mode='r', [info_filename], mode='r',
openhook=fileinput.hook_encoded('utf-8'))) as f: openhook=fileinput.hook_encoded('utf-8'))) as f:
# FileInput doesn't have a read method, we can't call json.load # FileInput doesn't have a read method, we can't call json.load
info = self.filter_requested_info(json.loads('\n'.join(f)), self.params.get('clean_infojson', True)) info = self.filter_requested_info(json.loads('\n'.join(f)))
try: try:
self.process_ie_result(info, download=True) self.process_ie_result(info, download=True)
except (DownloadError, EntryNotInPlaylist): except DownloadError:
webpage_url = info.get('webpage_url') webpage_url = info.get('webpage_url')
if webpage_url is not None: if webpage_url is not None:
self.report_warning('The info failed to download, trying with "%s"' % webpage_url) self.report_warning('The info failed to download, trying with "%s"' % webpage_url)
@@ -2551,10 +2539,9 @@ def download_with_info_file(self, info_filename):
@staticmethod @staticmethod
def filter_requested_info(info_dict, actually_filter=True): def filter_requested_info(info_dict, actually_filter=True):
if not actually_filter: if not actually_filter:
info_dict['epoch'] = int(time.time())
return info_dict return info_dict
exceptions = { exceptions = {
'remove': ['requested_formats', 'requested_subtitles', 'requested_entries', 'filepath', 'entries'], 'remove': ['requested_formats', 'requested_subtitles', 'filepath', 'entries'],
'keep': ['_type'], 'keep': ['_type'],
} }
keep_key = lambda k: k in exceptions['keep'] or not (k.startswith('_') or k in exceptions['remove']) keep_key = lambda k: k in exceptions['keep'] or not (k.startswith('_') or k in exceptions['remove'])

View File

@@ -79,7 +79,8 @@ def download_and_parse_fragment(url, frag_index, request_data):
self._prepare_and_start_frag_download(ctx) self._prepare_and_start_frag_download(ctx)
success, raw_fragment = dl_fragment(info_dict['url']) success, raw_fragment = dl_fragment(
'https://www.youtube.com/watch?v={}'.format(video_id))
if not success: if not success:
return False return False
try: try:

View File

@@ -1,22 +1,17 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import functools
import itertools import itertools
import json
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import (
compat_etree_Element, compat_etree_Element,
compat_HTTPError, compat_HTTPError,
compat_parse_qs,
compat_urllib_parse_urlparse,
compat_urlparse, compat_urlparse,
) )
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
OnDemandPagedList,
clean_html, clean_html,
dict_get, dict_get,
float_or_none, float_or_none,
@@ -816,7 +811,7 @@ class BBCIE(BBCCoUkIE):
@classmethod @classmethod
def suitable(cls, url): def suitable(cls, url):
EXCLUDE_IE = (BBCCoUkIE, BBCCoUkArticleIE, BBCCoUkIPlayerEpisodesIE, BBCCoUkIPlayerGroupIE, BBCCoUkPlaylistIE) EXCLUDE_IE = (BBCCoUkIE, BBCCoUkArticleIE, BBCCoUkIPlayerPlaylistIE, BBCCoUkPlaylistIE)
return (False if any(ie.suitable(url) for ie in EXCLUDE_IE) return (False if any(ie.suitable(url) for ie in EXCLUDE_IE)
else super(BBCIE, cls).suitable(url)) else super(BBCIE, cls).suitable(url))
@@ -1343,149 +1338,21 @@ def _real_extract(self, url):
playlist_id, title, description) playlist_id, title, description)
class BBCCoUkIPlayerPlaylistBaseIE(InfoExtractor): class BBCCoUkIPlayerPlaylistIE(BBCCoUkPlaylistBaseIE):
_VALID_URL_TMPL = r'https?://(?:www\.)?bbc\.co\.uk/iplayer/%%s/(?P<id>%s)' % BBCCoUkIE._ID_REGEX IE_NAME = 'bbc.co.uk:iplayer:playlist'
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/iplayer/(?:episodes|group)/(?P<id>%s)' % BBCCoUkIE._ID_REGEX
@staticmethod _URL_TEMPLATE = 'http://www.bbc.co.uk/iplayer/episode/%s'
def _get_default(episode, key, default_key='default'): _VIDEO_ID_TEMPLATE = r'data-ip-id=["\'](%s)'
return try_get(episode, lambda x: x[key][default_key])
def _get_description(self, data):
synopsis = data.get(self._DESCRIPTION_KEY) or {}
return dict_get(synopsis, ('large', 'medium', 'small'))
def _fetch_page(self, programme_id, per_page, series_id, page):
elements = self._get_elements(self._call_api(
programme_id, per_page, page + 1, series_id))
for element in elements:
episode = self._get_episode(element)
episode_id = episode.get('id')
if not episode_id:
continue
thumbnail = None
image = self._get_episode_image(episode)
if image:
thumbnail = image.replace('{recipe}', 'raw')
category = self._get_default(episode, 'labels', 'category')
yield {
'_type': 'url',
'id': episode_id,
'title': self._get_episode_field(episode, 'subtitle'),
'url': 'https://www.bbc.co.uk/iplayer/episode/' + episode_id,
'thumbnail': thumbnail,
'description': self._get_description(episode),
'categories': [category] if category else None,
'series': self._get_episode_field(episode, 'title'),
'ie_key': BBCCoUkIE.ie_key(),
}
def _real_extract(self, url):
pid = self._match_id(url)
qs = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
series_id = qs.get('seriesId', [None])[0]
page = qs.get('page', [None])[0]
per_page = 36 if page else self._PAGE_SIZE
fetch_page = functools.partial(self._fetch_page, pid, per_page, series_id)
entries = fetch_page(int(page) - 1) if page else OnDemandPagedList(fetch_page, self._PAGE_SIZE)
playlist_data = self._get_playlist_data(self._call_api(pid, 1))
return self.playlist_result(
entries, pid, self._get_playlist_title(playlist_data),
self._get_description(playlist_data))
class BBCCoUkIPlayerEpisodesIE(BBCCoUkIPlayerPlaylistBaseIE):
IE_NAME = 'bbc.co.uk:iplayer:episodes'
_VALID_URL = BBCCoUkIPlayerPlaylistBaseIE._VALID_URL_TMPL % 'episodes'
_TESTS = [{ _TESTS = [{
'url': 'http://www.bbc.co.uk/iplayer/episodes/b05rcz9v', 'url': 'http://www.bbc.co.uk/iplayer/episodes/b05rcz9v',
'info_dict': { 'info_dict': {
'id': 'b05rcz9v', 'id': 'b05rcz9v',
'title': 'The Disappearance', 'title': 'The Disappearance',
'description': 'md5:58eb101aee3116bad4da05f91179c0cb', 'description': 'French thriller serial about a missing teenager.',
}, },
'playlist_mincount': 8, 'playlist_mincount': 6,
'skip': 'This programme is not currently available on BBC iPlayer',
}, { }, {
# all seasons
'url': 'https://www.bbc.co.uk/iplayer/episodes/b094m5t9/doctor-foster',
'info_dict': {
'id': 'b094m5t9',
'title': 'Doctor Foster',
'description': 'md5:5aa9195fad900e8e14b52acd765a9fd6',
},
'playlist_mincount': 10,
}, {
# explicit season
'url': 'https://www.bbc.co.uk/iplayer/episodes/b094m5t9/doctor-foster?seriesId=b094m6nv',
'info_dict': {
'id': 'b094m5t9',
'title': 'Doctor Foster',
'description': 'md5:5aa9195fad900e8e14b52acd765a9fd6',
},
'playlist_mincount': 5,
}, {
# all pages
'url': 'https://www.bbc.co.uk/iplayer/episodes/m0004c4v/beechgrove',
'info_dict': {
'id': 'm0004c4v',
'title': 'Beechgrove',
'description': 'Gardening show that celebrates Scottish horticulture and growing conditions.',
},
'playlist_mincount': 37,
}, {
# explicit page
'url': 'https://www.bbc.co.uk/iplayer/episodes/m0004c4v/beechgrove?page=2',
'info_dict': {
'id': 'm0004c4v',
'title': 'Beechgrove',
'description': 'Gardening show that celebrates Scottish horticulture and growing conditions.',
},
'playlist_mincount': 1,
}]
_PAGE_SIZE = 100
_DESCRIPTION_KEY = 'synopsis'
def _get_episode_image(self, episode):
return self._get_default(episode, 'image')
def _get_episode_field(self, episode, field):
return self._get_default(episode, field)
@staticmethod
def _get_elements(data):
return data['entities']['results']
@staticmethod
def _get_episode(element):
return element.get('episode') or {}
def _call_api(self, pid, per_page, page=1, series_id=None):
variables = {
'id': pid,
'page': page,
'perPage': per_page,
}
if series_id:
variables['sliceId'] = series_id
return self._download_json(
'https://graph.ibl.api.bbc.co.uk/', pid, headers={
'Content-Type': 'application/json'
}, data=json.dumps({
'id': '5692d93d5aac8d796a0305e895e61551',
'variables': variables,
}).encode('utf-8'))['data']['programme']
@staticmethod
def _get_playlist_data(data):
return data
def _get_playlist_title(self, data):
return self._get_default(data, 'title')
class BBCCoUkIPlayerGroupIE(BBCCoUkIPlayerPlaylistBaseIE):
IE_NAME = 'bbc.co.uk:iplayer:group'
_VALID_URL = BBCCoUkIPlayerPlaylistBaseIE._VALID_URL_TMPL % 'group'
_TESTS = [{
# Available for over a year unlike 30 days for most other programmes # Available for over a year unlike 30 days for most other programmes
'url': 'http://www.bbc.co.uk/iplayer/group/p02tcc32', 'url': 'http://www.bbc.co.uk/iplayer/group/p02tcc32',
'info_dict': { 'info_dict': {
@@ -1494,56 +1361,14 @@ class BBCCoUkIPlayerGroupIE(BBCCoUkIPlayerPlaylistBaseIE):
'description': 'md5:683e901041b2fe9ba596f2ab04c4dbe7', 'description': 'md5:683e901041b2fe9ba596f2ab04c4dbe7',
}, },
'playlist_mincount': 10, 'playlist_mincount': 10,
}, {
# all pages
'url': 'https://www.bbc.co.uk/iplayer/group/p081d7j7',
'info_dict': {
'id': 'p081d7j7',
'title': 'Music in Scotland',
'description': 'Perfomances in Scotland and programmes featuring Scottish acts.',
},
'playlist_mincount': 47,
}, {
# explicit page
'url': 'https://www.bbc.co.uk/iplayer/group/p081d7j7?page=2',
'info_dict': {
'id': 'p081d7j7',
'title': 'Music in Scotland',
'description': 'Perfomances in Scotland and programmes featuring Scottish acts.',
},
'playlist_mincount': 11,
}] }]
_PAGE_SIZE = 200
_DESCRIPTION_KEY = 'synopses'
def _get_episode_image(self, episode): def _extract_title_and_description(self, webpage):
return self._get_default(episode, 'images', 'standard') title = self._search_regex(r'<h1>([^<]+)</h1>', webpage, 'title', fatal=False)
description = self._search_regex(
def _get_episode_field(self, episode, field): r'<p[^>]+class=(["\'])subtitle\1[^>]*>(?P<value>[^<]+)</p>',
return episode.get(field) webpage, 'description', fatal=False, group='value')
return title, description
@staticmethod
def _get_elements(data):
return data['elements']
@staticmethod
def _get_episode(element):
return element
def _call_api(self, pid, per_page, page=1, series_id=None):
return self._download_json(
'http://ibl.api.bbc.co.uk/ibl/v1/groups/%s/episodes' % pid,
pid, query={
'page': page,
'per_page': per_page,
})['group_episodes']
@staticmethod
def _get_playlist_data(data):
return data['group']
def _get_playlist_title(self, data):
return data.get('title')
class BBCCoUkPlaylistIE(BBCCoUkPlaylistBaseIE): class BBCCoUkPlaylistIE(BBCCoUkPlaylistBaseIE):

View File

@@ -108,8 +108,7 @@
from .bbc import ( from .bbc import (
BBCCoUkIE, BBCCoUkIE,
BBCCoUkArticleIE, BBCCoUkArticleIE,
BBCCoUkIPlayerEpisodesIE, BBCCoUkIPlayerPlaylistIE,
BBCCoUkIPlayerGroupIE,
BBCCoUkPlaylistIE, BBCCoUkPlaylistIE,
BBCIE, BBCIE,
) )
@@ -1674,14 +1673,9 @@
ZattooLiveIE, ZattooLiveIE,
) )
from .zdf import ZDFIE, ZDFChannelIE from .zdf import ZDFIE, ZDFChannelIE
from .zee5 import (
Zee5IE,
Zee5SeriesIE,
)
from .zhihu import ZhihuIE from .zhihu import ZhihuIE
from .zingmp3 import ( from .zingmp3 import ZingMp3IE
ZingMp3IE, from .zee5 import Zee5IE
ZingMp3AlbumIE, from .zee5 import Zee5SeriesIE
)
from .zoom import ZoomIE from .zoom import ZoomIE
from .zype import ZypeIE from .zype import ZypeIE

View File

@@ -2965,7 +2965,7 @@ def _real_extract(self, url):
webpage) webpage)
if not mobj: if not mobj:
mobj = re.search( mobj = re.search(
r'data-video-link=["\'](?P<url>http://m\.mlb\.com/video/[^"\']+)', r'data-video-link=["\'](?P<url>http://m.mlb.com/video/[^"\']+)',
webpage) webpage)
if mobj is not None: if mobj is not None:
return self.url_result(mobj.group('url'), 'MLB') return self.url_result(mobj.group('url'), 'MLB')

View File

@@ -112,7 +112,7 @@ def random_string():
'client_id': self._CLIENT_ID, 'client_id': self._CLIENT_ID,
'redirect_uri': self._ORIGIN_URL, 'redirect_uri': self._ORIGIN_URL,
'tenant': 'lacausers', 'tenant': 'lacausers',
'connection': 'Username-Password-ACG-Proxy', 'connection': 'Username-Password-Authentication',
'username': username, 'username': username,
'password': password, 'password': password,
'sso': 'true', 'sso': 'true',

View File

@@ -340,7 +340,7 @@ class MTVServicesEmbeddedIE(MTVServicesInfoExtractor):
@staticmethod @staticmethod
def _extract_url(webpage): def _extract_url(webpage):
mobj = re.search( mobj = re.search(
r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//media\.mtvnservices\.com/embed/.+?)\1', webpage) r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//media.mtvnservices.com/embed/.+?)\1', webpage)
if mobj: if mobj:
return mobj.group('url') return mobj.group('url')

View File

@@ -492,12 +492,13 @@ def get_video_info_xml(items):
self._sort_formats(formats) self._sort_formats(formats)
# Start extracting information # Start extracting information
title = ( title = get_video_info_web('originalTitle')
get_video_info_web(['originalTitle', 'title']) if not title:
or self._og_search_title(webpage, default=None) title = self._og_search_title(webpage, default=None)
or self._html_search_regex( if not title:
title = self._html_search_regex(
r'<span[^>]+class="videoHeaderTitle"[^>]*>([^<]+)</span>', r'<span[^>]+class="videoHeaderTitle"[^>]*>([^<]+)</span>',
webpage, 'video title')) webpage, 'video title')
watch_api_data_string = self._html_search_regex( watch_api_data_string = self._html_search_regex(
r'<div[^>]+id="watchAPIDataContainer"[^>]+>([^<]+)</div>', r'<div[^>]+id="watchAPIDataContainer"[^>]+>([^<]+)</div>',

View File

@@ -143,10 +143,7 @@ def _real_extract(self, url):
props_data = try_get(json_data, lambda x: x['props'], expected_type=dict) props_data = try_get(json_data, lambda x: x['props'], expected_type=dict)
# Chech statusCode for success # Chech statusCode for success
status = props_data.get('pageProps').get('statusCode') if props_data.get('pageProps').get('statusCode') == 0:
if status == 0:
return self._extract_aweme(props_data, webpage, url) return self._extract_aweme(props_data, webpage, url)
elif status == 10216:
raise ExtractorError('This video is private', expected=True)
raise ExtractorError('Video not available', video_id=video_id) raise ExtractorError('Video not available', video_id=video_id)

View File

@@ -23,8 +23,6 @@ class VGTVIE(XstreamIE):
'fvn.no/fvntv': 'fvntv', 'fvn.no/fvntv': 'fvntv',
'aftenposten.no/webtv': 'aptv', 'aftenposten.no/webtv': 'aptv',
'ap.vgtv.no/webtv': 'aptv', 'ap.vgtv.no/webtv': 'aptv',
'tv.aftonbladet.se': 'abtv',
# obsolete URL schemas, kept in order to save one HTTP redirect
'tv.aftonbladet.se/abtv': 'abtv', 'tv.aftonbladet.se/abtv': 'abtv',
'www.aftonbladet.se/tv': 'abtv', 'www.aftonbladet.se/tv': 'abtv',
} }
@@ -142,10 +140,6 @@ class VGTVIE(XstreamIE):
'url': 'http://www.vgtv.no/#!/video/127205/inside-the-mind-of-favela-funk', 'url': 'http://www.vgtv.no/#!/video/127205/inside-the-mind-of-favela-funk',
'only_matching': True, 'only_matching': True,
}, },
{
'url': 'https://tv.aftonbladet.se/video/36015/vulkanutbrott-i-rymden-nu-slapper-nasa-bilderna',
'only_matching': True,
},
{ {
'url': 'http://tv.aftonbladet.se/abtv/articles/36015', 'url': 'http://tv.aftonbladet.se/abtv/articles/36015',
'only_matching': True, 'only_matching': True,

View File

@@ -1947,7 +1947,7 @@ def feed_entry(name):
f['format_id'] = itag f['format_id'] = itag
formats.append(f) formats.append(f)
if self._downloader.params.get('youtube_include_dash_manifest', True): if self._downloader.params.get('youtube_include_dash_manifest'):
dash_manifest_url = streaming_data.get('dashManifestUrl') dash_manifest_url = streaming_data.get('dashManifestUrl')
if dash_manifest_url: if dash_manifest_url:
for f in self._extract_mpd_formats( for f in self._extract_mpd_formats(
@@ -2150,7 +2150,6 @@ def process_language(container, base_url, lang_code, query):
# This will error if there is no livechat # This will error if there is no livechat
initial_data['contents']['twoColumnWatchNextResults']['conversationBar']['liveChatRenderer']['continuations'][0]['reloadContinuationData']['continuation'] initial_data['contents']['twoColumnWatchNextResults']['conversationBar']['liveChatRenderer']['continuations'][0]['reloadContinuationData']['continuation']
info['subtitles']['live_chat'] = [{ info['subtitles']['live_chat'] = [{
'url': 'https://www.youtube.com/watch?v=%s' % video_id, # url is needed to set cookies
'video_id': video_id, 'video_id': video_id,
'ext': 'json', 'ext': 'json',
'protocol': 'youtube_live_chat_replay', 'protocol': 'youtube_live_chat_replay',

View File

@@ -1,94 +1,93 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
update_url_query,
) )
class ZingMp3BaseIE(InfoExtractor): class ZingMp3BaseInfoExtractor(InfoExtractor):
_VALID_URL_TMPL = r'https?://(?:mp3\.zing|zingmp3)\.vn/(?:%s)/[^/]+/(?P<id>\w+)\.html'
_GEO_COUNTRIES = ['VN']
def _extract_item(self, item, fatal): def _extract_item(self, item, page_type, fatal=True):
item_id = item['id'] error_message = item.get('msg')
title = item.get('name') or item['title'] if error_message:
formats = []
for k, v in (item.get('source') or {}).items():
if not v:
continue
if k in ('mp4', 'hls'):
for res, video_url in v.items():
if not video_url:
continue
if k == 'hls':
formats.extend(self._extract_m3u8_formats(
video_url, item_id, 'mp4',
'm3u8_native', m3u8_id=k, fatal=False))
elif k == 'mp4':
formats.append({
'format_id': 'mp4-' + res,
'url': video_url,
'height': int_or_none(self._search_regex(
r'^(\d+)p', res, 'resolution', default=None)),
})
else:
formats.append({
'ext': 'mp3',
'format_id': k,
'tbr': int_or_none(k),
'url': self._proto_relative_url(v),
'vcodec': 'none',
})
if not formats:
if not fatal: if not fatal:
return return
msg = item['msg'] raise ExtractorError(
if msg == 'Sorry, this content is not available in your country.': '%s returned error: %s' % (self.IE_NAME, error_message),
self.raise_geo_restricted(countries=self._GEO_COUNTRIES) expected=True)
raise ExtractorError(msg, expected=True)
self._sort_formats(formats)
subtitles = None formats = []
lyric = item.get('lyric') for quality, source_url in zip(item.get('qualities') or item.get('quality', []), item.get('source_list') or item.get('source', [])):
if lyric: if not source_url or source_url == 'require vip':
subtitles = { continue
'origin': [{ if not re.match(r'https?://', source_url):
'url': lyric, source_url = '//' + source_url
}], source_url = self._proto_relative_url(source_url, 'http:')
quality_num = int_or_none(quality)
f = {
'format_id': quality,
'url': source_url,
} }
if page_type == 'video':
f.update({
'height': quality_num,
'ext': 'mp4',
})
else:
f.update({
'abr': quality_num,
'ext': 'mp3',
})
formats.append(f)
album = item.get('album') or {} cover = item.get('cover')
return { return {
'id': item_id, 'title': (item.get('name') or item.get('title')).strip(),
'title': title,
'formats': formats, 'formats': formats,
'thumbnail': item.get('thumbnail'), 'thumbnail': 'http:/' + cover if cover else None,
'subtitles': subtitles, 'artist': item.get('artist'),
'duration': int_or_none(item.get('duration')),
'track': title,
'artist': item.get('artists_names'),
'album': album.get('name') or album.get('title'),
'album_artist': album.get('artists_names'),
} }
def _real_extract(self, url): def _extract_player_json(self, player_json_url, id, page_type, playlist_title=None):
page_id = self._match_id(url) player_json = self._download_json(player_json_url, id, 'Downloading Player JSON')
webpage = self._download_webpage( items = player_json['data']
url.replace('://zingmp3.vn/', '://mp3.zing.vn/'), if 'item' in items:
page_id, query={'play_song': 1}) items = items['item']
data_path = self._search_regex(
r'data-xml="([^"]+)', webpage, 'data path') if len(items) == 1:
return self._process_data(self._download_json( # one single song
'https://mp3.zing.vn/xhr' + data_path, page_id)['data']) data = self._extract_item(items[0], page_type)
data['id'] = id
return data
else:
# playlist of songs
entries = []
for i, item in enumerate(items, 1):
entry = self._extract_item(item, page_type, fatal=False)
if not entry:
continue
entry['id'] = '%s-%d' % (id, i)
entries.append(entry)
return {
'_type': 'playlist',
'id': id,
'title': playlist_title,
'entries': entries,
}
class ZingMp3IE(ZingMp3BaseIE): class ZingMp3IE(ZingMp3BaseInfoExtractor):
_VALID_URL = ZingMp3BaseIE._VALID_URL_TMPL % 'bai-hat|video-clip' _VALID_URL = r'https?://mp3\.zing\.vn/(?:bai-hat|album|playlist|video-clip)/[^/]+/(?P<id>\w+)\.html'
_TESTS = [{ _TESTS = [{
'url': 'http://mp3.zing.vn/bai-hat/Xa-Mai-Xa-Bao-Thy/ZWZB9WAB.html', 'url': 'http://mp3.zing.vn/bai-hat/Xa-Mai-Xa-Bao-Thy/ZWZB9WAB.html',
'md5': 'ead7ae13693b3205cbc89536a077daed', 'md5': 'ead7ae13693b3205cbc89536a077daed',
@@ -96,66 +95,49 @@ class ZingMp3IE(ZingMp3BaseIE):
'id': 'ZWZB9WAB', 'id': 'ZWZB9WAB',
'title': 'Xa Mãi Xa', 'title': 'Xa Mãi Xa',
'ext': 'mp3', 'ext': 'mp3',
'thumbnail': r're:^https?://.+\.jpg', 'thumbnail': r're:^https?://.*\.jpg$',
'subtitles': {
'origin': [{
'ext': 'lrc',
}]
},
'duration': 255,
'track': 'Xa Mãi Xa',
'artist': 'Bảo Thy',
'album': 'Special Album',
'album_artist': 'Bảo Thy',
}, },
}, { }, {
'url': 'https://mp3.zing.vn/video-clip/Suong-Hoa-Dua-Loi-K-ICM-RYO/ZO8ZF7C7.html', 'url': 'http://mp3.zing.vn/video-clip/Let-It-Go-Frozen-OST-Sungha-Jung/ZW6BAEA0.html',
'md5': 'e9c972b693aa88301ef981c8151c4343', 'md5': '870295a9cd8045c0e15663565902618d',
'info_dict': { 'info_dict': {
'id': 'ZO8ZF7C7', 'id': 'ZW6BAEA0',
'title': 'Sương Hoa Đưa Lối', 'title': 'Let It Go (Frozen OST)',
'ext': 'mp4', 'ext': 'mp4',
'thumbnail': r're:^https?://.+\.jpg',
'duration': 207,
'track': 'Sương Hoa Đưa Lối',
'artist': 'K-ICM, RYO',
}, },
}, { }, {
'url': 'https://zingmp3.vn/bai-hat/Xa-Mai-Xa-Bao-Thy/ZWZB9WAB.html', 'url': 'http://mp3.zing.vn/album/Lau-Dai-Tinh-Ai-Bang-Kieu-Minh-Tuyet/ZWZBWDAF.html',
'info_dict': {
'_type': 'playlist',
'id': 'ZWZBWDAF',
'title': 'Lâu Đài Tình Ái - Bằng Kiều,Minh Tuyết | Album 320 lossless',
},
'playlist_count': 10,
'skip': 'removed at the request of the owner',
}, {
'url': 'http://mp3.zing.vn/playlist/Duong-Hong-Loan-apollobee/IWCAACCB.html',
'only_matching': True, 'only_matching': True,
}] }]
IE_NAME = 'zingmp3' IE_NAME = 'zingmp3'
IE_DESC = 'mp3.zing.vn' IE_DESC = 'mp3.zing.vn'
def _process_data(self, data): def _real_extract(self, url):
return self._extract_item(data, True) page_id = self._match_id(url)
webpage = self._download_webpage(url, page_id)
class ZingMp3AlbumIE(ZingMp3BaseIE): player_json_url = self._search_regex([
_VALID_URL = ZingMp3BaseIE._VALID_URL_TMPL % 'album|playlist' r'data-xml="([^"]+)',
_TESTS = [{ r'&amp;xmlURL=([^&]+)&'
'url': 'http://mp3.zing.vn/album/Lau-Dai-Tinh-Ai-Bang-Kieu-Minh-Tuyet/ZWZBWDAF.html', ], webpage, 'player xml url')
'info_dict': {
'_type': 'playlist',
'id': 'ZWZBWDAF',
'title': 'Lâu Đài Tình Ái',
},
'playlist_count': 10,
}, {
'url': 'http://mp3.zing.vn/playlist/Duong-Hong-Loan-apollobee/IWCAACCB.html',
'only_matching': True,
}, {
'url': 'https://zingmp3.vn/album/Lau-Dai-Tinh-Ai-Bang-Kieu-Minh-Tuyet/ZWZBWDAF.html',
'only_matching': True,
}]
IE_NAME = 'zingmp3:album'
def _process_data(self, data): playlist_title = None
def entries(): page_type = self._search_regex(r'/(?:html5)?xml/([^/-]+)', player_json_url, 'page type')
for item in (data.get('items') or []): if page_type == 'video':
entry = self._extract_item(item, False) player_json_url = update_url_query(player_json_url, {'format': 'json'})
if entry: else:
yield entry player_json_url = player_json_url.replace('/xml/', '/html5xml/')
info = data.get('info') or {} if page_type == 'album':
return self.playlist_result( playlist_title = self._og_search_title(webpage)
entries(), info.get('id'), info.get('name') or info.get('title'))
return self._extract_player_json(player_json_url, page_id, page_type, playlist_title)

View File

@@ -1,68 +1,82 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
int_or_none, int_or_none,
js_to_json, url_or_none,
parse_filesize, parse_filesize,
urlencode_postdata, urlencode_postdata
) )
class ZoomIE(InfoExtractor): class ZoomIE(InfoExtractor):
IE_NAME = 'zoom' IE_NAME = 'zoom'
_VALID_URL = r'(?P<base_url>https?://(?:[^.]+\.)?zoom.us/)rec(?:ording)?/(?:play|share)/(?P<id>[A-Za-z0-9_.-]+)' _VALID_URL = r'https://(?:.*).?zoom.us/rec(?:ording)?/(play|share)/(?P<id>[A-Za-z0-9\-_.]+)'
_TEST = { _TEST = {
'url': 'https://economist.zoom.us/rec/play/dUk_CNBETmZ5VA2BwEl-jjakPpJ3M1pcfVYAPRsoIbEByGsLjUZtaa4yCATQuOL3der8BlTwxQePl_j0.EImBkXzTIaPvdZO5', 'url': 'https://zoom.us/recording/play/SILVuCL4bFtRwWTtOCFQQxAsBQsJljFtm9e4Z_bvo-A8B-nzUSYZRNuPl3qW5IGK',
'md5': 'ab445e8c911fddc4f9adc842c2c5d434',
'info_dict': { 'info_dict': {
'id': 'dUk_CNBETmZ5VA2BwEl-jjakPpJ3M1pcfVYAPRsoIbEByGsLjUZtaa4yCATQuOL3der8BlTwxQePl_j0.EImBkXzTIaPvdZO5', 'md5': '031a5b379f1547a8b29c5c4c837dccf2',
'ext': 'mp4', 'title': "GAZ Transformational Tuesdays W/ Landon & Stapes",
'title': 'China\'s "two sessions" and the new five-year plan', 'id': "SILVuCL4bFtRwWTtOCFQQxAsBQsJljFtm9e4Z_bvo-A8B-nzUSYZRNuPl3qW5IGK",
'ext': "mp4"
} }
} }
def _real_extract(self, url): def _real_extract(self, url):
base_url, play_id = re.match(self._VALID_URL, url).groups() display_id = self._match_id(url)
webpage = self._download_webpage(url, play_id) webpage = self._download_webpage(url, display_id)
try: password_protected = self._search_regex(r'<form[^>]+?id="(password_form)"', webpage, 'password field', fatal=False, default=None)
form = self._form_hidden_inputs('password_form', webpage) if password_protected is not None:
except ExtractorError: self._verify_video_password(url, display_id, webpage)
form = None webpage = self._download_webpage(url, display_id)
if form:
password = self._downloader.params.get('videopassword')
if not password:
raise ExtractorError(
'This video is protected by a passcode, use the --video-password option', expected=True)
is_meeting = form.get('useWhichPasswd') == 'meeting'
validation = self._download_json(
base_url + 'rec/validate%s_passwd' % ('_meet' if is_meeting else ''),
play_id, 'Validating passcode', 'Wrong passcode', data=urlencode_postdata({
'id': form[('meet' if is_meeting else 'file') + 'Id'],
'passwd': password,
'action': form.get('action'),
}))
if not validation.get('status'):
raise ExtractorError(validation['errorMessage'], expected=True)
webpage = self._download_webpage(url, play_id)
data = self._parse_json(self._search_regex( video_url = self._search_regex(r"viewMp4Url: \'(.*)\'", webpage, 'video url')
r'(?s)window\.__data__\s*=\s*({.+?});', title = self._html_search_regex([r"topic: \"(.*)\",", r"<title>(.*) - Zoom</title>"], webpage, 'title')
webpage, 'data'), play_id, js_to_json) viewResolvtionsWidth = self._search_regex(r"viewResolvtionsWidth: (\d*)", webpage, 'res width', fatal=False)
viewResolvtionsHeight = self._search_regex(r"viewResolvtionsHeight: (\d*)", webpage, 'res height', fatal=False)
fileSize = parse_filesize(self._search_regex(r"fileSize: \'(.+)\'", webpage, 'fileSize', fatal=False))
urlprefix = url.split("zoom.us")[0] + "zoom.us/"
formats = []
formats.append({
'url': url_or_none(video_url),
'width': int_or_none(viewResolvtionsWidth),
'height': int_or_none(viewResolvtionsHeight),
'http_headers': {'Accept': 'video/webm,video/ogg,video/*;q=0.9,application/ogg;q=0.7,audio/*;q=0.6,*/*;q=0.5',
'Referer': urlprefix},
'ext': "mp4",
'filesize_approx': int_or_none(fileSize)
})
self._sort_formats(formats)
return { return {
'id': play_id, 'id': display_id,
'title': data['topic'], 'title': title,
'url': data['viewMp4Url'], 'formats': formats
'width': int_or_none(data.get('viewResolvtionsWidth')),
'height': int_or_none(data.get('viewResolvtionsHeight')),
'http_headers': {
'Referer': base_url,
},
'filesize_approx': parse_filesize(data.get('fileSize')),
} }
def _verify_video_password(self, url, video_id, webpage):
password = self._downloader.params.get('videopassword')
if password is None:
raise ExtractorError('This video is protected by a password, use the --video-password option', expected=True)
meetId = self._search_regex(r'<input[^>]+?id="meetId" value="([^\"]+)"', webpage, 'meetId')
data = urlencode_postdata({
'id': meetId,
'passwd': password,
'action': "viewdetailedpage",
'recaptcha': ""
})
validation_url = url.split("zoom.us")[0] + "zoom.us/rec/validate_meet_passwd"
validation_response = self._download_json(
validation_url, video_id,
note='Validating Password...',
errnote='Wrong password?',
data=data)
if validation_response['errorCode'] != 0:
raise ExtractorError('Login failed, %s said: %r' % (self.IE_NAME, validation_response['errorMessage']))

View File

@@ -1147,18 +1147,13 @@ def _dict_from_multiple_values_options_callback(
metavar='FIELD:FORMAT', dest='metafromfield', action='append', metavar='FIELD:FORMAT', dest='metafromfield', action='append',
help=( help=(
'Parse additional metadata like title/artist from other fields. ' 'Parse additional metadata like title/artist from other fields. '
'Give a template or field name to extract data from and the ' 'Give field name to extract data from, and format of the field seperated by a ":". '
'format to interpret it as, seperated by a ":". '
'Either regular expression with named capture groups or a ' 'Either regular expression with named capture groups or a '
'similar syntax to the output template can be used for the FORMAT. ' 'similar syntax to the output template can also be used. '
'Similarly, the syntax for output template can be used for FIELD ' 'The parsed parameters replace any existing values and can be use in output template. '
'to parse the data from multiple fields. '
'The parsed parameters replace any existing values and can be used in output templates. '
'This option can be used multiple times. ' 'This option can be used multiple times. '
'Example: --parse-metadata "title:%(artist)s - %(title)s" matches a title like ' 'Example: --parse-metadata "title:%(artist)s - %(title)s" matches a title like '
'"Coldplay - Paradise". ' '"Coldplay - Paradise". '
'Example: --parse-metadata "%(series)s %(episode_number)s:%(title)s" '
'sets the title using series and episode number. '
'Example (regex): --parse-metadata "description:Artist - (?P<artist>.+?)"')) 'Example (regex): --parse-metadata "description:Artist - (?P<artist>.+?)"'))
postproc.add_option( postproc.add_option(
'--xattrs', '--xattrs',

View File

@@ -4,10 +4,11 @@
from .common import PostProcessor from .common import PostProcessor
from ..compat import compat_str from ..compat import compat_str
from ..utils import str_or_none
class MetadataFromFieldPP(PostProcessor): class MetadataFromFieldPP(PostProcessor):
regex = r'(?P<in>.+):(?P<out>.+)$' regex = r'(?P<field>\w+):(?P<format>.+)$'
def __init__(self, downloader, formats): def __init__(self, downloader, formats):
PostProcessor.__init__(self, downloader) PostProcessor.__init__(self, downloader)
@@ -18,20 +19,11 @@ def __init__(self, downloader, formats):
match = re.match(self.regex, f) match = re.match(self.regex, f)
assert match is not None assert match is not None
self._data.append({ self._data.append({
'in': match.group('in'), 'field': match.group('field'),
'out': match.group('out'), 'format': match.group('format'),
'tmpl': self.field_to_template(match.group('in')), 'regex': self.format_to_regex(match.group('format'))})
'regex': self.format_to_regex(match.group('out')),
})
@staticmethod def format_to_regex(self, fmt):
def field_to_template(tmpl):
if re.match(r'\w+$', tmpl):
return '%%(%s)s' % tmpl
return tmpl
@staticmethod
def format_to_regex(fmt):
r""" r"""
Converts a string like Converts a string like
'%(title)s - %(artist)s' '%(title)s - %(artist)s'
@@ -45,7 +37,7 @@ def format_to_regex(fmt):
# replace %(..)s with regex group and escape other string parts # replace %(..)s with regex group and escape other string parts
for match in re.finditer(r'%\((\w+)\)s', fmt): for match in re.finditer(r'%\((\w+)\)s', fmt):
regex += re.escape(fmt[lastpos:match.start()]) regex += re.escape(fmt[lastpos:match.start()])
regex += r'(?P<%s>[^\r\n]+)' % match.group(1) regex += r'(?P<' + match.group(1) + r'>[^\r\n]+)'
lastpos = match.end() lastpos = match.end()
if lastpos < len(fmt): if lastpos < len(fmt):
regex += re.escape(fmt[lastpos:]) regex += re.escape(fmt[lastpos:])
@@ -53,16 +45,22 @@ def format_to_regex(fmt):
def run(self, info): def run(self, info):
for dictn in self._data: for dictn in self._data:
tmpl, info_copy = self._downloader.prepare_outtmpl(dictn['tmpl'], info) field, regex = dictn['field'], dictn['regex']
data_to_parse = tmpl % info_copy if field not in info:
self.write_debug('Searching for r"%s" in %s' % (dictn['regex'], tmpl)) self.report_warning('Video doesnot have a %s' % field)
match = re.search(dictn['regex'], data_to_parse) continue
data_to_parse = str_or_none(info[field])
if data_to_parse is None:
self.report_warning('Field %s cannot be parsed' % field)
continue
self.write_debug('Searching for r"%s" in %s' % (regex, field))
match = re.search(regex, data_to_parse)
if match is None: if match is None:
self.report_warning('Could not interpret video %s as "%s"' % (dictn['in'], dictn['out'])) self.report_warning('Could not interpret video %s as "%s"' % (field, dictn['format']))
continue continue
for attribute, value in match.groupdict().items(): for attribute, value in match.groupdict().items():
info[attribute] = value info[attribute] = value
self.to_screen('parsed %s from "%s": %s' % (attribute, dictn['in'], value if value is not None else 'NA')) self.to_screen('parsed %s from %s: %s' % (attribute, field, value if value is not None else 'NA'))
return [], info return [], info

View File

@@ -2423,15 +2423,6 @@ def __init__(self, msg, exc_info=None):
self.exc_info = exc_info self.exc_info = exc_info
class EntryNotInPlaylist(YoutubeDLError):
"""Entry not in playlist exception.
This exception will be thrown by YoutubeDL when a requested entry
is not found in the playlist info_dict
"""
pass
class SameFileError(YoutubeDLError): class SameFileError(YoutubeDLError):
"""Same File exception. """Same File exception.
@@ -4205,20 +4196,6 @@ def q(qid):
'pl_infojson': 'info.json', 'pl_infojson': 'info.json',
} }
# As of [1] format syntax is:
# %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
# 1. https://docs.python.org/2/library/stdtypes.html#string-formatting
FORMAT_RE = r'''(?x)
(?<!%)
%
\({0}\) # mapping key
(?:[#0\-+ ]+)? # conversion flags (optional)
(?:\d+)? # minimum field width (optional)
(?:\.\d+)? # precision (optional)
[hlL]? # length modifier (optional)
(?P<type>[diouxXeEfFgGcrs%]) # conversion type
'''
def limit_length(s, length): def limit_length(s, length):
""" Add ellipses to overly long strings """ """ Add ellipses to overly long strings """

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals from __future__ import unicode_literals
__version__ = '2021.03.24' __version__ = '2021.03.15'