release 2018.11.23

[ChangeLog] Actualize
[ci skip]
2026-06-13 08:00:11 +00:00 · 2018-11-23 00:16:45 +07:00 · 2018-11-23 00:14:43 +07:00 · 2018-11-21 23:25:38 +01:00 · 2018-11-21 23:21:05 +01:00 · 2018-11-22 02:34:35 +07:00
20 changed files with 304 additions and 144 deletions
@@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.11.18*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.11.18**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2018.11.23*. If it's not, read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2018.11.23**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through the [README](https://github.com/rg3/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -36,7 +36,7 @@ Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2018.11.18
+[debug] youtube-dl version 2018.11.23
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
@@ -15,6 +15,18 @@ env:
  - YTDL_TEST_SET=download
 matrix:
  include:
+    - python: 3.7
+      dist: xenial
+      env: YTDL_TEST_SET=core
+    - python: 3.7
+      dist: xenial
+      env: YTDL_TEST_SET=download
+    - python: 3.8-dev
+      dist: xenial
+      env: YTDL_TEST_SET=core
+    - python: 3.8-dev
+      dist: xenial
+      env: YTDL_TEST_SET=download
    - env: JYTHON=true; YTDL_TEST_SET=core
    - env: JYTHON=true; YTDL_TEST_SET=download
  fast_finish: true
@@ -1,3 +1,23 @@
+version 2018.11.23
+
+Core
+ [setup.py] Add more relevant classifiers
+
+Extractors
+* [mixcloud] Fallback to hardcoded decryption key (#18016)
+* [nbc:news] Fix article extraction (#16194)
+* [foxsports] Fix extraction (#17543)
+* [loc] Relax regular expression and improve formats extraction
+ [ciscolive] Add support for ciscolive.cisco.com (#17984)
+* [nzz] Relax kaltura regex (#18228)
+* [sixplay] Fix formats extraction
+* [bitchute] Improve title extraction
+* [kaltura] Limit requested MediaEntry fields
+ [americastestkitchen] Add support for zype embeds (#18225)
+ [pornhub] Add pornhub.net alias
+* [nova:embed] Fix extraction (#18222)
+
+
 version 2018.11.18

 Extractors
@@ -163,6 +163,8 @@
 - **chirbit**
 - **chirbit:profile**
 - **Cinchcast**
+ - **CiscoLiveSearch**
+ - **CiscoLiveSession**
 - **CJSW**
 - **cliphunter**
 - **Clippit**
@@ -124,6 +124,8 @@ setup(
        'Development Status :: 5 - Production/Stable',
        'Environment :: Console',
        'License :: Public Domain',
+        'Programming Language :: Python',
+        'Programming Language :: Python :: 2',
        'Programming Language :: Python :: 2.6',
        'Programming Language :: Python :: 2.7',
        'Programming Language :: Python :: 3',
@@ -132,6 +134,13 @@ setup(
        'Programming Language :: Python :: 3.4',
        'Programming Language :: Python :: 3.5',
        'Programming Language :: Python :: 3.6',
+        'Programming Language :: Python :: 3.7',
+        'Programming Language :: Python :: 3.8',
+        'Programming Language :: Python :: Implementation',
+        'Programming Language :: Python :: Implementation :: CPython',
+        'Programming Language :: Python :: Implementation :: IronPython',
+        'Programming Language :: Python :: Implementation :: Jython',
+        'Programming Language :: Python :: Implementation :: PyPy',
    ],

    cmdclass={'build_lazy_extractors': build_lazy_extractors},
@@ -43,10 +43,6 @@ class AmericasTestKitchenIE(InfoExtractor):

        webpage = self._download_webpage(url, video_id)

-        partner_id = self._search_regex(
-            r'src=["\'](?:https?:)?//(?:[^/]+\.)kaltura\.com/(?:[^/]+/)*(?:p|partner_id)/(\d+)',
-            webpage, 'kaltura partner id')
-
        video_data = self._parse_json(
            self._search_regex(
                r'window\.__INITIAL_STATE__\s*=\s*({.+?})\s*;\s*</script>',
@@ -58,7 +54,18 @@ class AmericasTestKitchenIE(InfoExtractor):
            (lambda x: x['episodeDetail']['content']['data'],
             lambda x: x['videoDetail']['content']['data']), dict)
        ep_meta = ep_data.get('full_video', {})
-        external_id = ep_data.get('external_id') or ep_meta['external_id']
+
+        zype_id = ep_meta.get('zype_id')
+        if zype_id:
+            embed_url = 'https://player.zype.com/embed/%s.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ' % zype_id
+            ie_key = 'Zype'
+        else:
+            partner_id = self._search_regex(
+                r'src=["\'](?:https?:)?//(?:[^/]+\.)kaltura\.com/(?:[^/]+/)*(?:p|partner_id)/(\d+)',
+                webpage, 'kaltura partner id')
+            external_id = ep_data.get('external_id') or ep_meta['external_id']
+            embed_url = 'kaltura:%s:%s' % (partner_id, external_id)
+            ie_key = 'Kaltura'

        title = ep_data.get('title') or ep_meta.get('title')
        description = clean_html(ep_meta.get('episode_description') or ep_data.get(
@@ -72,8 +79,8 @@ class AmericasTestKitchenIE(InfoExtractor):

        return {
            '_type': 'url_transparent',
-            'url': 'kaltura:%s:%s' % (partner_id, external_id),
-            'ie_key': 'Kaltura',
+            'url': embed_url,
+            'ie_key': ie_key,
            'title': title,
            'description': description,
            'thumbnail': thumbnail,
@@ -37,7 +37,7 @@ class BitChuteIE(InfoExtractor):
                'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.57 Safari/537.36',
            })

-        title = self._search_regex(
+        title = self._html_search_regex(
            (r'<[^>]+\bid=["\']video-title[^>]+>([^<]+)', r'<title>([^<]+)'),
            webpage, 'title', default=None) or self._html_search_meta(
            'description', webpage, 'title',
@@ -0,0 +1,142 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import itertools
+
+from .common import InfoExtractor
+from ..compat import (
+    compat_parse_qs,
+    compat_urllib_parse_urlparse,
+)
+from ..utils import (
+    clean_html,
+    float_or_none,
+    int_or_none,
+    try_get,
+    urlencode_postdata,
+)
+
+
+class CiscoLiveBaseIE(InfoExtractor):
+    # These appear to be constant across all Cisco Live presentations
+    # and are not tied to any user session or event
+    RAINFOCUS_API_URL = 'https://events.rainfocus.com/api/%s'
+    RAINFOCUS_API_PROFILE_ID = 'Na3vqYdAlJFSxhYTYQGuMbpafMqftalz'
+    RAINFOCUS_WIDGET_ID = 'n6l4Lo05R8fiy3RpUBm447dZN8uNWoye'
+    BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/5647924234001/SyK2FdqjM_default/index.html?videoId=%s'
+
+    HEADERS = {
+        'Origin': 'https://ciscolive.cisco.com',
+        'rfApiProfileId': RAINFOCUS_API_PROFILE_ID,
+        'rfWidgetId': RAINFOCUS_WIDGET_ID,
+    }
+
+    def _call_api(self, ep, rf_id, query, referrer, note=None):
+        headers = self.HEADERS.copy()
+        headers['Referer'] = referrer
+        return self._download_json(
+            self.RAINFOCUS_API_URL % ep, rf_id, note=note,
+            data=urlencode_postdata(query), headers=headers)
+
+    def _parse_rf_item(self, rf_item):
+        event_name = rf_item.get('eventName')
+        title = rf_item['title']
+        description = clean_html(rf_item.get('abstract'))
+        presenter_name = try_get(rf_item, lambda x: x['participants'][0]['fullName'])
+        bc_id = rf_item['videos'][0]['url']
+        bc_url = self.BRIGHTCOVE_URL_TEMPLATE % bc_id
+        duration = float_or_none(try_get(rf_item, lambda x: x['times'][0]['length']))
+        location = try_get(rf_item, lambda x: x['times'][0]['room'])
+
+        if duration:
+            duration = duration * 60
+
+        return {
+            '_type': 'url_transparent',
+            'url': bc_url,
+            'ie_key': 'BrightcoveNew',
+            'title': title,
+            'description': description,
+            'duration': duration,
+            'creator': presenter_name,
+            'location': location,
+            'series': event_name,
+        }
+
+
+class CiscoLiveSessionIE(CiscoLiveBaseIE):
+    _VALID_URL = r'https?://ciscolive\.cisco\.com/on-demand-library/\??[^#]*#/session/(?P<id>[^/?&]+)'
+    _TEST = {
+        'url': 'https://ciscolive.cisco.com/on-demand-library/?#/session/1423353499155001FoSs',
+        'md5': 'c98acf395ed9c9f766941c70f5352e22',
+        'info_dict': {
+            'id': '5803694304001',
+            'ext': 'mp4',
+            'title': '13 Smart Automations to Monitor Your Cisco IOS Network',
+            'description': 'md5:ec4a436019e09a918dec17714803f7cc',
+            'timestamp': 1530305395,
+            'upload_date': '20180629',
+            'uploader_id': '5647924234001',
+            'location': '16B Mezz.',
+        },
+    }
+
+    def _real_extract(self, url):
+        rf_id = self._match_id(url)
+        rf_result = self._call_api('session', rf_id, {'id': rf_id}, url)
+        return self._parse_rf_item(rf_result['items'][0])
+
+
+class CiscoLiveSearchIE(CiscoLiveBaseIE):
+    _VALID_URL = r'https?://ciscolive\.cisco\.com/on-demand-library/'
+    _TESTS = [{
+        'url': 'https://ciscolive.cisco.com/on-demand-library/?search.event=ciscoliveus2018&search.technicallevel=scpsSkillLevel_aintroductory&search.focus=scpsSessionFocus_designAndDeployment#/',
+        'info_dict': {
+            'title': 'Search query',
+        },
+        'playlist_count': 5,
+    }, {
+        'url': 'https://ciscolive.cisco.com/on-demand-library/?search.technology=scpsTechnology_applicationDevelopment&search.technology=scpsTechnology_ipv6&search.focus=scpsSessionFocus_troubleshootingTroubleshooting#/',
+        'only_matching': True,
+    }]
+
+    @classmethod
+    def suitable(cls, url):
+        return False if CiscoLiveSessionIE.suitable(url) else super(CiscoLiveSearchIE, cls).suitable(url)
+
+    @staticmethod
+    def _check_bc_id_exists(rf_item):
+        return int_or_none(try_get(rf_item, lambda x: x['videos'][0]['url'])) is not None
+
+    def _entries(self, query, url):
+        query['size'] = 50
+        query['from'] = 0
+        for page_num in itertools.count(1):
+            results = self._call_api(
+                'search', None, query, url,
+                'Downloading search JSON page %d' % page_num)
+            sl = try_get(results, lambda x: x['sectionList'][0], dict)
+            if sl:
+                results = sl
+            items = results.get('items')
+            if not items or not isinstance(items, list):
+                break
+            for item in items:
+                if not isinstance(item, dict):
+                    continue
+                if not self._check_bc_id_exists(item):
+                    continue
+                yield self._parse_rf_item(item)
+            size = int_or_none(results.get('size'))
+            if size is not None:
+                query['size'] = size
+            total = int_or_none(results.get('total'))
+            if total is not None and query['from'] + query['size'] > total:
+                break
+            query['from'] += query['size']
+
+    def _real_extract(self, url):
+        query = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
+        query['type'] = 'session'
+        return self.playlist_result(
+            self._entries(query, url), playlist_title='Search query')
@@ -194,6 +194,10 @@ from .chirbit import (
    ChirbitProfileIE,
 )
 from .cinchcast import CinchcastIE
+from .ciscolive import (
+    CiscoLiveSessionIE,
+    CiscoLiveSearchIE,
+)
 from .cjsw import CJSWIE
 from .cliphunter import CliphunterIE
 from .clippit import ClippitIE
@@ -1,43 +1,33 @@
 from __future__ import unicode_literals

 from .common import InfoExtractor
-from ..utils import (
-    smuggle_url,
-    update_url_query,
-)


 class FoxSportsIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?foxsports\.com/(?:[^/]+/)*(?P<id>[^/]+)'
+    _VALID_URL = r'https?://(?:www\.)?foxsports\.com/(?:[^/]+/)*video/(?P<id>\d+)'

    _TEST = {
        'url': 'http://www.foxsports.com/tennessee/video/432609859715',
        'md5': 'b49050e955bebe32c301972e4012ac17',
        'info_dict': {
-            'id': 'bwduI3X_TgUB',
+            'id': '432609859715',
            'ext': 'mp4',
            'title': 'Courtney Lee on going up 2-0 in series vs. Blazers',
            'description': 'Courtney Lee talks about Memphis being focused.',
-            'upload_date': '20150423',
-            'timestamp': 1429761109,
+            # TODO: fix timestamp
+            'upload_date': '19700101',  # '20150423',
+            # 'timestamp': 1429761109,
            'uploader': 'NEWA-FNG-FOXSPORTS',
        },
+        'params': {
+            # m3u8 download
+            'skip_download': True,
+        },
        'add_ie': ['ThePlatform'],
    }

    def _real_extract(self, url):
        video_id = self._match_id(url)

-        webpage = self._download_webpage(url, video_id)
-
-        config = self._parse_json(
-            self._html_search_regex(
-                r"""class="[^"]*(?:fs-player|platformPlayer-wrapper)[^"]*".+?data-player-config='([^']+)'""",
-                webpage, 'data player config'),
-            video_id)
-
-        return self.url_result(smuggle_url(update_url_query(
-            config['releaseURL'], {
-                'mbr': 'true',
-                'switch': 'http',
-            }), {'force_smil_url': True}))
+        return self.url_result(
+            'https://feed.theplatform.com/f/BKQ29B/foxsports-all?byId=' + video_id, 'ThePlatformFeed')
@@ -192,6 +192,8 @@ class KalturaIE(InfoExtractor):
                'entryId': video_id,
                'service': 'baseentry',
                'ks': '{1:result:ks}',
+                'responseProfile:fields': 'createdAt,dataUrl,duration,name,plays,thumbnailUrl,userId',
+                'responseProfile:type': 1,
            },
            {
                'action': 'getbyentryid',
@@ -16,16 +16,15 @@ from ..utils import (
 class LibraryOfCongressIE(InfoExtractor):
    IE_NAME = 'loc'
    IE_DESC = 'Library of Congress'
-    _VALID_URL = r'https?://(?:www\.)?loc\.gov/(?:item/|today/cyberlc/feature_wdesc\.php\?.*\brec=)(?P<id>[0-9]+)'
+    _VALID_URL = r'https?://(?:www\.)?loc\.gov/(?:item/|today/cyberlc/feature_wdesc\.php\?.*\brec=)(?P<id>[0-9a-z_.]+)'
    _TESTS = [{
        # embedded via <div class="media-player"
        'url': 'http://loc.gov/item/90716351/',
-        'md5': '353917ff7f0255aa6d4b80a034833de8',
+        'md5': '6ec0ae8f07f86731b1b2ff70f046210a',
        'info_dict': {
            'id': '90716351',
            'ext': 'mp4',
            'title': "Pa's trip to Mars",
-            'thumbnail': r're:^https?://.*\.jpg$',
            'duration': 0,
            'view_count': int,
        },
@@ -57,6 +56,12 @@ class LibraryOfCongressIE(InfoExtractor):
        'params': {
            'skip_download': True,
        },
+    }, {
+        'url': 'https://www.loc.gov/item/ihas.200197114/',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.loc.gov/item/afc1981005_afs20503/',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
@@ -67,12 +72,13 @@ class LibraryOfCongressIE(InfoExtractor):
            (r'id=(["\'])media-player-(?P<id>.+?)\1',
             r'<video[^>]+id=(["\'])uuid-(?P<id>.+?)\1',
             r'<video[^>]+data-uuid=(["\'])(?P<id>.+?)\1',
-             r'mediaObjectId\s*:\s*(["\'])(?P<id>.+?)\1'),
+             r'mediaObjectId\s*:\s*(["\'])(?P<id>.+?)\1',
+             r'data-tab="share-media-(?P<id>[0-9A-F]{32})"'),
            webpage, 'media id', group='id')

        data = self._download_json(
            'https://media.loc.gov/services/v1/media?id=%s&context=json' % media_id,
-            video_id)['mediaObject']
+            media_id)['mediaObject']

        derivative = data['derivatives'][0]
        media_url = derivative['derivativeUrl']
@@ -89,25 +95,29 @@ class LibraryOfCongressIE(InfoExtractor):
        if ext not in ('mp4', 'mp3'):
            media_url += '.mp4' if is_video else '.mp3'

-        if 'vod/mp4:' in media_url:
-            formats = [{
-                'url': media_url.replace('vod/mp4:', 'hls-vod/media/') + '.m3u8',
+        formats = []
+        if '/vod/mp4:' in media_url:
+            formats.append({
+                'url': media_url.replace('/vod/mp4:', '/hls-vod/media/') + '.m3u8',
                'format_id': 'hls',
                'ext': 'mp4',
                'protocol': 'm3u8_native',
                'quality': 1,
-            }]
-        elif 'vod/mp3:' in media_url:
-            formats = [{
-                'url': media_url.replace('vod/mp3:', ''),
-                'vcodec': 'none',
-            }]
+            })
+        http_format = {
+            'url': re.sub(r'(://[^/]+/)(?:[^/]+/)*(?:mp4|mp3):', r'\1', media_url),
+            'format_id': 'http',
+            'quality': 1,
+        }
+        if not is_video:
+            http_format['vcodec'] = 'none'
+        formats.append(http_format)

        download_urls = set()
        for m in re.finditer(
                r'<option[^>]+value=(["\'])(?P<url>.+?)\1[^>]+data-file-download=[^>]+>\s*(?P<id>.+?)(?:(?:&nbsp;|\s+)\((?P<size>.+?)\))?\s*<', webpage):
            format_id = m.group('id').lower()
-            if format_id == 'gif':
+            if format_id in ('gif', 'jpeg'):
                continue
            download_url = m.group('url')
            if download_url in download_urls:
@@ -161,11 +161,17 @@ class MixcloudIE(InfoExtractor):
            stream_info = info_json['streamInfo']
            formats = []

+            def decrypt_url(f_url):
+                for k in (key, 'IFYOUWANTTHEARTISTSTOGETPAIDDONOTDOWNLOADFROMMIXCLOUD'):
+                    decrypted_url = self._decrypt_xor_cipher(k, f_url)
+                    if re.search(r'^https?://[0-9a-z.]+/[0-9A-Za-z/.?=&_-]+$', decrypted_url):
+                        return decrypted_url
+
            for url_key in ('url', 'hlsUrl', 'dashUrl'):
                format_url = stream_info.get(url_key)
                if not format_url:
                    continue
-                decrypted = self._decrypt_xor_cipher(key, compat_b64decode(format_url))
+                decrypted = decrypt_url(compat_b64decode(format_url))
                if not decrypted:
                    continue
                if url_key == 'hlsUrl':
@@ -9,10 +9,8 @@ from .theplatform import ThePlatformIE
 from .adobepass import AdobePassIE
 from ..compat import compat_urllib_parse_unquote
 from ..utils import (
-    find_xpath_attr,
    smuggle_url,
    try_get,
-    unescapeHTML,
    update_url_query,
    int_or_none,
 )
@@ -269,27 +267,14 @@ class CSNNEIE(InfoExtractor):


 class NBCNewsIE(ThePlatformIE):
-    _VALID_URL = r'''(?x)https?://(?:www\.)?(?:nbcnews|today|msnbc)\.com/
-        (?:video/.+?/(?P<id>\d+)|
-        ([^/]+/)*(?:.*-)?(?P<mpx_id>[^/?]+))
-        '''
+    _VALID_URL = r'(?x)https?://(?:www\.)?(?:nbcnews|today|msnbc)\.com/([^/]+/)*(?:.*-)?(?P<id>[^/?]+)'

    _TESTS = [
-        {
-            'url': 'http://www.nbcnews.com/video/nbc-news/52753292',
-            'md5': '47abaac93c6eaf9ad37ee6c4463a5179',
-            'info_dict': {
-                'id': '52753292',
-                'ext': 'flv',
-                'title': 'Crew emerges after four-month Mars food study',
-                'description': 'md5:24e632ffac72b35f8b67a12d1b6ddfc1',
-            },
-        },
        {
            'url': 'http://www.nbcnews.com/watch/nbcnews-com/how-twitter-reacted-to-the-snowden-interview-269389891880',
            'md5': 'af1adfa51312291a017720403826bb64',
            'info_dict': {
-                'id': 'p_tweet_snow_140529',
+                'id': '269389891880',
                'ext': 'mp4',
                'title': 'How Twitter Reacted To The Snowden Interview',
                'description': 'md5:65a0bd5d76fe114f3c2727aa3a81fe64',
@@ -313,7 +298,7 @@ class NBCNewsIE(ThePlatformIE):
            'url': 'http://www.nbcnews.com/nightly-news/video/nightly-news-with-brian-williams-full-broadcast-february-4-394064451844',
            'md5': '73135a2e0ef819107bbb55a5a9b2a802',
            'info_dict': {
-                'id': 'nn_netcast_150204',
+                'id': '394064451844',
                'ext': 'mp4',
                'title': 'Nightly News with Brian Williams Full Broadcast (February 4)',
                'description': 'md5:1c10c1eccbe84a26e5debb4381e2d3c5',
@@ -326,7 +311,7 @@ class NBCNewsIE(ThePlatformIE):
            'url': 'http://www.nbcnews.com/business/autos/volkswagen-11-million-vehicles-could-have-suspect-software-emissions-scandal-n431456',
            'md5': 'a49e173825e5fcd15c13fc297fced39d',
            'info_dict': {
-                'id': 'x_lon_vwhorn_150922',
+                'id': '529953347624',
                'ext': 'mp4',
                'title': 'Volkswagen U.S. Chief:\xa0 We Have Totally Screwed Up',
                'description': 'md5:c8be487b2d80ff0594c005add88d8351',
@@ -339,7 +324,7 @@ class NBCNewsIE(ThePlatformIE):
            'url': 'http://www.today.com/video/see-the-aurora-borealis-from-space-in-stunning-new-nasa-video-669831235788',
            'md5': '118d7ca3f0bea6534f119c68ef539f71',
            'info_dict': {
-                'id': 'tdy_al_space_160420',
+                'id': '669831235788',
                'ext': 'mp4',
                'title': 'See the aurora borealis from space in stunning new NASA video',
                'description': 'md5:74752b7358afb99939c5f8bb2d1d04b1',
@@ -352,7 +337,7 @@ class NBCNewsIE(ThePlatformIE):
            'url': 'http://www.msnbc.com/all-in-with-chris-hayes/watch/the-chaotic-gop-immigration-vote-314487875924',
            'md5': '6d236bf4f3dddc226633ce6e2c3f814d',
            'info_dict': {
-                'id': 'n_hayes_Aimm_140801_272214',
+                'id': '314487875924',
                'ext': 'mp4',
                'title': 'The chaotic GOP immigration vote',
                'description': 'The Republican House votes on a border bill that has no chance of getting through the Senate or signed by the President and is drawing criticism from all sides.',
@@ -374,60 +359,22 @@ class NBCNewsIE(ThePlatformIE):
    ]

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-        if video_id is not None:
-            all_info = self._download_xml('http://www.nbcnews.com/id/%s/displaymode/1219' % video_id, video_id)
-            info = all_info.find('video')
-
-            return {
-                'id': video_id,
-                'title': info.find('headline').text,
-                'ext': 'flv',
-                'url': find_xpath_attr(info, 'media', 'type', 'flashVideo').text,
-                'description': info.find('caption').text,
-                'thumbnail': find_xpath_attr(info, 'media', 'type', 'thumbnail').text,
-            }
-        else:
-            # "feature" and "nightly-news" pages use theplatform.com
-            video_id = mobj.group('mpx_id')
+        video_id = self._match_id(url)
+        if not video_id.isdigit():
            webpage = self._download_webpage(url, video_id)

-            filter_param = 'byId'
-            bootstrap_json = self._search_regex(
-                [r'(?m)(?:var\s+(?:bootstrapJson|playlistData)|NEWS\.videoObj)\s*=\s*({.+});?\s*$',
-                 r'videoObj\s*:\s*({.+})', r'data-video="([^"]+)"',
-                 r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);'],
-                webpage, 'bootstrap json', default=None)
-            if bootstrap_json:
-                bootstrap = self._parse_json(
-                    bootstrap_json, video_id, transform_source=unescapeHTML)
+            data = self._parse_json(self._search_regex(
+                r'window\.__data\s*=\s*({.+});', webpage,
+                'bootstrap json'), video_id)
+            video_id = data['article']['content'][0]['primaryMedia']['video']['mpxMetadata']['id']

-                info = None
-                if 'results' in bootstrap:
-                    info = bootstrap['results'][0]['video']
-                elif 'video' in bootstrap:
-                    info = bootstrap['video']
-                elif 'msnbcVideoInfo' in bootstrap:
-                    info = bootstrap['msnbcVideoInfo']['meta']
-                elif 'msnbcThePlatform' in bootstrap:
-                    info = bootstrap['msnbcThePlatform']['videoPlayer']['video']
-                else:
-                    info = bootstrap
-
-                if 'guid' in info:
-                    video_id = info['guid']
-                    filter_param = 'byGuid'
-                elif 'mpxId' in info:
-                    video_id = info['mpxId']
-
-            return {
-                '_type': 'url_transparent',
-                'id': video_id,
-                # http://feed.theplatform.com/f/2E2eJC/nbcnews also works
-                'url': update_url_query('http://feed.theplatform.com/f/2E2eJC/nnd_NBCNews', {filter_param: video_id}),
-                'ie_key': 'ThePlatformFeed',
-            }
+        return {
+            '_type': 'url_transparent',
+            'id': video_id,
+            # http://feed.theplatform.com/f/2E2eJC/nbcnews also works
+            'url': update_url_query('http://feed.theplatform.com/f/2E2eJC/nnd_NBCNews', {'byId': video_id}),
+            'ie_key': 'ThePlatformFeed',
+        }


 class NBCOlympicsIE(InfoExtractor):
@@ -35,7 +35,7 @@ class NovaEmbedIE(InfoExtractor):

        bitrates = self._parse_json(
            self._search_regex(
-                r'(?s)bitrates\s*=\s*({.+?})\s*;', webpage, 'formats'),
+                r'(?s)(?:src|bitrates)\s*=\s*({.+?})\s*;', webpage, 'formats'),
            video_id, transform_source=js_to_json)

        QUALITIES = ('lq', 'mq', 'hq', 'hd')
@@ -11,20 +11,27 @@ from ..utils import (

 class NZZIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?nzz\.ch/(?:[^/]+/)*[^/?#]+-ld\.(?P<id>\d+)'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.nzz.ch/zuerich/gymizyte/gymizyte-schreiben-schueler-heute-noch-diktate-ld.9153',
        'info_dict': {
            'id': '9153',
        },
        'playlist_mincount': 6,
-    }
+    }, {
+        'url': 'https://www.nzz.ch/video/nzz-standpunkte/cvp-auf-der-suche-nach-dem-mass-der-mitte-ld.1368112',
+        'info_dict': {
+            'id': '1368112',
+        },
+        'playlist_count': 1,
+    }]

    def _real_extract(self, url):
        page_id = self._match_id(url)
        webpage = self._download_webpage(url, page_id)

        entries = []
-        for player_element in re.findall(r'(<[^>]+class="kalturaPlayer"[^>]*>)', webpage):
+        for player_element in re.findall(
+                r'(<[^>]+class="kalturaPlayer[^"]*"[^>]*>)', webpage):
            player_params = extract_attributes(player_element)
            if player_params.get('data-type') not in ('kaltura_singleArticle',):
                self.report_warning('Unsupported player type')
@@ -27,7 +27,7 @@ class PornHubIE(InfoExtractor):
    _VALID_URL = r'''(?x)
                    https?://
                        (?:
-                            (?:[^/]+\.)?pornhub\.com/(?:(?:view_video\.php|video/show)\?viewkey=|embed/)|
+                            (?:[^/]+\.)?pornhub\.(?:com|net)/(?:(?:view_video\.php|video/show)\?viewkey=|embed/)|
                            (?:www\.)?thumbzilla\.com/video/
                        )
                        (?P<id>[\da-z]+)
@@ -121,6 +121,9 @@ class PornHubIE(InfoExtractor):
    }, {
        'url': 'http://www.pornhub.com/video/show?viewkey=648719015',
        'only_matching': True,
+    }, {
+        'url': 'https://www.pornhub.net/view_video.php?viewkey=203640933',
+        'only_matching': True,
    }]

    @staticmethod
@@ -340,7 +343,7 @@ class PornHubPlaylistBaseIE(InfoExtractor):


 class PornHubPlaylistIE(PornHubPlaylistBaseIE):
-    _VALID_URL = r'https?://(?:[^/]+\.)?pornhub\.com/playlist/(?P<id>\d+)'
+    _VALID_URL = r'https?://(?:[^/]+\.)?pornhub\.(?:com|net)/playlist/(?P<id>\d+)'
    _TESTS = [{
        'url': 'http://www.pornhub.com/playlist/4667351',
        'info_dict': {
@@ -355,7 +358,7 @@ class PornHubPlaylistIE(PornHubPlaylistBaseIE):


 class PornHubUserVideosIE(PornHubPlaylistBaseIE):
-    _VALID_URL = r'https?://(?:[^/]+\.)?pornhub\.com/(?:(?:user|channel)s|model|pornstar)/(?P<id>[^/]+)/videos'
+    _VALID_URL = r'https?://(?:[^/]+\.)?pornhub\.(?:com|net)/(?:(?:user|channel)s|model|pornstar)/(?P<id>[^/]+)/videos'
    _TESTS = [{
        'url': 'http://www.pornhub.com/users/zoe_ph/videos/public',
        'info_dict': {
@@ -64,7 +64,7 @@ class SixPlayIE(InfoExtractor):
        for asset in clip_data['assets']:
            asset_url = asset.get('full_physical_path')
            protocol = asset.get('protocol')
-            if not asset_url or protocol == 'primetime' or asset_url in urls:
+            if not asset_url or protocol == 'primetime' or asset.get('type') == 'usp_hlsfp_h264' or asset_url in urls:
                continue
            urls.append(asset_url)
            container = asset.get('video_container')
@@ -81,19 +81,17 @@ class SixPlayIE(InfoExtractor):
                        if not urlh:
                            continue
                        asset_url = urlh.geturl()
-                    asset_url = re.sub(r'/([^/]+)\.ism/[^/]*\.m3u8', r'/\1.ism/\1.m3u8', asset_url)
-                    formats.extend(self._extract_m3u8_formats(
-                        asset_url, video_id, 'mp4', 'm3u8_native',
-                        m3u8_id='hls', fatal=False))
-                    formats.extend(self._extract_f4m_formats(
-                        asset_url.replace('.m3u8', '.f4m'),
-                        video_id, f4m_id='hds', fatal=False))
-                    formats.extend(self._extract_mpd_formats(
-                        asset_url.replace('.m3u8', '.mpd'),
-                        video_id, mpd_id='dash', fatal=False))
-                    formats.extend(self._extract_ism_formats(
-                        re.sub(r'/[^/]+\.m3u8', '/Manifest', asset_url),
-                        video_id, ism_id='mss', fatal=False))
+                    for i in range(3, 0, -1):
+                        asset_url = asset_url = asset_url.replace('_sd1/', '_sd%d/' % i)
+                        m3u8_formats = self._extract_m3u8_formats(
+                            asset_url, video_id, 'mp4', 'm3u8_native',
+                            m3u8_id='hls', fatal=False)
+                        formats.extend(m3u8_formats)
+                        formats.extend(self._extract_mpd_formats(
+                            asset_url.replace('.m3u8', '.mpd'),
+                            video_id, mpd_id='dash', fatal=False))
+                        if m3u8_formats:
+                            break
                else:
                    formats.extend(self._extract_m3u8_formats(
                        asset_url, video_id, 'mp4', 'm3u8_native',
@@ -343,7 +343,7 @@ class ThePlatformFeedIE(ThePlatformBaseIE):
    def _extract_feed_info(self, provider_id, feed_id, filter_query, video_id, custom_fields=None, asset_types_query={}, account_id=None):
        real_url = self._URL_TEMPLATE % (self.http_scheme(), provider_id, feed_id, filter_query)
        entry = self._download_json(real_url, video_id)['entries'][0]
-        main_smil_url = 'http://link.theplatform.com/s/%s/media/guid/%d/%s' % (provider_id, account_id, entry['guid']) if account_id else None
+        main_smil_url = 'http://link.theplatform.com/s/%s/media/guid/%d/%s' % (provider_id, account_id, entry['guid']) if account_id else entry.get('plmedia$publicUrl')

        formats = []
        subtitles = {}
@@ -356,7 +356,8 @@ class ThePlatformFeedIE(ThePlatformBaseIE):
            if first_video_id is None:
                first_video_id = cur_video_id
                duration = float_or_none(item.get('plfile$duration'))
-            for asset_type in item['plfile$assetTypes']:
+            file_asset_types = item.get('plfile$assetTypes') or compat_parse_qs(compat_urllib_parse_urlparse(smil_url).query)['assetTypes']
+            for asset_type in file_asset_types:
                if asset_type in asset_types:
                    continue
                asset_types.append(asset_type)
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2018.11.18'
+__version__ = '2018.11.23'
Author	SHA1	Message	Date
Sergey M․	d861a9d581	release 2018.11.23	2018-11-23 00:16:45 +07:00
Sergey M․	66173211c4	[ChangeLog] Actualize [ci skip]	2018-11-23 00:14:43 +07:00
Remita Amine	6f2883a2df	[mixcloud] base64 decode before decryption	2018-11-21 23:25:38 +01:00
Remita Amine	560020da30	[mixcloud] fallback to hardcoded decryption key(closes #18016 )	2018-11-21 23:21:05 +01:00
Sergey M․	305ce767d5	[travis] Add python 3.8-dev build	2018-11-22 02:34:35 +07:00
Sergey M․	157eef3e63	[setup.py] Add python 3.8 classifier	2018-11-22 02:08:41 +07:00
Sergey M․	bd2d553c7b	[travis] Add python 3.7 build	2018-11-22 02:01:39 +07:00
Sergey M․	af60e81e3c	[setup.py] Add more relevant classifiers	2018-11-22 02:01:39 +07:00
Remita Amine	a843464a7e	[nbc] fix NBCNews article extraction(closes #16194 )	2018-11-21 12:10:06 +01:00
Remita Amine	6866f24494	[foxsports] update test	2018-11-21 12:08:46 +01:00
Remita Amine	4e33e0792a	[loc] update test	2018-11-21 12:00:50 +01:00
Remita Amine	35328915b5	[foxsports] fix extraction(closes #17543 )	2018-11-21 09:46:36 +01:00
Remita Amine	6c882aa899	[loc] relax _VALID_URL regex and improve formats extraction	2018-11-21 09:46:36 +01:00
Sergey M․	183417a50f	[ciscolive:search] Add support for pagination	2018-11-21 06:10:43 +07:00
Sergey M․	6a6d7f0641	[ciscolive] Fix issues and improve extraction (closes #17984 )	2018-11-21 06:10:39 +07:00
Austin de Coup-Crank	05bd5e9c77	[ciscolive] Add extractor	2018-11-21 06:10:30 +07:00
Alexander Seiler	15ed5a2784	[nzz] Relax kaltura regex	2018-11-21 02:50:40 +07:00
Remita Amine	2e1280ed43	[sixplay] fix format extraction	2018-11-19 18:15:51 +01:00
Remita Amine	8578ea4dcb	[bitchute] use _html_search_regex for title extraction	2018-11-18 16:15:27 +01:00
Remita Amine	9b27a78a88	[kaltura] limit requested MediaEntry fields	2018-11-18 16:15:27 +01:00
Sergey M․	964b989dc8	[americastestkitchen] Add support for zype embeds (closes #18225 )	2018-11-18 20:45:25 +07:00
Sergey M․	f97c099131	[pornhub] Move test to correct place	2018-11-18 11:14:46 +07:00
Sergey M․	1febf99da1	[pornhub] Add pornhub.net alias	2018-11-18 06:26:08 +07:00
Sergey M․	4167148fa4	[nova:embed] Fix extraction (closes #18222 )	2018-11-18 01:11:10 +07:00