release 2015.02.09.2

[generic] Improve SBS detection (Fixes #4899 )
release 2015.02.09.1
2026-06-11 23:20:15 +00:00 · 2015-02-09 14:46:30 +01:00 · 2015-02-09 14:46:10 +01:00 · 2015-02-09 10:49:10 +01:00 · 2015-02-09 10:47:19 +01:00 · 2015-02-09 10:44:55 +01:00
95 changed files with 2399 additions and 615 deletions
@@ -106,3 +106,7 @@ Johan K. Jensen
 Yen Chi Hsuan
 Enam Mijbah Noor
 David Luhmer
+Shaya Goldberg
+Paul Hartmann
+Frans de Jonge
+Robin de Rooij
@@ -1,4 +1,6 @@
-Please include the full output of the command when run with `--verbose`. The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
+**Please include the full output of youtube-dl when run with `-v`**.
+
+The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.

 Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):

@@ -122,7 +124,7 @@ If you want to add support for a new site, you can follow this quick list (assum
 5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will be then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
 7. Have a look at [`youtube_dl/common/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L38). Add tests and code for as many as you want.
-8. If you can, check the code with [pyflakes](https://pypi.python.org/pypi/pyflakes) (a good idea) and [pep8](https://pypi.python.org/pypi/pep8) (optional, ignore E501).
+8. If you can, check the code with [flake8](https://pypi.python.org/pypi/flake8).
 9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:

        $ git add youtube_dl/extractor/__init__.py
@@ -1,10 +1,7 @@
 all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites

 clean:
-	rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json CONTRIBUTING.md.tmp
-
-cleanall: clean
-	rm -f youtube-dl youtube-dl.exe
+	rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe

 PREFIX ?= /usr/local
 BINDIR ?= $(PREFIX)/bin
@@ -292,18 +292,20 @@ which means you can modify it, redistribute it or use it however you like.
                                     video results by putting a condition in
                                     brackets, as in -f "best[height=720]" (or
                                     -f "[filesize>10M]").  This works for
-                                     filesize, height, width, tbr, abr, and vbr
-                                     and the comparisons <, <=, >, >=, =, != .
-                                     Formats for which the value is not known
-                                     are excluded unless you put a question mark
-                                     (?) after the operator. You can combine
-                                     format filters, so  -f "[height <=?
-                                     720][tbr>500]" selects up to 720p videos
-                                     (or videos where the height is not known)
-                                     with a bitrate of at least 500 KBit/s. By
-                                     default, youtube-dl will pick the best
-                                     quality. Use commas to download multiple
-                                     audio formats, such as -f
+                                     filesize, height, width, tbr, abr, vbr,
+                                     asr, and fps and the comparisons <, <=, >,
+                                     >=, =, != and for ext, acodec, vcodec,
+                                     container, and protocol and the comparisons
+                                     =, != . Formats for which the value is not
+                                     known are excluded unless you put a
+                                     question mark (?) after the operator. You
+                                     can combine format filters, so  -f "[height
+                                     <=? 720][tbr>500]" selects up to 720p
+                                     videos (or videos where the height is not
+                                     known) with a bitrate of at least 500
+                                     KBit/s. By default, youtube-dl will pick
+                                     the best quality. Use commas to download
+                                     multiple audio formats, such as -f
                                     136/137/mp4/bestvideo,140/m4a/bestaudio.
                                     You can merge the video and audio of two
                                     formats into a single file using -f <video-
@@ -368,11 +370,11 @@ which means you can modify it, redistribute it or use it however you like.
    --add-metadata                   write metadata to the video file
    --xattrs                         write metadata to the video file's xattrs
                                     (using dublin core and xdg standards)
-    --fixup POLICY                   (experimental) Automatically correct known
-                                     faults of the file. One of never (do
-                                     nothing), warn (only emit a warning),
-                                     detect_or_warn(check whether we can do
-                                     anything about it, warn otherwise
+    --fixup POLICY                   Automatically correct known faults of the
+                                     file. One of never (do nothing), warn (only
+                                     emit a warning), detect_or_warn(the
+                                     default; fix file if we can, warn
+                                     otherwise)
    --prefer-avconv                  Prefer avconv over ffmpeg for running the
                                     postprocessors (default)
    --prefer-ffmpeg                  Prefer ffmpeg over avconv for running the
@@ -525,9 +527,24 @@ From then on, after restarting your shell, you will be able to access both youtu

 Use the `-o` to specify an [output template](#output-template), for example `-o "/home/user/videos/%(title)s-%(id)s.%(ext)s"`. If you want this for all of your downloads, put the option into your [configuration file](#configuration).

+### How do I download a video starting with a `-` ?
+
+Either prepend `http://www.youtube.com/watch?v=` or separate the ID from the options with `--`:
+
+    youtube-dl -- -wNyEUrxzFU
+    youtube-dl "http://www.youtube.com/watch?v=-wNyEUrxzFU"
+
+### Can you add support for this anime video site, or site which shows current movies for free?
+
+As a matter of policy (as well as legality), youtube-dl does not include support for services that specialize in infringing copyright. As a rule of thumb, if you cannot easily find a video that the service is quite obviously allowed to distribute (i.e. that has been uploaded by the creator, the creator's distributor, or is published under a free license), the service is probably unfit for inclusion to youtube-dl.
+
+A note on the service that they don't host the infringing content, but just link to those who do, is evidence that the service should **not** be included into youtube-dl. The same goes for any DMCA note when the whole front page of the service is filled with videos they are not allowed to distribute. A "fair use" note is equally unconvincing if the service shows copyright-protected videos in full without authorization.
+
+Support requests for services that **do** purchase the rights to distribute their content are perfectly fine though. If in doubt, you can simply include a source that mentions the legitimate purchase of content.
+
 ### How can I detect whether a given URL is supported by youtube-dl?

-For one, have a look at the [list of supported sites](docs/supportedsites). Note that it can sometimes happen that the site changes its URL scheme (say, from http://example.com/v/1234567 to http://example.com/v/1234567 ) and youtube-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug.
+For one, have a look at the [list of supported sites](docs/supportedsites.md). Note that it can sometimes happen that the site changes its URL scheme (say, from http://example.com/v/1234567 to http://example.com/v/1234567 ) and youtube-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug.

 It is *not* possible to detect whether a URL is supported or not. That's because youtube-dl contains a generic extractor which matches **all** URLs. You may be tempted to disable, exclude, or remove the generic extractor, but the generic extractor not only allows users to extract videos from lots of websites that embed a video from another service, but may also be used to extract video from a service that it's hosting itself. Therefore, we neither recommend nor support disabling, excluding, or removing the generic extractor.

@@ -721,7 +738,7 @@ In particular, every site support request issue should only pertain to services

 ###  Is anyone going to need the feature?

-Only post features that you (or an incapicated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.
+Only post features that you (or an incapacitated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.

 ###  Is your question about youtube-dl?

@@ -35,7 +35,7 @@ if [ ! -z "$useless_files" ]; then echo "ERROR: Non-.py files in youtube_dl: $us
 if [ ! -f "updates_key.pem" ]; then echo 'ERROR: updates_key.pem missing'; exit 1; fi

 /bin/echo -e "\n### First of all, testing..."
-make cleanall
+make clean
 if $skip_tests ; then
    echo 'SKIPPING TESTS'
 else
@@ -45,9 +45,9 @@ fi
 /bin/echo -e "\n### Changing version in version.py..."
 sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py

-/bin/echo -e "\n### Committing README.md and youtube_dl/version.py..."
-make README.md
-git add README.md youtube_dl/version.py
+/bin/echo -e "\n### Committing documentation and youtube_dl/version.py..."
+make README.md CONTRIBUTING.md supportedsites
+git add README.md CONTRIBUTING.md docs/supportedsites.md youtube_dl/version.py
 git commit -m "release $version"

 /bin/echo -e "\n### Now tagging, signing and pushing..."
@@ -9,16 +9,21 @@
 - **8tracks**
 - **9gag**
 - **abc.net.au**
+ - **Abc7News**
 - **AcademicEarth:Course**
 - **AddAnime**
 - **AdobeTV**
 - **AdultSwim**
+ - **Aftenposten**
 - **Aftonbladet**
 - **AlJazeera**
 - **Allocine**
+ - **AlphaPorno**
 - **anitube.se**
 - **AnySex**
 - **Aparat**
+ - **AppleDailyAnimationNews**
+ - **AppleDailyRealtimeNews**
 - **AppleTrailers**
 - **archive.org**: archive.org videos
 - **ARD**
@@ -30,8 +35,10 @@
 - **arte.tv:ddc**
 - **arte.tv:embed**
 - **arte.tv:future**
+ - **AtresPlayer**
+ - **ATTTechChannel**
 - **audiomack**
- - **AUEngine**
+ - **audiomack:album**
 - **Azubu**
 - **bambuser**
 - **bambuser:channel**
@@ -71,8 +78,10 @@
 - **cmt.com**
 - **CNET**
 - **CNN**
+ - **CNNArticle**
 - **CNNBlogs**
 - **CollegeHumor**
+ - **CollegeRama**
 - **ComCarCoff**
 - **ComedyCentral**
 - **ComedyCentralShows**: The Daily Show / The Colbert Report
@@ -82,23 +91,27 @@
 - **Crunchyroll**
 - **crunchyroll:playlist**
 - **CSpan**: C-SPAN
+ - **CtsNews**
 - **culturebox.francetvinfo.fr**
 - **dailymotion**
 - **dailymotion:playlist**
 - **dailymotion:user**
 - **daum.net**
 - **DBTV**
+ - **DctpTv**
 - **DeezerPlaylist**
 - **defense.gouv.fr**
 - **Discovery**
 - **divxstage**: DivxStage
 - **Dotsub**
+ - **DRBonanza**
 - **Dropbox**
 - **DrTuber**
 - **DRTV**
 - **Dump**
 - **dvtv**: http://video.aktualne.cz/
 - **EbaumsWorld**
+ - **EchoMsk**
 - **eHow**
 - **Einthusan**
 - **eitb.tv**
@@ -108,6 +121,7 @@
 - **EMPFlix**
 - **Engadget**
 - **Eporner**
+ - **EroProfile**
 - **Escapist**
 - **EveryonesMixtape**
 - **exfm**: ex.fm
@@ -143,6 +157,7 @@
 - **GDCVault**
 - **generic**: Generic downloader that works on some sites
 - **GiantBomb**
+ - **Giga**
 - **Glide**: Glide mobile video messages (glide.me)
 - **Globo**
 - **GodTube**
@@ -153,9 +168,14 @@
 - **Grooveshark**
 - **Groupon**
 - **Hark**
+ - **HearThisAt**
 - **Heise**
+ - **HellPorno**
 - **Helsinki**: helsinki.fi
 - **HentaiStigma**
+ - **HistoricFilms**
+ - **hitbox**
+ - **hitbox:live**
 - **HornBunny**
 - **HostingBulk**
 - **HotNewHipHop**
@@ -182,6 +202,7 @@
 - **jpopsuki.tv**
 - **Jukebox**
 - **Kankan**
+ - **Karaoketv**
 - **keek**
 - **KeezMovies**
 - **KhanAcademy**
@@ -195,6 +216,7 @@
 - **LiveLeak**
 - **livestream**
 - **livestream:original**
+ - **LnkGo**
 - **lrt.lt**
 - **lynda**: lynda.com videos
 - **lynda:course**: lynda.com online courses
@@ -235,6 +257,7 @@
 - **MySpass**
 - **myvideo**
 - **MyVidster**
+ - **n-tv.de**
 - **Naver**
 - **NBA**
 - **NBC**
@@ -242,11 +265,16 @@
 - **ndr**: NDR.de - Mediathek
 - **NDTV**
 - **NerdCubedFeed**
+ - **Nerdist**
+ - **Netzkino**
 - **Newgrounds**
 - **Newstube**
+ - **NextMedia**
+ - **NextMediaActionNews**
 - **nfb**: National Film Board of Canada
 - **nfl.com**
 - **nhl.com**
+ - **nhl.com:news**: NHL news
 - **nhl.com:videocenter**: NHL videocenter category
 - **niconico**: ニコニコ動画
 - **NiconicoPlaylist**
@@ -257,18 +285,20 @@
 - **Nowness**
 - **nowvideo**: NowVideo
 - **npo.nl**
+ - **npo.nl:live**
 - **NRK**
 - **NRKTV**
- - **NTV**
+ - **ntv.ru**
 - **Nuvid**
 - **NYTimes**
 - **ocw.mit.edu**
 - **OktoberfestTV**
 - **on.aol.com**
 - **Ooyala**
+ - **OpenFilm**
+ - **orf:fm4**: radio FM4
 - **orf:oe1**: Radio Österreich 1
 - **orf:tvthek**: ORF TVthek
- - **ORFFM4**: radio FM4
 - **parliamentlive.tv**: UK parliament videos
 - **Patreon**
 - **PBS**
@@ -290,6 +320,7 @@
 - **Pyvideo**
 - **QuickVid**
 - **radio.de**
+ - **radiobremen**
 - **radiofrance**
 - **Rai**
 - **RBMARadio**
@@ -300,6 +331,8 @@
 - **RottenTomatoes**
 - **Roxwel**
 - **RTBF**
+ - **Rte**
+ - **RTL2**
 - **RTLnow**
 - **rtlxl.nl**
 - **RTP**
@@ -309,6 +342,7 @@
 - **RUHD**
 - **rutube**: Rutube videos
 - **rutube:channel**: Rutube channels
+ - **rutube:embed**: Rutube embedded videos
 - **rutube:movie**: Rutube movies
 - **rutube:person**: Rutube person videos
 - **RUTV**: RUTV.RU
@@ -351,11 +385,12 @@
 - **Sport5**
 - **SportBox**
 - **SportDeutschland**
- - **SRMediathek**: Süddeutscher Rundfunk
+ - **SRMediathek**: Saarländischer Rundfunk
 - **stanfordoc**: Stanford Open ClassRoom
 - **Steam**
 - **streamcloud.eu**
 - **StreamCZ**
+ - **StreetVoice**
 - **SunPorno**
 - **SWRMediathek**
 - **Syfy**
@@ -375,7 +410,9 @@
 - **TeleBruxelles**
 - **telecinco.es**
 - **TeleMB**
+ - **TeleTask**
 - **TenPlay**
+ - **TestTube**
 - **TF1**
 - **TheOnion**
 - **ThePlatform**
@@ -403,8 +440,16 @@
 - **tv.dfb.de**
 - **tvigle**: Интернет-телевидение Tvigle.ru
 - **tvp.pl**
+ - **tvp.pl:Series**
 - **TVPlay**: TV3Play and related services
- - **Twitch**
+ - **Tweakers**
+ - **twitch:bookmarks**
+ - **twitch:chapter**
+ - **twitch:past_broadcasts**
+ - **twitch:profile**
+ - **twitch:stream**
+ - **twitch:video**
+ - **twitch:vod**
 - **Ubu**
 - **udemy**
 - **udemy:course**
@@ -433,6 +478,8 @@
 - **videoweed**: VideoWeed
 - **Vidme**
 - **Vidzi**
+ - **vier**
+ - **vier:videos**
 - **viki**
 - **vimeo**
 - **vimeo:album**
@@ -460,11 +507,13 @@
 - **WDR**
 - **wdr:mobile**
 - **WDRMaus**: Sendung mit der Maus
+ - **WebOfStories**
 - **Weibo**
 - **Wimp**
 - **Wistia**
 - **WorldStarHipHop**
 - **wrzuta.pl**
+ - **WSJ**: Wall Street Journal
 - **XBef**
 - **XboxClips**
 - **XHamster**
@@ -472,7 +521,9 @@
 - **XNXX**
 - **XTube**
 - **XTubeUser**: XTube user profile
+ - **Xuite**
 - **XVideos**
+ - **XXXYMovies**
 - **Yahoo**: Yahoo screen and movies
 - **YesJapan**
 - **Ynet**
@@ -491,7 +542,6 @@
 - **youtube:search_url**: YouTube.com search URLs
 - **youtube:show**: YouTube.com (multi-season) shows
 - **youtube:subscriptions**: YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)
- - **youtube:toplist**: YouTube.com top lists, "yttoplist:{channel}:{list title}" (Example: "yttoplist:music:Top Tracks")
 - **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword)
 - **youtube:watch_later**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
 - **ZDF**
@@ -103,6 +103,16 @@ def expect_info_dict(self, got_dict, expected_dict):
            self.assertTrue(
                match_rex.match(got),
                'field %s (value: %r) should match %r' % (info_field, got, match_str))
+        elif isinstance(expected, compat_str) and expected.startswith('startswith:'):
+            got = got_dict.get(info_field)
+            start_str = expected[len('startswith:'):]
+            self.assertTrue(
+                isinstance(got, compat_str),
+                'Expected a %s object, but got %s for field %s' % (
+                    compat_str.__name__, type(got).__name__, info_field))
+            self.assertTrue(
+                got.startswith(start_str),
+                'field %s (value: %r) should start with %r' % (info_field, got, start_str))
        elif isinstance(expected, type):
            got = got_dict.get(info_field)
            self.assertTrue(isinstance(got, expected),
@@ -148,9 +158,15 @@ def expect_info_dict(self, got_dict, expected_dict):
                return "'%s'" % v.replace('\\', '\\\\').replace("'", "\\'").replace('\n', '\\n')
            else:
                return repr(v)
-        info_dict_str = ''.join(
-            '    %s: %s,\n' % (_repr(k), _repr(v))
-            for k, v in test_info_dict.items())
+        info_dict_str = ''
+        if len(missing_keys) != len(expected_dict):
+            info_dict_str += ''.join(
+                '    %s: %s,\n' % (_repr(k), _repr(v))
+                for k, v in test_info_dict.items() if k not in missing_keys)
+            info_dict_str += '\n'
+        info_dict_str += ''.join(
+            '    %s: %s,\n' % (_repr(k), _repr(test_info_dict[k]))
+            for k in missing_keys)
        write_string(
            '\n\'info_dict\': {\n' + info_dict_str + '}\n', out=sys.stderr)
        self.assertFalse(
@@ -13,6 +13,7 @@ import copy
 from test.helper import FakeYDL, assertRegexpMatches
 from youtube_dl import YoutubeDL
 from youtube_dl.extractor import YoutubeIE
+from youtube_dl.postprocessor.common import PostProcessor


 class YDL(FakeYDL):
@@ -370,5 +371,35 @@ class TestFormatSelection(unittest.TestCase):
            'vbr': 10,
        }), '^\s*10k$')

+    def test_postprocessors(self):
+        filename = 'post-processor-testfile.mp4'
+        audiofile = filename + '.mp3'
+
+        class SimplePP(PostProcessor):
+            def run(self, info):
+                with open(audiofile, 'wt') as f:
+                    f.write('EXAMPLE')
+                info['filepath']
+                return False, info
+
+        def run_pp(params):
+            with open(filename, 'wt') as f:
+                f.write('EXAMPLE')
+            ydl = YoutubeDL(params)
+            ydl.add_post_processor(SimplePP())
+            ydl.post_process(filename, {'filepath': filename})
+
+        run_pp({'keepvideo': True})
+        self.assertTrue(os.path.exists(filename), '%s doesn\'t exist' % filename)
+        self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
+        os.unlink(filename)
+        os.unlink(audiofile)
+
+        run_pp({'keepvideo': False})
+        self.assertFalse(os.path.exists(filename), '%s exists' % filename)
+        self.assertTrue(os.path.exists(audiofile), '%s doesn\'t exist' % audiofile)
+        os.unlink(audiofile)
+
+
 if __name__ == '__main__':
    unittest.main()
@@ -89,7 +89,7 @@ def generator(test_case):

        for tc in test_cases:
            info_dict = tc.get('info_dict', {})
-            if not tc.get('file') and not (info_dict.get('id') and info_dict.get('ext')):
+            if not (info_dict.get('id') and info_dict.get('ext')):
                raise Exception('Test definition incorrect. The output file cannot be known. Are both \'id\' and \'ext\' keys present?')

        if 'skip' in test_case:
@@ -116,7 +116,7 @@ def generator(test_case):
        expect_warnings(ydl, test_case.get('expected_warnings', []))

        def get_tc_filename(tc):
-            return tc.get('file') or ydl.prepare_filename(tc.get('info_dict', {}))
+            return ydl.prepare_filename(tc.get('info_dict', {}))

        res_dict = None

@@ -0,0 +1,72 @@
+#!/usr/bin/env python
+from __future__ import unicode_literals
+
+# Allow direct execution
+import os
+import sys
+import unittest
+sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+
+from youtube_dl import YoutubeDL
+from youtube_dl.compat import compat_http_server
+import ssl
+import threading
+
+TEST_DIR = os.path.dirname(os.path.abspath(__file__))
+
+
+class HTTPTestRequestHandler(compat_http_server.BaseHTTPRequestHandler):
+    def log_message(self, format, *args):
+        pass
+
+    def do_GET(self):
+        if self.path == '/video.html':
+            self.send_response(200)
+            self.send_header('Content-Type', 'text/html; charset=utf-8')
+            self.end_headers()
+            self.wfile.write(b'<html><video src="/vid.mp4" /></html>')
+        elif self.path == '/vid.mp4':
+            self.send_response(200)
+            self.send_header('Content-Type', 'video/mp4')
+            self.end_headers()
+            self.wfile.write(b'\x00\x00\x00\x00\x20\x66\x74[video]')
+        else:
+            assert False
+
+
+class FakeLogger(object):
+    def debug(self, msg):
+        pass
+
+    def warning(self, msg):
+        pass
+
+    def error(self, msg):
+        pass
+
+
+class TestHTTP(unittest.TestCase):
+    def setUp(self):
+        certfn = os.path.join(TEST_DIR, 'testcert.pem')
+        self.httpd = compat_http_server.HTTPServer(
+            ('localhost', 0), HTTPTestRequestHandler)
+        self.httpd.socket = ssl.wrap_socket(
+            self.httpd.socket, certfile=certfn, server_side=True)
+        self.port = self.httpd.socket.getsockname()[1]
+        self.server_thread = threading.Thread(target=self.httpd.serve_forever)
+        self.server_thread.daemon = True
+        self.server_thread.start()
+
+    def test_nocheckcertificate(self):
+        if sys.version_info >= (2, 7, 9):  # No certificate checking anyways
+            ydl = YoutubeDL({'logger': FakeLogger()})
+            self.assertRaises(
+                Exception,
+                ydl.extract_info, 'https://localhost:%d/video.html' % self.port)
+
+        ydl = YoutubeDL({'logger': FakeLogger(), 'nocheckcertificate': True})
+        r = ydl.extract_info('https://localhost:%d/video.html' % self.port)
+        self.assertEqual(r['url'], 'https://localhost:%d/vid.mp4' % self.port)
+
+if __name__ == '__main__':
+    unittest.main()
@@ -0,0 +1,95 @@
+#!/usr/bin/env python
+
+from __future__ import unicode_literals
+
+# Allow direct execution
+import os
+import sys
+import unittest
+sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+
+from youtube_dl.jsinterp import JSInterpreter
+
+
+class TestJSInterpreter(unittest.TestCase):
+    def test_basic(self):
+        jsi = JSInterpreter('function x(){;}')
+        self.assertEqual(jsi.call_function('x'), None)
+
+        jsi = JSInterpreter('function x3(){return 42;}')
+        self.assertEqual(jsi.call_function('x3'), 42)
+
+    def test_calc(self):
+        jsi = JSInterpreter('function x4(a){return 2*a+1;}')
+        self.assertEqual(jsi.call_function('x4', 3), 7)
+
+    def test_empty_return(self):
+        jsi = JSInterpreter('function f(){return; y()}')
+        self.assertEqual(jsi.call_function('f'), None)
+
+    def test_morespace(self):
+        jsi = JSInterpreter('function x (a) { return 2 * a + 1 ; }')
+        self.assertEqual(jsi.call_function('x', 3), 7)
+
+        jsi = JSInterpreter('function f () { x =  2  ; return x; }')
+        self.assertEqual(jsi.call_function('f'), 2)
+
+    def test_strange_chars(self):
+        jsi = JSInterpreter('function $_xY1 ($_axY1) { var $_axY2 = $_axY1 + 1; return $_axY2; }')
+        self.assertEqual(jsi.call_function('$_xY1', 20), 21)
+
+    def test_operators(self):
+        jsi = JSInterpreter('function f(){return 1 << 5;}')
+        self.assertEqual(jsi.call_function('f'), 32)
+
+        jsi = JSInterpreter('function f(){return 19 & 21;}')
+        self.assertEqual(jsi.call_function('f'), 17)
+
+        jsi = JSInterpreter('function f(){return 11 >> 2;}')
+        self.assertEqual(jsi.call_function('f'), 2)
+
+    def test_array_access(self):
+        jsi = JSInterpreter('function f(){var x = [1,2,3]; x[0] = 4; x[0] = 5; x[2] = 7; return x;}')
+        self.assertEqual(jsi.call_function('f'), [5, 2, 7])
+
+    def test_parens(self):
+        jsi = JSInterpreter('function f(){return (1) + (2) * ((( (( (((((3)))))) )) ));}')
+        self.assertEqual(jsi.call_function('f'), 7)
+
+        jsi = JSInterpreter('function f(){return (1 + 2) * 3;}')
+        self.assertEqual(jsi.call_function('f'), 9)
+
+    def test_assignments(self):
+        jsi = JSInterpreter('function f(){var x = 20; x = 30 + 1; return x;}')
+        self.assertEqual(jsi.call_function('f'), 31)
+
+        jsi = JSInterpreter('function f(){var x = 20; x += 30 + 1; return x;}')
+        self.assertEqual(jsi.call_function('f'), 51)
+
+        jsi = JSInterpreter('function f(){var x = 20; x -= 30 + 1; return x;}')
+        self.assertEqual(jsi.call_function('f'), -11)
+
+    def test_comments(self):
+        jsi = JSInterpreter('''
+        function x() {
+            var x = /* 1 + */ 2;
+            var y = /* 30
+            * 40 */ 50;
+            return x + y;
+        }
+        ''')
+        self.assertEqual(jsi.call_function('x'), 52)
+
+    def test_precedence(self):
+        jsi = JSInterpreter('''
+        function x() {
+            var a = [10, 20, 30, 40, 50];
+            var b = 6;
+            a[0]=a[b%a.length];
+            return a;
+        }''')
+        self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50])
+
+
+if __name__ == '__main__':
+    unittest.main()
@@ -156,6 +156,9 @@ class TestUtil(unittest.TestCase):
        self.assertEqual(
            unified_strdate('11/26/2014 11:30:00 AM PST', day_first=False),
            '20141126')
+        self.assertEqual(
+            unified_strdate('2/2/2015 6:47:40 PM', day_first=False),
+            '20150202')

    def test_find_xpath_attr(self):
        testxml = '''<root>
@@ -238,6 +241,8 @@ class TestUtil(unittest.TestCase):
        self.assertEqual(parse_duration('5 s'), 5)
        self.assertEqual(parse_duration('3 min'), 180)
        self.assertEqual(parse_duration('2.5 hours'), 9000)
+        self.assertEqual(parse_duration('02:03:04'), 7384)
+        self.assertEqual(parse_duration('01:02:03:04'), 93784)

    def test_fix_xml_ampersands(self):
        self.assertEqual(
@@ -371,6 +376,16 @@ class TestUtil(unittest.TestCase):
        on = js_to_json('{"abc": true}')
        self.assertEqual(json.loads(on), {'abc': True})

+        # Ignore JavaScript code as well
+        on = js_to_json('''{
+            "x": 1,
+            y: "a",
+            z: some.code
+        }''')
+        d = json.loads(on)
+        self.assertEqual(d['x'], 1)
+        self.assertEqual(d['y'], 'a')
+
    def test_clean_html(self):
        self.assertEqual(clean_html('a:\nb'), 'a: b')
        self.assertEqual(clean_html('a:\n   "b"'), 'a:    "b"')
@@ -0,0 +1,52 @@
+-----BEGIN PRIVATE KEY-----
+MIIEvQIBADANBgkqhkiG9w0BAQEFAASCBKcwggSjAgEAAoIBAQDMF0bAzaHAdIyB
+HRmnIp4vv40lGqEePmWqicCl0QZ0wsb5dNysSxSa7330M2QeQopGfdaUYF1uTcNp
+Qx6ECgBSfg+RrOBI7r/u4F+sKX8MUXVaf/5QoBUrGNGSn/pp7HMGOuQqO6BVg4+h
+A1ySSwUG8mZItLRry1ISyErmW8b9xlqfd97uLME/5tX+sMelRFjUbAx8A4CK58Ev
+mMguHVTlXzx5RMdYcf1VScYcjlV/qA45uzP8zwI5aigfcmUD+tbGuQRhKxUhmw0J
+aobtOR6+JSOAULW5gYa/egE4dWLwbyM6b6eFbdnjlQzEA1EW7ChMPAW/Mo83KyiP
+tKMCSQulAgMBAAECggEALCfBDAexPjU5DNoh6bIorUXxIJzxTNzNHCdvgbCGiA54
+BBKPh8s6qwazpnjT6WQWDIg/O5zZufqjE4wM9x4+0Zoqfib742ucJO9wY4way6x4
+Clt0xzbLPabB+MoZ4H7ip+9n2+dImhe7pGdYyOHoNYeOL57BBi1YFW42Hj6u/8pd
+63YCXisto3Rz1YvRQVjwsrS+cRKZlzAFQRviL30jav7Wh1aWEfcXxjj4zhm8pJdk
+ITGtq6howz57M0NtX6hZnfe8ywzTnDFIGKIMA2cYHuYJcBh9bc4tCGubTvTKK9UE
+8fM+f6UbfGqfpKCq1mcgs0XMoFDSzKS9+mSJn0+5JQKBgQD+OCKaeH3Yzw5zGnlw
+XuQfMJGNcgNr+ImjmvzUAC2fAZUJLAcQueE5kzMv5Fmd+EFE2CEX1Vit3tg0SXvA
+G+bq609doILHMA03JHnV1npO/YNIhG3AAtJlKYGxQNfWH9mflYj9mEui8ZFxG52o
+zWhHYuifOjjZszUR+/eio6NPzwKBgQDNhUBTrT8LIX4SE/EFUiTlYmWIvOMgXYvN
+8Cm3IRNQ/yyphZaXEU0eJzfX5uCDfSVOgd6YM/2pRah+t+1Hvey4H8e0GVTu5wMP
+gkkqwKPGIR1YOmlw6ippqwvoJD7LuYrm6Q4D6e1PvkjwCq6lEndrOPmPrrXNd0JJ
+XO60y3U2SwKBgQDLkyZarryQXxcCI6Q10Tc6pskYDMIit095PUbTeiUOXNT9GE28
+Hi32ziLCakk9kCysNasii81MxtQ54tJ/f5iGbNMMddnkKl2a19Hc5LjjAm4cJzg/
+98KGEhvyVqvAo5bBDZ06/rcrD+lZOzUglQS5jcIcqCIYa0LHWQ/wJLxFzwKBgFcZ
+1SRhdSmDfUmuF+S4ZpistflYjC3IV5rk4NkS9HvMWaJS0nqdw4A3AMzItXgkjq4S
+DkOVLTkTI5Do5HAWRv/VwC5M2hkR4NMu1VGAKSisGiKtRsirBWSZMEenLNHshbjN
+Jrpz5rZ4H7NT46ZkCCZyFBpX4gb9NyOedjA7Via3AoGARF8RxbYjnEGGFuhnbrJB
+FTPR0vaL4faY3lOgRZ8jOG9V2c9Hzi/y8a8TU4C11jnJSDqYCXBTd5XN28npYxtD
+pjRsCwy6ze+yvYXPO7C978eMG3YRyj366NXUxnXN59ibwe/lxi2OD9z8J1LEdF6z
+VJua1Wn8HKxnXMI61DhTCSo=
+-----END PRIVATE KEY-----
+-----BEGIN CERTIFICATE-----
+MIIEEzCCAvugAwIBAgIJAK1haYi6gmSKMA0GCSqGSIb3DQEBCwUAMIGeMQswCQYD
+VQQGEwJERTEMMAoGA1UECAwDTlJXMRQwEgYDVQQHDAtEdWVzc2VsZG9yZjEbMBkG
+A1UECgwSeW91dHViZS1kbCBwcm9qZWN0MRkwFwYDVQQLDBB5b3V0dWJlLWRsIHRl
+c3RzMRIwEAYDVQQDDAlsb2NhbGhvc3QxHzAdBgkqhkiG9w0BCQEWEHBoaWhhZ0Bw
+aGloYWcuZGUwIBcNMTUwMTMwMDExNTA4WhgPMjExNTAxMDYwMTE1MDhaMIGeMQsw
+CQYDVQQGEwJERTEMMAoGA1UECAwDTlJXMRQwEgYDVQQHDAtEdWVzc2VsZG9yZjEb
+MBkGA1UECgwSeW91dHViZS1kbCBwcm9qZWN0MRkwFwYDVQQLDBB5b3V0dWJlLWRs
+IHRlc3RzMRIwEAYDVQQDDAlsb2NhbGhvc3QxHzAdBgkqhkiG9w0BCQEWEHBoaWhh
+Z0BwaGloYWcuZGUwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDMF0bA
+zaHAdIyBHRmnIp4vv40lGqEePmWqicCl0QZ0wsb5dNysSxSa7330M2QeQopGfdaU
+YF1uTcNpQx6ECgBSfg+RrOBI7r/u4F+sKX8MUXVaf/5QoBUrGNGSn/pp7HMGOuQq
+O6BVg4+hA1ySSwUG8mZItLRry1ISyErmW8b9xlqfd97uLME/5tX+sMelRFjUbAx8
+A4CK58EvmMguHVTlXzx5RMdYcf1VScYcjlV/qA45uzP8zwI5aigfcmUD+tbGuQRh
+KxUhmw0JaobtOR6+JSOAULW5gYa/egE4dWLwbyM6b6eFbdnjlQzEA1EW7ChMPAW/
+Mo83KyiPtKMCSQulAgMBAAGjUDBOMB0GA1UdDgQWBBTBUZoqhQkzHQ6xNgZfFxOd
+ZEVt8TAfBgNVHSMEGDAWgBTBUZoqhQkzHQ6xNgZfFxOdZEVt8TAMBgNVHRMEBTAD
+AQH/MA0GCSqGSIb3DQEBCwUAA4IBAQCUOCl3T/J9B08Z+ijfOJAtkbUaEHuVZb4x
+5EpZSy2ZbkLvtsftMFieHVNXn9dDswQc5qjYStCC4o60LKw4M6Y63FRsAZ/DNaqb
+PY3jyCyuugZ8/sNf50vHYkAcF7SQYqOQFQX4TQsNUk2xMJIt7H0ErQFmkf/u3dg6
+cy89zkT462IwxzSG7NNhIlRkL9o5qg+Y1mF9eZA1B0rcL6hO24PPTHOd90HDChBu
+SZ6XMi/LzYQSTf0Vg2R+uMIVlzSlkdcZ6sqVnnqeLL8dFyIa4e9sj/D4ZCYP8Mqe
+Z73H5/NNhmwCHRqVUTgm307xblQaWGhwAiDkaRvRW2aJQ0qGEdZK
+-----END CERTIFICATE-----
@@ -25,6 +25,7 @@ if os.name == 'nt':
    import ctypes

 from .compat import (
+    compat_basestring,
    compat_cookiejar,
    compat_expanduser,
    compat_http_client,
@@ -543,6 +544,11 @@ class YoutubeDL(object):
            outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
            tmpl = compat_expanduser(outtmpl)
            filename = tmpl % template_dict
+            # Temporary fix for #4787
+            # 'Treat' all problem characters by passing filename through preferredencoding
+            # to workaround encoding issues with subprocess on python2 @ Windows
+            if sys.version_info < (3, 0) and sys.platform == 'win32':
+                filename = encodeFilename(filename, True).decode(preferredencoding())
            return filename
        except ValueError as err:
            self.report_error('Error in output template: ' + str(err) + ' (encoding: ' + repr(preferredencoding()) + ')')
@@ -820,27 +826,44 @@ class YoutubeDL(object):
            '!=': operator.ne,
        }
        operator_rex = re.compile(r'''(?x)\s*\[
-            (?P<key>width|height|tbr|abr|vbr|filesize)
+            (?P<key>width|height|tbr|abr|vbr|asr|filesize|fps)
            \s*(?P<op>%s)(?P<none_inclusive>\s*\?)?\s*
            (?P<value>[0-9.]+(?:[kKmMgGtTpPeEzZyY]i?[Bb]?)?)
            \]$
            ''' % '|'.join(map(re.escape, OPERATORS.keys())))
        m = operator_rex.search(format_spec)
+        if m:
+            try:
+                comparison_value = int(m.group('value'))
+            except ValueError:
+                comparison_value = parse_filesize(m.group('value'))
+                if comparison_value is None:
+                    comparison_value = parse_filesize(m.group('value') + 'B')
+                if comparison_value is None:
+                    raise ValueError(
+                        'Invalid value %r in format specification %r' % (
+                            m.group('value'), format_spec))
+            op = OPERATORS[m.group('op')]
+
+        if not m:
+            STR_OPERATORS = {
+                '=': operator.eq,
+                '!=': operator.ne,
+            }
+            str_operator_rex = re.compile(r'''(?x)\s*\[
+                \s*(?P<key>ext|acodec|vcodec|container|protocol)
+                \s*(?P<op>%s)(?P<none_inclusive>\s*\?)?
+                \s*(?P<value>[a-zA-Z0-9_-]+)
+                \s*\]$
+                ''' % '|'.join(map(re.escape, STR_OPERATORS.keys())))
+            m = str_operator_rex.search(format_spec)
+            if m:
+                comparison_value = m.group('value')
+                op = STR_OPERATORS[m.group('op')]
+
        if not m:
            raise ValueError('Invalid format specification %r' % format_spec)

-        try:
-            comparison_value = int(m.group('value'))
-        except ValueError:
-            comparison_value = parse_filesize(m.group('value'))
-            if comparison_value is None:
-                comparison_value = parse_filesize(m.group('value') + 'B')
-            if comparison_value is None:
-                raise ValueError(
-                    'Invalid value %r in format specification %r' % (
-                        m.group('value'), format_spec))
-        op = OPERATORS[m.group('op')]
-
        def _filter(f):
            actual_value = f.get(m.group('key'))
            if actual_value is None:
@@ -932,6 +955,9 @@ class YoutubeDL(object):
            def has_header(self, h):
                return h in self.headers

+            def get_header(self, h, default=None):
+                return self.headers.get(h, default)
+
        pr = _PseudoRequest(info_dict['url'])
        self.cookiejar.add_cookie_header(pr)
        return pr.headers.get('Cookie')
@@ -953,14 +979,16 @@ class YoutubeDL(object):
        if thumbnails is None:
            thumbnail = info_dict.get('thumbnail')
            if thumbnail:
-                thumbnails = [{'url': thumbnail}]
+                info_dict['thumbnails'] = thumbnails = [{'url': thumbnail}]
        if thumbnails:
            thumbnails.sort(key=lambda t: (
                t.get('preference'), t.get('width'), t.get('height'),
                t.get('id'), t.get('url')))
-            for t in thumbnails:
+            for i, t in enumerate(thumbnails):
                if 'width' in t and 'height' in t:
                    t['resolution'] = '%dx%d' % (t['width'], t['height'])
+                if t.get('id') is None:
+                    t['id'] = '%d' % i

        if thumbnails and 'thumbnail' not in info_dict:
            info_dict['thumbnail'] = thumbnails[-1]['url']
@@ -1068,8 +1096,10 @@ class YoutubeDL(object):
                                else self.params['merge_output_format'])
                            selected_format = {
                                'requested_formats': formats_info,
-                                'format': rf,
-                                'ext': formats_info[0]['ext'],
+                                'format': '%s+%s' % (formats_info[0].get('format'),
+                                                     formats_info[1].get('format')),
+                                'format_id': '%s+%s' % (formats_info[0].get('format_id'),
+                                                        formats_info[1].get('format_id')),
                                'width': formats_info[0].get('width'),
                                'height': formats_info[0].get('height'),
                                'resolution': formats_info[0].get('resolution'),
@@ -1130,7 +1160,7 @@ class YoutubeDL(object):

        self._num_downloads += 1

-        filename = self.prepare_filename(info_dict)
+        info_dict['_filename'] = filename = self.prepare_filename(info_dict)

        # Forced printings
        if self.params.get('forcetitle', False):
@@ -1155,10 +1185,7 @@ class YoutubeDL(object):
        if self.params.get('forceformat', False):
            self.to_stdout(info_dict['format'])
        if self.params.get('forcejson', False):
-            info_dict['_filename'] = filename
            self.to_stdout(json.dumps(info_dict))
-        if self.params.get('dump_single_json', False):
-            info_dict['_filename'] = filename

        # Do nothing else if in simulate mode
        if self.params.get('simulate', False):
@@ -1555,7 +1582,7 @@ class YoutubeDL(object):
        # urllib chokes on URLs with non-ASCII characters (see http://bugs.python.org/issue3991)
        # To work around aforementioned issue we will replace request's original URL with
        # percent-encoded one
-        req_is_string = isinstance(req, basestring if sys.version_info < (3, 0) else compat_str)
+        req_is_string = isinstance(req, compat_basestring)
        url = req if req_is_string else req.get_full_url()
        url_escaped = escape_url(url)

@@ -361,7 +361,9 @@ def _real_main(argv=None):
                sys.exit()

            ydl.warn_if_short_id(sys.argv[1:] if argv is None else argv)
-            parser.error('you must provide at least one URL')
+            parser.error(
+                'You must provide at least one URL.\n'
+                'Type youtube-dl --help to see a list of all options.')

        try:
            if opts.load_info_filename is not None:
@@ -71,6 +71,11 @@ try:
 except ImportError:
    compat_subprocess_get_DEVNULL = lambda: open(os.path.devnull, 'w')

+try:
+    import http.server as compat_http_server
+except ImportError:
+    import BaseHTTPServer as compat_http_server
+
 try:
    from urllib.parse import unquote as compat_urllib_parse_unquote
 except ImportError:
@@ -109,6 +114,26 @@ except ImportError:
            string += pct_sequence.decode(encoding, errors)
        return string

+try:
+    compat_str = unicode  # Python 2
+except NameError:
+    compat_str = str
+
+try:
+    compat_basestring = basestring  # Python 2
+except NameError:
+    compat_basestring = str
+
+try:
+    compat_chr = unichr  # Python 2
+except NameError:
+    compat_chr = chr
+
+try:
+    from xml.etree.ElementTree import ParseError as compat_xml_parse_error
+except ImportError:  # Python 2.6
+    from xml.parsers.expat import ExpatError as compat_xml_parse_error
+

 try:
    from urllib.parse import parse_qs as compat_parse_qs
@@ -118,7 +143,7 @@ except ImportError:  # Python 2

    def _parse_qsl(qs, keep_blank_values=False, strict_parsing=False,
                   encoding='utf-8', errors='replace'):
-        qs, _coerce_result = qs, unicode
+        qs, _coerce_result = qs, compat_str
        pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
        r = []
        for name_value in pairs:
@@ -157,21 +182,6 @@ except ImportError:  # Python 2
                parsed_result[name] = [value]
        return parsed_result

-try:
-    compat_str = unicode  # Python 2
-except NameError:
-    compat_str = str
-
-try:
-    compat_chr = unichr  # Python 2
-except NameError:
-    compat_chr = chr
-
-try:
-    from xml.etree.ElementTree import ParseError as compat_xml_parse_error
-except ImportError:  # Python 2.6
-    from xml.parsers.expat import ExpatError as compat_xml_parse_error
-
 try:
    from shlex import quote as shlex_quote
 except ImportError:  # Python < 3.3
@@ -357,6 +367,7 @@ def workaround_optparse_bug9161():

 __all__ = [
    'compat_HTTPError',
+    'compat_basestring',
    'compat_chr',
    'compat_cookiejar',
    'compat_expanduser',
@@ -365,6 +376,7 @@ __all__ = [
    'compat_html_entities',
    'compat_html_parser',
    'compat_http_client',
+    'compat_http_server',
    'compat_kwargs',
    'compat_ord',
    'compat_parse_qs',
@@ -45,6 +45,12 @@ class ExternalFD(FileDownloader):
    def supports(cls, info_dict):
        return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps')

+    def _source_address(self, command_option):
+        source_address = self.params.get('source_address')
+        if source_address is None:
+            return []
+        return [command_option, source_address]
+
    def _call_downloader(self, tmpfilename, info_dict):
        """ Either overwrite this or implement _make_cmd """
        cmd = self._make_cmd(tmpfilename, info_dict)
@@ -72,6 +78,7 @@ class CurlFD(ExternalFD):
        cmd = [self.exe, '-o', tmpfilename]
        for key, val in info_dict['http_headers'].items():
            cmd += ['--header', '%s: %s' % (key, val)]
+        cmd += self._source_address('--interface')
        cmd += ['--', info_dict['url']]
        return cmd

@@ -81,6 +88,7 @@ class WgetFD(ExternalFD):
        cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies']
        for key, val in info_dict['http_headers'].items():
            cmd += ['--header', '%s: %s' % (key, val)]
+        cmd += self._source_address('--bind-address')
        cmd += ['--', info_dict['url']]
        return cmd

@@ -96,6 +104,7 @@ class Aria2cFD(ExternalFD):
        cmd += ['--out', os.path.basename(tmpfilename)]
        for key, val in info_dict['http_headers'].items():
            cmd += ['--header', '%s: %s' % (key, val)]
+        cmd += self._source_address('--interface')
        cmd += ['--', info_dict['url']]
        return cmd

@@ -230,6 +230,23 @@ class F4mFD(FileDownloader):
    A downloader for f4m manifests or AdobeHDS.
    """

+    def _get_unencrypted_media(self, doc):
+        media = doc.findall(_add_ns('media'))
+        if not media:
+            self.report_error('No media found')
+        for e in (doc.findall(_add_ns('drmAdditionalHeader')) +
+                  doc.findall(_add_ns('drmAdditionalHeaderSet'))):
+            # If id attribute is missing it's valid for all media nodes
+            # without drmAdditionalHeaderId or drmAdditionalHeaderSetId attribute
+            if 'id' not in e.attrib:
+                self.report_error('Missing ID in f4m DRM')
+        media = list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib and
+                                      'drmAdditionalHeaderSetId' not in e.attrib,
+                            media))
+        if not media:
+            self.report_error('Unsupported DRM')
+        return media
+
    def real_download(self, filename, info_dict):
        man_url = info_dict['url']
        requested_bitrate = info_dict.get('tbr')
@@ -248,7 +265,8 @@ class F4mFD(FileDownloader):
        )

        doc = etree.fromstring(manifest)
-        formats = [(int(f.attrib.get('bitrate', -1)), f) for f in doc.findall(_add_ns('media'))]
+        formats = [(int(f.attrib.get('bitrate', -1)), f)
+                   for f in self._get_unencrypted_media(doc)]
        if requested_bitrate is None:
            # get the best format
            formats = sorted(formats, key=lambda f: f[0])
@@ -11,6 +11,7 @@ from ..compat import (
    compat_urllib_request,
 )
 from ..utils import (
+    encodeArgument,
    encodeFilename,
 )

@@ -21,23 +22,22 @@ class HlsFD(FileDownloader):
        self.report_destination(filename)
        tmpfilename = self.temp_name(filename)

-        args = [
-            '-y', '-i', url, '-f', 'mp4', '-c', 'copy',
-            '-bsf:a', 'aac_adtstoasc',
-            encodeFilename(tmpfilename, for_subprocess=True)]
-
        ffpp = FFmpegPostProcessor(downloader=self)
        program = ffpp._executable
        if program is None:
            self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
            return False
        ffpp.check_version()
-        cmd = [program] + args

-        retval = subprocess.call(cmd)
+        args = [
+            encodeArgument(opt)
+            for opt in (program, '-y', '-i', url, '-f', 'mp4', '-c', 'copy', '-bsf:a', 'aac_adtstoasc')]
+        args.append(encodeFilename(tmpfilename, True))
+
+        retval = subprocess.call(args)
        if retval == 0:
            fsize = os.path.getsize(encodeFilename(tmpfilename))
-            self.to_screen('\r[%s] %s bytes' % (cmd[0], fsize))
+            self.to_screen('\r[%s] %s bytes' % (args[0], fsize))
            self.try_rename(tmpfilename, filename)
            self._hook_progress({
                'downloaded_bytes': fsize,
@@ -3,6 +3,9 @@ from __future__ import unicode_literals
 import os
 import time

+from socket import error as SocketError
+import errno
+
 from .common import FileDownloader
 from ..compat import (
    compat_urllib_request,
@@ -99,6 +102,11 @@ class HttpFD(FileDownloader):
                            resume_len = 0
                            open_mode = 'wb'
                            break
+            except SocketError as e:
+                if e.errno != errno.ECONNRESET:
+                    # Connection reset is no problem, just retry
+                    raise
+
            # Retry
            count += 1
            if count <= retries:
@@ -104,6 +104,7 @@ class RtmpFD(FileDownloader):
        live = info_dict.get('rtmp_live', False)
        conn = info_dict.get('rtmp_conn', None)
        protocol = info_dict.get('rtmp_protocol', None)
+        real_time = info_dict.get('rtmp_real_time', False)
        no_resume = info_dict.get('no_resume', False)
        continue_dl = info_dict.get('continuedl', False)

@@ -143,6 +144,8 @@ class RtmpFD(FileDownloader):
            basic_args += ['--conn', conn]
        if protocol is not None:
            basic_args += ['--protocol', protocol]
+        if real_time:
+            basic_args += ['--realtime']

        args = basic_args
        if not no_resume and continue_dl and not live:
@@ -6,6 +6,7 @@ from .academicearth import AcademicEarthCourseIE
 from .addanime import AddAnimeIE
 from .adobetv import AdobeTVIE
 from .adultswim import AdultSwimIE
+from .aftenposten import AftenpostenIE
 from .aftonbladet import AftonbladetIE
 from .aljazeera import AlJazeeraIE
 from .alphaporno import AlphaPornoIE
@@ -82,6 +83,7 @@ from .crunchyroll import (
    CrunchyrollShowPlaylistIE
 )
 from .cspan import CSpanIE
+from .ctsnews import CtsNewsIE
 from .dailymotion import (
    DailymotionIE,
    DailymotionPlaylistIE,
@@ -89,6 +91,7 @@ from .dailymotion import (
 )
 from .daum import DaumIE
 from .dbtv import DBTVIE
+from .dctp import DctpTvIE
 from .deezer import DeezerPlaylistIE
 from .dfb import DFBIE
 from .dotsub import DotsubIE
@@ -180,6 +183,7 @@ from .heise import HeiseIE
 from .hellporno import HellPornoIE
 from .helsinki import HelsinkiIE
 from .hentaistigma import HentaiStigmaIE
+from .historicfilms import HistoricFilmsIE
 from .hitbox import HitboxIE, HitboxLiveIE
 from .hornbunny import HornBunnyIE
 from .hostingbulk import HostingBulkIE
@@ -282,11 +286,22 @@ from .ndr import NDRIE
 from .ndtv import NDTVIE
 from .netzkino import NetzkinoIE
 from .nerdcubed import NerdCubedFeedIE
+from .nerdist import NerdistIE
 from .newgrounds import NewgroundsIE
 from .newstube import NewstubeIE
+from .nextmedia import (
+    NextMediaIE,
+    NextMediaActionNewsIE,
+    AppleDailyRealtimeNewsIE,
+    AppleDailyAnimationNewsIE
+)
 from .nfb import NFBIE
 from .nfl import NFLIE
-from .nhl import NHLIE, NHLVideocenterIE
+from .nhl import (
+    NHLIE,
+    NHLNewsIE,
+    NHLVideocenterIE,
+)
 from .niconico import NiconicoIE, NiconicoPlaylistIE
 from .ninegag import NineGagIE
 from .noco import NocoIE
@@ -304,7 +319,8 @@ from .nrk import (
    NRKIE,
    NRKTVIE,
 )
-from .ntv import NTVIE
+from .ntvde import NTVDeIE
+from .ntvru import NTVRuIE
 from .nytimes import NYTimesIE
 from .nuvid import NuvidIE
 from .oktoberfesttv import OktoberfestTVIE
@@ -460,6 +476,7 @@ from .tutv import TutvIE
 from .tvigle import TvigleIE
 from .tvp import TvpIE, TvpSeriesIE
 from .tvplay import TVPlayIE
+from .tweakers import TweakersIE
 from .twentyfourvideo import TwentyFourVideoIE
 from .twitch import (
    TwitchVideoIE,
@@ -539,6 +556,7 @@ from .wimp import WimpIE
 from .wistia import WistiaIE
 from .worldstarhiphop import WorldStarHipHopIE
 from .wrzuta import WrzutaIE
+from .wsj import WSJIE
 from .xbef import XBefIE
 from .xboxclips import XboxClipsIE
 from .xhamster import XHamsterIE
@@ -546,6 +564,7 @@ from .xminus import XMinusIE
 from .xnxx import XNXXIE
 from .xvideos import XVideosIE
 from .xtube import XTubeUserIE, XTubeIE
+from .xuite import XuiteIE
 from .xxxymovies import XXXYMoviesIE
 from .yahoo import (
    YahooIE,
@@ -0,0 +1,103 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+    int_or_none,
+    parse_iso8601,
+    xpath_with_ns,
+    xpath_text,
+    find_xpath_attr,
+)
+
+
+class AftenpostenIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?aftenposten\.no/webtv/([^/]+/)*(?P<id>[^/]+)-\d+\.html'
+
+    _TEST = {
+        'url': 'http://www.aftenposten.no/webtv/serier-og-programmer/sweatshopenglish/TRAILER-SWEATSHOP---I-cant-take-any-more-7800835.html?paging=&section=webtv_serierogprogrammer_sweatshop_sweatshopenglish',
+        'md5': 'fd828cd29774a729bf4d4425fe192972',
+        'info_dict': {
+            'id': '21039',
+            'ext': 'mov',
+            'title': 'TRAILER: "Sweatshop" - I can´t take any more',
+            'description': 'md5:21891f2b0dd7ec2f78d84a50e54f8238',
+            'timestamp': 1416927969,
+            'upload_date': '20141125',
+        }
+    }
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, display_id)
+
+        video_id = self._html_search_regex(
+            r'data-xs-id="(\d+)"', webpage, 'video id')
+
+        data = self._download_xml(
+            'http://frontend.xstream.dk/ap/feed/video/?platform=web&id=%s' % video_id, video_id)
+
+        NS_MAP = {
+            'atom': 'http://www.w3.org/2005/Atom',
+            'xt': 'http://xstream.dk/',
+            'media': 'http://search.yahoo.com/mrss/',
+        }
+
+        entry = data.find(xpath_with_ns('./atom:entry', NS_MAP))
+
+        title = xpath_text(
+            entry, xpath_with_ns('./atom:title', NS_MAP), 'title')
+        description = xpath_text(
+            entry, xpath_with_ns('./atom:summary', NS_MAP), 'description')
+        timestamp = parse_iso8601(xpath_text(
+            entry, xpath_with_ns('./atom:published', NS_MAP), 'upload date'))
+
+        formats = []
+        media_group = entry.find(xpath_with_ns('./media:group', NS_MAP))
+        for media_content in media_group.findall(xpath_with_ns('./media:content', NS_MAP)):
+            media_url = media_content.get('url')
+            if not media_url:
+                continue
+            tbr = int_or_none(media_content.get('bitrate'))
+            mobj = re.search(r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<playpath>.+)$', media_url)
+            if mobj:
+                formats.append({
+                    'url': mobj.group('url'),
+                    'play_path': 'mp4:%s' % mobj.group('playpath'),
+                    'app': mobj.group('app'),
+                    'ext': 'flv',
+                    'tbr': tbr,
+                    'format_id': 'rtmp-%d' % tbr,
+                })
+            else:
+                formats.append({
+                    'url': media_url,
+                    'tbr': tbr,
+                })
+        self._sort_formats(formats)
+
+        link = find_xpath_attr(
+            entry, xpath_with_ns('./atom:link', NS_MAP), 'rel', 'original')
+        if link is not None:
+            formats.append({
+                'url': link.get('href'),
+                'format_id': link.get('rel'),
+            })
+
+        thumbnails = [{
+            'url': splash.get('url'),
+            'width': int_or_none(splash.get('width')),
+            'height': int_or_none(splash.get('height')),
+        } for splash in media_group.findall(xpath_with_ns('./xt:splash', NS_MAP))]
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'timestamp': timestamp,
+            'formats': formats,
+            'thumbnails': thumbnails,
+        }
@@ -1,8 +1,6 @@
 # encoding: utf-8
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor


@@ -21,9 +19,7 @@ class AftonbladetIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.search(self._VALID_URL, url)
-
-        video_id = mobj.group('video_id')
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)

        # find internal video meta data
@@ -20,6 +20,7 @@ class AparatIE(InfoExtractor):
            'id': 'wP8On',
            'ext': 'mp4',
            'title': 'تیم گلکسی 11 - زومیت',
+            'age_limit': 0,
        },
        # 'skip': 'Extremely unreliable',
    }
@@ -34,7 +35,8 @@ class AparatIE(InfoExtractor):
                     video_id + '/vt/frame')
        webpage = self._download_webpage(embed_url, video_id)

-        video_urls = re.findall(r'fileList\[[0-9]+\]\s*=\s*"([^"]+)"', webpage)
+        video_urls = [video_url.replace('\\/', '/') for video_url in re.findall(
+            r'(?:fileList\[[0-9]+\]\s*=|"file"\s*:)\s*"([^"]+)"', webpage)]
        for i, video_url in enumerate(video_urls):
            req = HEADRequest(video_url)
            res = self._request_webpage(
@@ -46,7 +48,7 @@ class AparatIE(InfoExtractor):

        title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title')
        thumbnail = self._search_regex(
-            r'\s+image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)
+            r'image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)

        return {
            'id': video_id,
@@ -54,4 +56,5 @@ class AparatIE(InfoExtractor):
            'url': video_url,
            'ext': 'mp4',
            'thumbnail': thumbnail,
+            'age_limit': self._family_friendly_search(webpage),
        }
@@ -122,7 +122,6 @@ class AppleTrailersIE(InfoExtractor):
            playlist.append({
                '_type': 'video',
                'id': video_id,
-                'title': title,
                'formats': formats,
                'title': title,
                'duration': duration,
@@ -23,13 +23,7 @@ class ARDMediathekIE(InfoExtractor):

    _TESTS = [{
        'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
-        'file': '22429276.mp4',
-        'md5': '469751912f1de0816a9fc9df8336476c',
-        'info_dict': {
-            'title': 'Vertrauen ist gut, Spionieren ist besser - Geht so deutsch-amerikanische Freundschaft?',
-            'description': 'Das Erste Mediathek [ARD]: Vertrauen ist gut, Spionieren ist besser - Geht so deutsch-amerikanische Freundschaft?, Anne Will, Über die Spionage-Affäre diskutieren Clemens Binninger, Katrin Göring-Eckardt, Georg Mascolo, Andrew B. Denison und Constanze Kurz.. Das Video zur Sendung Anne Will am Mittwoch, 16.07.2014',
-        },
-        'skip': 'Blocked outside of Germany',
+        'only_matching': True,
    }, {
        'url': 'http://www.ardmediathek.de/tv/Tatort/Das-Wunder-von-Wolbeck-Video-tgl-ab-20/Das-Erste/Video?documentId=22490580&bcastId=602916',
        'info_dict': {
@@ -10,7 +10,7 @@ from ..compat import compat_HTTPError
 class BBCCoUkIE(SubtitlesInfoExtractor):
    IE_NAME = 'bbc.co.uk'
    IE_DESC = 'BBC iPlayer'
-    _VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:(?:(?:programmes|iplayer/(?:episode|playlist))/)|music/clips[/#])(?P<id>[\da-z]{8})'
+    _VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:(?:(?:programmes|iplayer(?:/[^/]+)?/(?:episode|playlist))/)|music/clips[/#])(?P<id>[\da-z]{8})'

    _TESTS = [
        {
@@ -118,6 +118,9 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
        }, {
            'url': 'http://www.bbc.co.uk/music/clips#p02frcc3',
            'only_matching': True,
+        }, {
+            'url': 'http://www.bbc.co.uk/iplayer/cbeebies/episode/b0480276/bing-14-atchoo',
+            'only_matching': True,
        }
    ]

@@ -108,7 +108,7 @@ class BrightcoveIE(InfoExtractor):
        """

        # Fix up some stupid HTML, see https://github.com/rg3/youtube-dl/issues/1553
-        object_str = re.sub(r'(<param name="[^"]+" value="[^"]+")>',
+        object_str = re.sub(r'(<param(?:\s+[a-zA-Z0-9_]+="[^"]*")*)>',
                            lambda m: m.group(1) + '/>', object_str)
        # Fix up some stupid XML, see https://github.com/rg3/youtube-dl/issues/1608
        object_str = object_str.replace('<--', '<!--')
@@ -28,12 +28,10 @@ class CinchcastIE(InfoExtractor):
            item, './{http://developer.longtailvideo.com/trac/}date')
        upload_date = unified_strdate(date_str, day_first=False)
        # duration is present but wrong
-        formats = []
-        formats.append({
+        formats = [{
            'format_id': 'main',
-            'url': item.find(
-                './{http://search.yahoo.com/mrss/}content').attrib['url'],
-        })
+            'url': item.find('./{http://search.yahoo.com/mrss/}content').attrib['url'],
+        }]
        backup_url = xpath_text(
            item, './{http://developer.longtailvideo.com/trac/}backupContent')
        if backup_url:
@@ -49,7 +49,9 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
                              |(watch/(?P<date>[^/]*)/(?P<tdstitle>.*))
                          )|
                          (?P<interview>
-                              extended-interviews/(?P<interID>[0-9a-z]+)/(?:playlist_tds_extended_)?(?P<interview_title>.*?)(/.*?)?)))
+                              extended-interviews/(?P<interID>[0-9a-z]+)/
+                              (?:playlist_tds_extended_)?(?P<interview_title>[^/?#]*?)
+                              (?:/[^/?#]?|[?#]|$))))
                     '''
    _TESTS = [{
        'url': 'http://thedailyshow.cc.com/watch/thu-december-13-2012/kristen-stewart',
@@ -62,6 +64,38 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
            'uploader': 'thedailyshow',
            'title': 'thedailyshow kristen-stewart part 1',
        }
+    }, {
+        'url': 'http://thedailyshow.cc.com/extended-interviews/b6364d/sarah-chayes-extended-interview',
+        'info_dict': {
+            'id': 'sarah-chayes-extended-interview',
+            'description': 'Carnegie Endowment Senior Associate Sarah Chayes discusses how corrupt institutions function throughout the world in her book "Thieves of State: Why Corruption Threatens Global Security."',
+            'title': 'thedailyshow Sarah Chayes Extended Interview',
+        },
+        'playlist': [
+            {
+                'info_dict': {
+                    'id': '0baad492-cbec-4ec1-9e50-ad91c291127f',
+                    'ext': 'mp4',
+                    'upload_date': '20150129',
+                    'description': 'Carnegie Endowment Senior Associate Sarah Chayes discusses how corrupt institutions function throughout the world in her book "Thieves of State: Why Corruption Threatens Global Security."',
+                    'uploader': 'thedailyshow',
+                    'title': 'thedailyshow sarah-chayes-extended-interview part 1',
+                },
+            },
+            {
+                'info_dict': {
+                    'id': '1e4fb91b-8ce7-4277-bd7c-98c9f1bbd283',
+                    'ext': 'mp4',
+                    'upload_date': '20150129',
+                    'description': 'Carnegie Endowment Senior Associate Sarah Chayes discusses how corrupt institutions function throughout the world in her book "Thieves of State: Why Corruption Threatens Global Security."',
+                    'uploader': 'thedailyshow',
+                    'title': 'thedailyshow sarah-chayes-extended-interview part 2',
+                },
+            },
+        ],
+        'params': {
+            'skip_download': True,
+        },
    }, {
        'url': 'http://thedailyshow.cc.com/extended-interviews/xm3fnq/andrew-napolitano-extended-interview',
        'only_matching': True,
@@ -230,6 +264,7 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):

        return {
            '_type': 'playlist',
+            'id': epTitle,
            'entries': entries,
            'title': show_name + ' ' + title,
            'description': description,
@@ -89,7 +89,8 @@ class InfoExtractor(object):
                    * player_url SWF Player URL (used for rtmpdump).
                    * protocol   The protocol that will be used for the actual
                                 download, lower-case.
-                                 "http", "https", "rtsp", "rtmp", "m3u8" or so.
+                                 "http", "https", "rtsp", "rtmp", "rtmpe",
+                                 "m3u8", or "m3u8_native".
                    * preference Order number of this format. If this field is
                                 present and not None, the formats get sorted
                                 by this field, regardless of all other values.
@@ -144,6 +145,7 @@ class InfoExtractor(object):
    thumbnail:      Full URL to a video thumbnail image.
    description:    Full video description.
    uploader:       Full name of the video uploader.
+    creator:        The main artist who created the video.
    timestamp:      UNIX timestamp of the moment the video became available.
    upload_date:    Video upload date (YYYYMMDD).
                    If not explicitly set, calculated from timestamp.
@@ -654,6 +656,21 @@ class InfoExtractor(object):
        }
        return RATING_TABLE.get(rating.lower(), None)

+    def _family_friendly_search(self, html):
+        # See http://schema.org/VideoObj
+        family_friendly = self._html_search_meta('isFamilyFriendly', html)
+
+        if not family_friendly:
+            return None
+
+        RATING_TABLE = {
+            '1': 0,
+            'true': 0,
+            '0': 18,
+            'false': 18,
+        }
+        return RATING_TABLE.get(family_friendly.lower(), None)
+
    def _twitter_search_player(self, html):
        return self._html_search_meta('twitter:player', html,
                                      'twitter card player')
@@ -703,11 +720,11 @@ class InfoExtractor(object):
                preference,
                f.get('language_preference') if f.get('language_preference') is not None else -1,
                f.get('quality') if f.get('quality') is not None else -1,
+                f.get('tbr') if f.get('tbr') is not None else -1,
+                f.get('vbr') if f.get('vbr') is not None else -1,
                f.get('height') if f.get('height') is not None else -1,
                f.get('width') if f.get('width') is not None else -1,
                ext_preference,
-                f.get('tbr') if f.get('tbr') is not None else -1,
-                f.get('vbr') if f.get('vbr') is not None else -1,
                f.get('abr') if f.get('abr') is not None else -1,
                audio_ext_preference,
                f.get('fps') if f.get('fps') is not None else -1,
@@ -763,7 +780,7 @@ class InfoExtractor(object):
        self.to_screen(msg)
        time.sleep(timeout)

-    def _extract_f4m_formats(self, manifest_url, video_id):
+    def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=None):
        manifest = self._download_xml(
            manifest_url, video_id, 'Downloading f4m manifest',
            'Unable to download f4m manifest')
@@ -776,26 +793,28 @@ class InfoExtractor(object):
            media_nodes = manifest.findall('{http://ns.adobe.com/f4m/2.0}media')
        for i, media_el in enumerate(media_nodes):
            if manifest_version == '2.0':
-                manifest_url = '/'.join(manifest_url.split('/')[:-1]) + '/' + media_el.attrib.get('href')
+                manifest_url = ('/'.join(manifest_url.split('/')[:-1]) + '/'
+                                + (media_el.attrib.get('href') or media_el.attrib.get('url')))
            tbr = int_or_none(media_el.attrib.get('bitrate'))
-            format_id = 'f4m-%d' % (i if tbr is None else tbr)
            formats.append({
-                'format_id': format_id,
+                'format_id': '-'.join(filter(None, [f4m_id, 'f4m-%d' % (i if tbr is None else tbr)])),
                'url': manifest_url,
                'ext': 'flv',
                'tbr': tbr,
                'width': int_or_none(media_el.attrib.get('width')),
                'height': int_or_none(media_el.attrib.get('height')),
+                'preference': preference,
            })
        self._sort_formats(formats)

        return formats

    def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
-                              entry_protocol='m3u8', preference=None):
+                              entry_protocol='m3u8', preference=None,
+                              m3u8_id=None):

        formats = [{
-            'format_id': 'm3u8-meta',
+            'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-meta'])),
            'url': m3u8_url,
            'ext': ext,
            'protocol': 'm3u8',
@@ -831,9 +850,8 @@ class InfoExtractor(object):
                    formats.append({'url': format_url(line)})
                    continue
                tbr = int_or_none(last_info.get('BANDWIDTH'), scale=1000)
-
                f = {
-                    'format_id': 'm3u8-%d' % (tbr if tbr else len(formats)),
+                    'format_id': '-'.join(filter(None, [m3u8_id, 'm3u8-%d' % (tbr if tbr else len(formats))])),
                    'url': format_url(line.strip()),
                    'tbr': tbr,
                    'ext': ext,
@@ -859,10 +877,13 @@ class InfoExtractor(object):
        return formats

    # TODO: improve extraction
-    def _extract_smil_formats(self, smil_url, video_id):
+    def _extract_smil_formats(self, smil_url, video_id, fatal=True):
        smil = self._download_xml(
            smil_url, video_id, 'Downloading SMIL file',
-            'Unable to download SMIL file')
+            'Unable to download SMIL file', fatal=fatal)
+        if smil is False:
+            assert not fatal
+            return []

        base = smil.find('./head/meta').get('base')

@@ -0,0 +1,93 @@
+# -*- coding: utf-8 -*-
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import parse_iso8601, ExtractorError
+
+
+class CtsNewsIE(InfoExtractor):
+    # https connection failed (Connection reset)
+    _VALID_URL = r'http://news\.cts\.com\.tw/[a-z]+/[a-z]+/\d+/(?P<id>\d+)\.html'
+    _TESTS = [{
+        'url': 'http://news.cts.com.tw/cts/international/201501/201501291578109.html',
+        'md5': 'a9875cb790252b08431186d741beaabe',
+        'info_dict': {
+            'id': '201501291578109',
+            'ext': 'mp4',
+            'title': '以色列.真主黨交火 3人死亡',
+            'description': 'md5:95e9b295c898b7ff294f09d450178d7d',
+            'timestamp': 1422528540,
+            'upload_date': '20150129',
+        }
+    }, {
+        # News count not appear on page but still available in database
+        'url': 'http://news.cts.com.tw/cts/international/201309/201309031304098.html',
+        'md5': '3aee7e0df7cdff94e43581f54c22619e',
+        'info_dict': {
+            'id': '201309031304098',
+            'ext': 'mp4',
+            'title': '韓國31歲童顏男 貌如十多歲小孩',
+            'description': 'md5:f183feeba3752b683827aab71adad584',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'timestamp': 1378205880,
+            'upload_date': '20130903',
+        }
+    }, {
+        # With Youtube embedded video
+        'url': 'http://news.cts.com.tw/cts/money/201501/201501291578003.html',
+        'md5': '1d842c771dc94c8c3bca5af2cc1db9c5',
+        'add_ie': ['Youtube'],
+        'info_dict': {
+            'id': 'OVbfO7d0_hQ',
+            'ext': 'mp4',
+            'title': 'iPhone6熱銷 蘋果財報亮眼',
+            'description': 'md5:f395d4f485487bb0f992ed2c4b07aa7d',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'upload_date': '20150128',
+            'uploader_id': 'TBSCTS',
+            'uploader': '中華電視公司',
+        }
+    }]
+
+    def _real_extract(self, url):
+        news_id = self._match_id(url)
+        page = self._download_webpage(url, news_id)
+
+        if self._search_regex(r'(CTSPlayer2)', page, 'CTSPlayer2 identifier', default=None):
+            feed_url = self._html_search_regex(
+                r'(http://news\.cts\.com\.tw/action/mp4feed\.php\?news_id=\d+)',
+                page, 'feed url')
+            video_url = self._download_webpage(
+                feed_url, news_id, note='Fetching feed')
+        else:
+            self.to_screen('Not CTSPlayer video, trying Youtube...')
+            youtube_url = self._search_regex(
+                r'src="(//www\.youtube\.com/embed/[^"]+)"', page, 'youtube url',
+                default=None)
+            if not youtube_url:
+                raise ExtractorError('The news includes no videos!', expected=True)
+
+            return {
+                '_type': 'url',
+                'url': youtube_url,
+                'ie_key': 'Youtube',
+            }
+
+        description = self._html_search_meta('description', page)
+        title = self._html_search_meta('title', page)
+        thumbnail = self._html_search_meta('image', page)
+
+        datetime_str = self._html_search_regex(
+            r'(\d{4}/\d{2}/\d{2} \d{2}:\d{2})', page, 'date and time')
+        # Transform into ISO 8601 format with timezone info
+        datetime_str = datetime_str.replace('/', '-') + ':00+0800'
+        timestamp = parse_iso8601(datetime_str, delimiter=' ')
+
+        return {
+            'id': news_id,
+            'url': video_url,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'timestamp': timestamp,
+        }
@@ -0,0 +1,57 @@
+# encoding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..compat import compat_str
+
+
+class DctpTvIE(InfoExtractor):
+    _VALID_URL = r'http://www.dctp.tv/(#/)?filme/(?P<id>.+?)/$'
+    _TEST = {
+        'url': 'http://www.dctp.tv/filme/videoinstallation-fuer-eine-kaufhausfassade/',
+        'info_dict': {
+            'id': '1324',
+            'display_id': 'videoinstallation-fuer-eine-kaufhausfassade',
+            'ext': 'flv',
+            'title': 'Videoinstallation für eine Kaufhausfassade'
+        }
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        base_url = 'http://dctp-ivms2-restapi.s3.amazonaws.com/'
+        version_json = self._download_json(
+            base_url + 'version.json',
+            video_id, note='Determining file version')
+        version = version_json['version_name']
+        info_json = self._download_json(
+            '{0}{1}/restapi/slugs/{2}.json'.format(base_url, version, video_id),
+            video_id, note='Fetching object ID')
+        object_id = compat_str(info_json['object_id'])
+        meta_json = self._download_json(
+            '{0}{1}/restapi/media/{2}.json'.format(base_url, version, object_id),
+            video_id, note='Downloading metadata')
+        uuid = meta_json['uuid']
+        title = meta_json['title']
+        wide = meta_json['is_wide']
+        if wide:
+            ratio = '16x9'
+        else:
+            ratio = '4x3'
+        play_path = 'mp4:{0}_dctp_0500_{1}.m4v'.format(uuid, ratio)
+
+        servers_json = self._download_json(
+            'http://www.dctp.tv/streaming_servers/',
+            video_id, note='Downloading server list')
+        url = servers_json[0]['endpoint']
+
+        return {
+            'id': object_id,
+            'title': title,
+            'format': 'rtmp',
+            'url': url,
+            'play_path': play_path,
+            'rtmp_real_time': True,
+            'ext': 'flv',
+            'display_id': video_id
+        }
@@ -1,40 +1,38 @@
 from __future__ import unicode_literals

-import re
-import json
-
 from .common import InfoExtractor


 class DefenseGouvFrIE(InfoExtractor):
    IE_NAME = 'defense.gouv.fr'
-    _VALID_URL = (r'http://.*?\.defense\.gouv\.fr/layout/set/'
-                  r'ligthboxvideo/base-de-medias/webtv/(.*)')
+    _VALID_URL = r'http://.*?\.defense\.gouv\.fr/layout/set/ligthboxvideo/base-de-medias/webtv/(?P<id>[^/?#]*)'

    _TEST = {
        'url': 'http://www.defense.gouv.fr/layout/set/ligthboxvideo/base-de-medias/webtv/attaque-chimique-syrienne-du-21-aout-2013-1',
-        'file': '11213.mp4',
        'md5': '75bba6124da7e63d2d60b5244ec9430c',
-        "info_dict": {
-            "title": "attaque-chimique-syrienne-du-21-aout-2013-1"
+        'info_dict': {
+            'id': '11213',
+            'ext': 'mp4',
+            'title': 'attaque-chimique-syrienne-du-21-aout-2013-1'
        }
    }

    def _real_extract(self, url):
-        title = re.match(self._VALID_URL, url).group(1)
+        title = self._match_id(url)
        webpage = self._download_webpage(url, title)
+
        video_id = self._search_regex(
            r"flashvars.pvg_id=\"(\d+)\";",
            webpage, 'ID')

        json_url = ('http://static.videos.gouv.fr/brightcovehub/export/json/'
                    + video_id)
-        info = self._download_webpage(json_url, title,
-                                      'Downloading JSON config')
-        video_url = json.loads(info)['renditions'][0]['url']
+        info = self._download_json(json_url, title, 'Downloading JSON config')
+        video_url = info['renditions'][0]['url']

-        return {'id': video_id,
-                'ext': 'mp4',
-                'url': video_url,
-                'title': title,
-                }
+        return {
+            'id': video_id,
+            'ext': 'mp4',
+            'url': video_url,
+            'title': title,
+        }
@@ -6,7 +6,7 @@ from ..utils import parse_iso8601


 class DRTVIE(SubtitlesInfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?dr\.dk/tv/se/(?:[^/]+/)+(?P<id>[\da-z-]+)(?:[/#?]|$)'
+    _VALID_URL = r'https?://(?:www\.)?dr\.dk/tv/se/(?:[^/]+/)*(?P<id>[\da-z-]+)(?:[/#?]|$)'

    _TEST = {
        'url': 'http://www.dr.dk/tv/se/partiets-mand/partiets-mand-7-8',
@@ -25,9 +25,15 @@ class DRTVIE(SubtitlesInfoExtractor):
    def _real_extract(self, url):
        video_id = self._match_id(url)

-        programcard = self._download_json(
-            'http://www.dr.dk/mu/programcard/expanded/%s' % video_id, video_id, 'Downloading video JSON')
+        webpage = self._download_webpage(url, video_id)

+        video_id = self._search_regex(
+            r'data-(?:material-identifier|episode-slug)="([^"]+)"',
+            webpage, 'video id')
+
+        programcard = self._download_json(
+            'http://www.dr.dk/mu/programcard/expanded/%s' % video_id,
+            video_id, 'Downloading video JSON')
        data = programcard['Data'][0]

        title = data['Title']
@@ -1,77 +1,69 @@
 # coding: utf-8
 from __future__ import unicode_literals

-import json
-import re
-
 from .common import InfoExtractor
 from ..compat import (
-    compat_parse_qs,
    compat_urlparse,
 )
+from ..utils import (
+    determine_ext,
+    int_or_none,
+)


 class FranceCultureIE(InfoExtractor):
-    _VALID_URL = r'(?P<baseurl>http://(?:www\.)?franceculture\.fr/)player/reecouter\?play=(?P<id>[0-9]+)'
+    _VALID_URL = r'https?://(?:www\.)?franceculture\.fr/player/reecouter\?play=(?P<id>[0-9]+)'
    _TEST = {
        'url': 'http://www.franceculture.fr/player/reecouter?play=4795174',
        'info_dict': {
            'id': '4795174',
            'ext': 'mp3',
            'title': 'Rendez-vous au pays des geeks',
+            'alt_title': 'Carnet nomade | 13-14',
            'vcodec': 'none',
-            'uploader': 'Colette Fellous',
            'upload_date': '20140301',
-            'duration': 3601,
            'thumbnail': r're:^http://www\.franceculture\.fr/.*/images/player/Carnet-nomade\.jpg$',
-            'description': 'Avec :Jean-Baptiste Péretié pour son documentaire sur Arte "La revanche des « geeks », une enquête menée aux Etats-Unis dans la S ...',
+            'description': 'startswith:Avec :Jean-Baptiste Péretié pour son documentaire sur Arte "La revanche des « geeks », une enquête menée aux Etats',
+            'timestamp': 1393700400,
        }
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-        baseurl = mobj.group('baseurl')
-
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
-        params_code = self._search_regex(
-            r"<param name='movie' value='/sites/all/modules/rf/rf_player/swf/loader.swf\?([^']+)' />",
-            webpage, 'parameter code')
-        params = compat_parse_qs(params_code)
-        video_url = compat_urlparse.urljoin(baseurl, params['urlAOD'][0])
+
+        video_path = self._search_regex(
+            r'<a id="player".*?href="([^"]+)"', webpage, 'video path')
+        video_url = compat_urlparse.urljoin(url, video_path)
+        timestamp = int_or_none(self._search_regex(
+            r'<a id="player".*?data-date="([0-9]+)"',
+            webpage, 'upload date', fatal=False))
+        thumbnail = self._search_regex(
+            r'<a id="player".*?>\s+<img src="([^"]+)"',
+            webpage, 'thumbnail', fatal=False)

        title = self._html_search_regex(
-            r'<h1 class="title[^"]+">(.+?)</h1>', webpage, 'title')
+            r'<span class="title-diffusion">(.*?)</span>', webpage, 'title')
+        alt_title = self._html_search_regex(
+            r'<span class="title">(.*?)</span>',
+            webpage, 'alt_title', fatal=False)
+        description = self._html_search_regex(
+            r'<span class="description">(.*?)</span>',
+            webpage, 'description', fatal=False)
+
        uploader = self._html_search_regex(
            r'(?s)<div id="emission".*?<span class="author">(.*?)</span>',
-            webpage, 'uploader', fatal=False)
-        thumbnail_part = self._html_search_regex(
-            r'(?s)<div id="emission".*?<img src="([^"]+)"', webpage,
-            'thumbnail', fatal=False)
-        if thumbnail_part is None:
-            thumbnail = None
-        else:
-            thumbnail = compat_urlparse.urljoin(baseurl, thumbnail_part)
-        description = self._html_search_regex(
-            r'(?s)<p class="desc">(.*?)</p>', webpage, 'description')
-
-        info = json.loads(params['infoData'][0])[0]
-        duration = info.get('media_length')
-        upload_date_candidate = info.get('media_section5')
-        upload_date = (
-            upload_date_candidate
-            if (upload_date_candidate is not None and
-                re.match(r'[0-9]{8}$', upload_date_candidate))
-            else None)
+            webpage, 'uploader', default=None)
+        vcodec = 'none' if determine_ext(video_url.lower()) == 'mp3' else None

        return {
            'id': video_id,
            'url': video_url,
-            'vcodec': 'none' if video_url.lower().endswith('.mp3') else None,
-            'duration': duration,
+            'vcodec': vcodec,
            'uploader': uploader,
-            'upload_date': upload_date,
+            'timestamp': timestamp,
            'title': title,
+            'alt_title': alt_title,
            'thumbnail': thumbnail,
            'description': description,
        }
@@ -230,12 +230,13 @@ class FranceTVIE(FranceTVBaseInfoExtractor):

 class GenerationQuoiIE(InfoExtractor):
    IE_NAME = 'france2.fr:generation-quoi'
-    _VALID_URL = r'https?://generation-quoi\.france2\.fr/portrait/(?P<name>.*)(\?|$)'
+    _VALID_URL = r'https?://generation-quoi\.france2\.fr/portrait/(?P<id>[^/?#]+)'

    _TEST = {
        'url': 'http://generation-quoi.france2.fr/portrait/garde-a-vous',
-        'file': 'k7FJX8VBcvvLmX4wA5Q.mp4',
        'info_dict': {
+            'id': 'k7FJX8VBcvvLmX4wA5Q',
+            'ext': 'mp4',
            'title': 'Génération Quoi - Garde à Vous',
            'uploader': 'Génération Quoi',
        },
@@ -243,14 +244,12 @@ class GenerationQuoiIE(InfoExtractor):
            # It uses Dailymotion
            'skip_download': True,
        },
-        'skip': 'Only available from France',
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        name = mobj.group('name')
-        info_url = compat_urlparse.urljoin(url, '/medias/video/%s.json' % name)
-        info_json = self._download_webpage(info_url, name)
+        display_id = self._match_id(url)
+        info_url = compat_urlparse.urljoin(url, '/medias/video/%s.json' % display_id)
+        info_json = self._download_webpage(info_url, display_id)
        info = json.loads(info_json)
        return self.url_result('http://www.dailymotion.com/video/%s' % info['id'],
                               ie='Dailymotion')
@@ -1,41 +1,67 @@
+# coding: utf-8
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor
+from ..utils import (
+    xpath_text,
+    xpath_with_ns,
+)


 class GamekingsIE(InfoExtractor):
-    _VALID_URL = r'http://www\.gamekings\.tv/videos/(?P<name>[0-9a-z\-]+)'
-    _TEST = {
+    _VALID_URL = r'http://www\.gamekings\.tv/(?:videos|nieuws)/(?P<id>[^/]+)'
+    _TESTS = [{
        'url': 'http://www.gamekings.tv/videos/phoenix-wright-ace-attorney-dual-destinies-review/',
        # MD5 is flaky, seems to change regularly
        # 'md5': '2f32b1f7b80fdc5cb616efb4f387f8a3',
        'info_dict': {
-            'id': '20130811',
+            'id': 'phoenix-wright-ace-attorney-dual-destinies-review',
            'ext': 'mp4',
            'title': 'Phoenix Wright: Ace Attorney \u2013 Dual Destinies Review',
            'description': 'md5:36fd701e57e8c15ac8682a2374c99731',
-        }
-    }
+            'thumbnail': 're:^https?://.*\.jpg$',
+        },
+    }, {
+        # vimeo video
+        'url': 'http://www.gamekings.tv/videos/the-legend-of-zelda-majoras-mask/',
+        'md5': '12bf04dfd238e70058046937657ea68d',
+        'info_dict': {
+            'id': 'the-legend-of-zelda-majoras-mask',
+            'ext': 'mp4',
+            'title': 'The Legend of Zelda: Majora’s Mask',
+            'description': 'md5:9917825fe0e9f4057601fe1e38860de3',
+            'thumbnail': 're:^https?://.*\.jpg$',
+        },
+    }, {
+        'url': 'http://www.gamekings.tv/nieuws/gamekings-extra-shelly-en-david-bereiden-zich-voor-op-de-livestream/',
+        'only_matching': True,
+    }]

    def _real_extract(self, url):
+        video_id = self._match_id(url)

-        mobj = re.match(self._VALID_URL, url)
-        name = mobj.group('name')
-        webpage = self._download_webpage(url, name)
-        video_url = self._og_search_video_url(webpage)
+        webpage = self._download_webpage(url, video_id)

-        video = re.search(r'[0-9]+', video_url)
-        video_id = video.group(0)
+        playlist_id = self._search_regex(
+            r'gogoVideo\(\s*\d+\s*,\s*"([^"]+)', webpage, 'playlist id')

-        # Todo: add medium format
-        video_url = video_url.replace(video_id, 'large/' + video_id)
+        playlist = self._download_xml(
+            'http://www.gamekings.tv/wp-content/themes/gk2010/rss_playlist.php?id=%s' % playlist_id,
+            video_id)
+
+        NS_MAP = {
+            'jwplayer': 'http://rss.jwpcdn.com/'
+        }
+
+        item = playlist.find('./channel/item')
+
+        thumbnail = xpath_text(item, xpath_with_ns('./jwplayer:image', NS_MAP), 'thumbnail')
+        video_url = item.find(xpath_with_ns('./jwplayer:source', NS_MAP)).get('file')

        return {
            'id': video_id,
-            'ext': 'mp4',
            'url': video_url,
            'title': self._og_search_title(webpage),
            'description': self._og_search_description(webpage),
+            'thumbnail': thumbnail,
        }
@@ -140,6 +140,19 @@ class GenericIE(InfoExtractor):
            },
            'add_ie': ['Ooyala'],
        },
+        # multiple ooyala embeds on SBN network websites
+        {
+            'url': 'http://www.sbnation.com/college-football-recruiting/2015/2/3/7970291/national-signing-day-rationalizations-itll-be-ok-itll-be-ok',
+            'info_dict': {
+                'id': 'national-signing-day-rationalizations-itll-be-ok-itll-be-ok',
+                'title': '25 lies you will tell yourself on National Signing Day - SBNation.com',
+            },
+            'playlist_mincount': 3,
+            'params': {
+                'skip_download': True,
+            },
+            'add_ie': ['Ooyala'],
+        },
        # google redirect
        {
            'url': 'http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CCUQtwIwAA&url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DcmQHVoWB5FY&ei=F-sNU-LLCaXk4QT52ICQBQ&usg=AFQjCNEw4hL29zgOohLXvpJ-Bdh2bils1Q&bvm=bv.61965928,d.bGE',
@@ -498,6 +511,32 @@ class GenericIE(InfoExtractor):
                'uploader': 'www.abc.net.au',
                'title': 'Game of Thrones with dice - Dungeons and Dragons fantasy role-playing game gets new life - 19/01/2015',
            }
+        },
+        # embedded viddler video
+        {
+            'url': 'http://deadspin.com/i-cant-stop-watching-john-wall-chop-the-nuggets-with-th-1681801597',
+            'info_dict': {
+                'id': '4d03aad9',
+                'ext': 'mp4',
+                'uploader': 'deadspin',
+                'title': 'WALL-TO-GORTAT',
+                'timestamp': 1422285291,
+                'upload_date': '20150126',
+            },
+            'add_ie': ['Viddler'],
+        },
+        # jwplayer YouTube
+        {
+            'url': 'http://media.nationalarchives.gov.uk/index.php/webinar-using-discovery-national-archives-online-catalogue/',
+            'info_dict': {
+                'id': 'Mrj4DVp2zeA',
+                'ext': 'mp4',
+                'upload_date': '20150204',
+                'uploader': 'The National Archives UK',
+                'description': 'md5:a236581cd2449dd2df4f93412f3f01c6',
+                'uploader_id': 'NationalArchives08',
+                'title': 'Webinar: Using Discovery, The National Archives’ online catalogue',
+            },
        }
    ]

@@ -860,12 +899,28 @@ class GenericIE(InfoExtractor):
        if mobj is not None:
            return self.url_result(mobj.group('url'))

+        # Look for embedded Viddler player
+        mobj = re.search(
+            r'<(?:iframe[^>]+?src|param[^>]+?value)=(["\'])(?P<url>(?:https?:)?//(?:www\.)?viddler\.com/(?:embed|player)/.+?)\1',
+            webpage)
+        if mobj is not None:
+            return self.url_result(mobj.group('url'))
+
        # Look for Ooyala videos
-        mobj = (re.search(r'player.ooyala.com/[^"?]+\?[^"]*?(?:embedCode|ec)=(?P<ec>[^"&]+)', webpage) or
-                re.search(r'OO.Player.create\([\'"].*?[\'"],\s*[\'"](?P<ec>.{32})[\'"]', webpage))
+        mobj = (re.search(r'player\.ooyala\.com/[^"?]+\?[^"]*?(?:embedCode|ec)=(?P<ec>[^"&]+)', webpage) or
+                re.search(r'OO\.Player\.create\([\'"].*?[\'"],\s*[\'"](?P<ec>.{32})[\'"]', webpage) or
+                re.search(r'SBN\.VideoLinkset\.ooyala\([\'"](?P<ec>.{32})[\'"]\)', webpage))
        if mobj is not None:
            return OoyalaIE._build_url_result(mobj.group('ec'))

+        # Look for multiple Ooyala embeds on SBN network websites
+        mobj = re.search(r'SBN\.VideoLinkset\.entryGroup\((\[.*?\])', webpage)
+        if mobj is not None:
+            embeds = self._parse_json(mobj.group(1), video_id, fatal=False)
+            if embeds:
+                return _playlist_from_matches(
+                    embeds, getter=lambda v: OoyalaIE._url_for_embed_code(v['provider_video_id']), ie='Ooyala')
+
        # Look for Aparat videos
        mobj = re.search(r'<iframe .*?src="(http://www\.aparat\.com/video/[^"]+)"', webpage)
        if mobj is not None:
@@ -992,7 +1047,12 @@ class GenericIE(InfoExtractor):

        # Look for embedded sbs.com.au player
        mobj = re.search(
-            r'<iframe[^>]+?src=(["\'])(?P<url>https?://(?:www\.)sbs\.com\.au/ondemand/video/single/.+?)\1',
+            r'''(?x)
+            (?:
+                <meta\s+property="og:video"\s+content=|
+                <iframe[^>]+?src=
+            )
+            (["\'])(?P<url>https?://(?:www\.)?sbs\.com\.au/ondemand/video/.+?)\1''',
            webpage)
        if mobj is not None:
            return self.url_result(mobj.group('url'), 'SBS')
@@ -1023,6 +1083,8 @@ class GenericIE(InfoExtractor):
            return self.url_result(mobj.group('url'), 'Livestream')

        def check_video(vurl):
+            if YoutubeIE.suitable(vurl):
+                return True
            vpath = compat_urlparse.urlparse(vurl).path
            vext = determine_ext(vpath)
            return '.' in vpath and vext not in ('swf', 'png', 'jpg', 'srt', 'sbv', 'sub', 'vtt', 'ttml')
@@ -1040,7 +1102,8 @@ class GenericIE(InfoExtractor):
                    JWPlayerOptions|
                    jwplayer\s*\(\s*["'][^'"]+["']\s*\)\s*\.setup
                )
-                .*?file\s*:\s*["\'](.*?)["\']''', webpage))
+                .*?
+                ['"]?file['"]?\s*:\s*["\'](.*?)["\']''', webpage))
        if not found:
            # Broaden the search a little bit
            found = filter_video(re.findall(r'[^A-Za-z0-9]?(?:file|source)=(http[^\'"&]*)', webpage))
@@ -1053,7 +1116,7 @@ class GenericIE(InfoExtractor):
            found = filter_video(re.findall(r'''(?xs)
                flowplayer\("[^"]+",\s*
                    \{[^}]+?\}\s*,
-                    \s*{[^}]+? ["']?clip["']?\s*:\s*\{\s*
+                    \s*\{[^}]+? ["']?clip["']?\s*:\s*\{\s*
                        ["']?url["']?\s*:\s*["']([^"']+)["']
            ''', webpage))
        if not found:
@@ -70,6 +70,19 @@ class GloboIE(InfoExtractor):
                'like_count': int,
            }
        },
+        {
+            'url': 'http://globotv.globo.com/canal-brasil/sangue-latino/t/todos-os-videos/v/ator-e-diretor-argentino-ricado-darin-fala-sobre-utopias-e-suas-perdas/3928201/',
+            'md5': 'c1defca721ce25b2354e927d3e4b3dec',
+            'info_dict': {
+                'id': '3928201',
+                'ext': 'mp4',
+                'title': 'Ator e diretor argentino, Ricado Darín fala sobre utopias e suas perdas',
+                'duration': 1472.906,
+                'uploader': 'Canal Brasil',
+                'uploader_id': 705,
+                'like_count': int,
+            }
+        },
    ]

    class MD5():
@@ -381,11 +394,16 @@ class GloboIE(InfoExtractor):
            signed_md5 = self.MD5.b64_md5(received_md5 + compat_str(sign_time) + padding)
            signed_hash = hash_code + compat_str(received_time) + received_random + compat_str(sign_time) + padding + signed_md5

-            formats.append({
-                'url': '%s?h=%s&k=%s' % (resource['url'], signed_hash, 'flash'),
-                'format_id': resource_id,
-                'height': resource['height']
-            })
+            resource_url = resource['url']
+            signed_url = '%s?h=%s&k=%s' % (resource_url, signed_hash, 'flash')
+            if resource_id.endswith('m3u8') or resource_url.endswith('.m3u8'):
+                formats.extend(self._extract_m3u8_formats(signed_url, resource_id, 'mp4'))
+            else:
+                formats.append({
+                    'url': signed_url,
+                    'format_id': resource_id,
+                    'height': resource.get('height'),
+                })

        self._sort_formats(formats)

@@ -34,8 +34,6 @@ class GoshgayIE(InfoExtractor):
        duration = parse_duration(self._html_search_regex(
            r'<span class="duration">\s*-?\s*(.*?)</span>',
            webpage, 'duration', fatal=False))
-        family_friendly = self._html_search_meta(
-            'isFamilyFriendly', webpage, default='false')

        flashvars = compat_parse_qs(self._html_search_regex(
            r'<embed.+?id="flash-player-embed".+?flashvars="([^"]+)"',
@@ -49,5 +47,5 @@ class GoshgayIE(InfoExtractor):
            'title': title,
            'thumbnail': thumbnail,
            'duration': duration,
-            'age_limit': 0 if family_friendly == 'true' else 18,
+            'age_limit': self._family_friendly_search(webpage),
        }
@@ -83,7 +83,7 @@ class GroovesharkIE(InfoExtractor):
        return compat_urlparse.urlunparse((uri.scheme, uri.netloc, obj['attrs']['data'], None, None, None))

    def _transform_bootstrap(self, js):
-        return re.split('(?m)^\s*try\s*{', js)[0] \
+        return re.split('(?m)^\s*try\s*\{', js)[0] \
                 .split(' = ', 1)[1].strip().rstrip(';')

    def _transform_meta(self, js):
@@ -0,0 +1,46 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import parse_duration
+
+
+class HistoricFilmsIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?historicfilms\.com/(?:tapes/|play)(?P<id>\d+)'
+    _TEST = {
+        'url': 'http://www.historicfilms.com/tapes/4728',
+        'md5': 'd4a437aec45d8d796a38a215db064e9a',
+        'info_dict': {
+            'id': '4728',
+            'ext': 'mov',
+            'title': 'Historic Films: GP-7',
+            'description': 'md5:1a86a0f3ac54024e419aba97210d959a',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'duration': 2096,
+        },
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, video_id)
+
+        tape_id = self._search_regex(
+            r'class="tapeId">([^<]+)<', webpage, 'tape id')
+
+        title = self._og_search_title(webpage)
+        description = self._og_search_description(webpage)
+        thumbnail = self._html_search_meta(
+            'thumbnailUrl', webpage, 'thumbnails') or self._og_search_thumbnail(webpage)
+        duration = parse_duration(self._html_search_meta(
+            'duration', webpage, 'duration'))
+
+        video_url = 'http://www.historicfilms.com/video/%s_%s_web.mov' % (tape_id, video_id)
+
+        return {
+            'id': video_id,
+            'url': video_url,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'duration': duration,
+        }
@@ -16,7 +16,7 @@ from ..utils import (
 class IviIE(InfoExtractor):
    IE_DESC = 'ivi.ru'
    IE_NAME = 'ivi'
-    _VALID_URL = r'https?://(?:www\.)?ivi\.ru/(?:watch/(?:[^/]+/)?|video/player\?.*?videoId=)(?P<videoid>\d+)'
+    _VALID_URL = r'https?://(?:www\.)?ivi\.ru/(?:watch/(?:[^/]+/)?|video/player\?.*?videoId=)(?P<id>\d+)'

    _TESTS = [
        # Single movie
@@ -63,29 +63,34 @@ class IviIE(InfoExtractor):
        return int(m.group('commentcount')) if m is not None else 0

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('videoid')
+        video_id = self._match_id(url)

        api_url = 'http://api.digitalaccess.ru/api/json/'

-        data = {'method': 'da.content.get',
-                'params': [video_id, {'site': 's183',
-                                      'referrer': 'http://www.ivi.ru/watch/%s' % video_id,
-                                      'contentid': video_id
-                                      }
-                           ]
+        data = {
+            'method': 'da.content.get',
+            'params': [
+                video_id, {
+                    'site': 's183',
+                    'referrer': 'http://www.ivi.ru/watch/%s' % video_id,
+                    'contentid': video_id
                }
+            ]
+        }

        request = compat_urllib_request.Request(api_url, json.dumps(data))

-        video_json_page = self._download_webpage(request, video_id, 'Downloading video JSON')
+        video_json_page = self._download_webpage(
+            request, video_id, 'Downloading video JSON')
        video_json = json.loads(video_json_page)

        if 'error' in video_json:
            error = video_json['error']
            if error['origin'] == 'NoRedisValidData':
                raise ExtractorError('Video %s does not exist' % video_id, expected=True)
-            raise ExtractorError('Unable to download video %s: %s' % (video_id, error['message']), expected=True)
+            raise ExtractorError(
+                'Unable to download video %s: %s' % (video_id, error['message']),
+                expected=True)

        result = video_json['result']

@@ -80,9 +80,6 @@ class IzleseneIE(InfoExtractor):
            r'comment_count\s*=\s*\'([^\']+)\';',
            webpage, 'comment_count', fatal=False)

-        family_friendly = self._html_search_meta(
-            'isFamilyFriendly', webpage, 'age limit', fatal=False)
-
        content_url = self._html_search_meta(
            'contentURL', webpage, 'content URL', fatal=False)
        ext = determine_ext(content_url, 'mp4')
@@ -120,6 +117,6 @@ class IzleseneIE(InfoExtractor):
            'duration': duration,
            'view_count': int_or_none(view_count),
            'comment_count': int_or_none(comment_count),
-            'age_limit': 18 if family_friendly == 'False' else 0,
+            'age_limit': self._family_friendly_search(webpage),
            'formats': formats,
        }
@@ -13,17 +13,17 @@ class KankanIE(InfoExtractor):

    _TEST = {
        'url': 'http://yinyue.kankan.com/vod/48/48863.shtml',
-        'file': '48863.flv',
        'md5': '29aca1e47ae68fc28804aca89f29507e',
        'info_dict': {
+            'id': '48863',
+            'ext': 'flv',
            'title': 'Ready To Go',
        },
        'skip': 'Only available from China',
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)

        title = self._search_regex(r'(?:G_TITLE=|G_MOVIE_TITLE = )[\'"](.+?)[\'"]', webpage, 'video title')
@@ -7,10 +7,6 @@ from .common import InfoExtractor
 from ..compat import (
    compat_urllib_parse_urlparse,
    compat_urllib_request,
-    compat_urllib_parse,
-)
-from ..aes import (
-    aes_decrypt_text
 )


@@ -18,9 +14,10 @@ class KeezMoviesIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?keezmovies\.com/video/.+?(?P<id>[0-9]+)(?:[/?&]|$)'
    _TEST = {
        'url': 'http://www.keezmovies.com/video/petite-asian-lady-mai-playing-in-bathtub-1214711',
-        'file': '1214711.mp4',
        'md5': '6e297b7e789329923fcf83abb67c9289',
        'info_dict': {
+            'id': '1214711',
+            'ext': 'mp4',
            'title': 'Petite Asian Lady Mai Playing In Bathtub',
            'age_limit': 18,
        }
@@ -39,11 +36,10 @@ class KeezMoviesIE(InfoExtractor):
            embedded_url = mobj.group(1)
            return self.url_result(embedded_url)

-        video_title = self._html_search_regex(r'<h1 [^>]*>([^<]+)', webpage, 'title')
-        video_url = compat_urllib_parse.unquote(self._html_search_regex(r'video_url=(.+?)&amp;', webpage, 'video_url'))
-        if 'encrypted=true' in webpage:
-            password = self._html_search_regex(r'video_title=(.+?)&amp;', webpage, 'password')
-            video_url = aes_decrypt_text(video_url, password, 32).decode('utf-8')
+        video_title = self._html_search_regex(
+            r'<h1 [^>]*>([^<]+)', webpage, 'title')
+        video_url = self._html_search_regex(
+            r'(?s)html5VideoPlayer = .*?src="([^"]+)"', webpage, 'video URL')
        path = compat_urllib_parse_urlparse(video_url).path
        extension = os.path.splitext(path)[1][1:]
        format = path.split('/')[4].split('_')[:2]
@@ -1,7 +1,5 @@
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor
 from ..utils import (
    parse_duration,
@@ -20,9 +18,10 @@ class LA7IE(InfoExtractor):

    _TEST = {
        'url': 'http://www.la7.tv/richplayer/?assetid=50355319',
-        'file': '50355319.mp4',
        'md5': 'ec7d1f0224d20ba293ab56cf2259651f',
        'info_dict': {
+            'id': '50355319',
+            'ext': 'mp4',
            'title': 'IL DIVO',
            'description': 'Un film di Paolo Sorrentino con Toni Servillo, Anna Bonaiuto, Giulio Bosetti  e Flavio Bucci',
            'duration': 6254,
@@ -31,9 +30,7 @@ class LA7IE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-
+        video_id = self._match_id(url)
        xml_url = 'http://www.la7.tv/repliche/content/index.php?contentId=%s' % video_id
        doc = self._download_xml(xml_url, video_id)

@@ -6,13 +6,12 @@ import re
 from .common import InfoExtractor
 from ..utils import (
    int_or_none,
-    js_to_json,
    unified_strdate,
 )


 class LnkGoIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?lnkgo\.alfa\.lt/visi\-video/(?P<show>[^/]+)/ziurek\-(?P<display_id>[A-Za-z0-9\-]+)'
+    _VALID_URL = r'https?://(?:www\.)?lnkgo\.alfa\.lt/visi-video/(?P<show>[^/]+)/ziurek-(?P<id>[A-Za-z0-9-]+)'
    _TESTS = [{
        'url': 'http://lnkgo.alfa.lt/visi-video/yra-kaip-yra/ziurek-yra-kaip-yra-162',
        'info_dict': {
@@ -51,8 +50,7 @@ class LnkGoIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        display_id = mobj.group('display_id')
+        display_id = self._match_id(url)

        webpage = self._download_webpage(
            url, display_id, 'Downloading player webpage')
@@ -61,6 +59,8 @@ class LnkGoIE(InfoExtractor):
            r'data-ep="([^"]+)"', webpage, 'video ID')
        title = self._og_search_title(webpage)
        description = self._og_search_description(webpage)
+        upload_date = unified_strdate(self._search_regex(
+            r'class="[^"]*meta-item[^"]*air-time[^"]*">.*?<strong>([^<]+)</strong>', webpage, 'upload date', fatal=False))

        thumbnail_w = int_or_none(
            self._og_search_property('image:width', webpage, 'thumbnail width', fatal=False))
@@ -75,39 +75,28 @@ class LnkGoIE(InfoExtractor):
                'height': thumbnail_h,
            })

-        upload_date = unified_strdate(self._search_regex(
-            r'class="meta-item\sair-time">.*?<strong>([^<]+)</strong>', webpage, 'upload date', fatal=False))
-        duration = int_or_none(self._search_regex(
-            r'VideoDuration = "([^"]+)"', webpage, 'duration', fatal=False))
+        config = self._parse_json(self._search_regex(
+            r'episodePlayer\((\{.*?\}),\s*\{', webpage, 'sources'), video_id)

-        pg_rating = self._search_regex(
-            r'pgrating="([^"]+)"', webpage, 'PG rating', fatal=False, default='')
-        age_limit = self._AGE_LIMITS.get(pg_rating.upper(), 0)
+        if config.get('pGeo'):
+            self.report_warning(
+                'This content might not be available in your country due to copyright reasons')

-        sources_js = self._search_regex(
-            r'(?s)sources:\s(\[.*?\]),', webpage, 'sources')
-        sources = self._parse_json(
-            sources_js, video_id, transform_source=js_to_json)
+        formats = [{
+            'format_id': 'hls',
+            'ext': 'mp4',
+            'url': config['EpisodeVideoLink_HLS'],
+        }]

-        formats = []
-        for source in sources:
-            if source.get('provider') == 'rtmp':
-                m = re.search(r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<play_path>.+)$', source['file'])
-                if not m:
-                    continue
-                formats.append({
-                    'format_id': 'rtmp',
-                    'ext': 'flv',
-                    'url': m.group('url'),
-                    'play_path': m.group('play_path'),
-                    'page_url': url,
-                })
-            elif source.get('file').endswith('.m3u8'):
-                formats.append({
-                    'format_id': 'hls',
-                    'ext': source.get('type', 'mp4'),
-                    'url': source['file'],
-                })
+        m = re.search(r'^(?P<url>rtmp://[^/]+/(?P<app>[^/]+))/(?P<play_path>.+)$', config['EpisodeVideoLink'])
+        if m:
+            formats.append({
+                'format_id': 'rtmp',
+                'ext': 'flv',
+                'url': m.group('url'),
+                'play_path': m.group('play_path'),
+                'page_url': url,
+            })

        self._sort_formats(formats)

@@ -117,8 +106,8 @@ class LnkGoIE(InfoExtractor):
            'title': title,
            'formats': formats,
            'thumbnails': [thumbnail],
-            'duration': duration,
+            'duration': int_or_none(config.get('VideoTime')),
            'description': description,
-            'age_limit': age_limit,
+            'age_limit': self._AGE_LIMITS.get(config.get('PGRating'), 0),
            'upload_date': upload_date,
        }
@@ -1,7 +1,5 @@
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor
 from ..utils import ExtractorError

@@ -13,21 +11,22 @@ class MacGameStoreIE(InfoExtractor):

    _TEST = {
        'url': 'http://www.macgamestore.com/mediaviewer.php?trailer=2450',
-        'file': '2450.m4v',
        'md5': '8649b8ea684b6666b4c5be736ecddc61',
        'info_dict': {
+            'id': '2450',
+            'ext': 'm4v',
            'title': 'Crow',
        }
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(
+            url, video_id, 'Downloading trailer page')

-        webpage = self._download_webpage(url, video_id, 'Downloading trailer page')
-
-        if re.search(r'>Missing Media<', webpage) is not None:
-            raise ExtractorError('Trailer %s does not exist' % video_id, expected=True)
+        if '>Missing Media<' in webpage:
+            raise ExtractorError(
+                'Trailer %s does not exist' % video_id, expected=True)

        video_title = self._html_search_regex(
            r'<title>MacGameStore: (.*?) Trailer</title>', webpage, 'title')
@@ -9,7 +9,7 @@ from ..compat import (
 from ..utils import (
    ExtractorError,
    HEADRequest,
-    int_or_none,
+    str_to_int,
    parse_iso8601,
 )

@@ -18,7 +18,7 @@ class MixcloudIE(InfoExtractor):
    _VALID_URL = r'^(?:https?://)?(?:www\.)?mixcloud\.com/([^/]+)/([^/]+)'
    IE_NAME = 'mixcloud'

-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.mixcloud.com/dholbach/cryptkeeper/',
        'info_dict': {
            'id': 'dholbach-cryptkeeper',
@@ -33,7 +33,20 @@ class MixcloudIE(InfoExtractor):
            'view_count': int,
            'like_count': int,
        },
-    }
+    }, {
+        'url': 'http://www.mixcloud.com/gillespeterson/caribou-7-inch-vinyl-mix-chat/',
+        'info_dict': {
+            'id': 'gillespeterson-caribou-7-inch-vinyl-mix-chat',
+            'ext': 'm4a',
+            'title': 'Electric Relaxation vol. 3',
+            'description': 'md5:2b8aec6adce69f9d41724647c65875e8',
+            'uploader': 'Daniel Drumz',
+            'uploader_id': 'gillespeterson',
+            'thumbnail': 're:https?://.*\.jpg',
+            'view_count': int,
+            'like_count': int,
+        },
+    }]

    def _get_url(self, track_id, template_url):
        server_count = 30
@@ -60,7 +73,7 @@ class MixcloudIE(InfoExtractor):
        webpage = self._download_webpage(url, track_id)

        preview_url = self._search_regex(
-            r'\s(?:data-preview-url|m-preview)="(.+?)"', webpage, 'preview url')
+            r'\s(?:data-preview-url|m-preview)="([^"]+)"', webpage, 'preview url')
        song_url = preview_url.replace('/previews/', '/c/originals/')
        template_url = re.sub(r'(stream\d*)', 'stream%d', song_url)
        final_song_url = self._get_url(track_id, template_url)
@@ -85,15 +98,17 @@ class MixcloudIE(InfoExtractor):
        uploader_id = self._search_regex(
            r'\s+"profile": "([^"]+)",', webpage, 'uploader id', fatal=False)
        description = self._og_search_description(webpage)
-        like_count = int_or_none(self._search_regex(
-            r'<meta itemprop="interactionCount" content="UserLikes:([0-9]+)"',
+        like_count = str_to_int(self._search_regex(
+            [r'<meta itemprop="interactionCount" content="UserLikes:([0-9]+)"',
+             r'/favorites/?">([0-9]+)<'],
            webpage, 'like count', fatal=False))
-        view_count = int_or_none(self._search_regex(
-            r'<meta itemprop="interactionCount" content="UserPlays:([0-9]+)"',
+        view_count = str_to_int(self._search_regex(
+            [r'<meta itemprop="interactionCount" content="UserPlays:([0-9]+)"',
+             r'/listeners/?">([0-9,.]+)</a>'],
            webpage, 'play count', fatal=False))
        timestamp = parse_iso8601(self._search_regex(
            r'<time itemprop="dateCreated" datetime="([^"]+)">',
-            webpage, 'upload date'))
+            webpage, 'upload date', default=None))

        return {
            'id': track_id,
@@ -1,21 +1,19 @@
 from __future__ import unicode_literals

-import json
-import re
-
 from .common import InfoExtractor
 from ..utils import int_or_none


 class MporaIE(InfoExtractor):
-    _VALID_URL = r'^https?://(www\.)?mpora\.(?:com|de)/videos/(?P<id>[^?#/]+)'
+    _VALID_URL = r'https?://(www\.)?mpora\.(?:com|de)/videos/(?P<id>[^?#/]+)'
    IE_NAME = 'MPORA'

    _TEST = {
        'url': 'http://mpora.de/videos/AAdo8okx4wiz/embed?locale=de',
-        'file': 'AAdo8okx4wiz.mp4',
        'md5': 'a7a228473eedd3be741397cf452932eb',
        'info_dict': {
+            'id': 'AAdo8okx4wiz',
+            'ext': 'mp4',
            'title': 'Katy Curd -  Winter in the Forest',
            'duration': 416,
            'uploader': 'Peter Newman Media',
@@ -23,14 +21,12 @@ class MporaIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        m = re.match(self._VALID_URL, url)
-        video_id = m.group('id')
-
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
+
        data_json = self._search_regex(
            r"new FM\.Player\('[^']+',\s*(\{.*?)\).player;", webpage, 'json')
-
-        data = json.loads(data_json)
+        data = self._parse_json(data_json, video_id)

        uploader = data['info_overlay'].get('username')
        duration = data['video']['duration'] // 1000
@@ -2,10 +2,11 @@ from __future__ import unicode_literals

 import re

-from .common import InfoExtractor
+from .subtitles import SubtitlesInfoExtractor
 from ..compat import (
    compat_urllib_parse,
    compat_urllib_request,
+    compat_str,
 )
 from ..utils import (
    ExtractorError,
@@ -22,7 +23,7 @@ def _media_xml_tag(tag):
    return '{http://search.yahoo.com/mrss/}%s' % tag


-class MTVServicesInfoExtractor(InfoExtractor):
+class MTVServicesInfoExtractor(SubtitlesInfoExtractor):
    _MOBILE_TEMPLATE = None

    @staticmethod
@@ -78,17 +79,42 @@ class MTVServicesInfoExtractor(InfoExtractor):
            try:
                _, _, ext = rendition.attrib['type'].partition('/')
                rtmp_video_url = rendition.find('./src').text
-                formats.append({'ext': ext,
-                                'url': self._transform_rtmp_url(rtmp_video_url),
-                                'format_id': rendition.get('bitrate'),
-                                'width': int(rendition.get('width')),
-                                'height': int(rendition.get('height')),
-                                })
+                if rtmp_video_url.endswith('siteunavail.png'):
+                    continue
+                formats.append({
+                    'ext': ext,
+                    'url': self._transform_rtmp_url(rtmp_video_url),
+                    'format_id': rendition.get('bitrate'),
+                    'width': int(rendition.get('width')),
+                    'height': int(rendition.get('height')),
+                })
            except (KeyError, TypeError):
                raise ExtractorError('Invalid rendition field.')
        self._sort_formats(formats)
        return formats

+    def _extract_subtitles(self, mdoc, mtvn_id):
+        subtitles = {}
+        FORMATS = {
+            'scc': 'cea-608',
+            'eia-608': 'cea-608',
+            'xml': 'ttml',
+        }
+        subtitles_format = FORMATS.get(
+            self._downloader.params.get('subtitlesformat'), 'ttml')
+        for transcript in mdoc.findall('.//transcript'):
+            if transcript.get('kind') != 'captions':
+                continue
+            lang = transcript.get('srclang')
+            for typographic in transcript.findall('./typographic'):
+                captions_format = typographic.get('format')
+                if captions_format == subtitles_format:
+                    subtitles[lang] = compat_str(typographic.get('src'))
+                    break
+        if self._downloader.params.get('listsubtitles', False):
+            self._list_available_subtitles(mtvn_id, subtitles)
+        return self.extract_subtitles(mtvn_id, subtitles)
+
    def _get_video_info(self, itemdoc):
        uri = itemdoc.find('guid').text
        video_id = self._id_from_uri(uri)
@@ -135,6 +161,7 @@ class MTVServicesInfoExtractor(InfoExtractor):
        return {
            'title': title,
            'formats': self._extract_video_formats(mediagen_doc, mtvn_id),
+            'subtitles': self._extract_subtitles(mediagen_doc, mtvn_id),
            'id': video_id,
            'thumbnail': self._get_thumbnail_url(uri, itemdoc),
            'description': description,
@@ -167,7 +194,11 @@ class MTVServicesInfoExtractor(InfoExtractor):
            mgid = self._search_regex(
                [r'data-mgid="(.*?)"', r'swfobject.embedSWF\(".*?(mgid:.*?)"'],
                webpage, 'mgid')
-        return self._get_videos_info(mgid)
+
+        videos_info = self._get_videos_info(mgid)
+        if self._downloader.params.get('listsubtitles', False):
+            return
+        return videos_info


 class MTVServicesEmbeddedIE(MTVServicesInfoExtractor):
@@ -212,25 +243,14 @@ class MTVIE(MTVServicesInfoExtractor):
    _TESTS = [
        {
            'url': 'http://www.mtv.com/videos/misc/853555/ours-vh1-storytellers.jhtml',
-            'file': '853555.mp4',
            'md5': '850f3f143316b1e71fa56a4edfd6e0f8',
            'info_dict': {
+                'id': '853555',
+                'ext': 'mp4',
                'title': 'Taylor Swift - "Ours (VH1 Storytellers)"',
                'description': 'Album: Taylor Swift performs "Ours" for VH1 Storytellers at Harvey Mudd College.',
            },
        },
-        {
-            'add_ie': ['Vevo'],
-            'url': 'http://www.mtv.com/videos/taylor-swift/916187/everything-has-changed-ft-ed-sheeran.jhtml',
-            'file': 'USCJY1331283.mp4',
-            'md5': '73b4e7fcadd88929292fe52c3ced8caf',
-            'info_dict': {
-                'title': 'Everything Has Changed',
-                'upload_date': '20130606',
-                'uploader': 'Taylor Swift',
-            },
-            'skip': 'VEVO is only available in some countries',
-        },
    ]

    def _get_thumbnail_url(self, uri, itemdoc):
@@ -244,8 +264,8 @@ class MTVIE(MTVServicesInfoExtractor):
            webpage = self._download_webpage(url, video_id)

            # Some videos come from Vevo.com
-            m_vevo = re.search(r'isVevoVideo = true;.*?vevoVideoId = "(.*?)";',
-                               webpage, re.DOTALL)
+            m_vevo = re.search(
+                r'(?s)isVevoVideo = true;.*?vevoVideoId = "(.*?)";', webpage)
            if m_vevo:
                vevo_id = m_vevo.group(1)
                self.to_screen('Vevo video detected: %s' % vevo_id)
@@ -11,6 +11,7 @@ class NerdCubedFeedIE(InfoExtractor):
    _TEST = {
        'url': 'http://www.nerdcubed.co.uk/feed.json',
        'info_dict': {
+            'id': 'nerdcubed-feed',
            'title': 'nerdcubed.co.uk feed',
        },
        'playlist_mincount': 1300,
@@ -0,0 +1,80 @@
+# encoding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+
+from ..utils import (
+    determine_ext,
+    parse_iso8601,
+    xpath_text,
+)
+
+
+class NerdistIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?nerdist\.com/vepisode/(?P<id>[^/?#]+)'
+    _TEST = {
+        'url': 'http://www.nerdist.com/vepisode/exclusive-which-dc-characters-w',
+        'md5': '3698ed582931b90d9e81e02e26e89f23',
+        'info_dict': {
+            'display_id': 'exclusive-which-dc-characters-w',
+            'id': 'RPHpvJyr',
+            'ext': 'mp4',
+            'title': 'Your TEEN TITANS Revealed! Who\'s on the show?',
+            'thumbnail': 're:^https?://.*/thumbs/.*\.jpg$',
+            'description': 'Exclusive: Find out which DC Comics superheroes will star in TEEN TITANS Live-Action TV Show on Nerdist News with Jessica Chobot!',
+            'uploader': 'Eric Diaz',
+            'upload_date': '20150202',
+            'timestamp': 1422892808,
+        }
+    }
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+        webpage = self._download_webpage(url, display_id)
+
+        video_id = self._search_regex(
+            r'''(?x)<script\s+(?:type="text/javascript"\s+)?
+                src="https?://content\.nerdist\.com/players/([a-zA-Z0-9_]+)-''',
+            webpage, 'video ID')
+        timestamp = parse_iso8601(self._html_search_meta(
+            'shareaholic:article_published_time', webpage, 'upload date'))
+        uploader = self._html_search_meta(
+            'shareaholic:article_author_name', webpage, 'article author')
+
+        doc = self._download_xml(
+            'http://content.nerdist.com/jw6/%s.xml' % video_id, video_id)
+        video_info = doc.find('.//item')
+        title = xpath_text(video_info, './title', fatal=True)
+        description = xpath_text(video_info, './description')
+        thumbnail = xpath_text(
+            video_info, './{http://rss.jwpcdn.com/}image', 'thumbnail')
+
+        formats = []
+        for source in video_info.findall('./{http://rss.jwpcdn.com/}source'):
+            vurl = source.attrib['file']
+            ext = determine_ext(vurl)
+            if ext == 'm3u8':
+                formats.extend(self._extract_m3u8_formats(
+                    vurl, video_id, entry_protocol='m3u8_native', ext='mp4',
+                    preference=0))
+            elif ext == 'smil':
+                formats.extend(self._extract_smil_formats(
+                    vurl, video_id, fatal=False
+                ))
+            else:
+                formats.append({
+                    'format_id': ext,
+                    'url': vurl,
+                })
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'display_id': display_id,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'timestamp': timestamp,
+            'formats': formats,
+            'uploader': uploader,
+        }
@@ -0,0 +1,163 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import parse_iso8601
+
+
+class NextMediaIE(InfoExtractor):
+    _VALID_URL = r'http://hk.apple.nextmedia.com/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)'
+    _TESTS = [{
+        'url': 'http://hk.apple.nextmedia.com/realtime/news/20141108/53109199',
+        'md5': 'dff9fad7009311c421176d1ac90bfe4f',
+        'info_dict': {
+            'id': '53109199',
+            'ext': 'mp4',
+            'title': '【佔領金鐘】50外國領事議員撐場 讚學生勇敢香港有希望',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'description': 'md5:28222b9912b6665a21011b034c70fcc7',
+            'timestamp': 1415456273,
+            'upload_date': '20141108',
+        }
+    }]
+
+    _URL_PATTERN = r'\{ url: \'(.+)\' \}'
+
+    def _real_extract(self, url):
+        news_id = self._match_id(url)
+        page = self._download_webpage(url, news_id)
+        return self._extract_from_nextmedia_page(news_id, url, page)
+
+    def _extract_from_nextmedia_page(self, news_id, url, page):
+        title = self._fetch_title(page)
+        video_url = self._search_regex(self._URL_PATTERN, page, 'video url')
+
+        attrs = {
+            'id': news_id,
+            'title': title,
+            'url': video_url,  # ext can be inferred from url
+            'thumbnail': self._fetch_thumbnail(page),
+            'description': self._fetch_description(page),
+        }
+
+        timestamp = self._fetch_timestamp(page)
+        if timestamp:
+            attrs['timestamp'] = timestamp
+        else:
+            attrs['upload_date'] = self._fetch_upload_date(url)
+
+        return attrs
+
+    def _fetch_title(self, page):
+        return self._og_search_title(page)
+
+    def _fetch_thumbnail(self, page):
+        return self._og_search_thumbnail(page)
+
+    def _fetch_timestamp(self, page):
+        dateCreated = self._search_regex('"dateCreated":"([^"]+)"', page, 'created time')
+        return parse_iso8601(dateCreated)
+
+    def _fetch_upload_date(self, url):
+        return self._search_regex(self._VALID_URL, url, 'upload date', group='date')
+
+    def _fetch_description(self, page):
+        return self._og_search_property('description', page)
+
+
+class NextMediaActionNewsIE(NextMediaIE):
+    _VALID_URL = r'http://hk.dv.nextmedia.com/actionnews/[^/]+/(?P<date>\d+)/(?P<id>\d+)/\d+'
+    _TESTS = [{
+        'url': 'http://hk.dv.nextmedia.com/actionnews/hit/20150121/19009428/20061460',
+        'md5': '05fce8ffeed7a5e00665d4b7cf0f9201',
+        'info_dict': {
+            'id': '19009428',
+            'ext': 'mp4',
+            'title': '【壹週刊】細10年男友偷食　50歲邵美琪再失戀',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'description': 'md5:cd802fad1f40fd9ea178c1e2af02d659',
+            'timestamp': 1421791200,
+            'upload_date': '20150120',
+        }
+    }]
+
+    def _real_extract(self, url):
+        news_id = self._match_id(url)
+        actionnews_page = self._download_webpage(url, news_id)
+        article_url = self._og_search_url(actionnews_page)
+        article_page = self._download_webpage(article_url, news_id)
+        return self._extract_from_nextmedia_page(news_id, url, article_page)
+
+
+class AppleDailyRealtimeNewsIE(NextMediaIE):
+    _VALID_URL = r'http://(www|ent).appledaily.com.tw/(realtimenews|enews)/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)(/.*)?'
+    _TESTS = [{
+        'url': 'http://ent.appledaily.com.tw/enews/article/entertainment/20150128/36354694',
+        'md5': 'a843ab23d150977cc55ef94f1e2c1e4d',
+        'info_dict': {
+            'id': '36354694',
+            'ext': 'mp4',
+            'title': '周亭羽走過摩鐵陰霾2男陪吃 九把刀孤寒看醫生',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'description': 'md5:b23787119933404ce515c6356a8c355c',
+            'upload_date': '20150128',
+        }
+    }, {
+        'url': 'http://www.appledaily.com.tw/realtimenews/article/strange/20150128/550549/%E4%B8%8D%E6%BB%BF%E8%A2%AB%E8%B8%A9%E8%85%B3%E3%80%80%E5%B1%B1%E6%9D%B1%E5%85%A9%E5%A4%A7%E5%AA%BD%E4%B8%80%E8%B7%AF%E6%89%93%E4%B8%8B%E8%BB%8A',
+        'md5': '86b4e9132d158279c7883822d94ccc49',
+        'info_dict': {
+            'id': '550549',
+            'ext': 'mp4',
+            'title': '不滿被踩腳　山東兩大媽一路打下車',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'description': 'md5:2648aaf6fc4f401f6de35a91d111aa1d',
+            'upload_date': '20150128',
+        }
+    }]
+
+    _URL_PATTERN = r'\{url: \'(.+)\'\}'
+
+    def _fetch_title(self, page):
+        return self._html_search_regex(r'<h1 id="h1">([^<>]+)</h1>', page, 'news title')
+
+    def _fetch_thumbnail(self, page):
+        return self._html_search_regex(r"setInitialImage\(\'([^']+)'\)", page, 'video thumbnail', fatal=False)
+
+    def _fetch_timestamp(self, page):
+        return None
+
+
+class AppleDailyAnimationNewsIE(AppleDailyRealtimeNewsIE):
+    _VALID_URL = 'http://www.appledaily.com.tw/animation/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)(/.*)?'
+    _TESTS = [{
+        'url': 'http://www.appledaily.com.tw/animation/realtimenews/new/20150128/5003671',
+        'md5': '03df296d95dedc2d5886debbb80cb43f',
+        'info_dict': {
+            'id': '5003671',
+            'ext': 'mp4',
+            'title': '20正妹熱舞　《刀龍傳說Online》火辣上市',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'description': 'md5:23c0aac567dc08c9c16a3161a2c2e3cd',
+            'upload_date': '20150128',
+        }
+    }, {
+        # No thumbnail
+        'url': 'http://www.appledaily.com.tw/animation/realtimenews/new/20150128/5003673/',
+        'md5': 'b06182cd386ea7bc6115ec7ff0f72aeb',
+        'info_dict': {
+            'id': '5003673',
+            'ext': 'mp4',
+            'title': '半夜尿尿　好像會看到___',
+            'description': 'md5:61d2da7fe117fede148706cdb85ac066',
+            'upload_date': '20150128',
+        },
+        'expected_warnings': [
+            'video thumbnail',
+        ]
+    }]
+
+    def _fetch_title(self, page):
+        return self._html_search_meta('description', page, 'news title')
+
+    def _fetch_description(self, page):
+        return self._html_search_meta('description', page, 'news description')
@@ -46,7 +46,18 @@ class NFLIE(InfoExtractor):
                'timestamp': 1388354455,
                'thumbnail': 're:^https?://.*\.jpg$',
            }
-        }
+        },
+        {
+            'url': 'http://www.nfl.com/news/story/0ap3000000467586/article/patriots-seahawks-involved-in-lategame-skirmish',
+            'info_dict': {
+                'id': '0ap3000000467607',
+                'ext': 'mp4',
+                'title': 'Frustrations flare on the field',
+                'description': 'Emotions ran high at the end of the Super Bowl on both sides of the ball after a dramatic finish.',
+                'timestamp': 1422850320,
+                'upload_date': '20150202',
+            },
+        },
    ]

    @staticmethod
@@ -80,7 +91,11 @@ class NFLIE(InfoExtractor):
        webpage = self._download_webpage(url, video_id)

        config_url = NFLIE.prepend_host(host, self._search_regex(
-            r'(?:config|configURL)\s*:\s*"([^"]+)"', webpage, 'config URL'))
+            r'(?:config|configURL)\s*:\s*"([^"]+)"', webpage, 'config URL',
+            default='static/content/static/config/video/config.json'))
+        # For articles, the id in the url is not the video id
+        video_id = self._search_regex(
+            r'contentId\s*:\s*"([^"]+)"', webpage, 'video id', default=video_id)
        config = self._download_json(config_url, video_id,
                                     note='Downloading player config')
        url_template = NFLIE.prepend_host(
@@ -20,6 +20,12 @@ class NHLBaseInfoExtractor(InfoExtractor):
    def _fix_json(json_string):
        return json_string.replace('\\\'', '\'')

+    def _real_extract_video(self, video_id):
+        json_url = 'http://video.nhl.com/videocenter/servlets/playlist?ids=%s&format=json' % video_id
+        data = self._download_json(
+            json_url, video_id, transform_source=self._fix_json)
+        return self._extract_video(data[0])
+
    def _extract_video(self, info):
        video_id = info['id']
        self.report_extraction(video_id)
@@ -54,7 +60,7 @@ class NHLBaseInfoExtractor(InfoExtractor):

 class NHLIE(NHLBaseInfoExtractor):
    IE_NAME = 'nhl.com'
-    _VALID_URL = r'https?://video(?P<team>\.[^.]*)?\.nhl\.com/videocenter/console(?:\?(?:.*?[?&])?)id=(?P<id>[-0-9a-zA-Z]+)'
+    _VALID_URL = r'https?://video(?P<team>\.[^.]*)?\.nhl\.com/videocenter/(?:console)?(?:\?(?:.*?[?&])?)id=(?P<id>[-0-9a-zA-Z]+)'

    _TESTS = [{
        'url': 'http://video.canucks.nhl.com/videocenter/console?catid=6?id=453614',
@@ -92,15 +98,41 @@ class NHLIE(NHLBaseInfoExtractor):
    }, {
        'url': 'http://video.flames.nhl.com/videocenter/console?id=630616',
        'only_matching': True,
+    }, {
+        'url': 'http://video.nhl.com/videocenter/?id=736722',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-        json_url = 'http://video.nhl.com/videocenter/servlets/playlist?ids=%s&format=json' % video_id
-        data = self._download_json(
-            json_url, video_id, transform_source=self._fix_json)
-        return self._extract_video(data[0])
+        video_id = self._match_id(url)
+        return self._real_extract_video(video_id)
+
+
+class NHLNewsIE(NHLBaseInfoExtractor):
+    IE_NAME = 'nhl.com:news'
+    IE_DESC = 'NHL news'
+    _VALID_URL = r'https?://(?:www\.)?nhl\.com/ice/news\.html?(?:\?(?:.*?[?&])?)id=(?P<id>[-0-9a-zA-Z]+)'
+
+    _TEST = {
+        'url': 'http://www.nhl.com/ice/news.htm?id=750727',
+        'md5': '4b3d1262e177687a3009937bd9ec0be8',
+        'info_dict': {
+            'id': '736722',
+            'ext': 'mp4',
+            'title': 'Cal Clutterbuck has been fined $2,000',
+            'description': 'md5:45fe547d30edab88b23e0dd0ab1ed9e6',
+            'duration': 37,
+            'upload_date': '20150128',
+        },
+    }
+
+    def _real_extract(self, url):
+        news_id = self._match_id(url)
+        webpage = self._download_webpage(url, news_id)
+        video_id = self._search_regex(
+            [r'pVid(\d+)', r"nlid\s*:\s*'(\d+)'"],
+            webpage, 'video id')
+        return self._real_extract_video(video_id)


 class NHLVideocenterIE(NHLBaseInfoExtractor):
@@ -1,8 +1,6 @@
 # encoding: utf-8
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor

 from ..utils import (
@@ -11,7 +9,7 @@ from ..utils import (


 class NormalbootsIE(InfoExtractor):
-    _VALID_URL = r'http://(?:www\.)?normalboots\.com/video/(?P<videoid>[0-9a-z-]*)/?$'
+    _VALID_URL = r'http://(?:www\.)?normalboots\.com/video/(?P<id>[0-9a-z-]*)/?$'
    _TEST = {
        'url': 'http://normalboots.com/video/home-alone-games-jontron/',
        'md5': '8bf6de238915dd501105b44ef5f1e0f6',
@@ -30,19 +28,22 @@ class NormalbootsIE(InfoExtractor):
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('videoid')
-
+        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
-        video_uploader = self._html_search_regex(r'Posted\sby\s<a\shref="[A-Za-z0-9/]*">(?P<uploader>[A-Za-z]*)\s</a>',
-                                                 webpage, 'uploader')
-        raw_upload_date = self._html_search_regex('<span style="text-transform:uppercase; font-size:inherit;">[A-Za-z]+, (?P<date>.*)</span>',
-                                                  webpage, 'date')
-        video_upload_date = unified_strdate(raw_upload_date)

-        player_url = self._html_search_regex(r'<iframe\swidth="[0-9]+"\sheight="[0-9]+"\ssrc="(?P<url>[\S]+)"', webpage, 'url')
+        video_uploader = self._html_search_regex(
+            r'Posted\sby\s<a\shref="[A-Za-z0-9/]*">(?P<uploader>[A-Za-z]*)\s</a>',
+            webpage, 'uploader', fatal=False)
+        video_upload_date = unified_strdate(self._html_search_regex(
+            r'<span style="text-transform:uppercase; font-size:inherit;">[A-Za-z]+, (?P<date>.*)</span>',
+            webpage, 'date', fatal=False))
+
+        player_url = self._html_search_regex(
+            r'<iframe\swidth="[0-9]+"\sheight="[0-9]+"\ssrc="(?P<url>[\S]+)"',
+            webpage, 'player url')
        player_page = self._download_webpage(player_url, video_id)
-        video_url = self._html_search_regex(r"file:\s'(?P<file>[^']+\.mp4)'", player_page, 'file')
+        video_url = self._html_search_regex(
+            r"file:\s'(?P<file>[^']+\.mp4)'", player_page, 'file')

        return {
            'id': video_id,
@@ -1,6 +1,6 @@
 from __future__ import unicode_literals

-from .common import InfoExtractor
+from .subtitles import SubtitlesInfoExtractor
 from ..utils import (
    fix_xml_ampersands,
    parse_duration,
@@ -11,7 +11,7 @@ from ..utils import (
 )


-class NPOBaseIE(InfoExtractor):
+class NPOBaseIE(SubtitlesInfoExtractor):
    def _get_token(self, video_id):
        token_page = self._download_webpage(
            'http://ida.omroep.nl/npoplayer/i.js',
@@ -161,6 +161,16 @@ class NPOIE(NPOBaseIE):

        self._sort_formats(formats)

+        subtitles = {}
+        if metadata.get('tt888') == 'ja':
+            subtitles['nl'] = 'http://e.omroep.nl/tt888/%s' % video_id
+
+        if self._downloader.params.get('listsubtitles', False):
+            self._list_available_subtitles(video_id, subtitles)
+            return
+
+        subtitles = self.extract_subtitles(video_id, subtitles)
+
        return {
            'id': video_id,
            'title': metadata['titel'],
@@ -169,6 +179,7 @@ class NPOIE(NPOBaseIE):
            'upload_date': unified_strdate(metadata.get('gidsdatum')),
            'duration': parse_duration(metadata.get('tijdsduur')),
            'formats': formats,
+            'subtitles': subtitles,
        }


@@ -0,0 +1,68 @@
+# encoding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    int_or_none,
+    js_to_json,
+    parse_duration,
+)
+
+
+class NTVDeIE(InfoExtractor):
+    IE_NAME = 'n-tv.de'
+    _VALID_URL = r'https?://(?:www\.)?n-tv\.de/mediathek/videos/[^/?#]+/[^/?#]+-article(?P<id>.+)\.html'
+
+    _TESTS = [{
+        'url': 'http://www.n-tv.de/mediathek/videos/panorama/Schnee-und-Glaette-fuehren-zu-zahlreichen-Unfaellen-und-Staus-article14438086.html',
+        'md5': '6ef2514d4b1e8e03ca24b49e2f167153',
+        'info_dict': {
+            'id': '14438086',
+            'ext': 'mp4',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'title': 'Schnee und Glätte führen zu zahlreichen Unfällen und Staus',
+            'alt_title': 'Winterchaos auf deutschen Straßen',
+            'description': 'Schnee und Glätte sorgen deutschlandweit für einen chaotischen Start in die Woche: Auf den Straßen kommt es zu kilometerlangen Staus und Dutzenden Glätteunfällen. In Düsseldorf und München wirbelt der Schnee zudem den Flugplan durcheinander. Dutzende Flüge landen zu spät, einige fallen ganz aus.',
+            'duration': 4020,
+            'timestamp': 1422892797,
+            'upload_date': '20150202',
+        },
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)
+
+        info = self._parse_json(self._search_regex(
+            r'(?s)ntv.pageInfo.article =\s(\{.*?\});', webpage, 'info'),
+            video_id, transform_source=js_to_json)
+        timestamp = int_or_none(info.get('publishedDateAsUnixTimeStamp'))
+        vdata = self._parse_json(self._search_regex(
+            r'(?s)\$\(\s*"\#player"\s*\)\s*\.data\(\s*"player",\s*(\{.*?\})\);',
+            webpage, 'player data'),
+            video_id, transform_source=js_to_json)
+        duration = parse_duration(vdata.get('duration'))
+        formats = [{
+            'format_id': 'flash',
+            'url': 'rtmp://fms.n-tv.de/' + vdata['video'],
+        }, {
+            'format_id': 'mobile',
+            'url': 'http://video.n-tv.de' + vdata['videoMp4'],
+            'tbr': 400,  # estimation
+        }]
+        m3u8_url = 'http://video.n-tv.de' + vdata['videoM3u8']
+        formats.extend(self._extract_m3u8_formats(
+            m3u8_url, video_id, ext='mp4',
+            entry_protocol='m3u8_native', preference=0))
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'title': info['headline'],
+            'description': info.get('intro'),
+            'alt_title': info.get('kicker'),
+            'timestamp': timestamp,
+            'thumbnail': vdata.get('html5VideoPoster'),
+            'duration': duration,
+            'formats': formats,
+        }
@@ -1,15 +1,14 @@
 # encoding: utf-8
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor
 from ..utils import (
    unescapeHTML
 )


-class NTVIE(InfoExtractor):
+class NTVRuIE(InfoExtractor):
+    IE_NAME = 'ntv.ru'
    _VALID_URL = r'http://(?:www\.)?ntv\.ru/(?P<id>.+)'

    _TESTS = [
@@ -92,9 +91,7 @@ class NTVIE(InfoExtractor):
    ]

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-
+        video_id = self._match_id(url)
        page = self._download_webpage(url, video_id)

        video_id = self._html_search_regex(self._VIDEO_ID_REGEXES, page, 'video id')
@@ -6,12 +6,13 @@ from .common import InfoExtractor


 class RingTVIE(InfoExtractor):
-    _VALID_URL = r'(?:http://)?(?:www\.)?ringtv\.craveonline\.com/(?P<type>news|videos/video)/(?P<id>[^/?#]+)'
+    _VALID_URL = r'http://(?:www\.)?ringtv\.craveonline\.com/(?P<type>news|videos/video)/(?P<id>[^/?#]+)'
    _TEST = {
        "url": "http://ringtv.craveonline.com/news/310833-luis-collazo-says-victor-ortiz-better-not-quit-on-jan-30",
-        "file": "857645.mp4",
        "md5": "d25945f5df41cdca2d2587165ac28720",
        "info_dict": {
+            'id': '857645',
+            'ext': 'mp4',
            "title": 'Video: Luis Collazo says Victor Ortiz "better not quit on Jan. 30" - Ring TV',
            "description": 'Luis Collazo is excited about his Jan. 30 showdown with fellow former welterweight titleholder Victor Ortiz at Barclays Center in his hometown of Brooklyn. The SuperBowl week fight headlines a Golden Boy Live! card on Fox Sports 1.',
        }
@@ -10,8 +10,9 @@ class RottenTomatoesIE(VideoDetectiveIE):

    _TEST = {
        'url': 'http://www.rottentomatoes.com/m/toy_story_3/trailers/11028566/',
-        'file': '613340.mp4',
        'info_dict': {
+            'id': '613340',
+            'ext': 'mp4',
            'title': 'TOY STORY 3',
            'description': 'From the creators of the beloved TOY STORY films, comes a story that will reunite the gang in a whole new way.',
        },
@@ -91,6 +91,15 @@ class RTLnowIE(InfoExtractor):
            },
        },
        {
+            'url': 'http://rtl-now.rtl.de/der-bachelor/folge-4.php?film_id=188729&player=1&season=5',
+            'info_dict': {
+                'id': '188729',
+                'ext': 'flv',
+                'upload_date': '20150204',
+                'description': 'md5:5e1ce23095e61a79c166d134b683cecc',
+                'title': 'Der Bachelor - Folge 4',
+            }
+        }, {
            'url': 'http://www.n-tvnow.de/deluxe-alles-was-spass-macht/thema-ua-luxushotel-fuer-vierbeiner.php?container_id=153819&player=1&season=0',
            'only_matching': True,
        },
@@ -134,9 +143,18 @@ class RTLnowIE(InfoExtractor):
                    'player_url': video_page_url + 'includes/vodplayer.swf',
                }
            else:
-                fmt = {
-                    'url': filename.text,
-                }
+                mobj = re.search(r'.*/(?P<hoster>[^/]+)/videos/(?P<play_path>.+)\.f4m', filename.text)
+                if mobj:
+                    fmt = {
+                        'url': 'rtmpe://fmspay-fra2.rtl.de/' + mobj.group('hoster'),
+                        'play_path': 'mp4:' + mobj.group('play_path'),
+                        'page_url': url,
+                        'player_url': video_page_url + 'includes/vodplayer.swf',
+                    }
+                else:
+                    fmt = {
+                        'url': filename.text,
+                    }
            fmt.update({
                'width': int_or_none(filename.get('width')),
                'height': int_or_none(filename.get('height')),
@@ -1,16 +1,16 @@
 # coding: utf-8
 from __future__ import unicode_literals

-import json
+import re

 from .common import InfoExtractor
-from ..utils import js_to_json


 class RTPIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?rtp\.pt/play/p(?P<program_id>[0-9]+)/(?P<id>[^/?#]+)/?'
    _TESTS = [{
        'url': 'http://www.rtp.pt/play/p405/e174042/paixoes-cruzadas',
+        'md5': 'e736ce0c665e459ddb818546220b4ef8',
        'info_dict': {
            'id': 'e174042',
            'ext': 'mp3',
@@ -18,9 +18,6 @@ class RTPIE(InfoExtractor):
            'description': 'As paixões musicais de António Cartaxo e António Macedo',
            'thumbnail': 're:^https?://.*\.jpg',
        },
-        'params': {
-            'skip_download': True,  # RTMP download
-        },
    }, {
        'url': 'http://www.rtp.pt/play/p831/a-quimica-das-coisas',
        'only_matching': True,
@@ -37,20 +34,48 @@ class RTPIE(InfoExtractor):

        player_config = self._search_regex(
            r'(?s)RTPPLAY\.player\.newPlayer\(\s*(\{.*?\})\s*\)', webpage, 'player config')
-        config = json.loads(js_to_json(player_config))
+        config = self._parse_json(player_config, video_id)

        path, ext = config.get('file').rsplit('.', 1)
        formats = [{
+            'format_id': 'rtmp',
+            'ext': ext,
+            'vcodec': config.get('type') == 'audio' and 'none' or None,
+            'preference': -2,
+            'url': 'rtmp://{streamer:s}/{application:s}'.format(**config),
            'app': config.get('application'),
            'play_path': '{ext:s}:{path:s}'.format(ext=ext, path=path),
            'page_url': url,
-            'url': 'rtmp://{streamer:s}/{application:s}'.format(**config),
            'rtmp_live': config.get('live', False),
-            'ext': ext,
-            'vcodec': config.get('type') == 'audio' and 'none' or None,
            'player_url': 'http://programas.rtp.pt/play/player.swf?v3',
+            'rtmp_real_time': True,
        }]

+        # Construct regular HTTP download URLs
+        replacements = {
+            'audio': {
+                'format_id': 'mp3',
+                'pattern': r'^nas2\.share/wavrss/',
+                'repl': 'http://rsspod.rtp.pt/podcasts/',
+                'vcodec': 'none',
+            },
+            'video': {
+                'format_id': 'mp4_h264',
+                'pattern': r'^nas2\.share/h264/',
+                'repl': 'http://rsspod.rtp.pt/videocasts/',
+                'vcodec': 'h264',
+            },
+        }
+        r = replacements[config['type']]
+        if re.match(r['pattern'], config['file']) is not None:
+            formats.append({
+                'format_id': r['format_id'],
+                'url': re.sub(r['pattern'], r['repl'], config['file']),
+                'vcodec': r['vcodec'],
+            })
+
+        self._sort_formats(formats)
+
        return {
            'id': video_id,
            'title': title,
@@ -6,12 +6,14 @@ import re
 from .common import InfoExtractor
 from ..compat import (
    compat_str,
+    compat_urllib_parse_urlparse,
 )
 from ..utils import (
    int_or_none,
    parse_duration,
    parse_iso8601,
    unescapeHTML,
+    xpath_text,
 )


@@ -159,11 +161,27 @@ class RTSIE(InfoExtractor):
            return int_or_none(self._search_regex(
                r'-([0-9]+)k\.', url, 'bitrate', default=None))

-        formats = [{
-            'format_id': fid,
-            'url': furl,
-            'tbr': extract_bitrate(furl),
-        } for fid, furl in info['streams'].items()]
+        formats = []
+        for format_id, format_url in info['streams'].items():
+            if format_url.endswith('.f4m'):
+                token = self._download_xml(
+                    'http://tp.srgssr.ch/token/akahd.xml?stream=%s/*' % compat_urllib_parse_urlparse(format_url).path,
+                    video_id, 'Downloading %s token' % format_id)
+                auth_params = xpath_text(token, './/authparams', 'auth params')
+                if not auth_params:
+                    continue
+                formats.extend(self._extract_f4m_formats(
+                    '%s?%s&hdcore=3.4.0&plugin=aasp-3.4.0.132.66' % (format_url, auth_params),
+                    video_id, f4m_id=format_id))
+            elif format_url.endswith('.m3u8'):
+                formats.extend(self._extract_m3u8_formats(
+                    format_url, video_id, 'mp4', m3u8_id=format_id))
+            else:
+                formats.append({
+                    'format_id': format_id,
+                    'url': format_url,
+                    'tbr': extract_bitrate(format_url),
+                })

        if 'media' in info:
            formats.extend([{
@@ -57,7 +57,7 @@ def _decrypt_url(png):
 class RTVEALaCartaIE(InfoExtractor):
    IE_NAME = 'rtve.es:alacarta'
    IE_DESC = 'RTVE a la carta'
-    _VALID_URL = r'http://www\.rtve\.es/alacarta/videos/[^/]+/[^/]+/(?P<id>\d+)'
+    _VALID_URL = r'http://www\.rtve\.es/(m/)?alacarta/videos/[^/]+/[^/]+/(?P<id>\d+)'

    _TESTS = [{
        'url': 'http://www.rtve.es/alacarta/videos/balonmano/o-swiss-cup-masculina-final-espana-suecia/2491869/',
@@ -74,7 +74,11 @@ class RTVEALaCartaIE(InfoExtractor):
            'id': '1694255',
            'ext': 'flv',
            'title': 'TODO',
-        }
+        },
+        'skip': 'The f4m manifest can\'t be used yet',
+    }, {
+        'url': 'http://www.rtve.es/m/alacarta/videos/cuentame-como-paso/cuentame-como-paso-t16-ultimo-minuto-nuestra-vida-capitulo-276/2969138/?media=tve',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
@@ -86,6 +90,18 @@ class RTVEALaCartaIE(InfoExtractor):
        png_url = 'http://www.rtve.es/ztnr/movil/thumbnail/default/videos/%s.png' % video_id
        png = self._download_webpage(png_url, video_id, 'Downloading url information')
        video_url = _decrypt_url(png)
+        if not video_url.endswith('.f4m'):
+            auth_url = video_url.replace(
+                'resources/', 'auth/resources/'
+            ).replace('.net.rtve', '.multimedia.cdn.rtve')
+            video_path = self._download_webpage(
+                auth_url, video_id, 'Getting video url')
+            # Use mvod.akcdn instead of flash.akamaihd.multimedia.cdn to get
+            # the right Content-Length header and the mp4 format
+            video_url = (
+                'http://mvod.akcdn.rtve.es/{0}&v=2.6.8'
+                '&fp=MAC%2016,0,0,296&r=MRUGG&g=OEOJWFXNFGCP'.format(video_path)
+            )

        return {
            'id': video_id,
@@ -162,10 +162,8 @@ class RUTVIE(InfoExtractor):
                        'vbr': int(quality),
                    }
                elif transport == 'm3u8':
-                    fmt = {
-                        'url': url,
-                        'ext': 'mp4',
-                    }
+                    formats.extend(self._extract_m3u8_formats(url, video_id, 'mp4'))
+                    continue
                else:
                    fmt = {
                        'url': url
@@ -1,7 +1,5 @@
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor
 from ..utils import (
    int_or_none,
@@ -13,10 +11,15 @@ class ServingSysIE(InfoExtractor):

    _TEST = {
        'url': 'http://bs.serving-sys.com/BurstingPipe/adServer.bs?cn=is&c=23&pl=VAST&pli=5349193&PluID=0&pos=7135&ord=[timestamp]&cim=1?',
+        'info_dict': {
+            'id': '5349193',
+            'title': 'AdAPPter_Hyundai_demo',
+        },
        'playlist': [{
-            'file': '29955898.flv',
            'md5': 'baed851342df6846eb8677a60a011a0f',
            'info_dict': {
+                'id': '29955898',
+                'ext': 'flv',
                'title': 'AdAPPter_Hyundai_demo (1)',
                'duration': 74,
                'tbr': 1378,
@@ -24,9 +27,10 @@ class ServingSysIE(InfoExtractor):
                'height': 400,
            },
        }, {
-            'file': '29907998.flv',
            'md5': '979b4da2655c4bc2d81aeb915a8c5014',
            'info_dict': {
+                'id': '29907998',
+                'ext': 'flv',
                'title': 'AdAPPter_Hyundai_demo (2)',
                'duration': 34,
                'width': 854,
@@ -37,14 +41,13 @@ class ServingSysIE(InfoExtractor):
        'params': {
            'playlistend': 2,
        },
-        'skip': 'Blocked in the US [sic]',
+        '_skip': 'Blocked in the US [sic]',
    }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        pl_id = mobj.group('id')
-
+        pl_id = self._match_id(url)
        vast_doc = self._download_xml(url, pl_id)
+
        title = vast_doc.find('.//AdTitle').text
        media = vast_doc.find('.//MediaFile').text
        info_url = self._search_regex(r'&adData=([^&]+)&', media, 'info URL')
@@ -11,7 +11,7 @@ from ..compat import (


 class SinaIE(InfoExtractor):
-    _VALID_URL = r'''https?://(.*?\.)?video\.sina\.com\.cn/
+    _VALID_URL = r'''(?x)https?://(.*?\.)?video\.sina\.com\.cn/
                        (
                            (.+?/(((?P<pseudo_id>\d+).html)|(.*?(\#|(vid=)|b/)(?P<id>\d+?)($|&|\-))))
                            |
@@ -23,9 +23,10 @@ class SinaIE(InfoExtractor):
    _TESTS = [
        {
            'url': 'http://video.sina.com.cn/news/vlist/zt/chczlj2013/?opsubject_id=top12#110028898',
-            'file': '110028898.flv',
            'md5': 'd65dd22ddcf44e38ce2bf58a10c3e71f',
            'info_dict': {
+                'id': '110028898',
+                'ext': 'flv',
                'title': '《中国新闻》 朝鲜要求巴拿马立即释放被扣船员',
            }
        },
@@ -39,10 +40,6 @@ class SinaIE(InfoExtractor):
        },
    ]

-    @classmethod
-    def suitable(cls, url):
-        return re.match(cls._VALID_URL, url, flags=re.VERBOSE) is not None
-
    def _extract_video(self, video_id):
        data = compat_urllib_parse.urlencode({'vid': video_id})
        url_doc = self._download_xml('http://v.iask.com/v_play.php?%s' % data,
@@ -59,7 +56,7 @@ class SinaIE(InfoExtractor):
                }

    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url, flags=re.VERBOSE)
+        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
        if mobj.group('token') is not None:
            # The video id is in the redirected url
@@ -108,7 +108,7 @@ class SmotriIE(InfoExtractor):
        # swf player
        {
            'url': 'http://pics.smotri.com/scrubber_custom8.swf?file=v9188090500',
-            'md5': '4d47034979d9390d14acdf59c4935bc2',
+            'md5': '31099eeb4bc906712c5f40092045108d',
            'info_dict': {
                'id': 'v9188090500',
                'ext': 'mp4',
@@ -139,9 +139,6 @@ class SmotriIE(InfoExtractor):
    def _search_meta(self, name, html, display_name=None):
        if display_name is None:
            display_name = name
-        return self._html_search_regex(
-            r'<meta itemprop="%s" content="([^"]+)" />' % re.escape(name),
-            html, display_name, fatal=False)
        return self._html_search_meta(name, html, display_name)

    def _real_extract(self, url):
@@ -1,80 +0,0 @@
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..utils import (
-    HEADRequest,
-    urlhandle_detect_ext,
-)
-
-
-class SoulAnimeWatchingIE(InfoExtractor):
-    IE_NAME = "soulanime:watching"
-    IE_DESC = "SoulAnime video"
-    _TEST = {
-        'url': 'http://www.soul-anime.net/watching/seirei-tsukai-no-blade-dance-episode-9/',
-        'md5': '05fae04abf72298098b528e98abf4298',
-        'info_dict': {
-            'id': 'seirei-tsukai-no-blade-dance-episode-9',
-            'ext': 'mp4',
-            'title': 'seirei-tsukai-no-blade-dance-episode-9',
-            'description': 'seirei-tsukai-no-blade-dance-episode-9'
-        }
-    }
-    _VALID_URL = r'http://[w.]*soul-anime\.(?P<domain>[^/]+)/watch[^/]*/(?P<id>[^/]+)'
-
-    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
-        domain = mobj.group('domain')
-
-        page = self._download_webpage(url, video_id)
-
-        video_url_encoded = self._html_search_regex(
-            r'<div id="download">[^<]*<a href="(?P<url>[^"]+)"', page, 'url')
-        video_url = "http://www.soul-anime." + domain + video_url_encoded
-
-        ext_req = HEADRequest(video_url)
-        ext_handle = self._request_webpage(
-            ext_req, video_id, note='Determining extension')
-        ext = urlhandle_detect_ext(ext_handle)
-
-        return {
-            'id': video_id,
-            'url': video_url,
-            'ext': ext,
-            'title': video_id,
-            'description': video_id
-        }
-
-
-class SoulAnimeSeriesIE(InfoExtractor):
-    IE_NAME = "soulanime:series"
-    IE_DESC = "SoulAnime Series"
-
-    _VALID_URL = r'http://[w.]*soul-anime\.(?P<domain>[^/]+)/anime./(?P<id>[^/]+)'
-
-    _EPISODE_REGEX = r'<option value="(/watch[^/]*/[^"]+)">[^<]*</option>'
-
-    _TEST = {
-        'url': 'http://www.soul-anime.net/anime1/black-rock-shooter-tv/',
-        'info_dict': {
-            'id': 'black-rock-shooter-tv'
-        },
-        'playlist_count': 8
-    }
-
-    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
-        series_id = mobj.group('id')
-        domain = mobj.group('domain')
-
-        pattern = re.compile(self._EPISODE_REGEX)
-
-        page = self._download_webpage(url, series_id, "Downloading series page")
-        mobj = pattern.findall(page)
-
-        entries = [self.url_result("http://www.soul-anime." + domain + obj) for obj in mobj]
-
-        return self.playlist_result(entries, series_id)
@@ -246,6 +246,7 @@ class SoundcloudSetIE(SoundcloudIE):
    _TESTS = [{
        'url': 'https://soundcloud.com/the-concept-band/sets/the-royal-concept-ep',
        'info_dict': {
+            'id': '2284613',
            'title': 'The Royal Concept EP',
        },
        'playlist_mincount': 6,
@@ -279,7 +280,7 @@ class SoundcloudSetIE(SoundcloudIE):
        return {
            '_type': 'playlist',
            'entries': [self._extract_info_dict(track, secret_token=token) for track in info['tracks']],
-            'id': info['id'],
+            'id': '%s' % info['id'],
            'title': info['title'],
        }

@@ -1,14 +1,12 @@
 from __future__ import unicode_literals

-import re
-
 from .mtv import MTVServicesInfoExtractor


 class SpikeIE(MTVServicesInfoExtractor):
    _VALID_URL = r'''(?x)https?://
-        (www\.spike\.com/(video-clips|episodes)/.+|
-         m\.spike\.com/videos/video.rbml\?id=(?P<mobile_id>[^&]+))
+        (?:www\.spike\.com/(?:video-clips|(?:full-)?episodes)/.+|
+         m\.spike\.com/videos/video\.rbml\?id=(?P<id>[^&]+))
        '''
    _TEST = {
        'url': 'http://www.spike.com/video-clips/lhtu8m/auction-hunters-can-allen-ride-a-hundred-year-old-motorcycle',
@@ -25,8 +23,7 @@ class SpikeIE(MTVServicesInfoExtractor):
    _MOBILE_TEMPLATE = 'http://m.spike.com/videos/video.rbml?id=%s'

    def _real_extract(self, url):
-        mobj = re.search(self._VALID_URL, url)
-        mobile_id = mobj.group('mobile_id')
-        if mobile_id is not None:
+        mobile_id = self._match_id(url)
+        if mobile_id:
            url = 'http://www.spike.com/video-clips/%s' % mobile_id
        return super(SpikeIE, self)._real_extract(url)
@@ -8,7 +8,7 @@ from ..utils import js_to_json


 class SRMediathekIE(InfoExtractor):
-    IE_DESC = 'Süddeutscher Rundfunk'
+    IE_DESC = 'Saarländischer Rundfunk'
    _VALID_URL = r'https?://sr-mediathek\.sr-online\.de/index\.php\?.*?&id=(?P<id>[0-9]+)'

    _TEST = {
@@ -10,19 +10,23 @@ class TeamcocoIE(InfoExtractor):
    _TESTS = [
        {
            'url': 'http://teamcoco.com/video/80187/conan-becomes-a-mary-kay-beauty-consultant',
-            'file': '80187.mp4',
            'md5': '3f7746aa0dc86de18df7539903d399ea',
            'info_dict': {
+                'id': '80187',
+                'ext': 'mp4',
                'title': 'Conan Becomes A Mary Kay Beauty Consultant',
-                'description': 'Mary Kay is perhaps the most trusted name in female beauty, so of course Conan is a natural choice to sell their products.'
+                'description': 'Mary Kay is perhaps the most trusted name in female beauty, so of course Conan is a natural choice to sell their products.',
+                'age_limit': 0,
            }
        }, {
            'url': 'http://teamcoco.com/video/louis-ck-interview-george-w-bush',
-            'file': '19705.mp4',
            'md5': 'cde9ba0fa3506f5f017ce11ead928f9a',
            'info_dict': {
+                'id': '19705',
+                'ext': 'mp4',
                "description": "Louis C.K. got starstruck by George W. Bush, so what? Part one.",
-                "title": "Louis C.K. Interview Pt. 1 11/3/11"
+                "title": "Louis C.K. Interview Pt. 1 11/3/11",
+                'age_limit': 0,
            }
        }
    ]
@@ -36,7 +40,7 @@ class TeamcocoIE(InfoExtractor):
        video_id = mobj.group("video_id")
        if not video_id:
            video_id = self._html_search_regex(
-                r'data-node-id="(\d+?)"',
+                r'<div\s+class="player".*?data-id="(\d+?)"',
                webpage, 'video id')

        data_url = 'http://teamcoco.com/cvp/2.0/%s.xml' % video_id
@@ -81,4 +85,5 @@ class TeamcocoIE(InfoExtractor):
            'title': self._og_search_title(webpage),
            'thumbnail': self._og_search_thumbnail(webpage),
            'description': self._og_search_description(webpage),
+            'age_limit': self._family_friendly_search(webpage),
        }
@@ -11,6 +11,7 @@ class TeleTaskIE(InfoExtractor):
    _TEST = {
        'url': 'http://www.tele-task.de/archive/video/html5/26168/',
        'info_dict': {
+            'id': '26168',
            'title': 'Duplicate Detection',
        },
        'playlist': [{
@@ -34,7 +35,6 @@ class TeleTaskIE(InfoExtractor):

    def _real_extract(self, url):
        lecture_id = self._match_id(url)
-
        webpage = self._download_webpage(url, lecture_id)

        title = self._html_search_regex(
@@ -16,8 +16,9 @@ class TouTvIE(InfoExtractor):

    _TEST = {
        'url': 'http://www.tou.tv/30-vies/S04E41',
-        'file': '30-vies_S04E41.mp4',
        'info_dict': {
+            'id': '30-vies_S04E41',
+            'ext': 'mp4',
            'title': '30 vies Saison 4 / Épisode 41',
            'description': 'md5:da363002db82ccbe4dafeb9cab039b09',
            'age_limit': 8,
@@ -1,6 +1,8 @@
 # encoding: utf-8
 from __future__ import unicode_literals

+import re
+
 from .common import InfoExtractor
 from ..utils import (
    float_or_none,
@@ -11,7 +13,7 @@ from ..utils import (
 class TvigleIE(InfoExtractor):
    IE_NAME = 'tvigle'
    IE_DESC = 'Интернет-телевидение Tvigle.ru'
-    _VALID_URL = r'http://(?:www\.)?tvigle\.ru/(?:[^/]+/)+(?P<id>[^/]+)/$'
+    _VALID_URL = r'https?://(?:www\.)?(?:tvigle\.ru/(?:[^/]+/)+(?P<display_id>[^/]+)/$|cloud\.tvigle\.ru/video/(?P<id>\d+))'

    _TESTS = [
        {
@@ -38,16 +40,22 @@ class TvigleIE(InfoExtractor):
                'duration': 186.080,
                'age_limit': 0,
            },
-        },
+        }, {
+            'url': 'https://cloud.tvigle.ru/video/5267604/',
+            'only_matching': True,
+        }
    ]

    def _real_extract(self, url):
-        display_id = self._match_id(url)
+        mobj = re.match(self._VALID_URL, url)
+        video_id = mobj.group('id')
+        display_id = mobj.group('display_id')

-        webpage = self._download_webpage(url, display_id)
-
-        video_id = self._html_search_regex(
-            r'<li class="video-preview current_playing" id="(\d+)">', webpage, 'video id')
+        if not video_id:
+            webpage = self._download_webpage(url, display_id)
+            video_id = self._html_search_regex(
+                r'<li class="video-preview current_playing" id="(\d+)">',
+                webpage, 'video id')

        video_data = self._download_json(
            'http://cloud.tvigle.ru/api/play/video/%s/' % video_id, display_id)
@@ -0,0 +1,65 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    xpath_text,
+    xpath_with_ns,
+    int_or_none,
+    float_or_none,
+)
+
+
+class TweakersIE(InfoExtractor):
+    _VALID_URL = r'https?://tweakers\.net/video/(?P<id>\d+)'
+    _TEST = {
+        'url': 'https://tweakers.net/video/9926/new-nintendo-3ds-xl-op-alle-fronten-beter.html',
+        'md5': '1b5afa817403bb5baa08359dca31e6df',
+        'info_dict': {
+            'id': '9926',
+            'ext': 'mp4',
+            'title': 'New Nintendo 3DS XL - Op alle fronten beter',
+            'description': 'md5:f97324cc71e86e11c853f0763820e3ba',
+            'thumbnail': 're:^https?://.*\.jpe?g$',
+            'duration': 386,
+        }
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        playlist = self._download_xml(
+            'https://tweakers.net/video/s1playlist/%s/playlist.xspf' % video_id,
+            video_id)
+
+        NS_MAP = {
+            'xspf': 'http://xspf.org/ns/0/',
+            's1': 'http://static.streamone.nl/player/ns/0',
+        }
+
+        track = playlist.find(xpath_with_ns('./xspf:trackList/xspf:track', NS_MAP))
+
+        title = xpath_text(
+            track, xpath_with_ns('./xspf:title', NS_MAP), 'title')
+        description = xpath_text(
+            track, xpath_with_ns('./xspf:annotation', NS_MAP), 'description')
+        thumbnail = xpath_text(
+            track, xpath_with_ns('./xspf:image', NS_MAP), 'thumbnail')
+        duration = float_or_none(
+            xpath_text(track, xpath_with_ns('./xspf:duration', NS_MAP), 'duration'),
+            1000)
+
+        formats = [{
+            'url': location.text,
+            'format_id': location.get(xpath_with_ns('s1:label', NS_MAP)),
+            'width': int_or_none(location.get(xpath_with_ns('s1:width', NS_MAP))),
+            'height': int_or_none(location.get(xpath_with_ns('s1:height', NS_MAP))),
+        } for location in track.findall(xpath_with_ns('./xspf:location', NS_MAP))]
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'duration': duration,
+            'formats': formats,
+        }
@@ -9,6 +9,7 @@ from ..compat import (
 )
 from ..utils import (
    ExtractorError,
+    int_or_none,
 )


@@ -192,9 +193,29 @@ class VevoIE(InfoExtractor):
        # Download via HLS API
        formats.extend(self._download_api_formats(video_id))

+        # Download SMIL
+        smil_blocks = sorted((
+            f for f in video_info['videoVersions']
+            if f['sourceType'] == 13),
+            key=lambda f: f['version'])
+        smil_url = '%s/Video/V2/VFILE/%s/%sr.smil' % (
+            self._SMIL_BASE_URL, video_id, video_id.lower())
+        if smil_blocks:
+            smil_url_m = self._search_regex(
+                r'url="([^"]+)"', smil_blocks[-1]['data'], 'SMIL URL',
+                default=None)
+            if smil_url_m is not None:
+                smil_url = smil_url_m
+        if smil_url:
+            smil_xml = self._download_webpage(
+                smil_url, video_id, 'Downloading SMIL info', fatal=False)
+            if smil_xml:
+                formats.extend(self._formats_from_smil(smil_xml))
+
        self._sort_formats(formats)
-        timestamp_ms = int(self._search_regex(
-            r'/Date\((\d+)\)/', video_info['launchDate'], 'launch date'))
+        timestamp_ms = int_or_none(self._search_regex(
+            r'/Date\((\d+)\)/',
+            video_info['launchDate'], 'launch date', fatal=False))

        return {
            'id': video_id,
@@ -5,27 +5,58 @@ from ..utils import (
    float_or_none,
    int_or_none,
 )
+from ..compat import (
+    compat_urllib_request
+)


 class ViddlerIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?viddler\.com/(?:v|embed|player)/(?P<id>[a-z0-9]+)'
-    _TEST = {
-        "url": "http://www.viddler.com/v/43903784",
+    _TESTS = [{
+        'url': 'http://www.viddler.com/v/43903784',
        'md5': 'ae43ad7cb59431ce043f0ff7fa13cbf4',
        'info_dict': {
            'id': '43903784',
            'ext': 'mp4',
-            "title": "Video Made Easy",
-            'description': 'You don\'t need to be a professional to make high-quality video content. Viddler provides some quick and easy tips on how to produce great video content with limited resources. ',
-            "uploader": "viddler",
+            'title': 'Video Made Easy',
+            'description': 'md5:6a697ebd844ff3093bd2e82c37b409cd',
+            'uploader': 'viddler',
            'timestamp': 1335371429,
            'upload_date': '20120425',
-            "duration": 100.89,
+            'duration': 100.89,
            'thumbnail': 're:^https?://.*\.jpg$',
            'view_count': int,
+            'comment_count': int,
            'categories': ['video content', 'high quality video', 'video made easy', 'how to produce video with limited resources', 'viddler'],
        }
-    }
+    }, {
+        'url': 'http://www.viddler.com/v/4d03aad9/',
+        'md5': 'faa71fbf70c0bee7ab93076fd007f4b0',
+        'info_dict': {
+            'id': '4d03aad9',
+            'ext': 'mp4',
+            'title': 'WALL-TO-GORTAT',
+            'upload_date': '20150126',
+            'uploader': 'deadspin',
+            'timestamp': 1422285291,
+            'view_count': int,
+            'comment_count': int,
+        }
+    }, {
+        'url': 'http://www.viddler.com/player/221ebbbd/0/',
+        'md5': '0defa2bd0ea613d14a6e9bd1db6be326',
+        'info_dict': {
+            'id': '221ebbbd',
+            'ext': 'mp4',
+            'title': 'LETeens-Grammar-snack-third-conditional',
+            'description': ' ',
+            'upload_date': '20140929',
+            'uploader': 'BCLETeens',
+            'timestamp': 1411997190,
+            'view_count': int,
+            'comment_count': int,
+        }
+    }]

    def _real_extract(self, url):
        video_id = self._match_id(url)
@@ -33,14 +64,17 @@ class ViddlerIE(InfoExtractor):
        json_url = (
            'http://api.viddler.com/api/v2/viddler.videos.getPlaybackDetails.json?video_id=%s&key=v0vhrt7bg2xq1vyxhkct' %
            video_id)
-        data = self._download_json(json_url, video_id)['video']
+        headers = {'Referer': 'http://static.cdn-ec.viddler.com/js/arpeggio/v2/embed.html'}
+        request = compat_urllib_request.Request(json_url, None, headers)
+        data = self._download_json(request, video_id)['video']

        formats = []
        for filed in data['files']:
            if filed.get('status', 'ready') != 'ready':
                continue
+            format_id = filed.get('profile_id') or filed['profile_name']
            f = {
-                'format_id': filed['profile_id'],
+                'format_id': format_id,
                'format_note': filed['profile_name'],
                'url': self._proto_relative_url(filed['url']),
                'width': int_or_none(filed.get('width')),
@@ -53,16 +87,15 @@ class ViddlerIE(InfoExtractor):

            if filed.get('cdn_url'):
                f = f.copy()
-                f['url'] = self._proto_relative_url(filed['cdn_url'])
-                f['format_id'] = filed['profile_id'] + '-cdn'
+                f['url'] = self._proto_relative_url(filed['cdn_url'], 'http:')
+                f['format_id'] = format_id + '-cdn'
                f['source_preference'] = 1
                formats.append(f)

            if filed.get('html5_video_source'):
                f = f.copy()
-                f['url'] = self._proto_relative_url(
-                    filed['html5_video_source'])
-                f['format_id'] = filed['profile_id'] + '-html5'
+                f['url'] = self._proto_relative_url(filed['html5_video_source'])
+                f['format_id'] = format_id + '-html5'
                f['source_preference'] = 0
                formats.append(f)
        self._sort_formats(formats)
@@ -71,7 +104,6 @@ class ViddlerIE(InfoExtractor):
            t.get('text') for t in data.get('tags', []) if 'text' in t]

        return {
-            '_type': 'video',
            'id': video_id,
            'title': data['title'],
            'formats': formats,
@@ -81,5 +113,6 @@ class ViddlerIE(InfoExtractor):
            'uploader': data.get('author'),
            'duration': float_or_none(data.get('length')),
            'view_count': int_or_none(data.get('view_count')),
+            'comment_count': int_or_none(data.get('comment_count')),
            'categories': categories,
        }
@@ -501,9 +501,10 @@ class VimeoReviewIE(InfoExtractor):
    _VALID_URL = r'https?://vimeo\.com/[^/]+/review/(?P<id>[^/]+)'
    _TESTS = [{
        'url': 'https://vimeo.com/user21297594/review/75524534/3c257a1b5d',
-        'file': '75524534.mp4',
        'md5': 'c507a72f780cacc12b2248bb4006d253',
        'info_dict': {
+            'id': '75524534',
+            'ext': 'mp4',
            'title': "DICK HARDWICK 'Comedian'",
            'uploader': 'Richard Hardwick',
        }
@@ -1,3 +1,4 @@
+# coding: utf-8
 from __future__ import unicode_literals

 import re
@@ -11,9 +12,10 @@ from ..utils import (

 class WashingtonPostIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?washingtonpost\.com/.*?/(?P<id>[^/]+)/(?:$|[?#])'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.washingtonpost.com/sf/national/2014/03/22/sinkhole-of-bureaucracy/',
        'info_dict': {
+            'id': 'sinkhole-of-bureaucracy',
            'title': 'Sinkhole of bureaucracy',
        },
        'playlist': [{
@@ -40,15 +42,38 @@ class WashingtonPostIE(InfoExtractor):
                'upload_date': '20140322',
                'uploader': 'The Washington Post',
            },
+        }],
+    }, {
+        'url': 'http://www.washingtonpost.com/blogs/wonkblog/wp/2014/12/31/one-airline-figured-out-how-to-make-sure-its-airplanes-never-disappear/',
+        'info_dict': {
+            'id': 'one-airline-figured-out-how-to-make-sure-its-airplanes-never-disappear',
+            'title': 'One airline figured out how to make sure its airplanes never disappear',
+        },
+        'playlist': [{
+            'md5': 'a7c1b5634ba5e57a6a82cdffa5b1e0d0',
+            'info_dict': {
+                'id': '0e4bb54c-9065-11e4-a66f-0ca5037a597d',
+                'ext': 'mp4',
+                'description': 'Washington Post transportation reporter Ashley Halsey III explains why a plane\'s black box needs to be recovered from a crash site instead of having its information streamed in real time throughout the flight.',
+                'upload_date': '20141230',
+                'uploader': 'The Washington Post',
+                'timestamp': 1419974765,
+                'title': 'Why black boxes don’t transmit data in real time',
+            }
        }]
-    }
+    }]

    def _real_extract(self, url):
        page_id = self._match_id(url)
        webpage = self._download_webpage(url, page_id)

        title = self._og_search_title(webpage)
-        uuids = re.findall(r'data-video-uuid="([^"]+)"', webpage)
+
+        uuids = re.findall(r'''(?x)
+            (?:
+                <div\s+class="posttv-video-embed[^>]*?data-uuid=|
+                data-video-uuid=
+            )"([^"]+)"''', webpage)
        entries = []
        for i, uuid in enumerate(uuids, start=1):
            vinfo_all = self._download_json(
@@ -75,10 +100,11 @@ class WashingtonPostIE(InfoExtractor):
                'filesize': s.get('fileSize'),
                'url': s.get('url'),
                'ext': 'mp4',
+                'preference': -100 if s.get('type') == 'smil' else None,
                'protocol': {
                    'MP4': 'http',
                    'F4F': 'f4m',
-                }.get(s.get('type'))
+                }.get(s.get('type')),
            } for s in vinfo.get('streams', [])]
            source_media_url = vinfo.get('sourceMediaURL')
            if source_media_url:
@@ -71,6 +71,9 @@ class WDRIE(InfoExtractor):
        {
            'url': 'http://www1.wdr.de/mediathek/video/sendungen/quarks_und_co/filterseite-quarks-und-co100.html',
            'playlist_mincount': 146,
+            'info_dict': {
+                'id': 'mediathek/video/sendungen/quarks_und_co/filterseite-quarks-und-co100',
+            }
        }
    ]

@@ -0,0 +1,89 @@
+# encoding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    int_or_none,
+    unified_strdate,
+)
+
+
+class WSJIE(InfoExtractor):
+    _VALID_URL = r'https?://video-api\.wsj\.com/api-video/player/iframe\.html\?guid=(?P<id>[a-zA-Z0-9-]+)'
+    IE_DESC = 'Wall Street Journal'
+    _TEST = {
+        'url': 'http://video-api.wsj.com/api-video/player/iframe.html?guid=1BD01A4C-BFE8-40A5-A42F-8A8AF9898B1A',
+        'md5': '9747d7a6ebc2f4df64b981e1dde9efa9',
+        'info_dict': {
+            'id': '1BD01A4C-BFE8-40A5-A42F-8A8AF9898B1A',
+            'ext': 'mp4',
+            'upload_date': '20150202',
+            'uploader_id': 'bbright',
+            'creator': 'bbright',
+            'categories': list,  # a long list
+            'duration': 90,
+            'title': 'Bills Coach Rex Ryan Updates His Old Jets Tattoo',
+        },
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        bitrates = [128, 174, 264, 320, 464, 664, 1264]
+        api_url = (
+            'http://video-api.wsj.com/api-video/find_all_videos.asp?'
+            'type=guid&count=1&query=%s&'
+            'fields=hls,adZone,thumbnailList,guid,state,secondsUntilStartTime,'
+            'author,description,name,linkURL,videoStillURL,duration,videoURL,'
+            'adCategory,catastrophic,linkShortURL,doctypeID,youtubeID,'
+            'titletag,rssURL,wsj-section,wsj-subsection,allthingsd-section,'
+            'allthingsd-subsection,sm-section,sm-subsection,provider,'
+            'formattedCreationDate,keywords,keywordsOmniture,column,editor,'
+            'emailURL,emailPartnerID,showName,omnitureProgramName,'
+            'omnitureVideoFormat,linkRelativeURL,touchCastID,'
+            'omniturePublishDate,%s') % (
+                video_id, ','.join('video%dkMP4Url' % br for br in bitrates))
+        info = self._download_json(api_url, video_id)['items'][0]
+
+        # Thumbnails are conveniently in the correct format already
+        thumbnails = info.get('thumbnailList')
+        creator = info.get('author')
+        uploader_id = info.get('editor')
+        categories = info.get('keywords')
+        duration = int_or_none(info.get('duration'))
+        upload_date = unified_strdate(
+            info.get('formattedCreationDate'), day_first=False)
+        title = info.get('name', info.get('titletag'))
+
+        formats = [{
+            'format_id': 'f4m',
+            'format_note': 'f4m (meta URL)',
+            'url': info['videoURL'],
+        }]
+        if info.get('hls'):
+            formats.extend(self._extract_m3u8_formats(
+                info['hls'], video_id, ext='mp4',
+                preference=0, entry_protocol='m3u8_native'))
+        for br in bitrates:
+            field = 'video%dkMP4Url' % br
+            if info.get(field):
+                formats.append({
+                    'format_id': 'mp4-%d' % br,
+                    'container': 'mp4',
+                    'tbr': br,
+                    'url': info[field],
+                })
+        self._sort_formats(formats)
+
+        return {
+            'id': video_id,
+            'formats': formats,
+            'thumbnails': thumbnails,
+            'creator': creator,
+            'uploader_id': uploader_id,
+            'duration': duration,
+            'upload_date': upload_date,
+            'title': title,
+            'formats': formats,
+            'categories': categories,
+        }
@@ -0,0 +1,142 @@
+# -*- coding: utf-8 -*-
+from __future__ import unicode_literals
+
+import base64
+
+from .common import InfoExtractor
+from ..compat import compat_urllib_parse_unquote
+from ..utils import (
+    ExtractorError,
+    parse_iso8601,
+    parse_duration,
+)
+
+
+class XuiteIE(InfoExtractor):
+    _REGEX_BASE64 = r'(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?'
+    _VALID_URL = r'https?://vlog\.xuite\.net/(?:play|embed)/(?P<id>%s)' % _REGEX_BASE64
+    _TESTS = [{
+        # Audio
+        'url': 'http://vlog.xuite.net/play/RGkzc1ZULTM4NjA5MTQuZmx2',
+        'md5': '63a42c705772aa53fd4c1a0027f86adf',
+        'info_dict': {
+            'id': '3860914',
+            'ext': 'mp3',
+            'title': '孤單南半球-歐德陽',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'duration': 247.246,
+            'timestamp': 1314932940,
+            'upload_date': '20110902',
+            'uploader': '阿能',
+            'uploader_id': '15973816',
+            'categories': ['個人短片'],
+        },
+    }, {
+        # Video with only one format
+        'url': 'http://vlog.xuite.net/play/TkRZNjhULTM0NDE2MjkuZmx2',
+        'md5': 'c45737fc8ac5dc8ac2f92ecbcecf505e',
+        'info_dict': {
+            'id': '3441629',
+            'ext': 'mp4',
+            'title': '孫燕姿 - 眼淚成詩',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'duration': 217.399,
+            'timestamp': 1299383640,
+            'upload_date': '20110306',
+            'uploader': 'Valen',
+            'uploader_id': '10400126',
+            'categories': ['影視娛樂'],
+        },
+    }, {
+        # Video with two formats
+        'url': 'http://vlog.xuite.net/play/bWo1N1pLLTIxMzAxMTcwLmZsdg==',
+        'md5': '1166e0f461efe55b62e26a2d2a68e6de',
+        'info_dict': {
+            'id': '21301170',
+            'ext': 'mp4',
+            'title': '暗殺教室 02',
+            'description': '字幕:【極影字幕社】',
+            'thumbnail': 're:^https?://.*\.jpg$',
+            'duration': 1384.907,
+            'timestamp': 1421481240,
+            'upload_date': '20150117',
+            'uploader': '我只是想認真點',
+            'uploader_id': '242127761',
+            'categories': ['電玩動漫'],
+        },
+    }, {
+        'url': 'http://vlog.xuite.net/play/S1dDUjdyLTMyOTc3NjcuZmx2/%E5%AD%AB%E7%87%95%E5%A7%BF-%E7%9C%BC%E6%B7%9A%E6%88%90%E8%A9%A9',
+        'only_matching': True,
+    }]
+
+    def _extract_flv_config(self, media_id):
+        base64_media_id = base64.b64encode(media_id.encode('utf-8')).decode('utf-8')
+        flv_config = self._download_xml(
+            'http://vlog.xuite.net/flash/player?media=%s' % base64_media_id,
+            'flv config')
+        prop_dict = {}
+        for prop in flv_config.findall('./property'):
+            prop_id = base64.b64decode(prop.attrib['id']).decode('utf-8')
+            # CDATA may be empty in flv config
+            if not prop.text:
+                continue
+            encoded_content = base64.b64decode(prop.text).decode('utf-8')
+            prop_dict[prop_id] = compat_urllib_parse_unquote(encoded_content)
+        return prop_dict
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, video_id)
+
+        error_msg = self._search_regex(
+            r'<div id="error-message-content">([^<]+)',
+            webpage, 'error message', default=None)
+        if error_msg:
+            raise ExtractorError(
+                '%s returned error: %s' % (self.IE_NAME, error_msg),
+                expected=True)
+
+        video_id = self._html_search_regex(
+            r'data-mediaid="(\d+)"', webpage, 'media id')
+        flv_config = self._extract_flv_config(video_id)
+
+        FORMATS = {
+            'audio': 'mp3',
+            'video': 'mp4',
+        }
+
+        formats = []
+        for format_tag in ('src', 'hq_src'):
+            video_url = flv_config.get(format_tag)
+            if not video_url:
+                continue
+            format_id = self._search_regex(
+                r'\bq=(.+?)\b', video_url, 'format id', default=format_tag)
+            formats.append({
+                'url': video_url,
+                'ext': FORMATS.get(flv_config['type'], 'mp4'),
+                'format_id': format_id,
+                'height': int(format_id) if format_id.isnumeric() else None,
+            })
+        self._sort_formats(formats)
+
+        timestamp = flv_config.get('publish_datetime')
+        if timestamp:
+            timestamp = parse_iso8601(timestamp + ' +0800', ' ')
+
+        category = flv_config.get('category')
+        categories = [category] if category else []
+
+        return {
+            'id': video_id,
+            'title': flv_config['title'],
+            'description': flv_config.get('description'),
+            'thumbnail': flv_config.get('thumb'),
+            'timestamp': timestamp,
+            'uploader': flv_config.get('author_name'),
+            'uploader_id': flv_config.get('author_id'),
+            'duration': parse_duration(flv_config.get('duration')),
+            'categories': categories,
+            'formats': formats,
+        }
@@ -809,6 +809,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor, SubtitlesInfoExtractor):
            player_url = None

        # Get video info
+        embed_webpage = None
        if re.search(r'player-age-gate-content">', video_webpage) is not None:
            age_gate = True
            # We simulate the access to the video from www.youtube.com/v/{video_id}
@@ -1016,10 +1017,21 @@ class YoutubeIE(YoutubeBaseInfoExtractor, SubtitlesInfoExtractor):
                    url += '&signature=' + url_data['sig'][0]
                elif 's' in url_data:
                    encrypted_sig = url_data['s'][0]
+                    ASSETS_RE = r'"assets":.+?"js":\s*("[^"]+")'

                    jsplayer_url_json = self._search_regex(
-                        r'"assets":.+?"js":\s*("[^"]+")',
-                        embed_webpage if age_gate else video_webpage, 'JS player URL')
+                        ASSETS_RE,
+                        embed_webpage if age_gate else video_webpage,
+                        'JS player URL (1)', default=None)
+                    if not jsplayer_url_json and not age_gate:
+                        # We need the embed website after all
+                        if embed_webpage is None:
+                            embed_url = proto + '://www.youtube.com/embed/%s' % video_id
+                            embed_webpage = self._download_webpage(
+                                embed_url, video_id, 'Downloading embed webpage')
+                        jsplayer_url_json = self._search_regex(
+                            ASSETS_RE, embed_webpage, 'JS player URL')
+
                    player_url = json.loads(jsplayer_url_json)
                    if player_url is None:
                        player_url_json = self._search_regex(
@@ -1148,6 +1160,7 @@ class YoutubePlaylistIE(YoutubeBaseInfoExtractor):
    }, {
        'url': 'https://www.youtube.com/playlist?list=PLtPgu7CB4gbZDA7i_euNxn75ISqxwZPYx',
        'info_dict': {
+            'id': 'PLtPgu7CB4gbZDA7i_euNxn75ISqxwZPYx',
            'title': 'YDL_Empty_List',
        },
        'playlist_count': 0,
@@ -1156,6 +1169,7 @@ class YoutubePlaylistIE(YoutubeBaseInfoExtractor):
        'url': 'https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC',
        'info_dict': {
            'title': '29C3: Not my department',
+            'id': 'PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC',
        },
        'playlist_count': 95,
    }, {
@@ -1163,6 +1177,7 @@ class YoutubePlaylistIE(YoutubeBaseInfoExtractor):
        'url': 'PLBB231211A4F62143',
        'info_dict': {
            'title': '[OLD]Team Fortress 2 (Class-based LP)',
+            'id': 'PLBB231211A4F62143',
        },
        'playlist_mincount': 26,
    }, {
@@ -1170,12 +1185,14 @@ class YoutubePlaylistIE(YoutubeBaseInfoExtractor):
        'url': 'https://www.youtube.com/playlist?list=UUBABnxM4Ar9ten8Mdjj1j0Q',
        'info_dict': {
            'title': 'Uploads from Cauchemar',
+            'id': 'UUBABnxM4Ar9ten8Mdjj1j0Q',
        },
        'playlist_mincount': 799,
    }, {
        'url': 'PLtPgu7CB4gbY9oDN3drwC3cMbJggS7dKl',
        'info_dict': {
            'title': 'YDL_safe_search',
+            'id': 'PLtPgu7CB4gbY9oDN3drwC3cMbJggS7dKl',
        },
        'playlist_count': 2,
    }, {
@@ -1184,6 +1201,7 @@ class YoutubePlaylistIE(YoutubeBaseInfoExtractor):
        'playlist_count': 4,
        'info_dict': {
            'title': 'JODA15',
+            'id': 'PL6IaIsEjSbf96XFRuNccS_RuEXwNdsoEu',
        }
    }, {
        'note': 'Embedded SWF player',
@@ -1191,12 +1209,14 @@ class YoutubePlaylistIE(YoutubeBaseInfoExtractor):
        'playlist_count': 4,
        'info_dict': {
            'title': 'JODA7',
+            'id': 'YN5VISEtHet5D4NEvfTd0zcgFk84NqFZ',
        }
    }, {
        'note': 'Buggy playlist: the webpage has a "Load more" button but it doesn\'t have more videos',
        'url': 'https://www.youtube.com/playlist?list=UUXw-G3eDE9trcvY2sBMM_aA',
        'info_dict': {
-                'title': 'Uploads from Interstellar Movie',
+            'title': 'Uploads from Interstellar Movie',
+            'id': 'UUXw-G3eDE9trcvY2sBMM_aA',
        },
        'playlist_mincout': 21,
    }]
@@ -1302,6 +1322,9 @@ class YoutubeChannelIE(InfoExtractor):
        'note': 'paginated channel',
        'url': 'https://www.youtube.com/channel/UCKfVa3S1e4PHvxWcwyMMg8w',
        'playlist_mincount': 91,
+        'info_dict': {
+            'id': 'UCKfVa3S1e4PHvxWcwyMMg8w',
+        }
    }]

    def extract_videos_from_page(self, page):
@@ -1688,6 +1711,7 @@ class YoutubeTruncatedURLIE(InfoExtractor):
            feature=[a-z_]+|
            annotation_id=annotation_[^&]+|
            x-yt-cl=[0-9]+|
+            hl=[^&]*|
        )?
        |
            attribution_link\?a=[^&]+
@@ -1707,6 +1731,9 @@ class YoutubeTruncatedURLIE(InfoExtractor):
    }, {
        'url': 'https://www.youtube.com/watch?feature=foo',
        'only_matching': True,
+    }, {
+        'url': 'https://www.youtube.com/watch?hl=en-GB',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
@@ -1,59 +1,122 @@
 from __future__ import unicode_literals

 import json
+import operator
 import re

 from .utils import (
    ExtractorError,
 )

+_OPERATORS = [
+    ('|', operator.or_),
+    ('^', operator.xor),
+    ('&', operator.and_),
+    ('>>', operator.rshift),
+    ('<<', operator.lshift),
+    ('-', operator.sub),
+    ('+', operator.add),
+    ('%', operator.mod),
+    ('/', operator.truediv),
+    ('*', operator.mul),
+]
+_ASSIGN_OPERATORS = [(op + '=', opfunc) for op, opfunc in _OPERATORS]
+_ASSIGN_OPERATORS.append(('=', lambda cur, right: right))
+
+_NAME_RE = r'[a-zA-Z_$][a-zA-Z_$0-9]*'
+

 class JSInterpreter(object):
-    def __init__(self, code):
-        self.code = code
+    def __init__(self, code, objects=None):
+        if objects is None:
+            objects = {}
+        self.code = self._remove_comments(code)
        self._functions = {}
-        self._objects = {}
+        self._objects = objects

-    def interpret_statement(self, stmt, local_vars, allow_recursion=20):
+    def _remove_comments(self, code):
+        return re.sub(r'(?s)/\*.*?\*/', '', code)
+
+    def interpret_statement(self, stmt, local_vars, allow_recursion=100):
        if allow_recursion < 0:
            raise ExtractorError('Recursion limit reached')

-        if stmt.startswith('var '):
-            stmt = stmt[len('var '):]
-        ass_m = re.match(r'^(?P<out>[a-z]+)(?:\[(?P<index>[^\]]+)\])?' +
-                         r'=(?P<expr>.*)$', stmt)
-        if ass_m:
-            if ass_m.groupdict().get('index'):
-                def assign(val):
-                    lvar = local_vars[ass_m.group('out')]
-                    idx = self.interpret_expression(
-                        ass_m.group('index'), local_vars, allow_recursion)
-                    assert isinstance(idx, int)
-                    lvar[idx] = val
-                    return val
-                expr = ass_m.group('expr')
-            else:
-                def assign(val):
-                    local_vars[ass_m.group('out')] = val
-                    return val
-                expr = ass_m.group('expr')
-        elif stmt.startswith('return '):
-            assign = lambda v: v
-            expr = stmt[len('return '):]
+        should_abort = False
+        stmt = stmt.lstrip()
+        stmt_m = re.match(r'var\s', stmt)
+        if stmt_m:
+            expr = stmt[len(stmt_m.group(0)):]
        else:
-            # Try interpreting it as an expression
-            expr = stmt
-            assign = lambda v: v
+            return_m = re.match(r'return(?:\s+|$)', stmt)
+            if return_m:
+                expr = stmt[len(return_m.group(0)):]
+                should_abort = True
+            else:
+                # Try interpreting it as an expression
+                expr = stmt

        v = self.interpret_expression(expr, local_vars, allow_recursion)
-        return assign(v)
+        return v, should_abort

    def interpret_expression(self, expr, local_vars, allow_recursion):
+        expr = expr.strip()
+
+        if expr == '':  # Empty expression
+            return None
+
+        if expr.startswith('('):
+            parens_count = 0
+            for m in re.finditer(r'[()]', expr):
+                if m.group(0) == '(':
+                    parens_count += 1
+                else:
+                    parens_count -= 1
+                    if parens_count == 0:
+                        sub_expr = expr[1:m.start()]
+                        sub_result = self.interpret_expression(
+                            sub_expr, local_vars, allow_recursion)
+                        remaining_expr = expr[m.end():].strip()
+                        if not remaining_expr:
+                            return sub_result
+                        else:
+                            expr = json.dumps(sub_result) + remaining_expr
+                        break
+            else:
+                raise ExtractorError('Premature end of parens in %r' % expr)
+
+        for op, opfunc in _ASSIGN_OPERATORS:
+            m = re.match(r'''(?x)
+                (?P<out>%s)(?:\[(?P<index>[^\]]+?)\])?
+                \s*%s
+                (?P<expr>.*)$''' % (_NAME_RE, re.escape(op)), expr)
+            if not m:
+                continue
+            right_val = self.interpret_expression(
+                m.group('expr'), local_vars, allow_recursion - 1)
+
+            if m.groupdict().get('index'):
+                lvar = local_vars[m.group('out')]
+                idx = self.interpret_expression(
+                    m.group('index'), local_vars, allow_recursion)
+                assert isinstance(idx, int)
+                cur = lvar[idx]
+                val = opfunc(cur, right_val)
+                lvar[idx] = val
+                return val
+            else:
+                cur = local_vars.get(m.group('out'))
+                val = opfunc(cur, right_val)
+                local_vars[m.group('out')] = val
+                return val
+
        if expr.isdigit():
            return int(expr)

-        if expr.isalpha():
-            return local_vars[expr]
+        var_m = re.match(
+            r'(?!if|return|true|false)(?P<name>%s)$' % _NAME_RE,
+            expr)
+        if var_m:
+            return local_vars[var_m.group('name')]

        try:
            return json.loads(expr)
@@ -61,7 +124,7 @@ class JSInterpreter(object):
            pass

        m = re.match(
-            r'^(?P<var>[$a-zA-Z0-9_]+)\.(?P<member>[^(]+)(?:\(+(?P<args>[^()]*)\))?$',
+            r'(?P<var>%s)\.(?P<member>[^(]+)(?:\(+(?P<args>[^()]*)\))?$' % _NAME_RE,
            expr)
        if m:
            variable = m.group('var')
@@ -114,23 +177,31 @@ class JSInterpreter(object):
            return obj[member](argvals)

        m = re.match(
-            r'^(?P<in>[a-z]+)\[(?P<idx>.+)\]$', expr)
+            r'(?P<in>%s)\[(?P<idx>.+)\]$' % _NAME_RE, expr)
        if m:
            val = local_vars[m.group('in')]
            idx = self.interpret_expression(
                m.group('idx'), local_vars, allow_recursion - 1)
            return val[idx]

-        m = re.match(r'^(?P<a>.+?)(?P<op>[%])(?P<b>.+?)$', expr)
-        if m:
-            a = self.interpret_expression(
-                m.group('a'), local_vars, allow_recursion)
-            b = self.interpret_expression(
-                m.group('b'), local_vars, allow_recursion)
-            return a % b
+        for op, opfunc in _OPERATORS:
+            m = re.match(r'(?P<x>.+?)%s(?P<y>.+)' % re.escape(op), expr)
+            if not m:
+                continue
+            x, abort = self.interpret_statement(
+                m.group('x'), local_vars, allow_recursion - 1)
+            if abort:
+                raise ExtractorError(
+                    'Premature left-side return of %s in %r' % (op, expr))
+            y, abort = self.interpret_statement(
+                m.group('y'), local_vars, allow_recursion - 1)
+            if abort:
+                raise ExtractorError(
+                    'Premature right-side return of %s in %r' % (op, expr))
+            return opfunc(x, y)

        m = re.match(
-            r'^(?P<func>[a-zA-Z$]+)\((?P<args>[a-z0-9,]+)\)$', expr)
+            r'^(?P<func>%s)\((?P<args>[a-zA-Z0-9_$,]+)\)$' % _NAME_RE, expr)
        if m:
            fname = m.group('func')
            argvals = tuple([
@@ -139,6 +210,7 @@ class JSInterpreter(object):
            if fname not in self._functions:
                self._functions[fname] = self.extract_function(fname)
            return self._functions[fname](argvals)
+
        raise ExtractorError('Unsupported JS expression %r' % expr)

    def extract_object(self, objname):
@@ -162,9 +234,11 @@ class JSInterpreter(object):

    def extract_function(self, funcname):
        func_m = re.search(
-            (r'(?:function %s|[{;]%s\s*=\s*function)' % (
-                re.escape(funcname), re.escape(funcname))) +
-            r'\((?P<args>[a-z,]+)\){(?P<code>[^}]+)}',
+            r'''(?x)
+                (?:function\s+%s|[{;]%s\s*=\s*function)\s*
+                \((?P<args>[^)]*)\)\s*
+                \{(?P<code>[^}]+)\}''' % (
+                re.escape(funcname), re.escape(funcname)),
            self.code)
        if func_m is None:
            raise ExtractorError('Could not find JS function %r' % funcname)
@@ -172,10 +246,16 @@ class JSInterpreter(object):

        return self.build_function(argnames, func_m.group('code'))

+    def call_function(self, funcname, *args):
+        f = self.extract_function(funcname)
+        return f(args)
+
    def build_function(self, argnames, code):
        def resf(args):
            local_vars = dict(zip(argnames, args))
            for stmt in code.split(';'):
-                res = self.interpret_statement(stmt, local_vars)
+                res, abort = self.interpret_statement(stmt, local_vars)
+                if abort:
+                    break
            return res
        return resf
@@ -297,8 +297,10 @@ def parseOpts(overrideArguments=None):
            ' You can filter the video results by putting a condition in'
            ' brackets, as in -f "best[height=720]"'
            ' (or -f "[filesize>10M]"). '
-            ' This works for filesize, height, width, tbr, abr, and vbr'
-            ' and the comparisons <, <=, >, >=, =, != .'
+            ' This works for filesize, height, width, tbr, abr, vbr, asr, and fps'
+            ' and the comparisons <, <=, >, >=, =, !='
+            ' and for ext, acodec, vcodec, container, and protocol'
+            ' and the comparisons =, != .'
            ' Formats for which the value is not known are excluded unless you'
            ' put a question mark (?) after the operator.'
            ' You can combine format filters, so  '
@@ -698,10 +700,9 @@ def parseOpts(overrideArguments=None):
    postproc.add_option(
        '--fixup',
        metavar='POLICY', dest='fixup', default='detect_or_warn',
-        help='(experimental) Automatically correct known faults of the file. '
+        help='Automatically correct known faults of the file. '
             'One of never (do nothing), warn (only emit a warning), '
-             'detect_or_warn(check whether we can do anything about it, warn '
-             'otherwise')
+             'detect_or_warn(the default; fix file if we can, warn otherwise)')
    postproc.add_option(
        '--prefer-avconv',
        action='store_false', dest='prefer_ffmpeg',
@@ -166,14 +166,13 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
        if filecodec is None:
            raise PostProcessingError('WARNING: unable to obtain file audio codec with ffprobe')

-        uses_avconv = self._uses_avconv()
        more_opts = []
        if self._preferredcodec == 'best' or self._preferredcodec == filecodec or (self._preferredcodec == 'm4a' and filecodec == 'aac'):
            if filecodec == 'aac' and self._preferredcodec in ['m4a', 'best']:
                # Lossless, but in another container
                acodec = 'copy'
                extension = 'm4a'
-                more_opts = ['-bsf:a' if uses_avconv else '-absf', 'aac_adtstoasc']
+                more_opts = ['-bsf:a', 'aac_adtstoasc']
            elif filecodec in ['aac', 'mp3', 'vorbis', 'opus']:
                # Lossless if possible
                acodec = 'copy'
@@ -189,9 +188,9 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
                more_opts = []
                if self._preferredquality is not None:
                    if int(self._preferredquality) < 10:
-                        more_opts += ['-q:a' if uses_avconv else '-aq', self._preferredquality]
+                        more_opts += ['-q:a', self._preferredquality]
                    else:
-                        more_opts += ['-b:a' if uses_avconv else '-ab', self._preferredquality + 'k']
+                        more_opts += ['-b:a', self._preferredquality + 'k']
        else:
            # We convert the audio (lossy)
            acodec = {'mp3': 'libmp3lame', 'aac': 'aac', 'm4a': 'aac', 'opus': 'opus', 'vorbis': 'libvorbis', 'wav': None}[self._preferredcodec]
@@ -200,13 +199,13 @@ class FFmpegExtractAudioPP(FFmpegPostProcessor):
            if self._preferredquality is not None:
                # The opus codec doesn't support the -aq option
                if int(self._preferredquality) < 10 and extension != 'opus':
-                    more_opts += ['-q:a' if uses_avconv else '-aq', self._preferredquality]
+                    more_opts += ['-q:a', self._preferredquality]
                else:
-                    more_opts += ['-b:a' if uses_avconv else '-ab', self._preferredquality + 'k']
+                    more_opts += ['-b:a', self._preferredquality + 'k']
            if self._preferredcodec == 'aac':
                more_opts += ['-f', 'adts']
            if self._preferredcodec == 'm4a':
-                more_opts += ['-bsf:a' if uses_avconv else '-absf', 'aac_adtstoasc']
+                more_opts += ['-bsf:a', 'aac_adtstoasc']
            if self._preferredcodec == 'vorbis':
                extension = 'ogg'
            if self._preferredcodec == 'wav':
@@ -511,8 +510,9 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
            metadata['artist'] = info['uploader_id']
        if info.get('description') is not None:
            metadata['description'] = info['description']
+            metadata['comment'] = info['description']
        if info.get('webpage_url') is not None:
-            metadata['comment'] = info['webpage_url']
+            metadata['purl'] = info['webpage_url']

        if not metadata:
            self._downloader.to_screen('[ffmpeg] There isn\'t any metadata to add')
@@ -32,6 +32,7 @@ import xml.etree.ElementTree
 import zlib

 from .compat import (
+    compat_basestring,
    compat_chr,
    compat_getenv,
    compat_html_entities,
@@ -140,7 +141,7 @@ else:
    def find_xpath_attr(node, xpath, key, val):
        # Here comes the crazy part: In 2.6, if the xpath is a unicode,
        # .//node does not match if a node is a direct child of . !
-        if isinstance(xpath, unicode):
+        if isinstance(xpath, compat_str):
            xpath = xpath.encode('ascii')

        for f in node.findall(xpath):
@@ -654,9 +655,14 @@ class YoutubeDLHTTPSHandler(compat_urllib_request.HTTPSHandler):
        self._params = params

    def https_open(self, req):
+        kwargs = {}
+        if hasattr(self, '_context'):  # python > 2.6
+            kwargs['context'] = self._context
+        if hasattr(self, '_check_hostname'):  # python 3.x
+            kwargs['check_hostname'] = self._check_hostname
        return self.do_open(functools.partial(
            _create_http_connection, self, self._https_conn_class, True),
-            req)
+            req, **kwargs)


 def parse_iso8601(date_str, delimiter='T'):
@@ -695,7 +701,7 @@ def unified_strdate(date_str, day_first=True):
    # %z (UTC offset) is only supported in python>=3.2
    date_str = re.sub(r' ?(\+|-)[0-9]{2}:?[0-9]{2}$', '', date_str)
    # Remove AM/PM + timezone
-    date_str = re.sub(r'(?i)\s*(?:AM|PM)\s+[A-Z]+', '', date_str)
+    date_str = re.sub(r'(?i)\s*(?:AM|PM)(?:\s+[A-Z]+)?', '', date_str)

    format_expressions = [
        '%d %B %Y',
@@ -1257,7 +1263,7 @@ def float_or_none(v, scale=1, invscale=1, default=None):


 def parse_duration(s):
-    if not isinstance(s, basestring if sys.version_info < (3, 0) else compat_str):
+    if not isinstance(s, compat_basestring):
        return None

    s = s.strip()
@@ -1269,7 +1275,10 @@ def parse_duration(s):
            (?P<only_hours>[0-9.]+)\s*(?:hours?)|

            (?:
-                (?:(?P<hours>[0-9]+)\s*(?:[:h]|hours?)\s*)?
+                (?:
+                    (?:(?P<days>[0-9]+)\s*(?:[:d]|days?)\s*)?
+                    (?P<hours>[0-9]+)\s*(?:[:h]|hours?)\s*
+                )?
                (?P<mins>[0-9]+)\s*(?:[:m]|mins?|minutes?)\s*
            )?
            (?P<secs>[0-9]+)(?P<ms>\.[0-9]+)?\s*(?:s|secs?|seconds?)?
@@ -1287,6 +1296,8 @@ def parse_duration(s):
        res += int(m.group('mins')) * 60
    if m.group('hours'):
        res += int(m.group('hours')) * 60 * 60
+    if m.group('days'):
+        res += int(m.group('days')) * 24 * 60 * 60
    if m.group('ms'):
        res += float(m.group('ms'))
    return res
@@ -1421,7 +1432,7 @@ def uppercase_escape(s):

 def escape_rfc3986(s):
    """Escape non-ASCII characters as suggested by RFC 3986"""
-    if sys.version_info < (3, 0) and isinstance(s, unicode):
+    if sys.version_info < (3, 0) and isinstance(s, compat_str):
        s = s.encode('utf-8')
    return compat_urllib_parse.quote(s, b"%/;:@&=+$,!~*'()?#[]")

@@ -1537,7 +1548,7 @@ def js_to_json(code):
    res = re.sub(r'''(?x)
        "(?:[^"\\]*(?:\\\\|\\")?)*"|
        '(?:[^'\\]*(?:\\\\|\\')?)*'|
-        [a-zA-Z_][a-zA-Z_0-9]*
+        [a-zA-Z_][.a-zA-Z_0-9]*
        ''', fix_kv, code)
    res = re.sub(r',(\s*\])', lambda m: m.group(1), res)
    return res
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2015.01.25'
+__version__ = '2015.02.09.2'