Compare commits

..

1 Commits

Author SHA1 Message Date
3c96c02394 wip 2022-08-08 13:55:33 +02:00
24 changed files with 205 additions and 599 deletions

View File

@ -3,26 +3,6 @@ twitch-dl changelog
<!-- Do not edit. This file is automatically generated from changelog.yaml.-->
### [2.0.0 (2022-08-18)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.0)
This release switches from using `requests` to `httpx` for making http requests,
and from threads to `asyncio` for concurrency. This enables easier
implementation of new features, but has no breaking changes for the CLI.
* **BREAKING**: Require Python 3.7 or later.
* Add `--rate-limit` option to `download` for limiting maximum bandwidth when
downloading.
* Add `--compact` option to `download` for displaying one video per line.
* Allow passing multiple video ids to `download` to download multiple videos
successively.
* Improved progress meter, updates on each chunk downloaded, instead of each VOD
downloaded.
* Improved speed estimate, displays recent speed instead of average speed since
the start of download.
* Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting the
`-w` option. Please test and report back if this works for you.
### [1.22.0 (2022-06-25)](https://github.com/ihabunek/twitch-dl/releases/tag/1.22.0)
* Add support for downloading subscriber-only VODs (#48, thanks @cemiu)

View File

@ -8,7 +8,7 @@ dist :
clean :
find . -name "*pyc" | xargs rm -rf $1
rm -rf build dist bundle MANIFEST htmlcov deb_dist twitch-dl.*.pyz twitch-dl.1.man twitch_dl.egg-info
rm -rf build dist bundle MANIFEST htmlcov deb_dist twitch-dl.*.pyz twitch-dl.1.man
bundle:
mkdir bundle

View File

@ -17,7 +17,7 @@ Resources
Requirements
------------
* Python 3.7 or later
* Python 3.5 or later
* [ffmpeg](https://ffmpeg.org/download.html), installed and on the system path
Quick start

View File

@ -1,8 +0,0 @@
TODO
====
Some ideas what to do next.
* gracefully handle aborting the download with Ctrl+C, now it prints out an error stack
* add keyboard control for e.g. pausing a download
* test how worker count affects download speeds on low and high-bandwidth links (see https://github.com/ihabunek/twitch-dl/issues/104), adjust default worker count

View File

@ -1,22 +1,3 @@
2.0.0:
date: 2022-08-18
description: |
This release switches from using `requests` to `httpx` for making http
requests, and from threads to `asyncio` for concurrency. This enables
easier implementation of new features, but has no breaking changes for the
CLI.
changes:
- "**BREAKING**: Require Python 3.7 or later."
- "Add `--rate-limit` option to `download` for limiting maximum bandwidth when downloading."
- "Add `--compact` option to `download` for displaying one video per line."
- "Allow passing multiple video ids to `download` to download multiple videos successively."
- "Improved progress meter, updates on each chunk downloaded, instead of each VOD downloaded."
- "Improved speed estimate, displays recent speed instead of average speed since the start of download."
- |
Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting
the `-w` option. Please test and report back if this works for you.
1.22.0:
date: 2022-06-25
changes:

View File

@ -3,26 +3,6 @@ twitch-dl changelog
<!-- Do not edit. This file is automatically generated from changelog.yaml.-->
### [2.0.0 (2022-08-18)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.0)
This release switches from using `requests` to `httpx` for making http requests,
and from threads to `asyncio` for concurrency. This enables easier
implementation of new features, but has no breaking changes for the CLI.
* **BREAKING**: Require Python 3.7 or later.
* Add `--rate-limit` option to `download` for limiting maximum bandwidth when
downloading.
* Add `--compact` option to `download` for displaying one video per line.
* Allow passing multiple video ids to `download` to download multiple videos
successively.
* Improved progress meter, updates on each chunk downloaded, instead of each VOD
downloaded.
* Improved speed estimate, displays recent speed instead of average speed since
the start of download.
* Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting the
`-w` option. Please test and report back if this works for you.
### [1.22.0 (2022-06-25)](https://github.com/ihabunek/twitch-dl/releases/tag/1.22.0)
* Add support for downloading subscriber-only VODs (#48, thanks @cemiu)

View File

@ -1,12 +1,12 @@
<!-- ------------------- generated docs start ------------------- -->
# twitch-dl download
Download videos or clips.
Download a video or clip.
### USAGE
```
twitch-dl download <videos> [FLAGS] [OPTIONS]
twitch-dl download <video> [FLAGS] [OPTIONS]
```
### ARGUMENTS
@ -14,8 +14,8 @@ twitch-dl download <videos> [FLAGS] [OPTIONS]
<table>
<tbody>
<tr>
<td class="code">&lt;videos&gt;</td>
<td>One or more video ID, clip slug or twitch URL to download.</td>
<td class="code">&lt;video&gt;</td>
<td>Video ID, clip slug, or URL</td>
</tr>
</tbody>
</table>
@ -47,7 +47,7 @@ twitch-dl download <videos> [FLAGS] [OPTIONS]
<tbody>
<tr>
<td class="code">-w, --max-workers</td>
<td>Number of workers for downloading vods concurrently (default 5)</td>
<td>Maximal number of threads for downloading vods concurrently (default 20)</td>
</tr>
<tr>
@ -79,11 +79,6 @@ twitch-dl download <videos> [FLAGS] [OPTIONS]
<td class="code">-o, --output</td>
<td>Output file name template. See docs for details.</td>
</tr>
<tr>
<td class="code">-r, --rate-limit</td>
<td>Limit the maximum download speed in bytes per second. Use &#x27;k&#x27; and &#x27;m&#x27; suffixes for kbps and mbps.</td>
</tr>
</tbody>
</table>
@ -116,12 +111,6 @@ Setting quality to `audio_only` will download only audio:
twitch-dl download -q audio_only 221837124
```
Download multiple videos one after the other:
```
twitch-dl download 1559928295 1557034274 1555157293 -q source
```
### Overriding the target file name
The target filename can be defined by passing the `--output` option followed by
@ -183,4 +172,4 @@ download command:
```
twitch-dl download 221837124 --auth-token iduetx4i107rn4b9wrgctf590aiktv
```
```

View File

@ -33,11 +33,6 @@ twitch-dl videos <channel_name> [FLAGS] [OPTIONS]
<td class="code">-j, --json</td>
<td>Show results as JSON. Ignores <code>--pager</code>.</td>
</tr>
<tr>
<td class="code">-c, --compact</td>
<td>Show videos in compact mode, one line per video</td>
</tr>
</tbody>
</table>
@ -52,7 +47,7 @@ twitch-dl videos <channel_name> [FLAGS] [OPTIONS]
<tr>
<td class="code">-l, --limit</td>
<td>Number of videos to fetch. Defaults to 40 in copmpact mode, 10 otherwise.</td>
<td>Number of videos to fetch. Defaults to 10.</td>
</tr>
<tr>

View File

@ -1,6 +1,6 @@
# Installation
twitch-dl requires **Python 3.7** or later.
twitch-dl requires **Python 3.5** or later.
## Prerequisite: FFmpeg

View File

@ -21,13 +21,6 @@ for version in data.keys():
changes = data[version]["changes"]
print(f"### [{version} ({date})](https://github.com/ihabunek/twitch-dl/releases/tag/{version})")
print()
if "description" in data[version]:
description = data[version]["description"].strip()
for line in textwrap.wrap(description, 80):
print(line)
print()
for c in changes:
lines = textwrap.wrap(c, 78)
initial = True

View File

@ -44,18 +44,14 @@ if dist_version != version:
release_date = changelog_item["date"]
changes = changelog_item["changes"]
description = changelog_item["description"] if "description" in changelog_item else None
if not isinstance(release_date, date):
print(f"Release date not set for version `{version}` in the changelog.", file=sys.stderr)
sys.exit(1)
commit_message = f"twitch-dl {version}\n\n"
if description:
lines = textwrap.wrap(description.strip(), 72)
commit_message += "\n".join(lines) + "\n\n"
for c in changes:
lines = textwrap.wrap(c, 69)
lines = textwrap.wrap(c, 70)
initial = True
for line in lines:
lead = " *" if initial else " "

View File

@ -10,33 +10,35 @@ makes it faster.
"""
setup(
name="twitch-dl",
version="2.0.0",
description="Twitch downloader",
name='twitch-dl',
version='1.22.0',
description='Twitch downloader',
long_description=long_description.strip(),
author="Ivan Habunek",
author_email="ivan@habunek.com",
url="https://github.com/ihabunek/twitch-dl/",
author='Ivan Habunek',
author_email='ivan@habunek.com',
url='https://github.com/ihabunek/twitch-dl/',
project_urls={
"Documentation": "https://twitch-dl.bezdomni.net/"
},
keywords="twitch vod video download",
license="GPLv3",
keywords='twitch vod video download',
license='GPLv3',
classifiers=[
"Development Status :: 5 - Production/Stable",
"Environment :: Console",
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
"Programming Language :: Python :: 3",
'Development Status :: 5 - Production/Stable',
'License :: OSI Approved :: GNU General Public License v3 (GPLv3)',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
],
packages=find_packages(),
python_requires=">=3.7",
python_requires='>=3.5',
install_requires=[
"m3u8>=1.0.0,<2.0.0",
"httpx>=0.17.0,<1.0.0",
"requests>=2.13,<3.0",
],
entry_points={
"console_scripts": [
"twitch-dl=twitchdl.console:main",
'console_scripts': [
'twitch-dl=twitchdl.console:main',
],
}
)

View File

@ -1,102 +0,0 @@
from twitchdl.progress import Progress
def test_initial_values():
progress = Progress(10)
assert progress.downloaded == 0
assert progress.estimated_total is None
assert progress.progress_perc == 0
assert progress.remaining_time is None
assert progress.speed is None
assert progress.vod_count == 10
assert progress.vod_downloaded_count == 0
def test_downloaded():
progress = Progress(3)
progress.start(1, 300)
progress.start(2, 300)
progress.start(3, 300)
assert progress.downloaded == 0
assert progress.progress_bytes == 0
assert progress.progress_perc == 0
progress.advance(1, 100)
assert progress.downloaded == 100
assert progress.progress_bytes == 100
assert progress.progress_perc == 11
progress.advance(2, 200)
assert progress.downloaded == 300
assert progress.progress_bytes == 300
assert progress.progress_perc == 33
progress.advance(3, 150)
assert progress.downloaded == 450
assert progress.progress_bytes == 450
assert progress.progress_perc == 50
progress.advance(1, 50)
assert progress.downloaded == 500
assert progress.progress_bytes == 500
assert progress.progress_perc == 55
progress.abort(2)
assert progress.downloaded == 500
assert progress.progress_bytes == 300
assert progress.progress_perc == 33
progress.start(2, 300)
progress.advance(1, 150)
progress.advance(2, 300)
progress.advance(3, 150)
assert progress.downloaded == 1100
assert progress.progress_bytes == 900
assert progress.progress_perc == 100
progress.end(1)
progress.end(2)
progress.end(3)
assert progress.downloaded == 1100
assert progress.progress_bytes == 900
assert progress.progress_perc == 100
def test_estimated_total():
progress = Progress(3)
assert progress.estimated_total is None
progress.start(1, 12000)
assert progress.estimated_total == 12000 * 3
progress.start(2, 11000)
assert progress.estimated_total == 11500 * 3
progress.start(3, 10000)
assert progress.estimated_total == 11000 * 3
def test_vod_downloaded_count():
progress = Progress(3)
progress.start(1, 100)
progress.start(2, 100)
progress.start(3, 100)
assert progress.vod_downloaded_count == 0
progress.advance(1, 100)
progress.end(1)
assert progress.vod_downloaded_count == 1
progress.advance(2, 100)
progress.end(2)
assert progress.vod_downloaded_count == 2
progress.advance(3, 100)
progress.end(3)
assert progress.vod_downloaded_count == 3

View File

@ -1,3 +1,3 @@
__version__ = "2.0.0"
__version__ = "1.22.0"
CLIENT_ID = "kimne78kx3ncx6brgo4mv6wki5h1ko"

View File

@ -1,21 +1,17 @@
import asyncio
import httpx
import m3u8
import os
import re
import requests
import shutil
import subprocess
import tempfile
from os import path
from pathlib import Path
from typing import OrderedDict
from urllib.parse import urlparse, urlencode
from twitchdl import twitch, utils
from twitchdl.download import download_file
from twitchdl.download import download_file, download_files
from twitchdl.exceptions import ConsoleError
from twitchdl.http import download_all
from twitchdl.output import print_out
@ -166,20 +162,15 @@ def _crete_temp_dir(base_uri):
def download(args):
for video in args.videos:
download_one(video, args)
def download_one(video, args):
video_id = utils.parse_video_identifier(video)
video_id = utils.parse_video_identifier(args.video)
if video_id:
return _download_video(video_id, args)
clip_slug = utils.parse_clip_identifier(video)
clip_slug = utils.parse_clip_identifier(args.video)
if clip_slug:
return _download_clip(clip_slug, args)
raise ConsoleError("Invalid input: {}".format(video))
raise ConsoleError("Invalid input: {}".format(args.video))
def _get_clip_url(clip, quality):
@ -292,7 +283,7 @@ def _download_video(video_id, args):
else _select_playlist_interactive(playlists))
print_out("<dim>Fetching playlist...</dim>")
response = httpx.get(playlist_uri)
response = requests.get(playlist_uri)
response.raise_for_status()
playlist = m3u8.loads(response.text)
@ -308,15 +299,11 @@ def _download_video(video_id, args):
print_out("\nDownloading {} VODs using {} workers to {}".format(
len(vod_paths), args.max_workers, target_dir))
sources = [base_uri + path for path in vod_paths]
targets = [os.path.join(target_dir, "{:05d}.ts".format(k)) for k, _ in enumerate(vod_paths)]
asyncio.run(download_all(sources, targets, args.max_workers, rate_limit=args.rate_limit))
path_map = download_files(base_uri, target_dir, vod_paths, args.max_workers)
# Make a modified playlist which references downloaded VODs
# Keep only the downloaded segments and skip the rest
org_segments = playlist.segments.copy()
path_map = OrderedDict(zip(vod_paths, targets))
playlist.segments.clear()
for segment in org_segments:
if segment.uri in path_map:

View File

@ -2,17 +2,13 @@ import sys
from twitchdl import twitch
from twitchdl.exceptions import ConsoleError
from twitchdl.output import print_out, print_paged_videos, print_video, print_json, print_video_compact
from twitchdl.output import print_out, print_paged_videos, print_video, print_json
def videos(args):
game_ids = _get_game_ids(args.game)
# Set different defaults for limit for compact display
limit = args.limit or (40 if args.compact else 10)
# Ignore --limit if --pager or --all are given
max_videos = sys.maxsize if args.all or args.pager else limit
max_videos = sys.maxsize if args.all or args.pager else args.limit
total_count, generator = twitch.channel_videos_generator(
args.channel_name, max_videos, args.sort, args.type, game_ids=game_ids)
@ -36,11 +32,8 @@ def videos(args):
count = 0
for video in generator:
if args.compact:
print_video_compact(video)
else:
print_out()
print_video(video)
print_out()
print_video(video)
count += 1
print_out()

View File

@ -2,7 +2,6 @@
import logging
import sys
import re
from argparse import ArgumentParser, ArgumentTypeError
from collections import namedtuple
@ -47,24 +46,6 @@ def pos_integer(value):
return value
def rate(value):
match = re.search(r"^([0-9]+)(k|m|)$", value, flags=re.IGNORECASE)
if not match:
raise ArgumentTypeError("must be an integer, followed by an optional 'k' or 'm'")
amount = int(match.group(1))
unit = match.group(2)
if unit == "k":
return amount * 1024
if unit == "m":
return amount * 1024 * 1024
return amount
COMMANDS = [
Command(
name="videos",
@ -80,8 +61,9 @@ COMMANDS = [
"type": str,
}),
(["-l", "--limit"], {
"help": "Number of videos to fetch. Defaults to 40 in copmpact mode, 10 otherwise.",
"help": "Number of videos to fetch. Defaults to 10.",
"type": pos_integer,
"default": 10,
}),
(["-a", "--all"], {
"help": "Fetch all videos, overrides --limit",
@ -111,11 +93,6 @@ COMMANDS = [
"nargs": "?",
"const": 10,
}),
(["-c", "--compact"], {
"help": "Show videos in compact mode, one line per video",
"action": "store_true",
"default": False,
}),
],
),
Command(
@ -162,17 +139,17 @@ COMMANDS = [
),
Command(
name="download",
description="Download videos or clips.",
description="Download a video or clip.",
arguments=[
(["videos"], {
"help": "One or more video ID, clip slug or twitch URL to download.",
(["video"], {
"help": "Video ID, clip slug, or URL",
"type": str,
"nargs": "+",
}),
(["-w", "--max-workers"], {
"help": "Number of workers for downloading vods concurrently (default 5)",
"help": "Maximal number of threads for downloading vods "
"concurrently (default 20)",
"type": int,
"default": 5,
"default": 20,
}),
(["-s", "--start"], {
"help": "Download video from this time (hh:mm or hh:mm:ss)",
@ -220,12 +197,7 @@ COMMANDS = [
"help": "Output file name template. See docs for details.",
"type": str,
"default": "{date}_{id}_{channel_login}_{title_slug}.{format}"
}),
(["-r", "--rate-limit"], {
"help": "Limit the maximum download speed in bytes per second. "
"Use 'k' and 'm' suffixes for kbps and mbps.",
"type": rate,
}),
})
],
),
Command(
@ -309,7 +281,7 @@ def main():
print_err(e)
sys.exit(1)
except KeyboardInterrupt:
print_err("\nOperation canceled")
print_err("Operation canceled")
sys.exit(1)
except GQLError as e:
print_err(e)

View File

@ -1,5 +1,14 @@
import os
import httpx
import requests
from collections import OrderedDict
from concurrent.futures import ThreadPoolExecutor, as_completed
from datetime import datetime
from functools import partial
from requests.exceptions import RequestException
from twitchdl.output import print_out
from twitchdl.utils import format_size, format_duration
CHUNK_SIZE = 1024
CONNECT_TIMEOUT = 5
@ -12,12 +21,12 @@ class DownloadFailed(Exception):
def _download(url, path):
tmp_path = path + ".tmp"
response = requests.get(url, stream=True, timeout=CONNECT_TIMEOUT)
size = 0
with httpx.stream("GET", url, timeout=CONNECT_TIMEOUT) as response:
with open(tmp_path, "wb") as target:
for chunk in response.iter_bytes(chunk_size=CHUNK_SIZE):
target.write(chunk)
size += len(chunk)
with open(tmp_path, 'wb') as target:
for chunk in response.iter_content(chunk_size=CHUNK_SIZE):
target.write(chunk)
size += len(chunk)
os.rename(tmp_path, path)
return size
@ -32,7 +41,63 @@ def download_file(url, path, retries=RETRY_COUNT):
for _ in range(retries):
try:
return (_download(url, path), from_disk)
except httpx.RequestError:
except RequestException:
pass
raise DownloadFailed(":(")
def _print_progress(futures):
downloaded_count = 0
downloaded_size = 0
max_msg_size = 0
start_time = datetime.now()
total_count = len(futures)
current_download_size = 0
current_downloaded_count = 0
for future in as_completed(futures):
size, from_disk = future.result()
downloaded_count += 1
downloaded_size += size
# If we find something on disk, we don't want to take it in account in
# the speed calculation
if not from_disk:
current_download_size += size
current_downloaded_count += 1
percentage = 100 * downloaded_count // total_count
est_total_size = int(total_count * downloaded_size / downloaded_count)
duration = (datetime.now() - start_time).seconds
speed = current_download_size // duration if duration else 0
remaining = (total_count - downloaded_count) * duration / current_downloaded_count \
if current_downloaded_count else 0
msg = " ".join([
"Downloaded VOD {}/{}".format(downloaded_count, total_count),
"({}%)".format(percentage),
"<cyan>{}</cyan>".format(format_size(downloaded_size)),
"of <cyan>~{}</cyan>".format(format_size(est_total_size)),
"at <cyan>{}/s</cyan>".format(format_size(speed)) if speed > 0 else "",
"remaining <cyan>~{}</cyan>".format(format_duration(remaining)) if remaining > 0 else "",
])
max_msg_size = max(len(msg), max_msg_size)
print_out("\r" + msg.ljust(max_msg_size), end="")
def download_files(base_url, target_dir, vod_paths, max_workers):
"""
Downloads a list of VODs defined by a common `base_url` and a list of
`vod_paths`, returning a dict which maps the paths to the downloaded files.
"""
urls = [base_url + path for path in vod_paths]
targets = [os.path.join(target_dir, "{:05d}.ts".format(k)) for k, _ in enumerate(vod_paths)]
partials = (partial(download_file, url, path) for url, path in zip(urls, targets))
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = [executor.submit(fn) for fn in partials]
_print_progress(futures)
return OrderedDict(zip(vod_paths, targets))

76
twitchdl/ffmpeg.py Normal file
View File

@ -0,0 +1,76 @@
import asyncio
import json
import logging
import re
from asyncio.subprocess import PIPE
from pprint import pprint
from typing import Optional
from twitchdl.output import print_out
async def join_vods(playlist_path: str, target: str, overwrite: bool, video: dict):
command = [
"ffmpeg",
"-i", playlist_path,
"-c", "copy",
"-metadata", "artist={}".format(video["creator"]["displayName"]),
"-metadata", "title={}".format(video["title"]),
"-metadata", "encoded_by=twitch-dl",
"-stats",
"-loglevel", "warning",
f"file:{target}",
]
if overwrite:
command.append("-y")
# command = ["ls", "-al"]
print_out("<dim>{}</dim>".format(" ".join(command)))
process = await asyncio.create_subprocess_exec(*command, stdout=PIPE, stderr=PIPE)
assert process.stderr is not None
await asyncio.gather(
# _read_stream("stdout", process.stdout),
_print_progress("stderr", process.stderr),
process.wait()
)
print(process.returncode)
async def _read_stream(name: str, stream: Optional[asyncio.StreamReader]):
if stream:
async for line in readlines(stream):
print(name, ">", line)
async def _print_progress(stream: asyncio.StreamReader):
async for line in readlines(stream):
print(name, ">", line)
pattern = re.compile(br"[\r\n]+")
async def readlines(stream: asyncio.StreamReader):
data = bytearray()
while not stream.at_eof():
lines = pattern.split(data)
data[:] = lines.pop(-1)
for line in lines:
yield line
data.extend(await stream.read(1024))
if __name__ == "__main__":
# logging.basicConfig(level=logging.DEBUG)
video = json.loads('{"id": "1555108011", "title": "Cult of the Lamb", "publishedAt": "2022-08-07T17:00:30Z", "broadcastType": "ARCHIVE", "lengthSeconds": 17948, "game": {"name": "Cult of the Lamb"}, "creator": {"login": "bananasaurus_rex", "displayName": "Bananasaurus_Rex"}, "playlists": [{"bandwidth": 8446533, "resolution": [1920, 1080], "codecs": "avc1.64002A,mp4a.40.2", "video": "chunked", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/chunked/index-dvr.m3u8"}, {"bandwidth": 3432426, "resolution": [1280, 720], "codecs": "avc1.4D0020,mp4a.40.2", "video": "720p60", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/720p60/index-dvr.m3u8"}, {"bandwidth": 1445268, "resolution": [852, 480], "codecs": "avc1.4D001F,mp4a.40.2", "video": "480p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/480p30/index-dvr.m3u8"}, {"bandwidth": 215355, "resolution": null, "codecs": "mp4a.40.2", "video": "audio_only", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/audio_only/index-dvr.m3u8"}, {"bandwidth": 705523, "resolution": [640, 360], "codecs": "avc1.4D001E,mp4a.40.2", "video": "360p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/360p30/index-dvr.m3u8"}, {"bandwidth": 285614, "resolution": [284, 160], "codecs": "avc1.4D000C,mp4a.40.2", "video": "160p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/160p30/index-dvr.m3u8"}]}')
playlist_path = "/tmp/twitch-dl/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/160p30/playlist_downloaded.m3u8"
asyncio.run(join_vods(playlist_path, "out.mkv", True, video), debug=True)

View File

@ -1,129 +0,0 @@
import asyncio
import httpx
import logging
import os
import time
from typing import List, Optional, Union
from twitchdl.progress import Progress
logger = logging.getLogger(__name__)
KB = 1024
CHUNK_SIZE = 256 * KB
"""How much of a VOD to download in each iteration"""
RETRY_COUNT = 5
"""Number of times to retry failed downloads before aborting."""
TIMEOUT = 30
"""
Number of seconds to wait before aborting when there is no network activity.
https://www.python-httpx.org/advanced/#timeout-configuration
"""
class TokenBucket:
"""Limit the download speed by strategically inserting sleeps."""
def __init__(self, rate: int, capacity: Optional[int] = None):
self.rate: int = rate
self.capacity: int = capacity or rate * 2
self.available: int = 0
self.last_refilled: float = time.time()
def advance(self, size: int):
"""Called every time a chunk of data is downloaded."""
self._refill()
if self.available < size:
deficit = size - self.available
time.sleep(deficit / self.rate)
self.available -= size
def _refill(self):
"""Increase available capacity according to elapsed time since last refill."""
now = time.time()
elapsed = now - self.last_refilled
refill_amount = int(elapsed * self.rate)
self.available = min(self.available + refill_amount, self.capacity)
self.last_refilled = now
class EndlessTokenBucket:
"""Used when download speed is not limited."""
def advance(self, size):
pass
AnyTokenBucket = Union[TokenBucket, EndlessTokenBucket]
async def download(
client: httpx.AsyncClient,
task_id: int,
source: str,
target: str,
progress: Progress,
token_bucket: AnyTokenBucket,
):
# Download to a temp file first, then copy to target when over to avoid
# getting saving chunks which may persist if canceled or --keep is used
tmp_target = f"{target}.tmp"
with open(tmp_target, "wb") as f:
async with client.stream("GET", source) as response:
size = int(response.headers.get("content-length"))
progress.start(task_id, size)
async for chunk in response.aiter_bytes(chunk_size=CHUNK_SIZE):
f.write(chunk)
size = len(chunk)
token_bucket.advance(size)
progress.advance(task_id, size)
progress.end(task_id)
os.rename(tmp_target, target)
async def download_with_retries(
client: httpx.AsyncClient,
semaphore: asyncio.Semaphore,
task_id: int,
source: str,
target: str,
progress: Progress,
token_bucket: AnyTokenBucket,
):
async with semaphore:
if os.path.exists(target):
size = os.path.getsize(target)
progress.already_downloaded(task_id, size)
return
for n in range(RETRY_COUNT):
try:
return await download(client, task_id, source, target, progress, token_bucket)
except httpx.RequestError:
logger.exception("Task {task_id} failed. Retrying. Maybe.")
progress.abort(task_id)
if n + 1 >= RETRY_COUNT:
raise
raise Exception("Should not happen")
async def download_all(
sources: List[str],
targets: List[str],
workers: int,
/, *,
rate_limit: Optional[int] = None
):
progress = Progress(len(sources))
token_bucket = TokenBucket(rate_limit) if rate_limit else EndlessTokenBucket()
async with httpx.AsyncClient(timeout=TIMEOUT) as client:
semaphore = asyncio.Semaphore(workers)
tasks = [download_with_retries(client, semaphore, task_id, source, target, progress, token_bucket)
for task_id, (source, target) in enumerate(zip(sources, targets))]
await asyncio.gather(*tasks)

View File

@ -48,13 +48,6 @@ def strip_tags(text):
return text
def truncate(string, length):
if len(string) > length:
return string[:length - 1] + ""
return string
def print_out(*args, **kwargs):
args = [colorize(a) if USE_ANSI_COLOR else strip_tags(a) for a in args]
print(*args, **kwargs)
@ -96,14 +89,6 @@ def print_video(video):
print_out("<i>{}</i>".format(url))
def print_video_compact(video):
id = video["id"]
date = video["publishedAt"][:10]
game = video["game"]["name"] if video["game"] else ""
title = truncate(video["title"], 80).ljust(80)
print_out(f'<b>{id}</b> {date} <green>{title}</green> <blue>{game}</blue>')
def print_paged_videos(generator, page_size, total_count):
iterator = iter(generator)
page = list(islice(iterator, page_size))

View File

@ -1,137 +0,0 @@
import logging
import time
from collections import deque
from dataclasses import dataclass, field
from statistics import mean
from typing import Dict, NamedTuple, Optional, Deque
from twitchdl.output import print_out
from twitchdl.utils import format_size, format_time
logger = logging.getLogger(__name__)
TaskId = int
@dataclass
class Task:
id: TaskId
size: int
downloaded: int = 0
def advance(self, size):
self.downloaded += size
class Sample(NamedTuple):
downloaded: int
timestamp: float
@dataclass
class Progress:
vod_count: int
downloaded: int = 0
estimated_total: Optional[int] = None
last_printed: float = field(default_factory=time.time)
progress_bytes: int = 0
progress_perc: int = 0
remaining_time: Optional[int] = None
speed: Optional[float] = None
start_time: float = field(default_factory=time.time)
tasks: Dict[TaskId, Task] = field(default_factory=dict)
vod_downloaded_count: int = 0
samples: Deque[Sample] = field(default_factory=lambda: deque(maxlen=100))
def start(self, task_id: int, size: int):
if task_id in self.tasks:
raise ValueError(f"Task {task_id}: cannot start, already started")
self.tasks[task_id] = Task(task_id, size)
self._calculate_total()
self._calculate_progress()
self.print()
def advance(self, task_id: int, size: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot advance, not started")
self.downloaded += size
self.progress_bytes += size
self.tasks[task_id].advance(size)
self.samples.append(Sample(self.downloaded, time.time()))
self._calculate_progress()
self.print()
def already_downloaded(self, task_id: int, size: int):
if task_id in self.tasks:
raise ValueError(f"Task {task_id}: cannot mark as downloaded, already started")
self.tasks[task_id] = Task(task_id, size)
self.progress_bytes += size
self.vod_downloaded_count += 1
self._calculate_total()
self._calculate_progress()
self.print()
def abort(self, task_id: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot abort, not started")
del self.tasks[task_id]
self.progress_bytes = sum(t.downloaded for t in self.tasks.values())
self._calculate_total()
self._calculate_progress()
self.print()
def end(self, task_id: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot end, not started")
task = self.tasks[task_id]
if task.size != task.downloaded:
logger.warn(f"Taks {task_id} ended with {task.downloaded}b downloaded, expected {task.size}b.")
self.vod_downloaded_count += 1
self.print()
def _calculate_total(self):
self.estimated_total = int(mean(t.size for t in self.tasks.values()) * self.vod_count) if self.tasks else None
def _calculate_progress(self):
self.speed = self._calculate_speed()
self.progress_perc = int(100 * self.progress_bytes / self.estimated_total) if self.estimated_total else 0
self.remaining_time = int((self.estimated_total - self.progress_bytes) / self.speed) if self.estimated_total and self.speed else None
def _calculate_speed(self):
if len(self.samples) < 2:
return None
first_sample = self.samples[0]
last_sample = self.samples[-1]
size = last_sample.downloaded - first_sample.downloaded
duration = last_sample.timestamp - first_sample.timestamp
return size / duration
def print(self):
now = time.time()
# Don't print more often than 10 times per second
if now - self.last_printed < 0.1:
return
progress = " ".join([
f"Downloaded {self.vod_downloaded_count}/{self.vod_count} VODs",
f"<blue>{self.progress_perc}%</blue>",
f"of <blue>~{format_size(self.estimated_total)}</blue>" if self.estimated_total else "",
f"at <blue>{format_size(self.speed)}/s</blue>" if self.speed else "",
f"ETA <blue>{format_time(self.remaining_time)}</blue>" if self.remaining_time is not None else "",
])
print_out(f"\r{progress} ", end="")
self.last_printed = now

View File

@ -2,8 +2,9 @@
Twitch API access.
"""
import httpx
import requests
from requests.exceptions import HTTPError
from twitchdl import CLIENT_ID
from twitchdl.exceptions import ConsoleError
@ -17,7 +18,7 @@ class GQLError(Exception):
def authenticated_get(url, params={}, headers={}):
headers['Client-ID'] = CLIENT_ID
response = httpx.get(url, params=params, headers=headers)
response = requests.get(url, params, headers=headers)
if 400 <= response.status_code < 500:
data = response.json()
# TODO: this does not look nice in the console since data["message"]
@ -32,7 +33,7 @@ def authenticated_get(url, params={}, headers={}):
def authenticated_post(url, data=None, json=None, headers={}):
headers['Client-ID'] = CLIENT_ID
response = httpx.post(url, data=data, json=json, headers=headers)
response = requests.post(url, data=data, json=json, headers=headers)
if response.status_code == 400:
data = response.json()
raise ConsoleError(data["message"])
@ -329,7 +330,7 @@ def get_access_token(video_id, auth_token=None):
try:
response = gql_query(query, headers=headers)
return response["data"]["videoPlaybackAccessToken"]
except httpx.HTTPStatusError as error:
except HTTPError as error:
# Provide a more useful error message when server returns HTTP 401
# Unauthorized while using a user-provided auth token.
if error.response.status_code == 401:
@ -350,7 +351,7 @@ def get_playlists(video_id, access_token):
"""
url = "http://usher.twitch.tv/vod/{}".format(video_id)
response = httpx.get(url, params={
response = requests.get(url, params={
"nauth": access_token['value'],
"nauthsig": access_token['signature'],
"allow_audio_only": "true",

View File

@ -40,19 +40,6 @@ def format_duration(total_seconds):
return "{} sec".format(seconds)
def format_time(total_seconds):
total_seconds = int(total_seconds)
hours = total_seconds // 3600
remainder = total_seconds % 3600
minutes = remainder // 60
seconds = total_seconds % 60
if hours:
return f"{hours:02}:{minutes:02}:{seconds:02}"
return f"{minutes:02}:{seconds:02}"
def read_int(msg, min, max, default):
msg = msg + " [default {}]: ".format(default)