Compare commits

..

30 Commits

Author SHA1 Message Date
33b5cc0a01 Bump version, add changelog 2022-09-09 08:11:46 +02:00
c9ab6237e8 Don't rename the file while it's still open
issue #111
2022-09-09 08:05:03 +02:00
662ce72195 Accept newer versions of m3u8 lib 2022-08-20 13:25:00 +02:00
a4b2434735 Start adding types 2022-08-20 13:25:00 +02:00
280a284fb2 Expand clips tests 2022-08-20 11:16:47 +02:00
235b13c257 Remove unused function 2022-08-20 11:10:47 +02:00
8e3a41e415 Expand tests 2022-08-19 09:35:56 +02:00
cacf921923 Increase default limit for compact mode 2022-08-18 10:06:23 +02:00
4d19f09065 Add compact videos display 2022-08-18 10:04:04 +02:00
f289c93305 Bump version, set release date 2022-08-18 10:04:04 +02:00
b43c9dc9b9 Use double quotes please 2022-08-18 09:36:56 +02:00
a0e808660a Require python 3.7 2022-08-18 09:36:46 +02:00
a14ce57f95 Decrease default number of workers to 5 2022-08-18 09:30:35 +02:00
c4f4935b96 Enable downloading multiple videos successively 2022-08-18 09:12:25 +02:00
c8d38b5512 Update changelog 2022-08-17 11:07:04 +02:00
8be0aba95d Add support for version description in changelog 2022-08-17 11:07:04 +02:00
71ae2bf906 Add TODO 2022-08-17 09:22:16 +02:00
c8a6d67822 Improve speed tracking
Instead of calculating the average speed for the whole download,
consider only the last 100 chunks.
2022-08-17 08:35:57 +02:00
5c380084ba Update changelog 2022-08-15 07:31:25 +02:00
51a35ab494 Remove overly verbose logging 2022-08-15 07:14:53 +02:00
7ca71ddeca Delete egg-info on clean 2022-08-15 07:13:02 +02:00
f40fd290f7 Replace requests with httpx, remove unused code 2022-08-15 07:12:10 +02:00
b03c19dac1 Improve visuals
I never liked cyan anyway
2022-08-14 11:33:38 +02:00
cd445674e5 Download chunks to a temp file first 2022-08-14 11:33:23 +02:00
721d78377e Add rate limiting to download 2022-08-14 11:13:11 +02:00
ac07006ae7 Limit number of prints per second 2022-08-14 11:04:53 +02:00
32a68395d5 Use async downloader 2022-08-14 11:02:29 +02:00
81846764a1 Don't download already downloaded files 2022-08-14 10:21:38 +02:00
23f1a74aa6 Add new asyncio downloader code with rate limiting 2022-08-13 11:41:13 +02:00
85631c8ce5 Extract progress tracking 2022-08-13 09:40:18 +02:00
25 changed files with 664 additions and 241 deletions

View File

@ -3,6 +3,31 @@ twitch-dl changelog
<!-- Do not edit. This file is automatically generated from changelog.yaml.-->
### [2.0.1 (2022-09-09)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.1)
* Fix an issue where a temp vod file would be renamed while still being open,
which caused an exception on Windows (#111)
### [2.0.0 (2022-08-18)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.0)
This release switches from using `requests` to `httpx` for making http requests,
and from threads to `asyncio` for concurrency. This enables easier
implementation of new features, but has no breaking changes for the CLI.
* **BREAKING**: Require Python 3.7 or later.
* Add `--rate-limit` option to `download` for limiting maximum bandwidth when
downloading.
* Add `--compact` option to `download` for displaying one video per line.
* Allow passing multiple video ids to `download` to download multiple videos
successively.
* Improved progress meter, updates on each chunk downloaded, instead of each VOD
downloaded.
* Improved speed estimate, displays recent speed instead of average speed since
the start of download.
* Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting the
`-w` option. Please test and report back if this works for you.
### [1.22.0 (2022-06-25)](https://github.com/ihabunek/twitch-dl/releases/tag/1.22.0)
* Add support for downloading subscriber-only VODs (#48, thanks @cemiu)

View File

@ -8,7 +8,7 @@ dist :
clean :
find . -name "*pyc" | xargs rm -rf $1
rm -rf build dist bundle MANIFEST htmlcov deb_dist twitch-dl.*.pyz twitch-dl.1.man
rm -rf build dist bundle MANIFEST htmlcov deb_dist twitch-dl.*.pyz twitch-dl.1.man twitch_dl.egg-info
bundle:
mkdir bundle

View File

@ -17,7 +17,7 @@ Resources
Requirements
------------
* Python 3.5 or later
* Python 3.7 or later
* [ffmpeg](https://ffmpeg.org/download.html), installed and on the system path
Quick start

8
TODO.md Normal file
View File

@ -0,0 +1,8 @@
TODO
====
Some ideas what to do next.
* gracefully handle aborting the download with Ctrl+C, now it prints out an error stack
* add keyboard control for e.g. pausing a download
* test how worker count affects download speeds on low and high-bandwidth links (see https://github.com/ihabunek/twitch-dl/issues/104), adjust default worker count

View File

@ -1,3 +1,27 @@
2.0.1:
date: 2022-09-09
changes:
- "Fix an issue where a temp vod file would be renamed while still being open, which caused an exception on Windows (#111)"
2.0.0:
date: 2022-08-18
description: |
This release switches from using `requests` to `httpx` for making http
requests, and from threads to `asyncio` for concurrency. This enables
easier implementation of new features, but has no breaking changes for the
CLI.
changes:
- "**BREAKING**: Require Python 3.7 or later."
- "Add `--rate-limit` option to `download` for limiting maximum bandwidth when downloading."
- "Add `--compact` option to `download` for displaying one video per line."
- "Allow passing multiple video ids to `download` to download multiple videos successively."
- "Improved progress meter, updates on each chunk downloaded, instead of each VOD downloaded."
- "Improved speed estimate, displays recent speed instead of average speed since the start of download."
- |
Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting
the `-w` option. Please test and report back if this works for you.
1.22.0:
date: 2022-06-25
changes:

View File

@ -3,6 +3,31 @@ twitch-dl changelog
<!-- Do not edit. This file is automatically generated from changelog.yaml.-->
### [2.0.1 (2022-09-09)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.1)
* Fix an issue where a temp vod file would be renamed while still being open,
which caused an exception on Windows (#111)
### [2.0.0 (2022-08-18)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.0)
This release switches from using `requests` to `httpx` for making http requests,
and from threads to `asyncio` for concurrency. This enables easier
implementation of new features, but has no breaking changes for the CLI.
* **BREAKING**: Require Python 3.7 or later.
* Add `--rate-limit` option to `download` for limiting maximum bandwidth when
downloading.
* Add `--compact` option to `download` for displaying one video per line.
* Allow passing multiple video ids to `download` to download multiple videos
successively.
* Improved progress meter, updates on each chunk downloaded, instead of each VOD
downloaded.
* Improved speed estimate, displays recent speed instead of average speed since
the start of download.
* Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting the
`-w` option. Please test and report back if this works for you.
### [1.22.0 (2022-06-25)](https://github.com/ihabunek/twitch-dl/releases/tag/1.22.0)
* Add support for downloading subscriber-only VODs (#48, thanks @cemiu)

View File

@ -1,12 +1,12 @@
<!-- ------------------- generated docs start ------------------- -->
# twitch-dl download
Download a video or clip.
Download videos or clips.
### USAGE
```
twitch-dl download <video> [FLAGS] [OPTIONS]
twitch-dl download <videos> [FLAGS] [OPTIONS]
```
### ARGUMENTS
@ -14,8 +14,8 @@ twitch-dl download <video> [FLAGS] [OPTIONS]
<table>
<tbody>
<tr>
<td class="code">&lt;video&gt;</td>
<td>Video ID, clip slug, or URL</td>
<td class="code">&lt;videos&gt;</td>
<td>One or more video ID, clip slug or twitch URL to download.</td>
</tr>
</tbody>
</table>
@ -47,7 +47,7 @@ twitch-dl download <video> [FLAGS] [OPTIONS]
<tbody>
<tr>
<td class="code">-w, --max-workers</td>
<td>Maximal number of threads for downloading vods concurrently (default 20)</td>
<td>Number of workers for downloading vods concurrently (default 5)</td>
</tr>
<tr>
@ -79,6 +79,11 @@ twitch-dl download <video> [FLAGS] [OPTIONS]
<td class="code">-o, --output</td>
<td>Output file name template. See docs for details.</td>
</tr>
<tr>
<td class="code">-r, --rate-limit</td>
<td>Limit the maximum download speed in bytes per second. Use &#x27;k&#x27; and &#x27;m&#x27; suffixes for kbps and mbps.</td>
</tr>
</tbody>
</table>
@ -111,6 +116,12 @@ Setting quality to `audio_only` will download only audio:
twitch-dl download -q audio_only 221837124
```
Download multiple videos one after the other:
```
twitch-dl download 1559928295 1557034274 1555157293 -q source
```
### Overriding the target file name
The target filename can be defined by passing the `--output` option followed by
@ -172,4 +183,4 @@ download command:
```
twitch-dl download 221837124 --auth-token iduetx4i107rn4b9wrgctf590aiktv
```
```

View File

@ -33,6 +33,11 @@ twitch-dl videos <channel_name> [FLAGS] [OPTIONS]
<td class="code">-j, --json</td>
<td>Show results as JSON. Ignores <code>--pager</code>.</td>
</tr>
<tr>
<td class="code">-c, --compact</td>
<td>Show videos in compact mode, one line per video</td>
</tr>
</tbody>
</table>
@ -47,7 +52,7 @@ twitch-dl videos <channel_name> [FLAGS] [OPTIONS]
<tr>
<td class="code">-l, --limit</td>
<td>Number of videos to fetch. Defaults to 10.</td>
<td>Number of videos to fetch. Defaults to 40 in copmpact mode, 10 otherwise.</td>
</tr>
<tr>

View File

@ -1,6 +1,6 @@
# Installation
twitch-dl requires **Python 3.5** or later.
twitch-dl requires **Python 3.7** or later.
## Prerequisite: FFmpeg

View File

@ -21,6 +21,13 @@ for version in data.keys():
changes = data[version]["changes"]
print(f"### [{version} ({date})](https://github.com/ihabunek/twitch-dl/releases/tag/{version})")
print()
if "description" in data[version]:
description = data[version]["description"].strip()
for line in textwrap.wrap(description, 80):
print(line)
print()
for c in changes:
lines = textwrap.wrap(c, 78)
initial = True

View File

@ -44,14 +44,18 @@ if dist_version != version:
release_date = changelog_item["date"]
changes = changelog_item["changes"]
description = changelog_item["description"] if "description" in changelog_item else None
if not isinstance(release_date, date):
print(f"Release date not set for version `{version}` in the changelog.", file=sys.stderr)
sys.exit(1)
commit_message = f"twitch-dl {version}\n\n"
if description:
lines = textwrap.wrap(description.strip(), 72)
commit_message += "\n".join(lines) + "\n\n"
for c in changes:
lines = textwrap.wrap(c, 70)
lines = textwrap.wrap(c, 69)
initial = True
for line in lines:
lead = " *" if initial else " "

View File

@ -10,35 +10,33 @@ makes it faster.
"""
setup(
name='twitch-dl',
version='1.22.0',
description='Twitch downloader',
name="twitch-dl",
version="2.0.1",
description="Twitch downloader",
long_description=long_description.strip(),
author='Ivan Habunek',
author_email='ivan@habunek.com',
url='https://github.com/ihabunek/twitch-dl/',
author="Ivan Habunek",
author_email="ivan@habunek.com",
url="https://github.com/ihabunek/twitch-dl/",
project_urls={
"Documentation": "https://twitch-dl.bezdomni.net/"
},
keywords='twitch vod video download',
license='GPLv3',
keywords="twitch vod video download",
license="GPLv3",
classifiers=[
'Development Status :: 5 - Production/Stable',
'License :: OSI Approved :: GNU General Public License v3 (GPLv3)',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
"Development Status :: 5 - Production/Stable",
"Environment :: Console",
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
"Programming Language :: Python :: 3",
],
packages=find_packages(),
python_requires='>=3.5',
python_requires=">=3.7",
install_requires=[
"m3u8>=1.0.0,<2.0.0",
"requests>=2.13,<3.0",
"m3u8>=1.0.0,<4.0.0",
"httpx>=0.17.0,<1.0.0",
],
entry_points={
'console_scripts': [
'twitch-dl=twitchdl.console:main',
"console_scripts": [
"twitch-dl=twitchdl.console:main",
],
}
)

View File

@ -2,7 +2,10 @@
These tests depend on the channel having some videos and clips published.
"""
import httpx
import m3u8
from twitchdl import twitch
from twitchdl.commands.download import _parse_playlists, get_clip_authenticated_url
TEST_CHANNEL = "bananasaurus_rex"
@ -16,6 +19,21 @@ def test_get_videos():
video = twitch.get_video(video_id)
assert video["id"] == video_id
access_token = twitch.get_access_token(video_id)
assert "signature" in access_token
assert "value" in access_token
playlists = twitch.get_playlists(video_id, access_token)
assert playlists.startswith("#EXTM3U")
name, res, url = next(_parse_playlists(playlists))
playlist = httpx.get(url).text
assert playlist.startswith("#EXTM3U")
playlist = m3u8.loads(playlist)
vod_path = playlist.segments[0].uri
assert vod_path == "0.ts"
def test_get_clips():
"""
@ -25,6 +43,8 @@ def test_get_clips():
assert clips["pageInfo"]
assert len(clips["edges"]) > 0
clip_slug = clips["edges"][0]["node"]["slug"]
clip = twitch.get_clip(clip_slug)
assert clip["slug"] == clip_slug
slug = clips["edges"][0]["node"]["slug"]
clip = twitch.get_clip(slug)
assert clip["slug"] == slug
assert get_clip_authenticated_url(slug, "source")

102
tests/test_progress.py Normal file
View File

@ -0,0 +1,102 @@
from twitchdl.progress import Progress
def test_initial_values():
progress = Progress(10)
assert progress.downloaded == 0
assert progress.estimated_total is None
assert progress.progress_perc == 0
assert progress.remaining_time is None
assert progress.speed is None
assert progress.vod_count == 10
assert progress.vod_downloaded_count == 0
def test_downloaded():
progress = Progress(3)
progress.start(1, 300)
progress.start(2, 300)
progress.start(3, 300)
assert progress.downloaded == 0
assert progress.progress_bytes == 0
assert progress.progress_perc == 0
progress.advance(1, 100)
assert progress.downloaded == 100
assert progress.progress_bytes == 100
assert progress.progress_perc == 11
progress.advance(2, 200)
assert progress.downloaded == 300
assert progress.progress_bytes == 300
assert progress.progress_perc == 33
progress.advance(3, 150)
assert progress.downloaded == 450
assert progress.progress_bytes == 450
assert progress.progress_perc == 50
progress.advance(1, 50)
assert progress.downloaded == 500
assert progress.progress_bytes == 500
assert progress.progress_perc == 55
progress.abort(2)
assert progress.downloaded == 500
assert progress.progress_bytes == 300
assert progress.progress_perc == 33
progress.start(2, 300)
progress.advance(1, 150)
progress.advance(2, 300)
progress.advance(3, 150)
assert progress.downloaded == 1100
assert progress.progress_bytes == 900
assert progress.progress_perc == 100
progress.end(1)
progress.end(2)
progress.end(3)
assert progress.downloaded == 1100
assert progress.progress_bytes == 900
assert progress.progress_perc == 100
def test_estimated_total():
progress = Progress(3)
assert progress.estimated_total is None
progress.start(1, 12000)
assert progress.estimated_total == 12000 * 3
progress.start(2, 11000)
assert progress.estimated_total == 11500 * 3
progress.start(3, 10000)
assert progress.estimated_total == 11000 * 3
def test_vod_downloaded_count():
progress = Progress(3)
progress.start(1, 100)
progress.start(2, 100)
progress.start(3, 100)
assert progress.vod_downloaded_count == 0
progress.advance(1, 100)
progress.end(1)
assert progress.vod_downloaded_count == 1
progress.advance(2, 100)
progress.end(2)
assert progress.vod_downloaded_count == 2
progress.advance(3, 100)
progress.end(3)
assert progress.vod_downloaded_count == 3

View File

@ -1,3 +1,3 @@
__version__ = "1.22.0"
__version__ = "2.0.1"
CLIENT_ID = "kimne78kx3ncx6brgo4mv6wki5h1ko"

View File

@ -1,17 +1,21 @@
import asyncio
import httpx
import m3u8
import os
import re
import requests
import shutil
import subprocess
import tempfile
from os import path
from pathlib import Path
from typing import List, Optional, OrderedDict
from urllib.parse import urlparse, urlencode
from twitchdl import twitch, utils
from twitchdl.download import download_file, download_files
from twitchdl.download import download_file
from twitchdl.exceptions import ConsoleError
from twitchdl.http import download_all
from twitchdl.output import print_out
@ -133,7 +137,7 @@ def _clip_target_filename(clip, args):
raise ConsoleError("Invalid key {} used in --output. Supported keys are: {}".format(e, supported))
def _get_vod_paths(playlist, start, end):
def _get_vod_paths(playlist, start: Optional[int], end: Optional[int]) -> List[str]:
"""Extract unique VOD paths for download from playlist."""
files = []
vod_start = 0
@ -153,7 +157,7 @@ def _get_vod_paths(playlist, start, end):
return files
def _crete_temp_dir(base_uri):
def _crete_temp_dir(base_uri: str) -> str:
"""Create a temp dir to store downloads if it doesn't exist."""
path = urlparse(base_uri).path.lstrip("/")
temp_dir = Path(tempfile.gettempdir(), "twitch-dl", path)
@ -162,15 +166,20 @@ def _crete_temp_dir(base_uri):
def download(args):
video_id = utils.parse_video_identifier(args.video)
for video_id in args.videos:
download_one(video_id, args)
def download_one(video: str, args):
video_id = utils.parse_video_identifier(video)
if video_id:
return _download_video(video_id, args)
clip_slug = utils.parse_clip_identifier(args.video)
clip_slug = utils.parse_clip_identifier(video)
if clip_slug:
return _download_clip(clip_slug, args)
raise ConsoleError("Invalid input: {}".format(args.video))
raise ConsoleError("Invalid input: {}".format(video))
def _get_clip_url(clip, quality):
@ -218,7 +227,7 @@ def get_clip_authenticated_url(slug, quality):
return "{}?{}".format(url, query)
def _download_clip(slug, args):
def _download_clip(slug: str, args) -> None:
print_out("<dim>Looking up clip...</dim>")
clip = twitch.get_clip(slug)
game = clip["game"]["name"] if clip["game"] else "Unknown"
@ -251,7 +260,7 @@ def _download_clip(slug, args):
print_out("Downloaded: <blue>{}</blue>".format(target))
def _download_video(video_id, args):
def _download_video(video_id, args) -> None:
if args.start and args.end and args.end <= args.start:
raise ConsoleError("End time must be greater than start time")
@ -283,7 +292,7 @@ def _download_video(video_id, args):
else _select_playlist_interactive(playlists))
print_out("<dim>Fetching playlist...</dim>")
response = requests.get(playlist_uri)
response = httpx.get(playlist_uri)
response.raise_for_status()
playlist = m3u8.loads(response.text)
@ -299,11 +308,15 @@ def _download_video(video_id, args):
print_out("\nDownloading {} VODs using {} workers to {}".format(
len(vod_paths), args.max_workers, target_dir))
path_map = download_files(base_uri, target_dir, vod_paths, args.max_workers)
sources = [base_uri + path for path in vod_paths]
targets = [os.path.join(target_dir, "{:05d}.ts".format(k)) for k, _ in enumerate(vod_paths)]
asyncio.run(download_all(sources, targets, args.max_workers, rate_limit=args.rate_limit))
# Make a modified playlist which references downloaded VODs
# Keep only the downloaded segments and skip the rest
org_segments = playlist.segments.copy()
path_map = OrderedDict(zip(vod_paths, targets))
playlist.segments.clear()
for segment in org_segments:
if segment.uri in path_map:

View File

@ -2,13 +2,17 @@ import sys
from twitchdl import twitch
from twitchdl.exceptions import ConsoleError
from twitchdl.output import print_out, print_paged_videos, print_video, print_json
from twitchdl.output import print_out, print_paged_videos, print_video, print_json, print_video_compact
def videos(args):
game_ids = _get_game_ids(args.game)
# Set different defaults for limit for compact display
limit = args.limit or (40 if args.compact else 10)
# Ignore --limit if --pager or --all are given
max_videos = sys.maxsize if args.all or args.pager else args.limit
max_videos = sys.maxsize if args.all or args.pager else limit
total_count, generator = twitch.channel_videos_generator(
args.channel_name, max_videos, args.sort, args.type, game_ids=game_ids)
@ -32,8 +36,11 @@ def videos(args):
count = 0
for video in generator:
print_out()
print_video(video)
if args.compact:
print_video_compact(video)
else:
print_out()
print_video(video)
count += 1
print_out()

View File

@ -2,9 +2,10 @@
import logging
import sys
import re
from argparse import ArgumentParser, ArgumentTypeError
from collections import namedtuple
from typing import NamedTuple, List, Tuple, Any, Dict
from twitchdl.exceptions import ConsoleError
from twitchdl.output import print_err
@ -12,12 +13,19 @@ from twitchdl.twitch import GQLError
from . import commands, __version__
Command = namedtuple("Command", ["name", "description", "arguments"])
Argument = Tuple[List[str], Dict[str, Any]]
class Command(NamedTuple):
name: str
description: str
arguments: List[Argument]
CLIENT_WEBSITE = 'https://github.com/ihabunek/twitch-dl'
def time(value):
def time(value: str) -> int:
"""Parse a time string (hh:mm or hh:mm:ss) to number of seconds."""
parts = [int(p) for p in value.split(":")]
@ -34,16 +42,34 @@ def time(value):
return hours * 3600 + minutes * 60 + seconds
def pos_integer(value):
def pos_integer(value: str) -> int:
try:
value = int(value)
parsed = int(value)
except ValueError:
raise ArgumentTypeError("must be an integer")
if value < 1:
if parsed < 1:
raise ArgumentTypeError("must be positive")
return value
return parsed
def rate(value: str) -> int:
match = re.search(r"^([0-9]+)(k|m|)$", value, flags=re.IGNORECASE)
if not match:
raise ArgumentTypeError("must be an integer, followed by an optional 'k' or 'm'")
amount = int(match.group(1))
unit = match.group(2)
if unit == "k":
return amount * 1024
if unit == "m":
return amount * 1024 * 1024
return amount
COMMANDS = [
@ -61,9 +87,8 @@ COMMANDS = [
"type": str,
}),
(["-l", "--limit"], {
"help": "Number of videos to fetch. Defaults to 10.",
"help": "Number of videos to fetch. Defaults to 40 in copmpact mode, 10 otherwise.",
"type": pos_integer,
"default": 10,
}),
(["-a", "--all"], {
"help": "Fetch all videos, overrides --limit",
@ -93,6 +118,11 @@ COMMANDS = [
"nargs": "?",
"const": 10,
}),
(["-c", "--compact"], {
"help": "Show videos in compact mode, one line per video",
"action": "store_true",
"default": False,
}),
],
),
Command(
@ -139,17 +169,17 @@ COMMANDS = [
),
Command(
name="download",
description="Download a video or clip.",
description="Download videos or clips.",
arguments=[
(["video"], {
"help": "Video ID, clip slug, or URL",
(["videos"], {
"help": "One or more video ID, clip slug or twitch URL to download.",
"type": str,
"nargs": "+",
}),
(["-w", "--max-workers"], {
"help": "Maximal number of threads for downloading vods "
"concurrently (default 20)",
"help": "Number of workers for downloading vods concurrently (default 5)",
"type": int,
"default": 20,
"default": 5,
}),
(["-s", "--start"], {
"help": "Download video from this time (hh:mm or hh:mm:ss)",
@ -197,7 +227,12 @@ COMMANDS = [
"help": "Output file name template. See docs for details.",
"type": str,
"default": "{date}_{id}_{channel_login}_{title_slug}.{format}"
})
}),
(["-r", "--rate-limit"], {
"help": "Limit the maximum download speed in bytes per second. "
"Use 'k' and 'm' suffixes for kbps and mbps.",
"type": rate,
}),
],
),
Command(
@ -281,7 +316,7 @@ def main():
print_err(e)
sys.exit(1)
except KeyboardInterrupt:
print_err("Operation canceled")
print_err("\nOperation canceled")
sys.exit(1)
except GQLError as e:
print_err(e)

View File

@ -1,14 +1,5 @@
import os
import requests
from collections import OrderedDict
from concurrent.futures import ThreadPoolExecutor, as_completed
from datetime import datetime
from functools import partial
from requests.exceptions import RequestException
from twitchdl.output import print_out
from twitchdl.utils import format_size, format_duration
import httpx
CHUNK_SIZE = 1024
CONNECT_TIMEOUT = 5
@ -19,20 +10,20 @@ class DownloadFailed(Exception):
pass
def _download(url, path):
def _download(url: str, path: str):
tmp_path = path + ".tmp"
response = requests.get(url, stream=True, timeout=CONNECT_TIMEOUT)
size = 0
with open(tmp_path, 'wb') as target:
for chunk in response.iter_content(chunk_size=CHUNK_SIZE):
target.write(chunk)
size += len(chunk)
with httpx.stream("GET", url, timeout=CONNECT_TIMEOUT) as response:
with open(tmp_path, "wb") as target:
for chunk in response.iter_bytes(chunk_size=CHUNK_SIZE):
target.write(chunk)
size += len(chunk)
os.rename(tmp_path, path)
return size
def download_file(url, path, retries=RETRY_COUNT):
def download_file(url: str, path: str, retries: int = RETRY_COUNT):
if os.path.exists(path):
from_disk = True
return (os.path.getsize(path), from_disk)
@ -41,63 +32,7 @@ def download_file(url, path, retries=RETRY_COUNT):
for _ in range(retries):
try:
return (_download(url, path), from_disk)
except RequestException:
except httpx.RequestError:
pass
raise DownloadFailed(":(")
def _print_progress(futures):
downloaded_count = 0
downloaded_size = 0
max_msg_size = 0
start_time = datetime.now()
total_count = len(futures)
current_download_size = 0
current_downloaded_count = 0
for future in as_completed(futures):
size, from_disk = future.result()
downloaded_count += 1
downloaded_size += size
# If we find something on disk, we don't want to take it in account in
# the speed calculation
if not from_disk:
current_download_size += size
current_downloaded_count += 1
percentage = 100 * downloaded_count // total_count
est_total_size = int(total_count * downloaded_size / downloaded_count)
duration = (datetime.now() - start_time).seconds
speed = current_download_size // duration if duration else 0
remaining = (total_count - downloaded_count) * duration / current_downloaded_count \
if current_downloaded_count else 0
msg = " ".join([
"Downloaded VOD {}/{}".format(downloaded_count, total_count),
"({}%)".format(percentage),
"<cyan>{}</cyan>".format(format_size(downloaded_size)),
"of <cyan>~{}</cyan>".format(format_size(est_total_size)),
"at <cyan>{}/s</cyan>".format(format_size(speed)) if speed > 0 else "",
"remaining <cyan>~{}</cyan>".format(format_duration(remaining)) if remaining > 0 else "",
])
max_msg_size = max(len(msg), max_msg_size)
print_out("\r" + msg.ljust(max_msg_size), end="")
def download_files(base_url, target_dir, vod_paths, max_workers):
"""
Downloads a list of VODs defined by a common `base_url` and a list of
`vod_paths`, returning a dict which maps the paths to the downloaded files.
"""
urls = [base_url + path for path in vod_paths]
targets = [os.path.join(target_dir, "{:05d}.ts".format(k)) for k, _ in enumerate(vod_paths)]
partials = (partial(download_file, url, path) for url, path in zip(urls, targets))
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = [executor.submit(fn) for fn in partials]
_print_progress(futures)
return OrderedDict(zip(vod_paths, targets))

View File

@ -1,76 +0,0 @@
import asyncio
import json
import logging
import re
from asyncio.subprocess import PIPE
from pprint import pprint
from typing import Optional
from twitchdl.output import print_out
async def join_vods(playlist_path: str, target: str, overwrite: bool, video: dict):
command = [
"ffmpeg",
"-i", playlist_path,
"-c", "copy",
"-metadata", "artist={}".format(video["creator"]["displayName"]),
"-metadata", "title={}".format(video["title"]),
"-metadata", "encoded_by=twitch-dl",
"-stats",
"-loglevel", "warning",
f"file:{target}",
]
if overwrite:
command.append("-y")
# command = ["ls", "-al"]
print_out("<dim>{}</dim>".format(" ".join(command)))
process = await asyncio.create_subprocess_exec(*command, stdout=PIPE, stderr=PIPE)
assert process.stderr is not None
await asyncio.gather(
# _read_stream("stdout", process.stdout),
_print_progress("stderr", process.stderr),
process.wait()
)
print(process.returncode)
async def _read_stream(name: str, stream: Optional[asyncio.StreamReader]):
if stream:
async for line in readlines(stream):
print(name, ">", line)
async def _print_progress(stream: asyncio.StreamReader):
async for line in readlines(stream):
print(name, ">", line)
pattern = re.compile(br"[\r\n]+")
async def readlines(stream: asyncio.StreamReader):
data = bytearray()
while not stream.at_eof():
lines = pattern.split(data)
data[:] = lines.pop(-1)
for line in lines:
yield line
data.extend(await stream.read(1024))
if __name__ == "__main__":
# logging.basicConfig(level=logging.DEBUG)
video = json.loads('{"id": "1555108011", "title": "Cult of the Lamb", "publishedAt": "2022-08-07T17:00:30Z", "broadcastType": "ARCHIVE", "lengthSeconds": 17948, "game": {"name": "Cult of the Lamb"}, "creator": {"login": "bananasaurus_rex", "displayName": "Bananasaurus_Rex"}, "playlists": [{"bandwidth": 8446533, "resolution": [1920, 1080], "codecs": "avc1.64002A,mp4a.40.2", "video": "chunked", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/chunked/index-dvr.m3u8"}, {"bandwidth": 3432426, "resolution": [1280, 720], "codecs": "avc1.4D0020,mp4a.40.2", "video": "720p60", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/720p60/index-dvr.m3u8"}, {"bandwidth": 1445268, "resolution": [852, 480], "codecs": "avc1.4D001F,mp4a.40.2", "video": "480p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/480p30/index-dvr.m3u8"}, {"bandwidth": 215355, "resolution": null, "codecs": "mp4a.40.2", "video": "audio_only", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/audio_only/index-dvr.m3u8"}, {"bandwidth": 705523, "resolution": [640, 360], "codecs": "avc1.4D001E,mp4a.40.2", "video": "360p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/360p30/index-dvr.m3u8"}, {"bandwidth": 285614, "resolution": [284, 160], "codecs": "avc1.4D000C,mp4a.40.2", "video": "160p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/160p30/index-dvr.m3u8"}]}')
playlist_path = "/tmp/twitch-dl/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/160p30/playlist_downloaded.m3u8"
asyncio.run(join_vods(playlist_path, "out.mkv", True, video), debug=True)

129
twitchdl/http.py Normal file
View File

@ -0,0 +1,129 @@
import asyncio
import httpx
import logging
import os
import time
from typing import List, Optional, Union
from twitchdl.progress import Progress
logger = logging.getLogger(__name__)
KB = 1024
CHUNK_SIZE = 256 * KB
"""How much of a VOD to download in each iteration"""
RETRY_COUNT = 5
"""Number of times to retry failed downloads before aborting."""
TIMEOUT = 30
"""
Number of seconds to wait before aborting when there is no network activity.
https://www.python-httpx.org/advanced/#timeout-configuration
"""
class TokenBucket:
"""Limit the download speed by strategically inserting sleeps."""
def __init__(self, rate: int, capacity: Optional[int] = None):
self.rate: int = rate
self.capacity: int = capacity or rate * 2
self.available: int = 0
self.last_refilled: float = time.time()
def advance(self, size: int):
"""Called every time a chunk of data is downloaded."""
self._refill()
if self.available < size:
deficit = size - self.available
time.sleep(deficit / self.rate)
self.available -= size
def _refill(self):
"""Increase available capacity according to elapsed time since last refill."""
now = time.time()
elapsed = now - self.last_refilled
refill_amount = int(elapsed * self.rate)
self.available = min(self.available + refill_amount, self.capacity)
self.last_refilled = now
class EndlessTokenBucket:
"""Used when download speed is not limited."""
def advance(self, size: int):
pass
AnyTokenBucket = Union[TokenBucket, EndlessTokenBucket]
async def download(
client: httpx.AsyncClient,
task_id: int,
source: str,
target: str,
progress: Progress,
token_bucket: AnyTokenBucket,
):
# Download to a temp file first, then copy to target when over to avoid
# getting saving chunks which may persist if canceled or --keep is used
tmp_target = f"{target}.tmp"
with open(tmp_target, "wb") as f:
async with client.stream("GET", source) as response:
size = int(response.headers.get("content-length"))
progress.start(task_id, size)
async for chunk in response.aiter_bytes(chunk_size=CHUNK_SIZE):
f.write(chunk)
size = len(chunk)
token_bucket.advance(size)
progress.advance(task_id, size)
progress.end(task_id)
os.rename(tmp_target, target)
async def download_with_retries(
client: httpx.AsyncClient,
semaphore: asyncio.Semaphore,
task_id: int,
source: str,
target: str,
progress: Progress,
token_bucket: AnyTokenBucket,
):
async with semaphore:
if os.path.exists(target):
size = os.path.getsize(target)
progress.already_downloaded(task_id, size)
return
for n in range(RETRY_COUNT):
try:
return await download(client, task_id, source, target, progress, token_bucket)
except httpx.RequestError:
logger.exception("Task {task_id} failed. Retrying. Maybe.")
progress.abort(task_id)
if n + 1 >= RETRY_COUNT:
raise
raise Exception("Should not happen")
async def download_all(
sources: List[str],
targets: List[str],
workers: int,
/, *,
rate_limit: Optional[int] = None
):
progress = Progress(len(sources))
token_bucket = TokenBucket(rate_limit) if rate_limit else EndlessTokenBucket()
async with httpx.AsyncClient(timeout=TIMEOUT) as client:
semaphore = asyncio.Semaphore(workers)
tasks = [download_with_retries(client, semaphore, task_id, source, target, progress, token_bucket)
for task_id, (source, target) in enumerate(zip(sources, targets))]
await asyncio.gather(*tasks)

View File

@ -6,6 +6,7 @@ import re
from itertools import islice
from twitchdl import utils
from typing import Any, Match
START_CODES = {
@ -29,31 +30,38 @@ END_PATTERN = "</(" + "|".join(START_CODES.keys()) + ")>"
USE_ANSI_COLOR = "--no-color" not in sys.argv
def start_code(match):
def start_code(match: Match[str]) -> str:
name = match.group(1)
return START_CODES[name]
def colorize(text):
def colorize(text: str) -> str:
text = re.sub(START_PATTERN, start_code, text)
text = re.sub(END_PATTERN, END_CODE, text)
return text
def strip_tags(text):
def strip_tags(text: str) -> str:
text = re.sub(START_PATTERN, '', text)
text = re.sub(END_PATTERN, '', text)
return text
def truncate(string: str, length: int) -> str:
if len(string) > length:
return string[:length - 1] + ""
return string
def print_out(*args, **kwargs):
args = [colorize(a) if USE_ANSI_COLOR else strip_tags(a) for a in args]
print(*args, **kwargs)
def print_json(data):
def print_json(data: Any):
print(json.dumps(data))
@ -89,6 +97,14 @@ def print_video(video):
print_out("<i>{}</i>".format(url))
def print_video_compact(video):
id = video["id"]
date = video["publishedAt"][:10]
game = video["game"]["name"] if video["game"] else ""
title = truncate(video["title"], 80).ljust(80)
print_out(f'<b>{id}</b> {date} <green>{title}</green> <blue>{game}</blue>')
def print_paged_videos(generator, page_size, total_count):
iterator = iter(generator)
page = list(islice(iterator, page_size))

137
twitchdl/progress.py Normal file
View File

@ -0,0 +1,137 @@
import logging
import time
from collections import deque
from dataclasses import dataclass, field
from statistics import mean
from typing import Dict, NamedTuple, Optional, Deque
from twitchdl.output import print_out
from twitchdl.utils import format_size, format_time
logger = logging.getLogger(__name__)
TaskId = int
@dataclass
class Task:
id: TaskId
size: int
downloaded: int = 0
def advance(self, size):
self.downloaded += size
class Sample(NamedTuple):
downloaded: int
timestamp: float
@dataclass
class Progress:
vod_count: int
downloaded: int = 0
estimated_total: Optional[int] = None
last_printed: float = field(default_factory=time.time)
progress_bytes: int = 0
progress_perc: int = 0
remaining_time: Optional[int] = None
speed: Optional[float] = None
start_time: float = field(default_factory=time.time)
tasks: Dict[TaskId, Task] = field(default_factory=dict)
vod_downloaded_count: int = 0
samples: Deque[Sample] = field(default_factory=lambda: deque(maxlen=100))
def start(self, task_id: int, size: int):
if task_id in self.tasks:
raise ValueError(f"Task {task_id}: cannot start, already started")
self.tasks[task_id] = Task(task_id, size)
self._calculate_total()
self._calculate_progress()
self.print()
def advance(self, task_id: int, size: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot advance, not started")
self.downloaded += size
self.progress_bytes += size
self.tasks[task_id].advance(size)
self.samples.append(Sample(self.downloaded, time.time()))
self._calculate_progress()
self.print()
def already_downloaded(self, task_id: int, size: int):
if task_id in self.tasks:
raise ValueError(f"Task {task_id}: cannot mark as downloaded, already started")
self.tasks[task_id] = Task(task_id, size)
self.progress_bytes += size
self.vod_downloaded_count += 1
self._calculate_total()
self._calculate_progress()
self.print()
def abort(self, task_id: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot abort, not started")
del self.tasks[task_id]
self.progress_bytes = sum(t.downloaded for t in self.tasks.values())
self._calculate_total()
self._calculate_progress()
self.print()
def end(self, task_id: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot end, not started")
task = self.tasks[task_id]
if task.size != task.downloaded:
logger.warn(f"Taks {task_id} ended with {task.downloaded}b downloaded, expected {task.size}b.")
self.vod_downloaded_count += 1
self.print()
def _calculate_total(self):
self.estimated_total = int(mean(t.size for t in self.tasks.values()) * self.vod_count) if self.tasks else None
def _calculate_progress(self):
self.speed = self._calculate_speed()
self.progress_perc = int(100 * self.progress_bytes / self.estimated_total) if self.estimated_total else 0
self.remaining_time = int((self.estimated_total - self.progress_bytes) / self.speed) if self.estimated_total and self.speed else None
def _calculate_speed(self):
if len(self.samples) < 2:
return None
first_sample = self.samples[0]
last_sample = self.samples[-1]
size = last_sample.downloaded - first_sample.downloaded
duration = last_sample.timestamp - first_sample.timestamp
return size / duration
def print(self):
now = time.time()
# Don't print more often than 10 times per second
if now - self.last_printed < 0.1:
return
progress = " ".join([
f"Downloaded {self.vod_downloaded_count}/{self.vod_count} VODs",
f"<blue>{self.progress_perc}%</blue>",
f"of <blue>~{format_size(self.estimated_total)}</blue>" if self.estimated_total else "",
f"at <blue>{format_size(self.speed)}/s</blue>" if self.speed else "",
f"ETA <blue>{format_time(self.remaining_time)}</blue>" if self.remaining_time is not None else "",
])
print_out(f"\r{progress} ", end="")
self.last_printed = now

View File

@ -2,9 +2,9 @@
Twitch API access.
"""
import requests
import httpx
from requests.exceptions import HTTPError
from typing import Dict
from twitchdl import CLIENT_ID
from twitchdl.exceptions import ConsoleError
@ -15,25 +15,10 @@ class GQLError(Exception):
self.errors = errors
def authenticated_get(url, params={}, headers={}):
headers['Client-ID'] = CLIENT_ID
response = requests.get(url, params, headers=headers)
if 400 <= response.status_code < 500:
data = response.json()
# TODO: this does not look nice in the console since data["message"]
# can contain a JSON encoded object.
raise ConsoleError(data["message"])
response.raise_for_status()
return response
def authenticated_post(url, data=None, json=None, headers={}):
headers['Client-ID'] = CLIENT_ID
response = requests.post(url, data=data, json=json, headers=headers)
response = httpx.post(url, data=data, json=json, headers=headers)
if response.status_code == 400:
data = response.json()
raise ConsoleError(data["message"])
@ -53,7 +38,7 @@ def gql_post(query):
return response
def gql_query(query, headers={}):
def gql_query(query: str, headers: Dict[str, str] = {}):
url = "https://gql.twitch.tv/gql"
response = authenticated_post(url, json={"query": query}, headers=headers).json()
@ -330,7 +315,7 @@ def get_access_token(video_id, auth_token=None):
try:
response = gql_query(query, headers=headers)
return response["data"]["videoPlaybackAccessToken"]
except HTTPError as error:
except httpx.HTTPStatusError as error:
# Provide a more useful error message when server returns HTTP 401
# Unauthorized while using a user-provided auth token.
if error.response.status_code == 401:
@ -351,7 +336,7 @@ def get_playlists(video_id, access_token):
"""
url = "http://usher.twitch.tv/vod/{}".format(video_id)
response = requests.get(url, params={
response = httpx.get(url, params={
"nauth": access_token['value'],
"nauthsig": access_token['signature'],
"allow_audio_only": "true",

View File

@ -40,6 +40,19 @@ def format_duration(total_seconds):
return "{} sec".format(seconds)
def format_time(total_seconds):
total_seconds = int(total_seconds)
hours = total_seconds // 3600
remainder = total_seconds % 3600
minutes = remainder // 60
seconds = total_seconds % 60
if hours:
return f"{hours:02}:{minutes:02}:{seconds:02}"
return f"{minutes:02}:{seconds:02}"
def read_int(msg, min, max, default):
msg = msg + " [default {}]: ".format(default)