Compare commits

..

34 Commits

Author SHA1 Message Date
c0a7ea1f27 wip 2022-08-27 15:26:54 +02:00
afe38b84cd Extract Game dataclass 2022-08-27 12:51:28 +02:00
aed0b993a7 Add dataclasses for clips 2022-08-27 12:01:23 +02:00
599b7783d0 Some more types 2022-08-27 11:59:50 +02:00
9cc7c05d8a Expand tests 2022-08-27 11:58:19 +02:00
98d2ce0bc7 Add dependency on pytest-cov 2022-08-27 11:57:57 +02:00
662ce72195 Accept newer versions of m3u8 lib 2022-08-20 13:25:00 +02:00
a4b2434735 Start adding types 2022-08-20 13:25:00 +02:00
280a284fb2 Expand clips tests 2022-08-20 11:16:47 +02:00
235b13c257 Remove unused function 2022-08-20 11:10:47 +02:00
8e3a41e415 Expand tests 2022-08-19 09:35:56 +02:00
cacf921923 Increase default limit for compact mode 2022-08-18 10:06:23 +02:00
4d19f09065 Add compact videos display 2022-08-18 10:04:04 +02:00
f289c93305 Bump version, set release date 2022-08-18 10:04:04 +02:00
b43c9dc9b9 Use double quotes please 2022-08-18 09:36:56 +02:00
a0e808660a Require python 3.7 2022-08-18 09:36:46 +02:00
a14ce57f95 Decrease default number of workers to 5 2022-08-18 09:30:35 +02:00
c4f4935b96 Enable downloading multiple videos successively 2022-08-18 09:12:25 +02:00
c8d38b5512 Update changelog 2022-08-17 11:07:04 +02:00
8be0aba95d Add support for version description in changelog 2022-08-17 11:07:04 +02:00
71ae2bf906 Add TODO 2022-08-17 09:22:16 +02:00
c8a6d67822 Improve speed tracking
Instead of calculating the average speed for the whole download,
consider only the last 100 chunks.
2022-08-17 08:35:57 +02:00
5c380084ba Update changelog 2022-08-15 07:31:25 +02:00
51a35ab494 Remove overly verbose logging 2022-08-15 07:14:53 +02:00
7ca71ddeca Delete egg-info on clean 2022-08-15 07:13:02 +02:00
f40fd290f7 Replace requests with httpx, remove unused code 2022-08-15 07:12:10 +02:00
b03c19dac1 Improve visuals
I never liked cyan anyway
2022-08-14 11:33:38 +02:00
cd445674e5 Download chunks to a temp file first 2022-08-14 11:33:23 +02:00
721d78377e Add rate limiting to download 2022-08-14 11:13:11 +02:00
ac07006ae7 Limit number of prints per second 2022-08-14 11:04:53 +02:00
32a68395d5 Use async downloader 2022-08-14 11:02:29 +02:00
81846764a1 Don't download already downloaded files 2022-08-14 10:21:38 +02:00
23f1a74aa6 Add new asyncio downloader code with rate limiting 2022-08-13 11:41:13 +02:00
85631c8ce5 Extract progress tracking 2022-08-13 09:40:18 +02:00
30 changed files with 968 additions and 398 deletions

View File

@ -3,6 +3,26 @@ twitch-dl changelog
<!-- Do not edit. This file is automatically generated from changelog.yaml.-->
### [2.0.0 (2022-08-18)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.0)
This release switches from using `requests` to `httpx` for making http requests,
and from threads to `asyncio` for concurrency. This enables easier
implementation of new features, but has no breaking changes for the CLI.
* **BREAKING**: Require Python 3.7 or later.
* Add `--rate-limit` option to `download` for limiting maximum bandwidth when
downloading.
* Add `--compact` option to `download` for displaying one video per line.
* Allow passing multiple video ids to `download` to download multiple videos
successively.
* Improved progress meter, updates on each chunk downloaded, instead of each VOD
downloaded.
* Improved speed estimate, displays recent speed instead of average speed since
the start of download.
* Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting the
`-w` option. Please test and report back if this works for you.
### [1.22.0 (2022-06-25)](https://github.com/ihabunek/twitch-dl/releases/tag/1.22.0)
* Add support for downloading subscriber-only VODs (#48, thanks @cemiu)

View File

@ -8,7 +8,7 @@ dist :
clean :
find . -name "*pyc" | xargs rm -rf $1
rm -rf build dist bundle MANIFEST htmlcov deb_dist twitch-dl.*.pyz twitch-dl.1.man
rm -rf build dist bundle MANIFEST htmlcov deb_dist twitch-dl.*.pyz twitch-dl.1.man twitch_dl.egg-info
bundle:
mkdir bundle

View File

@ -17,7 +17,7 @@ Resources
Requirements
------------
* Python 3.5 or later
* Python 3.7 or later
* [ffmpeg](https://ffmpeg.org/download.html), installed and on the system path
Quick start

8
TODO.md Normal file
View File

@ -0,0 +1,8 @@
TODO
====
Some ideas what to do next.
* gracefully handle aborting the download with Ctrl+C, now it prints out an error stack
* add keyboard control for e.g. pausing a download
* test how worker count affects download speeds on low and high-bandwidth links (see https://github.com/ihabunek/twitch-dl/issues/104), adjust default worker count

View File

@ -1,3 +1,22 @@
2.0.0:
date: 2022-08-18
description: |
This release switches from using `requests` to `httpx` for making http
requests, and from threads to `asyncio` for concurrency. This enables
easier implementation of new features, but has no breaking changes for the
CLI.
changes:
- "**BREAKING**: Require Python 3.7 or later."
- "Add `--rate-limit` option to `download` for limiting maximum bandwidth when downloading."
- "Add `--compact` option to `download` for displaying one video per line."
- "Allow passing multiple video ids to `download` to download multiple videos successively."
- "Improved progress meter, updates on each chunk downloaded, instead of each VOD downloaded."
- "Improved speed estimate, displays recent speed instead of average speed since the start of download."
- |
Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting
the `-w` option. Please test and report back if this works for you.
1.22.0:
date: 2022-06-25
changes:

View File

@ -3,6 +3,26 @@ twitch-dl changelog
<!-- Do not edit. This file is automatically generated from changelog.yaml.-->
### [2.0.0 (2022-08-18)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.0)
This release switches from using `requests` to `httpx` for making http requests,
and from threads to `asyncio` for concurrency. This enables easier
implementation of new features, but has no breaking changes for the CLI.
* **BREAKING**: Require Python 3.7 or later.
* Add `--rate-limit` option to `download` for limiting maximum bandwidth when
downloading.
* Add `--compact` option to `download` for displaying one video per line.
* Allow passing multiple video ids to `download` to download multiple videos
successively.
* Improved progress meter, updates on each chunk downloaded, instead of each VOD
downloaded.
* Improved speed estimate, displays recent speed instead of average speed since
the start of download.
* Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting the
`-w` option. Please test and report back if this works for you.
### [1.22.0 (2022-06-25)](https://github.com/ihabunek/twitch-dl/releases/tag/1.22.0)
* Add support for downloading subscriber-only VODs (#48, thanks @cemiu)

View File

@ -1,12 +1,12 @@
<!-- ------------------- generated docs start ------------------- -->
# twitch-dl download
Download a video or clip.
Download videos or clips.
### USAGE
```
twitch-dl download <video> [FLAGS] [OPTIONS]
twitch-dl download <videos> [FLAGS] [OPTIONS]
```
### ARGUMENTS
@ -14,8 +14,8 @@ twitch-dl download <video> [FLAGS] [OPTIONS]
<table>
<tbody>
<tr>
<td class="code">&lt;video&gt;</td>
<td>Video ID, clip slug, or URL</td>
<td class="code">&lt;videos&gt;</td>
<td>One or more video ID, clip slug or twitch URL to download.</td>
</tr>
</tbody>
</table>
@ -47,7 +47,7 @@ twitch-dl download <video> [FLAGS] [OPTIONS]
<tbody>
<tr>
<td class="code">-w, --max-workers</td>
<td>Maximal number of threads for downloading vods concurrently (default 20)</td>
<td>Number of workers for downloading vods concurrently (default 5)</td>
</tr>
<tr>
@ -79,6 +79,11 @@ twitch-dl download <video> [FLAGS] [OPTIONS]
<td class="code">-o, --output</td>
<td>Output file name template. See docs for details.</td>
</tr>
<tr>
<td class="code">-r, --rate-limit</td>
<td>Limit the maximum download speed in bytes per second. Use &#x27;k&#x27; and &#x27;m&#x27; suffixes for kbps and mbps.</td>
</tr>
</tbody>
</table>
@ -111,6 +116,12 @@ Setting quality to `audio_only` will download only audio:
twitch-dl download -q audio_only 221837124
```
Download multiple videos one after the other:
```
twitch-dl download 1559928295 1557034274 1555157293 -q source
```
### Overriding the target file name
The target filename can be defined by passing the `--output` option followed by
@ -172,4 +183,4 @@ download command:
```
twitch-dl download 221837124 --auth-token iduetx4i107rn4b9wrgctf590aiktv
```
```

View File

@ -33,6 +33,11 @@ twitch-dl videos <channel_name> [FLAGS] [OPTIONS]
<td class="code">-j, --json</td>
<td>Show results as JSON. Ignores <code>--pager</code>.</td>
</tr>
<tr>
<td class="code">-c, --compact</td>
<td>Show videos in compact mode, one line per video</td>
</tr>
</tbody>
</table>
@ -47,7 +52,7 @@ twitch-dl videos <channel_name> [FLAGS] [OPTIONS]
<tr>
<td class="code">-l, --limit</td>
<td>Number of videos to fetch. Defaults to 10.</td>
<td>Number of videos to fetch. Defaults to 40 in copmpact mode, 10 otherwise.</td>
</tr>
<tr>

View File

@ -1,6 +1,6 @@
# Installation
twitch-dl requires **Python 3.5** or later.
twitch-dl requires **Python 3.7** or later.
## Prerequisite: FFmpeg

View File

@ -1,4 +1,5 @@
pytest
pytest-cov
twine
wheel
pyyaml

View File

@ -21,6 +21,13 @@ for version in data.keys():
changes = data[version]["changes"]
print(f"### [{version} ({date})](https://github.com/ihabunek/twitch-dl/releases/tag/{version})")
print()
if "description" in data[version]:
description = data[version]["description"].strip()
for line in textwrap.wrap(description, 80):
print(line)
print()
for c in changes:
lines = textwrap.wrap(c, 78)
initial = True

View File

@ -44,14 +44,18 @@ if dist_version != version:
release_date = changelog_item["date"]
changes = changelog_item["changes"]
description = changelog_item["description"] if "description" in changelog_item else None
if not isinstance(release_date, date):
print(f"Release date not set for version `{version}` in the changelog.", file=sys.stderr)
sys.exit(1)
commit_message = f"twitch-dl {version}\n\n"
if description:
lines = textwrap.wrap(description.strip(), 72)
commit_message += "\n".join(lines) + "\n\n"
for c in changes:
lines = textwrap.wrap(c, 70)
lines = textwrap.wrap(c, 69)
initial = True
for line in lines:
lead = " *" if initial else " "

View File

@ -10,35 +10,33 @@ makes it faster.
"""
setup(
name='twitch-dl',
version='1.22.0',
description='Twitch downloader',
name="twitch-dl",
version="2.0.0",
description="Twitch downloader",
long_description=long_description.strip(),
author='Ivan Habunek',
author_email='ivan@habunek.com',
url='https://github.com/ihabunek/twitch-dl/',
author="Ivan Habunek",
author_email="ivan@habunek.com",
url="https://github.com/ihabunek/twitch-dl/",
project_urls={
"Documentation": "https://twitch-dl.bezdomni.net/"
},
keywords='twitch vod video download',
license='GPLv3',
keywords="twitch vod video download",
license="GPLv3",
classifiers=[
'Development Status :: 5 - Production/Stable',
'License :: OSI Approved :: GNU General Public License v3 (GPLv3)',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
"Development Status :: 5 - Production/Stable",
"Environment :: Console",
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
"Programming Language :: Python :: 3",
],
packages=find_packages(),
python_requires='>=3.5',
python_requires=">=3.7",
install_requires=[
"m3u8>=1.0.0,<2.0.0",
"requests>=2.13,<3.0",
"m3u8>=1.0.0,<4.0.0",
"httpx>=0.17.0,<1.0.0",
],
entry_points={
'console_scripts': [
'twitch-dl=twitchdl.console:main',
"console_scripts": [
"twitch-dl=twitchdl.console:main",
],
}
)

View File

@ -2,29 +2,57 @@
These tests depend on the channel having some videos and clips published.
"""
import httpx
import m3u8
from twitchdl import twitch
from twitchdl.commands.download import _parse_playlists, get_clip_authenticated_url
from twitchdl.models import Game, VideosPage
TEST_CHANNEL = "bananasaurus_rex"
def test_get_videos():
videos = twitch.get_channel_videos(TEST_CHANNEL, 3, "time")
assert videos["pageInfo"]
assert len(videos["edges"]) > 0
page = twitch.get_channel_videos(TEST_CHANNEL, 3, "time")
assert isinstance(page, VideosPage)
assert len(page.videos) > 0
video_id = videos["edges"][0]["node"]["id"]
video_id = page.videos[0].id
video = twitch.get_video(video_id)
assert video["id"] == video_id
assert video and video.id == video_id
access_token = twitch.get_access_token(video_id)
assert "signature" in access_token
assert "value" in access_token
playlists = twitch.get_playlists(video_id, access_token)
assert playlists.startswith("#EXTM3U")
_, _, url = next(_parse_playlists(playlists))
playlist = httpx.get(url).text
assert playlist.startswith("#EXTM3U")
playlist = m3u8.loads(playlist)
vod_path = playlist.segments[0].uri
assert vod_path == "0.ts"
def test_get_clips():
"""
This test depends on the channel having some videos published.
"""
clips = twitch.get_channel_clips(TEST_CHANNEL, "all_time", 3)
assert clips["pageInfo"]
assert len(clips["edges"]) > 0
page = twitch.get_channel_clips(TEST_CHANNEL, "all_time", 3)
assert len(page.clips) > 0
clip_slug = clips["edges"][0]["node"]["slug"]
clip = twitch.get_clip(clip_slug)
assert clip["slug"] == clip_slug
slug = page.clips[0].slug
clip = twitch.get_clip(slug)
assert clip.slug == slug
assert get_clip_authenticated_url(slug, "source").startswith("https")
def test_get_game():
game = twitch.find_game("The Witness")
assert isinstance(game, Game)
assert game.id == "17324"
assert game.name == "The Witness"
assert game.description
game = twitch.find_game("Does Not Exist Hopefully")
assert game is None

102
tests/test_progress.py Normal file
View File

@ -0,0 +1,102 @@
from twitchdl.progress import Progress
def test_initial_values():
progress = Progress(10)
assert progress.downloaded == 0
assert progress.estimated_total is None
assert progress.progress_perc == 0
assert progress.remaining_time is None
assert progress.speed is None
assert progress.vod_count == 10
assert progress.vod_downloaded_count == 0
def test_downloaded():
progress = Progress(3)
progress.start(1, 300)
progress.start(2, 300)
progress.start(3, 300)
assert progress.downloaded == 0
assert progress.progress_bytes == 0
assert progress.progress_perc == 0
progress.advance(1, 100)
assert progress.downloaded == 100
assert progress.progress_bytes == 100
assert progress.progress_perc == 11
progress.advance(2, 200)
assert progress.downloaded == 300
assert progress.progress_bytes == 300
assert progress.progress_perc == 33
progress.advance(3, 150)
assert progress.downloaded == 450
assert progress.progress_bytes == 450
assert progress.progress_perc == 50
progress.advance(1, 50)
assert progress.downloaded == 500
assert progress.progress_bytes == 500
assert progress.progress_perc == 55
progress.abort(2)
assert progress.downloaded == 500
assert progress.progress_bytes == 300
assert progress.progress_perc == 33
progress.start(2, 300)
progress.advance(1, 150)
progress.advance(2, 300)
progress.advance(3, 150)
assert progress.downloaded == 1100
assert progress.progress_bytes == 900
assert progress.progress_perc == 100
progress.end(1)
progress.end(2)
progress.end(3)
assert progress.downloaded == 1100
assert progress.progress_bytes == 900
assert progress.progress_perc == 100
def test_estimated_total():
progress = Progress(3)
assert progress.estimated_total is None
progress.start(1, 12000)
assert progress.estimated_total == 12000 * 3
progress.start(2, 11000)
assert progress.estimated_total == 11500 * 3
progress.start(3, 10000)
assert progress.estimated_total == 11000 * 3
def test_vod_downloaded_count():
progress = Progress(3)
progress.start(1, 100)
progress.start(2, 100)
progress.start(3, 100)
assert progress.vod_downloaded_count == 0
progress.advance(1, 100)
progress.end(1)
assert progress.vod_downloaded_count == 1
progress.advance(2, 100)
progress.end(2)
assert progress.vod_downloaded_count == 2
progress.advance(3, 100)
progress.end(3)
assert progress.vod_downloaded_count == 3

5
tests/test_twitch.py Normal file
View File

@ -0,0 +1,5 @@
from twitchdl.twitch import channel_clips_generator
# def test_clips_generator():
# channel_clips_generator("foo", "bar", 100)

View File

@ -1,3 +1,3 @@
__version__ = "1.22.0"
__version__ = "2.0.0"
CLIENT_ID = "kimne78kx3ncx6brgo4mv6wki5h1ko"

View File

@ -7,6 +7,7 @@ from os import path
from twitchdl import twitch, utils
from twitchdl.commands.download import get_clip_authenticated_url
from twitchdl.download import download_file
from twitchdl.models import Clip, ClipGenerator
from twitchdl.output import print_out, print_clip, print_json
@ -17,13 +18,12 @@ def clips(args):
generator = twitch.channel_clips_generator(args.channel_name, args.period, limit)
if args.json:
return print_json(list(generator))
return print_json([c.raw for c in generator])
if args.download:
return _download_clips(generator)
if args.pager:
print(args)
return _print_paged(generator, args.pager)
return _print_all(generator, args)
@ -40,38 +40,41 @@ def _continue():
return True
def _target_filename(clip):
url = clip["videoQualities"][0]["sourceURL"]
def _target_filename(clip: Clip):
url = clip.video_qualities[0].source_url
_, ext = path.splitext(url)
ext = ext.lstrip(".")
match = re.search(r"^(\d{4})-(\d{2})-(\d{2})T", clip["createdAt"])
match = re.search(r"^(\d{4})-(\d{2})-(\d{2})T", clip.created_at)
if not match:
raise ValueError(f"Invalid date: {clip.created_at}")
date = "".join(match.groups())
name = "_".join([
date,
clip["id"],
clip["broadcaster"]["login"],
utils.slugify(clip["title"]),
clip.id,
clip.broadcaster.login,
utils.slugify(clip.title),
])
return "{}.{}".format(name, ext)
def _download_clips(generator):
for clip in generator:
def _download_clips(clips: ClipGenerator):
for clip in clips:
target = _target_filename(clip)
if path.exists(target):
print_out("Already downloaded: <green>{}</green>".format(target))
else:
url = get_clip_authenticated_url(clip["slug"], "source")
url = get_clip_authenticated_url(clip.slug, "source")
print_out("Downloading: <yellow>{}</yellow>".format(target))
download_file(url, target)
def _print_all(generator, args):
for clip in generator:
def _print_all(clips: ClipGenerator, args):
for clip in clips:
print_out()
print_clip(clip)
@ -82,8 +85,8 @@ def _print_all(generator, args):
)
def _print_paged(generator, page_size):
iterator = iter(generator)
def _print_paged(clips: ClipGenerator, page_size: int):
iterator = iter(clips)
page = list(islice(iterator, page_size))
first = 1

View File

@ -1,17 +1,22 @@
import asyncio
import httpx
import m3u8
import os
import re
import requests
import shutil
import subprocess
import tempfile
from os import path
from pathlib import Path
from typing import List, Optional, OrderedDict
from urllib.parse import urlparse, urlencode
from twitchdl import twitch, utils
from twitchdl.download import download_file, download_files
from twitchdl.download import download_file
from twitchdl.exceptions import ConsoleError
from twitchdl.http import download_all
from twitchdl.models import Clip, Video
from twitchdl.output import print_out
@ -56,13 +61,13 @@ def _select_playlist_interactive(playlists):
return uri
def _join_vods(playlist_path, target, overwrite, video):
def _join_vods(playlist_path: str, target: str, overwrite: bool, video: Video):
command = [
"ffmpeg",
"-i", playlist_path,
"-c", "copy",
"-metadata", "artist={}".format(video["creator"]["displayName"]),
"-metadata", "title={}".format(video["title"]),
"-metadata", f"artist={video.creator.display_name}",
"-metadata", f"title={video.title}",
"-metadata", "encoded_by=twitch-dl",
"-stats",
"-loglevel", "warning",
@ -78,22 +83,22 @@ def _join_vods(playlist_path, target, overwrite, video):
raise ConsoleError("Joining files failed")
def _video_target_filename(video, args):
date, time = video['publishedAt'].split("T")
game = video["game"]["name"] if video["game"] else "Unknown"
def _video_target_filename(video: Video, args) -> str:
date, time = video.published_at.split("T")
game = video.game.name if video.game else "Unknown"
subs = {
"channel": video["creator"]["displayName"],
"channel_login": video["creator"]["login"],
"channel": video.creator.display_name,
"channel_login": video.creator.login,
"date": date,
"datetime": video["publishedAt"],
"datetime": video.published_at,
"format": args.format,
"game": game,
"game_slug": utils.slugify(game),
"id": video["id"],
"id": video.id,
"time": time,
"title": utils.titlify(video["title"]),
"title_slug": utils.slugify(video["title"]),
"title": utils.titlify(video.title),
"title_slug": utils.slugify(video.title),
}
try:
@ -103,27 +108,27 @@ def _video_target_filename(video, args):
raise ConsoleError("Invalid key {} used in --output. Supported keys are: {}".format(e, supported))
def _clip_target_filename(clip, args):
date, time = clip["createdAt"].split("T")
game = clip["game"]["name"] if clip["game"] else "Unknown"
def _clip_target_filename(clip: Clip, args) -> str:
date, time = clip.created_at.split("T")
game = clip.game.name if clip.game else "Unknown"
url = clip["videoQualities"][0]["sourceURL"]
url = clip.video_qualities[0].source_url
_, ext = path.splitext(url)
ext = ext.lstrip(".")
subs = {
"channel": clip["broadcaster"]["displayName"],
"channel_login": clip["broadcaster"]["login"],
"channel": clip.broadcaster.display_name,
"channel_login": clip.broadcaster.login,
"date": date,
"datetime": clip["createdAt"],
"datetime": clip.created_at,
"format": ext,
"game": game,
"game_slug": utils.slugify(game),
"id": clip["id"],
"slug": clip["slug"],
"id": clip.id,
"slug": clip.slug,
"time": time,
"title": utils.titlify(clip["title"]),
"title_slug": utils.slugify(clip["title"]),
"title": utils.titlify(clip.title),
"title_slug": utils.slugify(clip.title),
}
try:
@ -133,7 +138,7 @@ def _clip_target_filename(clip, args):
raise ConsoleError("Invalid key {} used in --output. Supported keys are: {}".format(e, supported))
def _get_vod_paths(playlist, start, end):
def _get_vod_paths(playlist, start: Optional[int], end: Optional[int]) -> List[str]:
"""Extract unique VOD paths for download from playlist."""
files = []
vod_start = 0
@ -153,7 +158,7 @@ def _get_vod_paths(playlist, start, end):
return files
def _crete_temp_dir(base_uri):
def _crete_temp_dir(base_uri: str) -> str:
"""Create a temp dir to store downloads if it doesn't exist."""
path = urlparse(base_uri).path.lstrip("/")
temp_dir = Path(tempfile.gettempdir(), "twitch-dl", path)
@ -162,18 +167,23 @@ def _crete_temp_dir(base_uri):
def download(args):
video_id = utils.parse_video_identifier(args.video)
for video_id in args.videos:
download_one(video_id, args)
def download_one(video: str, args):
video_id = utils.parse_video_identifier(video)
if video_id:
return _download_video(video_id, args)
clip_slug = utils.parse_clip_identifier(args.video)
clip_slug = utils.parse_clip_identifier(video)
if clip_slug:
return _download_clip(clip_slug, args)
raise ConsoleError("Invalid input: {}".format(args.video))
raise ConsoleError("Invalid input: {}".format(video))
def _get_clip_url(clip, quality):
def _get_clip_url(clip, quality) -> str:
qualities = clip["videoQualities"]
# Quality given as an argument
@ -201,7 +211,7 @@ def _get_clip_url(clip, quality):
return selected_quality["sourceURL"]
def get_clip_authenticated_url(slug, quality):
def get_clip_authenticated_url(slug: str, quality: str) -> str:
print_out("<dim>Fetching access token...</dim>")
access_token = twitch.get_clip_access_token(slug)
@ -218,19 +228,19 @@ def get_clip_authenticated_url(slug, quality):
return "{}?{}".format(url, query)
def _download_clip(slug, args):
def _download_clip(slug: str, args):
print_out("<dim>Looking up clip...</dim>")
clip = twitch.get_clip(slug)
game = clip["game"]["name"] if clip["game"] else "Unknown"
game = clip.game.name if clip.game else "Unknown"
if not clip:
raise ConsoleError("Clip '{}' not found".format(slug))
print_out("Found: <green>{}</green> by <yellow>{}</yellow>, playing <blue>{}</blue> ({})".format(
clip["title"],
clip["broadcaster"]["displayName"],
clip.title,
clip.broadcaster.display_name,
game,
utils.format_duration(clip["durationSeconds"])
utils.format_duration(clip.duration_seconds)
))
target = _clip_target_filename(clip, args)
@ -251,7 +261,7 @@ def _download_clip(slug, args):
print_out("Downloaded: <blue>{}</blue>".format(target))
def _download_video(video_id, args):
def _download_video(video_id, args) -> None:
if args.start and args.end and args.end <= args.start:
raise ConsoleError("End time must be greater than start time")
@ -261,8 +271,8 @@ def _download_video(video_id, args):
if not video:
raise ConsoleError("Video {} not found".format(video_id))
print_out("Found: <blue>{}</blue> by <yellow>{}</yellow>".format(
video['title'], video['creator']['displayName']))
creator = f" by <yellow>{video.creator.display_name}" if video.creator else ""
print_out(f"Found: <blue>{video.title}</blue>{creator}")
target = _video_target_filename(video, args)
print_out("Output: <blue>{}</blue>".format(target))
@ -283,7 +293,7 @@ def _download_video(video_id, args):
else _select_playlist_interactive(playlists))
print_out("<dim>Fetching playlist...</dim>")
response = requests.get(playlist_uri)
response = httpx.get(playlist_uri)
response.raise_for_status()
playlist = m3u8.loads(response.text)
@ -299,11 +309,15 @@ def _download_video(video_id, args):
print_out("\nDownloading {} VODs using {} workers to {}".format(
len(vod_paths), args.max_workers, target_dir))
path_map = download_files(base_uri, target_dir, vod_paths, args.max_workers)
sources = [base_uri + path for path in vod_paths]
targets = [os.path.join(target_dir, "{:05d}.ts".format(k)) for k, _ in enumerate(vod_paths)]
asyncio.run(download_all(sources, targets, args.max_workers, rate_limit=args.rate_limit))
# Make a modified playlist which references downloaded VODs
# Keep only the downloaded segments and skip the rest
org_segments = playlist.segments.copy()
path_map = OrderedDict(zip(vod_paths, targets))
playlist.segments.clear()
for segment in org_segments:
if segment.uri in path_map:

View File

@ -2,6 +2,7 @@ import m3u8
from twitchdl import utils, twitch
from twitchdl.exceptions import ConsoleError
from twitchdl.models import Clip, Video
from twitchdl.output import print_video, print_clip, print_json, print_out, print_log
@ -35,7 +36,7 @@ def info(args):
raise ConsoleError("Clip {} not found".format(clip_slug))
if args.json:
print_json(clip)
print_json(clip.raw)
else:
clip_info(clip)
return
@ -43,7 +44,7 @@ def info(args):
raise ConsoleError("Invalid input: {}".format(args.video))
def video_info(video, playlists):
def video_info(video: Video, playlists):
print_out()
print_video(video)
@ -53,10 +54,11 @@ def video_info(video, playlists):
print_out("<b>{}</b> {}".format(p.stream_info.video, p.uri))
def video_json(video, playlists):
def video_json(video: Video, playlists):
playlists = m3u8.loads(playlists).playlists
json = video.raw
video["playlists"] = [
json["playlists"] = [
{
"bandwidth": p.stream_info.bandwidth,
"resolution": p.stream_info.resolution,
@ -66,14 +68,14 @@ def video_json(video, playlists):
} for p in playlists
]
print_json(video)
print_json(json)
def clip_info(clip):
def clip_info(clip: Clip):
print_out()
print_clip(clip)
print_out()
print_out("Download links:")
for q in clip["videoQualities"]:
print_out("<b>{quality}p{frameRate}</b> {sourceURL}".format(**q))
for q in clip.video_qualities:
print_out(f"<b>{q.quality}p{q.frame_rate}</b> {q.source_url}")

View File

@ -2,13 +2,17 @@ import sys
from twitchdl import twitch
from twitchdl.exceptions import ConsoleError
from twitchdl.output import print_out, print_paged_videos, print_video, print_json
from twitchdl.output import print_out, print_paged_videos, print_video, print_json, print_video_compact
def videos(args):
game_ids = _get_game_ids(args.game)
# Set different defaults for limit for compact display
limit = args.limit or (40 if args.compact else 10)
# Ignore --limit if --pager or --all are given
max_videos = sys.maxsize if args.all or args.pager else args.limit
max_videos = sys.maxsize if args.all or args.pager else limit
total_count, generator = twitch.channel_videos_generator(
args.channel_name, max_videos, args.sort, args.type, game_ids=game_ids)
@ -18,7 +22,7 @@ def videos(args):
print_json({
"count": len(videos),
"totalCount": total_count,
"videos": videos
"videos": [v.raw for v in videos]
})
return
@ -32,8 +36,11 @@ def videos(args):
count = 0
for video in generator:
print_out()
print_video(video)
if args.compact:
print_video_compact(video)
else:
print_out()
print_video(video)
count += 1
print_out()
@ -53,10 +60,10 @@ def _get_game_ids(names):
game_ids = []
for name in names:
print_out("<dim>Looking up game '{}'...</dim>".format(name))
game_id = twitch.get_game_id(name)
if not game_id:
raise ConsoleError("Game '{}' not found".format(name))
game_ids.append(int(game_id))
print_out(f"<dim>Looking up game '{name}'...</dim>")
game = twitch.find_game(name)
if not game:
raise ConsoleError(f"Game '{name}' not found")
game_ids.append(int(game.id))
return game_ids

View File

@ -2,9 +2,10 @@
import logging
import sys
import re
from argparse import ArgumentParser, ArgumentTypeError
from collections import namedtuple
from typing import NamedTuple, List, Tuple, Any, Dict
from twitchdl.exceptions import ConsoleError
from twitchdl.output import print_err
@ -12,12 +13,19 @@ from twitchdl.twitch import GQLError
from . import commands, __version__
Command = namedtuple("Command", ["name", "description", "arguments"])
Argument = Tuple[List[str], Dict[str, Any]]
class Command(NamedTuple):
name: str
description: str
arguments: List[Argument]
CLIENT_WEBSITE = 'https://github.com/ihabunek/twitch-dl'
def time(value):
def time(value: str) -> int:
"""Parse a time string (hh:mm or hh:mm:ss) to number of seconds."""
parts = [int(p) for p in value.split(":")]
@ -34,16 +42,34 @@ def time(value):
return hours * 3600 + minutes * 60 + seconds
def pos_integer(value):
def pos_integer(value: str) -> int:
try:
value = int(value)
parsed = int(value)
except ValueError:
raise ArgumentTypeError("must be an integer")
if value < 1:
if parsed < 1:
raise ArgumentTypeError("must be positive")
return value
return parsed
def rate(value: str) -> int:
match = re.search(r"^([0-9]+)(k|m|)$", value, flags=re.IGNORECASE)
if not match:
raise ArgumentTypeError("must be an integer, followed by an optional 'k' or 'm'")
amount = int(match.group(1))
unit = match.group(2)
if unit == "k":
return amount * 1024
if unit == "m":
return amount * 1024 * 1024
return amount
COMMANDS = [
@ -61,9 +87,8 @@ COMMANDS = [
"type": str,
}),
(["-l", "--limit"], {
"help": "Number of videos to fetch. Defaults to 10.",
"help": "Number of videos to fetch. Defaults to 40 in copmpact mode, 10 otherwise.",
"type": pos_integer,
"default": 10,
}),
(["-a", "--all"], {
"help": "Fetch all videos, overrides --limit",
@ -93,6 +118,11 @@ COMMANDS = [
"nargs": "?",
"const": 10,
}),
(["-c", "--compact"], {
"help": "Show videos in compact mode, one line per video",
"action": "store_true",
"default": False,
}),
],
),
Command(
@ -139,17 +169,17 @@ COMMANDS = [
),
Command(
name="download",
description="Download a video or clip.",
description="Download videos or clips.",
arguments=[
(["video"], {
"help": "Video ID, clip slug, or URL",
(["videos"], {
"help": "One or more video ID, clip slug or twitch URL to download.",
"type": str,
"nargs": "+",
}),
(["-w", "--max-workers"], {
"help": "Maximal number of threads for downloading vods "
"concurrently (default 20)",
"help": "Number of workers for downloading vods concurrently (default 5)",
"type": int,
"default": 20,
"default": 5,
}),
(["-s", "--start"], {
"help": "Download video from this time (hh:mm or hh:mm:ss)",
@ -197,7 +227,12 @@ COMMANDS = [
"help": "Output file name template. See docs for details.",
"type": str,
"default": "{date}_{id}_{channel_login}_{title_slug}.{format}"
})
}),
(["-r", "--rate-limit"], {
"help": "Limit the maximum download speed in bytes per second. "
"Use 'k' and 'm' suffixes for kbps and mbps.",
"type": rate,
}),
],
),
Command(
@ -281,7 +316,7 @@ def main():
print_err(e)
sys.exit(1)
except KeyboardInterrupt:
print_err("Operation canceled")
print_err("\nOperation canceled")
sys.exit(1)
except GQLError as e:
print_err(e)

View File

@ -1,14 +1,5 @@
import os
import requests
from collections import OrderedDict
from concurrent.futures import ThreadPoolExecutor, as_completed
from datetime import datetime
from functools import partial
from requests.exceptions import RequestException
from twitchdl.output import print_out
from twitchdl.utils import format_size, format_duration
import httpx
CHUNK_SIZE = 1024
CONNECT_TIMEOUT = 5
@ -19,20 +10,20 @@ class DownloadFailed(Exception):
pass
def _download(url, path):
def _download(url: str, path: str):
tmp_path = path + ".tmp"
response = requests.get(url, stream=True, timeout=CONNECT_TIMEOUT)
size = 0
with open(tmp_path, 'wb') as target:
for chunk in response.iter_content(chunk_size=CHUNK_SIZE):
target.write(chunk)
size += len(chunk)
with httpx.stream("GET", url, timeout=CONNECT_TIMEOUT) as response:
with open(tmp_path, "wb") as target:
for chunk in response.iter_bytes(chunk_size=CHUNK_SIZE):
target.write(chunk)
size += len(chunk)
os.rename(tmp_path, path)
return size
def download_file(url, path, retries=RETRY_COUNT):
def download_file(url: str, path: str, retries: int = RETRY_COUNT):
if os.path.exists(path):
from_disk = True
return (os.path.getsize(path), from_disk)
@ -41,63 +32,7 @@ def download_file(url, path, retries=RETRY_COUNT):
for _ in range(retries):
try:
return (_download(url, path), from_disk)
except RequestException:
except httpx.RequestError:
pass
raise DownloadFailed(":(")
def _print_progress(futures):
downloaded_count = 0
downloaded_size = 0
max_msg_size = 0
start_time = datetime.now()
total_count = len(futures)
current_download_size = 0
current_downloaded_count = 0
for future in as_completed(futures):
size, from_disk = future.result()
downloaded_count += 1
downloaded_size += size
# If we find something on disk, we don't want to take it in account in
# the speed calculation
if not from_disk:
current_download_size += size
current_downloaded_count += 1
percentage = 100 * downloaded_count // total_count
est_total_size = int(total_count * downloaded_size / downloaded_count)
duration = (datetime.now() - start_time).seconds
speed = current_download_size // duration if duration else 0
remaining = (total_count - downloaded_count) * duration / current_downloaded_count \
if current_downloaded_count else 0
msg = " ".join([
"Downloaded VOD {}/{}".format(downloaded_count, total_count),
"({}%)".format(percentage),
"<cyan>{}</cyan>".format(format_size(downloaded_size)),
"of <cyan>~{}</cyan>".format(format_size(est_total_size)),
"at <cyan>{}/s</cyan>".format(format_size(speed)) if speed > 0 else "",
"remaining <cyan>~{}</cyan>".format(format_duration(remaining)) if remaining > 0 else "",
])
max_msg_size = max(len(msg), max_msg_size)
print_out("\r" + msg.ljust(max_msg_size), end="")
def download_files(base_url, target_dir, vod_paths, max_workers):
"""
Downloads a list of VODs defined by a common `base_url` and a list of
`vod_paths`, returning a dict which maps the paths to the downloaded files.
"""
urls = [base_url + path for path in vod_paths]
targets = [os.path.join(target_dir, "{:05d}.ts".format(k)) for k, _ in enumerate(vod_paths)]
partials = (partial(download_file, url, path) for url, path in zip(urls, targets))
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = [executor.submit(fn) for fn in partials]
_print_progress(futures)
return OrderedDict(zip(vod_paths, targets))

View File

@ -1,76 +0,0 @@
import asyncio
import json
import logging
import re
from asyncio.subprocess import PIPE
from pprint import pprint
from typing import Optional
from twitchdl.output import print_out
async def join_vods(playlist_path: str, target: str, overwrite: bool, video: dict):
command = [
"ffmpeg",
"-i", playlist_path,
"-c", "copy",
"-metadata", "artist={}".format(video["creator"]["displayName"]),
"-metadata", "title={}".format(video["title"]),
"-metadata", "encoded_by=twitch-dl",
"-stats",
"-loglevel", "warning",
f"file:{target}",
]
if overwrite:
command.append("-y")
# command = ["ls", "-al"]
print_out("<dim>{}</dim>".format(" ".join(command)))
process = await asyncio.create_subprocess_exec(*command, stdout=PIPE, stderr=PIPE)
assert process.stderr is not None
await asyncio.gather(
# _read_stream("stdout", process.stdout),
_print_progress("stderr", process.stderr),
process.wait()
)
print(process.returncode)
async def _read_stream(name: str, stream: Optional[asyncio.StreamReader]):
if stream:
async for line in readlines(stream):
print(name, ">", line)
async def _print_progress(stream: asyncio.StreamReader):
async for line in readlines(stream):
print(name, ">", line)
pattern = re.compile(br"[\r\n]+")
async def readlines(stream: asyncio.StreamReader):
data = bytearray()
while not stream.at_eof():
lines = pattern.split(data)
data[:] = lines.pop(-1)
for line in lines:
yield line
data.extend(await stream.read(1024))
if __name__ == "__main__":
# logging.basicConfig(level=logging.DEBUG)
video = json.loads('{"id": "1555108011", "title": "Cult of the Lamb", "publishedAt": "2022-08-07T17:00:30Z", "broadcastType": "ARCHIVE", "lengthSeconds": 17948, "game": {"name": "Cult of the Lamb"}, "creator": {"login": "bananasaurus_rex", "displayName": "Bananasaurus_Rex"}, "playlists": [{"bandwidth": 8446533, "resolution": [1920, 1080], "codecs": "avc1.64002A,mp4a.40.2", "video": "chunked", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/chunked/index-dvr.m3u8"}, {"bandwidth": 3432426, "resolution": [1280, 720], "codecs": "avc1.4D0020,mp4a.40.2", "video": "720p60", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/720p60/index-dvr.m3u8"}, {"bandwidth": 1445268, "resolution": [852, 480], "codecs": "avc1.4D001F,mp4a.40.2", "video": "480p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/480p30/index-dvr.m3u8"}, {"bandwidth": 215355, "resolution": null, "codecs": "mp4a.40.2", "video": "audio_only", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/audio_only/index-dvr.m3u8"}, {"bandwidth": 705523, "resolution": [640, 360], "codecs": "avc1.4D001E,mp4a.40.2", "video": "360p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/360p30/index-dvr.m3u8"}, {"bandwidth": 285614, "resolution": [284, 160], "codecs": "avc1.4D000C,mp4a.40.2", "video": "160p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/160p30/index-dvr.m3u8"}]}')
playlist_path = "/tmp/twitch-dl/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/160p30/playlist_downloaded.m3u8"
asyncio.run(join_vods(playlist_path, "out.mkv", True, video), debug=True)

129
twitchdl/http.py Normal file
View File

@ -0,0 +1,129 @@
import asyncio
import httpx
import logging
import os
import time
from typing import List, Optional, Union
from twitchdl.progress import Progress
logger = logging.getLogger(__name__)
KB = 1024
CHUNK_SIZE = 256 * KB
"""How much of a VOD to download in each iteration"""
RETRY_COUNT = 5
"""Number of times to retry failed downloads before aborting."""
TIMEOUT = 30
"""
Number of seconds to wait before aborting when there is no network activity.
https://www.python-httpx.org/advanced/#timeout-configuration
"""
class TokenBucket:
"""Limit the download speed by strategically inserting sleeps."""
def __init__(self, rate: int, capacity: Optional[int] = None):
self.rate: int = rate
self.capacity: int = capacity or rate * 2
self.available: int = 0
self.last_refilled: float = time.time()
def advance(self, size: int):
"""Called every time a chunk of data is downloaded."""
self._refill()
if self.available < size:
deficit = size - self.available
time.sleep(deficit / self.rate)
self.available -= size
def _refill(self):
"""Increase available capacity according to elapsed time since last refill."""
now = time.time()
elapsed = now - self.last_refilled
refill_amount = int(elapsed * self.rate)
self.available = min(self.available + refill_amount, self.capacity)
self.last_refilled = now
class EndlessTokenBucket:
"""Used when download speed is not limited."""
def advance(self, size: int):
pass
AnyTokenBucket = Union[TokenBucket, EndlessTokenBucket]
async def download(
client: httpx.AsyncClient,
task_id: int,
source: str,
target: str,
progress: Progress,
token_bucket: AnyTokenBucket,
):
# Download to a temp file first, then copy to target when over to avoid
# getting saving chunks which may persist if canceled or --keep is used
tmp_target = f"{target}.tmp"
with open(tmp_target, "wb") as f:
async with client.stream("GET", source) as response:
size = int(response.headers.get("content-length"))
progress.start(task_id, size)
async for chunk in response.aiter_bytes(chunk_size=CHUNK_SIZE):
f.write(chunk)
size = len(chunk)
token_bucket.advance(size)
progress.advance(task_id, size)
progress.end(task_id)
os.rename(tmp_target, target)
async def download_with_retries(
client: httpx.AsyncClient,
semaphore: asyncio.Semaphore,
task_id: int,
source: str,
target: str,
progress: Progress,
token_bucket: AnyTokenBucket,
):
async with semaphore:
if os.path.exists(target):
size = os.path.getsize(target)
progress.already_downloaded(task_id, size)
return
for n in range(RETRY_COUNT):
try:
return await download(client, task_id, source, target, progress, token_bucket)
except httpx.RequestError:
logger.exception("Task {task_id} failed. Retrying. Maybe.")
progress.abort(task_id)
if n + 1 >= RETRY_COUNT:
raise
raise Exception("Should not happen")
async def download_all(
sources: List[str],
targets: List[str],
workers: int,
/, *,
rate_limit: Optional[int] = None
):
progress = Progress(len(sources))
token_bucket = TokenBucket(rate_limit) if rate_limit else EndlessTokenBucket()
async with httpx.AsyncClient(timeout=TIMEOUT) as client:
semaphore = asyncio.Semaphore(workers)
tasks = [download_with_retries(client, semaphore, task_id, source, target, progress, token_bucket)
for task_id, (source, target) in enumerate(zip(sources, targets))]
await asyncio.gather(*tasks)

141
twitchdl/models.py Normal file
View File

@ -0,0 +1,141 @@
from typing import Any, Dict, List, Optional, Generator
from dataclasses import dataclass
Json = Dict[str, Any]
GameID = str
@dataclass(frozen=True)
class Broadcaster():
login: str
display_name: str
@staticmethod
def from_json(data: Json) -> "Broadcaster":
return Broadcaster(data["login"], data["displayName"])
@dataclass(frozen=True)
class VideoQuality():
frame_rate: int
quality: str
source_url: str
@staticmethod
def from_json(data: Json) -> "VideoQuality":
return VideoQuality(data["frameRate"], data["quality"], data["sourceURL"])
@dataclass(frozen=True)
class Game():
id: str
name: str
description: str
@staticmethod
def from_json(data: Json) -> "Game":
return Game(data["id"], data["name"], data["description"])
@dataclass(frozen=True)
class Clip():
id: str
slug: str
title: str
created_at: str
view_count: int
duration_seconds: int
url: str
game: Optional[Game]
broadcaster: Broadcaster
video_qualities: List[VideoQuality]
raw: Json
@staticmethod
def from_json(data: Json) -> "Clip":
game = Game.from_json(data["game"]) if data["game"] else None
broadcaster = Broadcaster.from_json(data["broadcaster"])
video_qualities = [VideoQuality.from_json(q) for q in data["videoQualities"]]
return Clip(
data["id"],
data["slug"],
data["title"],
data["createdAt"],
data["viewCount"],
data["durationSeconds"],
data["url"],
game,
broadcaster,
video_qualities,
data
)
@dataclass(frozen=True)
class ClipsPage():
cursor: str
has_next_page: bool
has_previous_page: bool
clips: List[Clip]
@staticmethod
def from_json(data: Json) -> "ClipsPage":
return ClipsPage(
data["edges"][-1]["cursor"],
data["pageInfo"]["hasNextPage"],
data["pageInfo"]["hasPreviousPage"],
[Clip.from_json(c["node"]) for c in data["edges"]]
)
@dataclass(frozen=True)
class Video():
id: str
title: str
published_at: str
broadcast_type: str
length_seconds: int
game: Optional[Game]
creator: Broadcaster
raw: Json
@staticmethod
def from_json(data: Json) -> "Video":
game = Game.from_json(data["game"]) if data["game"] else None
creator = Broadcaster.from_json(data["creator"])
return Video(
data["id"],
data["title"],
data["publishedAt"],
data["broadcastType"],
data["lengthSeconds"],
game,
creator,
data
)
@dataclass(frozen=True)
class VideosPage():
cursor: str
has_next_page: bool
has_previous_page: bool
total_count: int
videos: List[Video]
@staticmethod
def from_json(data: Json) -> "VideosPage":
return VideosPage(
data["edges"][-1]["cursor"],
data["pageInfo"]["hasNextPage"],
data["pageInfo"].get("hasPreviousPage"),
data["totalCount"],
[Video.from_json(c["node"]) for c in data["edges"]]
)
ClipGenerator = Generator[Clip, None, None]
VideoGenerator = Generator[Video, None, None]

View File

@ -6,6 +6,8 @@ import re
from itertools import islice
from twitchdl import utils
from twitchdl.models import Clip, Video
from typing import Any, Match
START_CODES = {
@ -29,31 +31,38 @@ END_PATTERN = "</(" + "|".join(START_CODES.keys()) + ")>"
USE_ANSI_COLOR = "--no-color" not in sys.argv
def start_code(match):
def start_code(match: Match[str]) -> str:
name = match.group(1)
return START_CODES[name]
def colorize(text):
def colorize(text: str) -> str:
text = re.sub(START_PATTERN, start_code, text)
text = re.sub(END_PATTERN, END_CODE, text)
return text
def strip_tags(text):
def strip_tags(text: str) -> str:
text = re.sub(START_PATTERN, '', text)
text = re.sub(END_PATTERN, '', text)
return text
def truncate(string: str, length: int) -> str:
if len(string) > length:
return string[:length - 1] + ""
return string
def print_out(*args, **kwargs):
args = [colorize(a) if USE_ANSI_COLOR else strip_tags(a) for a in args]
print(*args, **kwargs)
def print_json(data):
def print_json(data: Any):
print(json.dumps(data))
@ -69,24 +78,31 @@ def print_log(*args, **kwargs):
print(*args, file=sys.stderr, **kwargs)
def print_video(video):
published_at = video["publishedAt"].replace("T", " @ ").replace("Z", "")
length = utils.format_duration(video["lengthSeconds"])
def print_video(video: Video):
published_at = video.published_at.replace("T", " @ ").replace("Z", "")
length = utils.format_duration(video.length_seconds)
channel = "<blue>{}</blue>".format(video["creator"]["displayName"]) if video["creator"] else ""
playing = "playing <blue>{}</blue>".format(video["game"]["name"]) if video["game"] else ""
channel = f"<blue>{video.creator.display_name}</blue>" if video.creator else ""
playing = f"playing <blue>{video.game.name}</blue>" if video.game else ""
# Can't find URL in video object, strange
url = "https://www.twitch.tv/videos/{}".format(video["id"])
url = f"https://www.twitch.tv/videos/{video.id}"
print_out("<b>Video {}</b>".format(video["id"]))
print_out("<green>{}</green>".format(video["title"]))
print_out(f"<b>Video {video.id}</b>")
print_out(f"<green>{video.title}</green>")
if channel or playing:
print_out(" ".join([channel, playing]))
print_out("Published <blue>{}</blue> Length: <blue>{}</blue> ".format(published_at, length))
print_out("<i>{}</i>".format(url))
print_out(f"Published <blue>{published_at}</blue> Length: <blue>{length}</blue>")
print_out(f"<i>{url}</i>")
def print_video_compact(video):
date = video.published_at[:10]
game = video.game.name if video.game else ""
title = truncate(video.title, 80).ljust(80)
print_out(f"<b>{video.id}</b> {date} <green>{title}</green> <blue>{game}</blue>")
def print_paged_videos(generator, page_size, total_count):
@ -117,23 +133,23 @@ def print_paged_videos(generator, page_size, total_count):
break
def print_clip(clip):
published_at = clip["createdAt"].replace("T", " @ ").replace("Z", "")
length = utils.format_duration(clip["durationSeconds"])
channel = clip["broadcaster"]["displayName"]
def print_clip(clip: Clip):
published_at = clip.created_at.replace("T", " @ ").replace("Z", "")
length = utils.format_time(clip.duration_seconds)
channel = clip.broadcaster.display_name
playing = (
"playing <blue>{}</blue>".format(clip["game"]["name"])
if clip["game"] else ""
"playing <blue>{}</blue>".format(clip.game.name)
if clip.game else ""
)
print_out("Clip <b>{}</b>".format(clip["slug"]))
print_out("<green>{}</green>".format(clip["title"]))
print_out("Clip <b>{}</b>".format(clip.slug))
print_out("<green>{}</green>".format(clip.title))
print_out("<blue>{}</blue> {}".format(channel, playing))
print_out(
"Published <blue>{}</blue>"
" Length: <blue>{}</blue>"
" Views: <blue>{}</blue>".format(published_at, length, clip["viewCount"]))
print_out("<i>{}</i>".format(clip["url"]))
f"Published: <blue>{published_at}</blue>"
f" Length: <blue>{length}</blue>"
f" Views: <blue>{clip.view_count}</blue>")
print_out(f"<i>{clip.url}</i>")
def _continue():

137
twitchdl/progress.py Normal file
View File

@ -0,0 +1,137 @@
import logging
import time
from collections import deque
from dataclasses import dataclass, field
from statistics import mean
from typing import Dict, NamedTuple, Optional, Deque
from twitchdl.output import print_out
from twitchdl.utils import format_size, format_time
logger = logging.getLogger(__name__)
TaskId = int
@dataclass
class Task:
id: TaskId
size: int
downloaded: int = 0
def advance(self, size):
self.downloaded += size
class Sample(NamedTuple):
downloaded: int
timestamp: float
@dataclass
class Progress:
vod_count: int
downloaded: int = 0
estimated_total: Optional[int] = None
last_printed: float = field(default_factory=time.time)
progress_bytes: int = 0
progress_perc: int = 0
remaining_time: Optional[int] = None
speed: Optional[float] = None
start_time: float = field(default_factory=time.time)
tasks: Dict[TaskId, Task] = field(default_factory=dict)
vod_downloaded_count: int = 0
samples: Deque[Sample] = field(default_factory=lambda: deque(maxlen=100))
def start(self, task_id: int, size: int):
if task_id in self.tasks:
raise ValueError(f"Task {task_id}: cannot start, already started")
self.tasks[task_id] = Task(task_id, size)
self._calculate_total()
self._calculate_progress()
self.print()
def advance(self, task_id: int, size: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot advance, not started")
self.downloaded += size
self.progress_bytes += size
self.tasks[task_id].advance(size)
self.samples.append(Sample(self.downloaded, time.time()))
self._calculate_progress()
self.print()
def already_downloaded(self, task_id: int, size: int):
if task_id in self.tasks:
raise ValueError(f"Task {task_id}: cannot mark as downloaded, already started")
self.tasks[task_id] = Task(task_id, size)
self.progress_bytes += size
self.vod_downloaded_count += 1
self._calculate_total()
self._calculate_progress()
self.print()
def abort(self, task_id: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot abort, not started")
del self.tasks[task_id]
self.progress_bytes = sum(t.downloaded for t in self.tasks.values())
self._calculate_total()
self._calculate_progress()
self.print()
def end(self, task_id: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot end, not started")
task = self.tasks[task_id]
if task.size != task.downloaded:
logger.warn(f"Taks {task_id} ended with {task.downloaded}b downloaded, expected {task.size}b.")
self.vod_downloaded_count += 1
self.print()
def _calculate_total(self):
self.estimated_total = int(mean(t.size for t in self.tasks.values()) * self.vod_count) if self.tasks else None
def _calculate_progress(self):
self.speed = self._calculate_speed()
self.progress_perc = int(100 * self.progress_bytes / self.estimated_total) if self.estimated_total else 0
self.remaining_time = int((self.estimated_total - self.progress_bytes) / self.speed) if self.estimated_total and self.speed else None
def _calculate_speed(self):
if len(self.samples) < 2:
return None
first_sample = self.samples[0]
last_sample = self.samples[-1]
size = last_sample.downloaded - first_sample.downloaded
duration = last_sample.timestamp - first_sample.timestamp
return size / duration
def print(self):
now = time.time()
# Don't print more often than 10 times per second
if now - self.last_printed < 0.1:
return
progress = " ".join([
f"Downloaded {self.vod_downloaded_count}/{self.vod_count} VODs",
f"<blue>{self.progress_perc}%</blue>",
f"of <blue>~{format_size(self.estimated_total)}</blue>" if self.estimated_total else "",
f"at <blue>{format_size(self.speed)}/s</blue>" if self.speed else "",
f"ETA <blue>{format_time(self.remaining_time)}</blue>" if self.remaining_time is not None else "",
])
print_out(f"\r{progress} ", end="")
self.last_printed = now

View File

@ -2,11 +2,12 @@
Twitch API access.
"""
import requests
import httpx
from requests.exceptions import HTTPError
from twitchdl import CLIENT_ID
from twitchdl.exceptions import ConsoleError
from twitchdl.models import Clip, ClipsPage, ClipGenerator, Game, Video, VideoGenerator, VideosPage
from typing import Dict, Optional, Tuple
class GQLError(Exception):
@ -15,25 +16,10 @@ class GQLError(Exception):
self.errors = errors
def authenticated_get(url, params={}, headers={}):
headers['Client-ID'] = CLIENT_ID
response = requests.get(url, params, headers=headers)
if 400 <= response.status_code < 500:
data = response.json()
# TODO: this does not look nice in the console since data["message"]
# can contain a JSON encoded object.
raise ConsoleError(data["message"])
response.raise_for_status()
return response
def authenticated_post(url, data=None, json=None, headers={}):
headers['Client-ID'] = CLIENT_ID
response = requests.post(url, data=data, json=json, headers=headers)
response = httpx.post(url, data=data, json=json, headers=headers)
if response.status_code == 400:
data = response.json()
raise ConsoleError(data["message"])
@ -53,7 +39,7 @@ def gql_post(query):
return response
def gql_query(query, headers={}):
def gql_query(query: str, headers: Dict[str, str] = {}):
url = "https://gql.twitch.tv/gql"
response = authenticated_post(url, json={"query": query}, headers=headers).json()
@ -63,23 +49,29 @@ def gql_query(query, headers={}):
return response
VIDEO_FIELDS = """
GAME_FIELDS = """
id
name
description
"""
VIDEO_FIELDS = f"""
id
title
publishedAt
broadcastType
lengthSeconds
game {
name
}
creator {
game {{
{GAME_FIELDS}
}}
creator {{
login
displayName
}
}}
"""
CLIP_FIELDS = """
CLIP_FIELDS = f"""
id
slug
title
@ -87,23 +79,22 @@ CLIP_FIELDS = """
viewCount
durationSeconds
url
videoQualities {
videoQualities {{
frameRate
quality
sourceURL
}
game {
id
name
}
broadcaster {
displayName
}}
game {{
{GAME_FIELDS}
}}
broadcaster {{
login
}
displayName
}}
"""
def get_video(video_id):
def get_video(video_id: str) -> Optional[Video]:
query = """
{{
video(id: "{video_id}") {{
@ -115,10 +106,11 @@ def get_video(video_id):
query = query.format(video_id=video_id, fields=VIDEO_FIELDS)
response = gql_query(query)
return response["data"]["video"]
if response["data"]["video"]:
return Video.from_json(response["data"]["video"])
def get_clip(slug):
def get_clip(slug: str) -> Clip:
query = """
{{
clip(slug: "{}") {{
@ -128,7 +120,7 @@ def get_clip(slug):
"""
response = gql_query(query.format(slug, fields=CLIP_FIELDS))
return response["data"]["clip"]
return Clip.from_json(response["data"]["clip"])
def get_clip_access_token(slug):
@ -151,7 +143,7 @@ def get_clip_access_token(slug):
return response["data"]["clip"]
def get_channel_clips(channel_id, period, limit, after=None):
def get_channel_clips(channel_id, period, limit, after=None) -> ClipsPage:
"""
List channel clips.
@ -193,50 +185,47 @@ def get_channel_clips(channel_id, period, limit, after=None):
if not user:
raise ConsoleError("Channel {} not found".format(channel_id))
return response["data"]["user"]["clips"]
return ClipsPage.from_json(response["data"]["user"]["clips"])
def channel_clips_generator(channel_id, period, limit):
def _generator(clips, limit):
for clip in clips["edges"]:
def channel_clips_generator(channel_id: str, period, limit: int) -> ClipGenerator:
def _generator(page: ClipsPage, limit: int):
for clip in page.clips:
if limit < 1:
return
yield clip["node"]
yield clip
limit -= 1
has_next = clips["pageInfo"]["hasNextPage"]
if limit < 1 or not has_next:
if limit < 1 or not page.has_next_page:
return
req_limit = min(limit, 100)
cursor = clips["edges"][-1]["cursor"]
clips = get_channel_clips(channel_id, period, req_limit, cursor)
yield from _generator(clips, limit)
next_page = get_channel_clips(channel_id, period, req_limit, page.cursor)
yield from _generator(next_page, limit)
req_limit = min(limit, 100)
clips = get_channel_clips(channel_id, period, req_limit)
return _generator(clips, limit)
page = get_channel_clips(channel_id, period, req_limit)
return _generator(page, limit)
def channel_clips_generator_old(channel_id, period, limit):
cursor = ""
while True:
clips = get_channel_clips(
channel_id, period, limit, after=cursor)
page = get_channel_clips(channel_id, period, limit, after=cursor)
if not clips["edges"]:
if not page.clips:
break
has_next = clips["pageInfo"]["hasNextPage"]
cursor = clips["edges"][-1]["cursor"] if has_next else None
has_next = page.has_next_page
cursor = page.cursor if has_next else None
yield clips, has_next
yield page.clips, has_next
if not cursor:
break
def get_channel_videos(channel_id, limit, sort, type="archive", game_ids=[], after=None):
def get_channel_videos(channel_id, limit, sort, type="archive", game_ids=[], after=None) -> VideosPage:
query = """
{{
user(login: "{channel_id}") {{
@ -279,29 +268,27 @@ def get_channel_videos(channel_id, limit, sort, type="archive", game_ids=[], aft
if not response["data"]["user"]:
raise ConsoleError("Channel {} not found".format(channel_id))
return response["data"]["user"]["videos"]
return VideosPage.from_json(response["data"]["user"]["videos"])
def channel_videos_generator(channel_id, max_videos, sort, type, game_ids=None):
def _generator(videos, max_videos):
for video in videos["edges"]:
def channel_videos_generator(channel_id, max_videos, sort, type, game_ids=None) -> Tuple[int, VideoGenerator]:
def _generator(page, max_videos):
for video in page.videos:
if max_videos < 1:
return
yield video["node"]
yield video
max_videos -= 1
has_next = videos["pageInfo"]["hasNextPage"]
if max_videos < 1 or not has_next:
if max_videos < 1 or not page.has_next_page:
return
limit = min(max_videos, 100)
cursor = videos["edges"][-1]["cursor"]
videos = get_channel_videos(channel_id, limit, sort, type, game_ids, cursor)
videos = get_channel_videos(channel_id, limit, sort, type, game_ids, page.cursor)
yield from _generator(videos, max_videos)
limit = min(max_videos, 100)
videos = get_channel_videos(channel_id, limit, sort, type, game_ids)
return videos["totalCount"], _generator(videos, max_videos)
page = get_channel_videos(channel_id, limit, sort, type, game_ids)
return page.total_count, _generator(page, max_videos)
def get_access_token(video_id, auth_token=None):
@ -330,7 +317,7 @@ def get_access_token(video_id, auth_token=None):
try:
response = gql_query(query, headers=headers)
return response["data"]["videoPlaybackAccessToken"]
except HTTPError as error:
except httpx.HTTPStatusError as error:
# Provide a more useful error message when server returns HTTP 401
# Unauthorized while using a user-provided auth token.
if error.response.status_code == 401:
@ -351,7 +338,7 @@ def get_playlists(video_id, access_token):
"""
url = "http://usher.twitch.tv/vod/{}".format(video_id)
response = requests.get(url, params={
response = httpx.get(url, params={
"nauth": access_token['value'],
"nauthsig": access_token['signature'],
"allow_audio_only": "true",
@ -362,16 +349,15 @@ def get_playlists(video_id, access_token):
return response.content.decode('utf-8')
def get_game_id(name):
query = """
def find_game(name: str) -> Optional[Game]:
query = f"""
{{
game(name: "{}") {{
id
game(name: "{name.strip()}") {{
{GAME_FIELDS}
}}
}}
"""
response = gql_query(query.format(name.strip()))
game = response["data"]["game"]
if game:
return game["id"]
response = gql_query(query)
if response["data"]["game"]:
return Game.from_json(response["data"]["game"])

View File

@ -24,7 +24,7 @@ def format_size(bytes_, digits=1):
return _format_size(mega / 1024, digits, "GB")
def format_duration(total_seconds):
def format_duration(total_seconds: int) -> str:
total_seconds = int(total_seconds)
hours = total_seconds // 3600
remainder = total_seconds % 3600
@ -40,6 +40,19 @@ def format_duration(total_seconds):
return "{} sec".format(seconds)
def format_time(total_seconds: int) -> str:
total_seconds = int(total_seconds)
hours = total_seconds // 3600
remainder = total_seconds % 3600
minutes = remainder // 60
seconds = total_seconds % 60
if hours:
return f"{hours:02}:{minutes:02}:{seconds:02}"
return f"{minutes:02}:{seconds:02}"
def read_int(msg, min, max, default):
msg = msg + " [default {}]: ".format(default)
@ -54,14 +67,14 @@ def read_int(msg, min, max, default):
pass
def slugify(value):
def slugify(value: str) -> str:
value = unicodedata.normalize('NFKC', str(value))
value = re.sub(r'[^\w\s_-]', '', value)
value = re.sub(r'[\s_-]+', '_', value)
return value.strip("_").lower()
def titlify(value):
def titlify(value: str) -> str:
value = unicodedata.normalize('NFKC', str(value))
value = re.sub(r'[^\w\s\[\]().-]', '', value)
value = re.sub(r'\s+', ' ', value)
@ -80,7 +93,7 @@ CLIP_PATTERNS = [
]
def parse_video_identifier(identifier):
def parse_video_identifier(identifier: str) -> str:
"""Given a video ID or URL returns the video ID, or null if not matched"""
for pattern in VIDEO_PATTERNS:
match = re.match(pattern, identifier)
@ -88,7 +101,7 @@ def parse_video_identifier(identifier):
return match.group("id")
def parse_clip_identifier(identifier):
def parse_clip_identifier(identifier: str) -> str:
"""Given a clip slug or URL returns the clip slug, or null if not matched"""
for pattern in CLIP_PATTERNS:
match = re.match(pattern, identifier)