Compare commits

..

1 Commits

Author SHA1 Message Date
3c96c02394 wip 2022-08-08 13:55:33 +02:00
26 changed files with 258 additions and 810 deletions

View File

@ -3,41 +3,6 @@ twitch-dl changelog
<!-- Do not edit. This file is automatically generated from changelog.yaml.-->
### [2.1.1 (2022-11-20)](https://github.com/ihabunek/twitch-dl/releases/tag/2.1.1)
* Fix Python 3.7 compatibility (#117, thanks @eliduvid)
* Fix default value for game_ids (#102, thanks @FunnyPocketBook)
### [2.1.0 (2022-11-20)](https://github.com/ihabunek/twitch-dl/releases/tag/2.1.0)
* Add chapter list to `info` command
* Add `download --chapter` option for downloading a single chapter
### [2.0.1 (2022-09-09)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.1)
* Fix an issue where a temp vod file would be renamed while still being open,
which caused an exception on Windows (#111)
### [2.0.0 (2022-08-18)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.0)
This release switches from using `requests` to `httpx` for making http requests,
and from threads to `asyncio` for concurrency. This enables easier
implementation of new features, but has no breaking changes for the CLI.
* **BREAKING**: Require Python 3.7 or later.
* Add `--rate-limit` option to `download` for limiting maximum bandwidth when
downloading.
* Add `--compact` option to `download` for displaying one video per line.
* Allow passing multiple video ids to `download` to download multiple videos
successively.
* Improved progress meter, updates on each chunk downloaded, instead of each VOD
downloaded.
* Improved speed estimate, displays recent speed instead of average speed since
the start of download.
* Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting the
`-w` option. Please test and report back if this works for you.
### [1.22.0 (2022-06-25)](https://github.com/ihabunek/twitch-dl/releases/tag/1.22.0)
* Add support for downloading subscriber-only VODs (#48, thanks @cemiu)

View File

@ -8,7 +8,7 @@ dist :
clean :
find . -name "*pyc" | xargs rm -rf $1
rm -rf build dist bundle MANIFEST htmlcov deb_dist twitch-dl.*.pyz twitch-dl.1.man twitch_dl.egg-info
rm -rf build dist bundle MANIFEST htmlcov deb_dist twitch-dl.*.pyz twitch-dl.1.man
bundle:
mkdir bundle

View File

@ -17,7 +17,7 @@ Resources
Requirements
------------
* Python 3.7 or later
* Python 3.5 or later
* [ffmpeg](https://ffmpeg.org/download.html), installed and on the system path
Quick start

View File

@ -1,8 +0,0 @@
TODO
====
Some ideas what to do next.
* gracefully handle aborting the download with Ctrl+C, now it prints out an error stack
* add keyboard control for e.g. pausing a download
* test how worker count affects download speeds on low and high-bandwidth links (see https://github.com/ihabunek/twitch-dl/issues/104), adjust default worker count

View File

@ -1,39 +1,3 @@
2.1.1:
date: 2022-11-20
changes:
- "Fix Python 3.7 compatibility (#117, thanks @eliduvid)"
- "Fix default value for game_ids (#102, thanks @FunnyPocketBook)"
2.1.0:
date: 2022-11-20
changes:
- "Add chapter list to `info` command"
- "Add `download --chapter` option for downloading a single chapter"
2.0.1:
date: 2022-09-09
changes:
- "Fix an issue where a temp vod file would be renamed while still being open, which caused an exception on Windows (#111)"
2.0.0:
date: 2022-08-18
description: |
This release switches from using `requests` to `httpx` for making http
requests, and from threads to `asyncio` for concurrency. This enables
easier implementation of new features, but has no breaking changes for the
CLI.
changes:
- "**BREAKING**: Require Python 3.7 or later."
- "Add `--rate-limit` option to `download` for limiting maximum bandwidth when downloading."
- "Add `--compact` option to `download` for displaying one video per line."
- "Allow passing multiple video ids to `download` to download multiple videos successively."
- "Improved progress meter, updates on each chunk downloaded, instead of each VOD downloaded."
- "Improved speed estimate, displays recent speed instead of average speed since the start of download."
- |
Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting
the `-w` option. Please test and report back if this works for you.
1.22.0:
date: 2022-06-25
changes:

View File

@ -3,41 +3,6 @@ twitch-dl changelog
<!-- Do not edit. This file is automatically generated from changelog.yaml.-->
### [2.1.1 (2022-11-20)](https://github.com/ihabunek/twitch-dl/releases/tag/2.1.1)
* Fix Python 3.7 compatibility (#117, thanks @eliduvid)
* Fix default value for game_ids (#102, thanks @FunnyPocketBook)
### [2.1.0 (2022-11-20)](https://github.com/ihabunek/twitch-dl/releases/tag/2.1.0)
* Add chapter list to `info` command
* Add `download --chapter` option for downloading a single chapter
### [2.0.1 (2022-09-09)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.1)
* Fix an issue where a temp vod file would be renamed while still being open,
which caused an exception on Windows (#111)
### [2.0.0 (2022-08-18)](https://github.com/ihabunek/twitch-dl/releases/tag/2.0.0)
This release switches from using `requests` to `httpx` for making http requests,
and from threads to `asyncio` for concurrency. This enables easier
implementation of new features, but has no breaking changes for the CLI.
* **BREAKING**: Require Python 3.7 or later.
* Add `--rate-limit` option to `download` for limiting maximum bandwidth when
downloading.
* Add `--compact` option to `download` for displaying one video per line.
* Allow passing multiple video ids to `download` to download multiple videos
successively.
* Improved progress meter, updates on each chunk downloaded, instead of each VOD
downloaded.
* Improved speed estimate, displays recent speed instead of average speed since
the start of download.
* Decreased default concurrent downloads to 5. This seems to be enough to
saturate the download link in most cases. You can override this by setting the
`-w` option. Please test and report back if this works for you.
### [1.22.0 (2022-06-25)](https://github.com/ihabunek/twitch-dl/releases/tag/1.22.0)
* Add support for downloading subscriber-only VODs (#48, thanks @cemiu)

View File

@ -1,12 +1,12 @@
<!-- ------------------- generated docs start ------------------- -->
# twitch-dl download
Download videos or clips.
Download a video or clip.
### USAGE
```
twitch-dl download <videos> [FLAGS] [OPTIONS]
twitch-dl download <video> [FLAGS] [OPTIONS]
```
### ARGUMENTS
@ -14,8 +14,8 @@ twitch-dl download <videos> [FLAGS] [OPTIONS]
<table>
<tbody>
<tr>
<td class="code">&lt;videos&gt;</td>
<td>One or more video ID, clip slug or twitch URL to download.</td>
<td class="code">&lt;video&gt;</td>
<td>Video ID, clip slug, or URL</td>
</tr>
</tbody>
</table>
@ -47,7 +47,7 @@ twitch-dl download <videos> [FLAGS] [OPTIONS]
<tbody>
<tr>
<td class="code">-w, --max-workers</td>
<td>Number of workers for downloading vods concurrently (default 5)</td>
<td>Maximal number of threads for downloading vods concurrently (default 20)</td>
</tr>
<tr>
@ -79,16 +79,6 @@ twitch-dl download <videos> [FLAGS] [OPTIONS]
<td class="code">-o, --output</td>
<td>Output file name template. See docs for details.</td>
</tr>
<tr>
<td class="code">-r, --rate-limit</td>
<td>Limit the maximum download speed in bytes per second. Use &#x27;k&#x27; and &#x27;m&#x27; suffixes for kbps and mbps.</td>
</tr>
<tr>
<td class="code">-c, --chapter</td>
<td>Download a single chapter of the video. Specify the chapter number or use the flag without a number to display a chapter select prompt.</td>
</tr>
</tbody>
</table>
@ -121,12 +111,6 @@ Setting quality to `audio_only` will download only audio:
twitch-dl download -q audio_only 221837124
```
Download multiple videos one after the other:
```
twitch-dl download 1559928295 1557034274 1555157293 -q source
```
### Overriding the target file name
The target filename can be defined by passing the `--output` option followed by
@ -188,4 +172,4 @@ download command:
```
twitch-dl download 221837124 --auth-token iduetx4i107rn4b9wrgctf590aiktv
```
```

View File

@ -33,11 +33,6 @@ twitch-dl videos <channel_name> [FLAGS] [OPTIONS]
<td class="code">-j, --json</td>
<td>Show results as JSON. Ignores <code>--pager</code>.</td>
</tr>
<tr>
<td class="code">-c, --compact</td>
<td>Show videos in compact mode, one line per video</td>
</tr>
</tbody>
</table>
@ -52,7 +47,7 @@ twitch-dl videos <channel_name> [FLAGS] [OPTIONS]
<tr>
<td class="code">-l, --limit</td>
<td>Number of videos to fetch. Defaults to 40 in copmpact mode, 10 otherwise.</td>
<td>Number of videos to fetch. Defaults to 10.</td>
</tr>
<tr>

View File

@ -1,6 +1,6 @@
# Installation
twitch-dl requires **Python 3.7** or later.
twitch-dl requires **Python 3.5** or later.
## Prerequisite: FFmpeg

View File

@ -21,13 +21,6 @@ for version in data.keys():
changes = data[version]["changes"]
print(f"### [{version} ({date})](https://github.com/ihabunek/twitch-dl/releases/tag/{version})")
print()
if "description" in data[version]:
description = data[version]["description"].strip()
for line in textwrap.wrap(description, 80):
print(line)
print()
for c in changes:
lines = textwrap.wrap(c, 78)
initial = True

View File

@ -44,18 +44,14 @@ if dist_version != version:
release_date = changelog_item["date"]
changes = changelog_item["changes"]
description = changelog_item["description"] if "description" in changelog_item else None
if not isinstance(release_date, date):
print(f"Release date not set for version `{version}` in the changelog.", file=sys.stderr)
sys.exit(1)
commit_message = f"twitch-dl {version}\n\n"
if description:
lines = textwrap.wrap(description.strip(), 72)
commit_message += "\n".join(lines) + "\n\n"
for c in changes:
lines = textwrap.wrap(c, 69)
lines = textwrap.wrap(c, 70)
initial = True
for line in lines:
lead = " *" if initial else " "

View File

@ -10,33 +10,35 @@ makes it faster.
"""
setup(
name="twitch-dl",
version="2.1.1",
description="Twitch downloader",
name='twitch-dl',
version='1.22.0',
description='Twitch downloader',
long_description=long_description.strip(),
author="Ivan Habunek",
author_email="ivan@habunek.com",
url="https://github.com/ihabunek/twitch-dl/",
author='Ivan Habunek',
author_email='ivan@habunek.com',
url='https://github.com/ihabunek/twitch-dl/',
project_urls={
"Documentation": "https://twitch-dl.bezdomni.net/"
},
keywords="twitch vod video download",
license="GPLv3",
keywords='twitch vod video download',
license='GPLv3',
classifiers=[
"Development Status :: 5 - Production/Stable",
"Environment :: Console",
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
"Programming Language :: Python :: 3",
'Development Status :: 5 - Production/Stable',
'License :: OSI Approved :: GNU General Public License v3 (GPLv3)',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
],
packages=find_packages(),
python_requires=">=3.7",
python_requires='>=3.5',
install_requires=[
"m3u8>=1.0.0,<4.0.0",
"httpx>=0.17.0,<1.0.0",
"m3u8>=1.0.0,<2.0.0",
"requests>=2.13,<3.0",
],
entry_points={
"console_scripts": [
"twitch-dl=twitchdl.console:main",
'console_scripts': [
'twitch-dl=twitchdl.console:main',
],
}
)

View File

@ -2,10 +2,7 @@
These tests depend on the channel having some videos and clips published.
"""
import httpx
import m3u8
from twitchdl import twitch
from twitchdl.commands.download import _parse_playlists, get_clip_authenticated_url
TEST_CHANNEL = "bananasaurus_rex"
@ -19,21 +16,6 @@ def test_get_videos():
video = twitch.get_video(video_id)
assert video["id"] == video_id
access_token = twitch.get_access_token(video_id)
assert "signature" in access_token
assert "value" in access_token
playlists = twitch.get_playlists(video_id, access_token)
assert playlists.startswith("#EXTM3U")
name, res, url = next(_parse_playlists(playlists))
playlist = httpx.get(url).text
assert playlist.startswith("#EXTM3U")
playlist = m3u8.loads(playlist)
vod_path = playlist.segments[0].uri
assert vod_path == "0.ts"
def test_get_clips():
"""
@ -43,8 +25,6 @@ def test_get_clips():
assert clips["pageInfo"]
assert len(clips["edges"]) > 0
slug = clips["edges"][0]["node"]["slug"]
clip = twitch.get_clip(slug)
assert clip["slug"] == slug
assert get_clip_authenticated_url(slug, "source")
clip_slug = clips["edges"][0]["node"]["slug"]
clip = twitch.get_clip(clip_slug)
assert clip["slug"] == clip_slug

View File

@ -1,102 +0,0 @@
from twitchdl.progress import Progress
def test_initial_values():
progress = Progress(10)
assert progress.downloaded == 0
assert progress.estimated_total is None
assert progress.progress_perc == 0
assert progress.remaining_time is None
assert progress.speed is None
assert progress.vod_count == 10
assert progress.vod_downloaded_count == 0
def test_downloaded():
progress = Progress(3)
progress.start(1, 300)
progress.start(2, 300)
progress.start(3, 300)
assert progress.downloaded == 0
assert progress.progress_bytes == 0
assert progress.progress_perc == 0
progress.advance(1, 100)
assert progress.downloaded == 100
assert progress.progress_bytes == 100
assert progress.progress_perc == 11
progress.advance(2, 200)
assert progress.downloaded == 300
assert progress.progress_bytes == 300
assert progress.progress_perc == 33
progress.advance(3, 150)
assert progress.downloaded == 450
assert progress.progress_bytes == 450
assert progress.progress_perc == 50
progress.advance(1, 50)
assert progress.downloaded == 500
assert progress.progress_bytes == 500
assert progress.progress_perc == 55
progress.abort(2)
assert progress.downloaded == 500
assert progress.progress_bytes == 300
assert progress.progress_perc == 33
progress.start(2, 300)
progress.advance(1, 150)
progress.advance(2, 300)
progress.advance(3, 150)
assert progress.downloaded == 1100
assert progress.progress_bytes == 900
assert progress.progress_perc == 100
progress.end(1)
progress.end(2)
progress.end(3)
assert progress.downloaded == 1100
assert progress.progress_bytes == 900
assert progress.progress_perc == 100
def test_estimated_total():
progress = Progress(3)
assert progress.estimated_total is None
progress.start(1, 12000)
assert progress.estimated_total == 12000 * 3
progress.start(2, 11000)
assert progress.estimated_total == 11500 * 3
progress.start(3, 10000)
assert progress.estimated_total == 11000 * 3
def test_vod_downloaded_count():
progress = Progress(3)
progress.start(1, 100)
progress.start(2, 100)
progress.start(3, 100)
assert progress.vod_downloaded_count == 0
progress.advance(1, 100)
progress.end(1)
assert progress.vod_downloaded_count == 1
progress.advance(2, 100)
progress.end(2)
assert progress.vod_downloaded_count == 2
progress.advance(3, 100)
progress.end(3)
assert progress.vod_downloaded_count == 3

View File

@ -1,3 +1,3 @@
__version__ = "2.1.1"
__version__ = "1.22.0"
CLIENT_ID = "kimne78kx3ncx6brgo4mv6wki5h1ko"

View File

@ -1,21 +1,17 @@
import asyncio
import httpx
import m3u8
import os
import re
import requests
import shutil
import subprocess
import tempfile
from os import path
from pathlib import Path
from typing import List, Optional, OrderedDict
from urllib.parse import urlparse, urlencode
from twitchdl import twitch, utils
from twitchdl.download import download_file
from twitchdl.download import download_file, download_files
from twitchdl.exceptions import ConsoleError
from twitchdl.http import download_all
from twitchdl.output import print_out
@ -51,9 +47,9 @@ def _select_playlist_interactive(playlists):
print_out("\nAvailable qualities:")
for n, (name, resolution, uri) in enumerate(playlists):
if resolution:
print_out("{}) <b>{}</b> <dim>({})</dim>".format(n + 1, name, resolution))
print_out("{}) {} [{}]".format(n + 1, name, resolution))
else:
print_out("{}) <b>{}</b>".format(n + 1, name))
print_out("{}) {}".format(n + 1, name))
no = utils.read_int("Choose quality", min=1, max=len(playlists) + 1, default=1)
_, _, uri = playlists[no - 1]
@ -137,7 +133,7 @@ def _clip_target_filename(clip, args):
raise ConsoleError("Invalid key {} used in --output. Supported keys are: {}".format(e, supported))
def _get_vod_paths(playlist, start: Optional[int], end: Optional[int]) -> List[str]:
def _get_vod_paths(playlist, start, end):
"""Extract unique VOD paths for download from playlist."""
files = []
vod_start = 0
@ -157,7 +153,7 @@ def _get_vod_paths(playlist, start: Optional[int], end: Optional[int]) -> List[s
return files
def _crete_temp_dir(base_uri: str) -> str:
def _crete_temp_dir(base_uri):
"""Create a temp dir to store downloads if it doesn't exist."""
path = urlparse(base_uri).path.lstrip("/")
temp_dir = Path(tempfile.gettempdir(), "twitch-dl", path)
@ -166,20 +162,15 @@ def _crete_temp_dir(base_uri: str) -> str:
def download(args):
for video_id in args.videos:
download_one(video_id, args)
def download_one(video: str, args):
video_id = utils.parse_video_identifier(video)
video_id = utils.parse_video_identifier(args.video)
if video_id:
return _download_video(video_id, args)
clip_slug = utils.parse_clip_identifier(video)
clip_slug = utils.parse_clip_identifier(args.video)
if clip_slug:
return _download_clip(clip_slug, args)
raise ConsoleError("Invalid input: {}".format(video))
raise ConsoleError("Invalid input: {}".format(args.video))
def _get_clip_url(clip, quality):
@ -227,7 +218,7 @@ def get_clip_authenticated_url(slug, quality):
return "{}?{}".format(url, query)
def _download_clip(slug: str, args) -> None:
def _download_clip(slug, args):
print_out("<dim>Looking up clip...</dim>")
clip = twitch.get_clip(slug)
game = clip["game"]["name"] if clip["game"] else "Unknown"
@ -260,7 +251,7 @@ def _download_clip(slug: str, args) -> None:
print_out("Downloaded: <blue>{}</blue>".format(target))
def _download_video(video_id, args) -> None:
def _download_video(video_id, args):
if args.start and args.end and args.end <= args.start:
raise ConsoleError("End time must be greater than start time")
@ -282,9 +273,6 @@ def _download_video(video_id, args) -> None:
raise ConsoleError("Aborted")
args.overwrite = True
# Chapter select or manual offset
start, end = _determine_time_range(video_id, args)
print_out("<dim>Fetching access token...</dim>")
access_token = twitch.get_access_token(video_id, auth_token=args.auth_token)
@ -295,13 +283,13 @@ def _download_video(video_id, args) -> None:
else _select_playlist_interactive(playlists))
print_out("<dim>Fetching playlist...</dim>")
response = httpx.get(playlist_uri)
response = requests.get(playlist_uri)
response.raise_for_status()
playlist = m3u8.loads(response.text)
base_uri = re.sub("/[^/]+$", "/", playlist_uri)
target_dir = _crete_temp_dir(base_uri)
vod_paths = _get_vod_paths(playlist, start, end)
vod_paths = _get_vod_paths(playlist, args.start, args.end)
# Save playlists for debugging purposes
with open(path.join(target_dir, "playlists.m3u8"), "w") as f:
@ -311,15 +299,11 @@ def _download_video(video_id, args) -> None:
print_out("\nDownloading {} VODs using {} workers to {}".format(
len(vod_paths), args.max_workers, target_dir))
sources = [base_uri + path for path in vod_paths]
targets = [os.path.join(target_dir, "{:05d}.ts".format(k)) for k, _ in enumerate(vod_paths)]
asyncio.run(download_all(sources, targets, args.max_workers, rate_limit=args.rate_limit))
path_map = download_files(base_uri, target_dir, vod_paths, args.max_workers)
# Make a modified playlist which references downloaded VODs
# Keep only the downloaded segments and skip the rest
org_segments = playlist.segments.copy()
path_map = OrderedDict(zip(vod_paths, targets))
playlist.segments.clear()
for segment in org_segments:
if segment.uri in path_map:
@ -344,40 +328,3 @@ def _download_video(video_id, args) -> None:
shutil.rmtree(target_dir)
print_out("\nDownloaded: <green>{}</green>".format(target))
def _determine_time_range(video_id, args):
if args.start or args.end:
return args.start, args.end
if args.chapter is not None:
print_out("<dim>Fetching chapters...</dim>")
chapters = twitch.get_video_chapters(video_id)
if not chapters:
raise ConsoleError("This video has no chapters")
if args.chapter == 0:
chapter = _choose_chapter_interactive(chapters)
else:
try:
chapter = chapters[args.chapter - 1]
except IndexError:
raise ConsoleError(f"Chapter {args.chapter} does not exist. This video has {len(chapters)} chapters.")
print_out(f'Chapter selected: <blue>{chapter["description"]}</blue>\n')
start = chapter["positionMilliseconds"] // 1000
duration = chapter["durationMilliseconds"] // 1000
return start, start + duration
return None, None
def _choose_chapter_interactive(chapters):
print_out("\nChapters:")
for index, chapter in enumerate(chapters):
duration = utils.format_time(chapter["durationMilliseconds"] // 1000)
print_out(f'{index + 1}) <b>{chapter["description"]}</b> <dim>({duration})</dim>')
index = utils.read_int("Select a chapter", 1, len(chapters))
chapter = chapters[index - 1]
return chapter

View File

@ -20,14 +20,12 @@ def info(args):
print_log("Fetching playlists...")
playlists = twitch.get_playlists(video_id, access_token)
print_log("Fetching chapters...")
chapters = twitch.get_video_chapters(video_id)
if args.json:
video_json(video, playlists, chapters)
else:
video_info(video, playlists, chapters)
return
if video:
if args.json:
video_json(video, playlists)
else:
video_info(video, playlists)
return
clip_slug = utils.parse_clip_identifier(args.video)
if clip_slug:
@ -45,7 +43,7 @@ def info(args):
raise ConsoleError("Invalid input: {}".format(args.video))
def video_info(video, playlists, chapters):
def video_info(video, playlists):
print_out()
print_video(video)
@ -54,16 +52,8 @@ def video_info(video, playlists, chapters):
for p in m3u8.loads(playlists).playlists:
print_out("<b>{}</b> {}".format(p.stream_info.video, p.uri))
if chapters:
print_out()
print_out("Chapters:")
for chapter in chapters:
start = utils.format_time(chapter["positionMilliseconds"] // 1000, force_hours=True)
duration = utils.format_time(chapter["durationMilliseconds"] // 1000)
print_out(f'{start} <b>{chapter["description"]}</b> ({duration})')
def video_json(video, playlists, chapters):
def video_json(video, playlists):
playlists = m3u8.loads(playlists).playlists
video["playlists"] = [
@ -76,8 +66,6 @@ def video_json(video, playlists, chapters):
} for p in playlists
]
video["chapters"] = chapters
print_json(video)

View File

@ -2,17 +2,13 @@ import sys
from twitchdl import twitch
from twitchdl.exceptions import ConsoleError
from twitchdl.output import print_out, print_paged_videos, print_video, print_json, print_video_compact
from twitchdl.output import print_out, print_paged_videos, print_video, print_json
def videos(args):
game_ids = _get_game_ids(args.game)
# Set different defaults for limit for compact display
limit = args.limit or (40 if args.compact else 10)
# Ignore --limit if --pager or --all are given
max_videos = sys.maxsize if args.all or args.pager else limit
max_videos = sys.maxsize if args.all or args.pager else args.limit
total_count, generator = twitch.channel_videos_generator(
args.channel_name, max_videos, args.sort, args.type, game_ids=game_ids)
@ -36,11 +32,8 @@ def videos(args):
count = 0
for video in generator:
if args.compact:
print_video_compact(video)
else:
print_out()
print_video(video)
print_out()
print_video(video)
count += 1
print_out()

View File

@ -2,10 +2,9 @@
import logging
import sys
import re
from argparse import ArgumentParser, ArgumentTypeError
from typing import NamedTuple, List, Tuple, Any, Dict
from collections import namedtuple
from twitchdl.exceptions import ConsoleError
from twitchdl.output import print_err
@ -13,19 +12,12 @@ from twitchdl.twitch import GQLError
from . import commands, __version__
Argument = Tuple[List[str], Dict[str, Any]]
Command = namedtuple("Command", ["name", "description", "arguments"])
CLIENT_WEBSITE = 'https://github.com/ihabunek/twitch-dl'
class Command(NamedTuple):
name: str
description: str
arguments: List[Argument]
CLIENT_WEBSITE = "https://twitch-dl.bezdomni.net/"
def time(value: str) -> int:
def time(value):
"""Parse a time string (hh:mm or hh:mm:ss) to number of seconds."""
parts = [int(p) for p in value.split(":")]
@ -42,34 +34,16 @@ def time(value: str) -> int:
return hours * 3600 + minutes * 60 + seconds
def pos_integer(value: str) -> int:
def pos_integer(value):
try:
parsed = int(value)
value = int(value)
except ValueError:
raise ArgumentTypeError("must be an integer")
if parsed < 1:
if value < 1:
raise ArgumentTypeError("must be positive")
return parsed
def rate(value: str) -> int:
match = re.search(r"^([0-9]+)(k|m|)$", value, flags=re.IGNORECASE)
if not match:
raise ArgumentTypeError("must be an integer, followed by an optional 'k' or 'm'")
amount = int(match.group(1))
unit = match.group(2)
if unit == "k":
return amount * 1024
if unit == "m":
return amount * 1024 * 1024
return amount
return value
COMMANDS = [
@ -87,8 +61,9 @@ COMMANDS = [
"type": str,
}),
(["-l", "--limit"], {
"help": "Number of videos to fetch. Defaults to 40 in copmpact mode, 10 otherwise.",
"help": "Number of videos to fetch. Defaults to 10.",
"type": pos_integer,
"default": 10,
}),
(["-a", "--all"], {
"help": "Fetch all videos, overrides --limit",
@ -118,11 +93,6 @@ COMMANDS = [
"nargs": "?",
"const": 10,
}),
(["-c", "--compact"], {
"help": "Show videos in compact mode, one line per video",
"action": "store_true",
"default": False,
}),
],
),
Command(
@ -169,17 +139,17 @@ COMMANDS = [
),
Command(
name="download",
description="Download videos or clips.",
description="Download a video or clip.",
arguments=[
(["videos"], {
"help": "One or more video ID, clip slug or twitch URL to download.",
(["video"], {
"help": "Video ID, clip slug, or URL",
"type": str,
"nargs": "+",
}),
(["-w", "--max-workers"], {
"help": "Number of workers for downloading vods concurrently (default 5)",
"help": "Maximal number of threads for downloading vods "
"concurrently (default 20)",
"type": int,
"default": 5,
"default": 20,
}),
(["-s", "--start"], {
"help": "Download video from this time (hh:mm or hh:mm:ss)",
@ -227,19 +197,7 @@ COMMANDS = [
"help": "Output file name template. See docs for details.",
"type": str,
"default": "{date}_{id}_{channel_login}_{title_slug}.{format}"
}),
(["-r", "--rate-limit"], {
"help": "Limit the maximum download speed in bytes per second. "
"Use 'k' and 'm' suffixes for kbps and mbps.",
"type": rate,
}),
(["-c", "--chapter"], {
"help": "Download a single chapter of the video. Specify the chapter number or "
"use the flag without a number to display a chapter select prompt.",
"type": int,
"nargs": "?",
"const": 0
}),
})
],
),
Command(
@ -323,7 +281,7 @@ def main():
print_err(e)
sys.exit(1)
except KeyboardInterrupt:
print_err("\nOperation canceled")
print_err("Operation canceled")
sys.exit(1)
except GQLError as e:
print_err(e)

View File

@ -1,5 +1,14 @@
import os
import httpx
import requests
from collections import OrderedDict
from concurrent.futures import ThreadPoolExecutor, as_completed
from datetime import datetime
from functools import partial
from requests.exceptions import RequestException
from twitchdl.output import print_out
from twitchdl.utils import format_size, format_duration
CHUNK_SIZE = 1024
CONNECT_TIMEOUT = 5
@ -10,20 +19,20 @@ class DownloadFailed(Exception):
pass
def _download(url: str, path: str):
def _download(url, path):
tmp_path = path + ".tmp"
response = requests.get(url, stream=True, timeout=CONNECT_TIMEOUT)
size = 0
with httpx.stream("GET", url, timeout=CONNECT_TIMEOUT) as response:
with open(tmp_path, "wb") as target:
for chunk in response.iter_bytes(chunk_size=CHUNK_SIZE):
target.write(chunk)
size += len(chunk)
with open(tmp_path, 'wb') as target:
for chunk in response.iter_content(chunk_size=CHUNK_SIZE):
target.write(chunk)
size += len(chunk)
os.rename(tmp_path, path)
return size
def download_file(url: str, path: str, retries: int = RETRY_COUNT):
def download_file(url, path, retries=RETRY_COUNT):
if os.path.exists(path):
from_disk = True
return (os.path.getsize(path), from_disk)
@ -32,7 +41,63 @@ def download_file(url: str, path: str, retries: int = RETRY_COUNT):
for _ in range(retries):
try:
return (_download(url, path), from_disk)
except httpx.RequestError:
except RequestException:
pass
raise DownloadFailed(":(")
def _print_progress(futures):
downloaded_count = 0
downloaded_size = 0
max_msg_size = 0
start_time = datetime.now()
total_count = len(futures)
current_download_size = 0
current_downloaded_count = 0
for future in as_completed(futures):
size, from_disk = future.result()
downloaded_count += 1
downloaded_size += size
# If we find something on disk, we don't want to take it in account in
# the speed calculation
if not from_disk:
current_download_size += size
current_downloaded_count += 1
percentage = 100 * downloaded_count // total_count
est_total_size = int(total_count * downloaded_size / downloaded_count)
duration = (datetime.now() - start_time).seconds
speed = current_download_size // duration if duration else 0
remaining = (total_count - downloaded_count) * duration / current_downloaded_count \
if current_downloaded_count else 0
msg = " ".join([
"Downloaded VOD {}/{}".format(downloaded_count, total_count),
"({}%)".format(percentage),
"<cyan>{}</cyan>".format(format_size(downloaded_size)),
"of <cyan>~{}</cyan>".format(format_size(est_total_size)),
"at <cyan>{}/s</cyan>".format(format_size(speed)) if speed > 0 else "",
"remaining <cyan>~{}</cyan>".format(format_duration(remaining)) if remaining > 0 else "",
])
max_msg_size = max(len(msg), max_msg_size)
print_out("\r" + msg.ljust(max_msg_size), end="")
def download_files(base_url, target_dir, vod_paths, max_workers):
"""
Downloads a list of VODs defined by a common `base_url` and a list of
`vod_paths`, returning a dict which maps the paths to the downloaded files.
"""
urls = [base_url + path for path in vod_paths]
targets = [os.path.join(target_dir, "{:05d}.ts".format(k)) for k, _ in enumerate(vod_paths)]
partials = (partial(download_file, url, path) for url, path in zip(urls, targets))
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = [executor.submit(fn) for fn in partials]
_print_progress(futures)
return OrderedDict(zip(vod_paths, targets))

76
twitchdl/ffmpeg.py Normal file
View File

@ -0,0 +1,76 @@
import asyncio
import json
import logging
import re
from asyncio.subprocess import PIPE
from pprint import pprint
from typing import Optional
from twitchdl.output import print_out
async def join_vods(playlist_path: str, target: str, overwrite: bool, video: dict):
command = [
"ffmpeg",
"-i", playlist_path,
"-c", "copy",
"-metadata", "artist={}".format(video["creator"]["displayName"]),
"-metadata", "title={}".format(video["title"]),
"-metadata", "encoded_by=twitch-dl",
"-stats",
"-loglevel", "warning",
f"file:{target}",
]
if overwrite:
command.append("-y")
# command = ["ls", "-al"]
print_out("<dim>{}</dim>".format(" ".join(command)))
process = await asyncio.create_subprocess_exec(*command, stdout=PIPE, stderr=PIPE)
assert process.stderr is not None
await asyncio.gather(
# _read_stream("stdout", process.stdout),
_print_progress("stderr", process.stderr),
process.wait()
)
print(process.returncode)
async def _read_stream(name: str, stream: Optional[asyncio.StreamReader]):
if stream:
async for line in readlines(stream):
print(name, ">", line)
async def _print_progress(stream: asyncio.StreamReader):
async for line in readlines(stream):
print(name, ">", line)
pattern = re.compile(br"[\r\n]+")
async def readlines(stream: asyncio.StreamReader):
data = bytearray()
while not stream.at_eof():
lines = pattern.split(data)
data[:] = lines.pop(-1)
for line in lines:
yield line
data.extend(await stream.read(1024))
if __name__ == "__main__":
# logging.basicConfig(level=logging.DEBUG)
video = json.loads('{"id": "1555108011", "title": "Cult of the Lamb", "publishedAt": "2022-08-07T17:00:30Z", "broadcastType": "ARCHIVE", "lengthSeconds": 17948, "game": {"name": "Cult of the Lamb"}, "creator": {"login": "bananasaurus_rex", "displayName": "Bananasaurus_Rex"}, "playlists": [{"bandwidth": 8446533, "resolution": [1920, 1080], "codecs": "avc1.64002A,mp4a.40.2", "video": "chunked", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/chunked/index-dvr.m3u8"}, {"bandwidth": 3432426, "resolution": [1280, 720], "codecs": "avc1.4D0020,mp4a.40.2", "video": "720p60", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/720p60/index-dvr.m3u8"}, {"bandwidth": 1445268, "resolution": [852, 480], "codecs": "avc1.4D001F,mp4a.40.2", "video": "480p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/480p30/index-dvr.m3u8"}, {"bandwidth": 215355, "resolution": null, "codecs": "mp4a.40.2", "video": "audio_only", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/audio_only/index-dvr.m3u8"}, {"bandwidth": 705523, "resolution": [640, 360], "codecs": "avc1.4D001E,mp4a.40.2", "video": "360p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/360p30/index-dvr.m3u8"}, {"bandwidth": 285614, "resolution": [284, 160], "codecs": "avc1.4D000C,mp4a.40.2", "video": "160p30", "uri": "https://d1m7jfoe9zdc1j.cloudfront.net/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/160p30/index-dvr.m3u8"}]}')
playlist_path = "/tmp/twitch-dl/278bcbd011d28f96b856_bananasaurus_rex_40035345017_1659891626/160p30/playlist_downloaded.m3u8"
asyncio.run(join_vods(playlist_path, "out.mkv", True, video), debug=True)

View File

@ -1,129 +0,0 @@
import asyncio
import httpx
import logging
import os
import time
from typing import List, Optional, Union
from twitchdl.progress import Progress
logger = logging.getLogger(__name__)
KB = 1024
CHUNK_SIZE = 256 * KB
"""How much of a VOD to download in each iteration"""
RETRY_COUNT = 5
"""Number of times to retry failed downloads before aborting."""
TIMEOUT = 30
"""
Number of seconds to wait before aborting when there is no network activity.
https://www.python-httpx.org/advanced/#timeout-configuration
"""
class TokenBucket:
"""Limit the download speed by strategically inserting sleeps."""
def __init__(self, rate: int, capacity: Optional[int] = None):
self.rate: int = rate
self.capacity: int = capacity or rate * 2
self.available: int = 0
self.last_refilled: float = time.time()
def advance(self, size: int):
"""Called every time a chunk of data is downloaded."""
self._refill()
if self.available < size:
deficit = size - self.available
time.sleep(deficit / self.rate)
self.available -= size
def _refill(self):
"""Increase available capacity according to elapsed time since last refill."""
now = time.time()
elapsed = now - self.last_refilled
refill_amount = int(elapsed * self.rate)
self.available = min(self.available + refill_amount, self.capacity)
self.last_refilled = now
class EndlessTokenBucket:
"""Used when download speed is not limited."""
def advance(self, size: int):
pass
AnyTokenBucket = Union[TokenBucket, EndlessTokenBucket]
async def download(
client: httpx.AsyncClient,
task_id: int,
source: str,
target: str,
progress: Progress,
token_bucket: AnyTokenBucket,
):
# Download to a temp file first, then copy to target when over to avoid
# getting saving chunks which may persist if canceled or --keep is used
tmp_target = f"{target}.tmp"
with open(tmp_target, "wb") as f:
async with client.stream("GET", source) as response:
size = int(response.headers.get("content-length"))
progress.start(task_id, size)
async for chunk in response.aiter_bytes(chunk_size=CHUNK_SIZE):
f.write(chunk)
size = len(chunk)
token_bucket.advance(size)
progress.advance(task_id, size)
progress.end(task_id)
os.rename(tmp_target, target)
async def download_with_retries(
client: httpx.AsyncClient,
semaphore: asyncio.Semaphore,
task_id: int,
source: str,
target: str,
progress: Progress,
token_bucket: AnyTokenBucket,
):
async with semaphore:
if os.path.exists(target):
size = os.path.getsize(target)
progress.already_downloaded(task_id, size)
return
for n in range(RETRY_COUNT):
try:
return await download(client, task_id, source, target, progress, token_bucket)
except httpx.RequestError:
logger.exception("Task {task_id} failed. Retrying. Maybe.")
progress.abort(task_id)
if n + 1 >= RETRY_COUNT:
raise
raise Exception("Should not happen")
async def download_all(
sources: List[str],
targets: List[str],
workers: int,
*,
rate_limit: Optional[int] = None
):
progress = Progress(len(sources))
token_bucket = TokenBucket(rate_limit) if rate_limit else EndlessTokenBucket()
async with httpx.AsyncClient(timeout=TIMEOUT) as client:
semaphore = asyncio.Semaphore(workers)
tasks = [download_with_retries(client, semaphore, task_id, source, target, progress, token_bucket)
for task_id, (source, target) in enumerate(zip(sources, targets))]
await asyncio.gather(*tasks)

View File

@ -6,7 +6,6 @@ import re
from itertools import islice
from twitchdl import utils
from typing import Any, Match
START_CODES = {
@ -30,38 +29,31 @@ END_PATTERN = "</(" + "|".join(START_CODES.keys()) + ")>"
USE_ANSI_COLOR = "--no-color" not in sys.argv
def start_code(match: Match[str]) -> str:
def start_code(match):
name = match.group(1)
return START_CODES[name]
def colorize(text: str) -> str:
def colorize(text):
text = re.sub(START_PATTERN, start_code, text)
text = re.sub(END_PATTERN, END_CODE, text)
return text
def strip_tags(text: str) -> str:
def strip_tags(text):
text = re.sub(START_PATTERN, '', text)
text = re.sub(END_PATTERN, '', text)
return text
def truncate(string: str, length: int) -> str:
if len(string) > length:
return string[:length - 1] + ""
return string
def print_out(*args, **kwargs):
args = [colorize(a) if USE_ANSI_COLOR else strip_tags(a) for a in args]
print(*args, **kwargs)
def print_json(data: Any):
def print_json(data):
print(json.dumps(data))
@ -97,14 +89,6 @@ def print_video(video):
print_out("<i>{}</i>".format(url))
def print_video_compact(video):
id = video["id"]
date = video["publishedAt"][:10]
game = video["game"]["name"] if video["game"] else ""
title = truncate(video["title"], 80).ljust(80)
print_out(f'<b>{id}</b> {date} <green>{title}</green> <blue>{game}</blue>')
def print_paged_videos(generator, page_size, total_count):
iterator = iter(generator)
page = list(islice(iterator, page_size))

View File

@ -1,137 +0,0 @@
import logging
import time
from collections import deque
from dataclasses import dataclass, field
from statistics import mean
from typing import Dict, NamedTuple, Optional, Deque
from twitchdl.output import print_out
from twitchdl.utils import format_size, format_time
logger = logging.getLogger(__name__)
TaskId = int
@dataclass
class Task:
id: TaskId
size: int
downloaded: int = 0
def advance(self, size):
self.downloaded += size
class Sample(NamedTuple):
downloaded: int
timestamp: float
@dataclass
class Progress:
vod_count: int
downloaded: int = 0
estimated_total: Optional[int] = None
last_printed: float = field(default_factory=time.time)
progress_bytes: int = 0
progress_perc: int = 0
remaining_time: Optional[int] = None
speed: Optional[float] = None
start_time: float = field(default_factory=time.time)
tasks: Dict[TaskId, Task] = field(default_factory=dict)
vod_downloaded_count: int = 0
samples: Deque[Sample] = field(default_factory=lambda: deque(maxlen=100))
def start(self, task_id: int, size: int):
if task_id in self.tasks:
raise ValueError(f"Task {task_id}: cannot start, already started")
self.tasks[task_id] = Task(task_id, size)
self._calculate_total()
self._calculate_progress()
self.print()
def advance(self, task_id: int, size: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot advance, not started")
self.downloaded += size
self.progress_bytes += size
self.tasks[task_id].advance(size)
self.samples.append(Sample(self.downloaded, time.time()))
self._calculate_progress()
self.print()
def already_downloaded(self, task_id: int, size: int):
if task_id in self.tasks:
raise ValueError(f"Task {task_id}: cannot mark as downloaded, already started")
self.tasks[task_id] = Task(task_id, size)
self.progress_bytes += size
self.vod_downloaded_count += 1
self._calculate_total()
self._calculate_progress()
self.print()
def abort(self, task_id: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot abort, not started")
del self.tasks[task_id]
self.progress_bytes = sum(t.downloaded for t in self.tasks.values())
self._calculate_total()
self._calculate_progress()
self.print()
def end(self, task_id: int):
if task_id not in self.tasks:
raise ValueError(f"Task {task_id}: cannot end, not started")
task = self.tasks[task_id]
if task.size != task.downloaded:
logger.warn(f"Taks {task_id} ended with {task.downloaded}b downloaded, expected {task.size}b.")
self.vod_downloaded_count += 1
self.print()
def _calculate_total(self):
self.estimated_total = int(mean(t.size for t in self.tasks.values()) * self.vod_count) if self.tasks else None
def _calculate_progress(self):
self.speed = self._calculate_speed()
self.progress_perc = int(100 * self.progress_bytes / self.estimated_total) if self.estimated_total else 0
self.remaining_time = int((self.estimated_total - self.progress_bytes) / self.speed) if self.estimated_total and self.speed else None
def _calculate_speed(self):
if len(self.samples) < 2:
return None
first_sample = self.samples[0]
last_sample = self.samples[-1]
size = last_sample.downloaded - first_sample.downloaded
duration = last_sample.timestamp - first_sample.timestamp
return size / duration
def print(self):
now = time.time()
# Don't print more often than 10 times per second
if now - self.last_printed < 0.1:
return
progress = " ".join([
f"Downloaded {self.vod_downloaded_count}/{self.vod_count} VODs",
f"<blue>{self.progress_perc}%</blue>",
f"of <blue>~{format_size(self.estimated_total)}</blue>" if self.estimated_total else "",
f"at <blue>{format_size(self.speed)}/s</blue>" if self.speed else "",
f"ETA <blue>{format_time(self.remaining_time)}</blue>" if self.remaining_time is not None else "",
])
print_out(f"\r{progress} ", end="")
self.last_printed = now

View File

@ -2,10 +2,9 @@
Twitch API access.
"""
import httpx
import json
import requests
from typing import Dict
from requests.exceptions import HTTPError
from twitchdl import CLIENT_ID
from twitchdl.exceptions import ConsoleError
@ -16,10 +15,25 @@ class GQLError(Exception):
self.errors = errors
def authenticated_get(url, params={}, headers={}):
headers['Client-ID'] = CLIENT_ID
response = requests.get(url, params, headers=headers)
if 400 <= response.status_code < 500:
data = response.json()
# TODO: this does not look nice in the console since data["message"]
# can contain a JSON encoded object.
raise ConsoleError(data["message"])
response.raise_for_status()
return response
def authenticated_post(url, data=None, json=None, headers={}):
headers['Client-ID'] = CLIENT_ID
response = httpx.post(url, data=data, json=json, headers=headers)
response = requests.post(url, data=data, json=json, headers=headers)
if response.status_code == 400:
data = response.json()
raise ConsoleError(data["message"])
@ -39,7 +53,7 @@ def gql_post(query):
return response
def gql_query(query: str, headers: Dict[str, str] = {}):
def gql_query(query, headers={}):
url = "https://gql.twitch.tv/gql"
response = authenticated_post(url, json={"query": query}, headers=headers).json()
@ -268,7 +282,7 @@ def get_channel_videos(channel_id, limit, sort, type="archive", game_ids=[], aft
return response["data"]["user"]["videos"]
def channel_videos_generator(channel_id, max_videos, sort, type, game_ids=[]):
def channel_videos_generator(channel_id, max_videos, sort, type, game_ids=None):
def _generator(videos, max_videos):
for video in videos["edges"]:
if max_videos < 1:
@ -316,7 +330,7 @@ def get_access_token(video_id, auth_token=None):
try:
response = gql_query(query, headers=headers)
return response["data"]["videoPlaybackAccessToken"]
except httpx.HTTPStatusError as error:
except HTTPError as error:
# Provide a more useful error message when server returns HTTP 401
# Unauthorized while using a user-provided auth token.
if error.response.status_code == 401:
@ -337,7 +351,7 @@ def get_playlists(video_id, access_token):
"""
url = "http://usher.twitch.tv/vod/{}".format(video_id)
response = httpx.get(url, params={
response = requests.get(url, params={
"nauth": access_token['value'],
"nauthsig": access_token['signature'],
"allow_audio_only": "true",
@ -361,32 +375,3 @@ def get_game_id(name):
game = response["data"]["game"]
if game:
return game["id"]
def get_video_chapters(video_id):
query = {
"operationName": "VideoPlayer_ChapterSelectButtonVideo",
"variables":
{
"includePrivate": False,
"videoID": video_id
},
"extensions":
{
"persistedQuery":
{
"version": 1,
"sha256Hash": "8d2793384aac3773beab5e59bd5d6f585aedb923d292800119e03d40cd0f9b41"
}
}
}
response = gql_post(json.dumps(query))
return list(_chapter_nodes(response["data"]["video"]["moments"]))
def _chapter_nodes(collection):
for edge in collection["edges"]:
node = edge["node"]
del node["moments"]
yield node

View File

@ -40,29 +40,13 @@ def format_duration(total_seconds):
return "{} sec".format(seconds)
def format_time(total_seconds, force_hours=False):
total_seconds = int(total_seconds)
hours = total_seconds // 3600
remainder = total_seconds % 3600
minutes = remainder // 60
seconds = total_seconds % 60
if hours or force_hours:
return f"{hours:02}:{minutes:02}:{seconds:02}"
return f"{minutes:02}:{seconds:02}"
def read_int(msg, min, max, default=None):
if default:
msg = msg + f" [default {default}]"
msg += ": "
def read_int(msg, min, max, default):
msg = msg + " [default {}]: ".format(default)
while True:
try:
val = input(msg)
if default and not val:
if not val:
return default
if min <= int(val) <= max:
return int(val)