Compare commits

..

70 Commits

Author SHA1 Message Date
b982cba566 Bump version, changelog 2020-09-03 12:56:57 +02:00
f4d442c118 Document installation instructions 2020-09-03 12:56:08 +02:00
eecd098f18 Add www before twitch.tv since that's what twitch uses 2020-09-03 12:55:45 +02:00
041689bee9 Construct paths using path libs
Fixes issues with paths on windows.

issue #35
2020-09-03 12:26:28 +02:00
a245ffb6a4 Fix issue with partial downloads
When using --start or --end, only keep the segments which have been
downloaded, and skip the rest.
2020-09-03 11:59:44 +02:00
ac37f179ef Improve bundle command
* directly save to desired file name
* add version number to file name
* remove __pycache__ folders before bundling
* compress the archive
2020-09-03 11:09:38 +02:00
5b200a2cb7 Add a splash of color 2020-09-03 10:38:29 +02:00
bf2a4558f4 Improve VOD joining logic
Instead of creating a file list, create a modified playlist which
references the downloaded files, and give this as input to ffmpeg. Since
ffmpeg handles M3U8 playlists, this means options such as
`EXT-X-BYTERANGE` are supported.

issue #35
2020-09-03 10:38:29 +02:00
772faa5901 Log ffmpeg command and handle errors better 2020-09-03 10:38:28 +02:00
04ddadef26 Add bundling with zipapp 2020-08-13 13:47:03 +02:00
bbed398cf6 Fix version number in init file, bump version 2020-08-11 18:10:21 +02:00
d8863ac695 Bump version, changelog 2020-08-09 12:04:48 +02:00
4edf299780 Handle more gracefully when video/clip not found 2020-08-09 11:55:40 +02:00
78295a492c Improve parsing inputs for download command
* Fix overzealous regex which caused video patterns to be identified as
  clips by testing for videos before clips.
* Allow URLs with or without `www.`.
* Add tests which verify this actually works.

fixes #28
2020-08-09 11:45:39 +02:00
f456d04de6 Bump version, update changelog and readme 2020-08-07 16:38:55 +02:00
8222df3670 Allow numbers in clip slugs
issue #24
2020-08-07 16:35:56 +02:00
65663d3505 Fix url to video shown when listing videos 2020-08-07 16:24:23 +02:00
706e42d197 Make quality selectable for clip download 2020-08-07 16:23:08 +02:00
3cfa05a3ee Make quality selectable for video download 2020-08-07 16:23:00 +02:00
4f62a26c30 Bump version, changelog 2020-06-10 12:07:59 +02:00
2171a9e08e Allow unicode values in slugs
Otherwise non-ascii characters get stripped which is not good for
e.g. titles in cyrillic script.
2020-06-10 10:54:28 +02:00
15ca684286 Don't unpack options
This makes it more readable as option count increases.
2020-05-30 10:21:19 +02:00
fd56a16c41 Fix option to use kebab case like the rest 2020-05-30 09:48:31 +02:00
4885c6a3b7 Add requirements to readme 2020-05-29 13:57:02 +02:00
2cf66c022c Don't break if game is None 2020-05-29 13:55:54 +02:00
717f634dda Remove unused code 2020-05-29 13:51:51 +02:00
169f15ca30 Add --game example to README 2020-05-17 14:46:08 +02:00
58458553bc Bump version 2020-05-17 14:42:55 +02:00
cabc8ff327 Improve paging 2020-05-17 14:41:11 +02:00
d22fd74357 Add filtering videos by game 2020-05-17 14:35:33 +02:00
4241ab5d67 Make less important messages dim 2020-05-17 14:32:37 +02:00
94e9f6aa80 Extract graphql query function 2020-05-17 13:48:48 +02:00
b014d94366 Blue is nicer than cyan 2020-05-17 13:48:16 +02:00
ea01ef3d99 Add paging to videos command 2020-05-17 13:41:34 +02:00
2118cd8825 Use graphql to fetch channel videos
The old helix endpoint returns HTTP 401

fixes #18
2020-05-17 11:57:16 +02:00
6c28dd2f5e Bump version 2020-04-25 20:06:02 +02:00
e3dde90870 Specify broadcast type when listing videos
issue #13
2020-04-25 20:04:21 +02:00
c628757ac0 Fix error message 2020-04-12 11:44:01 +02:00
5e97b439a7 Bump version, changelog 2020-04-11 20:57:43 +02:00
07f3a2fa48 Implement downloading clips
issue #15
2020-04-11 16:07:17 +02:00
96f13e9cf7 Bump version, changelog 2020-04-11 14:07:14 +02:00
c9547435df Nicer otput while dowloading VODs, bright colors 2020-04-11 14:05:23 +02:00
042d35ba1e Override local file names for downloaded vods
Sometimes the playlists contain more than just file names which can
break the ffmpeg join, so just name downloaded vods sequentially.

fixes #12
2020-04-11 13:20:59 +02:00
ebc754072d Reorganise code 2020-04-11 13:08:42 +02:00
cb00accd6a Better long description 2020-04-10 16:42:35 +02:00
64157c1ef6 Bump version 2020-04-10 16:34:37 +02:00
6a8da3b01b Don't print errors messages when retrying
Only die if all retries fail.
2020-04-10 16:22:15 +02:00
e29d42e9ef Use Twitch's client ID
Fetching access token with own client ID no longer works.

Everybody else in the world seems to be doing it:
https://github.com/search?p=2&q=kimne78kx3ncx6brgo4mv6wki5h1ko&type=Code
2020-04-10 16:21:10 +02:00
100aa53b84 Bump version 2019-08-23 13:08:57 +02:00
e384f26444 Save playlists to temp dir for debugging 2019-08-23 13:08:35 +02:00
000754af8c Use m3u8 lib to parse playlists 2019-08-23 12:36:05 +02:00
6813bb51b4 Add option not to delete downloaded VODs 2019-08-23 10:16:49 +02:00
34b0592cf3 Fix usage of deprecated v3 API
related #8
2019-08-23 09:03:33 +02:00
e72f8e24ea Bump version 2019-08-13 12:40:00 +02:00
3c99e9975b Bump version, changelog 2019-08-13 12:31:44 +02:00
f807d4324b Style the url in video list 2019-08-13 12:29:42 +02:00
68a8b70948 Add offset and sort options to videos command
fixes #7
2019-08-13 12:25:25 +02:00
932b7750b9 Print video URL 2019-08-12 15:21:57 +02:00
aa5f17cbdb Show errors returned via HTTP 400 2019-08-12 15:14:13 +02:00
0fff0a4de1 Exit with nonzero code on error 2019-08-12 13:47:48 +02:00
9dc67a7ff1 Bump version, add changelog 2019-07-05 13:15:59 +02:00
3e7f310e36 Add --version option to print program version 2019-07-05 13:14:22 +02:00
cbb0d6cfbd Allow specifying the output format
i.e. the output file extension passed to ffmpeg
2019-07-05 13:04:09 +02:00
46d2654cfa Bump to stable 2019-06-06 11:48:28 +02:00
0f54c527be Remove redundant bdist_wheel setting
py3 is the default
2019-06-06 11:47:04 +02:00
8133d93436 Don't make universal wheels (py2 not supported) 2019-06-06 11:44:56 +02:00
9345dd966f Bump version, add changelog 2019-06-06 11:10:21 +02:00
e9bd706194 Allow limiting download by start and end time 2019-06-06 11:06:33 +02:00
357379a6a1 Update Makefile 2019-06-06 09:28:23 +02:00
0c88de3862 Bump version 2019-04-30 13:47:16 +02:00
19 changed files with 954 additions and 239 deletions

2
.gitignore vendored
View File

@ -11,3 +11,5 @@ tmp/
/htmlcov
/twitch-dl-*.tar.gz
/twitch-dl.1.man
/bundle
/*.pyz

102
CHANGELOG.md Normal file
View File

@ -0,0 +1,102 @@
Twitch Downloader change log
============================
1.11.0 (2020-09-03)
-------------------
* Make downloading more robust, fixes issues with some VODs (#35)
* Bundle twitch-dl to a standalone archive, simplifying installation, see
installation instructions in README
1.10.2 (2020-08-11)
-------------------
* Fix version number displayed by `twitch-dl --version` (#29)
1.10.1 (2020-08-09)
-------------------
* Fix videos incorrectly identified as clips (#28)
* Make download command work with video URLs lacking "www" before "twitch.tv"
* Print an error when video or clip is not found instead of an exception trace
1.10.0 (2020-08-07)
-------------------
* Add `--quality` option to `download` command, allows specifying the video
quality to download. In this case, twitch-dl will require no user input. (#22)
* Fix download of clips which contain numbers in their slug (#24)
* Fix URL to video displayed by `videos` command (it was missing /videos/)
1.9.0 (2020-06-10)
------------------
* **Breaking**: wrongly named `--max_workers` option changed to `--max-workers`.
The shorthand option `-w` remains the same.
* Fix bug where `videos` command would crash if there was no game info (#21)
* Allow unicode characters in filenames, no longer strips e.g. cyrillic script
1.8.0 (2020-05-17)
------------------
* Fix videos command (#18)
* **Breaking**: `videos` command no longer takes the `--offset` parameter due to
API changes
* Add paging to `videos` command to replace offset
* Add `--game` option to `videos` command to filter by game
1.7.0 (2020-04-25)
------------------
* Support for specifying broadcast type when listing videos (#13)
1.6.0 (2020-04-11)
------------------
* Support for downloading clips (#15)
1.5.1 (2020-04-11)
------------------
* Fix VOD naming issue (#12)
* Nice console output while downloading
1.5.0 (2020-04-10)
------------------
* Fix video downloads after Twitch deprecated access token access
* Don't print errors when retrying download, only if all fails
1.4.0 (2019-08-23)
------------------
* Fix usage of deprecated v3 API
* Use m3u8 lib for parsing playlists
* Add `--keep` option not preserve downloaded VODs
1.3.1 (2019-08-13)
------------------
* No changes, bumped to fix issue with pypi
1.3.0 (2019-08-13)
------------------
* Add `--sort` and `--offset` options to `videos` command, allows paging (#7)
* Show video URL in `videos` command output
1.2.0 (2019-07-05)
------------------
* Add `--format` option to `download` command for specifying the output format (#6)
* Add `--version` option for printing program version
1.1.0 (2019-06-06)
------------------
* Allow limiting download by start and end time
1.0.0 (2019-04-30)
------------------
* Initial release

View File

@ -1,22 +1,26 @@
default : clean dist
dist :
@echo "\nMaking source"
@echo "-------------"
@python setup.py sdist
@echo "\nMaking wheel"
@echo "-------------"
@python setup.py bdist_wheel --universal
@echo "\nDone."
python setup.py sdist --formats=gztar,zip
python setup.py bdist_wheel --python-tag=py3
clean :
find . -name "*pyc" | xargs rm -rf $1
rm -rf build dist MANIFEST htmlcov deb_dist twitch-dl*.tar.gz twitch-dl.1.man
rm -rf build dist bundle MANIFEST htmlcov deb_dist twitch-dl*.tar.gz twitch-dl.1.man
bundle:
mkdir bundle
cp twitchdl/__main__.py bundle
pip install . --target=bundle
rm -rf bundle/*.dist-info
find bundle/ -type d -name "__pycache__" -exec rm -rf {} +
python -m zipapp \
--python "/usr/bin/env python3" \
--output twitch-dl.`git describe`.pyz bundle \
--compress
publish :
twine upload dist/*
twine upload dist/*.tar.gz dist/*.whl
coverage:
py.test --cov=toot --cov-report html tests/
@ -26,3 +30,6 @@ deb:
man:
scdoc < twitch-dl.1.scd > twitch-dl.1.man
test:
pytest

View File

@ -1,10 +1,10 @@
Twitch Downloader
=================
A simple CLI tool for downloading videos from Twitch.
CLI tool for downloading videos from twitch.tv
Inspired by youtube-dl but improves upon it by using multiple concurrent
connections to make the download faster.
Inspired by [youtube-dl](https://youtube-dl.org/) but improves upon it by using
multiple concurrent connections to make the download faster.
Resources
---------
@ -13,6 +13,69 @@ Resources
* Issues: https://github.com/ihabunek/twitch-dl/issues
* Python package: https://pypi.org/project/twitch-dl/
Requirements
------------
* Python 3.5+
* [ffmpeg](https://ffmpeg.org/download.html), installed and on the system path
Installation
------------
### Download standalone archive
Go to the [latest release](https://github.com/ihabunek/twitch-dl/releases/latest)
and download the `twitch-dl.<version>.pyz` archive.
Run the archive by either:
a) passing it to python:
```
python3 twitch-dl.1.10.2.pyz --help
```
b) making it executable and invoking it directly (linux specific):
```
chmod +x twitch-dl.1.10.2.pyz
./twitch-dl.1.10.2.pyz --help
```
Feel free to rename the archive to something more managable, like `twitch-dl`.
To upgrade to a newer version, repeat the process with the newer release.
### From PYPI using pipx
**pipx** is a tool which installs python apps into isolated environments, which
prevents all kinds of problems later so it's the suggested way to install
twitch-dl from PYPI.
Install pipx as described in
[pipx install docs](https://pipxproject.github.io/pipx/installation/).
Install twitch-dl:
```
pipx install twitch-dl
```
Check installation worked:
```
twitch-dl --help
```
If twitch-dl executable is not found, check that the pipx binary location (by
default `~/.local/bin`) is in your PATH.
To upgrade twitch-dl to the latest version:
```
pipx install twitch-dl
```
Usage
-----
@ -43,6 +106,12 @@ Bananasaurus_Rex playing Dead Space
Published 2018-01-21 @ 05:47:03 Length: 5h 7min
```
Use the `--game` option to specify one or more games to show:
```
twitch-dl videos --game "doom eternal" --game "cave story" bananasaurus_rex
```
Download a stream by ID or URL:
```
@ -50,6 +119,27 @@ twitch-dl download 221837124
twitch-dl download https://www.twitch.tv/videos/221837124
```
Specify video quality to download:
```
twitch-dl download -q 720p 221837124
```
Download a clip by slug or URL:
```
twitch-dl download VenomousTameWormHumbleLife
twitch-dl download https://www.twitch.tv/bananasaurus_rex/clip/VenomousTameWormHumbleLife
```
Specify clip quality to download:
```
twitch-dl download -q 720 VenomousTameWormHumbleLife
```
Note that twitch names for clip qualities have no trailing "p".
Man page
--------
@ -64,6 +154,6 @@ make man
License
-------
Copyright 2018 Ivan Habunek <ivan@habunek.com>
Copyright 2018-2020 Ivan Habunek <ivan@habunek.com>
Licensed under the GPLv3: http://www.gnu.org/licenses/gpl-3.0.html

View File

@ -1,5 +1,3 @@
pytest-cov~=2.4.0
pytest~=3.0.0
stdeb~=0.8.5
twine~=1.8.1
wheel~=0.29.0
pytest
twine
wheel

View File

@ -1,2 +0,0 @@
[bdist_wheel]
universal=1

View File

@ -2,27 +2,35 @@
from setuptools import setup
long_description = """
Quickly download videos from twitch.tv.
Works simliarly to youtube-dl but downloads multiple VODs in parallel which
makes it faster.
"""
setup(
name='twitch-dl',
version='0.1.0',
version='1.11.0',
description='Twitch downloader',
long_description="A simple script for downloading videos from Twitch",
long_description=long_description.strip(),
author='Ivan Habunek',
author_email='ivan@habunek.com',
url='https://github.com/ihabunek/twitch-dl/',
keywords='twitch vod video download',
license='GPLv3',
classifiers=[
'Development Status :: 4 - Beta',
'Development Status :: 5 - Production/Stable',
'License :: OSI Approved :: GNU General Public License v3 (GPLv3)',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
],
packages=['twitchdl'],
python_requires='>=3.5',
install_requires=[
"m3u8>=0.3.12,<0.4",
"requests>=2.13,<3.0",
],
entry_points={

View File

@ -1,3 +1,3 @@
[DEFAULT]
X-Python3-Version: >= 3.3
X-Python3-Version: >= 3.5
Copyright-File: LICENSE

45
tests/test_download.py Normal file
View File

@ -0,0 +1,45 @@
import pytest
from unittest.mock import patch
from twitchdl.commands import download
from collections import namedtuple
Args = namedtuple("args", ["video"])
TEST_VIDEO_PATTERNS = [
("702689313", "702689313"),
("702689313", "https://twitch.tv/videos/702689313"),
("702689313", "https://www.twitch.tv/videos/702689313"),
]
TEST_CLIP_PATTERNS = {
("AbrasivePlayfulMangoMau5", "AbrasivePlayfulMangoMau5"),
("AbrasivePlayfulMangoMau5", "https://clips.twitch.tv/AbrasivePlayfulMangoMau5"),
("AbrasivePlayfulMangoMau5", "https://www.twitch.tv/dracul1nx/clip/AbrasivePlayfulMangoMau5"),
("AbrasivePlayfulMangoMau5", "https://twitch.tv/dracul1nx/clip/AbrasivePlayfulMangoMau5"),
("HungryProudRadicchioDoggo", "HungryProudRadicchioDoggo"),
("HungryProudRadicchioDoggo", "https://clips.twitch.tv/HungryProudRadicchioDoggo"),
("HungryProudRadicchioDoggo", "https://www.twitch.tv/bananasaurus_rex/clip/HungryProudRadicchioDoggo?filter=clips&range=7d&sort=time"),
("HungryProudRadicchioDoggo", "https://twitch.tv/bananasaurus_rex/clip/HungryProudRadicchioDoggo?filter=clips&range=7d&sort=time"),
}
@patch("twitchdl.commands._download_clip")
@patch("twitchdl.commands._download_video")
@pytest.mark.parametrize("expected,input", TEST_VIDEO_PATTERNS)
def test_video_patterns(video_dl, clip_dl, expected, input):
args = Args(video=input)
download(args)
video_dl.assert_called_once_with(expected, args)
clip_dl.assert_not_called()
@patch("twitchdl.commands._download_clip")
@patch("twitchdl.commands._download_video")
@pytest.mark.parametrize("expected,input", TEST_CLIP_PATTERNS)
def test_clip_patterns(video_dl, clip_dl, expected, input):
args = Args(video=input)
download(args)
clip_dl.assert_called_once_with(expected, args)
video_dl.assert_not_called()

View File

@ -24,18 +24,45 @@ List recent videos from bananasaurus\_rex's channel:
twitch-dl videos bananasaurus_rex
```
Download by URL:
Download video by URL:
```
twitch-dl download https://www.twitch.tv/videos/377220226
```
Download by ID:
Download video by ID:
```
twitch-dl download 377220226
```
Specify output format:
```
twitch-dl download --format=avi 377220226
```
Partial download by setting start and end time (hh:mm or hh:mm:ss):
```
twitch-dl download --start=00:10 --end=02:15 377220226
```
Download clip by URL:
```
twitch-dl download https://www.twitch.tv/bananasaurus_rex/clip/VenomousTameWormHumbleLife
```
Download clip by slug:
```
twitch-dl download VenomousTameWormHumbleLife
```
Note that clips are a single download, and don't benefit from the paralelism
used when downloading videos.
# SEE ALSO
youtube-dl(1)

View File

@ -1,3 +1,3 @@
__version__ = "0.1.0"
__version__ = "1.11.0"
CLIENT_ID = "miwy5zk23vh2he94san0bzj5ks1r0p"
CLIENT_ID = "kimne78kx3ncx6brgo4mv6wki5h1ko"

3
twitchdl/__main__.py Normal file
View File

@ -0,0 +1,3 @@
from twitchdl.console import main
main()

View File

@ -1,147 +1,121 @@
import os
import pathlib
import m3u8
import re
import requests
import shutil
import subprocess
import tempfile
from datetime import datetime
from concurrent.futures import ThreadPoolExecutor, as_completed
from functools import partial
from os import path
from pathlib import Path
from urllib.parse import urlparse
from twitchdl import twitch
from twitchdl.download import download_file
from twitchdl import twitch, utils
from twitchdl.download import download_file, download_files
from twitchdl.exceptions import ConsoleError
from twitchdl.output import print_out
from twitchdl.utils import slugify
from twitchdl.output import print_out, print_video
def read_int(msg, min, max, default):
msg = msg + " [default {}]: ".format(default)
def _continue():
print_out(
"\nThere are more videos. "
"Press <green><b>Enter</green> to continue, "
"<yellow><b>Ctrl+C</yellow> to break."
)
while True:
try:
val = input(msg)
if not val:
return default
if min <= int(val) <= max:
return int(val)
except ValueError:
pass
try:
input()
except KeyboardInterrupt:
return False
return True
def format_size(bytes_):
if bytes_ < 1024:
return str(bytes_)
def _get_game_ids(names):
if not names:
return []
kilo = bytes_ / 1024
if kilo < 1024:
return "{:.1f}K".format(kilo)
game_ids = []
for name in names:
print_out("<dim>Looking up game '{}'...</dim>".format(name))
game_id = twitch.get_game_id(name)
if not game_id:
raise ConsoleError("Game '{}' not found".format(name))
game_ids.append(int(game_id))
mega = kilo / 1024
if mega < 1024:
return "{:.1f}M".format(mega)
return "{:.1f}G".format(mega / 1024)
return game_ids
def format_duration(total_seconds):
total_seconds = int(total_seconds)
hours = total_seconds // 3600
remainder = total_seconds % 3600
minutes = remainder // 60
seconds = total_seconds % 60
def videos(args):
game_ids = _get_game_ids(args.game)
if hours:
return "{} h {} min".format(hours, minutes)
print_out("<dim>Loading videos...</dim>")
generator = twitch.channel_videos_generator(
args.channel_name, args.limit, args.sort, args.type, game_ids=game_ids)
if minutes:
return "{} min {} sec".format(minutes, seconds)
first = 1
return "{} sec".format(seconds)
for videos, has_more in generator:
count = len(videos["edges"]) if "edges" in videos else 0
total = videos["totalCount"]
last = first + count - 1
print_out("-" * 80)
print_out("<yellow>Showing videos {}-{} of {}</yellow>".format(first, last, total))
for video in videos["edges"]:
print_video(video["node"])
if not has_more or not _continue():
break
first += count
else:
print_out("<yellow>No videos found</yellow>")
def _print_video(video):
published_at = video['published_at'].replace('T', ' @ ').replace('Z', '')
length = format_duration(video['length'])
name = video['channel']['display_name']
def _parse_playlists(playlists_m3u8):
playlists = m3u8.loads(playlists_m3u8)
print_out("\n<bold>{}</bold>".format(video['_id'][1:]))
print_out("<green>{}</green>".format(video["title"]))
print_out("<cyan>{}</cyan> playing <cyan>{}</cyan>".format(name, video['game']))
print_out("Published <cyan>{}</cyan> Length: <cyan>{}</cyan> ".format(published_at, length))
for p in playlists.playlists:
name = p.media[0].name if p.media else ""
resolution = "x".join(str(r) for r in p.stream_info.resolution)
yield name, resolution, p.uri
def videos(channel_name, **kwargs):
videos = twitch.get_channel_videos(channel_name)
def _get_playlist_by_name(playlists, quality):
for name, _, uri in playlists:
if name == quality:
return uri
print("Found {} videos".format(videos["_total"]))
for video in videos['videos']:
_print_video(video)
available = ", ".join([name for (name, _, _) in playlists])
msg = "Quality '{}' not found. Available qualities are: {}".format(quality, available)
raise ConsoleError(msg)
def _select_quality(playlists):
def _select_playlist_interactive(playlists):
print_out("\nAvailable qualities:")
for no, v in playlists.items():
print_out("{}) {}".format(no, v[0]))
for n, (name, resolution, uri) in enumerate(playlists):
print_out("{}) {} [{}]".format(n + 1, name, resolution))
keys = list(playlists.keys())
no = read_int("Choose quality", min=min(keys), max=max(keys), default=keys[0])
return playlists[no]
no = utils.read_int("Choose quality", min=1, max=len(playlists) + 1, default=1)
_, _, uri = playlists[no - 1]
return uri
def _print_progress(futures):
counter = 1
total = len(futures)
total_size = 0
start_time = datetime.now()
for future in as_completed(futures):
size = future.result()
percentage = 100 * counter // total
total_size += size
duration = (datetime.now() - start_time).seconds
speed = total_size // duration if duration else 0
remaining = (total - counter) * duration / counter
msg = "Downloaded VOD {}/{} ({}%) total <cyan>{}B</cyan> at <cyan>{}B/s</cyan> remaining <cyan>{}</cyan>".format(
counter, total, percentage, format_size(total_size), format_size(speed), format_duration(remaining))
print_out("\r" + msg.ljust(80), end='')
counter += 1
def _download_files(base_url, directory, filenames, max_workers):
urls = [base_url.format(f) for f in filenames]
paths = ["/".join([directory, f]) for f in filenames]
partials = (partial(download_file, url, path) for url, path in zip(urls, paths))
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = [executor.submit(fn) for fn in partials]
_print_progress(futures)
return paths
def _join_vods(directory, paths, target):
input_path = "{}/files.txt".format(directory)
with open(input_path, 'w') as f:
for path in paths:
f.write('file {}\n'.format(os.path.basename(path)))
result = subprocess.run([
def _join_vods(playlist_path, target):
command = [
"ffmpeg",
"-f", "concat",
"-i", input_path,
"-i", playlist_path,
"-c", "copy",
target,
"-stats",
"-loglevel", "warning",
])
]
result.check_returncode()
print_out("<dim>{}</dim>".format(" ".join(command)))
result = subprocess.run(command)
if result.returncode != 0:
raise ConsoleError("Joining files failed")
def _video_target_filename(video, format):
@ -152,57 +126,182 @@ def _video_target_filename(video, format):
date,
video['_id'][1:],
video['channel']['name'],
slugify(video['title']),
utils.slugify(video['title']),
])
return name + "." + format
def parse_video_id(video_id):
"""This can be either a integer ID or an URL to the video on twitch."""
if re.search(r"^\d+$", video_id):
return int(video_id)
def _get_vod_paths(playlist, start, end):
"""Extract unique VOD paths for download from playlist."""
files = []
vod_start = 0
for segment in playlist.segments:
vod_end = vod_start + segment.duration
match = re.search(r"^https://www.twitch.tv/videos/(\d+)(\?.+)?$", video_id)
if match:
return int(match.group(1))
# `vod_end > start` is used here becuase it's better to download a bit
# more than a bit less, similar for the end condition
start_condition = not start or vod_end > start
end_condition = not end or vod_start < end
raise ConsoleError("Invalid video ID given, expected integer ID or Twitch URL")
if start_condition and end_condition and segment.uri not in files:
files.append(segment.uri)
vod_start = vod_end
return files
def download(video_id, max_workers, format='mkv', **kwargs):
video_id = parse_video_id(video_id)
def _crete_temp_dir(base_uri):
"""Create a temp dir to store downloads if it doesn't exist."""
path = urlparse(base_uri).path.lstrip("/")
temp_dir = Path(tempfile.gettempdir(), "twitch-dl", path)
temp_dir.mkdir(parents=True, exist_ok=True)
return temp_dir
print_out("Looking up video...")
VIDEO_PATTERNS = [
r"^(?P<id>\d+)?$",
r"^https://(www.)?twitch.tv/videos/(?P<id>\d+)(\?.+)?$",
]
CLIP_PATTERNS = [
r"^(?P<slug>[A-Za-z0-9]+)$",
r"^https://(www.)?twitch.tv/\w+/clip/(?P<slug>[A-Za-z0-9]+)(\?.+)?$",
r"^https://clips.twitch.tv/(?P<slug>[A-Za-z0-9]+)(\?.+)?$",
]
def download(args):
for pattern in VIDEO_PATTERNS:
match = re.match(pattern, args.video)
if match:
video_id = match.group('id')
return _download_video(video_id, args)
for pattern in CLIP_PATTERNS:
match = re.match(pattern, args.video)
if match:
clip_slug = match.group('slug')
return _download_clip(clip_slug, args)
raise ConsoleError("Invalid video: {}".format(args.video))
def _get_clip_url(clip, args):
qualities = clip["videoQualities"]
# Quality given as an argument
if args.quality:
selected_quality = args.quality.rstrip("p") # allow 720p as well as 720
for q in qualities:
if q["quality"] == selected_quality:
return q["sourceURL"]
available = ", ".join([str(q["quality"]) for q in qualities])
msg = "Quality '{}' not found. Available qualities are: {}".format(args.quality, available)
raise ConsoleError(msg)
# Ask user to select quality
print_out("\nAvailable qualities:")
for n, q in enumerate(qualities):
print_out("{}) {} [{} fps]".format(n + 1, q["quality"], q["frameRate"]))
print_out()
no = utils.read_int("Choose quality", min=1, max=len(qualities), default=1)
selected_quality = qualities[no - 1]
return selected_quality["sourceURL"]
def _download_clip(slug, args):
print_out("<dim>Looking up clip...</dim>")
clip = twitch.get_clip(slug)
if not clip:
raise ConsoleError("Clip '{}' not found".format(slug))
print_out("Found: <green>{}</green> by <yellow>{}</yellow>, playing <blue>{}</blue> ({})".format(
clip["title"],
clip["broadcaster"]["displayName"],
clip["game"]["name"],
utils.format_duration(clip["durationSeconds"])
))
url = _get_clip_url(clip, args)
print_out("<dim>Selected URL: {}</dim>".format(url))
url_path = urlparse(url).path
extension = Path(url_path).suffix
filename = "{}_{}{}".format(
clip["broadcaster"]["login"],
utils.slugify(clip["title"]),
extension
)
print_out("Downloading clip...")
download_file(url, filename)
print_out("Downloaded: {}".format(filename))
def _download_video(video_id, args):
if args.start and args.end and args.end <= args.start:
raise ConsoleError("End time must be greater than start time")
print_out("<dim>Looking up video...</dim>")
video = twitch.get_video(video_id)
print_out("Found: <blue>{}</blue> by <yellow>{}</yellow>".format(
video['title'], video['channel']['display_name']))
print_out("Fetching access token...")
print_out("<dim>Fetching access token...</dim>")
access_token = twitch.get_access_token(video_id)
print_out("Fetching playlists...")
playlists = twitch.get_playlists(video_id, access_token)
quality, playlist_url = _select_quality(playlists)
print_out("<dim>Fetching playlists...</dim>")
playlists_m3u8 = twitch.get_playlists(video_id, access_token)
playlists = list(_parse_playlists(playlists_m3u8))
playlist_uri = (_get_playlist_by_name(playlists, args.quality) if args.quality
else _select_playlist_interactive(playlists))
print_out("\nFetching playlist...")
base_url, filenames = twitch.get_playlist_urls(playlist_url)
print_out("<dim>Fetching playlist...</dim>")
response = requests.get(playlist_uri)
response.raise_for_status()
playlist = m3u8.loads(response.text)
# Create a temp dir to store downloads if it doesn't exist
directory = '{}/twitch-dl/{}/{}'.format(tempfile.gettempdir(), video_id, quality)
pathlib.Path(directory).mkdir(parents=True, exist_ok=True)
print_out("Download dir: {}".format(directory))
base_uri = re.sub("/[^/]+$", "/", playlist_uri)
target_dir = _crete_temp_dir(base_uri)
vod_paths = _get_vod_paths(playlist, args.start, args.end)
print_out("Downloading VODs with {} workers...".format(max_workers))
paths = _download_files(base_url, directory, filenames, max_workers)
# Save playlists for debugging purposes
with open(path.join(target_dir, "playlists.m3u8"), "w") as f:
f.write(playlists_m3u8)
with open(path.join(target_dir, "playlist.m3u8"), "w") as f:
f.write(response.text)
print_out("\nDownloading {} VODs using {} workers to {}".format(
len(vod_paths), args.max_workers, target_dir))
path_map = download_files(base_uri, target_dir, vod_paths, args.max_workers)
# Make a modified playlist which references downloaded VODs
# Keep only the downloaded segments and skip the rest
org_segments = playlist.segments.copy()
playlist.segments.clear()
for segment in org_segments:
if segment.uri in path_map:
segment.uri = path_map[segment.uri]
playlist.segments.append(segment)
playlist_path = path.join(target_dir, "playlist_downloaded.m3u8")
playlist.dump(playlist_path)
print_out("\n\nJoining files...")
target = _video_target_filename(video, format)
_join_vods(directory, paths, target)
target = _video_target_filename(video, args.format)
_join_vods(playlist_path, target)
print_out("\nDeleting vods...")
for path in paths:
os.unlink(path)
if args.keep:
print_out("\n<dim>Temporary files not deleted: {}</dim>".format(target_dir))
else:
print_out("\n<dim>Deleting temporary files...</dim>")
shutil.rmtree(target_dir)
print_out("\nDownloaded: {}".format(target))
print_out("\nDownloaded: <green>{}</green>".format(target))

View File

@ -1,17 +1,51 @@
# -*- coding: utf-8 -*-
from argparse import ArgumentParser
import sys
from argparse import ArgumentParser, ArgumentTypeError
from collections import namedtuple
from twitchdl.exceptions import ConsoleError
from twitchdl.output import print_err
from . import commands
from twitchdl.twitch import GQLError
from . import commands, __version__
Command = namedtuple("Command", ["name", "description", "arguments"])
CLIENT_WEBSITE = 'https://github.com/ihabunek/twitch-dl'
def time(value):
"""Parse a time string (hh:mm or hh:mm:ss) to number of seconds."""
parts = [int(p) for p in value.split(":")]
if not 2 <= len(parts) <= 3:
raise ArgumentTypeError()
hours = parts[0]
minutes = parts[1]
seconds = parts[2] if len(parts) > 2 else 0
if hours < 0 or not (0 <= minutes <= 59) or not (0 <= seconds <= 59):
raise ArgumentTypeError()
return hours * 3600 + minutes * 60 + seconds
def limit(value):
"""Validates the number of videos to fetch."""
try:
value = int(value)
except ValueError:
raise ArgumentTypeError("must be an integer")
if not 1 <= int(value) <= 100:
raise ArgumentTypeError("must be between 1 and 100")
return value
COMMANDS = [
Command(
name="videos",
@ -21,21 +55,69 @@ COMMANDS = [
"help": "channel name",
"type": str,
}),
(["-g", "--game"], {
"help": "Show videos of given game (can be given multiple times)",
"action": "append",
"type": str,
}),
(["-l", "--limit"], {
"help": "Number of videos to fetch (default 10, max 100)",
"type": limit,
"default": 10,
}),
(["-s", "--sort"], {
"help": "Sorting order of videos. (default: time)",
"type": str,
"choices": ["views", "time"],
"default": "time",
}),
(["-t", "--type"], {
"help": "Broadcast type. (default: archive)",
"type": str,
"choices": ["archive", "highlight", "upload"],
"default": "archive",
}),
],
),
Command(
name="download",
description="Download a video",
arguments=[
(["video_id"], {
"help": "video ID",
(["video"], {
"help": "video ID, clip slug, or URL",
"type": str,
}),
(["-w", "--max_workers"], {
"help": "maximal number of threads for downloading vods concurrently (default 5)",
(["-w", "--max-workers"], {
"help": "maximal number of threads for downloading vods "
"concurrently (default 20)",
"type": int,
"default": 20,
}),
(["-s", "--start"], {
"help": "Download video from this time (hh:mm or hh:mm:ss)",
"type": time,
"default": None,
}),
(["-e", "--end"], {
"help": "Download video up to this time (hh:mm or hh:mm:ss)",
"type": time,
"default": None,
}),
(["-f", "--format"], {
"help": "Video format to convert into, passed to ffmpeg as the "
"target file extension (default: mkv)",
"type": str,
"default": "mkv",
}),
(["-k", "--keep"], {
"help": "Don't delete downloaded VODs and playlists after merging.",
"action": "store_true",
"default": False,
}),
(["-q", "--quality"], {
"help": "Video quality.",
"type": str,
}),
],
),
]
@ -58,6 +140,8 @@ def get_parser():
description = "A script for downloading videos from Twitch"
parser = ArgumentParser(prog='twitch-dl', description=description, epilog=CLIENT_WEBSITE)
parser.add_argument("--version", help="show version number", action='store_true')
subparsers = parser.add_subparsers(title="commands")
for command in COMMANDS:
@ -76,11 +160,21 @@ def main():
parser = get_parser()
args = parser.parse_args()
if args.version:
print("twitch-dl v{}".format(__version__))
return
if "func" not in args:
parser.print_help()
return
try:
args.func(**args.__dict__)
args.func(args)
except ConsoleError as e:
print_err(e)
sys.exit(1)
except GQLError as e:
print_err(e)
for err in e.errors:
print_err("*", err["message"])
sys.exit(1)

View File

@ -1,11 +1,18 @@
import os
import requests
from collections import OrderedDict
from concurrent.futures import ThreadPoolExecutor, as_completed
from datetime import datetime
from functools import partial
from requests.exceptions import RequestException
from twitchdl.output import print_out
from twitchdl.utils import format_size, format_duration
CHUNK_SIZE = 1024
CONNECT_TIMEOUT = 5
RETRY_COUNT = 5
class DownloadFailed(Exception):
@ -25,14 +32,61 @@ def _download(url, path):
return size
def download_file(url, path, retries=3):
def download_file(url, path, retries=RETRY_COUNT):
if os.path.exists(path):
return 0
return os.path.getsize(path)
for _ in range(retries):
try:
return _download(url, path)
except RequestException as e:
print("Download failed: {}".format(e))
except RequestException:
pass
raise DownloadFailed(":(")
def _print_progress(futures):
downloaded_count = 0
downloaded_size = 0
max_msg_size = 0
start_time = datetime.now()
total_count = len(futures)
for future in as_completed(futures):
size = future.result()
downloaded_count += 1
downloaded_size += size
percentage = 100 * downloaded_count // total_count
est_total_size = int(total_count * downloaded_size / downloaded_count)
duration = (datetime.now() - start_time).seconds
speed = downloaded_size // duration if duration else 0
remaining = (total_count - downloaded_count) * duration / downloaded_count
msg = " ".join([
"Downloaded VOD {}/{}".format(downloaded_count, total_count),
"({}%)".format(percentage),
"<cyan>{}</cyan>".format(format_size(downloaded_size)),
"of <cyan>~{}</cyan>".format(format_size(est_total_size)),
"at <cyan>{}/s</cyan>".format(format_size(speed)) if speed > 0 else "",
"remaining <cyan>~{}</cyan>".format(format_duration(remaining)) if speed > 0 else "",
])
max_msg_size = max(len(msg), max_msg_size)
print_out("\r" + msg.ljust(max_msg_size), end="")
def download_files(base_url, target_dir, vod_paths, max_workers):
"""
Downloads a list of VODs defined by a common `base_url` and a list of
`vod_paths`, returning a dict which maps the paths to the downloaded files.
"""
urls = [base_url + path for path in vod_paths]
targets = [os.path.join(target_dir, "{:05d}.ts".format(k)) for k, _ in enumerate(vod_paths)]
partials = (partial(download_file, url, path) for url, path in zip(urls, targets))
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = [executor.submit(fn) for fn in partials]
_print_progress(futures)
return OrderedDict(zip(vod_paths, targets))

View File

@ -3,14 +3,20 @@
import sys
import re
from twitchdl import utils
START_CODES = {
'bold': '\033[1m',
'red': '\033[31m',
'green': '\033[32m',
'yellow': '\033[33m',
'blue': '\033[34m',
'magenta': '\033[35m',
'cyan': '\033[36m',
'b': '\033[1m',
'dim': '\033[2m',
'i': '\033[3m',
'u': '\033[4m',
'red': '\033[91m',
'green': '\033[92m',
'yellow': '\033[93m',
'blue': '\033[94m',
'magenta': '\033[95m',
'cyan': '\033[96m',
}
END_CODE = '\033[0m'
@ -49,3 +55,22 @@ def print_err(*args, **kwargs):
args = ["<red>{}</red>".format(a) for a in args]
args = [colorize(a) if USE_ANSI_COLOR else strip_tags(a) for a in args]
print(*args, file=sys.stderr, **kwargs)
def print_video(video):
published_at = video["publishedAt"].replace("T", " @ ").replace("Z", "")
length = utils.format_duration(video["lengthSeconds"])
channel = video["creator"]["channel"]["displayName"]
playing = (
" playing <blue>{}</blue>".format(video["game"]["name"])
if video["game"] else ""
)
# Can't find URL in video object, strange
url = "https://www.twitch.tv/videos/{}".format(video["id"])
print_out("\n<b>{}</b>".format(video["id"]))
print_out("<green>{}</green>".format(video["title"]))
print_out("<blue>{}</blue> {}".format(channel, playing))
print_out("Published <blue>{}</blue> Length: <blue>{}</blue> ".format(published_at, length))
print_out("<i>{}</i>".format(url))

View File

@ -1,28 +0,0 @@
import re
from collections import OrderedDict
def parse_playlists(data):
media_pattern = re.compile(r'^#EXT-X-MEDIA:TYPE=VIDEO,GROUP-ID="(?P<group>\w+)",NAME="(?P<name>\w+)"')
playlists = OrderedDict()
n = 1
name = None
for line in data.split():
match = re.match(media_pattern, line)
if match:
name = match.group('name')
elif line.startswith('http'):
playlists[n] = (name, line)
n += 1
return playlists
def parse_playlist(url, data):
base_url = re.sub("/[^/]+$", "/{}", url)
filenames = [line for line in data.split() if re.match(r"\d+\.ts", line)]
return base_url, filenames

View File

@ -1,46 +1,182 @@
"""
Twitch API access.
"""
import requests
from twitchdl import CLIENT_ID
from twitchdl.parse import parse_playlists, parse_playlist
from twitchdl.exceptions import ConsoleError
def authenticated_get(url, params={}):
headers = {'Client-ID': CLIENT_ID}
class GQLError(Exception):
def __init__(self, errors):
super().__init__("GraphQL query failed")
self.errors = errors
def authenticated_get(url, params={}, headers={}):
headers['Client-ID'] = CLIENT_ID
response = requests.get(url, params, headers=headers)
if 400 <= response.status_code < 500:
data = response.json()
# TODO: this does not look nice in the console since data["message"]
# can contain a JSON encoded object.
raise ConsoleError(data["message"])
response.raise_for_status()
return response
def authenticated_post(url, data=None, json=None, headers={}):
headers['Client-ID'] = CLIENT_ID
response = requests.post(url, data=data, json=json, headers=headers)
if response.status_code == 400:
data = response.json()
raise ConsoleError(data["message"])
response.raise_for_status()
return response
def kraken_get(url, params={}, headers={}):
"""
Add accept header required by kraken API v5.
see: https://discuss.dev.twitch.tv/t/change-in-access-to-deprecated-kraken-twitch-apis/22241
"""
headers["Accept"] = "application/vnd.twitchtv.v5+json"
return authenticated_get(url, params, headers)
def gql_query(query):
url = "https://gql.twitch.tv/gql"
response = authenticated_post(url, json={"query": query}).json()
if "errors" in response:
raise GQLError(response["errors"])
return response
def get_video(video_id):
"""
https://dev.twitch.tv/docs/v5/reference/videos#get-video
"""
url = "https://api.twitch.tv/kraken/videos/%d" % video_id
url = "https://api.twitch.tv/kraken/videos/{}".format(video_id)
return authenticated_get(url).json()
return kraken_get(url).json()
def get_channel_videos(channel_name, limit=20):
def get_clip(slug):
query = """
{{
clip(slug: "{}") {{
title
durationSeconds
game {{
name
}}
broadcaster {{
login
displayName
}}
videoQualities {{
frameRate
quality
sourceURL
}}
}}
}}
"""
https://dev.twitch.tv/docs/v5/reference/channels#get-channel-videos
"""
url = "https://api.twitch.tv/kraken/channels/%s/videos" % channel_name
return authenticated_get(url, {
"broadcast_type": "archive",
response = gql_query(query.format(slug))
return response["data"]["clip"]
def get_channel_videos(channel_id, limit, sort, type="archive", game_ids=[], after=None):
query = """
{{
user(login: "{channel_id}") {{
videos(
first: {limit},
type: {type},
sort: {sort},
after: "{after}",
options: {{
gameIDs: {game_ids}
}}
) {{
totalCount
pageInfo {{
hasNextPage
}}
edges {{
cursor
node {{
id
title
publishedAt
broadcastType
lengthSeconds
game {{
name
}}
creator {{
channel {{
displayName
}}
}}
}}
}}
}}
}}
}}
"""
query = query.format(**{
"channel_id": channel_id,
"game_ids": game_ids,
"after": after,
"limit": limit,
}).json()
"sort": sort.upper(),
"type": type.upper(),
})
response = gql_query(query)
return response["data"]["user"]["videos"]
def channel_videos_generator(channel_id, limit, sort, type, game_ids=None):
cursor = None
while True:
videos = get_channel_videos(
channel_id, limit, sort, type, game_ids=game_ids, after=cursor)
if not videos["edges"]:
break
has_next = videos["pageInfo"]["hasNextPage"]
cursor = videos["edges"][-1]["cursor"] if has_next else None
yield videos, has_next
if not cursor:
break
def get_access_token(video_id):
url = "https://api.twitch.tv/api/vods/%d/access_token" % video_id
url = "https://api.twitch.tv/api/vods/{}/access_token".format(video_id)
return authenticated_get(url).json()
def get_playlists(video_id, access_token):
"""
For a given video return a playlist which contains possible video qualities.
"""
url = "http://usher.twitch.tv/vod/{}".format(video_id)
response = requests.get(url, params={
@ -50,16 +186,19 @@ def get_playlists(video_id, access_token):
"player": "twitchweb",
})
response.raise_for_status()
data = response.content.decode('utf-8')
return parse_playlists(data)
return response.content.decode('utf-8')
def get_playlist_urls(url):
response = requests.get(url)
response.raise_for_status()
def get_game_id(name):
query = """
{{
game(name: "{}") {{
id
}}
}}
"""
data = response.content.decode('utf-8')
return parse_playlist(url, data)
response = gql_query(query.format(name.strip()))
game = response["data"]["game"]
if game:
return game["id"]

View File

@ -2,10 +2,62 @@ import re
import unicodedata
def _format_size(value, digits, unit):
if digits > 0:
return "{{:.{}f}}{}".format(digits, unit).format(value)
else:
return "{{:d}}{}".format(unit).format(value)
def format_size(bytes_, digits=1):
if bytes_ < 1024:
return _format_size(bytes_, digits, "B")
kilo = bytes_ / 1024
if kilo < 1024:
return _format_size(kilo, digits, "kB")
mega = kilo / 1024
if mega < 1024:
return _format_size(mega, digits, "MB")
return _format_size(mega / 1024, digits, "GB")
def format_duration(total_seconds):
total_seconds = int(total_seconds)
hours = total_seconds // 3600
remainder = total_seconds % 3600
minutes = remainder // 60
seconds = total_seconds % 60
if hours:
return "{} h {} min".format(hours, minutes)
if minutes:
return "{} min {} sec".format(minutes, seconds)
return "{} sec".format(seconds)
def read_int(msg, min, max, default):
msg = msg + " [default {}]: ".format(default)
while True:
try:
val = input(msg)
if not val:
return default
if min <= int(val) <= max:
return int(val)
except ValueError:
pass
def slugify(value):
re_pattern = re.compile(r'[^\w\s-]', flags=re.U)
re_spaces = re.compile(r'[-\s]+', flags=re.U)
value = str(value)
value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore').decode('ascii')
value = unicodedata.normalize('NFKC', value)
value = re_pattern.sub('', value).strip().lower()
return re_spaces.sub('-', value)
return re_spaces.sub('_', value)