veloren

mirror of https://gitlab.com/veloren/veloren.git synced 2024-08-30 18:12:32 +00:00

Author	SHA1	Message	Date
tygyh	5e5698249b	Remove unnecessarily qualified paths	2022-07-15 14:49:46 +02:00
Marcel Märtens	50d85940d8	implement a event channel that posts regular information on events for Participants	2022-07-03 21:21:59 +02:00
Marcel Märtens	b5d0ee22e4	deactivate some features again and only keep the internal code for now to reuse it in automatic reconnect code	2022-06-30 22:14:24 +02:00
Marcel Märtens	f3e4f022cb	rather than letting the api/Participant handling the cleanup, we had to move over to the bParticipant::shutdownmanager. Because the old case didn't account for the Scheduler got dropped but Participant keept around. Shutdown behavior is quite easy now: bParticipant sends a oneshot, when that hits we stop. also when Participant gets droped we still stop as there woul dbe no sense in continue running the report_mgr ...	2022-06-30 22:14:24 +02:00
Marcel Märtens	5b63035506	Add a new/unstable functionality report_channel. This will ask the bparticipant for a list of all channels and their respective connection arguments. With that one could prob reach the remote side. The data is gathered by scheduler (or channel for the listener code). It requeres some read logs so we shouldn't abuse that function call. in bparticipant we have a new manager that also properly shuts down as the Participant holds the sender to the respective receiver. The sender is always dropped. inside the Mutex via disconnect and outside via Drop (we need 2 Options as otherwise we would create a runtime inside async context implicitly o.O ) (also i didn't liked the alternative by just overwriting the sender with a fake one, i want a propper Option that can be taken) The code might also come handy in the future when we implement a auto-reconnect feature in the bparticipant.	2022-06-30 22:14:24 +02:00
Marcel Märtens	e194a2e334	export unterlying errors via network crate, to generate more detailed logs	2022-05-23 02:00:56 +02:00
Avi Weinstock	5f8957d8ef	Globally allow the clippy lints `{new_without_default, many_single_char_names, identity_op, type_complexity, too_many_arguments}`.	2022-01-30 20:16:20 +01:00
Jonathan Berglin	596307c9b7	Remove unused clippy suppressions	2021-12-05 17:59:02 +00:00
Marcel Märtens	2a82405df2	update toolchain to `nightly-2021-09-24`	2021-09-24 23:18:07 +02:00
Marcel Märtens	9b3b21f368	fix clippy warnings	2021-07-12 12:09:09 +02:00
Marcel Märtens	cf3188b412	remove Protocol from Quic, cleanup code, fix some log spam	2021-05-21 10:41:19 +02:00
Joshua Yanovski	e7587c4d9d	Added non-admin moderators and timed bans. The security model has been updated to reflect this change (for example, moderators cannot revert a ban by an administrator). Ban history is also now recorded in the ban file, and much more information about the ban is stored (whitelists and administrators also have extra information). To support the new information without losing important information, this commit also introduces a new migration path for editable settings (both from legacy to the new format, and between versions). Examples of how to do this correctly, and migrate to new versions of a settings file, are in the settings/ subdirectory. As part of this effort, editable settings have been revamped to guarantee atomic saves (due to the increased amount of information in each file), some latent bugs in networking were fixed, and server-cli has been updated to go through StructOpt for both calls through TUI and argv, greatly simplifying parsing logic.	2021-05-09 21:19:16 -07:00
Marcel Märtens	68d326c817	revert Client drop to be correct again and also stop network properly, reduce timeout to 10s	2021-05-04 22:34:19 +02:00
Marcel Märtens	df7b65289d	fix error handling in networking and switch to hashbrown, fixing #1118	2021-05-04 15:29:42 +02:00
Marcel Märtens	653fb065e0	extract protocol specific listen code from scheduler and move it to channel.rs	2021-04-29 17:51:52 +02:00
Marcel Märtens	383482a36e	Quic: We had the followuing problem: - locally we open a stream, our local Drain is sending OpenStream - remote Sink will know this and notify remote Drain - remote side sends a message - local sink does not know about the Stream. as there is (and CANT) be a wat to notify local Sink from local Drain (it could introduce race conditions). One of the possible solutions was, that the remote drain will copy the OpenStream Msg ON the Quic::stream before first data is send. This would work but is complicated. Instead we now just mark such streams as "potentially open" and we listen for the first DataHeader to get it's SID. add support for unreliable messages in quic protocol, benchmarks	2021-04-29 15:58:23 +02:00
Marcel Märtens	ea5a02d7cd	change some Ordering::Relaxed to Ordering::SeqCst when we do not want to have it moved/or taken effects from other threads. some id increases are kept Relaxed, SeqCst shouldn't be necessary there. Not sure about the bool checks in api.rs	2021-04-07 23:17:09 +02:00
Marcel Märtens	7ca2f3b9d6	make a panic a error! and improve logging	2021-04-03 19:58:36 +02:00
Marcel Märtens	aea52d8b54	implement Upload Bandwidth prediction. Its available to `api` and `metrics` and can be used to slow down msg send in veloren. It uses a tokio::watch for now, as i plan to have a watch job in the scheduler that recalculates prio on change. Also cleaning up participant metrics after a disconnect	2021-03-26 08:58:03 +01:00
Marcel Märtens	034d0f0c5d	fix a bug that some closes could get lost	2021-03-26 08:57:56 +01:00
Marcel Märtens	8dccc21125	preparation for multiple-channel participants. When a stream is opened we are searching for the best (currently) available channel. The stream will then be keept on that channel. Adjusted the rest of the algorithms that they now respect this rule. improved a HashMap for Pids to be based on a Vec. Also using this for Sid -> Cid relation which is more performance critical WARN: our current send()? error handling allows it for some close_stream messages to get lost.	2021-03-26 08:57:50 +01:00
Marcel Märtens	01c82b70ab	network scheduler and rawmsg cleanup	2021-03-26 08:57:42 +01:00
Marcel Märtens	9084ac48f1	defer some trace, so that we wont spam the log.	2021-03-22 09:16:07 +01:00
Marcel Märtens	514d5db038	Update Network Protocol - now last digit version is compatible 0.6.0 will connect to 0.6.1 - the TCP DATA Frames no longer contain START field, as it's not needed - the TCP OPENSTREAM Frames will now contain the BANDWIDTH field - MID is not Protocol internal Update network - update API with Bandwidth Update veloren - introduce better runtime and `async` things that are IO bound. - Remove `uvth` and instead use `tokio::runtime::Runtime::spawn_blocking` - remove futures_execute from client and server use tokio::runtime::Runtime instead - give threads a Name	2021-02-22 17:34:55 +01:00
Marcel Märtens	5a48bffcb0	fix main thread blocking which was a bad combination of - a channel was stale and wasn't shut down propertly AS WELL AS - the msg ingoing pipe was bounded, so it could fill up To mitigate this we a) unbounded the pipe b) stoped spam the log in no channel case c) instead of ever reaching "no channel" state we actually shutdown participant d) when send_mgr is closed it will no longer be able to SEND on streams	2021-02-18 20:00:07 +01:00
Marcel Märtens	03af9937cf	Stabelize Network again: - completly switch to Bytes, even in api. speed up TCP by fak 2 - improve benchmarks - speed up mpsc metrics - gracefully handle shutdown by interpreting Ok(0) as tokio::tcpstream closed now. - fix hotloop in participants by adding `Some(n)` to fix endless handing. - fix closing bug by closing streams after `recv_mgr` is shutdown even if now shutdown is triggered locally. - fix prometheus - no longer throw when a `Stream` is dropped while participant still receives a msg for it. - fix the bandwith handling, TCP network send speed is up to 1.5GiB/s while recv is 150MiB/s - add documentation - tmp require rt-multi-threaded in client for tokio, to not fail cargo check this is prob stable, i tested over 1 hour. after that some optimisations in priomgr. and impl. propper bandwith. Speed is up to 2GB/s write and 150MB/s recv on a single core sync add documentation	2021-02-17 19:37:48 +01:00
Marcel Märtens	ea8ab1ce7a	Great improvements to the codebase: - better logging in network - we now notify the send of what happened in recv in participant. - works with veloren master servers - works in singleplayer, using a actual mid. - add `mpsc` in whole stack incl tests - speed up internal read/write with `Bytes` crate - use `prometheus-hyper` for metrics - use a metrics cache	2021-02-17 16:15:00 +01:00
Marcel Märtens	9884019963	COMPLETE REDESIGN of network crate - Implementing a async non-io protocol crate a) no tokio / no channels b) I/O is based on abstraction Sink/Drain c) different Protocols can have a different Drain Type This allow MPSC to send its content without splitting up messages at all! It allows UDP to have internal extra frames to care for security It allows better abstraction for tests Allows benchmarks on the mpsc variant Custom Handshakes to allow sth like Quic protocol easily - reduce the participant managers to 4: channel creations, send, recv and shutdown. keeping the `mut data` in one manager removes the need for all RwLocks. reducing complexity and parallel access problems - more strategic participant shutdown. first send. then wait for remote side to notice recv stop, then remote side will stop send, then local side can stop recv. - metrics are internally abstracted to fit protocol and network layer - in this commit network/protocol tests work and network tests work someway, veloren compiles but does not work - handshake compatible to async_std	2021-02-17 12:39:47 +01:00
Marcel Märtens	3f85506761	fix most unittests (not all) by a) dropping network/participant BEFORE runtime and by transfering a expect into a warn! in the protocol	2021-02-17 12:38:58 +01:00
Marcel Märtens	5aa1940ef8	get rid of `async_std::channel` switch to `tokio` and `async_channel` crate. I wanted to do tokio first, but it doesnt feature Sender::close(), thus i included async_channel Got rid of `futures` and only need `futures_core` and `futures_util`. Tokio does not support `Stream` and `StreamExt` so for now i need to use `tokio-stream`, i think this will go in `std` in the future Created `b2b_close_stream_opened_sender_r` as the shutdown procedure does not need a copy of a Sender, it just need to stop it. Various adjustments, e.g. for `select!` which now requieres a `&mut` for oneshots. Future things to do: - Use some better signalling than oneshot<()> in some cases. - Use a Watch for the Prio propergation (impl. it ofc) - Use Bounded Channels in order to improve performance - adjust tests coding bring tests to work	2021-02-17 12:38:53 +01:00
Marcel Märtens	1b77b6dc41	Initial switch to tokio for network, minimum working example.	2021-02-17 12:37:59 +01:00
Marcel Märtens	7a7c1f6f50	I would except this to be implcitly done by the `drop` though it doesn't hurt here, as this channel is dropped anyway a line later. But i have the feeling that maybe something with the channel is wrong which leads to this behavior (or maybe did i made a copy somewhere, though i dobt this). Again, not sure if this is a fix, but i think it doesn't hurt	2020-11-27 10:47:01 +01:00
Marcel Märtens	ba1299e670	apparently span doesnt work for async, so i replaced it by an instrument version	2020-10-14 17:54:01 +02:00
Marcel Märtens	e914c29728	FIX for hanging participant deletion. There is a rare bug that recently got triggered more often with the release of xMAC94x/netfixA if the bug triggeres, a Participant never gets cleaned up gracefully. Reason: When `participant_shutdown_mgr` was called it stopped all managers at once. Especially stream_close_mgr and send_mgr. The problem with stream_close_mgr is, it's responsible for gracefully flushing streams when the Participant is dropped locally. So when it was interupted self.streams where no longer flushed gracefully. The next problem was with send_mgr. It is triggering the PrioManager, and the PrioManager is responsible for notifying once a stream is completly flushed. This lead to the problem, that a stream flush could be requested, but was actually never executed (as send_mgr was already down). Solution: 1. when stream_close_mgr is stopped it MUST flush all remaining streams 2. wait for stream_close_mgr to finish before shutting down the send_mgr 3. no longer delete streams when closing the API (this also wasn't tracked in metrics so far) Additionally i added a dependency, so that the network/examples compile again, fixed some spelling. I created a `delete_stream` fn that basically just moved the code over.	2020-10-14 15:03:49 +02:00
Marcel Märtens	24af657fd5	quickfix for closing participants more reliable	2020-10-13 20:06:20 +02:00
Ben Wallis	b3dd8e8a02	Added #![deny(clippy::clone_on_ref_ptr)] to all crates and fixed resulting lint errors	2020-09-27 17:25:33 +01:00
Marcel Märtens	a7b7ae3a2c	fix compiling with metrics	2020-08-27 09:35:06 +02:00
Imbris	dcce5641f7	Fix broken features and avoid panic if the client leaves before character data loads	2020-08-26 20:47:39 -04:00
Marcel Märtens	9170622611	reduce load on metrics by ALOT! - first remove participant AND channel in same metric. this caused a matrix full of 0 values which bloated alot. - then did the cid cache to be lazy loading to no longer generate that much 0 values - possible would also be no longer keeping metrics for INIT, HANDSHAKE and PARTICIPANTID	2020-08-27 01:55:13 +02:00
notoria	2be4202d01	Corrected some spelling errors	2020-08-25 12:21:25 +00:00
Marcel Märtens	5fe7c05d9c	Redefine Close behavior: - When Participant A was closed by remote side. Then a `disconnect` on `A` shall return Ok() (instead of ParticipantDisconected) IF: A was already flushed and no data needs to be sended any more. so a `disconnect` doesnt differ if the other side initiated the disconnect before or not. it tries to clean things up and returns Ok(()) if both sides agree	2020-08-24 16:22:12 +02:00
Marcel Märtens	91d296cda1	Fixed bug in tcp protocol.rs - It was possible for a end_receiver to be triggered in the moment while a frame was started by not finished. This removed bytes from the stream with them getting lost. this almost ever was followed by a RAW frame as the TCP stream was now invalid. The TCP stream was then detected by participant or caused one or multiple failures - introduces some simplifications, removed a macro, reuse code	2020-08-24 16:22:06 +02:00
Marcel Märtens	d37ca02913	using Locks a more sensitive way. - replace RwLock by Mutex if it's only accessed for insert/delete - use RwLock<HashMap<Mutex>> pattern otherwise in order to allow concurrent `.read()` - fixed a deadlock O.o	2020-08-23 21:43:17 +02:00
Marcel Märtens	1eb126736d	workaround for impossible RAW msg	2020-08-22 01:09:07 +02:00
Marcel Märtens	926d334082	Fixed the unclean disconnecting of participants. Till now, we just dropped the TCP connection and registered this as a clean shutdown. The prodocol reader intereted this and send a Frame::Shutdown frame to it's local processor. This is ofc wrong. So now the protocol reader will detect a Frame::Shutdown frame and send it over. if the Tcp connection gets closed it will return an Error up. The processor will then pick up this error and request a unclear shutdown and notifies the user. Also when doing a clean shutdown we are sending a Frame::Shutdown now to the remote side to trigger this behavior. Before we wrongly added the feature of only using a `select` in channel. This is WRONG, as it could mean that the write maybe fails, but the read still had some Frames buffered which then get dropped. Its fixed now by the clean shutdown mechanims defined before. Also when a channel is closed now inside a participant we are closing the whole participant as a protection. However, we must not close the recv channel as the `handle_frames_mgr` might still be working on them, so we only stop writing/sending. Debugging this also let me introduce some smaller fixes: - PID in tests are now 0 and 1+164+164*64+... this makes the traces appear as AAAAAA and BBBBBB instead of ABAAAA and ACAAAA - veloren client now better seperates between clean shutdown and unclear shutdown. - added a new type: C2pFrame for `(cid, Result<Frame, ()>)` - wrong frames inside the handshare are not counted in metrics -	2020-08-21 18:00:28 +02:00
Marcel Märtens	42141b3aa3	remove some `trace!` in network which a) was only spam and b) could be replaced by a metric way better. added a span for disconnecting on the gameserver side. also added more debug! tracing there Just keeping a trace! all 10000ms active to have a keep alive feeling.	2020-08-21 18:00:14 +02:00
Marcel Märtens	12b46250f5	protocols no longer send a Close Frame in case the read fails. They just fail, let participant handle this! Participant will now handle a close in the `create_channel_mgr` rather then the `send` fn. Its the better place, which makes a HashMap better for delete lookup Since tcp_read now broke but tcp_write didn't and the Participant wasnt updated till both were broke, we changed CHANNEL tcp_read and tcp_write in protocols to be a `select` rather than a `join` However only do this in the CHANNEL, but in the HANDSHAKE. it fails if you try to. Also the handshake will take care of any failed read or write manually and will handle a clear teardown in this case.	2020-08-21 18:00:07 +02:00
Marcel Märtens	b59fc2ff0c	improve tracing and spans in network crate	2020-08-21 18:00:00 +02:00
Marcel Märtens	e618eeb386	Fix a isse that might occur when a participant is dropped while the remote wants to open a Stream and we get some bad time condition. increase the slowlorris timeout. for some reason it seems to trigger alot more often since commit: `75c1d440` but i have no idea why. My guess would be that the initial sync now sends way more data which slows down TCP to be > 10ms and trigger. Note: the fix might cause small lags when slow people try to connect to the server	2020-08-13 12:06:53 +02:00
Marcel Märtens	dd581bc6c0	Participant closure was immeatiatly, even in case a new participant was connected, send a MSG and then dropped immeadiatly. The remote site should see it connect, be open for 1 single stream and read the message before it's notified that the participant is closed actually. This caused the faulure in one of our API tests (in lib, with client and server). Where it was possible that all messages were send and one side was dropped before the other side asked for the opened stream Also introduce better error detection in participant(and scheduler) by removing the std_async::Result and intruduce `Result<(),ParticipantError>` instead	2020-07-22 09:18:15 +02:00

1 2

73 Commits