veloren

mirror of https://gitlab.com/veloren/veloren.git synced 2024-08-30 18:12:32 +00:00

Author	SHA1	Message	Date
Marcel Märtens	258da1bedf	those sleeps cannot be easily included in the code, as they simulate 2 participants on 2 different computers. Increase the timeouts from 1000 -> 3000 ms if there are some internal messages send (e.g. network_a closed, send on stream_a) Increase the timeouts from 1000 -> 5000 ms if there is actuall networking involed (e.g. stream_a send, stream_b recv)	2021-11-19 09:36:39 +01:00
Marcel Märtens	760c382ed9	protocoladdr change for listen and connect (remove a loop in quic protocol which wasnt a actual loop)	2021-04-29 15:58:34 +02:00
Marcel Märtens	9028578bc8	Change the way Network is dropped. Instead of keeping Runtime and manually spawn a task on `drop` this task is spawned at start and will wait to be triggered. The `drop` methods then wait for completion, UNLESS they are in a async context, then they MUST NOT BLOCK (deadlock potential), so they defer it to the Runtime and HOPE for the runtime to exist long enough. This get rid of the weird `block_in_place` which is only accessable with `rt-multi-threaded` and has some disadvantages. We also wont requiere the runtime to be active all the time. Though its needed for a clean shutdown	2021-03-03 11:28:40 +01:00
Marcel Märtens	514d5db038	Update Network Protocol - now last digit version is compatible 0.6.0 will connect to 0.6.1 - the TCP DATA Frames no longer contain START field, as it's not needed - the TCP OPENSTREAM Frames will now contain the BANDWIDTH field - MID is not Protocol internal Update network - update API with Bandwidth Update veloren - introduce better runtime and `async` things that are IO bound. - Remove `uvth` and instead use `tokio::runtime::Runtime::spawn_blocking` - remove futures_execute from client and server use tokio::runtime::Runtime instead - give threads a Name	2021-02-22 17:34:55 +01:00
Marcel Märtens	03af9937cf	Stabelize Network again: - completly switch to Bytes, even in api. speed up TCP by fak 2 - improve benchmarks - speed up mpsc metrics - gracefully handle shutdown by interpreting Ok(0) as tokio::tcpstream closed now. - fix hotloop in participants by adding `Some(n)` to fix endless handing. - fix closing bug by closing streams after `recv_mgr` is shutdown even if now shutdown is triggered locally. - fix prometheus - no longer throw when a `Stream` is dropped while participant still receives a msg for it. - fix the bandwith handling, TCP network send speed is up to 1.5GiB/s while recv is 150MiB/s - add documentation - tmp require rt-multi-threaded in client for tokio, to not fail cargo check this is prob stable, i tested over 1 hour. after that some optimisations in priomgr. and impl. propper bandwith. Speed is up to 2GB/s write and 150MB/s recv on a single core sync add documentation	2021-02-17 19:37:48 +01:00
Marcel Märtens	3f85506761	fix most unittests (not all) by a) dropping network/participant BEFORE runtime and by transfering a expect into a warn! in the protocol	2021-02-17 12:38:58 +01:00
Marcel Märtens	5aa1940ef8	get rid of `async_std::channel` switch to `tokio` and `async_channel` crate. I wanted to do tokio first, but it doesnt feature Sender::close(), thus i included async_channel Got rid of `futures` and only need `futures_core` and `futures_util`. Tokio does not support `Stream` and `StreamExt` so for now i need to use `tokio-stream`, i think this will go in `std` in the future Created `b2b_close_stream_opened_sender_r` as the shutdown procedure does not need a copy of a Sender, it just need to stop it. Various adjustments, e.g. for `select!` which now requieres a `&mut` for oneshots. Future things to do: - Use some better signalling than oneshot<()> in some cases. - Use a Watch for the Prio propergation (impl. it ofc) - Use Bounded Channels in order to improve performance - adjust tests coding bring tests to work	2021-02-17 12:38:53 +01:00
Marcel Märtens	b5f48014a9	Streams no longer panic when `recv` on a StreamClosed Stream. Panicing is a "feature" of `futures::channel` Refactor the `send_raw` and `recv_raw` completly. We now expost `Message` which has a public `serialize` and `deseialize` fn for the first time. This makes using the `raw` methods of a stream much easier and is a requierement for using "copy_less" sending to multiple streams	2020-10-19 10:23:30 +02:00
Marcel Märtens	144f88f811	Propper Compression support of network. - Compression is no longer enabled always but can now be enabled per Stream. If a Stream is Compression enabled it will compress and decompress all msg (except for `raw` access) before handling them internally. You need to handle compression yourself for `raw` fn. - added a new feature to the network crate to enable or disable the compression - switched to `lz-fear` instead of `lz4-compression` - use `bitflags` to represent the `Promises` struct	2020-08-25 23:55:27 +02:00
notoria	2be4202d01	Corrected some spelling errors	2020-08-25 12:21:25 +00:00
Marcel Märtens	5fe7c05d9c	Redefine Close behavior: - When Participant A was closed by remote side. Then a `disconnect` on `A` shall return Ok() (instead of ParticipantDisconected) IF: A was already flushed and no data needs to be sended any more. so a `disconnect` doesnt differ if the other side initiated the disconnect before or not. it tries to clean things up and returns Ok(()) if both sides agree	2020-08-24 16:22:12 +02:00
Marcel Märtens	91d296cda1	Fixed bug in tcp protocol.rs - It was possible for a end_receiver to be triggered in the moment while a frame was started by not finished. This removed bytes from the stream with them getting lost. this almost ever was followed by a RAW frame as the TCP stream was now invalid. The TCP stream was then detected by participant or caused one or multiple failures - introduces some simplifications, removed a macro, reuse code	2020-08-24 16:22:06 +02:00
Marcel Märtens	34a4c72a73	Fix scheduler not really shutting down when they where listening on a Port. Add a seperate test for this. - 1000ms sleep isn't enough in tracing anyway, so remove it	2020-08-21 18:00:34 +02:00
Marcel Märtens	926d334082	Fixed the unclean disconnecting of participants. Till now, we just dropped the TCP connection and registered this as a clean shutdown. The prodocol reader intereted this and send a Frame::Shutdown frame to it's local processor. This is ofc wrong. So now the protocol reader will detect a Frame::Shutdown frame and send it over. if the Tcp connection gets closed it will return an Error up. The processor will then pick up this error and request a unclear shutdown and notifies the user. Also when doing a clean shutdown we are sending a Frame::Shutdown now to the remote side to trigger this behavior. Before we wrongly added the feature of only using a `select` in channel. This is WRONG, as it could mean that the write maybe fails, but the read still had some Frames buffered which then get dropped. Its fixed now by the clean shutdown mechanims defined before. Also when a channel is closed now inside a participant we are closing the whole participant as a protection. However, we must not close the recv channel as the `handle_frames_mgr` might still be working on them, so we only stop writing/sending. Debugging this also let me introduce some smaller fixes: - PID in tests are now 0 and 1+164+164*64+... this makes the traces appear as AAAAAA and BBBBBB instead of ABAAAA and ACAAAA - veloren client now better seperates between clean shutdown and unclear shutdown. - added a new type: C2pFrame for `(cid, Result<Frame, ()>)` - wrong frames inside the handshare are not counted in metrics -	2020-08-21 18:00:28 +02:00
Marcel Märtens	12b46250f5	protocols no longer send a Close Frame in case the read fails. They just fail, let participant handle this! Participant will now handle a close in the `create_channel_mgr` rather then the `send` fn. Its the better place, which makes a HashMap better for delete lookup Since tcp_read now broke but tcp_write didn't and the Participant wasnt updated till both were broke, we changed CHANNEL tcp_read and tcp_write in protocols to be a `select` rather than a `join` However only do this in the CHANNEL, but in the HANDSHAKE. it fails if you try to. Also the handshake will take care of any failed read or write manually and will handle a clear teardown in this case.	2020-08-21 18:00:07 +02:00
Marcel Märtens	dd581bc6c0	Participant closure was immeatiatly, even in case a new participant was connected, send a MSG and then dropped immeadiatly. The remote site should see it connect, be open for 1 single stream and read the message before it's notified that the participant is closed actually. This caused the faulure in one of our API tests (in lib, with client and server). Where it was possible that all messages were send and one side was dropped before the other side asked for the opened stream Also introduce better error detection in participant(and scheduler) by removing the std_async::Result and intruduce `Result<(),ParticipantError>` instead	2020-07-22 09:18:15 +02:00
Marcel Märtens	6db9c6f91b	fix a followup bug, after a protocol fail now Participant is closed, including all streams, so we get the stream errors. We MUST handle them and we are not allowed to act on a stream after it failed, as i am to lazy to change the structure to ensure the client to be imeadiatly dropped i added a AtomicBool to it.	2020-07-13 13:03:35 +02:00
Marcel Märtens	187ec42aa2	fix Participant shutdown - we had the problem that Participants couldn't shutdown them self, only by scheduler, which was controlled by api. it's needed e.g. to handle the Schudown Frame - my initial solution did a full shutdown, which was a problem if in parallel a 2nd shutdown was requested, no possibility of getting the error - new solution will only deactivate Participant and Stream. and then still functions correctly, till the api closes the participant and calls the scheduler which then calls the bparticipant again - i experimented with a Mutex<oneshot> or 2 and a `select` but it didn't prove that well - also adjusted the Error messages to now either Disconnected when gracefully shutdown or ProtocolFailed when some msg couldn't be delivered (note later might not be 100% returned correctly yet)	2020-07-13 13:03:30 +02:00
Marcel Märtens	041349be48	Switch API to return Participant rather than Arc<Participant> - API behavior switched! - the `Network` no longer holds a copy of participant, thus if the return of `connect` (before `Arc<Participant>, now `Participant`) got dropped, the `Participant::Drop` is triggered! - you can close a Participant async via `Particiant::disconnect()`, no more need to know the network at this point - the `Network::Drop` will check and drop not yet disconnected Participants. - you can compare Participants via PartialEq, if they are true they point to the same endpoint (it checks remote_pid) - Note: multiple Participants are only supported in theory, wont work yet Additionally: - fix some `debug!` - veloren-client will now drop the participant gracefully on shutdown - rename `error` to `debug` when 2 times Bparticipant shutdown is called, as it is to be expected in a async runtime	2020-07-13 13:03:14 +02:00
Joshua Barretto	dd2a81b1f3	Increased network test timeouts	2020-07-05 19:56:06 +01:00
Marcel Märtens	11e7b1f922	increase network sleep in order to fix flanky tests	2020-06-29 14:32:20 +02:00
Marcel Märtens	3324c08640	Fixing the DEADLOCK in handshake -> channel creation - this bug was initially called imbris bug, as it happened on his runners and i couldn't reproduce it locally at fist :) - When in a Handshake a seperate mpsc::Channel was created for (Cid, Frame) transport however the protocol could already catch non handshake data any more and push in into this mpsc::Channel. Then this channel got dropped and a fresh one was created for the network::Channel. These droped Frames are ofc a BUG! I tried multiple things to solve this: - dont create a new mpsc::Channel, but instead bind it to the Protocol itself and always use 1. This would work theoretically, but in bParticipant side we are using 1 mpsc::Channel<(Cid, Frame)> to handle ALL the network::channel. If now ever Protocol would have it's own, and with that every network::Channel had it's own it would no longer work out Bad Idea... - using the first method but creating the mpsc::Channel inside the scheduler instead protocol neither works, as the scheduler doesnt know the remote_pid yet - i dont want a hack to say the protocol only listen to 2 messages and then stop no matter what So i switched over to the simply method now: - Do everything like before with 2 mpsc::Channels - after the handshake. close the receiver and listen for all remaining (cid, frame) combinations - when starting the channel, reapply them to the new sender/listener combination - added tracing - switched Protocol RwLock to Mutex, as it's only ever 1 - Additionally changed the layout and introduces the c2w_frame_s and w2s_cid_frame_s name schema - Fixed a bug in scheduler which WOULD cause a DEADLOCK if handshake would fail - fixd a but in api_send_send_main, i need to store the stream_p otherwise it's immeadiatly closed and a stream_a.send() isn't guaranteed - add extra test to verify that a send message is received even if the Stream is already closed - changed OutGoing to Outgoing - fixed a bug that `metrics.tick()` was never called - removed 2 unused nightly features and added `deny_code`	2020-06-09 01:24:21 +02:00
Marcel Märtens	2a7c5807ff	overall cleanup, more tests, fixing clashes, removing unwraps, hardening against protocol errors, prepare prio mgr to take commands from scheduler fix async_recv and double block_on panic on Network::drop and participant::drop include Cargo.lock from all examples Found a bug on imbris runners with doc tests of `stream::send` and `stream::recv` As neither a backtrace, nor tracing on runners in the doc tests seems to help, i disable them and add them as unit tests	2020-06-09 01:24:16 +02:00

23 Commits