Commit Graph

9 Commits

Author SHA1 Message Date
Marcel Märtens
2e3d5f87db StreamError::Deserialize is now triggered when recv fails because of wrong type
- added PartialEq to StreamError for test purposes (only yet!)
 - removed async_recv example as it's no longer for any use.
   It was created before the COMPLETE REWRITE in order to verify that my own async interface on top of mio works.
   However it's now guaranteed by async-std and futures. no need for a special test
 - remove uvth from dependencies and replace it with a `FnOnce`
 - fix ALL clippy (network) lints
 - basic fix for a channel drop scenario:
   TODO: this needs some further fixes
   up to know only destruction of participant by api was covered correctly.
   we had an issue when the underlying channels got dropped. So now we have a participant without channels.
   We need to buffer the requests and try to reopen a channel ASAP!
   If no channel could be reopened we need to close the Participant, while
    a) leaving the BParticipant in takt, knowing that it only waits for a propper close by scheduler
    b) close the BParticipant gracefully. Notifying the scheduler to remove its stuff (either scheduler schould detect a stopped BParticipant or BParticipant will send Scheduler it's own destruction, and then Scheduler just does the same like when API forces a close)
       Keep the Participant alive and wait for the api to acces BParticipant to notice it's closed and then wait for a disconnect which isn't doing anything as it was already cleaned up in the background
2020-06-09 13:16:39 +02:00
Marcel Märtens
3324c08640 Fixing the DEADLOCK in handshake -> channel creation
- this bug was initially called imbris bug, as it happened on his runners and i couldn't reproduce it locally at fist :)
- When in a Handshake a seperate mpsc::Channel was created for (Cid, Frame) transport
  however the protocol could already catch non handshake data any more and push in into this
  mpsc::Channel.
  Then this channel got dropped and a fresh one was created for the network::Channel.
  These droped Frames are ofc a BUG!
  I tried multiple things to solve this:
   - dont create a new mpsc::Channel, but instead bind it to the Protocol itself and always use 1.
     This would work theoretically, but in bParticipant side we are using 1 mpsc::Channel<(Cid, Frame)>
     to handle ALL the network::channel.
     If now ever Protocol would have it's own, and with that every network::Channel had it's own it would no longer work out
     Bad Idea...
   - using the first method but creating the mpsc::Channel inside the scheduler instead protocol neither works, as the
     scheduler doesnt know the remote_pid yet
   - i dont want a hack to say the protocol only listen to 2 messages and then stop no matter what
  So i switched over to the simply method now:
   - Do everything like before with 2 mpsc::Channels
   - after the handshake. close the receiver and listen for all remaining (cid, frame) combinations
   - when starting the channel, reapply them to the new sender/listener combination
   - added tracing
- switched Protocol RwLock to Mutex, as it's only ever 1
- Additionally changed the layout and introduces the c2w_frame_s and w2s_cid_frame_s name schema
- Fixed a bug in scheduler which WOULD cause a DEADLOCK if handshake would fail
- fixd a but in api_send_send_main, i need to store the stream_p otherwise it's immeadiatly closed and a stream_a.send() isn't guaranteed
- add extra test to verify that a send message is received even if the Stream is already closed
- changed OutGoing to Outgoing
- fixed a bug that `metrics.tick()` was never called
- removed 2 unused nightly features and added `deny_code`
2020-06-09 01:24:21 +02:00
Marcel Märtens
2a7c5807ff overall cleanup, more tests, fixing clashes, removing unwraps, hardening against protocol errors, prepare prio mgr to take commands from scheduler
fix async_recv and double block_on panic on Network::drop and participant::drop
include Cargo.lock from all examples
Found a bug on imbris runners with doc tests of `stream::send` and `stream::recv`
As neither a backtrace, nor tracing on runners in the doc tests seems to help, i disable them and add them as unit tests
2020-06-09 01:24:16 +02:00
Marcel Märtens
a86cfbae65 add new tests and increase coverage 2020-06-09 01:24:07 +02:00
Marcel Märtens
6e776e449f fixing all tests and doc tests including some deadlocks 2020-06-09 01:24:05 +02:00
Marcel Märtens
9550da87b8 speeding up metrics by reducing string generation and Hashmap access with a metrics cache for msg/send and msg/recv 2020-06-09 01:24:01 +02:00
Marcel Märtens
a8f1bc178a Experiments with a prometheus bug which actually worked as designed because i had client and server running at the same time
- https://github.com/tikv/rust-prometheus/issues/321
 - split up channel into a hanshake part and channel part.
   The handshake part is non endless and ends when its either done or aborted.
   If its okay i will send a request to the BParticipant which then opens a channel on the existing TCP or UDP connection.
   this streamlines the command chain alot. also the channel is almost empty now, thinking about removing it completly.
   isnt perfect, as shutdown and udp doesnt work yet
 - make PID to print as Base64
 - replace rouille with tiny_http
2020-06-09 01:23:49 +02:00
Marcel Märtens
661060808d switch from serde to manually for speed, remove async_serde
- removing async_serde as it seems to be not usefull
  the idea was because deserialising is slow parallising it could speed up.
  Whoever we need to keep the order of frames, (at least for controlframes) so serialising in threads would be quite complicated.
  Also serialisation is quite fast, about 1 Gbit/s such speed is enough for messaging, it's more important to serve parallel streams better.
  Thats why i am removing async serde coding for now
- frames are no longer serialized by serde, by byte by byte manually, increadible speed upgrade
- more metrics
- switch channel_creator into for_each_concurrent
- removing some pool.spwan_ok() as they dont allow me to use self
- reduce features needed
2020-06-09 01:23:42 +02:00
Marcel Märtens
2ee18b1fd8 Examples, HUGE fixes, test, make it alot smother
- switch `listen` to async in oder to verify if the bind was successful
- Introduce the following examples
  - network speed
  - chat
  - fileshare
- add additional tests
- fix dropping stream before last messages can be handled bug, when dropping a stream, BParticipant will wait for prio to be empty before dropping the stream and sending the signal
- correct closing of stream and participant
- move tcp to protocols and create udp front and backend
- tracing and fixing a bug that is caused by not waiting for configuration after receiving a frame
- fix a bug in network-speed, but there is still a bug if trace=warn after 2.000.000 messages the server doesnt get that client has shut down and seems to lock somewhere. hard to reproduce

open tasks
[ ] verify UDP works correctly, especcially the connect!
[ ] implements UDP shutdown correctly, the one created in connect!
[ ] unify logging
[ ] fill metrics
[ ] fix dropping stream before last messages can be handled bug
[ ] add documentation
[ ] add benchmarks
[ ] remove async_serde???
[ ] add mpsc
2020-06-09 01:23:37 +02:00