biggest issue causing the cascade was building crash detection in check_running method
that method is called everywhere, so when called (sometimes 5 times at the same time) it tries to restart the server over and over
i created a new detect_crash method that will now look for crashes and removed all crash detection from check_running method
also added the remove_watcher_thread method to remove the old scheduled task watching the older server.