Add ws reconnection#499
Conversation
| defer c.reconnectMx.Unlock() | ||
| } else { | ||
| // already reconnecting, try again later | ||
| time.Sleep(50 * time.Millisecond) |
There was a problem hiding this comment.
If one thread hangs on reconnect, another will create 1 stack frame every 50ms, because e.g. writeMessage uses recursion after reconnect returns. Can this lead to stack overflow eventually?
There was a problem hiding this comment.
I guess in theory we could run out of memory but it will only happen if L1 is not accessible for a really long time, which means that wasp will not be working anyway. Besides, golang's stack is more flexible and can grow dynamically. So I'm not sure if it's a major issue
| log: log, | ||
| } | ||
|
|
||
| c.reconnect(ctx) |
There was a problem hiding this comment.
Minor thing, not a required fix: Maybe check an error? If context is canceled, if would be nice to return error, especially when we already have error returned from the contructor.
| err := c.CallContext(ctx, &subID, method, args...) | ||
| if err != nil { | ||
| c.log.Errorf("failed to resubscribe to %s: %s", method, err) | ||
| continue |
There was a problem hiding this comment.
Maybe we should return here with error? Why would we support subscribing for part of all subscriptions? Wouldn't hta be a faulty state?
There was a problem hiding this comment.
not sure. I though it makes sense to resubscribe to what we can anyway. Worst case scenario is that it just does not work
| } | ||
|
|
||
| if c.conn != nil { | ||
| c.conn.Close() |
There was a problem hiding this comment.
If read and write both will need to reconnect, would that mean that eventually they both will call this function?
I mean:
- read fail and tried to reconnect
- write fails and tries to reconnect but is required to wait for read to reconnect
- read reconnects -> resubscribes and recreated pending calls
- write "unfreezes" and is now also able to reconnect, and thus deletes again the connect (which might again trigger reconnect for read), and again resubscribes and recreates calls
I might be missing something
There was a problem hiding this comment.
I think this scenario is possible but is very unlikely, it could only happen if second flow (write in your case) calls reconnect RIGHT after the first reconnect finishes and the lock is released. In most cases second connection will just be waiting 50ms and then reconnect using new connection
|
will address all questions/suggestions if we actually end up using this solution tomorrow |
No description provided.