All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] [RFC] make ocfs2/o2net reliable
@ 2017-11-16  9:49 Changwei Ge
  2017-11-16 10:04 ` Gang He
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Changwei Ge @ 2017-11-16  9:49 UTC (permalink / raw
  To: ocfs2-devel

Hi all,
As far as we know, ocfs2/o2net is not a reliable message mechanism. 
Messages might get lost due to a sudden TCP socket connection shutdown. 
And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm 
hang(missing AST and ASSERT MASTER). Sometimes it also causes 
ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that 
won't happen since target node is still heartbeating and no dlm recovery 
procedure will be launched.

So I think above cases drive us to improve current ocfs2/o2net making it 
more reliable. I already have a draft design for it. And we indeed need 
to change o2net behavior.

To accomplish this goal, we tag each o2net message with a sequence 
::msg_seq to let receiver tell if the newly coming message is a 
duplicated one or not and ::msg_seq will work as a key value for 
searching a following key structure in a red-black tree.

A brandy new structure is added to o2net named as *Message Holder*, it 
is responsible for _handle_status_ storing.

When TCP has to shutdown or reset due to unknown reason, although we 
lose the packets in send or receive buffer, o2net still manages those 
messages. This gives a chance to o2net to re-send the messages once TCP 
connection is established again.

Below diagram demonstrates how it works:

SEND					RECV
send message				
tag message header with ::msg_seq	
					search for Message Holder with
					  ::msg_seq
					NOT FOUND - insert one
					(FOUND - means a duplicated one)
					handle message
					store status into Message Holder
					send back status
instruct RECV to remove MH
					notify SEND that MH is already
					  removed
return to caller

I am expecting your comments especially from @Mark, @Joseph and @Junxiao.

Thanks,
Changwei.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Ocfs2-devel] [RFC] make ocfs2/o2net reliable
  2017-11-16  9:49 [Ocfs2-devel] [RFC] make ocfs2/o2net reliable Changwei Ge
@ 2017-11-16 10:04 ` Gang He
  2017-11-17  1:48   ` Changwei Ge
  2017-11-16 23:02 ` Wengang Wang
  2017-11-17  3:04 ` jiangyiwen
  2 siblings, 1 reply; 11+ messages in thread
From: Gang He @ 2017-11-16 10:04 UTC (permalink / raw
  To: ocfs2-devel

Hello Changwei,

Base on your description, it looks make sense.
Since I uses fs/dlm kernel module, it looks stable.
Do you compare both dlm implementation? maybe can learn from each other.


Thanks
Gang


>>> 
> Hi all,
> As far as we know, ocfs2/o2net is not a reliable message mechanism. 
> Messages might get lost due to a sudden TCP socket connection shutdown. 
> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm 
> hang(missing AST and ASSERT MASTER). Sometimes it also causes 
> ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that 
> won't happen since target node is still heartbeating and no dlm recovery 
> procedure will be launched.
> 
> So I think above cases drive us to improve current ocfs2/o2net making it 
> more reliable. I already have a draft design for it. And we indeed need 
> to change o2net behavior.
> 
> To accomplish this goal, we tag each o2net message with a sequence 
> ::msg_seq to let receiver tell if the newly coming message is a 
> duplicated one or not and ::msg_seq will work as a key value for 
> searching a following key structure in a red-black tree.
> 
> A brandy new structure is added to o2net named as *Message Holder*, it 
> is responsible for _handle_status_ storing.
> 
> When TCP has to shutdown or reset due to unknown reason, although we 
> lose the packets in send or receive buffer, o2net still manages those 
> messages. This gives a chance to o2net to re-send the messages once TCP 
> connection is established again.
> 
> Below diagram demonstrates how it works:
> 
> SEND					RECV
> send message				
> tag message header with ::msg_seq	
> 					search for Message Holder with
> 					  ::msg_seq
> 					NOT FOUND - insert one
> 					(FOUND - means a duplicated one)
> 					handle message
> 					store status into Message Holder
> 					send back status
> instruct RECV to remove MH
> 					notify SEND that MH is already
> 					  removed
> return to caller
> 
> I am expecting your comments especially from @Mark, @Joseph and @Junxiao.
> 
> Thanks,
> Changwei.
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com 
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Ocfs2-devel] [RFC] make ocfs2/o2net reliable
  2017-11-16  9:49 [Ocfs2-devel] [RFC] make ocfs2/o2net reliable Changwei Ge
  2017-11-16 10:04 ` Gang He
@ 2017-11-16 23:02 ` Wengang Wang
  2017-11-17  1:38   ` Changwei Ge
  2017-11-17  3:04 ` jiangyiwen
  2 siblings, 1 reply; 11+ messages in thread
From: Wengang Wang @ 2017-11-16 23:02 UTC (permalink / raw
  To: ocfs2-devel



On 2017/11/16 1:49, Changwei Ge wrote:
> Hi all,
> As far as we know, ocfs2/o2net is not a reliable message mechanism.
> Messages might get lost due to a sudden TCP socket connection shutdown.
> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm
> hang(missing AST and ASSERT MASTER). Sometimes it also causes
> ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that
> won't happen since target node is still heartbeating and no dlm recovery
> procedure will be launched.
>
> So I think above cases drive us to improve current ocfs2/o2net making it
> more reliable. I already have a draft design for it. And we indeed need
> to change o2net behavior.
>
> To accomplish this goal, we tag each o2net message with a sequence
> ::msg_seq to let receiver tell if the newly coming message is a
> duplicated one or not and ::msg_seq will work as a key value for
> searching a following key structure in a red-black tree.
>
> A brandy new structure is added to o2net named as *Message Holder*, it
> is responsible for _handle_status_ storing.
>
> When TCP has to shutdown or reset due to unknown reason, although we
> lose the packets in send or receive buffer, o2net still manages those
> messages. This gives a chance to o2net to re-send the messages once TCP
> connection is established again.
This sounds a good idea. some questions.

So the sender keeps the pending messages (to send) and re-send them when 
necessary.

> Below diagram demonstrates how it works:
>
> SEND					RECV
> send message				
> tag message header with ::msg_seq	
> 					search for Message Holder with
> 					  ::msg_seq
> 					NOT FOUND - insert one
> 					(FOUND - means a duplicated one)
> 					handle message
> 					store status into Message Holder
> 					send back status
I didn't get clear about the receiver's response.
what if FOUND?? the saved status still apply currently? why?
For example,

sender sends the message asking which node is the owner of a lock;
receiver handles the message and the response is node X;
network issue happened and sender didn't get the response
The owner of that lock migrated to node X2
network recovered
the sender resend the message
receiver send back it's node X, but actually it's now X2.

I am quite sure if the above example can happen, but you may need to 
prove the stale status still apply now.

This is the biggest concern.


> instruct RECV to remove MH
> 					notify SEND that MH is already
> 					  removed

So another round of network message? What if sending the instrument 
failed due to network issue.
And this will almost double the network overhead.

thanks,
wengang

> return to caller
>
> I am expecting your comments especially from @Mark, @Joseph and @Junxiao.
>
> Thanks,
> Changwei.
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Ocfs2-devel] [RFC] make ocfs2/o2net reliable
  2017-11-16 23:02 ` Wengang Wang
@ 2017-11-17  1:38   ` Changwei Ge
  0 siblings, 0 replies; 11+ messages in thread
From: Changwei Ge @ 2017-11-17  1:38 UTC (permalink / raw
  To: ocfs2-devel

Hi Wengang,
Thanks for your comments and inspiration.

On 2017/11/17 7:05, Wengang Wang wrote:
> 
> 
> On 2017/11/16 1:49, Changwei Ge wrote:
>> Hi all,
>> As far as we know, ocfs2/o2net is not a reliable message mechanism.
>> Messages might get lost due to a sudden TCP socket connection shutdown.
>> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm
>> hang(missing AST and ASSERT MASTER). Sometimes it also causes
>> ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that
>> won't happen since target node is still heartbeating and no dlm recovery
>> procedure will be launched.
>>
>> So I think above cases drive us to improve current ocfs2/o2net making it
>> more reliable. I already have a draft design for it. And we indeed need
>> to change o2net behavior.
>>
>> To accomplish this goal, we tag each o2net message with a sequence
>> ::msg_seq to let receiver tell if the newly coming message is a
>> duplicated one or not and ::msg_seq will work as a key value for
>> searching a following key structure in a red-black tree.
>>
>> A brandy new structure is added to o2net named as *Message Holder*, it
>> is responsible for _handle_status_ storing.
>>
>> When TCP has to shutdown or reset due to unknown reason, although we
>> lose the packets in send or receive buffer, o2net still manages those
>> messages. This gives a chance to o2net to re-send the messages once TCP
>> connection is established again.
> This sounds a good idea. some questions.
> 
> So the sender keeps the pending messages (to send) and re-send them when
> necessary.

1.When to keep pending messages:
O2net(in o2net_send_message_vec) fails to get response from  receiver 
but woken up by connection shutdown event(o2net_set_nn_state), then 
o2net will keep pending messages and wait for re-connection established 
again.

2.When to re-send them:
When re-connection establishes, o2net will try to re-send them.

> 
>> Below diagram demonstrates how it works:
>>
>> SEND					RECV
>> send message				
>> tag message header with ::msg_seq	
>> 					search for Message Holder with
>> 					  ::msg_seq
>> 					NOT FOUND - insert one
>> 					(FOUND - means a duplicated one)
>> 					handle message
>> 					store status into Message Holder
>> 					send back status
> I didn't get clear about the receiver's response.
> what if FOUND?? the saved status still apply currently? why?

Um, yes. If the Message Holder is found meaning that this message is a 
duplicated one, so no message handling will be performed but use the 
status stored in Message Holder to respond to sender directly. Otherwise 
the sender might so some overlap work which may cause system bug.

> For example,
> 
> sender sends the message asking which node is the owner of a lock;
> receiver handles the message and the response is node X;
> network issue happened and sender didn't get the response
> The owner of that lock migrated to node X2
> network recovered
> the sender resend the message
> receiver send back it's node X, but actually it's now X2.
> 
> I am quite sure if the above example can happen, but you may need to
> prove the stale status still apply now.
> 
> This is the biggest concern.

I agree with your concern here, the scenario truly exists.
But I suppose the same issue also exists in current o2net/dlm 
implementation.
For example:
1.Sender asks which is the owner of LOCK
2.Receiver finds out it is node X
3.Put the response into TCP send buffer, waiting for TCP layer 
transforming it to sender.
4.owner migrates to node X2
5.Sender still obtains a stale owner. :(

But I think we still have solution for that. Perhaps we need more work 
on o2net customer/application. I am afraid that o2net can hardly solve 
this alone.

> 
> 
>> instruct RECV to remove MH
>> 					notify SEND that MH is already
>> 					  removed
> 
> So another round of network message? What if sending the instrument
> failed due to network issue.

It will try again when a timer expires and again until it makes sure 
that Message Holder has been removed from receiver.

> And this will almost double the network overhead.

I agree, but if we want to make it reliable we have to sacrifice 
something and I think it is worthwhile.

Thanks,
Changwei

> 
> thanks,
> wengang
> 
>> return to caller
>>
>> I am expecting your comments especially from @Mark, @Joseph and @Junxiao.
>>
>> Thanks,
>> Changwei.
>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Ocfs2-devel] [RFC] make ocfs2/o2net reliable
  2017-11-16 10:04 ` Gang He
@ 2017-11-17  1:48   ` Changwei Ge
  2017-11-17  2:23     ` Gang He
  0 siblings, 1 reply; 11+ messages in thread
From: Changwei Ge @ 2017-11-17  1:48 UTC (permalink / raw
  To: ocfs2-devel

On 2017/11/16 18:05, Gang He wrote:
> Hello Changwei,
> 
> Base on your description, it looks make sense.
> Since I uses fs/dlm kernel module, it looks stable.
> Do you compare both dlm implementation? maybe can learn from each other.
> 
> 
> Thanks
> Gang

Hi Gang,
Actually , I have studied some code of fs/dlm and I don't think it can 
handle such a exception scenario. But I don't have a test environment 
with fs/dlm applied. Can you take some tests like configuring a 
duplicated IP address to a host.
I think it is easy to reproduce.

Thanks,
Changwei

> 
> 
>>>>
>> Hi all,
>> As far as we know, ocfs2/o2net is not a reliable message mechanism.
>> Messages might get lost due to a sudden TCP socket connection shutdown.
>> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm
>> hang(missing AST and ASSERT MASTER). Sometimes it also causes
>> ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that
>> won't happen since target node is still heartbeating and no dlm recovery
>> procedure will be launched.
>>
>> So I think above cases drive us to improve current ocfs2/o2net making it
>> more reliable. I already have a draft design for it. And we indeed need
>> to change o2net behavior.
>>
>> To accomplish this goal, we tag each o2net message with a sequence
>> ::msg_seq to let receiver tell if the newly coming message is a
>> duplicated one or not and ::msg_seq will work as a key value for
>> searching a following key structure in a red-black tree.
>>
>> A brandy new structure is added to o2net named as *Message Holder*, it
>> is responsible for _handle_status_ storing.
>>
>> When TCP has to shutdown or reset due to unknown reason, although we
>> lose the packets in send or receive buffer, o2net still manages those
>> messages. This gives a chance to o2net to re-send the messages once TCP
>> connection is established again.
>>
>> Below diagram demonstrates how it works:
>>
>> SEND					RECV
>> send message				
>> tag message header with ::msg_seq	
>> 					search for Message Holder with
>> 					  ::msg_seq
>> 					NOT FOUND - insert one
>> 					(FOUND - means a duplicated one)
>> 					handle message
>> 					store status into Message Holder
>> 					send back status
>> instruct RECV to remove MH
>> 					notify SEND that MH is already
>> 					  removed
>> return to caller
>>
>> I am expecting your comments especially from @Mark, @Joseph and @Junxiao.
>>
>> Thanks,
>> Changwei.
>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Ocfs2-devel] [RFC] make ocfs2/o2net reliable
  2017-11-17  1:48   ` Changwei Ge
@ 2017-11-17  2:23     ` Gang He
  2017-11-17  3:45       ` Changwei Ge
  0 siblings, 1 reply; 11+ messages in thread
From: Gang He @ 2017-11-17  2:23 UTC (permalink / raw
  To: ocfs2-devel




>>> 
> On 2017/11/16 18:05, Gang He wrote:
>> Hello Changwei,
>> 
>> Base on your description, it looks make sense.
>> Since I uses fs/dlm kernel module, it looks stable.
>> Do you compare both dlm implementation? maybe can learn from each other.
Do you have a detailed steps to reproduce this problem? I think the problem should exist,
Maybe the idea can be referenced by both dlm modules.
Second, if you add this message-id and red-black tree mechanism, you also need to 
add a monitor kernel-thread, to see if these messages in red-black tree will become more and more bigger (this will lead to memory leak).

Thanks
Gang


>> 
>> 
>> Thanks
>> Gang
> 
> Hi Gang,
> Actually , I have studied some code of fs/dlm and I don't think it can 
> handle such a exception scenario. But I don't have a test environment 
> with fs/dlm applied. Can you take some tests like configuring a 
> duplicated IP address to a host.
> I think it is easy to reproduce.
> 
> Thanks,
> Changwei
> 
>> 
>> 
>>>>>
>>> Hi all,
>>> As far as we know, ocfs2/o2net is not a reliable message mechanism.
>>> Messages might get lost due to a sudden TCP socket connection shutdown.
>>> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm
>>> hang(missing AST and ASSERT MASTER). Sometimes it also causes
>>> ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that
>>> won't happen since target node is still heartbeating and no dlm recovery
>>> procedure will be launched.
>>>
>>> So I think above cases drive us to improve current ocfs2/o2net making it
>>> more reliable. I already have a draft design for it. And we indeed need
>>> to change o2net behavior.
>>>
>>> To accomplish this goal, we tag each o2net message with a sequence
>>> ::msg_seq to let receiver tell if the newly coming message is a
>>> duplicated one or not and ::msg_seq will work as a key value for
>>> searching a following key structure in a red-black tree.
>>>
>>> A brandy new structure is added to o2net named as *Message Holder*, it
>>> is responsible for _handle_status_ storing.
>>>
>>> When TCP has to shutdown or reset due to unknown reason, although we
>>> lose the packets in send or receive buffer, o2net still manages those
>>> messages. This gives a chance to o2net to re-send the messages once TCP
>>> connection is established again.
>>>
>>> Below diagram demonstrates how it works:
>>>
>>> SEND					RECV
>>> send message				
>>> tag message header with ::msg_seq	
>>> 					search for Message Holder with
>>> 					  ::msg_seq
>>> 					NOT FOUND - insert one
>>> 					(FOUND - means a duplicated one)
>>> 					handle message
>>> 					store status into Message Holder
>>> 					send back status
>>> instruct RECV to remove MH
>>> 					notify SEND that MH is already
>>> 					  removed
>>> return to caller
>>>
>>> I am expecting your comments especially from @Mark, @Joseph and @Junxiao.
>>>
>>> Thanks,
>>> Changwei.
>>>
>>> _______________________________________________
>>> Ocfs2-devel mailing list
>>> Ocfs2-devel at oss.oracle.com 
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel 
>> 
>> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Ocfs2-devel] [RFC] make ocfs2/o2net reliable
  2017-11-16  9:49 [Ocfs2-devel] [RFC] make ocfs2/o2net reliable Changwei Ge
  2017-11-16 10:04 ` Gang He
  2017-11-16 23:02 ` Wengang Wang
@ 2017-11-17  3:04 ` jiangyiwen
  2017-11-17  3:53   ` Changwei Ge
  2 siblings, 1 reply; 11+ messages in thread
From: jiangyiwen @ 2017-11-17  3:04 UTC (permalink / raw
  To: ocfs2-devel

On 2017/11/16 17:49, Changwei Ge wrote:
> Hi all,
> As far as we know, ocfs2/o2net is not a reliable message mechanism. 
> Messages might get lost due to a sudden TCP socket connection shutdown. 
Hi Changwei,

Junxiao has already solved the situation about you mentioned.
in commit(c43c363def04cdaed0d9e26dae846081f55714e7), it don't shutdown
connection until node is fenced, so I don't understand the scenario
what you mentioned about TCP socket connection shutdown, can you give
a specific description? thank you.

In addition, as far as I know, TCP is reliable and trustworthy, TCP
will resend messages in a certain retransmit time. So as long as
o2net didn't active shutdown socket, TCP will resend message for
us.

Thanks,
Yiwen Jiang.
> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm 
> hang(missing AST and ASSERT MASTER). Sometimes it also causes 
> ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that 
> won't happen since target node is still heartbeating and no dlm recovery 
> procedure will be launched.
> 
> So I think above cases drive us to improve current ocfs2/o2net making it 
> more reliable. I already have a draft design for it. And we indeed need 
> to change o2net behavior.
> 
> To accomplish this goal, we tag each o2net message with a sequence 
> ::msg_seq to let receiver tell if the newly coming message is a 
> duplicated one or not and ::msg_seq will work as a key value for 
> searching a following key structure in a red-black tree.
> 
> A brandy new structure is added to o2net named as *Message Holder*, it 
> is responsible for _handle_status_ storing.
> 
> When TCP has to shutdown or reset due to unknown reason, although we 
> lose the packets in send or receive buffer, o2net still manages those 
> messages. This gives a chance to o2net to re-send the messages once TCP 
> connection is established again.
> 
> Below diagram demonstrates how it works:
> 
> SEND					RECV
> send message				
> tag message header with ::msg_seq	
> 					search for Message Holder with
> 					  ::msg_seq
> 					NOT FOUND - insert one
> 					(FOUND - means a duplicated one)
> 					handle message
> 					store status into Message Holder
> 					send back status
> instruct RECV to remove MH
> 					notify SEND that MH is already
> 					  removed
> return to caller
> 
> I am expecting your comments especially from @Mark, @Joseph and @Junxiao.
> 
> Thanks,
> Changwei.
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Ocfs2-devel] [RFC] make ocfs2/o2net reliable
  2017-11-17  2:23     ` Gang He
@ 2017-11-17  3:45       ` Changwei Ge
  0 siblings, 0 replies; 11+ messages in thread
From: Changwei Ge @ 2017-11-17  3:45 UTC (permalink / raw
  To: ocfs2-devel

Hi Gang

On 2017/11/17 10:24, Gang He wrote:
> 
> 
> 
>>>>
>> On 2017/11/16 18:05, Gang He wrote:
>>> Hello Changwei,
>>>
>>> Base on your description, it looks make sense.
>>> Since I uses fs/dlm kernel module, it looks stable.
>>> Do you compare both dlm implementation? maybe can learn from each other.
> Do you have a detailed steps to reproduce this problem? I think the problem should exist,

I assume you have four hosts: A, B, C, D.

ocfs2 cluster includes A, B, C.(total 3 nodes)
A.IP = 172.20.50.1
B.IP = 172.20.50.2
C.IP = 172.20.50.3

Let cluster begin to work and generate some workload into all nodes of them.

Configure node D IP address to 172.20.50.3.

ping node A from node D.


> Maybe the idea can be referenced by both dlm modules.
> Second, if you add this message-id and red-black tree mechanism, you also need to
> add a monitor kernel-thread, to see if these messages in red-black tree will become more and more bigger (this will lead to memory leak).

This is a good idea! I will take it into account.

Thanks,
Changwei

> 
> Thanks
> Gang
> 
> 
>>>
>>>
>>> Thanks
>>> Gang
>>
>> Hi Gang,
>> Actually , I have studied some code of fs/dlm and I don't think it can
>> handle such a exception scenario. But I don't have a test environment
>> with fs/dlm applied. Can you take some tests like configuring a
>> duplicated IP address to a host.
>> I think it is easy to reproduce.
>>
>> Thanks,
>> Changwei
>>
>>>
>>>
>>>>>>
>>>> Hi all,
>>>> As far as we know, ocfs2/o2net is not a reliable message mechanism.
>>>> Messages might get lost due to a sudden TCP socket connection shutdown.
>>>> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm
>>>> hang(missing AST and ASSERT MASTER). Sometimes it also causes
>>>> ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that
>>>> won't happen since target node is still heartbeating and no dlm recovery
>>>> procedure will be launched.
>>>>
>>>> So I think above cases drive us to improve current ocfs2/o2net making it
>>>> more reliable. I already have a draft design for it. And we indeed need
>>>> to change o2net behavior.
>>>>
>>>> To accomplish this goal, we tag each o2net message with a sequence
>>>> ::msg_seq to let receiver tell if the newly coming message is a
>>>> duplicated one or not and ::msg_seq will work as a key value for
>>>> searching a following key structure in a red-black tree.
>>>>
>>>> A brandy new structure is added to o2net named as *Message Holder*, it
>>>> is responsible for _handle_status_ storing.
>>>>
>>>> When TCP has to shutdown or reset due to unknown reason, although we
>>>> lose the packets in send or receive buffer, o2net still manages those
>>>> messages. This gives a chance to o2net to re-send the messages once TCP
>>>> connection is established again.
>>>>
>>>> Below diagram demonstrates how it works:
>>>>
>>>> SEND					RECV
>>>> send message				
>>>> tag message header with ::msg_seq	
>>>> 					search for Message Holder with
>>>> 					  ::msg_seq
>>>> 					NOT FOUND - insert one
>>>> 					(FOUND - means a duplicated one)
>>>> 					handle message
>>>> 					store status into Message Holder
>>>> 					send back status
>>>> instruct RECV to remove MH
>>>> 					notify SEND that MH is already
>>>> 					  removed
>>>> return to caller
>>>>
>>>> I am expecting your comments especially from @Mark, @Joseph and @Junxiao.
>>>>
>>>> Thanks,
>>>> Changwei.
>>>>
>>>> _______________________________________________
>>>> Ocfs2-devel mailing list
>>>> Ocfs2-devel at oss.oracle.com
>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>
>>>
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Ocfs2-devel] [RFC] make ocfs2/o2net reliable
  2017-11-17  3:04 ` jiangyiwen
@ 2017-11-17  3:53   ` Changwei Ge
  2017-11-17  5:50     ` jiangyiwen
  0 siblings, 1 reply; 11+ messages in thread
From: Changwei Ge @ 2017-11-17  3:53 UTC (permalink / raw
  To: ocfs2-devel

Hi Yiwen,

On 2017/11/17 11:06, jiangyiwen wrote:
> On 2017/11/16 17:49, Changwei Ge wrote:
>> Hi all,
>> As far as we know, ocfs2/o2net is not a reliable message mechanism.
>> Messages might get lost due to a sudden TCP socket connection shutdown.
> Hi Changwei,
> 
> Junxiao has already solved the situation about you mentioned.
> in commit(c43c363def04cdaed0d9e26dae846081f55714e7), it don't shutdown
> connection until node is fenced, so I don't understand the scenario
> what you mentioned about TCP socket connection shutdown, can you give
> a specific description? thank you.

I'm afraid Juxiao's patch can't cover all scenarios. It addresses o2net 
timeout scenario but not tcp socket resetting case.

> 
> In addition, as far as I know, TCP is reliable and trustworthy, TCP
> will resend messages in a certain retransmit time. So as long as
> o2net didn't active shutdown socket, TCP will resend message for
> us.
> 
> Thanks,
> Yiwen Jiang.

Actually, TCP event doesn't begin to send packets from its send buffer 
but closed due to underlying unknown reason. So we lose them.


Thanks,
Changwei

>> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm
>> hang(missing AST and ASSERT MASTER). Sometimes it also causes
>> ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that
>> won't happen since target node is still heartbeating and no dlm recovery
>> procedure will be launched.
>>
>> So I think above cases drive us to improve current ocfs2/o2net making it
>> more reliable. I already have a draft design for it. And we indeed need
>> to change o2net behavior.
>>
>> To accomplish this goal, we tag each o2net message with a sequence
>> ::msg_seq to let receiver tell if the newly coming message is a
>> duplicated one or not and ::msg_seq will work as a key value for
>> searching a following key structure in a red-black tree.
>>
>> A brandy new structure is added to o2net named as *Message Holder*, it
>> is responsible for _handle_status_ storing.
>>
>> When TCP has to shutdown or reset due to unknown reason, although we
>> lose the packets in send or receive buffer, o2net still manages those
>> messages. This gives a chance to o2net to re-send the messages once TCP
>> connection is established again.
>>
>> Below diagram demonstrates how it works:
>>
>> SEND					RECV
>> send message				
>> tag message header with ::msg_seq	
>> 					search for Message Holder with
>> 					  ::msg_seq
>> 					NOT FOUND - insert one
>> 					(FOUND - means a duplicated one)
>> 					handle message
>> 					store status into Message Holder
>> 					send back status
>> instruct RECV to remove MH
>> 					notify SEND that MH is already
>> 					  removed
>> return to caller
>>
>> I am expecting your comments especially from @Mark, @Joseph and @Junxiao.
>>
>> Thanks,
>> Changwei.
>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>
>>
> 
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Ocfs2-devel] [RFC] make ocfs2/o2net reliable
  2017-11-17  3:53   ` Changwei Ge
@ 2017-11-17  5:50     ` jiangyiwen
  2017-11-17  6:03       ` Changwei Ge
  0 siblings, 1 reply; 11+ messages in thread
From: jiangyiwen @ 2017-11-17  5:50 UTC (permalink / raw
  To: ocfs2-devel

On 2017/11/17 11:53, Changwei Ge wrote:
> Hi Yiwen,
> 
> On 2017/11/17 11:06, jiangyiwen wrote:
>> On 2017/11/16 17:49, Changwei Ge wrote:
>>> Hi all,
>>> As far as we know, ocfs2/o2net is not a reliable message mechanism.
>>> Messages might get lost due to a sudden TCP socket connection shutdown.
>> Hi Changwei,
>>
>> Junxiao has already solved the situation about you mentioned.
>> in commit(c43c363def04cdaed0d9e26dae846081f55714e7), it don't shutdown
>> connection until node is fenced, so I don't understand the scenario
>> what you mentioned about TCP socket connection shutdown, can you give
>> a specific description? thank you.
> 
> I'm afraid Juxiao's patch can't cover all scenarios. It addresses o2net 
> timeout scenario but not tcp socket resetting case.
> 
>>
>> In addition, as far as I know, TCP is reliable and trustworthy, TCP
>> will resend messages in a certain retransmit time. So as long as
>> o2net didn't active shutdown socket, TCP will resend message for
>> us.
>>
>> Thanks,
>> Yiwen Jiang.
> 
> Actually, TCP event doesn't begin to send packets from its send buffer 
> but closed due to underlying unknown reason. So we lose them.
> 
> 
> Thanks,
> Changwei
> 

I think firstly we should find the reason why tcp socket is reset/closed,
that is the underlying unknown reason you mentioned above, maybe it is
TCP bug. After analyzing, it is normal that tcp is closed in certain
condition, then we discuss the solution.

Thanks,
Yiwen Jiang.

>>> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm
>>> hang(missing AST and ASSERT MASTER). Sometimes it also causes
>>> ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that
>>> won't happen since target node is still heartbeating and no dlm recovery
>>> procedure will be launched.
>>>
>>> So I think above cases drive us to improve current ocfs2/o2net making it
>>> more reliable. I already have a draft design for it. And we indeed need
>>> to change o2net behavior.
>>>
>>> To accomplish this goal, we tag each o2net message with a sequence
>>> ::msg_seq to let receiver tell if the newly coming message is a
>>> duplicated one or not and ::msg_seq will work as a key value for
>>> searching a following key structure in a red-black tree.
>>>
>>> A brandy new structure is added to o2net named as *Message Holder*, it
>>> is responsible for _handle_status_ storing.
>>>
>>> When TCP has to shutdown or reset due to unknown reason, although we
>>> lose the packets in send or receive buffer, o2net still manages those
>>> messages. This gives a chance to o2net to re-send the messages once TCP
>>> connection is established again.
>>>
>>> Below diagram demonstrates how it works:
>>>
>>> SEND					RECV
>>> send message				
>>> tag message header with ::msg_seq	
>>> 					search for Message Holder with
>>> 					  ::msg_seq
>>> 					NOT FOUND - insert one
>>> 					(FOUND - means a duplicated one)
>>> 					handle message
>>> 					store status into Message Holder
>>> 					send back status
>>> instruct RECV to remove MH
>>> 					notify SEND that MH is already
>>> 					  removed
>>> return to caller
>>>
>>> I am expecting your comments especially from @Mark, @Joseph and @Junxiao.
>>>
>>> Thanks,
>>> Changwei.
>>>
>>> _______________________________________________
>>> Ocfs2-devel mailing list
>>> Ocfs2-devel at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>
>>>
>>
>>
>>
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Ocfs2-devel] [RFC] make ocfs2/o2net reliable
  2017-11-17  5:50     ` jiangyiwen
@ 2017-11-17  6:03       ` Changwei Ge
  0 siblings, 0 replies; 11+ messages in thread
From: Changwei Ge @ 2017-11-17  6:03 UTC (permalink / raw
  To: ocfs2-devel

On 2017/11/17 13:51, jiangyiwen wrote:
> On 2017/11/17 11:53, Changwei Ge wrote:
>> Hi Yiwen,
>>
>> On 2017/11/17 11:06, jiangyiwen wrote:
>>> On 2017/11/16 17:49, Changwei Ge wrote:
>>>> Hi all,
>>>> As far as we know, ocfs2/o2net is not a reliable message mechanism.
>>>> Messages might get lost due to a sudden TCP socket connection shutdown.
>>> Hi Changwei,
>>>
>>> Junxiao has already solved the situation about you mentioned.
>>> in commit(c43c363def04cdaed0d9e26dae846081f55714e7), it don't shutdown
>>> connection until node is fenced, so I don't understand the scenario
>>> what you mentioned about TCP socket connection shutdown, can you give
>>> a specific description? thank you.
>>
>> I'm afraid Juxiao's patch can't cover all scenarios. It addresses o2net
>> timeout scenario but not tcp socket resetting case.
>>
>>>
>>> In addition, as far as I know, TCP is reliable and trustworthy, TCP
>>> will resend messages in a certain retransmit time. So as long as
>>> o2net didn't active shutdown socket, TCP will resend message for
>>> us.
>>>
>>> Thanks,
>>> Yiwen Jiang.
>>
>> Actually, TCP event doesn't begin to send packets from its send buffer
>> but closed due to underlying unknown reason. So we lose them.
>>
>>
>> Thanks,
>> Changwei
>>
> 
> I think firstly we should find the reason why tcp socket is reset/closed,
> that is the underlying unknown reason you mentioned above, maybe it is
> TCP bug. After analyzing, it is normal that tcp is closed in certain
> condition, then we discuss the solution.

Um, I am a little confused. You mean we have to find out the root cause 
why TCP has to shutdown existed connection?
I think should enhance o2net reliability making it like other reliable 
message mechanism.

Thanks,
Changwei

> 
> Thanks,
> Yiwen Jiang.
> 
>>>> And the only customer of o2net is ocfs2/dlm, so this may cause ocfs2/dlm
>>>> hang(missing AST and ASSERT MASTER). Sometimes it also causes
>>>> ocfs2/dlm's infinite wait for accomplishment of DLM recovery. But that
>>>> won't happen since target node is still heartbeating and no dlm recovery
>>>> procedure will be launched.
>>>>
>>>> So I think above cases drive us to improve current ocfs2/o2net making it
>>>> more reliable. I already have a draft design for it. And we indeed need
>>>> to change o2net behavior.
>>>>
>>>> To accomplish this goal, we tag each o2net message with a sequence
>>>> ::msg_seq to let receiver tell if the newly coming message is a
>>>> duplicated one or not and ::msg_seq will work as a key value for
>>>> searching a following key structure in a red-black tree.
>>>>
>>>> A brandy new structure is added to o2net named as *Message Holder*, it
>>>> is responsible for _handle_status_ storing.
>>>>
>>>> When TCP has to shutdown or reset due to unknown reason, although we
>>>> lose the packets in send or receive buffer, o2net still manages those
>>>> messages. This gives a chance to o2net to re-send the messages once TCP
>>>> connection is established again.
>>>>
>>>> Below diagram demonstrates how it works:
>>>>
>>>> SEND					RECV
>>>> send message				
>>>> tag message header with ::msg_seq	
>>>> 					search for Message Holder with
>>>> 					  ::msg_seq
>>>> 					NOT FOUND - insert one
>>>> 					(FOUND - means a duplicated one)
>>>> 					handle message
>>>> 					store status into Message Holder
>>>> 					send back status
>>>> instruct RECV to remove MH
>>>> 					notify SEND that MH is already
>>>> 					  removed
>>>> return to caller
>>>>
>>>> I am expecting your comments especially from @Mark, @Joseph and @Junxiao.
>>>>
>>>> Thanks,
>>>> Changwei.
>>>>
>>>> _______________________________________________
>>>> Ocfs2-devel mailing list
>>>> Ocfs2-devel at oss.oracle.com
>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>> .
>>
> 
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-11-17  6:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-16  9:49 [Ocfs2-devel] [RFC] make ocfs2/o2net reliable Changwei Ge
2017-11-16 10:04 ` Gang He
2017-11-17  1:48   ` Changwei Ge
2017-11-17  2:23     ` Gang He
2017-11-17  3:45       ` Changwei Ge
2017-11-16 23:02 ` Wengang Wang
2017-11-17  1:38   ` Changwei Ge
2017-11-17  3:04 ` jiangyiwen
2017-11-17  3:53   ` Changwei Ge
2017-11-17  5:50     ` jiangyiwen
2017-11-17  6:03       ` Changwei Ge

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.