#4 Infinite loop to get new gnode

Open
opened 6 years ago by nas · 1 comments
nas commented 6 years ago

Nodes can be stuck in an infinite loop with the following message

+ Radar: New node found: 10.0.0.174, ext: 0, level: 0
+ I've seen 1 hooking nodes around us, and one of them is becoming a new gnode.
We wait, then we'll restart the hook.

This messages came from the function void hook_first_radar_scan(map_gnode * hook_gnode, int hook_level, quadro_group * old_quadg) in the file hook.c line 1166.

It may result when two nodes are synchronized in trying to elect a new gnode. Random could solve it, however it may already exists :

usleep(rand_range(0, 1024));    /* ++entropy, thx to katolaz :) */ 
Nodes can be stuck in an infinite loop with the following message ``` + Radar: New node found: 10.0.0.174, ext: 0, level: 0 + I've seen 1 hooking nodes around us, and one of them is becoming a new gnode. We wait, then we'll restart the hook. ``` This messages came from the function `void hook_first_radar_scan(map_gnode * hook_gnode, int hook_level, quadro_group * old_quadg) ` in the file `hook.c` line 1166. It may result when two nodes are synchronized in trying to elect a new gnode. Random could solve it, however it may already exists : ``` usleep(rand_range(0, 1024)); /* ++entropy, thx to katolaz :) */ ```
nas commented 6 years ago
Poster

Radar scan have a maximum attempt (MAX_FIRST_RADAR_SCANS) that is ignored during this “collision” :

else if (hook_retry) {
                        /*                                                                                                                                                                                                                                                                                                                                                
                         * There are only hooking nodes, but we started the hooking                                                                                                                                                                                                                                                                                       
                         * after them, so we wait until some of them create the new                                                                                                                                                                                                                                                                                       
                         * gnode.                                                                                                                                                                                                                                                                                                                                         
                         */
                        loginfo
                                ("I've seen %d hooking nodes around us, and one of them "
                                 "is becoming a new gnode.\n"
                                 "  We wait, then we'll restart the hook.\n",
                                 total_hooking_nodes);
                        loginfo("trying in : %d", sleepi);
                        usleep(rand_range(0, 10240)); /* ++entropy, thx to katolaz :) */
                        sleep(MAX_RADAR_WAIT);
                        i--;                                                                                                                                                                                                                                                                                                                                           
                } else

Removing this i-- make the hook signaling a success, but with no link effective.

It may be related to the state, however avoiding to loop on rhook_retry in making a loop on the whole function is not sufficient either.

{
        int total_hooking_nodes, i;
        _Bool init_hook = 1;

       while(init_hook){
               init_hook = 0;
               […];
               sleep(MAX_RADAR_WAIT);
               init_hook = 1 ;
           } else
               […]
       }
Radar scan have a maximum attempt (`MAX_FIRST_RADAR_SCANS`) that is ignored during this "collision" : ``` else if (hook_retry) { /* * There are only hooking nodes, but we started the hooking * after them, so we wait until some of them create the new * gnode. */ loginfo ("I've seen %d hooking nodes around us, and one of them " "is becoming a new gnode.\n" " We wait, then we'll restart the hook.\n", total_hooking_nodes); loginfo("trying in : %d", sleepi); usleep(rand_range(0, 10240)); /* ++entropy, thx to katolaz :) */ sleep(MAX_RADAR_WAIT); i--; } else ``` Removing this `i--` make the hook signaling a success, but with no link effective. It may be related to the state, however avoiding to loop on rhook_retry in making a loop on the whole function is not sufficient either. ``` { int total_hooking_nodes, i; _Bool init_hook = 1; while(init_hook){ init_hook = 0; […]; sleep(MAX_RADAR_WAIT); init_hook = 1 ; } else […] } ```
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
Cancel
Save
There is no content yet.