sir,Have you tried a later firmware release? When was the last update and configuration change made?
I've been using 6.35 on the 1072 and they've been really stable. The only time I've seen OSPF play up without a config change is on wireless links if the signal degrades, seems the remote router needs a reboot to reconnect for some reason.
supout.rif is the equivalent of a show tech in the Cisco world. You can log into your account and view the contents as well as send it into MikroTik with a ticket.sir,
in Cisco you have a "sh tech" command that we can actually analyze - does Mikrotik have any similar commands? I'm a newbie with Mikrotik and I was hoping I could check something out of the normal "log" files in Mikrotik that would somehow give me a clue as to what is causing or being a trigger to the sudden and random "down" of ospf and bgp?
Hi Sir..thanks for your response..so far I've upgraded both my 1072s to 6.37.5, will try to upgrade 2 x 1036s this weekend to the same bug fix version. I was able to generate a supout.rif file and was able to open it via the supout viewer. My question would be when would be the best time to do generate the file - right after an unusual behavior is encountered? The log files are deleted every time you reboot the router and in my case, a reboot is done to resolve the issue - temporarily that is. I guess I was hoping to have a means of finding out what triggers the behavior. Thanks again.As far as RouterOS version, I advise all of my clients to run bigfix code as it is much more stable in production. One other practice that can contribute to OSPF/BGP instability is running a lot of mismatched versions on the routers. 6.37.5 bugfix has worked well for a lot of our clients that depend on BGP/OSPF.
Thank you sir..though we don't have any wireless features enabled on both 1072s. Last I did was to delete files on my HDD since I've also noticed it has reached 80% utilization, that gives me 20% free space on my HDD. Could it be a factor? I mean will an 80% utilization on my HDD probably cause the router to hang or stop working? I mean as I've notice every time it happens, uptime doesn't really reset so technically router is still UP, it's only my BGP and OSPF neighbors that break and recover after the reboot.Those errors are exactly what I see on CCR1009/1016 in the access network when the wireless links cause the neighbours to drop. On one side the neighbour comes up in 'Full' state, but the other cycles through the OSPF FSM in the way you've shown. I have to reboot the one that thinks it's Full to bring the neighbours back up correctly. As you don't have wireless links I can't say why it might be happening, but the symptoms seem identical.
Again, no config changes on our CCRs before this happens. If the wireless signals are tuned to a strong level, the problem disappears. Suggests to me the cause is a bad link, but the CCR must have a bug somewhere that stops OSPF from forming correctly again. I've tried using different OSPF link types - broadcast, nbma, ptp, ptmp - non of them have solved the issue. I've reverted to a script to automatically reboot the router that thinks it's 'Full', but in the core/edge I don't see how you could do this.
In ROS v7 BGP and OSPF are separate processes, so that may improve things a bit.