Basically I’ve a mining rig and it worked for 2 days without any problem, non-stop. And then suddenly this problem started. Whenever I turn on my mining rig it works fine for 15 or 30 minutes and then mining stops by itself and doesn’t resume. GPU fans and lights will be ON and everything else looks like functioning from outside.
After checking the log files and using remote debugging tools, I could see that most of the GPUs(I’ve Sapphire Nitro+ OC Radeon RX 570, 4GB GDDR5, Elpida GPU) would freeze, including the one to which I connect my Monitor(via HDMI cable).
1. If it never worked even for a day from the day of your purchase, then it might be power supply issue. Check if your PSU is providing enough watts to your GPUs and other components.
2. If it worked fine for some days and this problem started suddenly. You can do 2 things:
i. Remove ALL the cables. Make sure you are not connecting more than 2 devices to your Sata power cable, while reconnecting.
ii. Use your regular vacuum cleaner and clean all the slots – RAM slots, PCI slots etc. Remove the dust properly. And then reconnect all the cables/components.
1. Do not use air blowing, because it sometimes generates moisture. So better use air sucking mechanism to suck the dust sitting on the motherboard and inside the slots.
2. Do not use any fiber or wiping cloths, as these might damage the small soldering. And some fiber cleaning material has its own static energy, so avoid such things too.
1. Turn off the rig and remove the power cord when its raining with lightning and thunder.
2. Turn off the rig when there is voltage fluctuations.
3. Make sure to increase the fan speed and provide proper ventilation if the GPUs are getting too hot. And also set the rig to shutdown if the GPU temperature reaches certain point.
4. Always use branded / high quality cables and power cords only.
If the problem persists, then remove all the GPUs and connect only 1 GPU and check the rig. Do this by connecting GPU one by one, this way you can detect the faulty GPU or the riser. This way you can fix the problem by fixing the problematic component.
My rig started working properly once again, after I removed all the cables, blew some air from my vacuum cleaner and cleaned the dust(from RAM slot and PCI slots) and reconnected all the components.