top of page
  • Writer's pictureAPT Big Daddy

Part 2: Card 1– Daddy’s got a brand-new Card

Updated: Sep 21, 2022


Nvidia 2080 Super Founders Edition

Price Paid: $200

Low voltage on the 5v Rail

Current Project Cost: $200


Reported Issue by the seller:” I think I shorted one of the MOSFETs by accident when I was cleaning my case. It was previously cooled with an AIO nzxt g12 bracket before I put it back together. It does not get detected by windows and the fans are not spinning. “

Ok, card number 1, or at least what I'm calling card number 1. This card was an interesting one as I’ve never had to do any sort of soldering with chips as small as these in the past, so I knew I was going to learn a lot. With what the seller had stated I knew there was a high likelihood that there was going to be a voltage issue somewhere, but I initially assumed it would be in the common voltage areas like the 12 volt rails. Amazingly those rails fail a lot more often than we would hope for something so expensive.


So, with any repair project, the first thing I did was attempt to replicate the issue. I popped the card into the test bench, and it did exactly as the seller said, no fan spin and not detected by Windows. Now, this could be a multitude of issues, so I wanted to see if the test bench BIOS was even seeing the PCI-e slot being occupied by the card. I restarted, booted into the BIOS and to my surprise, the BIOS was showing the slot empty.


This made my heart sink a bit because if it wasn’t being recognized it could be an issue with one of the PCI lanes being shorted or something of that matter and I had no clue how to troubleshoot something like that at the time. Either way, I knew it was time to take apart the card.


When I did it was quite obvious the previous owner tried to clean it up and cool it down with “aftermarket” thermal pads and paste. This board was so caked with thermal paste and pad residue I swear I cleaned it with almost a quarter of my Iso 99% (Isopropyl alcohol) bottle. The paste was literally coming off in chunks. Look I get it better safe than sorry, but you need good contact with the heatsink not 3 inches of past between it. Anyways, once I got it all cleaned up It didn’t look too bad.



It wasn’t exactly fully cleaned, but it was a lot prettier than it was. After a full inspection of the card under a microscope and not finding any obvious physical damage to any of the components it was time to move to the troubleshooting steps. Since the card wasn’t being detected in the slot, I wanted to make sure there wasn’t a grounding fault on the 12 volt fingers (pins 1-3) or the 3.3v finger (pin 8) on the PCI-e connection. So, to test this I set my multimeter to continuity mode and placed my black lead on the shielding of the HDMI port(which is ground) and then the red lead on each of the PCI fingers, 1 through 3 then 8. None of them gave me a beep, so that meant that there wasn’t a fault to ground here thankfully. So just to make sure my multimeter was good, I put the red lead on Pin 4 and got that satisfying beep.



That was a relief for the time being, but that also meant there was a bigger issue on the board itself. So as a quick test to see if the board was even powering up, I plugged the GPU back into the test bench and left the heatsink off. I wanted to see if the GPU even got hot. But when I turned on the bench it was not heating up at all, which should have been something that occurred almost immediately. Even at idle these things sit close to 38C with a heatsink on, and that’s at boot, so I should have felt something on my fingertips.


That little test narrowed down our issue tremendously to only a few areas though and provided possibly a good sign that the GPU itself wasn’t damaged by the previous owner. To understand why, I should explain how NVIDIA cards, specifically the Turing generation boot up. First, the whole card is continuously powered through the 12v Rails which it gets supplied by the PCI-e slot and the additional 6 and 8-pin connectors at the top. This rail supplies power to most of the card and other rails. Additional to the 12v rail there is the 3.3v rail which is the 8th pin on the first section and pins 9 and 10 on the second section of the PCI-e connector. This rail is used to enable voltage to some of the integrated circuits (ICs) like the card's BIOS, so it’s always applied as well when plugged in to my understanding. When the card boots, the 12v rail provides power to the 5v rail which is responsible for turning on the rest of the card (also supplies power to the USB-C controller which is only on some chips). The 5v rail first powers the 1.8v rail, which is responsible for low-level logic inside the core and powering up the BIOS chip(s). Once that’s booted power is supplied to the VCore, then to the vMem, and finally the PEX. Once that’s all completed the board is booted.



Since we weren’t getting any activity to the core, as far as the heat was concerned, and we weren’t getting any detection from the BIOS there was a strong possibility that the issue existed between the 12v input/rail and the 5v rail.


To figure out where the issue was there were two ways I could detect it. The hard way was using my DC variable power supply. I would inject 12 volts into the 12 volt rail and then trace the voltage through the rail using my multimeter to find where the power wasn’t getting past. I don't know if you have ever had more than 2 leads on your workbench before but it’s an ADHD nightmare. Or option two, I could just plug the card back into the test bench and test voltage across the rails from that where we already know the power is clean and the connection is stable. I already knew that the GPU wasn’t getting hot so there wasn’t really any extra risk to going down this route.


So, I plugged it back into the bench, turned on the machine, and started to measure voltage to test my theory on why the card was not turning on.


First is the voltage on the 12v rail.


12v-EXT 1 coil = 12v – Ok this was a good sign the power was getting 12v to the first 12v coil

12v-EXT 2 coil = 12v – Alright here we go good sign, 12 volts from the 6 and 8-pin connectors seem to be good to go

12v-BUS coil = 12v – Hey! Great sign the 12v seems to be good to go.


Now onto the 5v rail


5v coil = 1.2v – Hey this is confirming my theory something is going on here that is preventing the card from powering on


Just to make sure I was right I had to check the 1.8v, Vmem, vRam, and Pex


1.8v coil = No Voltage

vRam = No Voltage

vCore = No voltage

Pex = No voltage


Alright, Alright, Alright We found the probable issue.


Now that I narrowed down the issue to the 5v rail it meant that I was going to have to do some digging on what exactly the 5v rail's parts were as like just about any other electronic component, I have no idea how to look up the numbers



Seriously if anyone knows how to look up chip parts like the ones marked here let me know I really had to dig hard to find information. Anyways after that digging, I found that the chip left of the coil marked BFRJ 720 (sounds like a skate trick if I’m being honest) was an MPS MP28167 Integrated Buck-Boost Converter with fixed 5v Output. Which is a mouthful and I’m too lazy to type it again so I’m going to call it the controller chip. Thankfully the controller chip’s data sheet was informative as it should be, www.farnell.com/datasheets/3180384.pdf, and that led me to be able to test the chip to see if there was a fault with it or if I had to dig elsewhere.


Looking at the data I formulated a plan to test to see which pin on the controller was faulted if any. The idea here was to check the supply voltage aka IN, Pin 1, which should be at 12 volts, the On/Off switch aka EN, Pin 3, which should be at around 2 volts while running, and finally the output aka Vcc, Pin 9, which should have a constant voltage of 5 volts.



So once again I booted the bench back up and began testing.


Pin 1 IN = 12v

Pin 3 EN = 2v

Pin 9 Vcc = 1.2v


Ayyyyyyyy we found the issue. This bad box probably got burnt out somehow. Just to do a quick check I tested all the capacitors around the 5v rail to just be sure none were shorted to ground, and it all came back green. So now it was time to go out and buy a new controller chip.

I ended up buying 5 which cost me 15.85 which I’ll round up to $16 so I don’t have to do a whole lot of math later to figure out the cost. Why 5? No clue hopefully I don’t run into this issue again, but just in case.


Once I got the chips in, I needed to remove the current one in place. Now I’ve had a hot air gun (a Quick 261dw to be exact) for a while, but I’ve never really used it, so this was a fun little experiment. Though I’ll tell you what, watching people do repair on YouTube only for them to be a punk and say I’m not going to tell you my settings because it won’t help you are a bunch of gatekeepers. I found that setting my heat gun to 385C with a speed of 100 for 55secs was enough to melt the solder and get ICs off an old motherboard for practice, but when I thought I had enough practice and I went to do the actual repair, let me tell you, I didn't.


Right off the bat setting the hot air at 385C at 100 speed for 1 minute I noticed that this controller chip was not budging. Not even a micrometer. I figured maybe I needed it on there for longer. So, I ran over the chip for 2 minutes. Still absolutely no movement. I decided to go back to the drawing board of "YouTube University" and see if anyone had suggestions. After some video watching, I determined that I needed to up the temperature and make sure to not keep it there for long. I pushed the bad boy up to 400 with still a speed of 100 and after a minute my tweezers slipped, and my heart sank. Not only did the chip not budge, but the damn thing now had a gash on the top, meaning I was going way too hot, AND one of the SMB capacitors was gone. This was the most “oh fuck” mistake I could have made. Since I didn’t want to risk the SMD being gone as another failure point I took one from my practice motherboard near a similar chip and put that bad boy back on in its place



Now that I had totally fucked the 5v buck converter and had no real idea how to remove it I went back to “YouTube University” to watch some more videos more specifically, Learn Electronics Repair, whom I should have just watched in the first place. By the way, someone give that man a medal he is so knowledgeable and shares everything he knows for free it's great. Love that dude (https://www.youtube.com/c/LearnElectronicsRepair). Anyways, one of the tools he recommended for a stuck IC was a pre-heater. Specifically, something like https://www.amazon.com/gp/product/B07H3NGNND/ref=ppx_yo_dt_b_asin_image_o00_s00?ie=UTF8&psc=1. That’s another $63 being added to the budget.


Worse comes to worst I’ll have a spare parts card, something I was trying to avoid, but it doesn’t appear I really screwed anything up aside from that SMD capacitor and the already screwed up controller chip. So, for now, while I wait for the pre-heater I’m going to put this one back in the ESD bag and move on to the next card.


Issue: low 5v outage
Resolution: To be continued
JunkRat’s Current Cost: $279
112 views0 comments

Recent Posts

See All

Comentarios


bottom of page