github.com/wbrown/gpt_bpe@v0.0.0-20250709161131-1571a6e8ad2d/resources/test_references/753.txt (about) 1 *** 2 [Name: #hardware; Description: None; Guild: KoboldAI] 3 *** 4 Riehl: [File URL attached] 5 Sweet: Pinned a message. 6 gold9: I built a machine to run GPT-J and GPT-2 XL through KoboldAI. The motherboard I used was the HUANANZHI X99 F8. I tunneled air to the card using a washing machine tube and a 3D printed socket with a 120mm turbo blower fan, along with a Tesla M40, Xenon 2650, and 128GB REG ECC. Display: Display outputs are not available on Tesla cards, which are designed for datacenters. My first attempt involved pairing a Tesla M40 (24GB) with a Geforce 710 (2GB). Due to the nvidia eco system driver division, these cards do not operate on the same platform. Geforce drivers may work well with datacenter drivers, but there will be fringe issues. When I tried to boot Telsa with a 710 geforce, I encountered all kinds of errors. After I install Tesla with 710, I can only boot Tesla once, but it switches to Tesla in the second boot and I end up getting an error message from x99 F8. In order to prevent any conflict between the driver and motherboard, I chose the ATI 4600 from 2008. The job can, however, be done with any GPU from AMD, Intel, or Nvidia Quadro. Cooling: In my experiment, I used a 120mm blower solution that is suited for burst loads only. It was perfect for working with KoboldAI. I was able to run them at slower speeds due to their larger size, and the pitch sound of 120mm is much quieter than direct mount twin 36mm server blower fans. Later I will probably switch back to 36mm twain configuration with a PWM speed controller to make training and deploying easier. 7 gold9: [Image attached] 8 Sweet: Sounds promising for me then, i got the Vega 64 so an M40 should play along 9 gold9: it was really bad attempt and to pair it with Geforce! Vega 64 is great 10 Sweet: I might yolo it 😛 11 Sweet: Time to shine a flashlight in my case to see if it fits 12 gold9: The GPU market price is skyrocket at the moment. Vega 64 costs 700 last time I checked 13 Sweet: I bought the Vega 64 for 650 but last time i checked its selling over $1000 now 14 Sweet: But i bought it new for 650 15 Sweet: For the M40 it should work with 'Above 4G Crypto Mining support' right? 16 gold9: You can hunt them from Aliexpress. I wouldn't do it but you can 🤣 17 Sweet: Can't access my bios the runtime 7z is still uploading 18 gold9: I didn't test it yet but it should 19 gold9: What do you have for motherboard ? 20 Sweet: [File URL attached] 21 gold9: 3000 series Ryzen ? 22 Sweet: 1000 23 Sweet: Its from 2017 24 Sweet: 3000 is supported but not in it 25 gold9: They are still way better than outdate Xenon. I only got them because they support high REG ECC 26 Sweet: And its overclocked to 3,9GHz 27 Sweet: First gen ryzens have terrible stock clocks but a lot of overclocking headroom 28 Sweet: I could go from 3,4GHz to 3,9GHz without increasing the voltage 29 Sweet: So the difference with the 2nd gen is smaller 30 Sweet: These days Ryzens kinda auto overclock, so they run much faster but barely any overclocking gains 31 gold9: They are very efficient node. I will always choose the AMD AM4 socket over the Intel 2011V3 any day. I bought two motherboards as 2011v3 had a lot of issues 32 Sweet: My motherboard has a lot of issues to unfortunately 33 Sweet: All Ryzens did 34 Sweet: But the problem got so noticable and terrible at the 5000 series they actually fixed it 35 gold9: I have another machine with B450 but it is OME Asus motherboard 36 gold9: I have issues with USB 37 gold9: Razer keyboard weirdly disconnected during boot 38 Sweet: USB + Audio for me 39 gold9: I test it with other machine but I read that Ryzen X370, B450 and B550 have issues with USB 40 Sweet: Some USB disconnects happen for me if things are turning on, and in Linux the audio controller is all wrong despite working fine on intel 41 Sweet: Then they released that Ryzen 5000 bios update and those users reported it was fixed 42 Sweet: So it seems like its not the USB thats the issue, but the PCI controller itself 43 gold9: I thought Ryzen have the USB controller built inside Ryzen processor 44 Sweet: Only for some ports 45 Sweet: Those are typically the ones people advice to use 46 gold9: I will try testing with different ports to see if it resolves the issue 47 Sweet: Typically the ones closest to the CPU 48 Sweet: Use a tool like HWInfo to check which controller they are on 49 gold9: I am using USB switch to switch the USB on and off again 🤣 50 Sweet: For me with some USB drive enclosures the external HDD resets when my monitor comes on or when certain setup programs are ran 51 gold9: I can not see USB listed in HWInfo 52 Sweet: Its under ports 53 Sweet: [Image attached] 54 Sweet: For me the USB 3.10 is the motherboard one 55 Sweet: I mean CPU 56 gold9: I don't see the USB option 57 Sweet: Thats the sensor tab 58 Sweet: You want the main menu 59 gold9: How to open the main menu ? There are only setting with same option listed in it 60 Sweet: Open HWInfo then don't do any checkboxes for sensors only and stuff 61 Sweet: It will open a ton of screens 62 Sweet: Including the main screen 63 Sweet: [Image attached] 64 gold9: oh thxs yeah, I see it now 65 Sweet: Hard to tell in your case 66 Sweet: But one of them is the CPU so if one is giving issues try the other one haha 67 Sweet: I assume the top one is the CPU for you 68 Sweet: In my case its only like 4 ports on the back and the front ports 69 gold9: I have 6 ports. Two are USB 3.1 and the others USB 3 70 Sweet: This says all of them are USB 3.1 71 Sweet: But your GPU apparently has USB ports to, so you could always try plugging the razer in your GPU 72 gold9: Proablly than 3.2 and 3.1 haha They are color coded differently in the back blue and green 73 gold9: I kind lost with all new USB color code 74 Sweet: Maybe to differentiate which ones are on the CPU? 75 Sweet: Green isn't anything i know off 76 Sweet: Blue is USB3.X 77 gold9: It is Asus GL10DH with B450 chip 78 gold9: I assumed it had the default port option of B450 79 gold9: It said "AMD Turbo USB 3.2 GEN2" 80 gold9: [Image attached] 81 gold9: They are all 3.2 but some Gen 1 and other gen 2 82 Sweet: Decided to roll with it and try my luck on this 83 Sweet: Prices already went up slightly and i don't think that trend is going to stop as i am noticing the M40's go pretty rapidly on ebay 84 Sweet: So i bought myself an M40 😄 85 Sweet: Motherboard supports PCI Gen2, and Above 4G support 86 gold9: I paid 360$ with shipping for the Tesla M40 24GB. I am not in US right now, but you can get it on eBay for ~270$. The Tesla M40 performs similarly to the 1060 super and 1070. There are a lot of people using it to play games as a cheaper alternative with all the crazy pricing. So, it will always worth something and a good investment 87 gold9: [File URL attached] 88 Sweet: Yup thats the one i bought 89 Sweet: I did have to pay around $100 in shipping + import tax though 90 gold9: You should get https://www.aliexpress.com/item/4000505282893.html?spm=a2g0s.9042311.0.0.67a44c4dbJGrHc 91 gold9: The original GPU server bracket did not fit into my Corsair case so I had to replace it 92 gold9: You will also need an 8pin to Dual 8pin Power Cable: https://www.aliexpress.com/item/32692411159.html?spm=a2g0s.9042311.0.0.67a44c4dtGt6h6 93 Carena: Saving up for my own hardware... Stuff is expensive 😦 94 gold9: world-wide shortage 💀 The same build would cost me no more than 600 USD in 2019. There were a lot of Intel Xeon workstations selling for 200 and 300 with 128GB ram and Tesla M40, the total cost would be around 700$ max, as everyone was switching to AMD Ryzen. Until this day, a 650W Corsair power supply, case, motherboard, 256GB SSD, and old Haswell Intel Xeon 2650 with 10 cores will cost you 300. The price only high on GPU and Memory. I paid 480$ for 128GB of ECC REG RAM, and 360$ for the GPU. It was possible to find ECC REG RAM on ebay for 280$ and GPU for 200$ with shipping. 95 gold9: Best we to do it by getting old workstation or PC and pair it with M40 96 Carena: I have a server, but that one is just a nono when it comes to hardware. 97 Carena: Also have an Nvidia Jetson AGX that works really well for inference 🙂 98 Carena: I use it to run colab on 😄 99 gold9: I was planning to get Nvidia Jetson AGX but they cost as much as whole build right now 🤣 100 gold9: that 30W is just amazing and great for 24/7 operation. I have to make a box for my build 101 Carena: I know, one of the reasons why I said I wanted to have a "cloud"-based Colab 😛 102 Sweet: Do i need that power cable for the M40? Since it needs 8-pin + 6-pin 103 Sweet: Looks like it, ordering one 104 gold9: Yes, I thought of retrofitting it, but the 8 pins that came with power supplies are a bit thick and the 3d print fan mount is in a tight space to reroute the cable under it. You would have to get slim fit cables. Retrofitting was not worth the trouble. 105 gold9: What type of power supply do you have? When using Vega 64 and M40 on the same machine, you may require a beefy power supply upgrade! 106 Sweet: Ill be fine 😄 107 Sweet: The Vega 64 uses 3W in desktop mode 108 Sweet: And i won't be gaming while running the tesla 109 Sweet: I assume the tesla also only draws power when its used 110 gold9: When used Tesla with KoboldAI draw ~30W 111 Sweet: Thats nothing then 112 Sweet: How about idle? 113 gold9: 15W 114 Sweet: Thats pretty high idle draw then 115 Sweet: But nothing for me to worry about i got plenty of headroom 116 gold9: If I recall correctly 117 Sweet: The PSU is 850w 118 Sweet: And its not a budget PSU either 119 gold9: I just lost a machine for junky power supply 120 gold9: but yeah, you will probably fine! 850W is enough at max 121 Sweet: For sure 122 Sweet: My Vega is a hungry card 123 Sweet: It can peak at 300W 124 gold9: 300W for tesla, 300W for Vega 64 and 1700x will probably max at 85W 125 Sweet: I originally had some system crashes with 650W (Turned out to be heat not power) so i one upped 126 Sweet: Ill max the 1700X and check 127 gold9: Is it overclocked ? 128 Sweet: Yup 129 Sweet: But not overvolted 130 gold9: Does B350 support 5000 series ? 131 Sweet: Package power 163w 132 Sweet: You will have to check your specific motherboard 133 Sweet: But generally speaking no 134 Sweet: But yeah CPU package power is 166W 135 Sweet: So fans, soc, cpu, etc 136 gold9: I recall some post about some motherboard of B350 supported 137 gold9: 166W(CPU)+300W(Tesla)+300W(Vega) = 766W 138 Sweet: Plus a little bit for everything else of course 139 Sweet: But at that point my cooling would probably give out 140 gold9: Hopefully when I upgrade, the Nvidia Quadro RTX 8000 will be available for the same price as Tesla M40 🤣 141 gold9: Did you order fan's for it ? 142 Sweet: Nope 143 Sweet: Going to run it without a fan first 144 Sweet: Especially if it only draws 30w when using kobold 145 Sweet: For context i got a high airflow case 146 Sweet: [Image attached] 147 Sweet: So with a bit of luck those front fans are enough 148 Sweet: They produce a lot of airflow when maxed out 149 gold9: I am booting it to check what is the temp without the fan 150 gold9: [Image attached] 151 gold9: With GPT-J model 152 gold9: But the temp rise up gradually 153 Sweet: Do they also go down gradually? 154 gold9: I don't see it going down 155 gold9: [Image attached] 156 gold9: Check hot spot 157 Sweet: Its not a really bad temp though 158 Sweet: Hot spot yes, but general gpu temp no 159 gold9: I think it is fine, it rated for up to 90C 160 Sweet: Wonder why it draws 60w doing this though 161 Sweet: Its not really doing anything 162 gold9: nothing 163 Sweet: Really bad power draw for a home PC 164 Sweet: But oh well 165 gold9: I feel if I leave with no fan it will reach 90C and triger temp safty in the board 166 Sweet: Seems to be the memory controller though 167 Sweet: Probably in your case 168 Sweet: But in my case the heat will rise to my PC and possibly trigger the front fans 169 Sweet: Can you exit kobold and see what happens? 170 gold9: [Image attached] 171 gold9: It drops to 20W 172 Sweet: Much better 173 Sweet: So its the memory 174 Sweet: I bet the temps will become a lot better now 175 gold9: you can tunnel the air out of one of the fans to the card 176 Sweet: Yup but all the airflow in the case is as wide as the case and very direct 177 Sweet: So i think just having the big fans on when i use it should be enough 178 Sweet: How much is it dropping for you now kobold's off? 179 gold9: Just reopened Kobold with fan max to check the maximum consumption when I connect to the client 180 gold9: Even with my setup, it started to heat up when I opened the client and cooled down when Kobold idle 181 gold9: I am using Corsair 4000D which is well ventilated 182 Sweet: How did you rig yours up? 183 gold9: [Image attached] 184 Sweet: Fans on the front or solid panel on the front? 185 gold9: One fan in the front and one in the back while the tesla use 120mm blower to cool down 186 gold9: But the whole top of the case is open case 187 Sweet: Mine is more directed airflow then 188 Sweet: Given my giant fans on the front 189 Sweet: It will be blowing directly against the card 190 gold9: the case open in the top cause the airflow to be less directed. So, I am kind force to use direct airflow with the card. 191 Sweet: Mine is also open at the top 192 Sweet: But i'm pretty sure it can also be closed 193 Sweet: But since its two fans ill probably be able to get away with it 194 Sweet: Its going to be hit by the bottom fan, while the top fan handles the CPU area 195 gold9: After I receive my 36mm fan I will check another setup this week 196 gold9: I got the chance to run GPT J with the current setup to write me some HTML, C, Java and Kotlin. It is unlikely that it will require a heavy cooling system. 197 gold9: They looks like 140mm fans 198 Sweet: I think they are 199 Sweet: 200mm 200 gold9: Is this full tower ? 201 Sweet: Yup 202 gold9: damn that is humongous fans lamo 203 Sweet: Which is why i didn't order fans for the card haha 204 Sweet: When these are on full blast you literally hear the air move 205 gold9: Would love to see the final build! The 36mm fans are super loud consider the high RPM. You have the advantage of running lower RPM with higher air flow 206 Sweet: Not just that 207 Sweet: On full blast the fans aren't even that loud 208 Sweet: You just hear wind 209 Sweet: Its 1226rpm peak 210 Sweet: Although that might have been my back fan 211 Sweet: I don't think these report a rpm 212 gold9: I was considering two 200mm noctua fans for the top ventilation 213 Sweet: Hows your K80 @Valerian ? 214 Riehl: still waiting on the power cable that combines to PCI power ables. >:/ 215 Sweet: Also ordered one of those at the same time with the card 216 Riehl: and I need a GPU that fits a X1 slot.. 217 Riehl: I did two, but they were from different sellers. 218 Riehl: got your M40 running? 219 Sweet: Its in the mail 220 Sweet: Anything in europe was $700 221 Sweet: So it will take a while 222 Riehl: the motherboard i have doesn't have integrated video. 223 Riehl: oh lord. 224 Sweet: So beginning of october most likely 225 Riehl: :hype: 226 Riehl: another issue is .. apparently the Tesla series drivers are kind of garbage. 227 Sweet: Ordered one of those PSU adapters on ali, so hopefully they arrive around the same time 228 Sweet: Drivers i have no fear in 229 Riehl: i'm researching the issue - apparently there are other drivers that play nicer with the Tesla GPUs 230 Sweet: I work as an IT tech support 231 Sweet: / system administrators 232 Sweet: Drivers are kind of my thing haha 233 Riehl: oh sweet. 234 Riehl: what would you recommend doing, then? 235 Sweet: So if it takes some tinkering thats fun 236 Sweet: From what i saw there are multiple different cuda versions bundled in the driver 237 Sweet: So my first step will be matching what KoboldAI uses with whats in the driver 238 Sweet: And not having an existing nvidia card will help 239 Sweet: So my AMD should allow me to use it 240 Riehl: I uh.. 241 Riehl: [Image attached] 242 Riehl: *thought* about doing this to one of my existing low-end GPUS.. 243 Riehl: to make it fit a X1 PCI slot. 244 Riehl: it supposedly works? ... but is highly not recommended. 245 Sweet: I saw RX550's sometimes have single slot varients 246 Sweet: For KoboldAi currently the cuda version is locked on 11.1 247 Riehl: doing driver research? 248 Sweet: Yup 249 Riehl: it's awesome you work in tech support. 🙂 250 Sweet: Looks like they don't have that 251 Sweet: Its my work in tech / sys admin that allowed me to build those install scripts much easier 252 Sweet: Figuring out how to install stuff is my thing 253 Riehl: [Image attached] 254 Riehl: this was the X1 GPU I found .. but it requires an adapter for my monitors to work. 255 Sweet: What output do you need? 256 Riehl: just basic video. 257 Sweet: VGA? 258 Sweet: Or HDMI 259 Riehl: HDMI or DVI apaters 260 Sweet: Isn't that output DVI? 261 Riehl: i have converters for both DVI and VGA plugs 262 Sweet: It looks like DVI 263 Sweet: Then don't go with that 264 Riehl: It's a DMS-59 pin port. 265 Riehl: buuut.. 266 Sweet: Its easier to just go for a different card 267 Riehl: there is an adapter 268 Riehl: i was trying to go for cheap. 269 Sweet: HD 5450 for example 270 Sweet: Not sure how good compatibility is 271 Sweet: But windows 10 does have drivers for it 272 Sweet: [File URL attached] 273 Sweet: Looks like less of a headache than that firepro would be haha 274 Sweet: Since this seemingly has Displayport/HDMI , DVI and VGA 275 Sweet: No adapters 😄 276 Sweet: And you'd be using consumer drivers not firepro drivers 277 Riehl: mm. but it's not a X1 slot. 😦 278 Riehl: @Henky!! whatcha doing for cooling solutions? 279 Riehl: nevermind - just scrolled up and saw what you're doing. 280 Riehl: 👍 281 Sweet: I hope it will work 282 Sweet: But those 200mm's have a ton of cooling power 283 Sweet: @Valerian Wait you meant something else than PCI instead of 1 slot occupying? 284 Sweet: Then i misunderstood 285 Sweet: Do you seek PCI or PCIe @Valerian ? 286 Sweet: Or wait i can see PCIe from the screenshot 287 Riehl: I'm cooling the K80 with a set up like this .. 288 Riehl: [Image attached] 289 Riehl: But the triple fans will block my secondary PCIe 4 slot. 290 Riehl: I have a few PCI X1 slots near the bottom. 291 Sweet: PCI or PCie? 292 Sweet: Because for a retro PC i will soon be installing my Trio64 GPU 293 Sweet: And that PC is capable of windows 10 294 Sweet: So if it works on Windows 10 you'd be able to get that as a PCI card 295 Sweet: They are absolutely ancient cards though 296 Riehl: [Image attached] 297 Sweet: Doubt windows would have a driver for it even 298 Riehl: The very last one .. the X1 slot. mega small 299 Sweet: So the PCIe varient then 300 Riehl: yes 301 Sweet: I'm not far off with my 5450 302 Sweet: They do have one 303 Sweet: The question is would you be able to obtain one 304 Sweet: [File URL attached] 305 Riehl: 🤔 306 Sweet: Judging ebay the firepro + adapter is probably your best bet though 307 Riehl: I mean- 308 Sweet: Whats that cooler though @Valerian that looks sick 309 Riehl: not what I bought, lol 310 Riehl: [Image attached] 311 Sweet: Interesting but yours looks really cool 312 Riehl: i think the cool looking fans are lke $60 ... 313 Sweet: I like the LED's on it 314 Desberg: my question is how the air's gonna get past the plastic 315 Sweet: No fins? 316 Riehl: [File URL attached] 317 Desberg: there are fins but you gotta actually remove the casing 318 Riehl: I take hex-key to side screws; open the case, and peel the plastic shield off that covers the heat sinks. 319 Desberg: so the shield's on when not in use? gotcha 320 Riehl: er, i was just gonna leave the plastic off the top. 321 Sweet: In my case i just hope the front fan will be enough 322 Sweet: And i hope the card stays cool when not in use 323 Riehl: [Image attached] 324 Riehl: that way, the heat sink's are exposed 325 Desberg: I see 326 Desberg: I made a 3D printed attachment that attaches to the screws on the front of the card, which funnels the intake from a fan attached to it 327 Sweet: Saw those on ebay as well 328 Sweet: Like fan to the side kind of deal 329 Riehl: kinda like a data center fan actually works 330 Desberg: yeah since I have a 3D printer I just made my own, but they're fairly easy to attach 331 Sweet: In my case there is a 200mm fan in the front of my case which would blow directly against the card when active 332 Sweet: So i am hoping the M40 stays cool enough when not in use to keep my passive build 333 Riehl: Henky is using a big-boy fan 334 Sweet: But then when i want to use it i can turn that fan on a high speed and then that should do hopefulyl 😄 335 Riehl: instead of tiny screamy jet engines 336 Sweet: Yup 337 Sweet: The case is also quite affordable 338 Sweet: At least if you consider $100 for a case affordable 339 Sweet: To me most good quality cases are around that price 340 Riehl: i am very likely going to have to come to or or Henky about setting up linux to run this thing if we can't find windows drivers that play nice with it 341 Sweet: [File URL attached] 342 Sweet: @Valerian I am going to have Linux on mine either way since i want to be fully on linux by 2026 343 Sweet: So already doal booting 344 Desberg: and that is why I don't like running my K80 lol 345 Riehl: this is the case I was *thinking* of putting it in.. 346 Sweet: How hot does your K80 get idle when the fans not on? 347 Riehl: [Image attached] 348 Riehl: opinions? 349 Desberg: dunno, the fan's got a molex connector so it's not like I can very easily control that 350 Desberg: they're pretty much just always on 351 Sweet: @Valerian Looks server rack enough that you might get away with just the front fans if its near that area 352 Sweet: @WAUthethird In my case i can use motherboard software to control the output power of the motherboard ports, so if i went for a setup like that i'd try to hook it up to those 353 Sweet: But if its on the PSU then rip speed control indeed 354 Desberg: yup, psu 355 Sweet: Yeah thats not fun 356 Sweet: Think my 200mm fan will work? 357 Desberg: how are you attaching it? or is it in a server box 358 Sweet: Its in my desktop PC 359 Sweet: The fan is on the front 360 Sweet: If you check the amazon link i posted you get the idea 361 Sweet: Those on full blast produce a lot of airflow 362 Desberg: seems like it'd be a bit challenging to get all that into the fins though 363 Desberg: unless you've got a fancy rack thing like @Valerian 364 Sweet: Its basically case wide airflow 365 Desberg: ah yeah, that's what I've got 366 Desberg: should be fine 367 Sweet: And i am hoping that like the server varients it goes into the side 368 Sweet: Not sure how they line up in the servers but i am assuming they are in a row and then air goes from the side 369 Sweet: Since blade servers typically have this large fan array of super tiny fans 370 Desberg: yeah, in one way and out the other so you've got one area of the server room that's a bit windy, and another area that's pretty warm 371 Sweet: Once was in one where the aircon failed 372 Desberg: ouch 373 Sweet: That was insanely warm in the entire room 374 Sweet: They had dying harddrives for months 375 Desberg: not surprised why didn't they make fixing that a priority? 376 Sweet: They did 377 Sweet: But you don't notice it until the next day 378 Sweet: They ended up taking the windows out so they could have the room cool down 379 Sweet: It was not safe for humans in there 380 Desberg: and by that time it was too late that's not fun 381 Sweet: Pretty sure it happened on the weekend as well 382 Sweet: And it wasn't the main server room either 383 Sweet: The company had 2 384 Sweet: So it was just this regular room with servers and an aircon 385 Desberg: the main one didn't fail, right 386 Sweet: Nope 387 Sweet: That one was much more proper haha 388 Sweet: But i remember trying to go inside it hours after they already were getting the air out for the backup check 389 Desberg: can't have been a cheap fix-up 390 Sweet: And i was not able to go in yet 391 Sweet: To warm 392 Sweet: I would not be surpriced if it ended up near 80 degrees celcious in there 393 Riehl: holy wow... 394 Desberg: hot damn 395 Sweet: Imagine how hot the inside of your computer gets if you had no cool air anymore 396 Sweet: Then have an entire room of them, and no more external air cooling or ventilation 397 Sweet: The fact that company took the windows out already says a lot haha 398 Desberg: sounds like a good reason to have backup a/c 399 Sweet: We made daily backups on tape which were in different rooms entirely and some outside the building 400 Sweet: And the other server room did not get hit 401 Sweet: So no data was lost but they had to replace almost all the drives over the course of a year 402 Desberg: sounds pricey 403 Sweet: For sure 404 Sweet: I remember them being very concerned about their raids 405 Riehl: how often do things burn out in situations like that? 406 Sweet: Burn out as in catch fire? 407 Sweet: Or just failure 408 Riehl: both.. ? 409 Riehl: Ehm.. 410 Riehl: [Image attached] 411 Riehl: @Henky!! these were the drivers that were suggested ... 412 Riehl: i dunno if those were the ones you looked at already or not 413 Sweet: Its one i am researching yes 414 Sweet: But i do see one annoyance with them 415 Sweet: Conda has no pytorch version compatible with it 416 Sweet: Pytorch's repository basically has 11.0 and 11.1 417 Sweet: So i am starting to doubt if using the official one is a good idea 418 Sweet: conda-forge seems to have 11.2 419 Sweet: So i am messing with kobold's install scripts a bit 420 Sweet: Or more specifically the finetuneanon.yml 421 Sweet: Otherwise 11.0 might be better for you and me, since we can grab a download for 11.0 422 Sweet: @Valerian From what i can tell 11.2 is good for now 423 Sweet: But i can't test anything until one of us has the card running and we can learn how strict it is on the requirements 424 Nevlin: Doesn't Colab have K80s 425 Sweet: They do yes 426 Sweet: We are talking windows drivers at the moment 427 Sweet: They have seperate drivers for the different cuda versions and i am wondering why thats the case 428 Sweet: The one version pytorch uses (11.1) is not provided 429 Sweet: 10.2 would work though 430 Carena: @Henky!! welcome to my world, of compilation hell 431 Riehl: @Henky!! 432 Riehl: would this help? 433 Sweet: Try the 11.2 first 434 Sweet: Because 10.2 will not work on KoboldAI before changing a dependency file 435 Sweet: If 11.2 works out of the box we don't need to mess with it 436 gold9: 11.2 works fine with M40 but the GPU is monolithic which mean the driver will interface with it as a single GPU. I don't know much about K80 but if I recall correctly it is dual GPU in a single board. does cuda library interface with them as single board or as two gpu ? 437 Sweet: I think two gpu's 438 Sweet: VE mentioned before he might look into adding multi gpu support 439 gold9: Nice! 440 Nevlin: I already did, I'm just waiting for Sionnach to finish setting up 441 Sweet: Oh neat, so thats going into united soon i assume? 442 Nevlin: I have it in a branch 443 Sweet: Nice 😄 444 gold9: Does the multiple GPU allowed using multiple VRAM as a single ? 445 Nevlin: It just does some computations on one GPU and the rest on another 446 Nevlin: you can have some of the model on one GPU's VRAM and the rest on the other 447 gold9: So I'll be limiited by the biggest VRAM in the system for model size ? 448 Nevlin: No, you are limited by the sum of the VRAM in all your graphics cards combined 449 Sweet: Is it like breakmodel or does it auto split? 450 Nevlin: I just have it set to do an even split right now 451 Nevlin: I'm trying to think of how to get the user interface to work with multiple GPUs and CPU 452 gold9: oh that neat! 453 Sweet: How about you give them the total, and ask per device how much blocks they want to commit to it with the remaining blocks also being shown in each prompt? 454 Sweet: And then auto commits the remaining ones to the last device 455 Sweet: That way in a single GPU setup you get the CPU question and nothing else 456 Sweet: But in a GPU setup you can choose how much per GPU 457 Sweet: Because if people add their old GPU's on to it they may have preferences 458 Nevlin: I think for the multiple GPUs it'd be less confusing to ask for how many layers for each GPU and then commit the rest to CPU memory 459 Nevlin: But that'd also require me to change the existing breakmodel thing to ask for how many layers to commit to GPU 460 Sweet: Thats a good idea 461 Sweet: I don't think that would be to big of an issue if we are clear about it 462 Sweet: But ideally we'd automatically detect it 463 Sweet: If we had a means of detecting the total size 464 Sweet: Then you could recommend the amount to commit 465 gold9: I mean there are limited amount of memory size out there 2,4,6,8,11,12,16,24 and they can be read by NVidia drive assuming that it gone be NVidia only solution. I didn't test AMD\ pytourch on linux yet 466 Carena: Don't forget I run 32Gb... 467 Carena: And it's a CPU/GPU shared memory 468 Carena: This however is something that is not yet fully understood by the pytorch devs... 469 Carena: The GPU only needs a pointer to the memory, not a full copy. 470 gold9: hahha I think Javier AGX special case. But most likly future systems from ARM and RISC V gone be share memory with just pointer. But now, most x86 system are limited with memory. 471 Carena: It's not a special case, I also seen GPU with 48Gb... 472 gold9: Talking about CPU/GPU based system 473 gold9: other than APU for ML 474 gold9: Though I would love V100, the price is going to be astronomical 475 gold9: or RTX 8000 476 Carena: A100, 80Gb VRAm 477 gold9: yeah and rtx8000 is 48. Is anyone here have A100 ? 478 Carena: Vast.ai? I think you won't need anything bigger than 24Gb VRAM to be honest at this moment 479 Carena: HFJ in half size is 12Gb, for training you need twice that... 480 Sweet: Imagine being able to tune a model in chunks 481 Sweet: Would be so nice 482 gold9: I think the reloading each chucks will take forever 483 Carena: Deepspeed? 484 Sweet: Probably 485 Sweet: Deepspeed is going to be something ill look in to when i have the m40 486 Carena: Deepspeed is only handy if you have multiple gpu 487 gold9: What is Deepspeed ? 488 Sweet: Can't it tune J on a single 3090? 489 Carena: 12Gb * 2... Doubtful. 490 Sweet: [File URL attached] 491 Carena: Freezes 25% of layers to fit 492 Sweet: So wouldn't produce good results? 493 Carena: As i said: doubtful 494 Carena: It might, but unsure at what cost 495 Carena: Ah, it uses finetuneanon's wonderful junk 496 Carena: I tried that, it won't work properly 497 Sweet: To be fair you did not try that with an m40 😛 498 Carena: 2x 3090, and it OOMed at 24Gb 499 Carena: Want to try my most devilish idea ever, and that is 6B-Ramsay 500 Sweet: As your first 6B? 501 Carena: Yup 502 Carena: At least I will know how well it performs 503 Carena: I know that if I do Shinen on 1.3B on my system, it takes around a month... 504 Carena: I do have a paid 2080Ti at my disposal but last times I tried that broke on deepspeed 505 Riehl: it is time 506 Riehl: for operation 507 Riehl: mutilate spare GPU. 508 Riehl: wish me luck, i shall post my results. I *won't* be testing it on any good mobos or anything of value - just a garbage rig kept in storage. 509 Riehl: oh and i sent the gpu that arrived in the mail back; that's money i could spend on something else. 510 Sweet: And thus #hardware became a horror channel haha 511 Sweet: Hopefully they survive it sio 512 Riehl: well the good news is nothing caught fire 513 Riehl: the GPU powered up, etc. but no post. 514 Riehl: wasn't expecting it to work - but i have a bunch of old junk parts. 515 Riehl: on the plus side - i've got a franken rig to test parts on now at least. 😅 516 Sweet: The M40 arrived 😄 517 Sweet: Now i will need to wait a couple of weeks for the PSU cable to show up 518 Nevlin: Wow that was fast 519 Sweet: Yeah its a week earlier 520 Sweet: Thats like a one week delivery for something from the US 521 Sweet: The cable will take much longer 522 gold9: They were fast when they shipped M40. I had to wait a week to get cables even though I ordered them before the card arrived 523 Sweet: I have no illusion in regards to the cable, i expect it in the second half of next month 524 Sweet: China shipping is always super slow 525 Sweet: If it takes to long i might just try and buy another one locally 526 gold9: They have direct aliexpress shipping that usually takes less time. But I was in verge to order it for a bit extra from Amazon to just be sure nothing wrong with M40 527 Sweet: I didn't go with the direct one for such a cheap component 528 gold9: They are free after certain amount. I was building server rack cable management and I got them with other stuff 529 Sweet: I didn't need anything else, so it would have been $13 shipping for a $4 part 530 Sweet: Looks like i am very lucky, randomly got a track and trace alert despite using the free shipping option. "The cable arrived in my country" (Not really but its in the neighboring country close to the border). So it will probably be here very soon :D 531 Ralf: What courier is it with? 532 Sweet: Their free global shipping 533 Sweet: My tracking app can't figure out the real traxking number of the couriers like it often can 534 Ralf: It should be with a local carrier rn, no? 535 Sweet: Yup 536 Ralf: Are you using 17Track? 537 Sweet: Nope, parcelsapp which is way better 538 Ralf: I used Parcelsapp a long time, but shipments from China were just not accurate for me. 539 Sweet: Interesting since i never found anything it did not accurately track 540 Sweet: Tried like 5 of them on a chinese package and only parcelsapp managed to get details 541 Carena: depends, if it's aliexpress it might 542 Sweet: But it also is reliable outside of china where 17track fails on anything thats not international 543 Sweet: For this specific package i only have cainiao tracking 544 Ralf: Not using Aliexpress. 545 Carena: Aliexpress also ships with Cainao 546 Ralf: 17Track isn't working for you then? I use it on all my local DHL and DPD shipments. 547 Carena: Aliexpress using 4 types: - EMS - Aliexpress (DHL & PostNL) - Cainao - China Post 548 Sweet: @Ralf Didn't work on PostNL 549 Sweet: For me no matter the type i always got cainiao tracking 550 Ralf: Oh I remember EMS 551 Ralf: Not gud 552 Sweet: This one is Cainiao Super Economy Global 553 Sweet: What was EMS like? 554 Ralf: Ass. 555 Ralf: Took a whole month to get my shipment to Germany. Ripped of my customs declaration papers in the process. 556 Ralf: German Customs or "Der gute Zoll" seized it and it took EMS another whole month to get back to me. 557 Sweet: For me the china post one is the trash one 558 Sweet: With dealextreme it typically took me 2 months to get anything 559 Ralf: I usually go with DPD now when getting something from China. 560 Ralf: Customs doesn't even care about it. 561 Sweet: Nice, these days PostNL has a system where on behalf of customs they deliver it to the post office and make you collect it there 562 Sweet: But the post office is trash so i am trying to avoid them as much as possible 563 Ralf: Yes, same in Germany with DHL and the Deutsche Post, but in general I pay customs in advance. 564 Carena: @Henky!! Note that if you have an app, you can pay customs in advance 565 Sweet: I don't have their app 566 Sweet: But it happened with a package i had no track and trace for 567 Sweet: With the M40 ebay handled all the customs stuff shipping and customs was a redicilous $100 though, they should not do that on second hand goods 568 Sweet: It was a third of the price of the card 569 Riehl: @Henky!! K80 is powered on, system is running ... trying to find drivers. :/ 570 Riehl: getting 11.2 571 Riehl: @VE FORBRYDERNE I should be able to give you a screen shot of my GPU's soon. getting drivers set up. 572 Riehl: also getting fan controllers configured. 573 Riehl: hmm.. 11.2 may not with Keplar devices.. 574 Riehl: [File URL attached] 575 Nevlin: colab somehow has both CUDA 10.2 and 11.2 installed for K80s 576 Nevlin: I think at least one of those two should work 577 Riehl: i tired to install KAI -- but finetune anons transformers failed to create a process. 578 Nevlin: I've seen that happen before when trying to use conda in a path that has spaces in it 579 Riehl: ah! 580 Riehl: lemme look at that. 581 Nevlin: if you're using the K drive option then this isn't the problem though 582 Riehl: i think that is the issue.. my user account has a space. 583 Riehl: lemme try the other option 584 Riehl: [Image attached] 585 Riehl: ooof. 586 Riehl: i guess i have to use 11.2 then? 587 Riehl: also, weirdly enough ... no GPUs appearing in Task Manager... 588 Riehl: [Image attached] 589 Riehl: though I know the devices are there. 590 Riehl: restarted with 11.2, installing KAI again 591 Nevlin: If you get the drivers installed correctly, you can run commandline.bat and type in this command to check how many GPUs PyTorch recognizes ``` python -c "print(__import__('torch').cuda.device_count())" ``` 592 Riehl: can do! 593 Riehl: I also found this -- 594 Riehl: [Image attached] 595 Riehl: does that help you at all? 596 Nevlin: what am I looking at lol 597 Riehl: uh, I think registry entries for installed GPUS. 598 Riehl: [Image attached] 599 Riehl: and that command showed this ... 600 Riehl: wait 601 Riehl: wrong cmd 602 Nevlin: if it prints nothing it means PyTorch didn't recognize your cuda drivers 603 Nevlin: I think 604 Riehl: [Image attached] 605 Riehl: it said .. "2" 606 Nevlin: Your display GPU is from AMD right? 607 Riehl: correct. 608 Nevlin: Ok I'll just assume those two are your K80 cores 609 Riehl: so it sees both cores? 610 Nevlin: hold on 611 Nevlin: yeah I guess so 612 Riehl: want me to try to run a model? 613 Nevlin: sure 614 Riehl: kapow. 615 Riehl: i'm guessing the models also cannot have spaces in the command line either, huh? 616 Nevlin: I think they can 617 Riehl: not sure what borked up then 618 Riehl: [Image attached] 619 Riehl: if that provides more details or not? 620 Riehl: wait.. 621 Riehl: those aren't complete models. that's my fault. 622 Riehl: they didn't transfer over from my external hard drive completely; this motherboard has kinda borky USB ports on the front. 623 Riehl: transfering 6B over. 624 Nevlin: You should probably try to load a 2.7B model first 625 Riehl: yep -- trying a neo model first. 626 Riehl: 🤞 627 Riehl: and here we go~ 628 Riehl: holy shit 629 Riehl: [Image attached] 630 Riehl: I can run GPT neo without break model mode. It replies in like, a few seconds. Not instant.. 631 Nevlin: Around 4-5 seconds? 632 Riehl: Yup! 633 Nevlin: I have a version of KoboldAI here you can use to try to load 6B https://github.com/VE-FORBRYDERNE/KoboldAI/tree/k80-test 634 Riehl: Temps -- 635 Nevlin: Is that good? 636 Riehl: I think? 637 Nevlin: I know nothing about hardware 638 Riehl: I just looked up what is "hot" ... and it interwebs sais something like 65C ... 639 Riehl: i've got a manual fan controller set up just in case .. (and a fire extinguisher) 640 Riehl: i do have to get to bed. but i'm really excited. 641 Riehl: I downloaded your special version -- going to set it up in the morning. 642 Riehl: spent the evening building this thing. lol 643 Riehl: it looks like ... a train wreck; the proper case hasn't shown up yet. 644 Nevlin: That's what most stuff I build looks like 645 Nevlin: Anyway, good night! 646 Riehl: good night! talk to you tomorrow! thank you for help! I'm sure Henky will be excited, too! 647 Riehl: i guess I'm only using 12 VRAM right now? or is it using both? I'm not sure.. ? 648 Riehl: doesn't NEO require 16vram? 649 Nevlin: No only the official ones do 650 Nevlin: The finetuned ones need 8 GB 651 Nevlin: We halved the memory requirements 652 Riehl: I see! 653 Riehl: before, I had to use break model mode to even run horni-li at full tokens 654 Nevlin: Didn't you have an 8 GB card already? 655 Nevlin: I guess maybe it needs more at max tokens then 656 Riehl: I did! 657 Riehl: head to sleep! being poked. 658 Riehl: again, thank you! 659 Carena: Anything between 60 and 65 is good. Beyond 70 you might start experiencing issues, and beyond 85 your card will go throttle. It can go up to 130c, but then it's thermal paste will melt. 660 Riehl: thank you! testing K80 mode now! 661 Riehl: @VE FORBRYDERNE 662 Riehl: it appears to be working. both cores are operational - I'm running 6B Skein. Replies take a few moments to return 663 Riehl: [Image attached] 664 Carena: @Valerian you running on full GPU utilisation? 665 Riehl: I think so? there's a monitor in the upper left of the screen - both cores seem to be showing activity 666 Riehl: @mr_seeker 667 Carena: I mean like: Full GPU usage as in RAM 😉 668 Riehl: oh! 669 Riehl: let me see what the ram is doing.. 670 Riehl: [Image attached] 671 Riehl: I guess that's like 100% utilization? 672 Riehl: [Image attached] 673 Riehl: it's weird -- the GPU does not show up like a normal GPU in the performance tab. There are literally only two options in the NVIDIA settings: "Dev Mode enable" and "GPU utilization" 674 Riehl: it doesn't give me any easy readouts of VRAM usage. 675 Carena: Check nvidia-smi 676 Riehl: Most of the information is very, very skant - with only a "Data center GPU detected" comment 677 Carena: nvcc and nvidia-smi can see much more than windows 😉 678 Riehl: [Image attached] 679 Carena: Okay, it seems it's split between 2 K80? 680 Carena: 6Gb + 6Gb 🙂 681 Riehl: [Image attached] 682 Riehl: seems to be running at least pretty cool- 683 Carena: means multi-GPU works 😄 684 Riehl: yes! 685 Riehl: Hardware people! I have updated my Pinned message at the top to include cooling options - you can remove the cover from the GPU using hex keys and cool the GPU's heat sinks directly with traditional fans. 11.2 NVIDIA drivers for the Tesla K80 are confirmed good; unsure for M40 or other Tesla cards. As you get results, I'd be happy to add to the pinned messages for others who want to build their own home rig. 686 Carena: I would love to get a new rig for training purposes... 687 Sweet: @Valerian By default the K80 behaves as a compute card not a gpu 688 Sweet: Can be switched but this mode is going to be the best for kobold 689 Riehl: if I could learn how to do it - I would be happy to train stuff on my rig! 690 Riehl: multi-tasking and stuff. i might be capable of doing that now 691 Riehl: yeah .. i saw that. but it's kind of a debate on if it's even useful as a graphics card. some people swear by it, others say it's kinda garbage as a gaming card. I'm not really going to try at the moment. might be an interesting experiment later. 692 Riehl: on another note .. **all** my old stories are absolutely effing amazing now. 693 Riehl: at least a good few of them. 694 Riehl: 6B is incredible. 695 Carena: I tried it, but the 2.7B on 11Gb of RAM seemed to have caused OOM, so never followed up on it 696 Riehl: ah. so i can't use all 24 for training i guess 697 Sweet: @Valerian On skein i imagine? 698 Riehl: yep! was running Skein locally this morning. 699 Riehl: about to tinker with it a bit more. 700 Riehl: trying to get my Zotac drivers installed so I can have two monitor support. 701 Riehl: @Henky!! your M40 show up yet?? 🙂 702 Sweet: M40 yes, cable no 703 Liggitt: google coral products aren't compatible with anything kobold-ai related, are they? 704 Liggitt: [File URL attached] 705 Liggitt: seems like these are more geared towards picture, video and sound recognition 🤔 706 Nevlin: All you really need for running most neural networks is really fast matrix multiplication and lots of fast memory so technically it could be done, but these things don't look like they have enough memory 707 Nevlin: Also they use tensorflow and we use pytorch and jax 708 Liggitt: they're out of stock too 😫 709 Liggitt: i hate the chip shortage 710 Liggitt: 25 bucks wouldn't have been too much for a little coral device to experiment 711 Liggitt: *no stock* 712 Liggitt: but yeah with how much VRAM pytorch gobbles up, these things would definitely be out-of-spec 713 Desberg: I've got the USB coral accelerator 714 Desberg: works amazingly well 715 Carena: They might work with a bit of tinkering 716 Carena: One issue though: coral is an 8-bit system. 717 Sweet: So you need 8bit models? 718 Carena: Yup, looks like. Also restrictions in what can be used 719 Sweet: Finally landed up on my doorstep, when i got the time and energy this week ill be taking that tesla for a spin. I just really hope i can keep it cool otherwise ill need to figure out what the best fan setup is 720 Carena: Wondering if deepspeed can split up the Train model over multiple gpu's, and how to do that... 721 Sweet: Isn't that the purpose of GPTNeoX? 722 Carena: Might get my hands on cheap 2080Ti 723 Carena: Cluster of 4 724 Sweet: Not gonna join the M40 gang? 725 Carena: Rented 726 Sweet: Ah 727 Carena: If I go full M40, I need a new pc 728 Carena: And with that the hardware to finetune 20B 729 Carena: Chip shortage needs to end first 730 Sweet: For sure 731 Carena: AMD Threadripper Pro 64 core 3995WX 256gb 3200 ram 7x 3090 FEs. All running at pcie gen4 16x 2tb Sabbrent nvme 2x Corsair ax1600i 1x Corsair ax1200i 480mm EKWB XE Radiator 240mm EKWB XE Radiator 7x Noctua 3000rpm industrial fans dual D5 EKWB pumps Corsair 3090 waterblocks with heatsinks and fans on the backplates All EKWB fittings and tubing. 12/16 That is a rig for vast.ai 732 Sweet: Ran into an issue slotting my M40 733 Sweet: The plate on the back is not fit for my case, it seems nonstandard compared to some of the online pictures 734 Carena: Nonstandard? 735 Carena: Is it a server based one? 736 Sweet: Seems like it 737 Sweet: The screw is to high 738 Sweet: [Image attached] 739 Sweet: Yup thats a nogo 740 Sweet: @Valerian This will be good for you to include in the guide 741 Sweet: There are multiple backplates and models and if you get one like mine you can't use it in a desktop case at all 742 Sweet: The correct bracket seems to be called : PK3RJ 743 Sweet: [File URL attached] 744 Riehl: I think you can replace it? 745 Sweet: Yes, its with screws 746 Sweet: But people need to account for the fact they may need to buy this 747 Sweet: Its something you can really only tell from the picture in the listing 748 Sweet: Ill now need to wait another month before i have the bracket, otherwise i'd have ordered one haha 749 Lavinia: Time to break out the zip ties. 750 Sweet: I didn't wanna risk it with the slot 751 Sweet: But would have gone further if i was more certain about my cooling 752 Carena: [Image attached] 753 Carena: To give an idea on what phone you might need for AI-related tasks (higher is better) 754 Fowle: What kinda bracket does an M40 need to fit in a normal dual slot? 755 Sweet: PK3RJ 756 Sweet: Best way i can describe the look is that it has teeth at the bottom 757 Sweet: If its flat you got the server version 758 Fowle: Thanks 759 gold9: It fits perfectly on M40: https://www.aliexpress.com/item/4000505282893.html?spm=a2g0s.9042311.0.0.27424c4dSliEoa 760 Singband: I eventually need to try to get an Nvidia card. Or find out some new method to do without. One of the two. 761 Sweet: That one fits to, but the PK3RJ has the original mesh 762 Sweet: So the one i linked here is similarly priced and more original : https://www.ebay.com/itm/183756406633 763 Riehl: @Henky!! lemme know which parts work - maybe we can pin a M40 how-to 764 Riehl: or make an addendum 765 Sweet: The PK3RJ most likely applies to both the M40 and K80 so it could go in the universal guide. Its not on my card yet but ill know soon enough if it fits 766 Sweet: My M40 adventure may have come to an end 767 Sweet: Not sure if its going to work in my PC 768 Sweet: Getting insufficient system rsrcs 769 Sweet: @Valerian Any ideas? 770 Riehl: Insufficient system rsrcs? 771 Riehl: like what kind of errors are coming up / symptoms? 772 Riehl: hmm.. 773 Riehl: [File URL attached] 774 Sweet: I managed to fix it 775 Sweet: I needed to disable CSM mode 776 Sweet: But its still not looking good 777 Riehl: how so? 778 Sweet: I am having multiple device conflicts 779 Sweet: I got the tesla working now 780 Sweet: But my network and usb controller are broken 781 Riehl: which ones are conflicting? hurray! 782 Sweet: So it may not have enough PCI lanes for all of them 783 Riehl: interesting. 784 Sweet: At least thats my theory 785 Sweet: My main windows doesn't start anymore 786 Riehl: the machine i built also starting have USB issues; front ports just stopped working flat out 787 Sweet: Windows To Go running from USB starts and it has the card 788 Sweet: But 2 of my other devices are down 789 Riehl: using windows 10? 790 Sweet: Yes, but a much newer one on the usb stick 791 Sweet: The card was getting warm since i can't use my cooler software 792 Sweet: So i just shut everything down 793 Riehl: Pfff... 794 Sweet: Once it had a while to cool off ill do one reinstall of windows 795 Sweet: So it can detect and install everything fresh 796 Sweet: If it works hurray! 797 Riehl: like ... I know the M40 is more advanced / newer than the K80 798 Sweet: If not, then this board is not capable 799 Sweet: I think its oversaturating my PCI 800 Riehl: what cards do you have installed besides the M40? 801 Sweet: Vega 64 802 Sweet: So i assume that one to uses quite a lot 803 Sweet: And both of them eat up to much 804 Riehl: i got myself a baby display controller 805 Sweet: If i had known this i would not have bought that $50 board, i'd just have updated my PC when the old one died 806 Sweet: Its also my gaming PC so i'd remove the M40 before installing a shitty GPU 807 Riehl: OH oh no 808 Riehl: see, i had spare parts given to me 809 Sweet: I'd need a spare PC to put it in to 810 Riehl: i literally had an old motherboard my brother gifted to me when he built a new mining machine 811 Sweet: And my spare PC is certainly not going to handle this 812 Riehl: Hm. 813 Riehl: lemme look up... 814 Sweet: So i am either going to shelve the M40 or sell it if i can't get this stable 815 Sweet: Because chances are on Ryzen 3000 or 5000 it would work 816 Riehl: [File URL attached] 817 Riehl: this is what he gave me 818 Sweet: Thats yours? 819 Sweet: Because mine should be beefier than that 820 Riehl: i put a little baby small slot GPU in the most bottom slot at the very bottom 821 Riehl: it's not used for anything else other than KoboldAI. 822 Riehl: literally dedicated to that entirely 823 Sweet: I think i know why its a gameover 824 Riehl: ? 825 Sweet: It may not support dual X16 cards 826 Sweet: But getting conflicting information on that 827 Riehl: i wouldn't want you to cook your gaming machine. :/ 828 Sweet: Max PCI lanes is 20 829 Riehl: i also have a ryzen, too 830 Sweet: Which one? 831 Riehl: uh, off the top of my head, can't recall.. 832 Riehl: I *think* a 5000 series? 833 Riehl: it doesn't have an integrated GPU - that's why i needed the small graphics card 834 Sweet: I am worried i am PCI lanes short basically 835 Sweet: Which would not make much sense 836 Sweet: Why support 2 PCIx16 cards on the board? 837 Riehl: if i remember correctly with the B450 .. if you put two cards into the PCIx16 slots, it reduces their speed by 1/2 838 Sweet: Mine is a x370 which is one generation older but a higher end chipset 839 Sweet: [File URL attached] 840 Riehl: interesting.. 841 Sweet: It would indeed run it at 8x 8x 842 Sweet: But 843 Sweet: What i am worried about 844 Sweet: Is that it would make that a 16X and then i am short on the rest of the PCI devices 845 Sweet: But we'll see after i reinstall 846 Riehl: [File URL attached] 847 Riehl: i found this - haven't watched it -- but it *wince* talks about modifying the bios? 848 Sweet: Another aspect is that mine is the 24GB version 849 Riehl: oh 850 Sweet: Not wrecking another board haha 851 Sweet: The bios chip is soldered to the board 852 Sweet: And i already bought a new board this week because of a bad bios flash haha 853 Riehl: well we've confirmed the K80 *works* 854 Sweet: Well, not entirely 855 Riehl: perhaps you could sell the M40 and try a K80? 856 Sweet: We have not confirmed the K80 works 857 Sweet: Because the M40 works 858 Sweet: Its my other PCI devices that don't 859 Riehl: I'm running a K80. :/ 860 Sweet: Yeah but you mentioned the front port issue 861 Sweet: So to confirm it works you need to check if all your devices in device manager are working correctly 862 Sweet: Because the M40 is usable 863 Sweet: Its the rest of the system that isn't 864 Riehl: i see- 865 Sweet: So i could theoretically rig my USB network adapter to the PC and use it 866 Sweet: It would most likely work 867 Sweet: But thats not a way for me to use my PC 868 Sweet: I'd still be down the usb controller i need for my external drive, and the good ethernet 869 Sweet: Currently installing windows back 870 Sweet: @Valerian Is yours working with all the PCI devices in your system? 871 Sweet: Or do you also have other devices not working? 872 Sweet: Like one of the usb controllers 873 Riehl: when i was building it - i was reading that the motherboard itself has some issues with USBs refusing to work if certain SSDs / PCI cards are installed - regardless of the GPU. 874 Riehl: it's kind of a silly board. 875 Riehl: There's two 'thumb stick' ssd ports on the mobo itself, but if you use one of them - it switches off the PCI card next to it -- unless you go into the bios and override it. 876 Riehl: The front USBs work (infrequently) -- my external hd would display. I started copying data off it - and suddenly the usb stopped working. 877 Riehl: the rear USBs function, as does the network card 878 Riehl: i have the secondary graphics card installed - but i have not installed dirvers for it for fear of it conflicting with the K80's drivers. 879 Riehl: supposedly I can install two different sets of graphics drivers -- but the options to do so don't seem available or windows 10 is kind silly about doing it -- like, it doesn't recognize the drivers when I try to manually select them for that device. 880 Riehl: I *can't* use two monitors - depsite haveing two display ports for some reason. i imagine because the drivers are running only the Tesla drivers. 881 Riehl: I also don't have sound on that machine. 882 Riehl: likely - again - a driver issue - as I imagine data center drivers aren't designed for any audio output because of their purposes 883 Riehl: in terms of temps -- it's been remarkably cool - and i haven't had any crashes yet. 884 Riehl: but also - the machine is simply built around the K80. I don't do any gaming or anything else on it besides run KoboldAI. :/ 885 Sweet: Yeah 886 Sweet: I think i got my conclusion 887 Sweet: I can't support all the components in this build 888 Sweet: Its most likely the Vega 64 + M40 combination that eats up to many lanes 889 Sweet: Found out more information 890 Sweet: Its NOT the M40 that causes my issues 891 Sweet: The reason i only noticed it while i had the M40 in was because Above 4G support was never properly enabled for me 892 Sweet: Its why initially the M40 didn't initialize 893 Sweet: Its the above 4G support that breaks my network adapter and usb controller on its own 894 Sweet: That means that prior to buying one not only do you need to check if its present in the bios, but if everything on your system still works when its enabled. And you have to make sure CSM support is disabled when you test that. 895 Sweet: Its most likely an issue with the motherboard i have and not the M40 896 Sweet: Especially since its happening without the M40 slotted in my system 897 Sweet: And stops happening the moment i turn above 4g off 898 Riehl: Huh... 899 Sweet: Looks like a motherboard quirk sio 900 Riehl: *nodnods!* 901 Sweet: Which i wish i knew before i bought the same board again xD 902 Sweet: I bought it because it had Gen2 and Above 4G support 903 Sweet: But turns out the Above 4G support is only partial 904 Riehl: How can it only be "partial?" that's kinda weird 905 Sweet: Unless i can somehow solve it with drivers but i have not been able to solve it yet 906 Sweet: Like Above 4G support works 907 Sweet: But then my network adapter and one of the usb controllers does not 908 Riehl: but it emsses with other controllers. 909 Riehl: hm. bios update maybe? 910 Sweet: I can't 911 Sweet: Its the second highest bios and the newer one introduces stability issues 912 Sweet: This is the highest stable bios you can put on the board 913 Sweet: And i am not repeating the downgrade again haha 914 Sweet: I don't want to throw a second one in the trash xD 915 Sweet: Not that downgrading would work, because lower versions don't have the Gen2 support 916 Riehl: [File URL attached] 917 Riehl: Are the PCI slots configured for "Gen 2"? 918 Riehl: i'm looking through bios issues / bugs / etc 919 Sweet: They are, originally i forgot and it didn't boot at all 920 Sweet: For reference, i am running the beta NV bios 921 Sweet: Its the second newest bios 922 Sweet: The newest beta bios breaks legacy boot completely 923 Sweet: And i like to use legacy boot when testing stuff 924 Riehl: @Henky!! any luck? :/ 925 Jacinta: Hi guys, anyone know why the k80 are so cheap right now? 926 Jacinta: If there even there's a reason 927 Carena: Because they are being dumped by miners? 928 Jacinta: Eh, fair enough, because of ETH2.0 incoming? 929 Carena: Could be. 930 Sweet: K80's (and to some extend M40's) are a dead end in terms of support so datacenters don't want them. And performance wise they are weaker than a 1080 so miners don't want them. They are stupidly difficult to use at home and have no video outputs so gamers don't want them. Which leaves us. 931 Jacinta: Makes tons of sense, thank you 932 Carena: [File URL attached] 933 Sweet: Isn't that one that you have? 934 Carena: Nope, I have the Xavier 935 Sweet: Would it be any good for us / affordable? 936 Carena: It depends 937 Carena: Xavier was pricey 938 Carena: It's ampere technology, but I had difficulty getting the 2.7B to train already 939 Sweet: Looks like AMD users just got a lot harder time getting it to work 940 Sweet: And as a result i am out of GPU testing capabilities since i now don't have a working M40 and i also no longer have a working GPU at all 941 Sweet: I am almost certain its the 4.5 ROCm update that broke compatibility 942 Sweet: But naturally AMD is terrible so they did not update the other dependencies 943 Sweet: On the bright side they may have finally added RDNA support 944 Sweet: So when the dependency hell can be resolved more people might be able to attempt to run it 945 Fowle: @Henky!! How could I try to run Kobold locally on my machine? 946 Sweet: Which hardware do you have? 947 Fowle: Since GPU is a bust 948 Sweet: If your going for the CPU route, how much ram do you have? 949 Fowle: I have 40GB 950 Sweet: Lol, alright xD 951 Sweet: In that case if you are fine with longer generation times you can run the CPU version 952 Sweet: In your case the development version will probably work the best 953 Fowle: does the CPU matter? 954 Sweet: [File URL attached] 955 Sweet: It matters in how slow it will be 956 Sweet: But it will be able to run it 957 Fowle: 2600 958 Sweet: My 1700X handles it fine 959 Fowle: 3.8ghz boost 960 Sweet: Mine is 3,9Ghz 16 threads 961 Fowle: 6 cores 12 threads 962 Sweet: Yours might take a little longer but should be relatively close 963 Sweet: Expect to wait a minute 964 Fowle: ok thanks 965 Sweet: Longer once the stories go further 966 Sweet: So if you want to use it for slower more casual play with your ram it will work solid with the 2.7B's 967 Fowle: ok 968 Sweet: 6B expect a multi minute wait 969 Sweet: You can still do it since you got enough ram 970 Sweet: You will need the HFJ models though, the HF don't work CPU only 971 Fowle: Does ram speed matter 972 Sweet: Nah 973 Sweet: Just ram size 974 Fowle: ok 975 Fowle: Thank you henky for your help 976 Sweet: No problem 😄 977 Fowle: @Henky!! One more thing, which script do I execute? 978 Sweet: install_requirements.bat as admin 979 Sweet: Then later play.bat 980 Fowle: I have linux 981 Sweet: Oh 982 Sweet: In that case you will either need to set everything up manually or use docker 983 Sweet: I recommend play-cuda.sh 984 Sweet: With docker and docker-compose installed it should get you going 985 Fowle: What if I have an AMD GPU 986 Sweet: Which one? 987 Fowle: 5500xt 988 Sweet: Don't do anything 989 Sweet: Don't even attempt it 990 Sweet: I tried it today and AMD broke their stuff 991 Fowle: oof 992 Sweet: if you try it now the chance of it working is 0 993 Sweet: The chance to get it to work later would be low to 994 Sweet: And your GPU was never supported 995 Fowle: So nothing has changed, oof 996 Sweet: However, with a bit of luck it might be in 4.5 997 Sweet: So its worth trying once its working again on my supported GPU 998 Fowle: So forget CPU play then? 999 Sweet: With play-cuda.sh 1000 Fowle: ok 1001 Sweet: Just forget about the AMD side for now haha 1002 Sweet: You can still install docker and use the docker for nvidia 1003 Sweet: Although that might not support CPU play well 1004 Sweet: Let me look 1005 Sweet: Doesn't support CPU play properly 1006 Sweet: Can be easily changed to support CPU play instead though 1007 Sweet: You will have to make a tiny modification to it 1008 Fowle: What is the mod? 1009 Sweet: In play-cuda.sh it copies the finetune.yml file 1010 Sweet: That one only supports GPU usage 1011 Sweet: You will want to change that to the huggingface.yml file 1012 Sweet: So it downloads that one instead 1013 Fowle: what about env.yml? 1014 Sweet: That is where it copies it to 1015 Fowle: Ah ok 1016 Sweet: You may need to modify more 1017 Sweet: But we will see if it works 1018 Fowle: > ./play-cuda.sh: 4: docker-compose: not found 1019 Sweet: You need docker and docker-compose installed 1020 Sweet: Getting docker working is your own responsibility with your distro haha 1021 Sweet: Docker has good guides for pretty much all distro's 1022 Sweet: Which one do you have? 1023 Fowle: popos 1024 Sweet: So ubuntu 1025 Fowle: basically 1026 Sweet: Which one? 20.04 or 21? 1027 Fowle: Pop!_OS 20.04 LTS 1028 Sweet: [File URL attached] 1029 Sweet: Thats how to get the proper docker engine 1030 Sweet: Once you have that docker-compose should be apt-get installable 1031 Sweet: But in summary since they list so many instructions 1032 Sweet: ``` sudo apt-get update sudo apt-get install \ ca-certificates \ curl \ gnupg \ lsb-release``` 1033 Sweet: ```curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg``` 1034 Sweet: ``` echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null``` 1035 Sweet: ``` sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io``` 1036 Sweet: Those ones from their manual are the relevant ones 1037 Sweet: Once you have all that it should be something like ```sudo apt-get install docker-compose``` 1038 Fowle: > ERROR: The Compose file './docker-compose.yml' is invalid because: > services.koboldai.deploy.rsrcs.reservations value Additional properties are not allowed ('devices' was unexpected) > Unsupported config option for services.koboldai: 'group_add' 1039 Sweet: I wonder if you ended up getting a old version 1040 Sweet: Alright try this ```apt remove docker-compose``` 1041 Sweet: To get rid of it again 1042 Fowle: ok 1043 Fowle: removed 1044 Sweet: docker-compose giving the not found error again? 1045 Fowle: ./play-cuda.sh: 4: docker-compose: not found 1046 Fowle: yep 1047 Fowle: Now what 1048 Fowle: Trying again 1049 Fowle: But if I install it again the other error occurs 1050 Sweet: pip3 install docker-compose 1051 Sweet: So you have the new one 1052 Sweet: It has to be uninstalled though 1053 Sweet: It is 3am for me so i am not the most responsive atm xD 1054 Fowle: oof sorry 1055 Sweet: They think its a good idea to do construction at night 1056 Sweet: So ill be kept up until they finally stop 1057 Fowle: > Requirement already satisfied: docker-compose in /usr/lib/python3/dist-packages (1.25.0) 1058 Sweet: Did you uninstall it when you tried it again? 1059 Fowle: Ok now it's installing thanks 1060 Fowle: Error just now > ERROR: Service 'koboldai' failed to build : The command '/bin/bash -c apt update && apt install xorg -y' returned a non-zero code: 100 1061 Sweet: Don't know 1062 Sweet: I have not touched this on ubuntu in months 1063 Sweet: Neither do i normally use the cuda version, someone else made that one 1064 Sweet: Try something like 1065 Sweet: systemctl status docker 1066 Sweet: To see if its even running 1067 Sweet: You must also be a member of the docker group otherwise you will need to run the file with sudo 1068 Fowle: running 1069 Sweet: Try running it with sudo then 1070 Sweet: Maybe that will make it start 1071 Fowle: Full error message > rexommendation@pop-os:~/KoboldAI-united$ sudo ./play-cuda.sh > non-network local connections being added to access control list > Building koboldai > Step 1/5 : FROM mambaorg/micromamba > ---> 71c065a6981f > Step 2/5 : WORKDIR /content/ > ---> Using Cache > ---> c24add7fe95d > Step 3/5 : COPY env.yml /home/micromamba/env.yml > ---> Using Cache > ---> b0fd3c767a63 > Step 4/5 : RUN apt update && apt install xorg -y > ---> Running in 2f4ea236e366 > > WARNING: apt does not have a stable CLI interface. Use with caution in scripts. > > Reading package lists... > E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied) > ERROR: Service 'koboldai' failed to build : The command '/bin/bash -c apt update && apt install xorg -y' returned a non-zero code: 100 1072 Sweet: Ill see if i get the same on manjaro 1073 Fowle: Ok let me know how that goes, I'll just use colab for now. Thanks for your support. 1074 Sweet: Getting the same thing here so something broke that docker environment 1075 Fowle: I guess I will just stick to the colab for now 1076 Sweet: I do have a fix for that, but currently having merge conflicts and i am not used to github's way of dealing with merging it all back 1077 Sweet: I don't want to create a branch on the main one 1078 Sweet: I just want it to apply the changes to ours 1079 Nevlin: If you have a local git client you can resolve merge conflicts that way 1080 Sweet: I got github.com and the github desktop client 1081 Sweet: The merge itself is obvious, its your version bump it doesn't like 1082 Sweet: But i don't want to clutter the real one 1083 Nevlin: You are trying to merge the official branch into united, right? 1084 Sweet: Yup 1085 Sweet: I fixed the cuda docker 1086 Sweet: So i need to merge it with your united changes intact, without those landing in a public branch 1087 Nevlin: ok let me do it 1088 Sweet: And i assume your dynamic world info merge was safe 1089 Sweet: I didn't get a good chance to test it, but if others here have i can allow that one to 1090 Sweet: Merged the dynamic world scan first 1091 Nevlin: opened a pull request 1092 Sweet: Was that effectively you cloning main on your end and then pushing it back into mine? 1093 Nevlin: I did a `git checkout` to united, then `git merge`d the main branch into it, then manually fixed the merge conflicts in Visual Studio Code 1094 Sweet: @ReXommendation If you get the latest version you will have that file fixed its failing on 1095 Fowle: Ok thanks 1096 Sweet: It will still fail since you don't have an nvidia gpu 1097 Sweet: In the docker-cuda/docker-compose.yml file 1098 Sweet: Remove anything starting at devices: and lower 1099 Sweet: You only need the first half of the file 1100 Sweet: That will stop it from trying to look for a GPU that does not exist 1101 Sweet: Then you should have a CPU version instead of a cuda version 1102 Sweet: And don't forget to mod that first file to get the huggingface again if your updating all the files 1103 Fowle: Now do I just wait? > rexommendation@pop-os:~/KoboldAI-united$ sudo ./play-cuda.sh > non-network local connections being added to access control list > Building koboldai > Step 1/6 : FROM mambaorg/micromamba > ---> 71c065a6981f > Step 2/6 : WORKDIR /content/ > ---> Using Cache > ---> c24add7fe95d > Step 3/6 : COPY env.yml /home/micromamba/env.yml > ---> Using Cache > ---> b0fd3c767a63 > Step 4/6 : RUN micromamba install -y -n base -f /home/micromamba/env.yml > ---> Running in 5de05ffaeb4f > > __ > __ ______ ___ ____ _____ ___ / /_ ____ _ > / / / / __ `__ \/ __ `/ __ `__ \/ __ \/ __ `/ > / /_/ / / / / / / /_/ / / / / / / /_/ / /_/ / > / .___/_/ /_/ /_/\__,_/_/ /_/ /_/_.___/\__,_/ > /_/ 1104 Sweet: Yup 1105 Sweet: The display doesn't really work well but let it sit 1106 Fowle: Looks fine but copy and paste makes it look bad 1107 Sweet: In my case most of the install process got cut out from view by docker 1108 Sweet: But it downloaded a mini version of ubuntu inside the docker with the tools we use on windows to deploy all the dependencies 1109 Sweet: Right now its downloading all the python dependencies and setting it up for you 1110 Sweet: And then lastly it adds X11 to the docker so it can show you the file open dialogs 1111 Fowle: Ah ok 1112 Sweet: After that it will attempt to launch it 1113 Sweet: If you modified all the files correctly you get the main menu 1114 Fowle: How long with it take to download and install 1115 Sweet: Depends on your internet 1116 Sweet: Its around 10GB on windows 1117 Sweet: KoboldAI needs a LOT of dependencies haha 1118 Sweet: But this cuda docker is really nice 1119 Sweet: It uses the distribution tools i use for the windows side of things to 1120 Sweet: Its the AMD side that is annoying 1121 Sweet: They limit it to python 3.6 1122 Fowle: It seems frozen 1123 Sweet: Like i said, the display of it is bad 1124 Sweet: But its still going 1125 Fowle: ok 1126 Sweet: You can check how much CPU its using if your unsure 1127 Sweet: It should be hogging your CPU pretty decently 1128 Fowle: Lol I can hear its fan 1129 Fowle: So in this version can I use softprompting? 1130 Sweet: Yup 1131 Fowle: Yee 1132 Sweet: The henk717 / united version has that feature 1133 Fowle: Still frozen 1134 Sweet: Whats your CPU usage? 1135 Fowle: bouncing 22-27 1136 Sweet: Which process? 1137 Fowle: total 1138 Sweet: Yeah but which process is causing it? 1139 Fowle: track-extract is 8 percent 1140 Sweet: I don't recall that one 1141 Sweet: Did it do a whole lot of lines that say linking? 1142 Sweet: Like what is your terminal saying? 1143 Fowle: lots of Finished blas (00m:00s) 1 KB 5 B/s Finished pyasn1 (00m:00s) 53 KB 226 B/s Finished tenso 1144 Fowle: unmoving 1145 Fowle: do I terminate it and retry? 1146 Sweet: The moving part of that one is out of sight 1147 Sweet: Its most likely still downloading 1148 Nevlin: The last one usually takes a really long time 1149 Sweet: If you terminate and retry on Linux it will automatically delete all your progress and you have to do it all again 1150 Sweet: Its safe to do so, it just means all your progress is lost 1151 Fowle: It's still going 1152 Fowle: I'm just waiting 1153 Fowle: Oh wait micromamba is taking 2% 1154 Sweet: 2% and probably all your bandwith haha 1155 Fowle: lol 1156 Fowle: 60 megabits lol 1157 Fowle: Linking 1158 Fowle: How long does it take for transformers to install? 1159 Fowle: or flask 1160 Nevlin: Did it finish downloading everything yet 1161 Nevlin: It took my computer like 30 minutes to install 1162 Fowle: > Successfully installed filelock-3.3.2 flask-cloudflared-0.0.5 huggingface-hub-0.1.2 joblib-1.1.0 packaging-21.2 pyparsing-2.4.7 pyyaml-6.0 regex-2021.11.10 sacremoses-0.0.46 tokenizers-0.10.3 tqdm-4.62.3 transformers-4.12.0.dev0 It's just waiting 1163 Nevlin: Well I've never installed via Docker before 1164 Nevlin: Can you leave it on for like 15 more minutes just to see what happens 1165 Fowle: Sure 1166 Sweet: Once it begins linking its in the final phase, depends a lot on your storage speed 1167 Fowle: It's doing nothing now 1168 Sweet: pytorch takes a while to link 1169 Sweet: If it fails you exit back to the terminal 1170 Fowle: I'm past the linking 1171 Sweet: Got an error? 1172 Fowle: nope > Transaction finished 1173 Sweet: Whats most likely happening is that it finished creating the image and is packing it up 1174 Fowle: also accidentally pressed ctrl+c 1175 Sweet: Hopefully its not cancelling it then 1176 Fowle: trying to copy paste 1177 Fowle: it did 1178 Fowle: rerunning 1179 Sweet: Hopefully it at least completed mamba 1180 Sweet: Otherwise you have to do all that again xD 1181 Sweet: Either way i am going to call it a night, just be patient i am sure you will get far since i know the installation process should succeed at least beyond the steps that take long 1182 Fowle: Memory usage is getting kinda high 1183 Fowle: Cache 1184 Sweet: From the updates or from the program itself? 1185 Fowle: Just Cache is 1186 Sweet: Cache is supposed to be high 1187 Sweet: If Cache is high thats a good thing 1188 Fowle: ok 1189 Sweet: It means files it has in memory for faster access 1190 Sweet: Linux free's it up if you need actual ram 1191 Fowle: ok gn 1192 Sweet: This is what mine shows like 1193 Sweet: [Image attached] 1194 Fowle: interesting 1195 Jacinta: Found out that I can make an hacky modification to my server to allow beefy GPU support but I have to remove the motherboard to do it D: 1196 Jacinta: Does a k80 heat much by the way? They are fanless right? 1197 Carena: They need to be cooled 1198 Jacinta: Idea discarded then, the PCIs are outside the fan shroud. The official GPU kit adds 4 fan as well but apparently it can't be installed, you either have it already or you don't. 1199 Sweet: Got good news for AMD users, i got my system back up and running 1200 Sweet: Not with docker though, i am still uncertain what is going wrong on that part 1201 Sweet: If you want to run KoboldAI on AMD i recommend installing it manually with the requirements.txt that is bundled 1202 Sweet: Then, go to the pytorch website, find the pip install command for the rocm torch. And install this version on top. Replaces the regular pytorch with the rocm version. Then if you manage to resolve the other distro specific dependency hell (For me all i needed to do was install tk as a package) it will work. 1203 Sweet: Meanwhile in my M40 saga, i got my USB network adapter yesterday and it has been working wonderfully! Its a USB hub + network adapter in one so i can now afford to loose both my network card and the 2 USB ports i loose when i enable the Above 4G crypto 1204 Tmas: Here's an excel report on all the GPU options geared toward kobold ai models/maybe gaming with my recommendations. Feel free to suggest changes. 1205 Sweet: @Tmas 3060's also have 12GB of vram, and for people who want to go for a cheap rig it together and pray that it works solution you could go with K80's or M40's 1206 Fretwell: i checked the price for a k80 its around 3-5k euros 1207 Fretwell: i might have looked at the wrong thing i think 1208 Fretwell: the one i saw was an nvidia quadro tesla80 1209 Carena: New, yes... 1210 Fretwell: isn't it a bad idea to buy already used gpu's? You don't know how much they will last. 1211 Tmas: I missed that. I'll revise my report. 1212 Tmas: Depends. Server grade GPUs most likely were heavily used, put in storage for x amount of time, then perhaps sold second hand. Consumer cards are pretty safe in my experience with buying them. 1213 Fretwell: would a k80 fall under Server grade or Consumer Grade? 1214 Sweet: Server grade 1215 Sweet: So no fan, needs mods to be cool enough and a good motherboard that can handle it 1216 Tmas: Correct 1217 Fretwell: idk if my motherboard can use it 1218 Fretwell: i don't even think i can add more ram to it 1219 Tmas: Should be a standard x16 lane 1220 Tmas: Ram or a GPU? 1221 Fretwell: from what i read the max ram my motherboard can use in 16 ram 1222 Sweet: These K80's and M40's you need Above 4G decoding support 1223 Sweet: And PCI Gen2 1224 Fretwell: baseboard is the motherboard right? 1225 Sweet: Would be yes 1226 Fretwell: for some reason i can't find my motherboard online, there is the normal one but mine is a cf one 1227 Fretwell: the gen2 is PCI express 2.0? 1228 Sweet: I don't think so 1229 Sweet: Because mine is PCI Express 3.0 and its a seperate option in the bios 1230 Fretwell: i found the specs for mine 1231 Sweet: Which board is it? 1232 Fretwell: [Image attached] 1233 Fretwell: i also ordered a new hdd so i don't overstress my ssd 1234 Sweet: @Tmas RX6900's are flatout a bad recommendation in your list. They don't support any acceleration at all. ROCm is not supported on those either. 1235 Tmas: Noted. Wasn't sure about AMD cards so I tried to give them the benefit of the doubt. 1236 Sweet: [File URL attached] 1237 Sweet: For consumer cards only the Vega's can run with 8GB but its not really worth the hassle of having people use Linux 1238 Sweet: I don't think we should recommend people to buy AMD until they improve ROCm support 1239 Riehl: [File URL attached] 1240 Riehl: $195 1241 Yup: >Ready for resale 1242 Yup: [Image attached] 1243 Riehl: they absolutely need to be cooled. There are a variety of ways to cool it-- 1. You can purchase a 3d-printed mount for the back and mount a blower 2. You can unscrew the hex-screws on the top of the card and remove the case - allowing you to either use a large fan (Like Henky does) or you can use an independant 2-3 fan "card" which sits right next to the K80 and blows air directly onto it. Some folks even use thermal metal 'tape' to press the fans directly to the heat sinks - though I'm not sure that's a good idea. 1244 Riehl: I'm using a used K80 that I bought for $150. The K80 is an out-dated mining card which isn't useful for bitcoin mining at this point. There are warehouses full of the things. 1245 Riehl: I mean, it's a risk? But i've built several machines for half the price of new parts. It's a gamble. :/ 1246 Yup: Still, that's more on the "used" side than "unused" 1247 Riehl: Brand New, you're looking at 300-400 dollars, ... typically from China, 1248 Yup: anyway my location is so remote that in here the mining farm sold is, as I just checked that ad, made of 6 x 1070 1249 Riehl: ngl, ... they're *probably* used. But in good-looking shape. 1250 Yup: "Need money urgently" 1251 Yup: No point to even joke anymore 1252 Riehl: oof 1253 Yup: Server-usage could have less used cards, but those are less resold and more used till death 1254 Yup: Somewhat funny that my quite obsolete 750Ti *still* has more CUDA cores than the recently purchased 940MX (on a laptop) 1255 Yup: they're on the same screen 1256 Yup: (on that note, what the hell is "GRID"? apparently something about virtual desktops from 2015 and so on) 1257 Sweet: GRID is what you use to power virtual machines in case you want to make your own geforce now or need it for VDI. Because Nvidia is scummy they need that expensive card on top of a license which i think is a montly paid one. Makes it completely out of budget for most smaller businesses. 1258 Fretwell: i got my new hdd but the pc doesn't detect it 1259 Riehl: is it an SSD? 1260 Fretwell: a 2T hdd 1261 Fretwell: ssd 's don't have a long lifespan as an hdd 1262 Fretwell: atm i have the new hdd inside the computer case while the old one i have it on an external docking station 1263 Sweet: Right click on the startmenu and choose the disk management, if its detectef you will get a popup to initialize it (I recommend GPT due to its size). Then you can create a partition there. HDD's have no drive letters by default. 1264 Sweet: SSD's these days have a longer lifespan 1265 Fretwell: this is everything in the disk managment 1266 Sweet: Hows the part look on the bottom? 1267 Fretwell: only these 2 appear 1268 Sweet: No not that haha 1269 Sweet: In disk management on the bottom 1270 Sweet: It should show the disks 1271 Fretwell: [Image attached] 1272 Sweet: Not a third one? 1273 Fretwell: nope 1274 Sweet: Then its not detected on a hardware level 1275 Sweet: Got it connected to the motherboard and power? 1276 Fretwell: used same wire as for the old one 1277 Fretwell: atm the old one is connected on a docking rack on the pc 1278 Sweet: What if you put the new one on that? 1279 Fretwell: is already in 1280 Sweet: The docking rack or in the PC? 1281 Fretwell: in the pc 1282 Sweet: The PC is not seeing it though 1283 Sweet: The tricky part is even if it does see it you won't get a drive letter 1284 Fretwell: do i need to manually install the drive for the hdd? 1285 Sweet: You need to keep checking in disk management 1286 Sweet: And then make a partition once it shows up 1287 Fretwell: i just plugged the hard disk in after i unpacked it 1288 Tmas: Revised list based on @Henky!!'s suggestion. 1289 Sweet: Don't have the time and energy to really revise it further, but i made some quick changes. AMD RX is a terrible recommendation and should only ever be done if people have them since its not officially supported at all and in terms of the higher RX's they are not supported in general. Vega's if people want to use Linux and are super cheap or if they already have them could be used but i didn't add them. I did add my M40 to the mix since its better than a K80 but shares its issues. 1290 Riehl: Is your harddrive formated for Legacy Bios or UEFI bios? ... Also -- the life of an SSD is largely based on the maker of the ssd drive. 1291 Riehl: [Image attached] 1292 Riehl: If your BIOS is set to Legacy boot -- (and likely the OS is installed in Legacy mode ... which is bad news in itself) ... the SSD won't be recognized. 1293 Fretwell: it's an hdd 1294 Fretwell: from stargate 1295 Fretwell: i didn't format it or anything 1296 Fretwell: just unpacked it and plugged it in the pc 1297 Riehl: But nothing is showing up, huh. 1298 Riehl: well - if you have another computer available, you can try connecting it and seeing if that computer reads it. If it does, the hdd is at least good and it's something going on with communicating with the computer you're putting it in. 1299 Fretwell: i did connect a verry old hard disk from 2017 and it detected it 1300 Fretwell: is there a limit to hard disk size from the motherboard? 1301 Sweet: Limits from the motherboard should not be a thing if you format it in the GPT layout. UEFI motherboards can just handle it, and if you got something ancient like my retro PC windows would. 1302 Fretwell: i didn't format the hdd 1303 Fretwell: oh and also seems nvidia released a new gpu, an rtx 2060 with 12 gb of ram 1304 Fretwell: and it's almost the same price as a 3060 1305 Sweet: So its no use haha 1306 Sweet: People might as well buy the 3060 1307 Fretwell: for runing an ai it's better to buy the 2060 12gb ram because it has more vram 1308 Sweet: 3060 also has 12 1309 Sweet: The Ti has 8 because logic but the regular has 12 1310 Fretwell: isn't the Ti supposed to be better? 1311 Sweet: The 3000 range is completely illogical 1312 Sweet: The 3060 has 12gb of vram 1313 Sweet: 3070 has 8 1314 Sweet: 3060Ti has 8 1315 Fretwell: that makes no sense 1316 Fretwell: a higher model having less vram 1317 Fretwell: and quite a lot less aswell 1318 Sweet: 3080 has 10 1319 Sweet: 3080ti has 12 1320 Sweet: I think its because of the shortages 1321 Fretwell: 3070 is almost twice the price of a 3060 1322 Sweet: 3060 is probably a gimped version of something better that was to broken to be that 1323 Fretwell: whi would you buy a 3070? its only 58% better 1324 Fretwell: if its twice the price just buy 2 3060, you get way more vram that way 1325 Sweet: For most people speed is more important than VRAm 1326 Fretwell: the 3090 seems to have 24 gb of vram aparently 1327 Sweet: Yup, that one is the best consumer GPU for us but its very expensive 1328 Fretwell: i think if i were to chose betwen a new tesla k80 or an nvidia 3090 i would go with the 3090, it is much cheaper than a new tesla k80 1329 Fretwell: i think the 3090 is faster than a tesla k80 1330 Fretwell: yes is waay faster the bandwith is more than 3x faster 1331 Fretwell: also the base clock speed is much faster 1332 Sweet: 3090 beats the K80 in everything 1333 Sweet: Unless the K80 is significantly cheaper the 3090 is the way to go 1334 Fretwell: the k80 has much higher proces size 1335 Sweet: Which is bad 1336 Fretwell: rtx 3090 has 8 proces size while k80 has 28 1337 Fretwell: the 3090 has gddr6x memory while the k80 has gddr5 1338 Fretwell: also the 3090 supports a way never version of cuda 1339 Fretwell: the left one is the 3090 and the right one is the k80 1340 Masry: The last time i checked the k80 was way cheaper. 400 dollars for a new k80 vs 2000 for a new 3090 1341 Lavinia: I'm sad the 3090ti is still 24gb. Was hoping for 48. Maybe even 32. Ram is faster so go gaming performance... 1342 Lavinia: I played Icarus the other night and cranked up the textures and the texture pool to max. Managed to use 20gb of VRAM. 1343 Carlock: If someone could share their experience running let say 2.7b AID on a 3060 12gb I would apparaciate their toughts on how it is. What's the response time is like, etc. Thanks! 1344 Sweet: I got a weaker GPU than that and mine you can expect around 2-10 second response times 1345 Sweet: Your going to have a very good experience with 2.7B's 1346 Carlock: Ohh I just asking because a german site supposedly has it in stock, no idea if they actually ship it but I'm getting tempted. Thanks! 1347 Fretwell: finally it detects it 1348 Riehl: what was the issue? 1349 Fretwell: i didn't defragment the hdd 1350 Fretwell: had to put it in a docking station and detect it from disk management 1351 Sweet: Format / partition you mean 1352 Sweet: But thats why i adviced disk management as the check if it was connected properly 1353 Carlock: Ok, so anyone who is interested in upgrade their GPU's in this dark times I got a 3060 12GB in a Ryzen 3600 + 32 GB (kinda meh, kinda slow RAM) system. I'm playing with the model that has the folder name of: "gpt-neo-2.7B-aid", I think it's linked in Kobold AI's wiki. I'm not able to run this purely on GPU, however I'm able to load it in CPU+CPU mode with 0 layers allocated to system ram. With this setup it takes around 6 seconds to generate a response which is kind of crazy. 1354 Fretwell: for me takes 12 seconds to generate on 2.7B 1355 Carena: 12Gb should be enough to run a 2.7B model, unsure why you can't load a full model in there, unless you use Windows... 1356 Sweet: Even on windows it should have fit 1357 Sweet: Unless he uses the official transformers on 0.16 then it would be very inefficient and a tight fit 1358 Carlock: Ohh yeah, I didn't reinstall and I was using the official transofmers, resintalled now and that's lowered the response time to 4 seconds! Thanks! 1359 Sweet: If you switch to our development version http://github.com/henk717/koboldai and the official transformers that one installs you get an even better version of KoboldAI 😄 1360 Sweet: Its nearing the end of its development cycle 1361 Emelda: Perhaps a bit too hypothetical, but it would be nice if we could get a version of KoboldAI to run on Maker/Hacker SBCs, like the Jetson Nano: https://developer.nvidia.com/embedded/jetson-nano-developer-kit 1362 Sweet: That specific one they show is to weak 1363 Sweet: 4GB of shared memory won't get you enough to run any good model 1364 Sweet: Your regular PC will be more suitable 1365 Emelda: That would be the "too hypothetical" part. 1366 Sweet: And for context, KoboldAI runs on anything that has the python dependencies we need. KoboldAI itself can't be ported to anything since it does not need to be as its platform agnostic, people would need to port the required dependencies. 1367 Sweet: So if people had a version of pytorch and transformers for that specific platform chances are KoboldAI will work on it 1368 Emelda: I know platforms like that have Python, I don't know about specific things like pytorch, though. 1369 Mirelle: Go look at the Tesla M40. They come in a 12 and a 24 model and can be had for 150-275. Their compute is old so slower than a 3090, but faster than a K80 (and you don't have to worry about the dual GPU and memory partitioning of a K40). I've got one and can run GPT-J-6B at max everything and still have a TINY amount of vram left 🙂 1370 Sweet: I got one to, but not yet hooked up 1371 Sweet: Cooler for it will arrive somewhere next month 1372 Sweet: Its been a bit of a painful journey though xD 1373 Mirelle: I getto'ed it with duct tape (the silver reflecting kind) to make a shroud. I'm using it with a GPU miner external connect so it's not too bad to be geto there 1374 Sweet: In my case i bought one and tried it out 1375 Sweet: Then i no longer had working internet and lost a USB controller 1376 Sweet: Now i got a substitute with a really good USB hub with built in ethernet that has been shockingly good 1377 Sweet: But the cooling will also be tricky 1378 Sweet: So the one i ordered is a kit 1379 Sweet: Its a 3D printed mount 1380 Sweet: And on top of that is a server fan 1381 Mirelle: [Image attached] 1382 Sweet: Thats so ghetto xD 1383 Sweet: But will be quieter than mine 1384 Sweet: Is that even inside your case? 1385 Mirelle: Nope. Sitting on a shelf in my rack. Using one of the miner 1x riser cards with a USB A-A cable long enough to go out the back of the server case and up to the shelf. 1386 Mirelle: Super dusty in the garage, but racks are loud, so banned from the house it is. 1387 Sweet: Yeah i'm also worried for how loud my PC will become once i have it in 1388 Sweet: I have such a silent PC right now 1389 Sweet: I just hope the card won't overheat if i don't cool it and don't use it 1390 Fretwell: how loud can it get? 1391 Sweet: 60DB 1392 Sweet: Which is why i am trying to go for a fan i can control with my motherboard 1393 Fretwell: mine become a bit loud when it the ussage goes up a bit 1394 Sweet: Since i would not want a permanent 60db in my room 1395 Sweet: Your card has no server fans right? 1396 Fretwell: i do hear a small grinding sound idk if it's the cpu or gpu 1397 Sweet: Because we are talking this kind of loud : https://www.youtube.com/watch?v=qgXcYp6rn_0 1398 Fretwell: my card has 2 fans on it which came with the gpu 1399 Sweet: GPU fans have nothing on server fans xD 1400 Sweet: My GPU can get loud to, but not this loud 1401 Fretwell: are the server fans any good? 1402 Mirelle: That setup is quite quiet if I turn the fan speed down. When it's generating it's no louder than what I'd expect from a CPU fan. 1403 Sweet: They are way faster than desktop fans and much smaller 1404 Mirelle: The rack though, I've got an infiniband switch, and it is super loud and high pitched 1405 Fretwell: wait can i control the fan speed using the nvidia control panel? 1406 Mirelle: Server fans prioritize air movement over everything, so they are loud and high pitched 1407 Sweet: Possibly, otherwise afterburner can probably do it 1408 Sweet: That and a small size 1409 Mirelle: I hooked mine up to a spare fan controller port on my motherboard, then used some software (don't recall what) to tie that to the gpu temp 1410 Fretwell: i don't think i have issues with cooling my gpu never goes 80 degrees 1411 Sweet: They also had a blower fan version which looked neater, but that one had a molex connector 1412 Sweet: And i refuse to have the blower fan be as loud as my retro PC 1413 Sweet: My retro PC also has a blower fan at 100% at all times 1414 Mirelle: Ya, a blower would probably work well, but would be louder 1415 Fretwell: what about water cooling? 1416 Sweet: And its fine when i want to use my retro PC, but its not fine to have that on permanently in the room every day 1417 Sweet: I don't do water cooling 1418 Sweet: Never liked the idea of having water in my PC and its also less quiet than my current setup 1419 Mirelle: I've done lots of water cooling (current daily is GPU water cooled), but water blocking an M40 would be difficult 1420 Sweet: But i'm basically hoping the card will stay cool at idle with minimal airflow 1421 Sweet: Then i can program my bios to just never turn that fan on 1422 Sweet: And when i use it i can then use a tool in Windows to turn it up 1423 Fretwell: wouln't it be easyer to but the gpu inside a fridge? 1424 Mirelle: The M40 stays cool under idle with almost no air-flow for me 1425 Sweet: Great to hear since i was planning to install it in my system soon 1426 Sweet: Even before the fan arrives 1427 Sweet: Kinda depends on how long shipping will take though 1428 Mirelle: If you're thinking idle with no fan, then it'll get tosty 1429 Sweet: Idle with a tiny little bit of front fan? 1430 Mirelle: but idle with like 500rpm fan would probably be OK 1431 Mirelle: Don't know. Depends on the airflow in the case. I tried it in my server case (three 120s blowing through the back) and it couldn't handle a load. Didn't try an idle 1432 Fretwell: i m thinking of getting a 3090 which has 12 or 24vram 1433 Sweet: Ill spare myself the fanless attempt then 1434 Sweet: 3090 with 12 would be a waste of money 1435 Sweet: Then just get a 3060 1436 Mirelle: Unless you want to game and sometimes do AI 1437 Fretwell: mostly for gaming 1438 Mirelle: But for AI 12G 3090 is a bit of a waste 1439 Sweet: Still, 3090 kind of budget with only 12GB vram isn't worth it 1440 Sweet: Go all the way or go cheaper xD 1441 Mirelle: Ya, if you're dropping that kind of cash (assuming you don't go crazy with scalpers).... 1442 Fretwell: but for a while a barrely played any games, just listening to music and watching videos or reading stuff 1443 Fretwell: for now i might just get a better cpu and 16 more ram 1444 Fretwell: i mostly play mc and i need a better cpu so i can render chunks faster 1445 Sweet: Ram first 1446 Sweet: Unless your new CPU requires a new board 1447 Fretwell: 1.18.1 is a ram hog, i get ram issues with only 100 mods 1448 Fretwell: and constant freezes 1449 Sweet: My modpack was a ram hog to 1450 Sweet: And thats on 1.70 1451 Fretwell: 1.7.10 barrely took any ram 1452 Fretwell: i was able to run some pretty huge modpacks with 8 ram 1453 Fretwell: and it had over 400+ mods 1454 Mirelle: I'll show my ignorance, what's mc? 1455 Fretwell: minecraft 1456 Mirelle: Ahhhh. Should have guessed 1457 Fretwell: this is how much ram i have allocated 1458 Fretwell: on fabric i have no issues with ram i have over 250+ mods and the ram barelly goes over 12 but mostly noticed that mc is mostly dependent on the cpu and ram 1459 Sweet: I'd still need to remake my minecraft launcher so i can't easily launch my mod 1460 Sweet: It uses mojang accounts 1461 Sweet: But it was effectively minecraft on steroids 1462 Sweet: But made to feel like old minecraft and with a persistant inventory 1463 Sweet: So no parrying or anything like that, no food bars 1464 Sweet: Still the spam click combat system 1465 Sweet: Minecraft as i used to enjoy it but with much more in it 1466 Fretwell: now they forced account migration 1467 Sweet: Yeah, so nobody can use my mod pack xD 1468 Fretwell: i personally like both combats as much 1469 Fretwell: for pvp the spam click one is better but for pve the new one is more rewarding 1470 Sweet: Given the toughness of everything in the world the spam clicking is desirable 1471 Sweet: The mod pack is full of stuff 1472 Sweet: Many different worlds 1473 Sweet: Difficult creatures 1474 Sweet: Boss mobs 1475 Sweet: Etc 1476 Fretwell: well the only thing that the spam click would be good is agains the ice and fire dragons 1477 Sweet: Or these for example 1478 Sweet: [Image attached] 1479 Fretwell: that looks alot like the warden ngl 1480 Sweet: They can ignite stuff and do quite a decent bit of knocking damage 1481 Sweet: And they can jump up things 1482 Sweet: So you can't just get on a tree or roof 1483 Fretwell: are they from advent of ascension? 1484 Sweet: Yup 1485 Sweet: Advent of ascention is in the pack 1486 Fretwell: its a pretty good rpg mod ngl 1487 Sweet: Old version of Lycanites Mobs is in there to 1488 Sweet: Same with generation raidable dungeons 1489 Sweet: And that one that spawns like raidable casles and fortresses 1490 Sweet: But also the good tech mods, farm mods, fishing mods 1491 Sweet: Quake movement 1492 Sweet: So you can do things like air strafing and bunny hopping 1493 Fretwell: seems the mod won't get updates anymore, there seems that something happened to the mod creator 1494 Fretwell: if you like the raidable castles and fortresses you should take a look at when dungeons arise it has quite a few massive structures that are acually pretty well built in terms of design and they are also challanging 1495 Sweet: There is one version of minecraft i'd love to play but does not exist 1496 Sweet: Let me see if i can find it 1497 Fretwell: which one? 1498 Sweet: To many minecraft content on youtube to find it back 1499 Sweet: But it basically turned minecraft into a super good RPG, where you could really befriend the villagers, have kingdoms in the world, etc 1500 Sweet: And then use them to do raids on other ones, protect your community, etc 1501 Fretwell: oh 1502 Fretwell: i think i have that modpack 1503 Fretwell: i think it was called mineshadts and monsters 1504 Fretwell: and it is for 1.16.4 1505 Sweet: Looks neat, but i'm currently not going to redo an entire modpack from scratch xD 1506 Fretwell: i think the world i m tryng to generate is broken lol 1507 Sweet: [File URL attached] 1508 Sweet: @Valerian I finally have the M40 up and running 😄 1509 Sweet: And you should definately update your cuda driver 1510 Sweet: Anything higher is compatible, anything lower isn't so updating to 11.6 saves you hassle 1511 Sweet: So far mine has been keeping steady with the fan on 50% when in use 1512 Sweet: And i might get away with the fan off entirely if its not recently been used 1513 Sweet: Otherwise a very low fan speed is enough to hold it, but not enough to run multiple generations in a row 1514 Sweet: So i am still trying to find the sweet spot 1515 Sweet: Oddly noticing far longer loading times now that i have the GPU 1516 Sweet: Even before it hits the GPU 1517 Sweet: In WDDM mode i do indeed lose 2GB of VRAM, but with the benefit that Argus Monitor can detect the temp 1518 Sweet: So now i have dynamic fan speed for the gpu in software 😄 1519 Riehl: You can also train locally now, too! 1520 Riehl: I think I need to talk to VE about getting the LS scanner working, the LUA keeps breaking. 😦 1521 Sweet: Training it shat itself, but running so far goes great 😄 1522 Sweet: Very pleased with my final setup, it runs passive when i am not using it, when it gets a certain temperature it automatically turns the loud fan on 1523 Riehl: If you do local soft prompt training, you have to use a special "break model" that VE made. 1524 Riehl: but happy to hear! 1525 Riehl: 6B awesome. Pity I can't run any of the newer models. *sadness* 1526 Sweet: I don't think breakmodel benefits me mine is a single GPU 1527 Sweet: United has breakmodel for those now, needs testing 1528 Riehl: I could try? which ones would you recommend? 1529 Sweet: Fairseq 6B perhaps? 1530 Sweet: --model KoboldAI/fairseq-dense-6.7B 1531 Mirelle: @Henky!! I found, at least on linux, probably windows too, the M40 sits at ~70Watts of power with KoboldAI even when KoboldAI is idle. If you run this command (nvidia-smi -ac 405,324) it reduces that power to 16watts (and helps with heat). Turns out that if something is loaded in VRAM, the Mem clock stays at max, even though nothing is happening. running that command sets it down to 405, but it still spikes up to the previous value when in use. 1532 Carena: That command does not work with every application, mind you. I don't have nvidia-smi ;) 1533 Sweet: Its bundled with our M40 drivers, not all nvidia cards would have that indeed 1534 Sweet: Neat, i indeed noticed it could be high 1535 Sweet: I tried nvidia-smi -ac 3004,1114 so it would be allowed to have high clocks and it seems to persist in terms of power savings 1536 Sweet: Looks like it could be a hardware / driver bug that if you set something gets fixed 1537 Sweet: My command seems placebo 1538 Sweet: Mine just sometimes stays a 69watts for a little but always goes back even on stock settings 1539 Fretwell: i m planning on getting a new gpu, is it possibile to run kobold with an rtx 3060 and 2060 at the same time? 1540 Carena: It is. It just splits the layers 1541 Fretwell: like are the cuda versions compatibile? 1542 Fretwell: an rtx 3060 is slightly more expensive than my current gpu 1543 Carena: CUDA support for the 2060 might expire sooner than the 3060 1544 Fretwell: would 14 gb of vram be enough for 6B? 1545 Carena: 14Gb on Linux maybe, but note that Microsoft requires 20% of GPU for its own purposes 1546 Fretwell: if is 2gb less than enough i should be able to get 10 layers full tokens 1547 Mirelle: Huh. On mine it's definatly real. I moved the GPU in my case and my getto-fan method didn't work. If it's sitting at the 60W load it hits 89 degrees (new fan is coming today). If I run that command, it stays at 17W or so and runs at like 30-40 degrees. 1548 Carena: I am in the process of getting accurate numbers on the fine-tuning speed, and having said that, I might also be able to calculate the speed at which you fine-tune. I do need some data though, which is the speed at which your graphics card can fine-tune a 125m model. 1549 Jacinta: @mr_seeker Don't know how much data you managed to collect but if you want I can make a ML performance model to fill some gaps 1550 Jacinta: Just did that for another project, although accuracy on extrapolation is not that great 1551 Carena: ML performance model? 1552 Jacinta: Machine learning, so we can get an estimate on how it would perform on a card for which we have no data