Build farm fried

Started by Endolf, March 14, 2010, 08:19:04

Previous topic - Next topic

Endolf

Just a quick message to let you know the master node in the build farm fried it's disk some time yesterday. I'm on the case, not sure how long it will take to resurrect though, depends if I can find a spare disk and some backups :)

Endolf

princec

I guess we can wait until there's some OpenGL4 drivers out ;)

Cas :)

Endolf

Ok, looks like I've managed to rescue it, only took 5 hours, and that includes upgrading VirtualBox on one system and upgrading hudson. Now where near as bad as I feared.

Happy CI + Nightly builds everyone :)

Endolf

elias4444

QuoteI guess we can wait until there's some OpenGL4 drivers out
What do you mean? I'm still waiting for OpenGL 3 drivers to come out! (and in some cases, even openGL 2!!)
=-=-=-=-=-======-=-=-=-=-=-
http://www.tommytwisters.com

spasi

Endolf, letting you know that I've added timestamp checking to the Generator, just like ant is able to skip recompiling .java files that haven't changed. A template is now only generated when it has changed since the last build, or if anything in the Generator classes has changed. It should lighten up the builds a lot.

Quote from: elias4444 on March 14, 2010, 18:30:45What do you mean? I'm still waiting for OpenGL 3 drivers to come out! (and in some cases, even openGL 2!!)

What's wrong with the current 3.2 drivers? :P

Endolf

Quote from: spasi on March 16, 2010, 19:13:36
A template is now only generated when it has changed since the last build, or if anything in the Generator classes has changed. It should lighten up the builds a lot.

Cool, sounds good, that should mean the build farm is idle for 23 hours 45 mins a day instead of 23 hours 40 ;p

Unless you plan on committing often and having CI builds run multiple times a day :)

Endolf

princec

Quote from: spasi on March 16, 2010, 19:13:36
What's wrong with the current 3.2 drivers? :P
Apple, that's what's wrong :/

Cas :)

spasi

Quote from: Endolf on March 16, 2010, 20:03:17Cool, sounds good, that should mean the build farm is idle for 23 hours 45 mins a day instead of 23 hours 40 ;p

Unless you plan on committing often and having CI builds run multiple times a day :)

Hehe, ok, wasn't sure what else was running on that server. I did this mostly for me anyway, 3+ minute compile times were too long every time I did a little change.

Quote from: princec on March 17, 2010, 09:44:26Apple, that's what's wrong :/

Yeah, Macs are stuck to 2.1 aren't they? But I believe they do expose quite a few extensions that cover 3.0+ functionality? Anyway, lets hope Valve's move to port Source to MacOS will be a wake-up call for Apple.

Endolf

Quote from: spasi on March 17, 2010, 14:18:37
Hehe, ok, wasn't sure what else was running on that server.

Multiple machines in the build farm, they do CI builds for LWJGL, Nightlys for LWJGL and JInput (and jutils but only as a dependency). I also do private CI builds for Ardor3D and a bunch of my own libraries and projects. Mostly those servers just turn electricity in to noise and heat, but for 30-60 mins a day they compile stuff too :)

Endolf

Endolf

Sorry folks, it's gone pop again, looks like a drive is failing, taken out the win32 virtual machine this time (took out the linux boot area last time). I'm going to have to order a new drive I think, I will see what I can rescue in the mean time.

Endolf

Endolf

Quote from: Endolf on March 17, 2010, 20:49:36
Sorry folks, it's gone pop again
Quick update, I'm in the process of rebuilding 2 of the nodes in the farm, the linux and windows 32 bit machines. I've lost a few builds, so build numbers will be odd for a few hours and builds will be failing, but because of the farm badly configured not the code (I hope :)). I'm hoping that I can get it all back to normal with then next few hours.

Endolf

Endolf

Right

It's all up and running again. The server that had the failed disk has been retired, so this shouldn't happen again unless another server fries itself. I've rebuilt the win32 and linux32 virtual machines on another node in the cluster. Builds might be a little slower than before.

Sorry for the 3 days with no builds.

Endolf

Matzon

np - just happy you're providing the service!

Endolf

Looks like something in the new setup isn't happy. Last nights builds failed, second night in a row, not from failing to compile the code though, from archiving the artefacts afterwards or similar clean up issues. I've kicked them off sequentially this morning and they have gone through, not quite sure what's going on. Will have to have a prod at it during this week.

Endolf

Endolf

I've had another poke at the farm, I've moved the master node VM off on to another physical machine. I kicked off some manual builds and they ran fine, we'll see what happens in a couple of hours when the nightlies run. It was nightlies and CI builds that failed, manually started ones were working, most odd.

I'll let you know.

Endolf