Cluster Efficiency

Discussion in 'Business & Enterprise Computing' started by redav, Feb 12, 2007.

  1. redav

    redav Member

    Joined:
    Nov 7, 2003
    Messages:
    4,371
    Location:
    Brisbane
    I've had a bit of a search and nothing here helps so much.

    How much more efficient is 3 machines in a cluster over 3 machines that are standalone?

    We've been rendering on several stand alone machines over the years and IT now wishes to setup a cluster. Sounds great but how much more efficient / quicker is clustering? They want to start with 3 machines but I can't imagine 3 machines being quicker at excecuting the render tasks than say the ten stand alone machines on the current render farm. Would that be true?

    I mean, I'm all for adopting a better system but when they're about to roll out possibly 30 or 40+ machines that I could have the render software installed on, isn't it crazy to step back to a fraction of the amount of computer?
     
  2. infiltraitor

    infiltraitor Member

    Joined:
    Sep 7, 2002
    Messages:
    3,801
    Location:
    melbourne Donated:$133.70
    if all you are doing is rendering then id assume it would depend on how your program handles the distribution of frames/images to render
     
  3. OP
    OP
    redav

    redav Member

    Joined:
    Nov 7, 2003
    Messages:
    4,371
    Location:
    Brisbane
    Well, it's also to do other number crunching. I have no idea how the rendering software will go with it. I just reckon that bulk numbers will beat any smart system they come up with.
     
  4. stalin

    stalin (Taking a Break)

    Joined:
    Jun 26, 2001
    Messages:
    4,581
    Location:
    On the move
    pelvis esley will be your man for this one.

    If you were to compare 3 identical boxes in a clustered/rendering soloution to 3 individual ones, i strongly suspect you would find the clustered approach margianlly faster. The reason for this is it would not require human interaction to split the loads. The actual rendering would take about the same time on either incarnation of the system.

    If you can post more info, im sure elvis could give you as good an answer as anyone else ever could.

    edit: if its to do other tasks other than just straight rendering, (ie number crunching as you say) you would need a multi-process system, so something like openmosix would again be helpful, again performance would be nearly identical except for the time required to spread tasks would be less on a clustered approach.
     
  5. OP
    OP
    redav

    redav Member

    Joined:
    Nov 7, 2003
    Messages:
    4,371
    Location:
    Brisbane
    Yeah, I fully agree. The thing is that IT want to start of with 3 systems in a cluster and slowly build. The issue I have is that I presently have 10 systems on the render farm so it's going to be heaps faster as is. There's also another 20 I'd love to have on the farm plus there's another 30+ that will be rolled out to the office which could also get allocated. These are all desktops throughout the day but could be mine at night.

    So it's going to be far better than anything IT can throw at the cluster and I seriously think that management would never allow an equivalent number of machines allocated to a cluster as they will basically be idle machines half the time.

    And the other issue is that with a cluster, other people will want to use it too. I don't have an issue with sharing time, I mean, it's not my toy. But you will get to a time were two parties want CPU time and if I'm using existing infrastructure then I'm removed from the equation.
     
  6. Kermalius

    Kermalius Member

    Joined:
    Mar 15, 2002
    Messages:
    870
    depending on the cluster setup/license agreements, PXE boot linux on desktops overnight + a good job manager
     
  7. stalin

    stalin (Taking a Break)

    Joined:
    Jun 26, 2001
    Messages:
    4,581
    Location:
    On the move
    In that case do as Kermalius suggested. You can power cycle all your machines at a given time at night, at which point they get a TFTP provided linux box from PXE/DHCP it loads up and connects to your cluster head and it starts processing away. There you have all the power your company can provide at night, and during the day you may have to make do with 3 systems for quick tasks and postpone big tasks till evening.

    There should be more grunt than you need so system sharing is only an issue if you both want ALL the grunt on the same night.. But that scenario would be unlikley i would guess.

    Just make sure if you have no tasks to complete you dont leave the boxes running all night cause of hte power wastage and bills.
     
  8. elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    40,421
    Location:
    Brisbane
    1) What software are you using currently?

    2) What software are you considering adding to manage the "cluster"

    Renderfarm management tools are all about efficiency. You currently waste a few minutes every time you want to set up a machine to render a scene. You need to fire up the rendering software, tell it what scene to render, tell it which frames it is responsible for, then hit the render button.

    Then you have to sit and wait to see how it renders, what frame or position in the frame it's up to, whether the render was successful or failed, etc.

    And if the computer dies mid render? Well then you need to figure out which frame it was up to, and restart the job somewhere else, etc.

    Render management software does all this for you. Most of them present you with an administration panel that tells you what frame is being rendered where. You can set queue priority, frame priority, job priority. You can see in an instant which frames or jobs are rendering where, how they're doing, what the performance of each machine is comparatively, whether or not a job has failed, and if it failed then will it auto-migrate to another box, or do you want to control that, etc, etc.

    3-4 machines is child's play. But think about YOUR time spent running back and forth between them. Times that by ten. Are you going to be wasting rendering time setting up jobs on machines? The 10 minutes it takes you to set up a job on 10 machines is 10 minutes of wasted rendering time on the other 30 machines in your cluster.

    From an administration, co-ordination and logging point of view, render farm management software is mandatory. At the end of the day, the performance boost is the SECONDARY goal. Yes, it's important. But more important is being able to find out the EXACT status of your render farm down to the last detail by issuing a single command. Not by running around like a lunatic across your network trying to find out what's happening and where.
     
  9. OP
    OP
    redav

    redav Member

    Joined:
    Nov 7, 2003
    Messages:
    4,371
    Location:
    Brisbane
    3D Studio MAX is the render and using Backburner as the render farm manager. Simple program but for the most part, it works.

    I'm not involved with the cluster. I'm getting told by IT that they don't want to grow the machines on the farm, in fact they want to remove it and make me use the cluster when it's up and running.

    But this is the thing, with the next rollout of machines, I'll could potentially have access to 20 times the number of machines that they could put on a cluster from the start and I can't see management / IT growing the size of that cluster at a rate that's going to match even the 10 machines I have access to now.
     
  10. elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    40,421
    Location:
    Brisbane
    Backburner is more than ample for up to about 500 machines. After that you need some real software (and a real renderer to match). I'm assuming you're using the bog-standard scanline renderer, and not Mental Ray, VRay, or something more advanced?

    This new "cluster" that's going in - what OS will be running? If it's an all-Windows setup, then there's no harm putting the Backburner Server Service on each machine that will have the 3DSMax renderer going on it as well.

    You don't need anything more than Backburner to manage what you are talking about there. But you do need SOMETHING. Manually running jobs from each machine is just silly.
     
  11. OP
    OP
    redav

    redav Member

    Joined:
    Nov 7, 2003
    Messages:
    4,371
    Location:
    Brisbane
    Yeah, just the scanline render. I've used Brazil and Mental Ray for myself but as much as I want to use something nicer, it won't happen here.

    I'd be expecting that it's going to be a Windows based system. They're anti anything else... except for the odd Linux box. Yup, got the server service running here at the moment. I even wanted a wake on LAN ability but apparently it goes against our security protocol :(

    Oh, I haven't run machines individually for a good 5 years. But my concern wasn't about a render manager anyway, it's about having the render farm removed and being told that I've got to use a cluster.
     
  12. elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    40,421
    Location:
    Brisbane
    I think you're playing terminology soup here.

    A "renderfarm" is a nice wanky word for a bunch of network connected machines. A "cluster" is a nice wanky word for a bunch of network connected machines.

    If all these machines are running Windows and scanline 3dsmax, nothing will change for you. Just slap the backburner2 serversvc on each one and your "cluster" will look like a big "renderfarm".

    Don't get too caught up in labels when it comes to clustering. Far too many people use big words to make themselves look important when it comes to these things. Buggers me why.
     
  13. stalin

    stalin (Taking a Break)

    Joined:
    Jun 26, 2001
    Messages:
    4,581
    Location:
    On the move
    bigwordtomakemelookimportant!!!

    :leet:
     
  14. OP
    OP
    redav

    redav Member

    Joined:
    Nov 7, 2003
    Messages:
    4,371
    Location:
    Brisbane
    Thanks but I'm not, I know the difference.

    I currently have 10 machines available to the render farm but our IT department is telling me that this will stop however I will have access to a computer cluster instead. Their discription was "instead of having several seperate machines looking after several tasks individually inefficinently, you'll have 'one big computer' doing each task quicker where each machines resources are shared and therefore more efficient".
     
  15. Ice Czar

    Ice Czar Member

    Joined:
    Jan 18, 2004
    Messages:
    345
    Location:
    Colorado
    terminology soup it is then :p


    their description sounds like a High Performance Cluster but those are more typical of scientific clusters (where nodes actively communicate with each other as intermediate computations determined on one node are transfered to another node and are employed in its calculations)

    but that is still just a bunch of network connected machines

    or its Interactive parallel rendering (sort first, sort last or DPlex) but those are more typical of a monitor wall or virtualization

    however they are still just a bunch of network connected machines

    or grid computing
    again just a bunch of network connected machines

    whether its Embarrassingly Parallel Computation, Explicitly Parallel Computation, Interactive parallel rendering, or The "Kilauea" Massively Parallel Ray Tracer :p
    for all effective purposes they are basically the same as your current renderfarm just more of them and hopefully more efficiently managed for the overall rendering needs
    if you had the specifics of the implementation youd likely get better a better assessment from elvis ;)

    sometimes to make things clearer, muddying the soup is called for :p
     
    Last edited: Feb 21, 2007
  16. elvis

    elvis Old school old fool

    Joined:
    Jun 27, 2001
    Messages:
    40,421
    Location:
    Brisbane
    You're not getting it.

    Tell me what software they are running on "the cluster", both at an application and operating system level. Tell me what process and memory management systems they are using to distribute tasks. Tell me what compilation process and libraries you need to compile against to make your software "cluster aware".

    Clusters aren't magic. Much work goes into clusters to make them do what they need to do.

    3DSMax is a third party application. If they are using some sort of HPC MPI to do thread-based migration, 3DSMax WILL NOT WORK on the "cluster". 3DSMax is a "batch and send" clustering system. Any sort of high-performance system works in an ENTIRELY different manner.

    People need to understand that the world "cluster" is a generic term. It refers to around 50 different types of technology that can be used to spread a task across a large collection of computers.

    What you have told me above is meaningless. It's terminology soup. It tells me nothing of the underlying components of the system, and how they will work with your 3rd party software.

    3DSMax WILL NOT WORK with your "one big computer". And by the sounds of it, you need to start figuring out exactly how this "one big computer" works because currently all you have is a bunch of words that could mean 50 different things.

    I do this sort of thing for a living. I work on "clusters" that are built with about 10 different types of software and technology, all of which are incompatible with each other. There's still the other 40+ that I've never used, and I'm what most folks would consider more experienced in this than the average fella.

    Get back to me with the answers from the first paragraph, and I can give you some real advice to aide you in choosing a direction of travel.
     
  17. OP
    OP
    redav

    redav Member

    Joined:
    Nov 7, 2003
    Messages:
    4,371
    Location:
    Brisbane
    They haven't set anything up yet. It was primarily for them to be able to do flood analysis, traffic simulations and fluid dynamic analysis but then they decided to put my stuff in the pot too.

    Okay, that's good then. I didn't think what they were telling me was going to be of any use to me for what I want to use it for but I didn't know.

    So thanks :thumbup:

    Turns out that it's not going to happen for a month or so and seeing as though we've just landed more work, I'm just going to carry on as usual and try and collar more machines :D
     
    Last edited: Feb 21, 2007
  18. Ice Czar

    Ice Czar Member

    Joined:
    Jan 18, 2004
    Messages:
    345
    Location:
    Colorado
    + Windows = Windows Compute Cluster Server 2003
    (or an older version, still a form of High Performance Cluster)
     
  19. stalin

    stalin (Taking a Break)

    Joined:
    Jun 26, 2001
    Messages:
    4,581
    Location:
    On the move
  20. flarets

    flarets Member

    Joined:
    Jun 27, 2001
    Messages:
    29
    Location:
    planet earth
    from what I've done with parallel computing of cfd, the hardest thing is splitting up the work of one computation evenly for each processor (or harder still, getting the program to do it automatically). If processors need to exchange data during the computation, you don't want them waiting around for it.

    SETI and protein folding would examples of "embarrassingly parallel" computing, but they work very well.

    So, clusters will definitely be faster if you write the software specifically for them. Otherwise, you'll be able to do something at the same speed, only multiple times at once.
     

Share This Page

Advertisement: