1. OCAU Merchandise is available! Check out our 20th Anniversary Mugs, Classic Logo Shirts and much more! Discussion in this thread.
    Dismiss Notice

System Uptime Records

Discussion in 'Business & Enterprise Computing' started by DeVo, Mar 23, 2010.

  1. Smegger

    Smegger Member

    Joined:
    Jul 24, 2001
    Messages:
    2,729
    Location:
    Adelaide, with joy.
    There's always one, isn't there?:lol:
     
  2. phrosty-boi

    phrosty-boi Member

    Joined:
    Jun 27, 2003
    Messages:
    1,102
    Location:
    Altona North
    Knew i'd see a mention of Netware in there somewhere

    Best i've seen is an old 6.5 server with around 400 odd days, rarely have to reboot these things except for service packs and maintenance
     
  3. figrin

    figrin Member

    Joined:
    Jun 26, 2001
    Messages:
    2,966
    Location:
    Sydney
    Yep fully agree with you, and everything else you have said. It's just that far too often, uptime is seen as a byword for stability, when in actual fact the uncertainty of it going down is so big no one wants to touch it. Uptime is really meaningless, unless your max uptime is low and you have some forced reboot schedule, then it's a problem.
     
  4. Psycronic

    Psycronic Member

    Joined:
    Sep 23, 2003
    Messages:
    347
    Location:
    Penrith
    when I worked at optus, a couple of the Alpha servers running VMS had been up over 1000 days
     
  5. oli

    oli Member

    Joined:
    Jun 29, 2002
    Messages:
    7,263
    Location:
    The Internet
    I remember hearing about an educational institution here in Adelaide who had two mail servers running Windows which basically ran the whole mail system for the organisation. They had been running for years then when one went down it took the other one down and lo and behold nothing would start up anymore thereafter...

    So I can see where figrin is coming from but I think it also depends on the role that the device/system is playing.

    A router running for 5 years is different to a Windows server running for 1. :)
     
  6. figrin

    figrin Member

    Joined:
    Jun 26, 2001
    Messages:
    2,966
    Location:
    Sydney
    Yep no doubt. Risk should be balanced out with business needs, business continuity level and the cost to implement those risk aversion.

    Yeah I have seen too many of those. I have even seen some machines that don't physically come back up after a reboot (like PSU, fan or bios problems), but in most case there's usually a lot of manual process in order to bring the software side back online after a reboot, and that could be bad.

    But yes, big router is a bit different, but there should still be some failover test ;)
     
  7. 192.168.0.1

    192.168.0.1 Member

    Joined:
    Nov 18, 2004
    Messages:
    1,540
    Location:
    Postcode: 2528
    figrin is right on the money with this.

    Most sys-admins are scared to take down machines for the pure fact they just don't understand them or don't want to deal with the consequences of when they don't come back up.
     
  8. 4wardtristan

    4wardtristan Member

    Joined:
    Apr 9, 2008
    Messages:
    1,181
    Location:
    brisbane
    (in regards to rebooting machines once a fortnight)

    not necessarily. depends what your machines are doing, and what results are displayed.

    ALL of our xenapp machines in our farm (around 25 or so) get rebooted every night for various reasons.
     
  9. oli

    oli Member

    Joined:
    Jun 29, 2002
    Messages:
    7,263
    Location:
    The Internet
    Has someone actually done some research to see if there is a definite correlation between length of uptime and likelihood of problems upon reboot?

    What is the actual link? I mean why exactly are there so many cases of systems not coming back up correctly if they've been running for a long time?
     
  10. Bangers

    Bangers Member

    Joined:
    Dec 25, 2001
    Messages:
    7,254
    Location:
    Silicon Valley
    No need for research. Assuming there is a correlation (which I don't believe) the short answer is most of the bugs would be timing related and code that wasn't planned to stay around for so long. A trivial example would be storing time from epoch in the system controller in an 8 bit register.
     
  11. Gecko

    Gecko Member

    Joined:
    Jul 3, 2004
    Messages:
    2,715
    Location:
    Sydney
    Got a couple that have been online for ages in my environment:

    Code:
    A pfSense router box (400 day party for it next week :) ):
     7:34AM  up 393 days, 21:54, 2 users, load averages: 0.06, 0.13, 0.09
    
    Linux boxes (looks like they don't get much weekend use....)
     07:33:05 up 288 days, 21:45,  1 user,  load average: 0.03, 0.01, 0.00
     07:36:32 up 235 days, 18:33,  1 user,  load average: 0.07, 0.02, 0.00
     07:37:01 up 237 days, 20:23,  1 user,  load average: 0.15, 0.03, 0.01
     07:40:43 up 198 days, 21:44,  1 user,  load average: 0.00, 0.00, 0.00
     07:38:10 up 235 days, 20:20,  1 user,  load average: 0.00, 0.00, 0.00
     07:38:45 up 186 days, 10:27,  1 user,  load average: 0.00, 0.00, 0.00
     07:39:54 up 225 days,  2:47,  1 user,  load average: 0.00, 0.00, 0.00
    
    Those are all anomalies - normally a server manages a couple of months before it gets rebooted for an update. Looking at those, it looks like my Monday should be spent doing a bit of updating...
     
  12. oli

    oli Member

    Joined:
    Jun 29, 2002
    Messages:
    7,263
    Location:
    The Internet
    Fair enough. I guess the other thing is that even if there is no correlation, it looks like there is because the overall uptime seems longer if it's been running longer. What I mean there is that if a system is down for 3 minutes every week but it runs fine for a year with this procedure then the ~36 minutes a year it is down isn't significant, but if a machine has been up for 3 years then it is down for over an hour it suddenly seems more dramatic. :p

    IACSecurity: Fair enough. Makes perfect sense (like all of your posts, hehe). :p
     
  13. maddhatter

    maddhatter Member

    Joined:
    Jun 27, 2001
    Messages:
    4,797
    Location:
    Mackay, QLD.
    I've discovered that Window servers that are restarted nightly generally behave a whole lot better than ones going for uptime records. Servers I've provisioned generally have shutdown -r written into the backup script; clean slate for the next day :)
     
  14. fR33z3

    fR33z3 Member

    Joined:
    Jul 16, 2001
    Messages:
    2,164
    Location:
    Perth
    in today's environments, the whole omg how awesome is my uptime is a moot point. Its now all about *SERVICE* availability, not *NODE* availability. If you need HA, you invest in an appropriate architecture to start with, that allows boxes to come down for maintenance, allows for hardware refreshes, and allows dependent infrastructure like networks and storage to perform their necessary maintenance.
     
  15. OP
    OP
    DeVo

    DeVo Member

    Joined:
    Jan 3, 2002
    Messages:
    344
    Location:
    Bendigo
    So what you're really saying is that you don't have any cool uptimes to share??

    :)
     
  16. Reginald85

    Reginald85 Member

    Joined:
    May 26, 2006
    Messages:
    295
    Location:
    Bendigo
    I DO I DO!


    Click to view full size!


    Server 2003 DC
     
  17. Nyarghnia

    Nyarghnia (Taking a Break)

    Joined:
    Aug 5, 2008
    Messages:
    1,274
    IBM E-Series ATM/EFTPOS interchange server...

    Running Windows 2000.

    Install date.. December 2002.

    First time rebooted or shutdown since installation was for power maintenance on building, this was in April 2010.

    I'm not even going to calculate the days.

    Typically see HP UX servers running for 4 or 5 years without needing rebooting.

    -NyarghNia
     
  18. GiantGuineaPig

    GiantGuineaPig Member

    Joined:
    Oct 23, 2006
    Messages:
    4,027
    Location:
    Adelaide
    Surely there were some updates that needed a reboot since install? This doesn't seem like a good thing to brag about...
     
  19. Nyarghnia

    Nyarghnia (Taking a Break)

    Joined:
    Aug 5, 2008
    Messages:
    1,274
    Not really my systems, these are systems which are supplied by a major interchange organisation, which don't live on an IP network and up until they were replaced communicated via RS232 comms.

    In that situation, leaving systems in a known & stable state is common practice, they are in effect fixed use appliances. Before I get lampooned and abused, these were not 'my' servers and the policy of the interchange organsation was to leave them alone, they didn't want them changed, not even patched... which is pretty hard to do anyway when they have no IP configs...but I thought that i'd simply post the stats for curiosity...

    -NyarghNia
     
    Last edited: Aug 23, 2010
  20. GiantGuineaPig

    GiantGuineaPig Member

    Joined:
    Oct 23, 2006
    Messages:
    4,027
    Location:
    Adelaide
    Ah as long as they aren't on your network and aren't your problem then fair enough :) I wouldn't touch them even to see the uptime :)
     

Share This Page

Advertisement: