your oh-shit moments.

Discussion in 'Business & Enterprise Computing' started by TehCamel, Dec 7, 2013.

  1. TehCamel

    TehCamel Member

    Joined:
    Oct 8, 2006
    Messages:
    4,175
    Location:
    Melbourne
    Come on - fess up. That moment where you've done something, and realised you did the wrong thing.

    rm -rf /* instead of rm -rf ~/*
    pulled the wrong disk out of a broken raid5 array
    underprovisioned hardware on a virtual cluster..
    ran a dodgy script you hadn't tested in a lab environment..
     
  2. blankpaper

    blankpaper Member

    Joined:
    Feb 1, 2013
    Messages:
    932
    nothing memorable really.

    the biggest i can think of was mistyping a password while installing some proprietary software, which goes to town on local admin accounts with this user/pass entered during installation. so if you don't know the password the box is screwed. it was a fresh box anyway so a re-image and you start again. no big deal, maybe 45 minutes of time wasted.
     
  3. CAPT-Irrelevant

    CAPT-Irrelevant Member

    Joined:
    Sep 7, 2007
    Messages:
    5,251
    Location:
    Sydney
    Synchronising a user's blackberry with their Outlook. Their contacts were on their blackberry but not in Outlook, so I ran a sync via the desktop software, and Outlook decided to overwrite the phone's data with the lack-of data Outlook had.

    Accidentally lost hundreds of the user's contacts, and there was no backup...
     
  4. mooboyj

    mooboyj Member

    Joined:
    Sep 13, 2005
    Messages:
    984
    I broke BIND for ~120,000 users.....:Paranoid: Yes, over one hundred thousand users....
     
  5. Rezin

    Rezin Member

    Joined:
    Oct 27, 2002
    Messages:
    9,488
    lol, how'd you manage to do that?
     
  6. maddhatter

    maddhatter Member

    Joined:
    Jun 27, 2001
    Messages:
    4,798
    Location:
    Mackay, QLD.
    Years ago I was rolling out updates on a server at a medical practice (which operates paperless, so computers must work) and amongst accepting the terms and conditions for the updates the raid controller spat out an error to which I clicked something I shouldn't have...

    Destroyed the array...

    Format, re-install and restore from backup, took ~4 hours and that's about the worst from me.

    I may have introduced a rouge DHCP server to a large wireless network once with 100's of sites around the region, good times.
     
  7. inoshiro

    inoshiro Member

    Joined:
    Jun 28, 2001
    Messages:
    1,086
    Location:
    Siddeneee
    rm -R * blah instead of rm -R *.blah... Deleted a little more than planned :lol: Thank <deity> for backups.

    Best effort was a good couple of decades ago when oncall for a major financial institution. Had to swap out a dodgy ATM controller, but got the numbers wrong. A few minutes later got a call wondering if I knew why half the ATMs in the state had gone offline. "Really? No idea. I'll check it out..." :Paranoid: :lol:
     
  8. OP
    OP
    TehCamel

    TehCamel Member

    Joined:
    Oct 8, 2006
    Messages:
    4,175
    Location:
    Melbourne
    put a bad acl on a firewall.. knocked out internet access during production
    thank god for reload in 5
     
  9. IncredibleBulk

    IncredibleBulk Member

    Joined:
    Apr 29, 2007
    Messages:
    2,060
    Location:
    Kings Langley
    I've had a couple in my current job

    - rm -rf *

    Realised that I was in the wrong folder when the warning "cannot delete blah as it is a folder" warning popped up. Managed to kill the process but not before it deleted everything between A to I on a main production server

    Luckily had a backup from about a week before and not much had changed, but still a pants-shitting moment

    - drop_all_tables

    We have custom scripts to delete all tables (permanent and temp) in Oracle for specific db users. Was running two putty sessions side by side (customers and my own). Ran the script and then realised I was on the wrong session

    This happened at 2pm... two weeks before xmas shutdown.... with the customer already a week behind schedule due to the xmas rush

    You can imagine the chaos that caused, very nearly got me fired. I had a backup from the night before but that was still 8 hours worth of data entry and production info lost
     
  10. Primüs

    Primüs Member

    Joined:
    Apr 1, 2003
    Messages:
    3,354
    Location:
    CFS
    Mine is pretty similar - doing work remotely for a large ISP in a smallish country.

    They peer directly with transit providers in country, but floods undersea links pretty quick and wanted WAN optimisation and to use the local peer as a L2 link only (essentially).

    Created GRE tunnel between local AU site and remote site. Got to point of remote site setting default route via the tunnel (this was 3 am emergency work as links were being hammered without WAN optimisation, so static routes were used not dynamic) and I forgot to set a static route for the tunnel endpoint to still go via peering and not via tunnel. So it all fell to bits.

    I had a reload in 15 on the box instead of 5. Longest 13~ minutes to wait when hoping a core router in another country reboots and comes back lol!

    This is one thing I prefer about RouterOS - Safe Mode. Easy to turn on/off and if the management protocol which you were connecting to it drops off, it reverts back. Better then scheduling a reload as I've also been caught with a reload in 5 and my work took longer then 5 minutes, did not heed the 1 minute warning whoops!
     
  11. Alfonzo

    Alfonzo Member

    Joined:
    Apr 7, 2003
    Messages:
    13,136
    Location:
    4152, Brisbane
    I think the worst that I've done was plug a serial cable into a UPS that wasn't an 'official' APC serial cable - it had a panic attack and shut down instantly (still can't figure out why APC would build that feature into a UPS using a regular serial cable) - powered off the entire comms rack for that floor, knocked about 120 people off the network, voip, wireless.

    Lesser issues were with my own processes and didn't really impact on anyone else - back when I first started configuring Cisco switches I didn't really have any workflow in how I applied stuff, and I used to lock myself out of my own work-in-progress config - fortunately I never used to wr an awful lot either, so I could just pull the power and start from scratch. Then later on when I did start writing more frequently, I learned password recovery. :lol:
     
  12. GarethB

    GarethB Member

    Joined:
    Aug 19, 2001
    Messages:
    1,667
    Location:
    Melbourne
    Ok, a story from me. It was around 1987 or so, less than 12 months after I started working in IT full time. The system was a Burroughs B7800 series mainframe. The OS was proprietry and I forget what it was called. From the master console you could type in various commands to get a short set of information about what certain processes were doing. Adding a - sign to the command gave you more detailed information.

    The OS had a process called "swapper" which managed the virtual memory. Typing SW at the master console gave you some info about what system resources the swapper process was using for itself. One day, in the middle of the day, I decided to add the minus sign to the SW command, thinking I'd get more information about what swapper was doing and what resources it was using.

    Every rule has an exception and swapper was the exception to using the - sign. Adding the - sign to the command did not give me more information about the process, it shut down the process.

    Imagine a multi-user mainframe, in the middle of the day, and it's virtual memory has just been shut down without warning. The only way to recover was to reboot the entire mainframe, thanks to yours truely. :o
     
  13. bcann

    bcann Member

    Joined:
    Feb 26, 2006
    Messages:
    5,362
    Location:
    NSW
    It was around 98 or 99. Had some issues with an exchange 5.5 db. Was in dos after a 10 hour day troubleshooting the issue with isinteg. Accidently redirected the output using the > command to the live database overwriting it. Very much felt my heart drop down to below my bowels and about 3 seconds after enter was hit it sunk in. Very super lucky it was about 8pm and before id done anything i'd done a full db backup. Lost probably a few emails, but critical lesson learnt, back the fuck up before touching anything. Second lesson learnt 10 hour day and troubleshooting when already tired in a dos prompt dont mix for a non system stopping issue.
     
  14. Kommandant33

    Kommandant33 Member

    Joined:
    Mar 6, 2011
    Messages:
    4,011
    Location:
    Cheltenham VIC
    I know this was in the misc pics the other week, but I thought it was slightly relevant and funny for this thread:

    [​IMG]
     
  15. cbb1935

    cbb1935 Guest

    I've had 5 "major" ones in 17 years in IT, my ex boss had a cracker as well, and another employee had 3 decent efforts in the 3 months he was with us. We had a few work experience guys ome through out office and us train them up, but this one was the only real doosy.


    Linux wise, I never ever EVER use a delete command with as "*" or ".", because I shit myself that I might get something wrong.

    Mine:
    1. It's something I still do every now and then. START - RUN - "shutdown -s -f -t 0" *slams enter*. Then you realise you've made a typo in haste, and that the server in question has no iLO in it.
    DAMAGE: Wait until user comes in next to restart.

    2. Quite a number of years ago (back when Server 2000 was all the rage), a customer wanted specific permissions for specific folders. So I wanted to first reset all permissions, then start building them back up again. I selected all the files under C:, and went to security tab - then removed all permissions and hit OK. Left it running, rebooted, completely and royally screwed file system.
    DAMAGE: Reverted back to pre-server setup backup, reinstalled and reconfigured OS, and copied back data.

    3. Back in the days of DOS, ran "deltree . *" I think it was, but recursively went back 2 directories instead of one, so went back to the root directory and started nuking files. The end result was nasty.
    DAMAGE: Reinstall time.

    4. Replacing an AT power supply at a clients. Installed it, and went to test it before I installed the power button. Had button in my hand, pressed ON. Blew the PSU competely, and blew myself across the room. I'd overlooked the metal power button casing, created a loop and nuked myself with 240v.
    DAMAGE: Advise client I needed to go back to office to get a new PSU.

    5. Replacing a BIOS on a really old motherboard (circa Pentium 1 days). Swapped it, it didn't work, so touched it to remove, and burnt my fingerprint slightly off as the BIOS chip was well over 100 degrees.
    DAMAGE: Burnt off fingerprint partially.

    Boss blunder:
    1. After spending a week configing a linux server he took it out to a clients, and needed to still get the raid array working. He's typed a command, hit enter, and then proceeded the pressing enter with an "ooooooohhh shit".

    He had wiped the wrong drive,, and effectviely raided the blank hard drives contents, over the completely working drive. OS and all setups, NUKED.

    Twas not a good day for all.
    DAMAGE: Reinstall from scratch and left there at 4am.

    Other Employee
    1. Back in the days of the AMB T'breds, and before thermal cutoff. Replaced a customers HSF and didn't remove the plastic film.
    DAMAGE: I caught it at 120 degrees, and when went to rectify the heat had shifted the solder off the CPU.

    2. Connected the 3.5" hard drive plug, to the 4 pin fan socket on a motherboard. I don't think I need to go into specifics about what went *BANG* as the answer is, all of the above.
    DAMAGE: Killed the mobo, and the PSU.

    3. Couldn't make a power cable reach, so cut and spliced 2 together, WITHOUT any electrical tape. Found this one out when he tripped the safety switch after the 2 ends shorted, and he was sacked on the spot.
    DAMAGE: Could have killed him/me.

    =====================

    and the "doosie" for another professional tech, is one recently from Checkpoint's SMB team. Trying to fix a Client to Site routing issue with an 1180 remotely on our system, when he puts in the following route. The device's IP was 192.168.31.1

    ROUTE any TRAFFIC ON any PORT FROM 192.168.31.1 TO 192.168.31.1 <enter/apply/OK>

    DAMAGE: Put the router into an infinite loop, took down our VPN and both routers, took out our 1180 to the point even serial console cable could not access it. End result was a full facotry reset after 4 hours to get other functions of it going (so didn't have a backup). I was SPEWIN!!!
     
    Last edited by a moderator: Dec 8, 2013
  16. scrantic

    scrantic Member

    Joined:
    Apr 8, 2002
    Messages:
    1,690
    Location:
    3350
    Similar but with chmod on /* recursively on a web server rather than the current directory.
     
  17. aokman

    aokman Member

    Joined:
    Jul 12, 2001
    Messages:
    12,453
    Location:
    Melbourne
    One of my clients uses jail for FTP access and a couple of mount_nullfs for remapped folder shares to keep them out of no go areas.

    I once deleted an expired users homedrive without unmounting :upset: Never do that again...

    Oh yeah I have done the AT PSU electric shock thing aswell, I think its a right of passage from the old days :D
     
    Last edited: Dec 8, 2013
  18. ShaggyMoose

    ShaggyMoose Member

    Joined:
    Jul 1, 2002
    Messages:
    350
    Location:
    Sydney
    Not me, but at a bank I used to work at, someone decided it would be a good idea to let the graduate "learn by doing" on the AS400. Unfortunately, they should have checked the account permissions didn't include PWRDWNSYS. Hilarity ensues.
     
  19. cbb1935

    cbb1935 Guest

    hahaha :thumbup: You aren't a man until you've nuked yourself with 240v.

    I had one short and blow up on the front of a case ages ago. Ended up burning and marking the plastic on the back of the front bezel.

    So glad we moved on from the world of AT.
     
  20. plasticbastard

    plasticbastard Member

    Joined:
    Jul 30, 2003
    Messages:
    4,005
    Location:
    Sector ZZ9 Plural Z Alpha
    Killed a hard drive by accidentally placing the power cable over the interface and powering the system on. $900 data recovery effort on a drive that stupidly wasn't backed up.

    Ran an rm -rf <blah> on a folder which had a number of symlinks in it that ended up being followed. There were several dozen empty files after that, which caused deployments to stall for several days while I was figuring out what the hell I had done. Thankfully this was still in testing and not on production, and all the data was simply re-synced into the proper locations.

    Restarted squid in the middle of the day, instead of reloading configuration files, while also stupidly mistaking the issue I was trying to resolve as being related to squid cache, when it was really an external DNS matter. The really really stupid part about it was I know/knew better, and should have just let the situation resolve itself.
     

Share This Page