Discussion in 'Business & Enterprise Computing' started by IntelInside, Oct 1, 2010.
The joys of large scale batch processing.
I assume you are referring to this article. I'd guess they want to do acceptance testing on the primary systems; that would mean not only making changes and rolling them back, but also cross-checking and validating all the records across databases, re-connecting data warehouses etc. They may also be making changes that are intended to prevent a recurrence of the extended outage (eg ... changing decision trees etc) and testing or re-testing those procedures. Heck - they'd be culpable if they didn't fix the original problem (be it technology or process)!
Why aren't these organisations using something like SonicMQ and other related message queue structures??
Seriously, every booking should be nothing more than an XML document on a message queue, get the middleware to manage the flow of information between disparate systems, it works... well here anyway, nowhere near as much stuff floating about but the principle should work.
Airlines are hideously complex.
My companies ERP system is used with a variety of Airlines (Aus, NZ, Malaysia, UK, etc - No, VBA isn't a client) and depending on the complexity of the transactions, database size and volume of transactions, Yeah Batch can take hours on current Mainframe hardware.
Perhaps the effort if putting in such systems is financially out of reach?
One of the places I work for currently uses batch systems without any formal message queuing system in place (if there is, it's generally custom for a small chunk of the batch). But putting in something across the whole schedule end to end would be a massive financial undertaking, and introduce a lot of risk during the changeover.
Just for our ERP, you're looking at upto $1ml/year for support and maintenance. And we're one of the cheapest Tier 1 co's on the block.
We just spent $12 million over the last 18 months upgrading a few core systems (and that's just the software/support costs for one vendor). And that was literally just a version bump and data conversion with no real change to message passing systems no business logic (which is yet to come).
Apparently the failure was due to the solid state disk system. Quite interesting... Anyway thats what i heard in an interview on 2GB radio.
I cant believe theyre having to take their system off line from their DR to go back to their prod system all day today!!! wow botch up 9000!!
I can definately imagine, i'm dealing with FI to FI (Financial Institution) transactions at the moment and I can tell you it's all batch based.
But i've worked with several transactor sources and we've moved most of our inbound and outbound online stuff to message queues, it really does take a bit of a work to get them to see the benefits, but when you tell them that they don't need to know anything about the recipient system and visa versa, the discussion changes completely.. it becomes 'well, to do <this>. what do you need to know from us and what's the best format for you?"..
Once you start having that type of convesation you'd be absolutely blown away by just how fast completely disparate systems can be integrated.
But yeah, there's going to be cost and given the size of some of these operations it might be just too hard, which is a real shame.
Depends on the volume of data involved and the nature of the issue.. there might be all sorts of different applications and databases involved, moving from DR back to
production can be just as (if not more) involved than failing over to DR, downtime is inevitable in such a big environment...we're not just talking about a single SQL Server
It all comes down to how much time and money was spent on their DR systems.
But.. a disaster will reveal weaknesses in your DR strategy that you never thought of.
Eleventy-eight billion megabytes of data need to be restored.... once the diagnosis of the failed unit was finished ....the drive has to be sourced and replaced.... then the backup can being.... and thent he transaction logs can be replayed to make sure the restored database is good....
oh shit.... step 'whatever' failed, we need to go to an eariler backup.
Poor system design and maybe lack of clustered/hot swap capacity, poorly written or unpracticed emergency disaster recovery proceedures ?
So man questions... I'm honestly amazed that it was only 21 hours.
Our ERP system is based on MQ. Its still not an easy process to replay things (although we have Test and Dev environments - standard - so refreshing live data into dev, then replaying logs, checking consistency, etc can be done there) if you have a complex setup and large DB with high volume of daily transactions.
This is on the money.
Can i call you sometimes so you can give that speach to my clients? I'll pay you
I can probably answer that mate .... it's because they have never probably envisage this and they are not trained to do it nor the IT guys in charge had the balls to do it ... why? because they have never "practiced" that ... what is the point of a DR and BCP if you never try it?
I designed a similar scenario years back for a company doing ISP billing i think their clients involved quite a few international airports and once a month there was a failover to the DR site, this was business as usual, nothing wrong with primary site it was just something i wanted the IT guys there to feel comfortable with, it only took 5min and no transactions were lost. When a router caught on fire (literally) they were able to switch to the DR site in less than 2min!
Like i said they have nerver tried their DR strategy nor felt comfortable failing over because lack of training and knowledge which sux considering their status
I would use this quoted in my sig if not for the size lol
Couldn't have said it better and agree 100% and tired of fighting this very issue.
No worries, to my way of thinking.. message busses are the future of business systems, I generally think in terms of message architecture, busses and interfaces but am appalled at the lack of takeup.. I am also appalled that the very concept of message busses seems to exist WITHIN an application or parts of an application, but nobody is wanting to embrace them for system to system integration.
It really shames me that in 2010/2011 as I approach middle age (a reflective time in your life) that we're still relying on big nasty text files to transfer data from one sysem to another...
SonicMQ EMB does all that sort of stuff regarding protocol conversion, but i've also heard good things about the IBM suite, heck i've integrated SonicMQ and the IBM stuff, they do actually talk to each other lol.
This is where having flow control really helps, you can (as you are aware) put processed messages into a history repository, assign each unit of work a tracking number... and if there's a problem, simply replay from last processed tracking number, we're still working on this here where I am but the ramifications for DR are simply amazing if we can get it working.
I.T has really gone backwards in recent years... to be honest I'm done with it, we're going back to the 70's in many ways in how we do things...despite all the new techology, but that's a different rant I'm really over I.T.. it has no future in Australia.. everything's being sent to "India", hence why everything is so shit these days.
Happy to spread the word. I should warn you however: in my career thus far I've only ever found a single executive who understands a word of it, let alone agrees with it. That's a hit rate of a fraction of a percent, which doesn't bode well for the future of IT.
Seconded... I couldn't have worded it better myself...
You would be SHOCKED at the 'gutting' of I.T in the financial sector here in Melbourne, we've got corporations managing BILLIONS of dollars of investments handling incredibly sensitive information that have denuded themselves of anything even remotely related to Information Technology, I.T is a word that they do not use.
It's all about the infamous S.L.A and treating I.T like an investment (the next time I hear an executive use this analogy i'm going to hit him in the face with a brick).... The mindset is boggling...
I asked someone at a conference recently about their I.T outsourcing, here's what he said (I shit you not).
"I have no idea where these database things are, and frankly I don't give a fuck, they're just databases in the cloud, the outsourcer manages it in accordance with our SLA.. I.T is not our core business, so we outsourced it all and don't worry about it anymore".
They no longer even have a CIO or I.T or operations manager... the whole thing is being managed by their CFO and finance team, I won't name them but when I think of the vast amounts of data being managed it makes my blood run cold.
Executives everywhere are rambling on about 'the cloud! the cloud!' (which is always India).
Not only is the I.T infrastructure being outsourced, so is all the knowldge on how to manage information systems... I spoke to a big 4 bank exec recently in a meeting and asked them about some systems integration issues, he told me straight to my face that NOBODY working directly for this bank knows anything about how this particular system worked anymore.. it was all done in India by 'some guys'...
Just.... my mind boggles...
Man, you're right on the money with this one. I should know, I used to work for Virgin Blue.
Try fighting for a better architecture for the area of responsibility only for it to be shot down in flames by the middle managers as they think they know what is best for the business - even though they have had their foot in the door for 5 minutes, and the business is screaming at them that they have no clue what they're talking about. Yet its all about sucking up to a blue chip vendor, who provides a system that has minority support in Oceania.
Now that I've got that off my chest ... I got the hell out of dodge when I realised the direction they were heading, and it will end in tears.