Broadband Performance Devices Generate Bad Data

Discussion in 'Networking, Telephony & Internet' started by SiliconAngel, Jun 12, 2019.

  1. ViPeR-7

    ViPeR-7 Member

    Joined:
    Jun 28, 2001
    Messages:
    582
    Location:
    Newcastle, NSW, Australia
    Firstly, there are many international links from Australia to the rest of the world, and most ISPs do not use all of them, just a selection, which means various ISPs will have varying speed and latency depending on where the destination server is in the world.
    Secondly, these links are capable of huge bandwidth (eg last I checked the AJC is capable of 10000gbps), but these are huge bundles of fiber links, these individual strands may be leased out to providers, or shared among whole groups of them, with usage split by wavelength, or limited by quotas or prioritization. While only a fraction of the whole cable bundle is being used, the individual strand(s) being leased by the backhaul provider who supplies your ISP and half dozen others may only be capable of a few gbit, and your ISP may only pay for a 10% slice of this. They could pay for a bigger slice or more fibers, but often don't, because only a small selection of their customers will actually understand whats going on and that they're being ripped off.

    As such it's really not a question of Australia's international bandwidth - if the demand is there from providers, the bandwidth will be made available for purchase - but for providers offering 100mbit/1gbit "unlimited" connections for under $100/mo, and NBNco taking a sizable slice of this just for the end connection, the ISP may simply not be able to afford sufficient bandwidth to keep it's customers running at full speed. This will vary based on ISP, depending on their overheads, their profit margin, and the usage profiles of their userbase.
    Once an ISP gets a reputation as being "great for international performance", heavy users tend to flock towards it, saturating the international bandwidth the ISP pays for, and making it bad for international performance. At this point the ISP can either purchase more bandwidth, change their plans to include a data limit or shaping, or just sit on what they have and allow the contention ratio to get worse and worse, allowing their profits to increase.
     
    SiliconAngel, caspian and JSmithDTV like this.
  2. caspian

    caspian Member

    Joined:
    Mar 11, 2002
    Messages:
    10,347
    Location:
    Melbourne
    absolutely it should.

    it could be any one of (or a combination of) a number of factors, and the end user's ability to determine which is causing the issue is often limited due to the lack of additional input, visibility of how the connection works, and sectionalisation tools.

    I just ran a few tests at random

    to local melbourne server - 94Mbps
    to Los Angeles - 85Mbps
    to UK - 24Mbps
    to Japan - 95Mbps

    what can I derive from that? not much, other than it's not the network that connects me to the ISP at any fault. does my ISP have crappy routing to the UK? no idea. do they have insufficient peering bandwidth? can't tell. does some 3rd party well beyond my ISP currently have congestion issues? maybe. is the download server I picked at random not capable of delivering enough data to max my local connection? possible.

    this is where internet based testing sucks, even if you control the test server at the other end, because there are simply too many factors that you have zero visibility of, and no control over, to determine why a problem occurs. all you know is if the test result was good, then everything is OK.

    this is why you do a range of testing like I just did, to discount a single test to the UK skewing the results. yes, the best-effort nature of the internet means that some caution is required when using the results, but this doesn't render real-world testing fundamentally flawed.

    should the ISP be held accountable for poor overall performance? that depends on whether it's under their control, doesn't it? like my ISP deliberately cheaping out on bandwidth on their peering route to Youtube's CDN. in other cases, maybe not.

    but the point remains that the ACCC testing is designed to measure real-world user experience against what has been sold to them. it's not to determine the nature or location of any issues, because the end user doesn't care about why their experience is less than they are paying for - just that it is. if a particular ISP's results are poor compared to the rest of the industry, then absolutely there is something they can do about it - that might range from selling Joe Average the right sized plan for his line, through to fixing their own network performance, to maybe reconsidering who their external peering partners are. if others in the industry are delivering better results - across thousands of individual services and multiple tests per service - then clearly others are delivering a better end user experience, and publishing the test results lets the consumer see this.
     
    SiliconAngel likes this.
  3. OP
    OP
    SiliconAngel

    SiliconAngel Member

    Joined:
    Jun 27, 2001
    Messages:
    627
    Location:
    Perth, Western Australia
    Thanks for that ViPeR, much appreciated. I agree, it's a difficult balancing act...


    Thanks again for taking the time to go through that, Caspian. I think your final paragraph summarises why SamKnows does the broadband measurement in the way that it does, and why the ACCC will have chosen this methodology for the programme - across a broad enough test base, it should be possible to see performance issues caused by an RSP not purchasing enough international bandwidth, as their results will be poorer overall than others like Telstra that don't have the same restrictions (don't have to lease bandwidth from cables you actually own) - while the details are obscured, as you say the overview approach is more representative of the final product that the end user is experiencing.

    Unfortunately this won't reveal if there are performance issues caused by backhaul operators deliberately de-prioritising specific network traffic from RSP customers - it would be trivial to throttle traffic that matches the SamKnows data packets yet leave everything else alone, which would make your competitor's results look far worse than they should be. But I guess that's where communication with the RSP may help get to the bottom of it - if the RSP knows they have sufficient bandwidth, they could look into this behaviour to work out what's going on.

    This is really the thrust of my argument. The results I've seen from my specific device throw up a lot of questions. While I agree that real world testing is important, my concern is if the results are weighted towards testing international traffic, that's not remotely indicative of someone's real world use - RSPs go to a lot of effort to ensure large proportions of their traffic can hit internal peers and caches of things like YouTube, which means that end users don't even realise that huge amounts of the content they're accessing is being served by the RSP's network. That means weighting the results heavily in favour of international testing is unreasonable in the majority of cases - it should be part of the mix, yes, but certainly not the thing to focus on.

    What's interesting is that, while the SamKnows box is definitely telling me my Internet connection is far slower than it actually is, and it certainly seems to be doing most of its tests against the UK servers, if this was the case for all ABB's customers their overall results would be far lower than they are. I don't have any way to know if my circumstance is unique, but it's certainly unusual enough that it doesn't fit the overall trend. The trick is going to be working out the how and why.
     
  4. evilasdeath

    evilasdeath Member

    Joined:
    Jul 24, 2004
    Messages:
    4,861
    The results are valid if the test are only compared against the same host. Even if that host is international. And yes international performance should form part of the overall analysis. I would guess that sam knows compares to multiple sites and forms a grading on a combination result. User A on ISP A testing to site C, is comparable to User B on ISP B to Site C.

    Looking at samknows they have servers all over but are priced differently, and i'm sure the regulator that set this up only had x$ so maybe its partially flawed by people choosing the wrong server set for wrong price. Or maybe they have a combination, do you actually have access to the data dashboard?

    End users don't care where data is, only that it works, everyone likes to blame everyone else because it's a complex web in the background.

    Caspian also brought up an interesting thing about VPNs how with VPN it works fine and without it buffers. The VPN does 2 things 1) It changes your traffic as it passes over your ISP and everyone thinks this is the main reason for slowdown, but B) it changes your external peering ip which is actually far more interesting.

    When your computer talks to youtube it will talk to your ISPs CDN, when you go to VPN it will use a different CDN. CDNs are now a MASSIVE part of the internet, and far more important than international peering. CDN capacity within large ISPs far exceeds international peering. Now question for you all, congestion on the CDNs who is responsible? ISP or CDN provider?

    Combination of both, CDN needs to provide the servers, and ISP needs to provide the interconnect point.

    How much extra capacity should the CDN provide? N+1 at a server level or N+N at an ISP level, ie 10 + 1 servers or 10+10 servers at 2 different locations? both provide 100G of capacity. What about when you get to 400G? 40 + 10? 40 +40? What happens when your ISPs 100G line card fails? You can keep stacking redundancy as much as you want for the 1 time in 10 years that something fails unexpectedly. Deciding when to upgrade is hard as well.

    It's getting to a point that adding 10G doesn't do a whole lot. Then there are issues of load balancing which is yet another discussion. It's a lot harder running an ISP that many users think. Beyond the last mile.
     
  5. OP
    OP
    SiliconAngel

    SiliconAngel Member

    Joined:
    Jun 27, 2001
    Messages:
    627
    Location:
    Perth, Western Australia
    Thanks for taking the time to reply :)
    Yup, unless a network owner in the middle is interfering...

    Can you elaborate on this? Do they have a customer pricing sheet available somewhere?

    I have access to the dashboard for my specific device, if that's what you mean?

    I agree that testing through VPN tunnels can be very enlightening - I once discovered an ISP was deliberately blocking file downloads to a specific region (affecting about 10,000 customers) using DPI QoS, despite speed tests still working perfectly, thanks to VPN testing.

    The SamKnows box does test against BBC media servers and a Netflix AWS server, but they're in the UK and USA respectively, so it isn't representative of local content server performance. At the end of the day, it's impossible for a device to test against the whole Internet, so any testing is going to be limited by definition, and possibly non-representative as a result... Unless the SamKnows boxes build up a profile based on network activity from the premises and somehow test against servers most indicative of the traffic it's seeing... But that would be a bit creepy.
     
    Last edited: Jun 16, 2019
  6. evilasdeath

    evilasdeath Member

    Joined:
    Jul 24, 2004
    Messages:
    4,861
    On the samknows website go to test servers shows locations and 4 boxes at the bottom that have different numbers of $$$ i assume costings.

    1 set of data is inconsequential, the idea of a testing platform is lots of data sources.

    How do you know it was DPI QoS? could be just a missing route, it does happen. Getting consumer ISPs to care about specific routes is painful, its hard enough for business IPs, and to prove it. Again another question. If an ISP is not receiving a route, is it the destination ISP or the IP owners responsibility to maintain connectivity to everywhere?

    Also with VPN tunnels the path changes. Before it was you from A to B, with VPN its A to C then to B. The path AC can be very different to AB, C could also be connected to ISPs network directly, and as such pays to connect to that ISP, so if congestion on that interface exists its no longer the ISPs problem.

    I can give you lots of stories where customers have not been able to get to a destination because of no fault of the local ISP. And connectivity was required for legitimate purposes.


    Netflix is everywhere, not just the USA, it will depend on your DNS to say which it connects to.

    I'm not saying your observations are wrong, the whole testing regime could be entirely wrong, i don't know, and i don't think you have enough data to say its one way or the other. Your a data point in a swarm. I'm also saying that it's not all it seams within ISPs, they don't deliberately block/filter things just because they fell like it, there are usually very good reasons, and usually for the overall better result which is usually less customer complaints. However often the solution provided is the cheapest one, not the one that should be done.
     
  7. OP
    OP
    SiliconAngel

    SiliconAngel Member

    Joined:
    Jun 27, 2001
    Messages:
    627
    Location:
    Perth, Western Australia
    Ah yes, I had seen that page. I assume from the fact that there's definitely a server within the ABB network (confirmed by ABB - it's in Melbourne) that the ACCC have gone with the dedicated server option there.

    Yes, I thought that's what you meant, just confirming.

    Because I spoke to an iiNet engineer who confirmed it, explained it was due to an optical port failure and they had a DPI box in place to keep the route working on the restricted bandwidth they had available over their backup path, supposedly while they were waiting for replacement equipment. So it was for network stability, which I completely understand - having some Internet access is better than nothing at all, which was the alternative if they hadn't done it.

    What I did have a problem with was the fact that a) iiNet/TPG didn't only not explain what was happening to customers, but this engineer asked me not to mention his name because it would be his job on the line. So they were being secretive about it, instead of honest with their customers, but more importantly b) this situation had been ongoing for nearly three months when I discovered it - my client had been putting up with poor performance and weird behaviour for a long time before they mentioned it to me.

    We shifted them to ABB and within a few hours they were working fine again. iiNet of course offered no discounts to their customers for the fact that they weren't getting anything like a reasonable service - they wouldn't even disclose it to them. I have no idea how long it went on for either, once my job was done.

    Yes, but in this case the Wireshark capture showed me the IP address the whitebox was communicating with, which is an Amazon AWS server located in one of Amazon's US datacentres hosting a Netflix speed test server. I don't know if the whitebox has that IP hard-coded into it or if Netflix aren't peering that instance to AWS's global infrastructure, but that specific traffic is going to a particular IP, and over an 80 minute capture that's the only one it contacted.

    All good. I have no skin in this either way, other than wanting to ensure the system is providing reliable data. Yes, I'm just one data point, but of the data I have access to and I can compare against, the data that's being reported is entirely wrong. Either ABB have a weird configuration somewhere that's causing a lot of problems for me specifically, or there's something odd about the behaviour of my whitebox. Because if the whole system was configured like this, either everyone's bandwidth results would look like this, or all of ABB's numbers would be severely impacted, which they're not (although they do seem lower than I would expect). So I'm going to keep pulling on this thread until I can figure out where it goes, 'cause I don't have a good answer yet.
     
    Last edited: Jun 17, 2019
  8. Luke212

    Luke212 Member

    Joined:
    Feb 26, 2003
    Messages:
    9,652
    Location:
    Sydney
    Obviously a real world test will test points from around the world because real users download data from around the world! It is NOT the same thing as speedtest.net because speedtest.net finds the closest server to you, and it just tests a short section of your internet pipe. It is really good of samknows to do this because each RSPs buys/rents international bandwidth. So if they skimp on that, it will affect your speed!

    Hope that makes sense Silicon Angel!

    ps. to address your other concern, these international servers are very well provisioned and fast. they are unlikely to be a limiting factor. it is theoretically possible sam is using bad servers, but very unlikely given how these things are usually set up in the real world.

    pps. if your sam test bandwidth is far less than your speedtest.net results, you should change RSPs! you are being ripped off! boom the system works :)
     
    Last edited: Jun 17, 2019

Share This Page

Advertisement: