Using AZs can eat up your budget – From Prometheus to VictoriaMetrics

dantillberg · on Dec 29, 2024

This excessive inter-AZ data transfer pricing is distorting engineering best practices. It _should_ be cheap to operate HA systems across 2-3 AZs, but because of this price distortion on inter-AZ traffic charges, we lean towards designs that either silo data within an AZ, or that leverage S3 or other hosted solutions as a sort of accounting workaround (i.e. there are no data transfer charges to read/write an S3 bucket from any AZ in the same region).

While AWS egress pricing gets a lot of attention, I think that the high cost of inter-AZ traffic is much less defensible. This is transfer on short fat pipes completely owned by Amazon. And at $0.01/GB, that's 2~10X what smaller providers charge for _internet_ egress.

KaiserPro · on Dec 29, 2024

I don't work for AWS

However I do work for a company with >1 million servers. Scaling inter datacentre bandwidth is quite hard. Sure the datacentres might be geographically close, but laying network cables over distance is expensive. Moreover unless you spend uber millions, you're never going to get as much bandwidth as you have inside the datacentre.

So you either apply hard limits per account, or price it so that people think twice about using it.

iscoelho · on Dec 29, 2024

In Ashburn, VA, I can buy Dark Fiber for $750 MRC to any datacenter in the same city. I can buy Dark Fiber for $3-5K MRC to any random building in the same city.

That Duplex Dark Fiber with DWDM can run 4TBPS of capacity at 100GE (40x 100GE). Each 100GE transceiver costs $2-4K NRC dependent on manufacturer - $160K NRC for 40x. (There are higher densities as well, like 200/400/800GE, 100GE is just getting cheap.)

In AWS, utilizing 1x100GE will cost you >$1MM MRC. For significantly less than that, let's say absolutely worst-case 5K MRC + 200K NRC, you can get 40x100GE.

Now you have extra money for 4x redundancy, fancy routers, over-spec'd servers, world-class talent, and maybe a yacht if your heart desires.

bobbob1921 · on Dec 29, 2024

I’m just throwing out a hypothetical, so I may be completely off base: perhaps aws charges high inter-AZ bandwidth prices to keep users from tunneling traffic between availabilities zones to arbitrage lower Internet/egress costs at AZ 1 vs AZ 3.

Outside of my statement above, I do agree that the cost Amazon pays for bandwidth between their sites, has to be practically nothing at their scale/size (and thus they should charge their customers very little for it, especially considering easy-multi AZ is a big differentiator for cloud vs self-hosting / colo). The user above’s dark fiber MRC prices are spot on.

paranoidrobot · on Dec 29, 2024

> availabilities zones to arbitrage lower Internet/egress costs at AZ 1 vs AZ 3.

Unless I've missed something - They charge the same price for all AZs in a given region.

The only exception for any AWS service that I'm aware of, is EC2 Spot instance pricing.

themgt · on Dec 29, 2024

OK but $10/TB has gotta be like >99% profit margin for AWS. After massively jacking up their prices, Hetzner internet egress is only €1/TB. Also AWS encourages / in some cases practically forces you to do multi-AZ.

I remember switching to autoscaling spot instances to save a few bucks, then occasionally spot spinup would fail due to lack of availability within an AZ so I enabled multi-AZ spot. Then got hit with the inter-AZ bandwidth charges and wasn't actually saving any money vs single-AZ reserved. This was about the point I decided DIY Kubernetes was simpler to reason about.

everfrustrated · on Dec 29, 2024

Apples and Oranges. Hetzner doesn't even have multiple AZ's by AWS's definition - all Hetzners DCs eg Falkenstein 1-14 would be the same AZ zone.

AWS network is designed with a lot more internal capacity and reliability than Hetzner which costs a lot more - multiple uplinks to independent switches, etc.

AWS is also buying current gen network gear which is much more pricey - Hetzner is mostly doing 1 Gig ports or 10 gig at a push which means they can get away with >10 year old switches (if you think they buy new switches I have a bridge you might be interested in buying).

This costs at least an order of magnitude more.

iscoelho · on Dec 29, 2024

I agree with this post that Hetzner is a bad example. They are focused on a budget deployment.

I do not agree that a state-of-the-art high capacity deployment is as expensive as you think it is. If an organization pays MSRP on everything, has awful procurement with nonexistent negotiation, and multiple project failures, sure, maybe. In the real world though, we're not all working for the federal government (-:

dantillberg · on Dec 29, 2024

While your caveats are all noteworthy, I'll add that Hetzner also offers unlimited/free bandwidth between their datacenters in Germany and Finland. That's sort of like AWS offering free data transfer between us-east-1 and us-east-2.

dilyevsky · on Dec 29, 2024

You should do the math though because it’s expensive but nowhere near 1c/g expensive

rendaw · on Dec 30, 2024

That doesn't square with what GP wrote about S3 bandwidth costs. And IIUC GP's point is that it's priced much above cost.

znpy · on Dec 30, 2024

Amazon does have uber millions though

hipadev23 · on Dec 29, 2024

I assume it’s to discourage people from architecting designs that abuse the network. A good example would be collecting every single metric, every second, across every instance for no real business reason.

thayne · on Dec 29, 2024

Or maybe it is price discrimination. A way to extract more money from customers that need higher availability and probably have higher budgets.

hipadev23 · on Dec 29, 2024

Price discrimination is when you charge different amounts for the same thing to different customers. And usually the difference in those prices are not made apparent. Like when travel websites quote iOS users more than Android because they generally can afford to pay more.

This is just regular ole pricing.

thayne · on Dec 29, 2024

So what is the correct term for "charge an extremely high markup for a feature that some, but not all, of your customers need"?

jchanimal · on Dec 29, 2024

I came here to say the same thing. When you’re selling cloud services, the hardest thing to do is segment your customers by willingness to pay.

Cross AZ traffic is exactly the sort of thing companies with budgets need, that small projects don’t.

mcmcmc · on Dec 29, 2024

Supply and demand

spondylosaurus · on Dec 29, 2024

Price gouging?

hansvm · on Dec 29, 2024

It's a bit of a mix, but price discrimination isn't far off. It's like the SSO tax; all organizations are paying for effectively the same service, but the provider has found a minor way to cripple the service that selectively targets people who can afford to pay more.

If we want to call this just regular ole pricing, it's not a leap to call most textbook cases of price discrimination "regular ole pricing" as well. An online game charges more if your IP is from a certain geography? That's not discrimination; we've simply priced the product differently if you live in Silicon Valley; don't buy it it you don't want it.

hipadev23 · on Dec 30, 2024

https://en.wikipedia.org/wiki/Price_discrimination

Price discrimination has a clear definition. It’s not illegal in the US (when consumers are the victims anyway) but it has a clear meaning and you’re blurring the lines for I’m not sure what reason. Your example of a video game doing regional pricing is a perfect example of textbook price discrimination.

Pricing a good or service at a level that inherently excludes those unwilling or unable to pay is: https://en.wikipedia.org/wiki/Excludability

thayne · on Dec 30, 2024

> Price discrimination ("differential pricing",[1][2] "equity pricing", "preferential pricing",[3] "dual pricing",[4] "tiered pricing",[5] and "surveillance pricing"[6]) is a microeconomic pricing strategy where identical or largely similar goods or services are sold at different prices by the same provider to different buyers based on which market segment they are perceived to be part of.

That sounds exactly like what is happening here.

Intra-zone and inter-zone network traffic are two very similar services. One is free and one costs 1¢ per GB. And customers who need inter-az traffic are probably in a different market segment. Now, it is more expensive for AWS to build the infrastructure for inter-zone networking, so it isn't exclusively price discrimination, but assuming that getting more money from wealthier clients was a motivation, it seems to match the definition to me.

Re: excludability, yes it is excludable since there is a price, but that doesn't have much to do with how the price is much higher than the cost to AWS for providing the service.

hipadev23 · on Dec 30, 2024

Those are different products priced the same to all users. I genuinely don’t understand how this is even a debate.

Is AWS price gouging? Absolutely. But that’s not what price discrimination is.

thayne · on Dec 30, 2024

The definition above says "largely similar" not "exactly the same".

hipadev23 · on Dec 30, 2024

Interconnects within same datacenter are very different from regional backbones spanning 5-60 miles.

hansvm · on Dec 30, 2024

Alright, let's take a look at that first link as if it's gospel. AWS charging excessively for inter-AZ networking is:

1. a microeconomic pricing strategy

2. where largely similar goods (AWS with or without substantial inter-AZ bandwidth)

3. are sold at different prices (excessive inter-AZ networking fees) to different buyers

4. based on perceived market segments (most customers don't need (or don't know they need till they're locked in) much inter-AZ bandwidth, but larger, richer corporations likely do)

I'm not trying to blur the lines. On top of any juggling of our favorite sources of definitions, that particular pricing strategy has all the qualitative hallmarks of price discrimination. Everyone still buys AWS, most customers are unaffected by the lack of bulk inter-AZ bandwidth, and AWS can successfully charge much more to those who can afford to pay.

cowsandmilk · on Dec 29, 2024

Sending traffic between AZs doesn’t necessarily improve availability and can decrease it. Each of your services can be multi-az, but have hosts that talk just to other endpoints in their AZ.

thayne · on Dec 29, 2024

Unless your app is completely stateless, you will need some level of communication across AZs.

And you often want cross-zone routing on your load balancers so that if you lose all the instances in one AZ traffic will still get routed to healthy instances.

KaiserPro · on Dec 29, 2024

> Or maybe it is price discrimination.

It very much is, because scaling bandwidth between phyical datacenters which are not located next to each other is very expensive. So pricing it means that people don't use it as much as if it was free.

mcmcmc · on Dec 29, 2024

That’s not what price discrimination is

koolba · on Dec 29, 2024

You can still do that if you buffer it and push it to S3. Any AZ to S3 is free and the net result is the same.

timewizard · on Dec 29, 2024

> It _should_ be cheap to operate HA systems across 2-3 AZs,

In the steady state. HA systems tend towards large data bursts when failures or upgrades occur.

> And at $0.01/GB, that's 2~10X what smaller providers charge for _internet_ egress.

It's a lower latency network with a high SLA and automatic credits if the SLA isn't maintained. I think the inter-AZ option provides a level of service that's much higher than what most people want or need.

It might be nice if there was a "best effort" inter-AZ network. This would probably fit better with the synchronization methods built into most HA software anyways.

So, to me, it's a good product, it's just designed for a very niche segment of the market and often mistaken for something more general than it actually is.

tomalaci · on Dec 29, 2024

I've used VictoriaMetrics in past (~4 years ago) for collection of not just service monitoring data but also for network switch and cell tower module metrics. At the time I found it to be the most efficient Prometheus-like service in terms of query speed, data compression and, more importantly, being able to handle high cardinality (over 10s or 100s of millions of series).

However, I later switched to Clickhouse because I needed extra flexibility of running occasional async updates or deletes. In VictoriaMetrics you usually need to wipe out the entire series and re-ingest it. That may not be possible or would be quite annoying if you are dealing with a long history and you just wanted to update/delete some bad data in a month.

So, if you want a more efficient Prometheus drop-in replacement and don't think limited update/delete ability is an issue then I highly recommend VictoriaMetrics. Otherwise, Clickhouse (larger scale) or Timescale (smaller scale) has been my go to for anything time series.

aramattamara · on Dec 29, 2024

VictoriaMetrics does support occasional updates/deletes (e.g. you may need it for GDPR compliance).

brunoqc · on Dec 29, 2024

Btw, both clickhouse and timescale are open core. If you care about that.

hipadev23 · on Dec 29, 2024

Is there a reason you drop this comment on every product mention that’s not 100% OSS

presspot · on Dec 29, 2024

Because it's helpful and adds context? Why do you care?

hipadev23 · on Dec 29, 2024

Because it's frustrating. They do it to belittle the projects and shame the authors for trying to make a living.

Not every project wants to end up as another bloated abandonware in Apache Software Foundation.

simfree · on Dec 29, 2024

FOSS washing software is similarly frustrating.

When I see a license on a project I expect that project will provide the code under that license and function fully at runtime, not play games of "Speak to a sales rep to flip that bit or three to enable that codepath".

anonzzzies · on Dec 30, 2024

I find it frustrating it is not immediately clear it is open core (in which case we shall never touch it as per our lawyers). So hopefully people will keep commenting on that.

thayne · on Dec 29, 2024

So is VictoriaMetrics

brunoqc · on Dec 29, 2024

You are right. I guess I just saw the Apache 2 license and assumed it was foss.

raffraffraff · on Dec 29, 2024

I'd love to see a comparison with Mimir. Some of the problems that this article describes with Prometheus are also solved by Mimir. I'm running it in single binary mode, and everything is stored in S3. I'm deploying Prometheus in agent mode so it just scrapes and remote writes to Mimir, but doesn't store anything. The helm chart is a bit hairy because I have to use a fork for single binary mode, but it has actually been extremely stable and cheap to run. The same AZ cost saving rules apply, but my traffic is low enough right now for it not to matter. But I suppose I could also run ingesters per AZ to eliminate cross-AZ traffic.

FridgeSeal · on Dec 29, 2024

I was on a team once where we ran agent-mode Prometheus into a Mimir cluster and it was endless pain and suffering.

Parts of it would time out and blow up, one of the dozen components (slight hyperbole) they have you run would go down and half the cluster would go with it. It often had to be nursed back to health by hand, it was expensive to run, queries ate not even that fast.

Absolutely would not repeat the experience. We cheered the afternoon we landed the PR to dump it.

raffraffraff · on Dec 29, 2024

I definitely think that running the microservices deployment of Mimir (and Loki) looks hairy. But the monolithic deployments can handle pretty large volumes.

thelittleone · on Dec 29, 2024

Interesting. I'm fairly new to the field, but would this configuration help reduce the cost of logging security events from multiple zones/regions/providers to a colocated cluster?

raffraffraff · on Dec 29, 2024

Not really. On AWS, you're always going to pay an egress cost to get those logs out of AWS to your colo. If you were to ship your security logs to S3 and host your security log indexing and search services on EC2 within the same AWS region as the S3 bucket, you wouldn't have to worry about egress.

aramattamara · on Dec 29, 2024

Performance benchmark:

https://victoriametrics.com/blog/mimir-benchmark/

thayne · on Dec 29, 2024

> while it’s tempting to use the infinitely-scalable object storage (like S3), the good old block storage is just cheaper and more performant

How is it cheaper? Object storage is cheaper per GB. Does using s3 have another component that is more expensive, maybe a caching layer? Is the storage format significantly less efficient? Are you not using a vpc enpoint to avoid egress charges?

jdreaver · on Dec 29, 2024

You are correct that storage is cheaper in S3, but S3 charges per request to GET, LIST, POST, COPY, etc objects in your bucket. Block storage can be cheaper when you are frequently modifying or querying your data.

thayne · on Dec 29, 2024

That's a lot of requests.

hansvm · on Dec 29, 2024

It is, but it's not _that_ many. AWS pricing is complicated, but for fairly standard services and assuming bulk discounts at ~100TB level, your break-even points for requests/network vs storage happens at:

1. (modifications) 4200 requests per GB stored per month

2. (bandwidth) Updating each byte more than once every 70 days

You'll hit the break-even sooner, typically, since you incur both bandwidth and request charges.

That might sound like a lot, but updating some byte in each 250KB chunk of your data once a month isn't that hard to imagine. Say each user has 1KB of data, 1% are active each month, and you record login data. You'll have 2.5x the break-even request count and pay 2.5x more for requests than storage, and that's only considering the mutations, not the accesses.

You can reduce request costs (not bandwidth though) if you can batch them, but that's not even slightly tenable till a certain scale because of latency, and even when it is you might find that user satisfaction and retention are more expensive than the extra requests you're trying to avoid. Batching is a tool to reduce costs for offline workloads.

thayne · on Jan 4, 2025

Ok, there are definitely cases where it would be more expensive, like using it for user login data.

But for metrics, like you would use for prometheus:

- Data is usually write-only. There isn't usually any reason to modify the metrics after you have recorded them.

- The bulk of your data isn't going to be used very often. It will probably be processed by monitors/alerts and maybe the most recent data will be shown in dashboard (and that data could be cached on disk or in memory). But most if it is just going to sit there until you need to look at it for an ad-hoc query, and you should probably have an index to reduce how much data you need to read for those.

- This metrics data is very amenable to batching. You do probably want to make recent data available from memory or disk for alerts, dashboards, queries, etc. But for longer term storage it is very reasonable to use chunks of at least several megabytes. If your metrics volume is low enough that you have to use tiny objects, then you probably aren't storing enough to be worried about the cost anyway.

remram · on Dec 29, 2024

(bit of a tangent)

I like the new generation of metric database, but are there systems that allow for true distributed deployments? E.g. where some machines might be offline for a few days, and need to sync (send some/receive some) metrics with other machines when back online?

jiggawatts · on Dec 30, 2024

Azure inter-AZ traffic has been free for a while.

AWS is not the only public cloud.

sgarland · on Dec 29, 2024

My only hope is that as more and more companies find stuff like this out, there will be a larger shift towards on-prem / colo. Time is a circle.

brunkerhart · on Dec 30, 2024

Running on-prem/colo across two sites linked with 1 digit latency link won’t be free of charge either

nathan_jr · on Dec 29, 2024

Is it possible to query cloud watch to calculate the current cost attributed to inter AZ traffic ?