tech planet

Wednesday, 28 December 2011

What to do when Amazon’s spot prices spike

Rapid price spikes are affecting buyers on the Amazon Web Services Spot Instances market, where users are now bidding extremely high prices for scarce compute capacity. These price spikes are new, and they call into question assumptions that many users have made about how the auctioning of computing resources works.
The first report of this came in late September, when marketing software service SEOMoz reported huge price spikes on the spot market. A sudden spike in the price of “m2.2xlarge” servers (normally $.44/hour) drove the price briefly up to $999/hour, causing a site-wise outage. While this was bad news for SEOMoz, it was probably worse news for the unlucky customers who ended up paying $999 for one hour of compute time!

Will you pay $999 per hour for a server?

Why would anyone bid such a high price? It’s hard to say for sure, but the unlucky winner of the auction probably did not expect to pay $999 per hour for a server. On the Amazon marketplace, your bid represents the maximum amount that you are willing to pay: you usually end up paying much less than your bid. Many buyers seem to have assumed that the price would never rise above the fixed-price “on-demand” rate charged by Amazon.
Unfortunately, it seems like a large number of people were using that flawed strategy. And when something changed in the spot market (perhaps a reduction in the number of machines available to rent, due to increased demand) the unrealistically high bids that customers made went into effect. Amazon has since posted a video describing the different strategies buyers on the spot market use. Litmus, one of the companies mentioned in the video, describes their strategy as “bidding high for convenience.” The $999 bidder who cornered the spot market on large servers was probably using an extreme version of this strategy.
My company (SlideShare) was also effected by the recent price spikes on the spot market. Several times in October and November, all of our EC2 servers disappeared at once because of a price spike (this had never happened before). Fortunately, the software code that manages SlideShare’s cloud servers responded automatically by renting new machines at the “on-demand” rate, so we didn’t experience any actual downtime, only degraded service. But after this happened to us several times, we have changed the mix of machines that we use so that only half of them are from the spot market, and the rest are on-demand.

Spikes are a recent problem

Looking through the pricing history for various classes of machines, it’s clear that these spikes are new, and that they are happening across almost all instance types, at least for servers that are on the East Coast of the United Sates. For example, “small” servers on AWS both spiked as high as $100 an hour twice in November, when the on-demand price for those servers is $.085/hr. “m1.large” machines also spiked as high as $40 an hour. Almost every class of servers has hit spikes of more than 10 times their retail price in last few months. What is going on?
It’s hard to say why the spot market is suddenly showing more price spikes. A drop in supply (from Amazon requisitioning machines for its own purposes or for renting in the on-demand market) or a spike in demand (from the Christmas e-commerce rush) could be to blame. It’s important to remember that the AWS spot market is not a typical market, with many buyers and sellers doing business over a neutral exchange. One seller is servicing many buyers, and is also operating the exchange.
Amazon benefits from customer anxiety about getting access to spot servers: they sell on-demand instances for a higher price, and pre-paid reserved instances for better cash flow. So it’s unrealistic to expect Amazon to do anything to “fix” these price spikes. From Amazon’s perspective, they are a feature, not a bug.

How to deal with EC2 spot price spikes

For customers of the AWS spot market, there are some best practices to be learned from these recent price spikes:
  • Never EVER bid more than you are willing to pay for a server on the spot market. This is the most important lesson. Don’t even bother doing “convenience bidding” of double or triple the on-demand price: when the price starts to spike it will easily go way beyond any rational price. Do you want to be the gal who explains to the CEO why the company is paying $100 an hour for servers?
  • Don’t run all your infrastructure on spot market machines. In fact, don’t run more infrastructure than you are prepared to lose on spot machines. We use a thumb rule of 50 percent at SlideShare, since our system can easily survive 50 percent of our machines disappearing at one time (which is what will happen during a price spike).
  • Write the code that manages your cloud infrastructure so that it responds intelligently to spot market price spikes. If you can’t get a spot machine at a reasonable price, your code should automatically request an on-demand server.
  • Consider having some “reserved instances,” so that you are guaranteed the right to a minimum base level of machines. I’ve argued in the past that reserved instances don’t make sense for startups, but it’s clear that when supply dries up at Amazon it happens all at once, without warning. Your portfolio of servers on Amazon is almost like a financial porfolio. You want some diversification between risky high-reward elements (spot market) and more conservative elements (reserved instances).
These are early days for real-time pricing of cloud computing, and the spot market on Amazon is finally acting like a real market, with extreme price fluctuations. The “free ride” of getting reliable spot priced machines for less than the on-demand price is over. So if you want to play with cheap cloud servers, make sure you have the infrastructure in place to handle a price spike that could make all your servers vanish in the blink of an eye!


No comments:

Post a Comment