Etsy takes a different tack. For the 14-year-old marketplace, the biggest online shopping holiday of the year--when Etsy sees double the sales and se
Etsy takes a different tack. For the 14-year-old marketplace, the biggest online shopping holiday of the year–when Etsy sees double the sales and search activity than on a normal day–is not off limits for code tweaks. Etsy continuously deploys code onto the site, sometimes as much as 30 times a day, says Chief Technical Officer Mike Fisher. Continual deployment, he argues, helps keep the staff in the rhythm of making fixes quickly too. “I think that’s the best way to keep things stable.”
Cyber Monday 2018 put this strategy to the test. On that day, Etsy’s 2.6 million sellers pulled in an average of nearly $19,000 in gross merchandise sales per minute. In the middle of it all, a tool the company had released in the days leading up to the shopping holiday malfunctioned. Sellers were supposed to have been able to add their items to a sitewide Cyber Monday sale via their online dashboard; the problem was, the tool didn’t recognize some time zones properly. That meant that for some of the sellers further east from Etsy’s Brooklyn, New York, headquarters, their sales automatically closed long before Cyber Monday ended.
The company says it communicated the issue to sellers as soon as it figured out what had happened, and about 60 percent of sellers were able to re-start their sales. To make up for the mistake, Etsy offered advertising and listing credits to the sellers affected.
While the glitch hasn’t deterred the company from rolling out code changes around the big day. it has helped inform Etsy’s plan of action to prevent possible site issues this year. To prepare for the surge of activity–the site is expecting $20,000 in sales per minute, 10 to 12 checkouts per second, and 150 product searches per second–the 400-person engineering team is making some changes. They’re scaling up their servers and staff will work extra shifts. The team will also go into what they call code “slush” mode, where they don’t execute any major changes but they do continue to push out short lines of code to continuously improve the site’s functioning. Fisher notes that many organizations instead prefer a code “freeze,” where sites are essentially untouchable during peak times.
Beyond the slush, Fisher and all of his engineers host a pre-mortem meeting before the week to brainstorm any potential problems before they arise as well as potential solutions. When imagining potential issues, Fisher says they conceptualize a tree of scenarios, where a root issue may affect many other branching issues. From there, they make a plan for every issue on the tree. For example, in the case of, say, a checkout malfunction, Etsy has a response and action plan for everything from a small cart malfunction to a large-scale event in which the leadership team would need to get involved with a public response.
He also prepares the team with worksheets that have action items and incident response plans for situations brainstormed in the pre-mortem. This includes having a plan for communicating issues internally and externally–you need to know how to explain what happened to customers and sellers, too, Fisher suggests.
And of course, it’s never too early to prepare. “In fact, we’ve already started planning for next year,” Fisher says.
This article is from Inc.com