- Amazon addressed GPU shortages with internal tenets to guide allocation of this valuable resource.
- The company launched Project Greenland to streamline GPU distribution and prioritize ROI.
- Amazon employees now have better access to GPUs, the company said.
Amazon, like many other tech companies, has grappled with significant GPU shortages in recent years.
To address the problem, Amazon created 8 new “tenets,” or guiding principles, for approving employee GPU requests, according to an internal document seen by Business Insider.
These tenets are part of a broader effort to streamline Amazon’s internal GPU distribution process, as BI previously reported. Last year, Amazon launched Project Greenland, a “centralized GPU orchestration platform,” to more efficiently allocate GPU capacity across the company. It also pushed for tighter controls by prioritizing return-on-investment for each AI chip.
As a result, Amazon is no longer facing a GPU crunch, which strained the company last year.
“Amazon has ample GPU capacity to continue innovating for our retail business and other customers across the company,” Amazon’s spokesperson told BI. “AWS recognized early on that generative AI innovations are fueling rapid adoption of cloud computing services for all our customers, including Amazon, and we quickly evaluated our customers’ growing GPU needs and took steps to deliver the capacity they need to drive innovation.”
How Amazon decides who gets GPUs
Here are the 8 tenets for GPU allocation, according to the internal Amazon document:
- ROI + High Judgment thinking is required for GPU usage prioritization. GPUs are too valuable to be given out on a first-come, first-served basis. Instead, distribution should be determined based on ROI layered with common sense considerations, and provide for the long-term growth of the Company’s free cash flow. Distribution can happen in bespoke infrastructure or in hours of a sharing/pooling tool.
- Continuously learn, assess, and improve: We solicit new ideas based on continuous review and are willing to improve our approach as we learn more.
- Avoid silo decisions: Avoid making decisions in isolation; instead, centralize the tracking of GPUs and GPU related initiatives in one place.
- Time is critical: Scalable tooling is a key to moving fast when making distribution decisions which, in turn, allows more time for innovation and learning from our experiences.
- Efficiency feeds innovation: Efficiency paves the way for innovation by encouraging optimal resource utilization, fostering collaboration and resource sharing.
- Embrace risk in the pursuit of innovation: Acceptable level of risk tolerance will allow to embrace the idea of ‘failing fast’ and maintain an environment conducive to Research and Development.
- Transparency and confidentiality: We encourage transparency around the GPU allocation methodology through education and updates on the wiki’s while applying confidentiality around sensitive information on R&D and ROI sharable with only limited stakeholders. We celebrate wins and share lessons learned broadly.
- GPUs previously given to fleets may be recalled if other initiatives show more value. Having a GPU doesn’t mean you’ll get to keep it.
Do you work at Amazon? Got a tip? Contact this reporter via email at [email protected] or Signal, Telegram, or WhatsApp at 650-942-3061. Use a personal email address and a nonwork device; here’s our guide to sharing information securely.
Read the full article here