14

Building a $5,000 Machine Learning Workstation with an NVIDIA TITAN RTX and RYZE...

 3 years ago
source link: https://towardsdatascience.com/building-a-5-000-machine-learning-workstation-with-an-nvidia-titan-rtx-and-ryzen-threadripper-46c49383fdac?gi=25f3ae13831d
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

aQJvquI.png!web

3.8 GHz, 24-core, 64 GB, TITAN RTX-Based Machine Learning Workstation

Building a computer is not for everyone. At the high-end, building a machine can save money and allow you to specify exactly the hardware that you wish. Additionally, a custom computer build allows you to plan and execute an incremental upgrade path as more advanced components become available or economical. This article describes how I built a machine learning workstation in the $5,000 (USD, July 2020) price range. I also provide some suggestions on scaling this price up or down.

I will begin by describing my use case. Tasks that I work on can be either GPU or CPU-heavy. At times I do wish to have a large amount of RAM for data processing and staging. For memory requirements beyond what this machine is capable of I typically use Spark or Dask . Because of these needs, I spent a decent amount on CPU, RAM, and hard disk access. Because the most common bottleneck for my processing needs is neural network training I went for a high-end GPU.

Machine Specifications and Part List

The computer that I built is a 3.8GHz (boost clock rate of 4.5GHz) 24-core AMD ThreadRipper (3960X) with 64GB of RAM and an NVIDIA TITAN RTX . I published the complete parts list to PC Part Picker . The highlights include:

The hard drive is fast enough that program loading is very quick. Also moving my data between RAM and the hard disk is not punishingly slow. I paid extra for faster RAM hoping to speed loading between the CPU and GPU RAM. Further benchmarking will let me know how well this worked out.

Choosing a GPU

The GPU is an important consideration for a machine learning workstation. While the machine learning engineer may wish to run complex visualizations that use the graphics potential of the machine; most modern GPUs should be able to handle the graphics needs of the user. The number-crunching capabilities of the GPU is an important characteristic for a machine learning workstation. For a gaming system, the choice would be between AMD and NVIDIA. For machine learning, particularly, deep learning the choice of GPU really is just NVIDIA.

CUDA or OpenCL are the capabilities that allow a GPU to function as a mathematics engine for software. TensorFlow, PyTorch, and others commonly use CUDA, which requires and NVIDIA. OpenCL is more open and supports GPUs from Intel, AMD, and NVIDIA. However, for a variety of reasons, mostly performance, CUDA has the widest support. Also, NVIDIA dominates the cloud platforms from AWS and Azure. Google Cloud Platform (GCP) does offer an alternative technology called a Tensor Processing Unit (TPU); however, TPUs are not common on local systems. For these reasons, particularly for cloud compatibilities, I stick to NVIDIA’s offerings for GPUs.

NVIDIA offers GeForce GPUs for gaming applications and Quadro for more compute centric tasks. Bridging this divide are the TITAN cards that offer the additional memory that often benefits deep learning more than gaming; while still offering as many CUDA cores as their highest-end gaming GPUs. NVIDIA was kind enough to provide a TITAN RTX to my YouTube channel . I decided that since I was being provided a $2,500 (USD, July 2020) GPU I would invest the same amount and build an advanced deep learning workstation around this fantastic GPU.

Intel or AMD

I wanted my workstation to be flexible enough to be high-performance for both GPU and CPU-centric tasks. As great as GPU-based deep learning is; I do find myself using CPU-heavy tasks for data preprocessing and some visualizations. Also, since I frequently code automation tasks in Python; I am able to design my software to take advantage of multi-core technologies. Unless you are building a computer you rarely are directly given the choice of Intel or AMD; like you are sometimes allowed to choose a CPU type from a hardware manufacturer.

After watching/reading a considerable number of reports on the current status of Intel vs AMD; I came to the following conclusion. AMD offers more cores; however at a slightly reduced clock speed. Therefore, AMD will be more efficient on multi-threaded software. Intel will be more effective on less parallel software that benefits from a larger single-core speed advantage. Because most of my software is multi-threaded and I can choose to design my own custom-crafted software to be multi-threaded, I chose AMD.

I chose a 24-core AMD RYZEN ThreadRipper that fits a TRX4 socket, which is currently AMD’s latest socket type. This means that I can easily upgrade my CPU to a more advanced AMD offering later. Traditionally I’ve always used Intel. The only minor inconvenience that I’ve encountered with AMD is that sometimes I must wait for the latest “Microsoft Insider” pre-release Windows versions.

Operating System Choice

For this computer, I decided to use Windows 10 Pro. I am very impressed with Microsoft’s Linux Subsystem (LSU) capabilities; particularly now that the Linux subsystem can access underlying GPUs. I am just beginning with LSU-2, so my opinion of this system is still evolving. I hope to post more content on LSU in later articles and videos.

Cost Comparision to the Clouds

Prior to building this machine I sent most of my GPU workloads to the cloud. My most common tasks are either Kaggle competitions or rerunning and optimizing examples for my deep learning course at Washington University . The cloud costs for these workloads are not trivial. To compare this machine to AWS cloud I used the following costs (as of July 2020):

  • AWS Workspaces: 16 vCPU, 122 GiB Memory, 1 GPU, 8 GiB Video Memory, $999.00/mo or $66.00/month, and $11.62/hour.

The AWS workspaces instance quoted above is considerably weaker than my machine; however, it is the closest equivalent. I have 24GB of video RAM; whereas the AWS machine has only 8. This might require some reworking for neural network training sizes. Also, I have 24 CPUs vs 16 on AWS, yet more memory on AWS. At $999/month, which makes the most sense for heavy load, I would come out ahead in 5 months.

If you are willing to design some pipeline code around your workflow and use some AWS proprietary technology you could save considerable amounts in AWS fees by using SageMaker . I am not considering SageMaker or straight up EC2 instances here as I am looking at what most closely approximates a desktop experience as I have with the system I just built.

Scaling Back

I am sure that there are people reading this article who both feel I spent too much or too little on this machine. I’ve worked with advanced Tesla V100-based systems that cost 5 to 10 times what this machine costs to build. If you are looking to spend less, there are many options.

One of the easiest is aesthetics. RGB is quite popular for the components that system builders use. You can see some of the RGB on my system in the following picture. Not familiar with RGB? Anything glowing is RGB.

NzqeUjF.png!web

Inside the computer, lighted RGB components

I am a Youtuber, so the computer makes for an interesting component to my “set.” If you are going to just shove the machine under a desk, get a cheap but accessible case and avoid RGB components. I kind of epically failed on RGB at the current stage of this build. I have awesomely beautiful RGB RAM covered up by my really effective, yet Borg-cube shaped “Be Quiet” cooler. I may swap this out for a liquid AIO cooler someday soon.

Parts you can scale back on:

  • HDD speed : Really, the HDD speed is just to load data to RAM, once data is in RAM the HDD speed becomes less important.
  • RAM speed : Slower RAM will still get you by. Maybe most processing is done on the GPU anyway.
  • CPU : For Machine Learning more cores are always better. Fewer cores will result in worse performance.
  • GPU : CUDA cores determine how fast your GPU will train. GPU memory determines how cleaver you need to be with your batching and network structure to get something to run. You do not necessarily need a TITAN RTX.

YouTube Video of the Completed Build

I created a YouTube video of the build of this computer. My nephew Nathan assisted with the build. You can see this video here.

Going forward, I now look forward to benchmarking this GPU and using it to produce great examples for my GitHub repository and YouTube channel. Consider following me on either to see all the latest content.


很遗憾的说,推酷将在这个月底关闭。人生海海,几度秋凉,感谢那些有你的时光。


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK