4 steps for high-performance computing for big data processing-headlines-php.cn

4 steps for high-performance computing for big data processing

Mar 10, 2018 am 09:48 AM

data processingusehigh performance

If an enterprise needs high-performance computing to process its big data, it may work best to operate it on-premises. Here's what businesses need to know, including how high-performance computing and Hadoop differ.

In the field of big data, not every company needs high-performance computing (HPC), but almost all companies using big data have adopted Hadoop-style analytical computing.

4 steps for high-performance computing for big data processing

The difference between HPC and Hadoop is difficult to distinguish because Hadoop analytics jobs can be run on high-performance computing (HPC) devices, but not vice versa. Both HPC and Hadoop analytics use parallel data processing, but in Hadoop and analytics environments, data is stored on hardware and distributed across multiple nodes of that hardware. In high-performance computing (HPC), data file sizes are much larger and data is stored centrally. High-performance computing (HPC) requires high throughput and low latency due to its large file sizes and the need for more expensive network communications such as InfiniBand.

The purpose for enterprise CIOs is clear: If an enterprise can avoid HPC and use Hadoop only for analytics, it can do so. This approach is cheaper, easier for employees to operate, and can even run in the cloud where other companies (such as third-party vendors) can run it.

Unfortunately, for all enterprises and institutions in life sciences, meteorology, pharmaceuticals, mining, medical, government, and academia that require high-performance computing (HPC) processing, it is impossible to adopt Hadoop. Due to the large size of the files and the extremely strict processing requirements, using a data center or cloud computing is not a good solution.

In short, high performance computing (HPC) is a perfect example of a big data platform running inside the data center. Because of this, it becomes a challenge for companies to ensure that the hardware they invest heavily in does the job it needs to do.

Big Data Hadoop and HPC platform provider PSCC Labs chief strategy officer Alex Lesser said: "This is a challenge faced by many companies that must use HPC to process their big data. Most of these companies have the support of traditional IT infrastructure, they naturally take this approach and build the Hadoop analytical computing environment themselves because this uses commodity hardware that they are already familiar with, but for high-performance computing (HPC), the response is usually to let the vendor Process.”

Companies considering adopting high-performance computing (HPC) need to take the following four steps:

1. Ensure senior-level support for high-performance computing (HPC)

The senior managers and board members of the enterprise do not necessarily need to be experts in the field of high-performance computing, but they must not be without their understanding and support. These managers should all have sufficient understanding of high-performance computing (HPC) and can clearly support the large-scale hardware, software and training investments that may be made for the enterprise. This means they must be educated on two aspects: (1) What HPC is and why it is different from ordinary analysis and requires special hardware and software. (2) Why companies need to use HPC instead of legacy analytics to achieve their business goals. Both of these education efforts should be the responsibility of the chief information officer (CIO) or chief development officer (CDO).

Lesser said: "The companies that are most aggressive in adopting HPC are the ones that believe they are real technology companies, pointing to the Amazon Web Services cloud service, which started as a retail business for Amazon.com and has become a huge profit. Center.”

2. Consider a pre-configured hardware platform that can be customized

Companies such as PSSC Labs offer pre-packaged and pre-configured HPC hardware. "We have a base package based on HPC best practices and work with customers to customize that base package based on the customer's computing needs," Lesser said, noting that almost every data center must have some customization.

3. Understand the return

As with any IT investment, HPC must be cost-effective and the business should be able to achieve a return on investment (ROI), which is already in the minds of management and the board of directors clarify. "A good example is aircraft design," Lesser said. “High-performance computing (HPC) is a huge investment, but it’s quickly paid back when a company discovers it can use HPC to simulate designs and get five nines of accuracy and no longer has to rent a physical wind tunnel. Invest. ”

4. Train your own IT staff

HPC computing is not an easy transition for your IT staff, but if you want to run an on-premises operation, you should let the team Positioned for self-sufficiency.

Initially, businesses may need to hire outside consultants to get started. But the goal of a consulting assignment should always be twofold: (1) keep the HPC application running, and (2) transfer knowledge to employees so they can take over operations. Businesses should not be satisfied with this.

At the heart of the HPC team is the need for a data scientist who can develop the highly complex algorithms required for high-performance computing to answer the enterprise's questions. It also requires a programmer with strong C+ or Fortran skills and the ability to work on powerful systems in a parallel processing environment, or an expert in network communications.

"The bottom line is that if an enterprise is running jobs once or twice every two weeks, it should go to the cloud to host its HPC," Lesser said. "But if an enterprise is using HPC resources and running jobs, such as a pharmaceutical company Or a biology company may run it multiple times a day, then running it in the cloud would be a waste of money and should consider running their own in-house operations.”

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055612 fails to install in Windows 10?

4 weeks agoByDDD

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks agoByDDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Nordhold: Fusion System, Explained

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook

3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

Zend Studio 13.0.1

Powerful PHP integrated development environment

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Hot Topics

1670

1428

1329

1276

1256