5

Varada updates analytics platform to dynamically scale out data lakes

 2 years ago
source link: https://venturebeat.com/2021/08/04/varada-updates-analytics-platform-to-dynamically-scale-out-data-lakes/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Varada updates analytics platform to dynamically scale out data lakes

ADVERTISEMENT

Data-driven creative for sensitive verticals

With big privacy changes, creative has become even more important with verticals like health and wellness and finance. Learn how to make data the backbone of your campaigns.

Register for free now
ADVERTISEMENT

All the sessions from Transform 2021 are available on-demand now. Watch now.


Varada today extended its analytics platform to include the ability to rapidly add and remove nodes and clusters as workloads scale up and down.

One of the primary reasons organizations opt for a data warehouse deployed in the cloud over a data lake is performance. A data lake typically makes available a massive amount of data that is stored on inexpensive storage systems, usually on a cloud platform. Varada created the Varada Data Platform based on indexing technology that organizes data into nano blocks based on the type of data being queried and how it is structured. This approach allows end users to query data where it resides without needing to move it into a central data warehouse.

Big trends in how good data & powerful AI can contribute to more effective banking and financial services 1

The company’s latest 3.0 version employs an indexing engine from Varada that accelerates SQL queries using a scale-out architecture that enables a data lake to rival the performance of a cloud data warehouses while keeping the cost of consuming infrastructure resources down, Varada CEO Eran Vanounou told VentureBeat.

Data demands

As the amount of data organizations store in the cloud steadily increases, a data lake can easily transform into a data swamp because the quantity of data that needs to be queried eventually impacts performance adversely. Approaches have emerged that employ various types of distributed SQL engines to optimize query results across a narrower set of data residing in a data lake. In the case of Varada, the company late last year claimed it had developed an adaptive engine that selects the optimal index to enable each dataset to achieve that goal. It also includes an observability capability that automatically determines when to index specific datasets based on usage.

ADVERTISEMENT

The Varada Data Platform takes data lakes to the next level by providing capabilities that go beyond simply storing data in a cloud platform, Vanounou said. “The storage aspects of a data lake have already been solved,” he said.

Cost benefit

The challenge now is finding the most efficient way to launch queries against a massive amount of data. The first wave of data lakes based on platforms such as Hadoop often resulted in organizations creating data swamps because there was no way to dynamically organize data to make it easier to query. Cloud data warehouses emerged as an alternative to enable IT teams to manage data more effectively. The challenge is cloud data warehouses are more expensive to employ than a data lake that makes use of, for example, object-based cloud storage services. Providers of distributed SQL engines promise to provide most of the benefits of a data warehouse at a much lower total cost.

Unfortunately, many first-generation instances of the platforms employed to create data lakes didn’t live up to expectations. Providers of data lakes need to convince IT organizations that the next generation of these platforms has solved that issue. In the meantime, providers of data warehouses in the cloud continue to gain traction. However, providers of platforms for building data lakes note that a cloud data warehouse simply replaces a proprietary on-premises platform with a larger cloud platform. A data lake is designed to enable IT organizations to retain more control over which tools and applications can access their data.

ADVERTISEMENT

It’s too early to say how this battle will play out. There’s a lot of cloud data warehouse momentum within enterprise IT organizations that often prefer the path of least resistance when it comes to centralizing data management. But as it becomes apparent that the volume of data that needs to be accessed is going to exponentially expand, a day of reckoning for bringing the cost of managing all that data under control is not very far off.

VentureBeat

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more
Become a member
Sponsored

How do you sustain your speed of innovation?

Mark Porter, MongoDBJuly 07, 2021 05:20 AM
GettyImages-1217088301.jpg?fit=930%2C384&strip=all
Image Credit: Getty Images

Data-driven creative for sensitive verticals

With big privacy changes, creative has become even more important with verticals like health and wellness and finance. Learn how to make data the backbone of your campaigns.

Register for free now

Presented by MongoDB


Success in the digital age is predicated on the ability to deliver new  experiences to customers quickly. That’s why companies are rethinking not just the front-end of their company, but every layer beneath as well. They are streamlining their supply chains, optimizing customer feedback loops, maintaining less inventory, and applying metrics and AI/ML to generate operational insight. All in an effort to accelerate.

So when it comes to digital innovation, faster is better. Right? Not so fast.

Forward-thinking business leaders know that the decisions they make today will determine their competitiveness for years or maybe even decades to come. With this relentless push for speed comes the temptation to skip the basics. Cutting corners with security or privacy, locking into proprietary technologies, or accruing massive tech debt have long-term consequences.

Over time, these consequences add up, amounting to an “innovation tax” that must be continually paid in the form of inflexible technology, lost productivity, and slower time to market. And companies that don’t pay attention to this risk losing their best employees — a silent and hard-to-measure tax that nevertheless has killed some of the most innovative products in the market.

But there is a way for business and IT leaders to exercise both fast-twitch and slow-twitch muscles in this race; to think short-term and long-term. At MongoDB, we call it “sustainable speed,” and it starts with ensuring the proper digital foundations are in place. We believe that the cornerstone of your digital foundation is your data — the raw material of innovation in the digital age. In our work with thousands of customers, we’ve identified four pillars of sustainable speed, each of which allows organizations to accelerate innovation without courting long-term disaster.

Multi-cloud agility

Not all clouds are created equal — and neither are data centers. The fact is that each cloud provider can be the “best” cloud provider — albeit to different users, in different situations. While each provider offers a portfolio of services, they aren’t the same in terms of functionality or maturity. Developers should be able to use best-of-breed technologies across clouds — not just for different apps, but for the same application.

Imagine your devs being able to utilize AWS Lambda, Google Cloud’s AI Platform, and Microsoft’s Azure DevOps within a unified console. In addition, despite the energy around cloud, very few of the large companies I speak to are all-in on cloud — some plan to move slowly or even never move fully to the cloud because of regulation, compliance, or even cost at scale. Don’t fall for any kind of mantra about “all-in” on one thing or another — listen to your business units and listen to the developers in them.

Innovation velocity

If applications are the currency of the new economy, then development teams are the market makers. And yet despite the relentless strategic emphasis on speed and innovation in the digital economy, these teams continue to be mismanaged and malnourished inside both large and small companies. To maximize the innovation output of developers, companies must make an effort to understand the fundamental nature of development work, providing the most intuitive and flexible tools on the market, and removing time-consuming, undifferentiated work, like database administration.

Listen to your developers when they talk about wanting to fix the underpinnings of their test, deployment, or monitoring systems. And invest in the daily developer workflows, removing barriers and streamlining the process whenever possible.

Predictability (a.k.a., reliability)

Here is where we start to think about the ability to build quickly, but with confidence. Creating or updating mission-critical applications is always high-stakes work, with inherent risks of losing data or running afoul of regulatory requirements. Executives must feel certain their application development platform will protect the integrity of the customer and business data, handle outages with no significant impact (internally or externally), and scale to meet the ambition of the business.

You build this confidence by regularly asking your leaders and their teams to bring up areas of concern around the layers of what I call “The Onion of Requirements.”

These are:

  • Security
  • Durability
  • Correctness
  • Availability
  • Scalability
  • Operability
  • Features
  • Performance
  • Efficiency

The first six are the ones that, if you get them wrong, can completely trash your business predictability. In most companies, you won’t hear about these things as much as you should; because all executives ever ask about is one layer of the onion: Features. That’s all great until the breach or the outage or the release you have to pull back. Builders build buildings that behave predictably, with thousands of years of best practices behind them. Technology teams should do the same.

Privacy and Compliance

I can’t tell you how many times I’ve heard things like “we innovate quickly, with no compromises on security, compliance, and safety.” That’s really hard to put into practice. We all know that the only way to be absolutely sure you never have an outage caused by a software deployment is to…not deploy software. The research and my own personal experience shows that more than 65% of outage minutes are caused by bad software deployments.

What happens after an outage? Executives sow the fear of consequences among engineers. But this fear can be a debilitating force in the race toward digital innovation. Think of an athlete with blazing speed, holding back because they are terrified of pulling a hamstring or blowing an ACL. This is the effect that cyber-attacks, privacy concerns, and ever-changing regulatory standards can have on the innovation process.

Security is often seen as a counter-weight to innovation. But the opposite can be true. The more secure the data platform, the more testing, the quicker the cycle time between development and production. And the more confidence a team has in moving quickly, identifying problems early and rolling them back before damage is done. To achieve this, security must be baked in, compliance testing must be mandatory, and continuous integration and delivery must be a priority.

Much of this work should be automated, because, whenever I hear that humans are doing security and compliance testing at a company, I want to take my business somewhere else.

A Fortune 500 CTO once said that technical debt should be shown on the balance sheet so the CFO can see it. Why? Because tech debt comes with costly interest payments and the same morale-crushing impact of personal debt

The same could be said for all innovation taxes: the short-sighted, lift-and-shift strategies, the organizational data silos, the vendor lock-in, and the lack of a rock-solid testing infrastructure.

The companies that focus on both innovation and rigor can manage these long-term impediments. They don’t have to choose between the tortoise and the hare. They can be both.

To learn more about this topic, check out The Foundations of Sustainable Speed white paper.

Mark Porter is CTO of MongoDB.


Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. Content produced by our editorial team is never influenced by advertisers or sponsors in any way. For more information, contact [email protected].


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK