1

Preventing data infrastructure sprawl - what developers can do

 1 year ago
source link: https://devm.io/databases/data-infrastructure-sprawl-api
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Proactively looking at data services and APIs together

Preventing Data Infrastructure Sprawl - What Developers Can Do

28. Oct 2022


For developers, building applications is exciting - who doesn’t want to create the next generation app that customers love? However, the way we build applications today in the cloud leads to potential problems around data, finds Ovais Tariq, CEO at Tigris Data.

As we get more data from our applications, we have to organize this, and it leads to more infrastructure. To deal with the problem of data infrastructure sprawl, we have to understand why this sprawl takes place, and then be proactive in how we approach the issue. By looking at data services and APIs together, we can improve how we support data over time.

Microservices and data

While traditional applications would use a single database that would act as its data store, modern applications are designed based on connecting multiple microservices. Using microservices running in software containers offers more flexibility and freedom in how to build an application, but this compartmentalized approach does require more database instances to capture all the data that is created.

Rather than a couple of large databases that hold all the data involved, each application might have thirty or forty database instances to capture and store data from each microservice over time. This is exacerbated by the need to use different data models and functionality such as search, indexing and event streaming. This results in deploying multiple different database technologies which add to the fabric of your data infrastructure and the complexity grows.

However, this data infrastructure will then need to be managed over time as well. Each database technology used needs to be deployed, configured, secured, monitored, and maintained as infrastructure components. This can take developers away from the work that they love doing, and put their focus on operational tasks instead. This slows down the innovation process.

Using microservices running in software containers offers more flexibility and freedom in how to build an application, but this compartmentalized approach does require more database instances to capture all the data that is created.

To diagnose this as a problem can be difficult. When you start small, developers will often pick a database that they are most comfortable with, and that they can get running quickly. This will normally lead to them using the same database for multiple use cases, and where some of them are not good fits. For example, developers will typically start with a database like MySQL for OLTP workloads, then try to apply this for other workloads like full text search and for analytics. But because existing database technologies are not flexible enough to continue to support diverse workloads as the application scales or the needs of the application evolves, they end up going to a complicated data architecture with multiple different database technologies.

Cloud Native

Application Development and Deployment

In this workshop, you’ll learn everything you need about MicroProfile with an explanation and demo. After we create the microservice, you’ll get the hang of how to package it in a Docker container and deploy it to a Kubernetes cluster. Finally, you’ll understand how to use Istio to manage your microservice interactions.

Cloud service providers can offer services that can step in to reduce some of the management overhead. These providers have proven time and time again that the platform approach is popular. Picking a cloud provider and locking into their tech stack can give you some reduction in operational costs and effort. But all the cloud providers are doing is providing you “as a service” instances of the popular database technologies. This does not solve the data infrastructure sprawl problem.

Fixing infrastructure sprawl starts with developers

Solving this problem is about managing data more efficiently as a basic principle, and then treating this as a product in its own right. This means looking at how data gets used with APIs.

From a developer perspective, interactions around data can be very simple - they want to use the standard set of actions CREATE, READ, UPDATE and DELETE, termed CRUD. Alongside these actions, developers may have to set up streaming or search services to meet user demands within an application. Putting these behind APIs can make the process easier for developers.

However, having those data services accessed through APIs rather than deploying as multiple databases is not an effective solution to the problem on its own. If the whole system is not cohesively built, then you still have to learn these different APIs. It shifts some of the infrastructure sprawl burden, but it doesn’t solve the management overhead.

From a developer perspective, interactions around data can be very simple - they want to use the standard set of actions CREATE, READ, UPDATE and DELETE, termed CRUD.

Using APIs alone also doesn't take away the fact that you need to connect all these systems together. To solve this effectively, you need to think about consolidating your platforms and APIs at the same time, so that you can serve all the different use cases related to data that your application developers will have over time.

This “universal API” approach has to take a platform approach into account in order to be effective. Rather than building applications with dozens of infrastructure components exposed to the developers, instead developers should be able to access these diverse functions through a single common interface. Instead of having to worry about data flowing between disparate systems, data should be available across these different functions automatically.

Working with more data in interesting ways is essential to how developers deliver what businesses want. However, this has to be considered in the longer term, so that the sheer volume of data, services and requirements does not overwhelm your team with infrastructure sprawl.

Ovais Tariq
Ovais Tariq

Ovais is the CEO of Tigris Data, where he leads the team building the world’s first truly open source developer data platform. Prior to Tigris Data, Ovais led data and storage engineering teams in solving some of the toughest problems around developer productivity around data, including work at Uber, Khoros and Percona.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK