Discover Lakehouse – Databricks
source link: https://databricks.com/discoverlakehouse?utm_campaign=CB+-+AMER+-+USCA+-+EN+-+AW+-+ALL+-+DL+-+ALL&utm_adgroup=CTX+-+Lakehouse+-+ALL+-+TEX&utm_content=microsite&utm_offer=discoverlakehouse&utm_ad=EN+-+HUB+-+Discover+Lakehouse+-+Text+Link+-+1x1+-+TEX+-+Copy+2&utm_term=%7Bkeyword%7D
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Your data warehouse
wasn’t built for today’s world
Like the CD, disposable camera, floppy disk and most other 40-year-old innovations, the data warehouse had a great run. But new use cases have spawned new technologies. CDs can’t stream music. Film cameras can’t share photos. Floppy disks can’t compete with infinite cloud storage. And data warehouses can’t perform AI.
It’s time for a simpler approach
AI is a priority for every organization. But today’s complex and outdated legacy infrastructure can’t deliver on the promise of AI. It’s time for a new data architecture built to meet your needs today — and future-proofed so it’s ready for whatever tomorrow brings.
A new era of data and AI opens
The data lakehouse is an open data architecture that combines the best of data warehouses and data lakes on one platform.
Now you can store all your data — structured, semi-structured and unstructured — in your open data lake and still get the data quality, performance, security and governance you expect from a data warehouse. This makes lakehouse the only data architecture that supports business intelligence, SQL analytics, real-time data applications, data science and machine learning in one platform.
One platform for all use cases
The essential ingredients
Delta Lake is an open-source project that delivers reliability, security and performance on your data lake — essential to building lakehouse architecture on top of existing storage systems such as Amazon S3, Azure Data Lake Store and Google Cloud Storage.
Delta Lake is stored in an open data format so you avoid data lock-in from proprietary formats and gain access to a vast open source ecosystem. Today, thousands of companies are processing exabytes of data per month with Delta Lake.
Lakehouses do what warehouses can’t
Lakehouse leapfrogs the limitations of the data warehouse because it’s designed to manage all types of data while supporting both traditional data warehouse workloads and machine learning natively. It adds all this functionality to your existing data lake, creating a single open system to both manage all of your data and support every use case.
Data Warehouse | Lakehouse | |
---|---|---|
Data formats | Closed | Open |
Data types | Structured* | Any type of data |
Scalability | Limited** | Highly scalable |
Cost | ||
Cost | $$$ | $ |
Use cases | ||
Use cases | BI, SQL | BI, SQL, ML, Real-Time Apps |
Data access | ||
Data access | SQL only | Open APIs for direct access to files with SQL, R, Python and other languages |
Reliability | ||
Reliability | High-quality, reliable data with ACID transactions | High-quality, reliable data with ACID transactions |
Governance | ||
Governance | Fine-grained security and governance for row/columnar level for tables | Fine-grained security and governance for row/columnar level for tables |
Performance | ||
Performance | High | High |
*Limited support for semi-structured data
**Cost of scaling is prohibitive
The father of data warehousing agrees.
Grab your free copy of Bill Inmon’s new book, Building the Data Lakehouse.
The world’s first and only lakehouse platform in the cloud
Delivered and managed as a service on AWS, Microsoft Azure, or Google Cloud, the Databricks Lakehouse Platform makes all the data in your data lake available for any number of data-driven use cases.
Data engineers can build fast and reliable data pipelines. Business analysts can perform BI, running SQL queries faster than most data warehouses. Data scientists can streamline MLOps. And when all your data teams are on a common platform, you can significantly reduce infrastructure costs, increase data team productivity and accelerate innovation.
Analytics directly on your data lake
Databricks brings data analytics to your data lake, delivering data warehouse performance at data lake economics. Using open source standards to avoid data lock-in, the Databricks Lakehouse Platform provides the reliability, quality and performance capabilities that data lakes natively lack and up to 6x better price/performance than traditional cloud data warehouses.
The world’s leading companies are moving to the lakehouse
Transforming banking operations at a global scale
ABN AMRO leverages the Databricks Lakehouse Platform to transform their banking operations, from preventing fraud to personalizing mobile experiences.
Improving patients’ lives with faster drug development and delivery
Amgen uses Databricks Lakehouse for 280+ ML and analytics use cases from genomic research to clinical trials.
Approaching data nirvana with infinite scale at manageable cost
Atlassian deploys the Databricks Lakehouse Platform as an open and secure platform to democratize data for thousands of users enterprise-wide.
Increasing investment sustainability and mitigating risk with ESG data
S&P Global uses Databricks Lakehouse to help their customers glean insights from alternative data sets, driving more sustainable investment and mitigating risk.
Democratizing data to deliver gaming experiences at scale
SEGA moved from data warehousing to Databricks Lakehouse, unifying massive amounts of diverse data sets to enable insights that deliver personalized experiences to 30 million gamers.
Personalizing experiences with data and ML
Grab standardized upon Databricks Lakehouse to deliver insights at scale, democratizing data through the rapid deployment of AI and BI use cases across their operations.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK