

Hive-Metastore : A Basic Introduction
source link: https://blog.knoldus.com/hive-metastore-a-basic-introduction/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Hive-Metastore : A Basic Introduction
Reading Time: 3 minutes
As we know database is the most important and powerful part for any organisation. It is the collection of Schema, Tables, Relationships, Queries and Views. It is an organized collection of data. But can you ever think about these question –
- How does database manage all the tables?
- How does database manage all the relationship?
- How do we perform all operations so easy?
- Is there any way to know about all this?
There is one answer for all the question and that is Metastore. In the Metastore database keeps all the information related to our databases, tables and relations as Metadata. When ever you want to know about database than in the Metastore we can easily find all the information.
Here we will talk about the Hive-Metastore system where it keep all the information about the tables and relations.
Hive-Metastore :
All hive implementation need a metastore service, where it stores metadata. It is implemented using tables in relational database. By default, Hive uses built-in Derby SQL server. It provides single process storage, so when we use Derby we can not run instance of Hive CLI. Whenever we want to run Hive on a personal machine or for some developer task than it is good but when we want to use it on cluster then MYSQL or any other similar relational database is required.
Now when you run your hive query and you are using default derby database you will find that your current directory now contains a new sub-directory metastore_db. Also the metastore will be created if it doesn’t already exist. The property of interest here is javax.jdo.option.ConnectionURL. The default value of this property is jdbc:derby:;databaseName=metastore_db;create=true. This value specifies that you will be using embedded derby as your Hive metastore and the location of the metastore is metastore_db.
We can also configure directory for hive store table information. By default, the location of warehouse is file:///user/hive/warehouse and we can also use hive-site.xml file for local or remote metastore.
hive-site.xml : We used hive-site.xml for changing the configuring to specifying to Hive where the database is stored. We used JDBC compliant database for the metastore because default embedded database is not suitable for the production. For providing these configuration we used hive-site.xml file. An example of hive-site.xml for using MYSQL Database for storing metastore :
xxxxxxxxxx
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore</value>
<description>metadata is stored in a MySQL server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL.metastore_spark_plsql</name>
<value>jdbc:mysql://localhost/metastore_sql</value>
<description>user metadata is stored in a MySQL server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>MySQL JDBC driver class</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>user</value>
<description>user name for connecting to mysql server </description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>password</value>
<description>password for connecting to mysql server </description>
</property>
</configuration>
When we use MYSQL JDBC driver than we download Jconnector(MYSQL JDBC Driver) and place in the $HIVE_HOME/lib and place hive-site.xml in $HIVE_HOME/conf. After this we will be able to store metastore in MYSQL.
To know about the metastore tables, field and relation please look into this diagram :
Here in this diagram, we can find our all the answer regarding the metastore. We can find out that how metastore stores information of database & tables and How these tables are internally connected with each other. In these tables we an find all the information which related to the tables.
I hope it will help you to develop basic understanding with the metastore in hive.
Thanks.
Reference:
Recommend
-
43
Spring Security 5.1.4 RELEASE 对于使用Java进行开发的同学来说,Spring算是一个比较热门的选择,包括现在流行的Spring Boot更是能快速上手,并让开发者更好的关注业务核心的开发,而免去了原来冗杂的配置过程。...
-
26
Apache Hive is often referred to as a data warehouse infrastructure built on top of Apache Hadoop. Originally developed by Facebook to query their incoming ~20TB of data each day, currently, programmers use it for ad-hoc q...
-
23
A Basic Introduction to TensorFlow Lite An introduction to TensorFlow Lite Converter, Quantized Optimization, and Interpreter to run Tensorflow Lite models at the Edge In this article, we will understan...
-
8
Associated type constructors, part 1: basic concepts and introduction Nov 2, 2016 So for the end of last week, I was at Rust Belt Rust. This was awesome. And not only because the speakers and attendees at Rust Belt Rust...
-
9
How to query external hive Metastore From Spark ...
-
10
Hive & ldquo; Creating database tables Metastore Hive & rdquo; the command fails during installation 'Path A' using Cloudera Manager advertisements ...
-
9
Reading Time: 2 minutesActix-web is a small and pragmatic framework of RUST. If you are already a Rust programmer you will probably find yourself at home quickly, but even if you are coming from another programming language you should find a...
-
7
Hive Database : A basic Introduction Reading Time: 3 minutesWhat is Hive? Hive is a data warehouse infrastructure tool which process structured data in Hadoop. It resides on top of Hadoop to summar...
-
6
Copy link Collaborator inbal2l ...
-
5
This article was published as a part of the Data Science Blogathon. Introduction I’ve always wondered how big companies like Google process their information...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK