0

Foreign Data Wrappers in PostgreSQL Database

 4 weeks ago
source link: https://www.percona.com/blog/foreign-data-wrappers-in-postgresql-databases-postgres_fdw-dblink/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Foreign Data Wrappers in PostgreSQL Databases (postgres_fdw & dblink)

April 3, 2024

Neha Korukula

A Foreign Data Wrapper (FDW) is a feature in database management systems that can communicate with an external data source and access data without physically moving the data on a working local database.

PostgreSQL implements portions of the SQL/MED specification, allowing access to data that resides outside PostgreSQL using regular SQL queries. Such data, which is referred to as foreign data, can be accessed by Foreign Data Wrapper.

Note: SQL/MED is Management of External Data, a part of the SQL standard that deals with how a database management system can integrate data stored outside the database.

Accessing remote data may involve authenticating to the external data source. This process of authentication can be managed by creating a user mapping. Additionally, establishing a foreign server object defines the connection to a specific external data source.

We have various Foreign Data wrappers in PostgreSQL like postgres_fdw, file_fdw, Oracle_fdw, Mysql_fdw, TDS_fdw  etc. Here in this blog post we will discuss dblink and postgres_fdw.

dblink

dblink is an extension that facilitates database sessions in establishing connections to other PostgreSQL databases.

  • dblink executes a query (usually a SELECT, but it can be any SQL statement that returns rows) in a remote database.
  • There exist a few dblink functions for accessing data from external databases by establishing connections and disconnecting after usage.

Installation of dblink on a database:

The installation of the dblink extension can be executed directly through the CREATE command, as illustrated below.

dblink_testing=# CREATE EXTENSION dblink;
CREATE EXTENSION
dblink_testing=# dx dblink
                             List of installed extensions
Name  | Version | Schema |                         Description                          
--------+---------+--------+--------------------------------------------------------------
dblink | 1.2     | public | connect to other PostgreSQL databases from within a database
(1 row)

Steps to set up and access the remote database using dblink:

  • dblink_connect(): This function opens a persistent connection to a remote source database.
dblink_testing=# SELECT dblink_connect('pg_conn','hostaddr=*.*.*.* port=5434 dbname=dblink user=dblink_test password=******');
dblink_connect
----------------
(1 row)
  • dblink() : This function executes a query in a remote database.
dblink_testing=# select * from dblink ('pg_conn', 'select nspname as schema, relname as tablename,reltuples as rowcounts from
pg_class c JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace') AS RESULT(schema name,tablename name,rowcounts int)  where
schema like 'fdw_testing' order by schema, rowcounts desc;
 schema    | tablename  | rowcounts 
-------------+------------+-----------
fdw_testing | staff      |      1000
fdw_testing | staff_pkey |      1000
fdw_testing | students   |       300
(3 rows)
  • dblink_exec(): This function executes a command in a remote database.

Below are a few examples shown for executing commands like CREATE TABLE, INSERT data, and CREATE VIEW on the remote database.

dblink_testing=# SELECT dblink_exec('pg_conn','CREATE TABLE fdw_testing.employees (id INT PRIMARY KEY , fname VARCHAR(20) ,
lname VARCHAR (20) , dob DATE NOT NULL ,joined DATE )');
dblink_exec
--------------
CREATE TABLE
(1 row)
dblink_testing=# SELECT dblink_exec('pg_conn','INSERT INTO fdw_testing.employees (id ,fname,lname,dob ,joined) VALUES
( generate_series(1, 145) , (array[''Oswald'', ''Henry'', ''Bob'', ''Vennie''])[floor(random() * 4 + 1)],(array[''Leo'', ''Jack'',
''Den'', ''Daisy'' ,''Woody'']) [floor(random() * 5 + 1)], ''1995-01-01''::date + trunc(random() * 366 * 3)::int,
generate_series(''01/03/2024''::timestamp, ''09/02/2023''::timestamp, ''10 minutes''))');
dblink_exec 
--------------
INSERT 0 145
(1 row)
dblink_testing=# select dblink_exec('pg_conn','CREATE OR REPLACE VIEW sample AS SELECT  table_catalog,table_schema,table_name FROM
information_schema.views WHERE is_insertable_into=''YES''');
dblink_exec
-------------
CREATE VIEW
(1 row)
  • dblink_error_message(): This function retrieves the last error message on the named connection.
dblink_testing=# SELECT dblink_exec('pg_conn','INSERT INTO students (id ,fname,lname,dob ,joined) VALUES ( generate_series(500, 1000) ,
(array[''Oswald'', ''Henry'', ''Bob'', ''Vennie''])[floor(random() * 4 + 1)],(array[''Leo'', ''Jack'', ''Den'', ''Daisy'' ,''Woody''])[floor(random() * 5 + 1)],
''1995-01-01''::date + trunc(random() * 366 * 3)::int,generate_series(''01/03/2024''::timestamp, ''09/02/2023''::timestamp, ''10 minutes''))');
ERROR:  duplicate key value violates unique constraint "students_pkey"
DETAIL:  Key (id)=(500) already exists.
CONTEXT:  while executing command on dblink connection named "pg_conn"
dblink_testing=# select dblink_error_message('pg_conn');
                        dblink_error_message                          
------------------------------------------------------------------------
ERROR:  duplicate key value violates unique constraint "students_pkey"+
DETAIL:  Key (id)=(500) already exists.
(1 row)
  • dblink_get_connections(): This function provides the names of all open-named dblink connections.
dblink_testing=# select dblink_get_connections();
dblink_get_connections
------------------------
{pg_conn}
(1 row)
  • dblink_disconnect(): This function closes a persistent connection to a remote database.
dblink_testing=# SELECT dblink_disconnect('pg_conn');
dblink_disconnect
-------------------
(1 row)

postgres_fdw

The postgres_fdw extension offers the foreign-data wrapper postgres_fdw, enabling access to data stored in external PostgreSQL servers.

  • postgres_fdw provides the same functionality as dblink but uses a more modern and standards-compliant infrastructure and provides better performance.

1: Steps to implement remote access using postgres_fdw:

  1. Here we Install the postgres_fdw extension using CREATE EXTENSION.
postgres_fdw_db=# CREATE EXTENSION postgres_fdw ;
CREATE EXTENSION
postgres_fdw_db=# dx postgres_fdw
                           List of installed extensions
   Name     | Version | Schema |                    Description                     
--------------+---------+--------+----------------------------------------------------
postgres_fdw | 1.0     | public | foreign-data wrapper for remote PostgreSQL servers
(1 row)

     2. To establish connections to remote (source) databases, you can create a foreign server object using the CREATE SERVER command.

E.g.: CREATE SERVER foreign_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (hostaddr  ‘remote_ip_addr’ , port ‘remote_port’ , dbname ‘remote_db);’

postgres_fdw_db=# CREATE SERVER foreign_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (hostaddr '*.*.*.*', port '5434',
dbname 'dblink');
CREATE SERVER

Furthermore, you can list foreign servers through the below command.

postgres_fdw_db=# des
           List of foreign servers
    Name      |  Owner   | Foreign-data wrapper 
----------------+----------+----------------------
foreign_server | postgres | postgres_fdw
(1 row)

      3. For each database user to access each remote server, create a user mapping using the CREATE USER MAPPING command.

E.g.: CREATE USER MAPPING FOR SERVER foreign_server OPTIONS (USER ‘remote_db_user’ , password ‘*****’);

postgres_fdw_db=# CREATE USER MAPPING FOR  postgres_fdw_testing SERVER foreign_server OPTIONS (USER 'dblink_test' , password '*******');
CREATE USER MAPPING

Furthermore, to list user mappings, use the below command.

postgres_fdw_db=# deu
       List of user mappings
   Server     |      User name       
----------------+----------------------
foreign_server | postgres_fdw_testing
(1 row)

2: Creating a FOREIGN TABLE and accessing remote data:

To create a new foreign table in the current database, use CREATE FOREIGN TABLE

Note: Users must have USAGE privilege on the foreign server to be able to create a foreign table.

Below is the sample example shown for the Creation Foreign Table and accessing the data.

2.1: Here, we see the data from the Remote Database, which we try to access from the Target Database.

dblink=# select nspname as schema, relname as tablename,reltuples as rowcounts from pg_class c JOIN pg_catalog.pg_namespace n ON
n.oid = c.relnamespace where relkind='r' and nspname like 'fdw_testing' order by nspname, reltuples desc;
 schema    | tablename | rowcounts 
-------------+-----------+-----------
fdw_testing | staff     |      1000
fdw_testing | students  |       300
fdw_testing | employees |       145
(3 rows)

 2.2: Below, we create Foreign Table on the target Database and access the data.

  • It’s essential that the data types and other properties of the columns declared in CREATE FOREIGN TABLE match the actual remote table. 
postgres_fdw_db=# CREATE FOREIGN TABLE foreign_schema.employee_foreign_table (id bigint , fname varchar(20) , joined date) SERVER
foreign_server OPTIONS (schema_name 'fdw_testing' , table_name 'employees');
CREATE FOREIGN TABLE
postgres_fdw_db=# CREATE FOREIGN TABLE foreign_schema.students_foreign_table (id int , lname varchar(20)) SERVER foreign_server
OPTIONS (schema_name 'fdw_testing' , table_name 'students'); 
CREATE FOREIGN TABLE

As a result, below we see a list of foreign tables created in the foreign_schema.

postgres_fdw_db=# det foreign_schema.*
                List of foreign tables
   Schema     |         Table          |     Server     
----------------+------------------------+----------------
foreign_schema | employee_foreign_table | foreign_server
foreign_schema | students_foreign_table | foreign_server
(2 rows)

2.3: Validation of Source Data with step 2.1 and data copied to the target local database.

postgres_fdw_db=# SELECT employee_foreign_table_count,students_foreign_table_count FROM (SELECT COUNT(*) from foreign_schema.employee_foreign_table)
AS employee_foreign_table_count , (SELECT COUNT(*) from foreign_schema.students_foreign_table) AS students_foreign_table_count;
employee_foreign_table_count | students_foreign_table_count
------------------------------+------------------------------
(145)                        | (300)
(1 row)

3: IMPORT FOREIGN SCHEMA and access the remote/source data:

The IMPORT FOREIGN SCHEMA command seamlessly integrates table definitions from a foreign server into the local database, facilitating the smooth incorporation of foreign table structures.

3.1: To import all the tables in the remote (source) schema to the local working database, use the below command.

postgres_fdw_db=# IMPORT FOREIGN SCHEMA fdw_testing FROM SERVER foreign_server INTO postgres_fdw_schema;
IMPORT FOREIGN SCHEMA

Below we list Foreign Tables imported from remote source database.

postgres_fdw_db=# det postgres_fdw_schema.*
            List of foreign tables
     Schema        |   Table   |     Server     
---------------------+-----------+----------------
postgres_fdw_schema | employees | foreign_server
postgres_fdw_schema | staff     | foreign_server
postgres_fdw_schema | students  | foreign_server
(3 rows)

Validating the data in the target after importing to the schema postgres_fdw_schema involves ensuring that it matches the source data.

postgres_fdw_db=# select employees_cnt, staff_cnt , students_cnt FROM (select count(*) from postgres_fdw_schema.employees )
AS employees_cnt, (select count(*) from postgres_fdw_schema.staff) AS staff_cnt, (select count(*) from postgres_fdw_schema.students)
AS students_cnt;
employees_cnt | staff_cnt | students_cnt
---------------+-----------+--------------
(145)         | (1000)    | (300)
(1 row)

3.2: Import only foreign tables matching one of the given table names using “LIMIT TO” enables selective inclusion of specific foreign tables.

postgres_fdw_db=# IMPORT FOREIGN SCHEMA fdw_testing LIMIT TO (employees) FROM SERVER foreign_server INTO fdw_1;
IMPORT FOREIGN SCHEMA
postgres_fdw_db=# det fdw_1.*
    List of foreign tables
Schema |   Table   |     Server    
--------+-----------+----------------
fdw_1  | employees | foreign_server
(1 row)

Furthermore, validation of copied data into the local working database with remote data (step2.1).

postgres_fdw_db=# select count(*) from fdw_1.employees;
count
------
(1 row)

3.3: Import all the foreign tables excluding the given table names using “EXCEPT” to enable selective exclusion of specific foreign tables.

postgres_fdw_db=# IMPORT FOREIGN SCHEMA fdw_testing EXCEPT (employees) FROM SERVER foreign_server INTO fdw_2;
IMPORT FOREIGN SCHEMA

Below are the tables imported excluding the table “employees” and validation of copied data into the local working database with remote data (step2.1).

postgres_fdw_db=# det fdw_2.*
     List of foreign tables
Schema |  Table   |     Server    
--------+----------+----------------
fdw_2  | staff    | foreign_server
fdw_2  | students | foreign_server
(2 rows)
postgres_fdw_db=# select students_cnt, staff_cnt FROM (select count(*) from fdw_2.students ) AS students_cnt, (select count(*)
from fdw_2.staff) AS staff_cnt;
students_cnt | staff_cnt
--------------+-----------
(300)        | (1000)
(1 row)

Limitations:

  1. The ONLY option specified in SELECT, UPDATE, DELETE, or TRUNCATE has no effect when accessing or modifying the remote table using postgres_fdw.
  2. postgres_fdw currently lacks support for INSERT statements with an ON CONFLICT DO UPDATE clause. However, the ON CONFLICT DO NOTHING clause is supported.

For more details, please refer to these blog posts.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community in a single distribution, designed and tested to work together. Run PostgreSQL in your production and mission-critical environments and easily deploy and orchestrate reliable PostgreSQL in Kubernetes.

Download Percona Distribution for PostgreSQL Today!

Array

Share This Post!

Subscribe
Connect with
guest
Label
0 Comments

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK