8

Python and Oracle Database

 3 years ago
source link: https://towardsdatascience.com/python-and-oracle-database-c7b5d4d7fa4c
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Python and Oracle Database

A combination no data scientist can ignore

Image for post
Image for post
Photo by bruce mars on Unsplash

While learning data science, most of the time we use open-source data in Excel, CSV formats. However, real-world data science projects involve accessing data from databases. If data needs to be stored and accessed, it needs a database. Two major types of modern databases are relational (RDBMS) and non-relational databases (also called NoSQL database).

RDBMS database works well with the data that can be stored in rows and columns. Consistency and speed are the greatest strength of the RDBMS database.

In this article, I will take you through the most used RDBMS database- Oracle Database- and explain how you can use the holy combination of Python application on the Oracle database.

Connection Strategy

Image for post
Image for post
Image by Author

A database connection is a physical communication pathway between an application and an oracle database machine. Once an application makes a database connection, that session lasts from the time the user is authenticated by the database until the time the user disconnects or exits the database application.

Python has a built-in library cx_Oracle for connecting to the Oracle database.

One of the features I like about cx_Oracle the most is the transparency it provides for diagnosing any connection related or data access-related issues. The below figure indicates different error codes for different stages of application and database connection and operations.

Image for post
Image for post
Image by Author

The below code snippet shows how to connect with an Oracle database using the cx_Oracle library. Replace userid, password, host, port, and service name in the below code with your database details.

Instead of entering the database login details directly in a python program, it can also be stored in a python file and that can be imported in the program like below.

Here, oracbledbconnect is the name of the file where database connection details are stored.

Image for post
Image for post
Image by Author

Connection Pool

Designing a sound connection strategy between python application and Oracle database is crucial. A single stand-alone connection might not suffice in data-intensive applications and it might require a connection pool.

A connection pool is a cache of connections to an Oracle database.

At run time, the application requests a connection from the pool. If the pool contains a connection that can satisfy the request, then it returns the connection to the application. The application uses the connection to perform work on the database and then returns the connection to the pool. The released connection is then available for the next connection request.

The below code snippet shows how to create a connection pool.

This code will create five connection sessions at the start and will increment the connection sessions up to 15 sessions.

Python & SQL

SQL is foundational and must have knowledge for anyone working in data-related professions.

Running a SQL on Oracle database using cx_Oracle is not difficult at all. Just open a cursor on the connection object, execute the SQL, and close the connection.

The below code snippet shows how to connect to the database and select all records from a table and then print a record from the first row and third column in the table.

Python & PLSQL

PLSQL is a procedural extension for SQL. Instead of writing individual SQL statements shown above, more than one SQL statement can be written together in a PLSQL program and the program can be called in Python application through callfunc method in database connection cursor as shown in the below code snippet.

Large Objects

Data is stored in the Oracle database in form of rows and columns. If some data cannot be represented in row and column format ex. documents, images, videos, it cannot be directly stored in the Oracle database.

Data like large documents, images, etc. are referred to as large objects in the Oracle database. Cx_Oracle has methods like CLOB, BLOB to handle large objects.

The below code snippet shows how to store an image in the Oracle database table field.

System Integration

Data science and Machine learning involve getting data from different sources ex. public data, web scrapping, files from service providers, etc. Using cx_Oracle and sqlalchemy libraries, different kinds of data can be consumed in Oracle database tables.

The below code snippet shows how to consume data from all the files of a specific directory into an oracle database table.

Conclusion:

Almost every big enterprise uses relational databases like Oracle, MySQL, IBM DB2, etc for storing business-critical data. While SQL is an efficient mechanism to work with most of the relational databases and the declarative design of SQL makes it easy to learn and use, its capacity is limited when it comes to developing a diverse application. This is where a combination of Python and SQL shine. As a data science, you cannot ignore the importance of learning Python with the Oracle database.

Reference


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK