37

SELECT answers FROM sql

 4 years ago
source link: https://towardsdatascience.com/select-answers-from-sql-5cef923380ac?gi=13d7c17f2f7
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

A review of some common SQL interview questions with intuitive answers

jmaI3uM.jpg!web

Nov 18 ·6min read

N36Njmi.png!web

Modded from photo by Ashim D’Silva on Unsplash

An important note for anyone interested in data analysis or data science: do not underestimate the utility of SQL! It is easy to launch into learning using Python and packages like pandas and numpy, but there is a lot of business done with SQL and many reporting and analysis work that can be done quickly and efficiently with it. If you are someone that is interested in big data and you use a package like PySpark to interact with Apache Spark, writing SQL queries can be the clearest and quickest ways to filter and analyze data.

To help cement some of those concepts, here is a breakdown of some of the more common interview questions that may come up about SQL. I pulled these from lists of others I have recently reviewed (links at the bottom for more) and am just presenting explanation around a few that I personally have had come up in recent interviews:

Difference between SQL, MySQL, and SQL Server

To be very basic, SQL stands for Structured Query Language. It is the language itself that is used in designing, managing, and querying structured relational databases. mySQL is an open-source relational database management system. SQL Server (also known as Microsoft SQL Server) is a relational database management system created by Microsoft for their server platform. Both of these platforms use a common language (SQL), but have some nuances of their own. There are others you will probably hear of, like Oracle, but the main thing to note is that SQL is the language and these others are platforms for databases.

WHERE vs. HAVING

If you need a one sentence answer: HAVING is used specifically for filtering on tables using aggregate functions, and WHERE is not. For longer context: the WHERE clause filters based on rows that meet a certain condition. It is looking for matches of specific data in rows. The HAVING clause is used for when you are filtering on conditions of data in the entire table. Here is a quick breakdown of what that looks like. Let’s pretend we have a table that looks likes this of a random assortment of Game of Thrones characters:

╔═══════════╦════════════════════╦════════╗
║   name    ║       house        ║  age   ║
╠═══════════╬════════════════════╬════════╣
║ Eddard    ║ Stark              ║ 40     ║
║ Catelyn   ║ Stark              ║ 35     ║
║ Rob       ║ Stark              ║ 17     ║
║ Sansa     ║ Stark              ║ 13     ║
║ Cersei    ║ Lannister          ║ 35     ║
║ Jaime     ║ Lannister          ║ 35     ║
║ Tyrion    ║ Lannister          ║ 30     ║
║ Daenerys  ║ Targaryen          ║ 15     ║
║ Theon     ║ Greyjoy            ║ 19     ║
╚═══════════╩════════════════════╩════════╝

If you wanted to wanted to look for just the characters above or equal to 30, we would write a query like this:

SELECT name, age
FROM got_table
WHERE age >= 30

This function looks at a table, filters based on rows that have an age listed greater than or equal to 30. It is looking at the data available in the table and filtering from there. But let’s say you want to use aggregate functions. Let’s start with aggregating this table based on house, just to count how many people are listed in each:

SELECT house, COUNT(*)
FROM got_table
GROUP BY house

There we have it! BUT What if we just wanted to look at the rows created in this aggregate table that have more than one member, we would need to use HAVING instead of WHERE :

SELECT house, COUNT(*)
FROM got_table
GROUP BY house
HAVING COUNT(*) < 2

In short, if you are either looking for rows WHERE or groups HAVING .

Primary vs. Foreign Key

A primary key is a unique identifier for a record in a database table. A foreign key is a field in a table that is the primary key in a different table. A primary key can only be once per table, where a foreign key can appear multiple times. Here I am going to make a couple tables to illustrate (bear with my fake data and mix of locations for fun):

╔════════╦═══════════╦═════════╗  ╔════════╦═══════════╦════════╗
║group_id║  name     ║army_size║  ║planetid║  name     ║group_id║
╠════════╬═══════════╬═════════╣  ╠════════╬═══════════╬════════╣  
║ 0001   ║Cpt. Planet║ 1       ║  ║ 7001   ║ Earth     ║ 0001   ║  
║ 0002   ║First Order║ 800,000 ║  ║ 7020   ║ Cybertron ║ 0001   ║
║ 0003   ║ Autobots  ║ 1,000   ║  ║ 7044   ║ Tatooine  ║ 0003   ║
║ 0004   ║Silverhawks║ 10      ║  ║ 5084   ║ Alderan   ║ 0004   ║
║ 0005   ║ C.H.U.D.s ║ 35,000  ║  ║ 2001   ║ Mars      ║ 0002   ║
╚════════╩═══════════╩═════════╝  ╚════════╩═══════════╩════════╝
TABLE 1                           TABLE 2

So let’s say we have a smattering of factions (taken largely from the back of brain) warring over a few different planets. We have them in separate tables to store the individual characteristics of each. The tables are related based on which group controls what planet. In table 1, the primary key is group_id. It is a unique id given to identify each record. In table 2, the primary key is planetid, and the foreign key is group_id. You can see how we can show that Captain Planet has taken over both Earth and Cyberton, a pretty powerful move! The group_id used in table 2 shows the relationship between the two tables, but having a foreign key allows us to have unique tables for each set. In larger databases you will have systems that have several tables connected in this way.

Joins & Unions

The basics are as follows, assuming we have two tables we are connecting data side by side that have some common key or value to match on:

ArArmyN.png!web

  • INNER JOIN: Just records that have matching values in BOTH tables (B)
  • LEFT JOIN: All records from the LEFT most table and records from the RIGHT that match (A & B)
  • RIGHT JOIN: All records from the RIGHT most table and records from the LEFT that match (B & C)
  • FULL OUTER JOIN: All records from BOTH (A & B & C)

Not so bad, just takes time to remember the layout. It is good to think through joins in their side by side nature. With unions, we just need to flip the perspective of how we are viewing the tables:

3iIFBnB.png!web

  • UNION: the combination of the two tables, stacked on top of each other, but no duplicates from the second (A & B)
  • UNION ALL: the combination of two tables, stacked on top of each other in their entirety (A & B & C)

DELETE vs. TRUNCATE vs. DROP

Quick answer: DELETE can be used to drop some records from a table, and TRUNCATE deletes an entire table’s rows, and DROP removes the object from the database. When using DELETE , you would specify a WHERE clause to single out records:

DELETE FROM got_table WHERE house = "Stark"

TRUNCATEwipes out all the rows from the table, with no modifiers:

TRUNCATE TABLE got_table

DROPactually removes the table object and rows from the database:

DROP TABLE got_table

Each has utility, but be wise on which you need based on your scenario.

Manipulating strings

There are a lot of functions to help with string manipulation, but one important thing is that indexing starts at 1 (though commonly you start with 0 in other programming languages). Here are a few quick important functions that might come up:

  • SUBSTR( text, position, length ) : will take text and grab any portion
  • LEFT( text, length ) & RIGHT( text, length ): both start from fixed farthest right or left position and trim from there.
  • REPLACE( text, string_to_find, string_to_replace_with ): fairly straight-forward replacing of text.
  • CONCAT( text_1, text_2, etc . ): simple combination of text

Here is a quick look at how each might appear:

╔═══════════╦══════════╦══════════════════════════╗
║ first ║ last ║ line ║
╠═══════════╬══════════╬══════════════════════════╣
║ Inigo ║ Montoya ║ "Prepare to battle!" ║
╚═══════════╩══════════╩══════════════════════════╝
SELECT LEFT(first, 1), last FROM pb_table
╔═════╦══════════╗
║ I ║ Montoya ║
╚═════╩══════════╝
SELECT CONCAT(last, "," , first) FROM pb_table
╔════════════════╗
║ Montoya, Inigo ║
╚════════════════╝
SELECT last, REPLACE(line, "battle", "die") FROM pb_table
╔══════════╦════════════════════╗
║ Montoya ║ "Prepare to die!" ║
╚══════════╩════════════════════╝

Some interview question sources I’ve found helpful:

Top SQL Interview Questions (IntelliPaat)

Top 65 SQL Interview Questions You Must Prepare In 2019 (Edureka)

Top 50 SQL Interview Questions & Answers (Guru99)

100+ Most Popular SQL Interview Questions And Answers (Software Testing Material)

Some SQL learning resources I’ve enjoyed:

Mode Analytics (free and can use their platform to do some interactive practice)

Codecademy (love their interactive platform and thorough explanations)

W3Schools (a pretty well written reference with some sample interaction as well)

Thanks!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK