Key SQL Database Best Practices and Tips
When working with relational databases, adhering to best practices can make your queries more efficient, maintainable, and easier to understand. Here are some key guidelines to follow when creating and managing your tables and queries.
1. Table Names Should Be Plural
To keep your naming conventions consistent, it’s recommended that table names be plural. For example:
employees
books
This convention reflects that the table stores multiple records.
2.Field Names Should Be Singular
While table names are plural, field (column) names should be singular. For example:
name
(instead ofnames
)age
(instead ofages
)
This keeps field names clear and easy to understand, as each field represents a single attribute of a record.
3. More Tables are Better Than Fewer
It’s generally better to break up data into multiple smaller tables rather than having one large table with multiple subjects combined. For example, instead of having a single table that holds both employee information and department data, you might have one table for employees
and another for departments
. This normalization reduces redundancy and makes the database more flexible.
4. Table and Field Names Should Be Lowercase with No Spaces
Table and field names should be lowercase and use underscores (_
) instead of spaces for readability. For example:
employee_table
first_name
This naming convention makes queries more readable and easier to maintain.
5. SQL Formatting Best Practices
SQL formatting is not required for the code to execute, but it significantly enhances readability, especially when working with large and complex queries. Here are some best practices for formatting SQL queries:
Capitalizing SQL Keywords: SQL keywords (e.g.,
SELECT
,FROM
,WHERE
) should be written in uppercase to distinguish them from table and column names.Using Indentation and New Lines: Break down queries into multiple lines, especially for complex queries, to make them easier to read.
Avoiding Long Lines: Split long SQL queries across multiple lines. It makes your query easier to understand, especially in team environments.
Dealing with Non-Standard Field Names: If a column or table name contains spaces or special characters, wrap the name in double quotes (or backticks for MySQL).
- For further SQL style guidelines, you can refer to SQL Style Guide, a great resource for maintaining consistent formatting.
Records: Rows in a Table are Called Records
Each individual entry or row in a table is referred to as a record. It contains values for each column (or field) in that table.
Field: Columns in a Table are Called Fields
The individual components or attributes in each record are known as fields. For example, in an employees
table, fields might include name
, age
, email
, etc.
Unique Identifiers (Keys) are Used to Identify Records
Each record in a table should have a unique identifier, often referred to as a key (typically a number). The primary key uniquely identifies each record and ensures that there are no duplicates.
Result Set: The Query Result is Often Called a Result Set
When you run a query, the result returned is often referred to as a result set. This is the collection of data that matches your query criteria.
Order of Execution in SQL Queries
SQL queries may not execute in the order they are written. Understanding the order of execution can help with debugging, optimization, and structuring your queries correctly. Below is the standard SQL execution order:
FROM
: Determines the source of the data (tables or views).WHERE
: Filters the records based on conditions.SELECT
: Specifies the columns to be returned.ORDER BY
: Sorts the results.LIMIT
: Limits the number of rows returned.
e.g SELECT name FROM schools LIMIT 10;
In this query, the FROM
clause is executed first (to identify the table schools
), followed by filtering (if any) in the WHERE
clause (not used here), then the selection of columns (name
), and finally, the limit of 10 rows in the LIMIT
clause.
AS: Use Aliasing to Rename Columns
Aliasing allows you to rename a column temporarily for better readability in your query result. For example, if you want to rename the name
field as first_name
when selecting it, you can use the AS
keyword
LIMIT
LIMIT keyword is used to limit the number of records in the output e.g SELECT genre FROM books LIMIT 10;
SELECT name AS first_name, age FROM employee_tables;
#LIMIT
SELECT genre FROM books LIMIT 10;
DISTINCT: Select Distinct Records
If you want to retrieve unique records (i.e., no duplicates), you can use the DISTINCT
keyword.
-- Without DISTINCT:This query may return multiple entries with the same year_hired.
SELECT year_hired FROM employees;
-- With DISTINCT : This will return only unique years.
SELECT DISTINCT year_hired FROM employees;
-- This will return only unique combinations of dept_id and year_hired.
SELECT DISTINCT dept_id, year_hired FROM employees;
VIEW: Virtual Tables
A view is a virtual table created from the result of a saved SQL SELECT
statement. Views don’t store data themselves; instead, they store the query definition and display the result dynamically whenever accessed. A view automatically reflects changes in the underlying data.
You can use views to simplify queries and avoid direct modifications to the database. Views are especially helpful when you don’t have permission to modify the database but need custom views of the data.
-- After the view is created, you can use it in subsequent queries
CREATE VIEW employee_hire_years AS
SELECT id, name, year_hired
FROM employees;
SELECT id, name FROM employee_hire_years;
-- Another example:
CREATE VIEW library_authors AS
SELECT DISTINCT author AS unique_author
FROM books;
SELECT * FROM library_authors;
COUNT() Function: Counting Records in a Field
COUNT(field_name)
: Counts number of records with a value (i.e non NULL) in a field. Empty values are excluded from the count.
COUNT(*)
: It’s also used to determine the total number of records in a table. Counts all records in the table, including rows with null values in any column.
COUNT()
with DISTINCT
: Counting Unique Values
If you need to count only the distinct (unique) values in a column, you can use the COUNT(DISTINCT column_name)
syntax. This helps eliminate duplicate entries from your count.
-- This query counts the number of non-null values in the name and birthdate columns from the people table.
SELECT COUNT(name) AS count_name, COUNT(birthdate) AS count_birthdates
FROM people;
SELECT COUNT(*) AS total_records FROM people;
-- COUNT with DISTINCT
SELECT COUNT(DISTINCT birthdate) AS count_distinct_birthdates
FROM people;
WHERE: Filtering Data
Order of Execution:
FROM
: Table is identified.WHERE
: Filters rows based on condition.SELECT
: Chooses the output columns.LIMIT
: Limits the number of results.
-- ThWHERE clause filters out films released in 1960.
-- i.e not equal to 1960
SELECT title
FROM films
WHERE release_year <> 1960
LIMIT 5;
-- OR
SELECT * FROM coats WHERE color = 'white' OR length = 'short';
-- AND
SELECT * FROM coats WHERE color = 'white' AND length = 'short';
-- BETWEEN (inclusive range):
SELECT * FROM coats WHERE buttons BETWEEN 1 AND 5;
-- Combining Multiple Filters with Parentheses:
SELECT title
FROM films
WHERE (release_year = 1994 OR release_year = 2003)
AND (certification = 'PG' OR certification = 'R');
--more than 100,000
SELECT COUNT(num_votes) AS films_over_100K_votes
FROM reviews
WHERE num_votes >= 100000;