Most databases, whether it's Microsoft Access, SQL Server, Oracle, etc., let you create tables and specify any field or fields as the primary key. Unfortunately, this makes it easy to create primary keys that are not optimal which can cause performance problems and even worse, architectural problems that become expensive to fix later.
Every table in a relational database should be keyed. Records in a relational database can be sorted and physically stored in any order, but the key field (or fields) define uniqueness for that record. This makes it easy to link this table to data in other tables. The primary key is also important for secondary indexes.
Secondary indexes can be one or more fields that are defined to speed up searches by maintaining a pre-sorted list. For instance, a customer table may have a secondary index on Zip Code to allow quicker searches on zip code values.
Here are some tips for creating primary keys. As with all rules, there are specific situations where one may deviate from them. Additionally, if your tables are small, adjusting the primary key may have no perceivable impact, so these tips may not matter. However, for larger tables and to support scalability over time, these tips can have a huge impact. The hope is to know these reasons and why you may intentionally deviate from these best practices.
The primary key should be the first field (or fields) in your table design. While most databases allow you to define primary keys on any field in your table, the common convention and what the next developer will expect, is the primary key field(s) coming first.
The values in a table's primary key should never change over time. For instance, a numeric customer ID can point to a customer, and regardless of whether they change their name, address, phone number, etc., we can track the data properly without updating more than one table.
If a primary key value has meaning, it can change which requires updating the tables that reference this value. That may be automatic through cascading referential integrity, but it adds overhead, slows performance, and may create locking conflicts as the data is updated.
Therefore, it's critical to avoid using things like people's names as primary keys. Fields with meaning can also result in duplicates, which is a problem when the primary key is supposed to define each record uniquely.
Fortunately, it's easy to define a unique primary key. An auto generated value such as an Identity column in SQL Server or an AutoNumber field in Microsoft Access does this. It doesn't matter if you delete a record and the key values are no longer consecutive. The primary key should have no meaning other than the permanent tag to that record.
A reasonable exception to this is geographic lists such as Countries, States, and Zip Codes, where the name is keyed rather than using a separate ID. These data are relatively stable and small.
Uniquely tagging a record can be done with a number (long integer). There's no need to have more than one field defining its uniqueness. Text fields require more bytes than numeric fields, so using a number saves considerable space.
Making the primary key as small as possible reduces the memory required to open the primary key when links or searches are performed.
Primary key size also impacts secondary indexes. A secondary index contains the sorted information on the field or fields in the index, plus a reference back to the primary key. Searching on the secondary index is fast because it uses a presorted list, but still points back to the record using the primary key to retrieve all of the record's other fields.
The smaller the primary key, the smaller the secondary index, and the faster the links and searches are. If a table has lots of secondary indexes and lots of records, the difference can be very significant if you are not using a single numeric field as your primary key.
Sometimes people justify composite (multi-field) primary keys to avoid duplicates. This is the wrong approach. To prevent duplicates, create a secondary index for those fields and define it as unique. That lets the secondary index handle that job while the single field primary index retains its efficiencies.
Our Total Access Analyzer add-in performs detailed analysis of your MS Access database objects and detects tables with non-optimal primary keys, field type inconsistencies, plus hundreds of others tips to improve your database designs.
Strategic Overview
Microsoft Access within an Organization's Database Strategy
How many simultaneous Microsoft Access users?
Blaming Microsoft Access instead of the Developer
Microsoft Access Version Feature Differences
Microsoft Access Versions, Service Packs and Updates
Microsoft Office 365 Access Update Version Releases
Top 14 Features Added with MS Access 2007
Taking Over Legacy MS Access Databases
Winner of Every Best Access Add-in Award
Set AutoNumber Starting Number Other than 1
Avoid Unnecessary or Duplicate Indexes
Copy Command Button and Keep Picture
Module VBA to Forms and Controls
Subform Reference to Control Rather than Field
Suppress Page Headers and Footers on the First Page of Your Report
Annual Monthly Crosstab Columns
Add Buttons to the Quick Access Toolbar
Collapse the Office Ribbon for more space
Avoid Exits in the Body of a Procedure
Send Emails with DoCmd.SendObject
Error Handling and Debugging Techniques
Error Number and Description Reference
Remote Desktop Connection Setup
Terminal Services and RemoteApp Deployment
Missing Package & Deployment Wizard
Remove 'Save to SharePoint Site' Prompt from an Access Database
Class Not Registered Run-time Error -2147221164
Microsoft Access to SQL Server Upsizing Center
When and How to Upsize Access to SQL Server
SQL Server Express Versions and Downloads
Deploying MS Access Linked to SQL Azure
SQL Server Azure Usage and DTU Limits