Finding Missing Values In A MySQL Table

If you have a table of incremental values it can be hard to find out which ones are missing. The only solution might be to write a script to get all the data from the database and see which ones are missing. However, there is a way of doing this without using a script.

Using a standard select query like this:

1
2
3
4
5
6
<p>Gets the following data:</p>
 
<pre language="plain">1
3
10
23

We can see that values are missing, but which ones? The following query will show us where the gaps are in the data of the table.

1
2
3
4
5
6
<p>Produces the following result.</p>
 
<pre language="plain">2
4
11
24

However, this only tell us where the gaps are, not how long they are. To get the range of where the gaps from and to we need to do something a little more complex.

1
2
3
4
5
6
7
8
<p>This query gives the following result.</p>
 
<table><tr><th>Missing From</th><th>To</th></tr>
<tr><td>2</td><td>2</td></tr>
<tr><td>4</td><td>9</td></tr>
<tr><td>11</td><td>22</td></tr></table>
 
<p>Using this dataset we can figure out where the gaps in the data are and perhaps do something with them.</p>
Category: 

Share:

  • Add news feed
  • Bookmark this on Delicious

Comments

Nice Work! My MySql keeps pulling up an error on the last SQL statement. There error is: #1054 - Unknown column 't1.id' in 'having clause' Versions are: phpMyAdmin - 2.11.5.2 MySQL client version: 5.0.45 Any ideas?
philipnorton42's picture

Odd this... It works in MySQL query browser, so I thought it must be an issue with phpMyAdmin. And it was! For some reason it has a problem with the table.column reference in the having command. I found the solution was to add quotes around the name, like this: HAVING 't1.ID' < MIN(t2.ID) - 1} Let us know if that works!

Very nice. I tweaked your example a little to find records that are missing by date

CREATE  TABLE IF NOT EXISTS `sequence` (
 `id` BIGINT NOT NULL AUTO_INCREMENT ,
 `date` DATETIME NOT NULL ,
 PRIMARY KEY (`id`)
)
ENGINE = InnoDB;



# missing 10/2, 10/12, and 10/13

INSERT INTO sequence (date) values ('2011-10-01 00:00:00');
INSERT INTO sequence (date) values ('2011-10-03 00:00:00');
INSERT INTO sequence (date) values ('2011-10-04 00:00:00');
INSERT INTO sequence (date) values ('2011-10-05 00:00:00');
INSERT INTO sequence (date) values ('2011-10-06 00:00:00');
INSERT INTO sequence (date) values ('2011-10-07 00:00:00');
INSERT INTO sequence (date) values ('2011-10-08 00:00:00');
INSERT INTO sequence (date) values ('2011-10-09 00:00:00');
INSERT INTO sequence (date) values ('2011-10-10 00:00:00');
INSERT INTO sequence (date) values ('2011-10-11 00:00:00');
INSERT INTO sequence (date) values ('2011-10-14 00:00:00');
INSERT INTO sequence (date) values ('2011-10-15 00:00:00');
INSERT INTO sequence (date) values ('2011-10-16 00:00:00');
INSERT INTO sequence (date) values ('2011-10-17 00:00:00');
INSERT INTO sequence (date) values ('2011-10-18 00:00:00');
INSERT INTO sequence (date) values ('2011-10-19 00:00:00');
INSERT INTO sequence (date) values ('2011-10-20 00:00:00');


mysql&gt; SELECT
   -> DATE_ADD(t1.date, INTERVAL 1 DAY) AS 'Missing From',
   -> DATE_ADD( MIN(t2.date), INTERVAL -1 DAY) AS 'To'
   -> FROM sequence AS t1, sequence AS t2
   -> WHERE t1.date  GROUP BY t1.date
   -> HAVING t1.date < DATE_ADD( MIN(t2.date), INTERVAL -1 DAY);
+---------------------+---------------------+
| Missing From        | To                  |
+---------------------+---------------------+
| 2011-10-02 00:00:00 | 2011-10-02 00:00:00 |
| 2011-10-12 00:00:00 | 2011-10-13 00:00:00 |
+---------------------+---------------------+
2 rows in set (0.00 sec)

Add new comment