Rowcount for Large Tables July 24, 2008
Posted by furrukhbaig in DMV's, Performance, SQL Server 2005, SQL Server 2005 features, Tips, TSQL.Tags: large tables, rowcount, rowcount scan, sys.dm_db_partition_stats
trackback
Ever wondered why simple statements like SELECT COUNT(*) FROM [Table_Name] takes forever to return row count on large tables? Its because it uses full table scan to count number of rows. The instant way to get the row count on any table is to query new Dynamic Management Views (DMV) in SQL Server 2005 sys.dm_db_partition_stats. DMV contains row count and page counts for any table including all the partitions. Please note even you did not create partition for your table, your table still going to be created on single default partition so this will include all the tables.
Following are some useful queries that will return row count almost instantly. In SQL Server 2000 sysindexes system table is used to get the number of rows for large table to avoid full table scan.
Comments are welcome.
— All the tables having ZERO rows
— compatable with partitioned tables
SELECT
Table_Name = object_name(object_id),
Total_Rows = SUM(st.row_count)
FROM
sys.dm_db_partition_stats st
GROUP BY
object_name(object_id)
HAVING
SUM(st.row_count) = 0
ORDER BY
object_name(object_id)
— Row Count without doing full table Scan
— This will include provide total number of rows in all partition (if table is partitioned)
SELECT
Total_Rows= SUM(st.row_count)
FROM
sys.dm_db_partition_stats st
WHERE
object_name(object_id) = ‘Your_Table_Name_Goes_Here’
AND (index_id < 2) — Only Heap or clustered index
I just ran the above queries on my DB along with full-scan selects and the results are different (by a large variance). Is there an explanation for that?
LikeLike
This one will come in handy, thanks 🙂
LikeLike
I tested my query for the heap table (having no index at all).
I have not got access to SQL 2005 server right now and therefore can’t test on tables with indexes on.
But I am quite confident that by adding folloing on WHERE clause this should give you exactly same count. following clause will avoid rowcount from NON CLUSTERED indexes and only look for HEAP (0) and CLUSTERED (1).
AND (index_id < 2)
so the query will be
SELECT
Total_Rows= SUM(st.row_count)
FROM
sys.dm_db_partition_stats st
WHERE
object_name(object_id) = ‘Your_Table_Name_Goes_Here’
AND (index_id < 2) — Only Heap or clustered index.
Feedback will be much appriciated.
Thanks,
LikeLike
During my latest project I encountered the same problem (poor performance with COUNT(*)) but I found out that there are cases when SUM(row_count) is not correct. You can find information on Internet that the “row_count” value is not guaranteed to be correct. I had some tables with above 100 millions records where differences were about 300-400 records. The simple explanation why is that, is that the column in system table is not updated with every single INSERT statement. This would cause performance problems.
LikeLike
Hi,
There is one best stored procedure, That gives you instant way to get the row count on any table.
SP_MSTABLESPACE ‘Table_Name’
Output of this SP is like that
Rows DataSpaceUsed IndexSpaceUsed
830 176 312
Hope this hidden thing helps you..
Regards
Shashi Kant Chauhan
LikeLike
Thanks Mark and Shashi for your feedback. First of all the sproc suggested by shashi (SP_MSTABLESPACE) uses sysindexes system table for rowcount. BOL says this is sql 2000 system table and only available for backward compatibility. Thats the reason I always prefer to use DMV. Also I am not sure if sysindexes is compatable with partition tables.
As far as difference in rowcount concern It could well be due to sum rowcount from nonclustered indexes as explained in my previous post apart from this I tried to run DMV query and full scan row count against 643 million row table and both give identical value.
Could you pleae provide me the query you are using and value of rowcount so I can investigate it further.
Thanks,
LikeLike
In my short tests with tables having a small amout of records your procedure is good… but can you reply me if you find out that actually is not that precise? Thank you.
LikeLike
Thanks Cristi for your feedback. Could you please provide me query you are using for rowcount which is not precise ? also please let me know the indexes on the table. I have tried it many times on tables contain 600+ million rows tables and everytime it is accurate. I am wondering if you are using updated query as suggested in my previous comments.
Thanks,
Furrukh Baig
LikeLike
Thank you for your reply and for your confirmation. I will use your method in my application, due to I want the fastest response for RowCount of a table. In my previous message I was just saying… IF you will find that is not accurate, or if you have another better solution… to keep me up-to-date, if you don’t mind.
Thank you very much.
LikeLike
sp_spaceused would also return the number of rows.
LikeLike
yes you are right but sp_spaceused uses same DMV dm_db_partition_stats for row count.
LikeLike
Authentic words, some true words dude. Thanks for makin my day!!
LikeLike
I have actually found dm_db_partition_stats to be unreliable. On many tables it’s accurate, but I have tables where it is way off. For instance a table with 5.000 rows of which dm_db_partition_stats claims it has 22.000 rows. When it is off, dm_db_partition_stats seems to always give to many rows, never too little.
sp_spaceused does use dm_db_partition_stats, but only when getting a database summary. When using it for a particular table it does not use dm_db_partition_stats and (in my experience) is very reliable.
LikeLike
[…] Rowcount for Large Tables ” SQL Server 2005 Performance Tuning Rowcount for Large Tables July 24, 2008. Posted by furrukhbaig in DMV’s, Performance, SQL Server 2005, SQL Server 2005 features, TSQL, Tips. … […]
LikeLike
For an exact record count I’d suggest:
EXEC sp_spaceused N’TableName’, N’TRUE’
The “TRUE” parameter forces an update to the statistics underlying the table to ensure accuracy. With a “FALSE” parameter the returned value should be used as an estimate only.
Some of Microsoft’s own documentation suggest the statistics are always up-to-date in version SQL Server 2005 and later – unless the database was upgraded from a previous version.
LikeLike