Md Mominul Islam | Software and Data Enginnering | SQL Server, .NET, Power BI, Azure Blog

while(!(succeed=try()));

LinkedIn Portfolio Banner

Latest

Home Top Ad

Responsive Ads Here

Wednesday, September 10, 2025

 

SQL Server High CPU Usage: Causes and Best Fixes

Understanding High CPU Usage

High CPU usage occurs when SQL Server processes consume excessive processor cycles, often due to inefficient queries, poor indexing, or configuration mismatches. In SQL Server 2025, features like adaptive joins and parameter-sensitive plan optimization reduce CPU strain, but issues persist in high-transaction environments. Common symptoms include slow query execution, application timeouts, and system lag, which can disrupt operations like real-time stock trading or inventory updates.

Step-by-Step Troubleshooting and Fixes

Step 1: Identify High CPU Queries

Use DMVs to pinpoint queries consuming the most CPU resources.

  • Query DMVs for CPU Usage:

    SELECT 
        qsq.query_id,
        qt.query_sql_text,
        SUM(rs.avg_cpu_time) AS avg_cpu_ms,
        SUM(rs.count_executions) AS total_executions,
        SUM(rs.avg_cpu_time * rs.count_executions) AS total_cpu_ms
    FROM sys.query_store_query qsq
    INNER JOIN sys.query_store_query_text qt ON qsq.query_text_id = qt.query_text_id
    INNER JOIN sys.query_store_runtime_stats rs ON qsq.query_id = rs.query_id
    GROUP BY qsq.query_id, qt.query_sql_text
    ORDER BY total_cpu_ms DESC;

    Enable Query Store if not already active:

    ALTER DATABASE YourDatabase SET QUERY_STORE = ON;
  • Real-Life Example: In an e-commerce system, a query like SELECT * FROM Orders WHERE OrderDate > '2025-01-01' GROUP BY CustomerID might spike CPU due to a table scan on millions of rows during a sale.

Pros: Query Store provides historical data; DMVs offer precise CPU metrics. Cons: Requires Query Store setup; high-frequency polling adds slight overhead. Business Impact: Identifying costly queries prevents checkout delays, reducing cart abandonment by 10-15% in retail.

Step 2: Analyze Execution Plans

Examine execution plans to identify CPU-intensive operations like sorts, hash joins, or table scans.

  • Generate Execution Plan: In SSMS, enable "Include Actual Execution Plan" (Ctrl+M) and run the suspect query. Look for operators like:

    • Sort: Indicates unindexed ORDER BY or GROUP BY.

    • Hash Match: Suggests inefficient joins.

    • Clustered Index Scan: Points to missing indexes.

  • Example Fix: For the e-commerce query above, a table scan might occur due to no index on OrderDate. Create a non-clustered index:

    CREATE NONCLUSTERED INDEX IX_Orders_OrderDate 
    ON Orders(OrderDate) INCLUDE (CustomerID, OrderID);
  • Leverage 2025 IQP: Enable batch mode on rowstore for CPU-intensive aggregations:

    SELECT * FROM Orders 
    WHERE OrderDate > '2025-01-01' 
    GROUP BY CustomerID
    OPTION (USE HINT('BATCH_MODE_ON_ROWSTORE'));

Pros: Indexes reduce CPU by 20-50%; 2025’s batch mode speeds up aggregations. Cons: Indexes increase storage and slow writes; batch mode may not apply to small datasets. Business Impact: Faster queries in finance apps ensure real-time trade execution, avoiding losses in volatile markets.

Step 3: Optimize Query Design

Rewrite inefficient queries to reduce CPU load.

  • **Avoid SELECT ***:

    -- Instead of:
    SELECT * FROM Orders WHERE CustomerID = 123;
    -- Use:
    SELECT OrderID, OrderDate, Total FROM Orders WHERE CustomerID = 123;
  • Use Parameterized Queries: Prevent plan cache bloat and improve CPU efficiency:

    EXEC sp_executesql 
        N'SELECT OrderID, OrderDate FROM Orders WHERE CustomerID = @CustID', 
        N'@CustID INT', 
        @CustID = 123;
  • Apply Query Hints: For parameter-sensitive queries, use 2025’s enhanced plan optimization:

    SELECT * FROM Orders WHERE CustomerID = @CustID OPTION (RECOMPILE);

Pros: Reduces CPU by minimizing unnecessary data and optimizing plans. Cons: RECOMPILE increases compilation overhead; requires developer effort. Business Impact: In healthcare, optimized queries speed up patient record retrieval, ensuring compliance with response time SLAs.

Step 4: Tune Server Configuration

Misconfigured settings can exacerbate CPU usage.

  • Set Max Degree of Parallelism (MAXDOP): Limit parallelism for OLTP workloads:

    EXEC sp_configure 'max degree of parallelism', 4;  -- Match CPU cores (e.g., 4)
    RECONFIGURE;
  • Adjust Cost Threshold for Parallelism: Increase to avoid parallelism on small queries:

    EXEC sp_configure 'cost threshold for parallelism', 50;
    RECONFIGURE;
  • Optimize Memory Settings: Ensure sufficient memory to avoid CPU-intensive spills:

    EXEC sp_configure 'max server memory (MB)', 8192;  -- e.g., 8GB for 16GB system
    RECONFIGURE;

Pros: Balances CPU load, reducing contention by 20-30%. Cons: Over-limiting parallelism slows analytical queries; requires testing. Business Impact: In logistics, proper MAXDOP settings ensure smooth shipment tracking, reducing delays and operational costs.

Step 5: Update Statistics and Indexes

Outdated statistics or fragmented indexes cause inefficient plans, spiking CPU.

  • Check Statistics:

    SELECT 
        OBJECT_NAME(s.object_id) AS TableName,
        s.name AS StatsName,
        STATS_DATE(s.object_id, s.stats_id) AS LastUpdated
    FROM sys.stats s
    WHERE STATS_DATE(s.object_id, s.stats_id) < DATEADD(DAY, -7, GETDATE());
  • Update Statistics:

    UPDATE STATISTICS Orders WITH FULLSCAN;
  • Rebuild Fragmented Indexes:

    SELECT 
        OBJECT_NAME(ips.object_id) AS TableName,
        i.name AS IndexName,
        ips.avg_fragmentation_in_percent
    FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'LIMITED') ips
    INNER JOIN sys.indexes i ON ips.object_id = i.object_id AND ips.index_id = i.index_id
    WHERE ips.avg_fragmentation_in_percent > 30;
    
    ALTER INDEX ALL ON Orders REBUILD WITH (ONLINE = ON);

Pros: Improves plan accuracy, cutting CPU by 15-40%; 2025 auto-stats reduce manual effort. Cons: FULLSCAN blocks briefly; rebuilds increase I/O temporarily. Business Impact: In manufacturing, updated stats speed up IoT sensor data queries, enabling predictive maintenance and saving 10% on downtime costs.

Step 6: Use Resource Governor

Control CPU allocation for different workloads.

  • Configure Resource Governor:

    CREATE RESOURCE POOL ReportingPool WITH (MAX_CPU_PERCENT = 30);
    CREATE WORKLOAD GROUP ReportingGroup USING ReportingPool;
    
    CREATE FUNCTION dbo.WorkloadClassifier()
    RETURNS sysname WITH SCHEMABINDING
    AS
    BEGIN
        RETURN CASE WHEN SUSER_NAME() = 'ReportUser' THEN 'ReportingGroup' ELSE 'default' END;
    END;
    
    ALTER RESOURCE GOVERNOR WITH (CLASSIFIER_FUNCTION = dbo.WorkloadClassifier);
    ALTER RESOURCE GOVERNOR RECONFIGURE;

Pros: Limits CPU for non-critical workloads, ensuring OLTP performance. Cons: Setup complexity; requires user/role mapping. Business Impact: In telecom, isolating reporting queries ensures call data processing remains fast, boosting customer analytics efficiency.

Step 7: Monitor and Prevent Recurrence

  • Set Up Alerts: Use SQL Agent to alert on CPU thresholds:

    EXEC msdb.dbo.sp_add_alert 
        @name = N'High CPU Alert',
        @performance_condition = N'SQLServer:Resource Pool Stats|CPU usage %|default|>80',
        @action = N'NOTIFY';
  • Use Extended Events: Track CPU-intensive events:

    CREATE EVENT SESSION HighCPUQueries ON SERVER
    ADD EVENT sqlserver.sql_statement_completed (
        WHERE cpu_time > 1000  -- Filter for queries > 1s CPU
    )
    ADD TARGET package0.event_file (SET filename = 'C:\Temp\HighCPU.xel');
    
    ALTER EVENT SESSION HighCPUQueries ON SERVER STATE = START;

Pros: Proactive alerts prevent outages; Extended Events provide detailed insights. Cons: Requires storage and analysis for event files. Business Impact: Continuous monitoring in SaaS platforms ensures uptime, reducing churn and maintaining SLAs.

Real-Life Scenarios and Business Impact

  • E-Commerce: During a Black Friday sale, a poorly indexed query spiked CPU to 90%, delaying checkouts. Adding indexes and setting MAXDOP reduced CPU to 50%, increasing completed orders by 25% and revenue by millions.

  • Finance: A trading platform saw CPU spikes during market surges. Optimizing queries with batch mode and Resource Governor ensured trades executed in milliseconds, avoiding losses.

  • Healthcare: High CPU from reporting queries delayed patient data access. Using snapshot isolation and updated stats cut CPU usage by 30%, ensuring HIPAA-compliant response times.

Pros and Cons of Approach

  • Pros: Comprehensive use of DMVs, Query Store, and 2025 features like IQP reduces CPU usage by 20-50%; scalable for high-transaction systems.

  • Cons: Requires expertise for DMV analysis and tuning; index maintenance adds overhead; Resource Governor setup is complex. Business Impact: Resolving CPU issues enhances user experience, reduces hardware costs, and supports data-driven decisions, critical for competitive industries.

Best Practices for Prevention

  • Schedule regular statistics updates:

    EXEC sp_updatestats;
  • Automate index maintenance with SQL Agent jobs:

    ALTER INDEX ALL ON YourTable REORGANIZE;
  • Monitor Query Store for plan regressions:

    SELECT * FROM sys.query_store_plan WHERE is_forced_plan = 1;
  • Review application logic for efficient query patterns, avoiding cursors and nested loops.

No comments:

Post a Comment

Thanks for your valuable comment...........
Md. Mominul Islam