Fourteen33 Fourteen33
  • Home
  • About me
  • What I do
  • Blog
  • Contact me

About Rupert Wilson

This author has not yet filled in any details.
So far Rupert Wilson has created 4 blog entries.

SQL SERVER SEVEN DAY CHALLENGE

#givingback over this testing time

Over the next 7 days from tomorrow, Wednesday 8th to Wednesday 15th July, I am open to ANYONE who wants to take up my invitation to use my knowledge and experience to assist in any SQL Server challenges or advice UNCONDITIONALLY.

For the next seven days I am ready to help anyone who has real SQL Server current problems, issues or questions.

I’m intelligent enough to realise I may not resolve everything that may come my way, but I will endeavor to help you, with all the resources I have.

So if you want to discuss, work together or just bounce ideas – send me a message.

Whether you are starting out on your career, an experienced highly technical SQL expert or a CTO wanting an unbiased perspective on your environment (current or future road map) contact me and lets work together to make a difference to a real life scenerios.

Time is the most important life commodity we have that you can never get back. So while I am searching for my next role, I’m open to work with anyone that is out there wanting to make progress.

I can give you my time.

Up to a day per person/company (however many of the 24 hours it may require) AT NO COST.

I want to assist anyone that has real SQL Server challenges – NOT generic SQL Server training, workshops or advise – REAL LIFE CHALLENGES you may be experiencing.

Either remotely via video messaging, private messaging, email or on-site, we can discuss the options.

Contact me directly via rupert@fourteen33.uk or LinkedIn and lets work together via whatever method suits.

By Rupert Wilson|2020-07-07T21:05:07+01:00July 7th, 2020|SQL Server help|Comments Off on SQL SERVER SEVEN DAY CHALLENGE
Read More

Performance Tuning SQL Server – non technical solution

Finding a resolution a Non-Sargable Query – Its not always about your SQL Server technical astuteness

Being passionate and committed to solving SQL Server performance issues when a resource hungry SQL query within the database performance monitoring tool was glaringly obvious, I was eager to delve straight in.

The t-sql embedded within the application code was a non-sargeable query.

WHERE customerid LIKE %customerid%

  • It was reading 10’s of GB’s of data
  • Taking on average 15 minutes to return results
  • Ran at least 20 times a day

Next Step

That was an easy analysis. I took off like a shot to discuss with the application development team. Eager to schedule time at the next stand-up for allocation within forthcoming sprint. But nobody wanted to touch it!

Same old replies from Product Owners and developers alike:

  • “the customer service web interface is a legacy app”
  • “not my code, don’t know how it works, was written before I joined the company”

But the system was used by company employees 24/7 and there was no sight of the new system.

Historically the systems’ performance was labelled a database problem. Not only by the whole of Engineering department but throughout the whole company. From Accounts to Compliance, Marketing to outsourced Cleaners

Everyone wanted to work on the latest technology. Keen to develop new functionality however large or small to other services within the tech stack. In general, nobody had an appetite to re-architect the problem child. Let alone try to assess and get to grips with the current code. Which would then have their name’s associated to amendments in the corresponding GitHub repository.

Persistence and Change in Strategy

Well I told myself

  • IF I CAN’T FIX IT, I MUST FIX IT!

So it was time to be pragmatic

  • TIME TO CHANGE MY TACTICS AND FIND ANOTHER WAY

Breakthrough

Time to schedule an informal meeting with the Customer Services Team Manager. Sit down at their desk, listen, observe and ask questions.

After users experiencing several years of pain and blame without action from anyone within technology there was a simple solution.

where customerid = 'customerid'

  • Users never searched for part of a customerid
  • They cut and pasted the customerid field in its entirety every time from a prior search that brought up customer details and this specific piece of information.

The fix that was going to delight the users and managers

  • increase employee productivity
  • change the business customer’s experience dramatically
  • reduce 150 man hours/month of waiting on results.

This had me swaggering across the office like John Wayne with piles under my armpits (Bad joke alert)

Armed with this information, the Product Owners all wanted to assist now. They were fighting between themselves to take ownership. Wanting to assign a developer to make the small code change mid-sprint.

Developers were now ready to offer their services to test and deploy into Production as quickly as possible.

Can do Attitude, Courage and gaining Clarity

Having worked at large financial institutions supporting/developing critical profit making systems, as a database expert, you know that the blame for any sudden issues to production initially, more often than not, comes heading to the database owners.

DBAs are usually the very risk averse. Thus, first to investigate their platform when outages and degradation occur. Subsequently if the issue isn’t database related often we have enough information to assist other teams in finding and resolving the root cause.

Job satisfaction and reward, like life, is about giving
Take control, have a strategy and always give more to others than you ever expect to receive
Be grateful and have grace for what you have learned and perhaps failed at in the past that enables you to grow and utilise your experience to add value to others…

By Rupert Wilson|2020-06-30T22:50:30+01:00June 30th, 2020|Non-Sargable Query, Peformance Tuning|Comments Off on Performance Tuning SQL Server – non technical solution
Read More

Rolling Update AlwaysOn SQL 2014 SP2 CU5

My usual day to day work has been hectic since my first blog but one of the tasks I have implemented over this period was a rolling update of SQL Servers within an AlwaysOn Availability Group. The 2014 SP2 cumulative update 5  was required urgently due to a new API release that was going to take advantage of the read-only secondary replica instance.

The specific hotfix being: VSTS bug number 9278809; KB article no. 4013236

Description: The fix Bad query plan created on secondary replicas after statistics updated via FULLSCAN option on primary replica.

Due to the urgency of this requirement there wasn’t much time to prepare and luckily I still had a run book from a previous AlwaysOn SQL update at the end of last year (albeit when there was only two instances in the AlwaysOn automatic failover synchronous commit setup).

When I’ve researched into best practices/gotchas/step by step guides in the past you can easily end up with an overload of information that could make the task even more daunting when you know the environment you are supporting is critical to the overall running of the business.

Therefore, one of the points of my post is that database changes, however large are small, should always be tested where possible and a clear, easy to follow run-book needs to be created. Taking the military adage of the 7-Ps…

Whenever there are database system changes everyone is aware and all eyes are on the DBA team. Thus, you are questioned first (rightly so) when any small deviations from the normal day to day running expectations of your systems are encountered by the end-users.  Having a run-book gives me, the DBA, room to respond to questions that may come your way mid release. Even better if you have another DBA as your wing man as he can fend off the questions, see where you are in the process and investigate accordingly when required.

With regard to information overload you can find. My run-book is written based around Transact-SQL commands only where possible.  I have found that there are many articles that are very well written with many different variations of how to do the task – i.e. SQL Server Management Studio wizards; PowerShell commands; Transact-SQL statements. For my benefit, I like to KISS!

Disclaimer – Some of the steps I have taken deviate from what you may expect to be part of the best practices. This is due to good working knowledge of the environment and expectations on RTO/RPO and a risk/reward call on performance during and after the implementation.

AlwaysOn Availability Group Configuration

AlwaysOn Availability Group SQL Server 2014

 

RUN BOOK

I also like to highlight each step within the document when I have completed it

 

  • ON BOTH PRIMARY/SYNCRONOUS SECONDARY REPLICA INSTANCES SR1 & SR2

 

exec msdb.dbo.sp_update_job @job_name = 'DBA: DatabaseMaintenance - UPDATE_STATISTICS', @enabled = 0;

exec msdb.dbo.sp_update_job @job_name = 'DBA: DatabaseBackup - USER_DATABASES_FULL', @enabled = 0;

GO

 

  • PRIMARY REPLICA SR1

BACKUP system databases

DIFFERENTIAL BACKUPS of user databases

Run for all Databases

EXECUTE dbo.DatabaseBackup @Databases = ‘AlwaysOnDatabaseNames’,

            @URL = 'https://BLOBNAME/diffbackups',

            @Credential = 'BackupCredential',

            @BackupType = 'DIFF',

            @Compress = 'Y',

            @Encrypt = 'Y',

            @EncryptionAlgorithm = 'AES_256',

            @ServerCertificate = 'BackupCertificate',

            @LogToTable = 'Y';

GO

EXECUTE dbo.DatabaseBackup @Databases = 'SYSTEM_DATABASES',

            @URL = 'https://BLOBNAME/diffbackups',

            @Credential = 'BackupCredential',

            @BackupType = 'FULL',

            @Compress = 'Y',

            @Encrypt = 'Y',

            @EncryptionAlgorithm = 'AES_256',

            @ServerCertificate = 'BackupCertificate',

            @LogToTable = 'Y'

 

  • Check availability group is healthy – See query at end of doc

 

  • 1) PRIMARY REPLICA SR1

Remove automatic failover on synchronous-commit replicas after double checking is failover ready on the primary

ALTER AVAILABILITY GROUP AlwaysOnAG MODIFY REPLICA ON SR1

   WITH (FAILOVER_MODE = MANUAL);

GO

ALTER AVAILABILITY GROUP AlwaysOnAG MODIFY REPLICA ON SR2

   WITH (FAILOVER_MODE = MANUAL);

GO

 

  • 2) ASYNCHRONOUS SECONDARY REPLICA ASR3

RDP on to ASR3 Host:

RESTART ASR3

Run CU5 update from \Temp\SP2_CU5

RESTART ASR3

Check ErrorLog and Availability Group is still healthy (query at end of doc)

  • 3) SYNCHRONOUS SECONDARY REPLICA SR2

RDP on to SR2 Host:

Manually failover to Secondary Replica

ALTER AVAILABILITY GROUP AlwaysOnAG FAILOVER;

 

Check Errorlog and Availability Group is still healthy (query at end of doc)

Run update statistics with full scan on SR2 for database/transaction table

UPDATE STATISTICS Database.dbo.Table WITH FULLSCAN

      

  • 4) ORIGINAL SYNCHRONOUS PRIMARY REPLICA SR1

RDP on to SR1 Host

RESTART SR1 Host

Run CU5 update from \Temp\SP2_CU5

RESTART SR1

        RDP again on to SR1

CHECK AG health (query and SSMS GUI)

  • 5) NEW SYNCHRONOUS PRIMARY REPLICA SR2

Configure automatic failover partners again on new Primary

USE master

GO

ALTER AVAILABILITY GROUP AlwaysOnAG MODIFY REPLICA ON SR2

   WITH (FAILOVER_MODE = AUTOMATIC);

GO

ALTER AVAILABILITY GROUP AlwaysOnAG MODIFY REPLICA ON SR1

   WITH (FAILOVER_MODE = AUTOMATIC);

GO

 

  • ON BOTH PRIMARY/SYNCRONOUS SECONDARY REPLICA INSTANCES SR1 & SR2

 

exec msdb.dbo.sp_update_job @job_name = 'DBA: DatabaseMaintenance - UPDATE_STATISTICS', @enabled = 1;

exec msdb.dbo.sp_update_job @job_name = 'DBA: DatabaseBackup - USER_DATABASES_FULL', @enabled = 1;

GO

 

Run above jobs if schedule was due while upgrading instances

New AlwaysOn Availability Group Configuration

 

AlwaysOn Availability Group SQL Server 2014 SP2 CU5

Check health via tsql

select

dg.primary_replica

,dg.primary_recovery_health

,dg.primary_recovery_health_desc

, dg.synchronization_health

,dg.synchronization_health_desc

,dg.secondary_recovery_health

,dg.secondary_recovery_health_desc

, da.replica_server_name

, da.join_state

, da.join_state_desc

, dr.database_name

, dr.is_failover_ready

, dr.is_pending_secondary_suspend

, dr.is_database_joined

, dr.recovery_lsn

, dr.truncation_lsn

from sys.dm_hadr_database_replica_cluster_states dr

inner join sys.dm_hadr_availability_replica_cluster_states da

on dr.replica_id = da.replica_id

inner join sys.dm_hadr_availability_group_states dg on da.group_id = dg.group_id

By Rupert Wilson|2017-06-13T10:49:21+01:00June 13th, 2017|AlwaysOn, Availability Groups|Comments Off on Rolling Update AlwaysOn SQL 2014 SP2 CU5
Read More

SQL Server failed transactions – HADR_SYNC_COMMIT

I have finally decided it is about time I should start sharing what I learn working with SQL Servers day in day out. Over the years I have used other people’s posts to assist me with solving issues I have been faced with and feel it is about time I should attempt to post my experience. I’ve have not been one to over analyse issues or solutions throughout my career. It has been more a case of get it done and move on to the next task. With this ethos I am going to keep my posts as brief as possible and only detail what works/worked for me at the time, based on the specific environment I am/was using.In this instance I am going to discuss how I resolved high HADR_SYNC_COMMIT wait type within an Always On Availability Groups setup.

So here goes with my first post:

Background to High HADR_SYNC_COMMIT wait type

I was recently involved in an issue where all of a sudden the environment I supported was experiencing failed or delayed customer transactions. This was causing pain not only to the technology teams but both the customer support teams and customers alike.

SQL Server environment

The SQL Server High Availability (HA) configuration is made up of two SQL Server Instances (SQL Server 2014) consisting of a primary and secondary node that use Availability Group (AlwaysOn) technology setup in synchronous-commit mode to manage failover with no data loss.

Issue in brief

Normal transactions were queueing considerably when the index maintenance was running to the extent they were timing out due to the duration limits specified by the applications.

Straight away I could see from looking at the SQL Servers using the SQL Sentry  dashboard, the wait type counter HADR_SYNC_COMMIT was found to be consuming largest percentage of SQL Server time/resources during maintenance. I didn’t need much more evidence to determine that this was the culprit holding back normal business transactions from committing, thus ultimately stopping customers from completing their online transactions.

SQL Sentry Client Dashboard snapshot

SQL Sentry Client Dashboard tab - HADR_SYNC_COMMIT

SQL Server waits (HADR_SYNC_COMMIT represented by Purple)

Solution

The SQL Server Agent maintenance job was using Ola Hahhengen’s scripts with the following parameters:

@FragmentationLow = NULL,
@FragmentationMedium = 'INDEX_REBUILD_ONLINE',
@FragmentationHigh = 'INDEX_REBUILD_ONLINE',
@FragmentationLevel1 = 10,
@FragmentationLevel2 = 30,
@PageCountLevel = 1000,
@WaitAtLowPriorityMaxDuration = 3,
@WaitAtLowPriorityAbortAfterWait = 'SELF',
@MSShippedObjects = 'Y',
@LogToTable = 'Y', @SortInTempdb = 'Y'

--Change: Add Parameters: 
@MAXDOP = 1, @Delay=15

 

 Having made the change to throttle down the CPUs used the rebuilding of indexes took longer to complete (approximately 3-4 times longer) but erradicated blocking and locking. The delay parameter was added just to give SQL Server a bit of rest-bite between index rebuilds.

So now when the maintenance runs the AG Group is not using up as much resources and the primary replica SQL instance has more resources available to commit normal transactions within a timely, as expected, manner.

SQL Sentry Client Dashboard snapshot

SQL Sentry Client Dashboard tab - HADR_SYNC_COMMIT

SQL Server waits (HADR_SYNC_COMMIT represented by Purple – they have gone!)

Post and Pre Change Results

When I opened the monitoring dashboard within SQL Sentry it was immediately apparent that the wait types and queues relating to AG groups had reduced drastically from before I had made the change. In that, the HADR_SYNC_COMMIT waits are non-existent post-change.

Pre Change SentryOne Report snapshot

Primary node                                               Secondary node

KB to Replica/sec, Log Send Queue KB HADR_SYNC_COMMITKB from Replica/sec, Recovery Queue KB - HADR_SYNC_COMMIT

 

 

 

 

 

 

 

Post Change SentryOne Report snapshot

Primary node                                               Secondary node

KB to Replica/sec, Log Send Queue KB - HADR_SYNC_COMMITKB from Replica/sec, Recovery Queue KB - HADR_SYNC_COMMIT

 

 

The AG replica through-put and recovery queues are now lower overall which has enabled usual business transactions to commit as expected, without issue, when the maintenance is running

 

PRIMARY node SECONDARY node
maximum values maximum values
Post Change KB to Replica/sec Log Send Queue KB KB from Replica/sec Recovery Queue KB
Day 4 ~20,000 ~120 ~10,000 ~4,000
Day 3 ~1200 ~60 ~600 ~120
Day 2 ~11,000 ~60 ~6,000 ~4,000
Day 1 ~6,000 ~120 ~3,000 ~4,000
  Pre-change
~40,000 ~600,000 ~25,000 ~1,500,000

 

Going Forward

The maintenance was monitored over several days and issue never occurred again.

If required, the next step would have been to change synchronous-commit mode to asynchronous-commit mode before running the maintenance and changing back afterwards. I didn’t fancy over engineering the solution and putting the system in a different HA mode. But this was discussed with Microsoft product support specialists and confirmed as a legitimate step to take if required…

By Rupert Wilson|2017-06-13T09:47:37+01:00May 17th, 2017|Availability Groups|1 Comment
Read More

I can become your DBA giving you the peace of mind of knowing you have the level of expertise to resolve any critical business crippling issues related to SQL Server while costing less than a full-time Junior DBA

Contact

Rupert Wilson – rupert@fourteen33.uk

Fourteen33 - SQL Server DBA Services