Andy Weir

Full Stack Developer

Andy is a Full Stack Developer here at Headforwards. Agile practitioner with 10+ years experience in software engineering, Andy now has aspirations to become a Technical Agile Coach. Andy enjoys sailing, surfing, and taking on a challenge - learning to snowboard, parachute jumps, indoor rock climbing, or a 50K ultra marathon. Andy is also a prolific reader, which helps him relax.


The obvious approach to tackling frequent bugs might be looking at code quality, testing, legacy issues, or poor requirements. However, some other approaches could also help the situation. Headforwards’ Full Stack Developer, Andy Weir shares three. 

Spot and resolve issues before your customers with Monitoring & Observability. 

Monitoring is the instrumentation (metrics, events, logs and traces) that helps you understand your system’s health. It could take the form of low-level operational metrics like available disk space, memory, or CPU load, all the way to business metrics such as API response times or user engagement (such as sales or conversion rates). Your monitoring should be sufficient to alert you that there’s an issue. It could be something obvious, like a service outage or significant spike in errors, or more subtle, telling you that something is not quite right, such as a drop in conversions or slow API responses. 

On the other hand, Observability measures how effectively instrumentation allows you to look inside a system, understand why it behaves the way it does, and get to the root cause of an issue before it causes too much damage. Whilst change failure rate (the number of bugs introduced into production) is an important metric, the time it takes to recover is equally important. Whilst it’s impossible to prevent bugs from making it into production, a system with good Observability will help reduce the time it takes to diagnose and fix issues. 

Monitoring and Observability should help teams: 

  • Provide leading indicators of an outage or service degradation. 
  • Detect outages, service degradations, bugs, and unauthorised activity. 
  • Help debug outages, service degradations, bugs, and unauthorised activity. 
  • Identify long-term trends for capacity planning and business purposes. 
  • Expose unexpected side effects of changes or added functionality. 

Reduce risk by Streamlining Change Approval. 

A typical response to quality issues may be to add more checks and balances to the change approval process to address these issues. While this may seem counterintuitive, it will likely have the opposite effect. 

Change approval processes are designed to control operational and security risks. Change management involves obtaining approvals from external reviewers or change approval boards (CABs) to implement changes. Traditional heavy approval processes slow delivery, leading to infrequent releases of larger batches. Larger batches have a more significant impact on the production system, increasing risk and change failure rates. Instead of resorting to additional processes and heavier approvals, making smaller, faster, and safer changes is advisable. 

Teams can streamline change approval by: 

  • Transitioning to a peer-review-based process for code changes, supported by automated tests. 
  • Using automated tools to promptly identify regressions, performance problems, and security issues. 
  • Continuously analysing changes to flag high-risk ones for additional review. 
  • Implementing information security controls at the platform, infrastructure layer, and development toolchain. 

Find bugs early with Deployment Automation and Testing. 

As discussed in the previous section, larger batches increase risk and change failure rates. Another way to drive up batch size is to have slow, unreliable, or heavily manual test processes. 

Automation is essential to reduce the risk of production deployments. It’s also essential to provide fast feedback on the quality of your software by allowing teams to do comprehensive testing as soon as possible after changes. 

The key to building quality into software is getting fast feedback on the impact of changes throughout the software delivery lifecycle. Traditionally, teams relied on manual testing and code inspection to verify systems’ correctness. 

Teams should consider: 

  • Performing all types of testing continuously throughout the software delivery lifecycle. 
  • Creating and curating fast, reliable suites of automated tests as part of your continuous delivery pipelines. 

Read how our approach to automated testing drove efficiencies for our big four client in this case study.

When addressing frequent bugs, consider Monitoring and Observability, the change approval process, and deployment automation & testing alongside the more obvious code quality, testing, legacy issues, or poor requirements. By implementing these strategies, teams can proactively identify, diagnose, and resolve problems, ultimately improving the quality and reliability of their software. 

Headforwards™ is a Registered Trade Mark of Headforwards Solutions Ltd.
Registered Address: FibreHub, Trevenson Lane, Pool, Redruth, Cornwall, TR15 3GF, UK
Registered in England and Wales: 07576641 | VAT Registration Number: GB111315770