Static Code Analysis at Core Informatics

If you are a software engineer and not familiar with static code analysis, it’s a worthwhile investment to learn more about its purpose and value to your profession. There are several terms that all point to the same concept.  Code analysis, Static program analysis, Static code analysis (SCA) and Static Code Analysis Tools (SCATs). The Latin root of SCA tools is the C based Lint program that was popular in the 1970's.  Its popularity grew out of the fact that just because you can do that in C, does not mean you should.  For any language today, that message still rings true.

Static Analysis Tools

Static analysis of your source code and bytecode goes beyond just successfully compiling your code.  Generally static analysis tools are helpful as they check for formatting and best practices conventions. More importantly, they find bad practices.  Today with our continuous delivery pipelines we combine these metrics with a dashboard that collects code size metrics, test coverage, dependency reports and version control activity to name a few metrics.  When you add these metrics to a database and track these changes over time, you now have a very effective feedback to your team on the quality trends for your application. These tools should not to be confused with runtime analysis tools such as profilers and runtime performance tools.  They also have an important feedback role for your team, but look at quality monitoring from the runtime, beyond compile time. There are many types of tool used for static code analysis and they vary in popularity based on the language and development ecosystems.  Here is an often referenced list of common SCA tools used in various development domains: SCA tools. Static code analysis tools check for hundreds of rules and gather metrics for your entire codebase to help you identify those hotspots that may require attention.

  • Security violations
  • Memory leaks
  • Exception handling
  • Formatting and coding conventions
  • Language best practices
  • Coupling
  • Lack of Cohesion of methods (LCOM)
  • Percentage of duplicate code
  • Percentage of commented code
  • Complexity of classes in terms of method calls
  • Cyclomatic complexity
  • Source lines of code (SLOCs)

As a deeper example here is a set of rules defined by the popular Findbugs tool. A powerful notion is these rules were all previously debated and developed by software engineers from lessons learned, defects and antipatterns.  By leveraging these rules, you are standing on the shoulders of these teams without having to be an expert in all the details and nuances of each rule.  Besides, we all are busy delivering great new features to our customers.

Continuous Integration Pipelines

At Core Informatics we develop tightly integrated, state of the art Laboratory Information Management Systems (LIMS), Electronic Laboratory Notebooks (ELN), Scientific Data Management Software (SDMS) and collaboration solutions to customers across multiple industries. All products and applications are built on our Platform for Science, a web-based informatics platform which can be configured to meet customer needs without custom coding. Our development team leverages SCA tools as part of our Continuous Integration (CI) pipelines.  Currently, we rely on SonarQube to aggregate this information for the team through a series of drill-down dashboards.  We have a strong interest in Continuous Delivery (CD) and deliver high-quality software as we move to more rapid deployments. If you find your team documenting and debating or simply ignoring complex and outdated coding guidelines, consider these tools. They offer much of the same guidelines, and offer governance through your CI process.

Apply Advice from Your SCA Before Your Code Review

If you do team code analysis make sure you encourage your members to first perform a SCA before the code review.  If your review your notes you will see most of those discussion points are covered by a good set of SCA rules.  Also, people based code reviews often miss memory leaks or concurrency issues because understanding the combinations of contexts may not always be obvious.  SCA tools are more much inclined to consider the wider context for each rule violations.  Consider the monotonous and fallible mental exercise of calculating all the executions path for a null variable passed through your method chains. Before code reviews consider applying a consistent code beautifier to your source files.  This will reduce the low-level banter in the code reviews and even reduce the technical debt often reported by tools such as CheckStyle.  With this, your code reviews will be more focused on the requirements design rather than the mechanics. Make SCA a first class citizen of your CI process.  Ensure the analysis is continuously run and developers have access to the trends, reports and dashboards at any time. Have a team dashboard that can track these statistics over time.  Big monitors have come down in price so mount one in a public place where all can browse the reports.  This will help your transparency and as your technical debt numbers come down, a sense of pride will rise from the team.

Adding Code Analysis to your CI Pipeline

While you add code analysis to your CI pipeline keep in mind the most developers would like to discover and fix their analysis rules before submitting their work to the pipeline and code reviews.  Professional integrated developments environments (IDEs) have built-in code analysis functions.  Often these are encapsulated as plugins based on the languages your wish to inspect.  One particular time-saving IDE feature is the ability not only to find the problems but to fix it on the spot.  Try to refrain relying heavily on the rule set established by the IDE, instead, ensure the rules are IDE agnostic.  At Core Informatics, we allow developers to use whatever IDE they are most productive with and encourage them to hook to the team agreed rule set on our SonarQube server. If your team disagrees on the applications of rules, use it as a way to encourage team collaboration, if you still can't decide, create a set of debatable points team survey with the various open survey tools available.  You will find often team members are more interested in consensus rather than splitting hairs.  New developers will appreciate the unbiased expert advice coming out from these SCA reports while more seasoned developers will appreciate discovering new ways to improve and egoless reminders of what they may have been doing wrong quite some time. There is a note of caution.  The SCA reports can produce false positives and negatives.  No tool is perfect, and the complex heuristics we sometimes create are still beyond the scope of many of these tools.  Take the advice reported by these tools with a grain of salt.  The reports are there to help you and it should not be looked as a grading or merit system.  Developers with pedantic tendencies can get stressed by the number of rules violated and may try to achieve a zero level of tolerance. Doing so misses the point of the tool.

SCA and Open Source Projects

What about SCA against open source projects?  I have seen people blog about how to make the right technology decisions while evaluating 3rd party projects: they often list documentation, community support, maintenance activity, community familiarity, yet never mention SCA.  Why would you ever consider a 3rd party library with high technical debt?  SCA tools are a way to gather an unbiased consensus on the maturity level of any open source framework or API. Much has improved since the C Lint days as the tools to analyze have increased in complexity and scope of languages and their associated rules.  This analysis data is now aggregated in databases for your community, and the added dimension of tracking analysis changes over time opens deeper insight into how your application is changing over time.

Integrate SCA into Your Team Culture

However, with all its advancement since the Lint days, there still remains a fundamental challenge; getting software engineers in the habit of not just unit testing, but also running their edits through a set of SCA tools. Enlighten your software team on how to improve the applications you produce.  Add these tools to your CI pipelines.  Encourage developers to review their code right within their favorite IDE.  Encourage your team to share with others what the code analysis tools found and how you all may improve your quality. I would enjoy learning more about your experience with code analysis tools.  If you want to know more about how we do it at Core Informatics please join our workshop this Thursday February 25th, 2016.  Event details are here.

Jonathan Johnson is the Directory of Platform Architecture at Core Informatics.  Before joining Core he was a leading architect for 454 Life Sciences/Roche Diagnostics’ DNA Sequencing Instruments.  His passion for software in laboratory and scientific fields was rooted back in the 80’s while interning at Spectrogram Corporation; one of the first commercial producers of LIMS software.