When you’re building an app that will be deployed in someone else’s environment, building it so that it’s easy to troubleshoot and support is very important. It’s even more important in this case than when you are working on a SaaS product. Since your team won’t be responsible for deploying the code into production yourself or maintaining it once it’s live, giving your customers the tools to evaluate what’s going on if something goes wrong helps ensure they can quickly identify and resolve any problems that arise.
In the three years I’ve been working at CyberArk, my team has developed methods for enhancing our supportability. We base our methods on a logs-first approach, as we focus on logs throughout the development cycle. Our logs-first process is a three-part cycle: before, during and after development. Throughout this blog post I will share the methods that we use to enhance supportability and will offer advice on how you can apply our strategies in your own ecosystem. To support this, I will add real examples from our work on a feature that we recently developed for authenticating Azure VMs with our product. These examples will demonstrate how the logs-first approach enhanced our supportability and productivity while working on this feature.
Before we write a line of code, we invest time in writing a solution design. This design not only includes diagrams and explanations of the actual solution, but also a section that is dedicated to logs. When we write the solution design, we try to think about which flows of the solution should introduce a new log message. We write the new log messages in a table, which will specify the scenario and the log message that will be printed. This table is our guideline for when we write the code.
When we decide on the log message, we try to think what relevant information can help us investigate the issue in the future. For example, in the scenario above, logging the actual authenticator name will be more informative than logging “Authenticator is not enabled,” as the customer may be using the wrong authenticator, or may even have a typo in the authenticator name. Try to log as much data as you can, but do not forget to verify that no sensitive data is written in these log messages.
When we write the test plan, we will not only verify the expected outcome (success or failure) but we will also verify that a log message was written.
The key step for enhancing your logs during the development phase is not to use the debugger. Instead, use your product as if you are the potential customer that will deploy your application. This way, you will encounter any unexpected behavior and, thus, better understand which log messages are missing, as they will help you to develop your code. For example, when we parse the Azure token to extract the user’s Azure identity for authentication, we need to verify that all the required fields exist in the token. Although we could use the debugger to see how the token looks in run-time and which fields it contains, we prefer to log the message “Verifying that field ‘{0}’ exists in token” in debug mode and utilize our logs to better understand our environment. This way we know what the next steps are and we also provide a log message for the future developer or customer who is debugging the product.
Another concept that we use on the team is having one place in the code for all the errors and logs. This concept helps us maintain a common language in our messages; it removes duplications of errors or logs and it makes it easier to track error codes.
After we write the code and know the tests are passing, we head back to the solution design and verify that we implemented all the log messages that are written in the Logs table. We also go over the new code – or even better, let someone else go over it – to find places where debug messages would help potential developers and customers. It is best to add an issue for this, so it doesn’t get forgotten.
Working with a logs-first approach will enhance your product’s supportability. It ensures that it’s easy to identify potential problems as they arise, which makes it easier to maintain. This applies to all software projects, but especially when your customer is the one deploying and maintaining the software. I encourage you to focus on your logs in your development process. Your customers will thank you.
I am passionate about development and log driven design, if you can’t tell, join me in the CyberArk Commons to continue the conversation!
Oren Ben-Meir is a Software Engineer at CyberArk. He is passionate about exploring new areas, both in code bases and around the world. Oren loves to pass on the knowledge and experience he acquired along the way and guide others in their own personal journey.