Within the last decade or so, the advancement of distributed systems has introduced new complexities in managing your log data. Today’s systems can include Servers, Workstations, Firewalls, Databases etc., all with different Operating Systems and each generating their own log data.
Learn the best practices to help you get the most value out of your log data, including how to get started on your log management program, what you should and should not monitor, and how to perform real-time monitoring in this log management best practices guide.
What is Log Management?
Log management is the practice of collecting, formatting, aggregating, and analyzing all of that log data. Log data needs to be collected, stored, analyzed and monitored to meet and report on regulatory and security compliance standards like Sarbanes Oxley (SOX), Basel II, HIPAA, GLB, FISMA, PCI-DSS, etc. as well as security, forensics, and development purposes.
By deploying an Event and Log Management solution, you can easily manage this overwhelming amount of log information generated by all of these systems. Many organizations today generate huge amounts of log data. They need to handle this data in an orderly way. That’s what log management is all about, in a nutshell: handling huge volumes of logs, using a comprehensive approach with several processes, including log collection, aggregation, parsing, analysis, search, and reporting.
Step 1: Start with a Strategy
The first and most important step when getting started with log management is to set a strategy. Don't start logging "just because", hoping somehow, down the line, your organization will profit. That's not how it works.
Think long and hard about what you want to log and why. Understand the value that you want to extract from your logs because this will guide most other decisions from now on.
Step 2: Know What Logs to Monitor, and What Not to Monitor
Know what not to log. Just because you can log something doesn't mean you should — and logging too much data can make it harder to find the data that you need and that actually matters. It also adds complexity to your log storage and management processes because it gives you more logs to manage which can run into 100’s of Terabytes in large organizations.
Thus, consider carefully what you actually need to log. Any types of production-environment data that is critical for compliance or auditing purposes should certainly be logged. So should data that helps you troubleshoot performance problems, solve user-experience issues, or monitor security-related events.
On the other hand, there are categories of data that you do not need to log, such as data from test environments that are not an essential part of your software delivery pipeline. There are also some kinds of data that you should not log for compliance or security reasons. For example, if a user has enabled a do-not-track setting, you should not log data associated with that user. Similarly, you should avoid logging highly sensitive data, such as credit card numbers, unless you are certain that your logging and storage processes meet the security requirements for that data. Some systems can automatically redact that information, like NNT’s Log Tracker system.
A great resource which will help determine the correct logging levels is the Center for Internet Security (CIS). The CIS Benchmarks provide a wealth of information on logging and audit settings and also provide a suggested logging and audit configuration for your systems.
Step 3: Separate and Centralize your Log Data
Logs should always be automatically collected and shipped to a centralized location, separate from your production environment. Consolidating log data facilitates organized management and enriches analysis capabilities, enabling you to efficiently run cross analyses and identify correlations between different data sources. Centralizing log data also mitigates the risk of losing log data in the event of logs being cleared to cover the tracks of nefarious user activity.
Forwarding log data to a centralized location enables system administrators to grant developer, QA, security and support team’s access to log data without giving them access to production environments. As a result, these teams can use log data to debug issues without risk of impacting the environment. Replicating and isolating log data also eliminates the risk of attackers deleting log data in an effort to hide security breaches. Even if your system is compromised, your logs remain intact, protected and secure.
Step 4: Correlate Data Sources
End-to-end logging into a centralized location allows you to dynamically aggregate various streams of data from different sources – such as applications, servers, users and firewalls for correlation of key trends and metrics accordingly. Correlating data enables you to quickly and confidently identify and understand events that are causing system malfunctions or security breaches. For example, discovering real-time correlations for instances of failed logons within a short period of time can indicate a possible brute force attack and the log server can email alerts to the appropriate personnel to investigate.
Step 5: Perform Real-Time Monitoring
Service disruptions can lead to a host of unfortunate outcomes, including unhappy customers, lost purchases and missing data. When production-level issues arise, a real-time monitoring solution can be crucial when every second counts.
Beyond simple notifications, the ability to investigate issues and identify important information in real-time is just as important. Having “live tail” visibility into your log data and the ability to search and retrieve that information can empower teams to perform troubleshooting, analysis and debugging which only scratches the surface of what log data has to offer. Whereas logs were once considered a painful last resort for finding information, today’s logging services can empower everyone to identify useful trends and form key insights from their applications, systems and networks.
Treating log events as data creates opportunities to apply statistical analysis to user events and system activities. Calculating average values helps you to better identify anomalous values. Grouping event types and summing values enables you to compare events over time. This level of insight opens the door to make better informed business decisions based on data often unavailable outside of logs. A log management & analytics service like NNT Log Tracker can enable all your teams to benefit from this log data.
Logging is essential for any modern organization. However, by just creating log entries and leaving them, there’s a huge waste of potential at best. In worst-case scenarios, lack of proper log rotation could bring down your servers.
Proper log management does more than improve your troubleshooting and security processes—and that alone is already of great value. It can extract knowledge otherwise hidden in your logs, allow you to make sound decisions on security issues, faster and assist you to prevent problems before they actually occur.
The next step for you should be to familiarize yourself with the tools at your disposal, so you can start implementing your log management strategy ASAP.
For more information, read NNT CTO Mark Kedgley’s article, Fear and Loathing of Firewall and SIEM Log Savers. Don't Save Everything.
Share this post