Apache Kafka: Open Source Streaming Data Lake for Threat Detection
Customers would normally need to deploy a proprietary software with a database and /or data lake to contain large volumes of Data to detect threats, such as inside a SIEM.
This of course becomes extremely expensive and your are locked into a vendor.
Apache Kafka, provides a platform to deploy your own Threat Detection Data lake.
You can send data Directly in CEF format 2: Normalise Raw laws inside TransformationHub;
If the Data source can send it (produce it) to kafka or if there is another tool to do so. TransformationHub is passively sitting there, waiting for producers (i.e. SmartConnector) to publish data into the TH topic.
You can have other producers pushing data into TH: ie. ArcSight SmCollector, Apache NiFi, etc anything that can produce into Kafka should do. (BTW Kafka supports next to Producers and Consumers its own Kafka Connectors, don’t confuse them with ArcSight Connectors, but those are very simple collectors that will most likely not help you and its not something we support so I would not go this route nor confuse)
Once you have the data published by any means to the Hub, it will support routing and filtering CEF date and will parse RAW syslog only data if sent to syslog topic
If data is produced by non ArcSight connector, events will be probably unparsed. Is there option for adding custom parsers or parser overrides into TH ?
Syslog works if sent to th-syslog topic, then you deploy CTH (Connector on TH) to parse it and produce standard CEF or BIN data into your CEF/BIN topics. Unless recently changed and I missed it, CTH supports syslog only and there is no official way to override the parsers nor add a flex. (One partner played with it and loaded custom parsers into it, but also had to “publish” the tweaked CTH back into the TH repo so the tweked CTH is deployed once a pod crushes or node restarts. Doable if you understand k8s, pods, wrappers etc.., but not exposed in mgmt UI nor supported)
With practical and tangible Action plan – not just theoretical fluff ignored by hackers.
Firstly, as much as AWS want to advertise they are secure, enabling Logging Monitoring AWS is;
Not straight forward
Missing allot of information from AWS, which falls under your shared responsibility. (Public Cloud Security Get out of Jail Card.)
There are so many ways skin the cat, but not real best practices.
You need to be well aware of the service limits.
AWS release new products that don’t exists anywhere else, so you have no idea, what can be abused/exploited and how to detect these threats. (Of course, no one is going to question a Behemoth. Because everyone wants to work for them! right.)
They all ways advise you that the product is documented, but dont give you any advice on Business outcomes and gaps.
Check the AWS HCL – People also look at the features, but basic tenant of Solution Architecture is to check the HCL, hardware Compatibility List, this applies to AWS, where you need to check what is not supported.
AWS (complementary and additive) native Architecture.
AWS forces you to use all of they services for single requirement, making Bezo a Trillionaire . It’s a nonsensical intricate web, where no one has a farking clue what is going on. Look at this as a example from SecurityHub FAQ;
Q: Will Security Hub replace the consoles of our other security services, such as Amazon GuardDuty, Amazon Inspector, or Amazon Macie?
No. Security Hub is complementary and additive to the AWS security services. In fact, Security Hub will link back into the other consoles to help you gain additional context. Security Hub does not replicate the setup, configuration, or specialised features available within each security service.
CloudTail can also send logs into CloudWatch Logs, (i have no clue what you would need to do that.. )
Also, another one, DNS Traffic is not captured in VPC Flow logs and VPC Flow logs are not real-time and also does not support some instance types –
You can’t modify a Flow Log’s configuration parameters once it is created. Instead, you have to delete it and create a new log. That’s not difficult, but it’s a bit annoying from a usability perspective.
Network interfaces with multiple IP addresses will have data logged only for the primary IP as the destination address. This makes Flow Logs less useful in configurations involving multiple IPs on a single interface.
Flow Logs exclude traffic related to DHCP requests and Amazon DNS activity. (Traffic for a non-Amazon DNS server is logged.) In many cases, this may not matter, but it is a limitation if you need to troubleshoot an issue with your site related to DHCP or DNS. For example, you may be experiencing poor performance due to slow DNS resolution. There are also valuable security insights that you can glean from DHCP and DNS traffic, such as detecting packet sniffing attempts by looking for unusual rates of IP conflicts, usage of the same MAC address by multiple hosts or the sharing of DNS records by machines with the same IP address.
When exec decided to digitally transform into AWS, did they evaluate the cost of talent, AWS isn’t a single product, it is as of this writing 170 products that get upgrade and changed on a daily basis, did you assess this risk. Of course you didn’t. Oh, yeah don’t get me started on the Multi-Cloud stupidity.
This is why AWS is just so easy to master! And also super easy to secure! 🙂 🙂 🙂 🙂
“nobody got fired for buying ibm” old proverb, Now its Public Cloud!
AWS Security Actionable Security Monitoring Plan
You should make sure you get a clear answer from AWS for the following questions;
So you’re logging, thats great… what are you detecting?
What is your best practice for sending logs into a central SIEM?
Can you list top use cases AWS cover/detect?
Threat Detection SOC Use cases;
Essentially, you need to log everything centrally (for investigations and compliance) and Threat detection. What are you logging and what can you detect. You should run a Red Team against this configuration to see what you can detect or not.
In terms of Security Operations perspective the following are the key Use cases required to support your Incident Response Plans;
Threat Detection and Alerting.
Governance and Compliance Reporting.
Investigation Searches and Digital Forensics.
Cloud Control Plane vs Cloud Data Plane Concept
To establish baseline monitoring, security teams should gather and process the following:
Cloud control plane logs (such as AWS CloudTrail1 logs
Data Plane Workload OS/application logs
AWS Product (Access Logs)
Network flow logs for virtual private clouds (VPCs)
Inventory your threat landscape and exposure
Requirements for Threat Detection
UpStream Security Monitoring
Cloud Control Plane Logging
First, there’s the idea of a control plane. The control plane is the master controller (usually in the form of a master node) and includes API services, scheduling capabilities for containers and operational management tools/services. A master-level configuration database is also maintained in the control plane. In general, the control plane can be considered the brains of the Kubernetes infrastructure, and it needs to be very carefully protected.
Focus on the types of events that could be problematic to the environment. Examples include critical assets accessed or changed, identity policies modified, cryptographic keys deleted or changed, and so on.
AWS Product Access Logs
On top of the Control and Data plane, you need to consider the Access logs for specific AWS Products/Services. In terms of services such as AWS CloudFront, the access logs are not captured via the Control Plan, therefore, you need to capture; Access Logs, Account Activity, and Configuration;
Billing alarms—If you have a reasonable idea of a monthly billing range, you can break this down to define “checkpoints” that your bill should be at any given time. If these thresholds are crossed, you can be alerted and investigate the reason for the additional cost. Tools like AWS Budgets provide simple alerting and reporting for cloud billing.
These are key! If you have a reasonable idea of a monthly billing range, you can break this down to define “checkpoints” of what your bill should be at any given time. If these thresholds are crossed, a billing alarm could alert you and investigate what is causing the additional cost.
Resources and resource utilization—Cloud control plane logs from services like AWS CloudTrail can (and should) be heavily leveraged to monitor new, modified and deleted assets in the environment, as well as access to assets and service interaction in the cloud environment. These logs need to be integrated with a SIEM and/or cloud-native cloud monitoring solution like Amazon CloudWatch to build the appropriate triggers for alerting, as well as monitoring and reporting metrics as warranted. Some behavioral trending over time can also be assessed and reported through analytics tools like AWS Security Hub and Amazon GuardDuty, as well
Amazon CloudWatch filters
Activity in specific regions—One of the best quick wins for security teams is to purposefully disable all geographic regions not in use; a follow-up to this is enabling explicit monitoring for cloud control plane logs (like AWS CloudTrail) to look for any activity in regions marked as “not in use” or “disabled.” A common tactic intruders use for malicious activities like cryptocurrency mining is to create unauthorized assets and workloads in unused regions to “buy time” before detection. Teams should consider any alert for activity in an unauthorized or unused region a high priority.
Monitor your user activity within the cloud. Admins, in particular, should be monitored carefully, because these accounts are prime targets for attackers. Any nonfederated user access should also be a high priority.
VPC Flow Logs for your VPCs; they are not enabled by default.
AWS Config provides a detailed view of the configuration of AWS resources in your AWS account. This includes how the resources are related to one another and how they were configured in the past so that you can see how the configurations and relationships change over time.
However, AWS Config only collects information about EC2/VPC-related resources, not everything in your AWS account.
You should monitor changes to you AWS real estate and insure all changes are via ITIL Change Management and/or approved automation only.
Firstly, need to understand what AWS services and/or devices are in scope, then map them to your AWS native security logging into ArcSight SmartConnectors.
Click on Resource Groups next to the AWS Services in your aws console page, and select All Regions in region field and All Resources in the resources field. You will get the list of all the resources up and running in your AWS account. You can even tag them separately so you can check how much each resource is costing you. If there is any other way, for example through AWS CLI, I am curious to know that.
Adding context—If logs can be “tagged” as originating from a specific ISP or CSP, that can help provide context on the use cases of the service. For example, logs from identity management services like AWS Identity and Access Management (IAM) have a specific user context, whereas events from Amazon EC2 may need additional details about workloads to provide the proper context for evaluation.
relationships.resourceId = 'vpc-#######'
What do you use, AWS SecurityHub, GuardDuty, CloudWatch, CloudTrail or EventHub.
Answer is all of these are complementary and additives services. So let’s example each of them and there primary use cases. So its best to begin with your use cases in terms of SOC operations and Threat Detection;
Investigation and Search
Governance and Reporting
Threat Detection and Alerts
AWS GuardDuty vs CloudTrail vsSecurityHub vs CloudWatcth acts as an aggregation for other AWS services, which are supported by corresponding ArcSight SmartConnectors. You need to determine where you want to do Threat Detection and hold raw logs for long term retention and investigation.
AWS GuardDuty, CloudTrail, SecurityHub and CloudWatcth acts as an aggregation for other AWS services, which are supported by corresponding ArcSight SmartConnectors. AWS (complementary and additive) native Architecturecomes into play;
Data Plane -> AWS EC2 -> Windows (SYSMON/WEC/WEF) -> ArcSight SmartConnector -> ESM/Logger
Data Plane -> AWS EC2 -> Linux (AuditD/Syslogs) -> ArcSight SmartConnector -> ESM/Logger
ArcSight SmartConnector for WiNC (Windows Native Connector) – Recommended for Production Environments
Windows Event Collection (WEC) and Windows Event Forwarding (WEF) are native Microsoft technologies that support Windows event log collection in a Windows environment.
WiNC SmartConnector is capable of collecting “Forwarded Events or Other WEC Logs from Local Or Remote Hosts”. As such, you may consider deploying a suitable Windows Event Forwarding architecture for your organization.
Directly on WEF aggregation point (WECServer)
Remotely on another Windows Server, to connector and collect forwarded events from one or many WEC Server(s).
SmartConnector for MS Windows Event Log – Native SmartConnector (WiNC)
Suspicious AWS CloudTrail event that indicates a cloud user trying to deactivate an MFA device.
How to Improve Security Visibility and Detection/Response Operations in AWS
IAM activity (logins in particular)—Monitor your user activity within the cloud. In particular, monitor admins carefully, because these user credentials are prime targets for attackers. Any nonfederated user access should also be a high priority.
How to Improve Security Visibility and Detection/Response Operations in AWS
Priority 1 – Launching a workload that is not from an approved template – Launching any containers from unapproved images in a repository – Launching any assets in unapproved regions – Modifying any IAM roles or policies – Modifying or disabling cloud control plane logging or other security controls – Logins to the web console (unauthorized)
• Priority 2 – Unusual user behaviors (trying to access unauthorized resources, etc.) – Adding/updating new workload images – Adding/updating new container images – Logins to the web console (authorized) – Updating/changing serverless configuration
• Priority 3 – Changes to security groups or network access control lists (ACLs) – Updating/changing serverless function code
How to Improve Security Visibility and Detection/Response Operations in AWS
able 1. Starting Points for Event Searches
AWS CloudTrail Event
Reason for Investigation
A user initiates console login activity.
A user tries to stop AWS CloudTrail.
Someone creates a network ACL, which could expose attack surfaces or vectors.
Someone creates a new route for data path control, which could expose attack surfaces or vectors.
Enable Amazon VPC Flow logs for your VPCs; they are not enabled by default.
Uses AWS Nitro EC2 instance can mirror traffic from any EC2 instance (A1, C5, C5d, C5n, I3en, M5, M5a, M5ad, M5d, p3dn.24xlarge, R5, R5a, R5ad, R5d, T3, T3a, and z1d).
Ultizing default DNS services as it is intergrated with CloudTrail and GuardDuty, if you using a 3rd party for DNS, you need to make sure you can monitor that and correlate that within your SIEM.. e.g. Cisco Umercal support by ArcSight SmartConnector
Uses Amazon Machine Images (AMIs) to get started Multiple OS support Pay for what you use Next-gen Nitro infrastructure, created by AWS
Amazon Elastic Block Store (EBS), Amazon Simple Storage Service (S3), Amazon Elastic File System (EFS)
Amazon S3 offers multiple storage classes for multiple use cases. Amazon EBS is used for the “block device” or hard drive for Amazon EC2 instances. Amazon EFS is used for file sharing storage with two storage classes to choose from.
How to Improve Security Visibility and Detection/Response Operations in AWS
Initial investigation and threat hunting—Analysts need to quickly find evidence of compromise or unusual activity, and often need to do so at scale.
Opening and updating incident tickets/cases—Due to improved integration with ticketing systems, event management and monitoring tools used by response teams can often generate tickets to the right team members and update these as evidence comes in.
Producing reports and metrics—Once evidence has been collected and cases are underway or resolved, generating reports and metrics can take a lot of analysts’ time.
How to Improve Security Visibility and Detection/Response Operations in AWS
Automated DNS lookups of domain names never seen before • Automated searches for detected indicators of compromise • Automated forensic imaging of disk and memory from a suspect system, driven by alerts triggered in network- and host-based anti-malware platforms and tools • Network access controls automatically blocking outbound command and control (C2) channels from a suspected system
The key products compared here are based on Gartner Magic Q which is what Organizations typically use to select SIEM vendors. The Vendors mentioned here in the deck are :
1. HP ArcSight
2. McAfee Nitro
3. IBM QRadar
4. Splunk SIEM
5. RSA Security Analytic
SIEM Technology Space
SIEM market analysis of the last 3 years suggest:Market consolidation of SIEM players (25 vendors in 2011 to 16 vendors in 2013)Only products with technology maturity and a strong road map have featured in leaders quadrant.HP ArcSight & IBM Q1 Labs have maintained leadership in SIEM industry with continued technology upgradeMcAfee Nitro has strong product features & road map to challenge HP & IBM for leadership
The ArcSight Enterprise Threat and Risk Management (ETRM) Platform is an integrated set of products for collecting, analysing, and managing enterprise Security Event information.
ArcSight Enterprise Security Manager (ESM): Correlation and analysis engine used to identify security threat in real-time& virtual environments
ArcSight Logger: Log storage and Search solution
ArcSight IdentityView: User Identity tracking/User activity monitoring
ArcSight Connectors: For data collection from a variety of data sources
ArcSight Auditor Applications: Automated continuous controls monitoring for both mobile
Extensive Log collection support for commercial IT products & applications
Complex deployment & configuration
Advanced support for Threat Management, Fraud Management & Behavior Analysis
Mostly suited for Medium to Large Scale deployment
Comprehensive Explanation: What is a SIEM (in 2020 and beyond.)
[I have not had the time to proof read nor correct grammatical errors, spelling mistakes and typos. ]
SIEM unifies Threat Detection and Hunting.
This is an old topic worth revising and level setting with the latest advancements, concepts and learning from a decades of unsuccessful SIEM deployments! It is worth revisiting as allot people don’t understand the value and even less understand how to effectively operationalise and achieve business outcomes utilising the power of a SIEM.
After reading this you will gain enough insight into the basics of SIEM.
I am continually asked the same questions around SIEM design, so glad to finally brain dump this knowledge and share with the community
(SIEM in Public Cloud is beyond the scope of this article, while all the information is relevant, I will write another article focusing specifically for Threat Detection for Public Cloud environments. )
Security Information and Event Management
A SIEM seeks to provide a holistic approach to an organisation’s IT security. A SIEM represents a combination of services, appliances, and software products. It performance real-time collection of log data from devices, applications and hosts. It also process the collected log data, enabling real-time analysis of security alerts generated by network hardware and applications, Advanced Correlation for security and operational events, as well as real-time alarming and scheduled reporting.
SIEM technology is used in many enterprise organizations to provide real time reporting and long term analysis of security events. SIEM products evolved from two previously distinct product categories, namely security information management (SIM) and security event management (SEM).
Table 1 shows this evolution.
Table 1 . SIM and SEM Product Features Incorporated into SIEM
Real time reporting, log collection, normalization, correlation, aggregation
Combined SIEM Product
SIEM combines the essential functions of SIM and SEM products to provide a comprehensive view of the enterprise network using the following functions:
Log collection of event records from sources throughout the organization provides important forensic tools and helps to address compliance reporting requirements.
Normalization maps log messages from different systems into a common data model, enabling the organization to connect and analyze related events, even if they are initially logged in different source formats.
Correlation link slogs and events from disparate systems or applications, speeding detection of and reaction to security threats.
Aggregation reduces the volume of event data by consolidating duplicate event records.
Reporting presents the correlated aggregated event data in real-time monitoring and long-term summaries.
Internal IT environment consists of services, networking equipment, application, and components that they want to protect and prevent intrusion into. In order to protect these assets and data, you can deploy protection in the form of firewalls, antivirus, IPS/IDS and Authentication. Protection Examples such as;
Secure Access Service Edge
Despite all of the systems and effort put into these solutions, those trying to breach that environment will get in. Once they are in, detecting and responding to their attack is time critical.
A SIEM receives or taps into all of these activity as it is continually receiving thousands of logs per second from all of these devices and systems within the environment. The SIEM process log data to make meaning of what is actually happening on a device aka Detection, and analytics are used to analyses data activity, providing more input into what is actually happening.
SIEM solutions also provides the ability to analysis log historic data and generate reports for compliances purposes as well as providing digital forensic and fulfilling additional parts of overall information security strategy.
SIEM solutions centralising log data within IT environments, augmenting security measures and enabling real-time analysis. It is constantly watching, monitoring and analysing events and alerts with the environment in an effort to detect attacks and intrusions.
Fourth Wave of SIEM
SIEMs sometimes gets a bad name as it is incredibly powerful and yet takes enormous amount of skills and effort to get working. Not because of the SIEM, but it requires data from all of your IT environment and that particularly causes massive delays in successful SIEM deployment. (This can be easily solved. Keep reading.) SIEM has evolved to very mature platforms. E.g. ArcSight 20+ years of evolution. Read ArcSight History here
PCI-DSS really drove first phase of SIEM deployment for Complaint Business outcome.
Then people started to detect bad things in network activity.
This phase was when customer started to build SOCs.
This is about SOCs developing Threat Hunting utilising NDR, EDR, SIEM and SOAR
SIEM processes all types of Machine data produced by devices in a IT environment.
Machine data is one of the most underused and undervalued assets of any organization. But some of the most important insights that you can gain—across IT and the business—are hidden in this data: where things went wrong, how to optimize the customer experience, the fingerprints of fraud. All of these insights can be found in the machine data that’s generated by the normal operations of your organization.
Machine data is valuable because it contains a definitive record of all the activity and behavior of your customers, users, transactions, applications, servers, networks and mobile devices. It includes configurations, data from APIs, message queues, change events, the output of diagnostic commands, call detail records and sensor data from industrial systems, and more.
The challenge with leveraging machine data is that it comes in a dizzying array of unpredictable formats, and traditional monitoring and analysis tools weren’t designed for the variety, velocity, volume or variability of this data.
In computing, syslog/ˈsɪslɒɡ/ is a standard for message logging. It allows separation of the software that generates messages, the system that stores them, and the software that reports and analyzes them. Each message is labeled with a facility code, indicating the software type generating the message, and assigned a severity level.
The syslog protocol, defined in RFC 3164, protocol provides a transport to allow a device to send event notification messages across IP networks to event message collectors, also known as syslog servers. The protocol is simply designed to transport these event messages from the generating device to the collector. The collector doesn’t send back an acknowledgment of the receipt of the messages.
Syslog uses the User Datagram Protocol (UDP), port 514, for communication. Being a connectionless protocol, UDP does not provide acknowledgments. Additionally, at the application layer, syslog servers do not send acknowledgments back to the sender for receipt of syslog messages. Consequently, the sending device generates syslog messages without knowing whether the syslog server has received the messages. In fact, the sending devices send messages even if the syslog server does not exist.
The syslog packet size is limited to 1024 bytes and carries the following information:
Computer system designers may use syslog for system management and security auditing as well as general informational, analysis, and debugging messages. A wide variety of devices, such as printers, routers, and message receivers across many platforms use the syslog standard. This permits the consolidation of logging data from different types of systems in a central repository. Implementations of syslog exist for many operating systems.
When operating over a network, syslog uses a client-server architecture where a syslog server listens for and logs messages coming from clients.
SIEM is a mandatory requirement for Compliance Audits such as PCI-DSS, ISO, 27001, Sarbanes–Oxley Act of 2002(thanks Enron), and other standards.
The Payment Card Industry (PCI) Security Standards Council was founded by five global payment brands: American Express, Discover Financial Services, JCB International, MasterCard, and Visa. These five payment brands had a common vision of strengthening security policies across the industry to prevent data breaches for businesses that accept and process payment cards. Together they drafted and released the first version of PCI Data Security Standard (PCI DSS 1.0) on December 15, 2004.
PCI DSS is a regulation with twelve requirements that serve as a security baseline to secure payment card data.
Requirement 10: Track and monitor all access to network resources and cardholder data.
Requirement 11.5: Deploy a change detection mechanism (for example, file integrity monitoring tools) to alert 24 personnel to unauthorized modification (including changes, additions, and deletions) of critical system files, configuration files or content files. Configure the software to perform critical file comparisons at least weekly. Implement a process to respond to any alerts generated by the change-detection solution.
Depending on your PCI-DSS merchant level and number of Credit Card transactions you process, you will need to adhere to different levels of PCI-Auditing.
Cyber Threat Intelligence
Threat intelligence, or cyber threat intelligence, is information an organization uses to understand the threats that have, will, or are currently targeting the organization. This info is used to prepare, prevent, and identify cyber threats looking to take advantage of valuable resources.
Cyber Threat Intelligence consists of many number of information including; Indicators of Comprise and Indicators of Attacks
Indicators of compromise (IOCs) are “pieces of forensic data, such as data found in system log entries or files, that identify potentially malicious activity on a system or network.” Indicators of compromise aid information security and IT professionals in detecting data breaches, malware infections, or other threat activity. By monitoring for indicators of compromise, organizations can detect attacks and act quickly to prevent breaches from occurring or limit damages by stopping attacks in earlier stages.
Indicators of compromise act as breadcrumbs that lead infosec and IT pros to detect malicious activity early in the attack sequence. These unusual activities are the red flags that indicate a potential or in-progress attack that could lead to a data breach or systems compromise.
Indicators of attack are similar to IOCs, but rather than focusing on forensic analysis of a compromise that has already taken place, indicators of attack focus on identifying attacker activity while an attack is in process. Indicators of compromise help answer the question “What happened?” while indicators of attack can help answer questions like “What is happening and why?” A proactive approach to detection uses both IOAs and IOCs to discover security incidents or threats in as close to real time as possible
ATPs and Tactics, Techniques and Procedures (TTPs)
SIEM can utilise Cyber threat intelligence/IoCs/IoAs/TTPS and correlate with the IT environment log data to Detect threats in real-time and history log data.
Correlation Rules, Behaviour patterns, Pattern matching, Anomaly detection, Conditions, Thresholds, Network Modelling and Machine learning (Phew give me a pay rise. )
Correlation is one of the key components of any effective SIEM tool. As information from across your digital environment feeds into a SIEM, it uses correlation to identify any possible issues. It does so by comparing sequences of activity against preset rules, conditions and thresholds. SIEMs allow sophisticated ways to implement risk based rules.
The latest SIEM, can now implement Anomaly detection via Machine learning.
All integrated with Threat Intelligence information.
The Brains inside a SIEM is based on Correlation Rules, Pattern matching, Conditions, Thresholds and now implementation of Machine learning via Unsupervised and Supervised Models.
Supervised Machine Learning
Unsupervised Machine Learning
Network Modelling and Risk Scoring
Use case is a term used for Threat Detection in terms of Business Context. It combines the value and context in SIEM platform.
You can catch just about everything with ArcSight Default Content and SIGMA Rules! The rest you need to pay someone like me to workshop and write.
Machine Data Sources
Amazon Web Services
Security & Compliance, IT Operations
Data from AWS can support service monitoring, alarms and a dashboards for metrics, and can also track security-relevant activities, such as login and logout events.
APM Tool Logs
Security & Compliance, IT Operations
APM tool logs can provide end-to-end measurement of complex, multi-tier applications, and be used to perform post-hoc forensic analytics on security incidents that span multiple systems.
Security & Compliance, IT Operations, Application Delivery
Authentication data can help identify users that are struggling to log in to applications and provide insight into potentially anomalous behaviors, such as activities from different locations within a specified time period.
Security & Compliance, IT Operations
Firewall data can provide visibility into blocked traffic in case an application is having communication problems. It can also be used to help identify traffic to malicious and unknown domains.
Industrial Control Systems (ICS)
Security & Compliance, Internet of Things, Business Analytics
ICS data provides visibility into the uptime and availability of critical assets, and can play a major role in identifying when these systems have fallen victim to malicious activity.
Security & Compliance, Internet of Things, Business Analytics
Medical device data can support patient monitoring and provide insights to optimize patient care. It can also help identify compromised protected health information.
Security & Compliance, IT Operations
Network protocol data can provide visibility into the network’s role in overall availability and performance of critical services. It’s also an important source for identifying advanced persistent threats.
Security & Compliance, IT Operations, Internet of Things
Sensor data can provide visibility into system performance and support compliance reporting of devices. It can also be used to proactively identify systems that require maintenance.
Security & Compliance, IT Operations
System logs are key to troubleshooting system problems and can be used to alert security teams to network attacks, a security breach or compromised software.
Security & Compliance, IT Operations, Business Analytics
Web logs are critical in debugging web application and server problems, and can also be used to detect attacks, such as SQL injections.
SIEM Data formats
Typical formats supported by SIEM platform to ingest Log data;
In the realm of security event management, a myriad of event formats streaming from disparate devices makes for a complex integration. Common Event format by ArcSight promote interoperability between various event- or log-generating devices.
Although each vendor has its own format for reporting event information, these event formats often lack the key information necessary to integrate the events from their devices.
The ArcSight standard attempts to improve the interoperability of infrastructure devices by aligning the logging output from various technology vendors.
Common Event Format (CEF) is a Logging and Auditing file format from ArcSight and is an extensible, text-based format designed to support multiple device types by offering the most relevant information.
Message syntaxes are reduced to work with Arcisght normalization. Specifically, Common Event Format defines a syntax for log records comprised of a standard header and a variable extension, formatted as key-value pairs.The format called Common Event Format (CEF) can be readily adopted by vendors of both security and non-security devices.
This format contains the most relevant event information, making it easy for event consumers to parse and use them. To simplify integration, the syslog message format is used as a transport mechanism.
Ensures timestamps all reflect the same time zone to correlate events from different timezones.
Time is an important piece for threat detection. Some time zones around the world don’t observe Daylight Savings Time (DST) and some time zones are actually a half hour different than others. In addition to time zone issues, some devices don’t include a time in the log message. A SIEM needs to timestamp a log with a single time zone.
Data Enrichment (Meta data extracting, tagging and enrichment)
SIEM parses and breaks down log message into core components and adding context. e.g. adding customer tag, etc.
Log data is not uniform, they following a standard protocol, but the information within isn’t standard followed by log source providers, so a SIEM has to process the log into a unified threat detection taxonomy and universal schema in order to run mathematical rules.
Log information needs to be assigned into common schema so that a [User Log on] message from various system from Unix, Windows, Active Directory, AWS, etc will all be tagged as User Log on to assist threat detection search rules.
Threat and Risk Contextualisation
Evaluate each log and provide risk-based priority value. e.g. Information for Edge services / DMZ or Authentication such as Active Direction, DNS information, etc.
Events are a collections of syslogs that is created after processing with Threat Intelligence and/or correlation rules. An Event is a actionable log items sent to human Analysts for further triage, performing investigations and reporting.
Sizing SIEM solutions
Sizing a SIEM solutions, begins with the basic list of devices that you want to monitor. See Example Device List collection Tool;
Windows Server (Active Directory)
Windows Server (DNS)
Fortinet Firewall (IDS/IPS/VPN)
Citrix Access Gateway
SIEM Sizing (Events Per Second)
Critical to sizing and design of a SIEM platform, is to determine Events Per Second produced by the quantity of devices Size,
You need to determine and estimate the following SIEM fundamentals;
Events Per Second
Events Per Day:
Online Retention Period and requirement Storage in GBs
Retention Period and required Storage in GBs
Network Bandwidth Peak requirements: (GB /per second for all Devices.)
EPS average (Day, Week, Month, etc.)
Estimated Device Growth over 3 years
EPS Headroom (Allow 10-30%)
Recovery Point Objective
Recovery Time Objective
Event / Alert Size (512 Kbs per Event is a rough estimate.)
SIEM Sizing Rosetta Stone
GB (1 GB = 1,000,000,000 BYTES)
EPS (1 EVENT = 600 BYTES)
Storage and Archival are critical for any Security Logging platform
Raw Event Size
Normalised Event Size
Online Retention Period
Events Per Day
GB Storage per day/Retention time.
It is vital to understand the way your SIEM platform receivers and processing data; What is the Schema format, Schema on Read, Schema on Write. Is it using Distributed Search or in-memory Real-time, etc. The last thing you want to do is HORD data and not understand what you are collecting and be scared of getting rid of it and not even be able to get any value from the data; Don’t turn into this guy, because the Finance department will start knocking on your door and the day will come when you will have to provide justification and prove business results. If you ever get breached and can’t even useful information after you stored tons of data. You might need to find another job.
Overwhelming about of logs sources without proper sanitisation and normalisation can lead to massive amount of useless information in SIEM leading to alert fatigue
False-Positive and False-Negatives
A false positive state is when the SIEM identifies an activity as an attack but the activity is acceptable behavior. A false positive is a false alarm.
A false negative state is the most serious and dangerous state. This is when the SIEM identifies an activity as acceptable when the activity is actually an attack. That is, a false negative is when the SIEM fails to catch an attack. This is the most dangerous state since the security professional has no idea that an attack took place.
False positives, on the other hand, are an inconvenience at best and can cause significant issues. However, with the right amount of overhead, false positives can be successfully adjudicated; false negatives cannot.
Airport Security: a “false positive” is when ordinary items such as keys or coins get mistaken for weapons (machine goes “beep”)
Medical screening: low-cost tests given to a large group can give many false positives (saying you have a disease when you don’t), and then ask you to get more accurate tests.
Antivirus software: a “false positive” is when a normal file is thought to be a virus
Popular SYSLOG Servers
Log Sources Categories
err no clue
SIEM – Real-Time vs Search
As the ever increasing volume of data increases, it becomes increasingly difficult to gain critical insights into to massive volumes of data for SIEMs and other data analytics platforms. SIEMs need to detect threats in-real time and search years of log source archives at the same time. So you are trying to solve two critical problems at the same time;
Security Event Management
Real-Time Streaming Data Analytics
Security Information Management
Searching Large Data sets at scale and speed
These two requirements are incredibly difficult to solve at scale. So, lo and behold, Open source to the rescue; Apache Kafka and Apache Hadoop provide solutions for both of these requirements.
A streaming platform has three key capabilities:
Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system.
Store streams of records in a fault-tolerant durable way.
Process streams of records as they occur.
Kafka is generally used for two broad classes of applications:
Building real-time streaming data pipelines that reliably get data between systems or applications
Building real-time streaming applications that transform or react to the streams of data
Apache Hadoop (aka Data Lake)
The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing.
Security Operations and Automated Response (SOAR.)
This subject is beyond the scope of this article. I will dive into this in the near future.
Leading SIEM Vendor Solutions
ArcSight Data Platform
ArcSight really almost invited the SIEM industry with 20+ year Product portfolio and invented CEF format for cyber security now supports Apache Kafka and Apache Hadoop. Integrating Unsupervised Machine learning via Vertica, IDOL and Interset.
While gaining popularity for general purpose IT monitoring, they do have some capability in Security and Big Data Analytics. Splunk Enterprise is the Base, solution, with Splunk Enterprise Security, Splunk UBA, Splunk Cloud and Splunk Phantom. , Splunk Machine Learning Toolkit, Splunk uses Common Information Model
Another original SIEM vendor.
I don’t have any experience with QRadar.
ELK Security Onion / HELK
Fastest growing Open source Search stack. ELK is Opensource. Elastic is very powerful opensource platform, recently acquired Endgame. ELK stack; Elasticsearch, Kibana, Logstash, Beats. ECS Elastic Common Schema
Popular due to McAfee Enterprise license agreements.
100% Windows Server Based, no linux edition. Every complex to deploy and requires high resources and application administration. Does have SYSMON, FIM, NETMON, UEBA and SOAR as part of the solution.
FireEye / Mandiant
Premium products for Banking and Defence Grade Technology combined with 24/7 DFIR SOC services. So this is Product solution and arguably the best DFIR Team (Mandiant). Every expensive.. HX, NX, MX proud lines, for Endpoint, Network and Cloud SIEM.
Thank you for reading this article, please support my sharing, Next article, I will look at Log collection and SIEM Design patterns in Cloud.
If you would like to sponsor my next article or this blog, please get in touch.
You can’t find a solution without understanding the problem. Before buying or implementing new machine learning technology, identify the security use cases that are most critical for your organization. Once you understand and can articulate the problem you’re trying to solve, you are then ready to select the technol- ogy that is best suited for your needs .
AI and machine learning are ubiquitous terms in cybersecurity, but there is plenty of snake oil among ven- dors who claim to use these technologies . Do your homework to understand what type of machine learning is behind a vendor’s solution and whether or not that type of machine learning meet your security team’s needs. You don’t need to be a data scientist, but knowing just a little bit about how machine learning works can help you ask better questions when evaluating a vendor, like “What threats are not covered with existing tools and techniques?” or “Which data feeds contain valuable information but are currently underused?”
Your best defense comes from covering as many bases as possible. Machine learning alone will not find and stop a bad actor. Pairing a powerful UEBA with a next-gen SIEM provides a layered approach to security analytics that enables more visibility, better detection, and easier, quicker avenues for responding to both known and unknown threats. Real-time correlation quickly and effectively finds the known threats, and UEBA detects the subtle threats that will otherwise escape detection. The truth is that real-world threat scenarios often require a combination of both of these approaches .
The humans in your SOC are more valuable than ever, but they are facing formidable challenges. SOC teams consistently struggle to deal with snowballing feeds of data and constantly evolving threats . A proactive security posture comes from a human-machine team that leverages the strengths of each: faster-than-human analysis by machines to identify leads for investigation and the contextual understand- ing of SOC analysts and threat hunters .
Different Types of Machine Learning
UEBA MITRE – Machine learning Use Case Examples
A sample of MITRE ATT&CK threat tactics and associated behavioral indicators detected by anomaly detection powered by unsupervised machine learning, such as Interset UEBA.
Customer tagging is a feature developed mainly to support MSSP environments, although private organizations can use the technique to denote cost centers, internal groups, or business units.
A Customer is not a source or target of an event, but it can be thought of as the owner of an event. Content developers can also use the Customer tag to develop customer-aware content.
Why is customer tagging critical in MSSP environments? The Customer designation identifies who owns the events. This ensures each customer (tenant) can view only its own events.
Consider this scenario: The customer tag is usually assigned based on the reporting device IP address. In an MSSP environment, different customers can have overlapping networks. This requires an elaborate mechanism for assigning a customer attribute to events.
Since most organizations use private address spaces (see https://en.wikipedia.org/wiki/Private_ network), addresses included in events from different customers may contain identical addresses but referring to different assets. For example, two tenants may use the private address space 192.0.2.x, and therefore the address 192.0.2.1 may be used by both tenants to refer to a local system.
Make sure you have the proper network information model, which includes zone information, and the asset model, which requires correct zone information. When a connector enriches an event with asset information derived from the ESM asset model, the event uses the asset address as key for locating asset information. The ESM asset model would therefore need a mechanism to differentiate between assets with the same address but belonging to different customers.
Allow for separation of events by retention periods or by event type and categorisation URI field.
Voltage SecureData data encryption and masking for Data Sovereignty
Global ID will assign a unique ID to each security event coming into ArcSight. This ID is globally unique and can be used to facilitate easier cross-portfolio analysis across multiple ESM installations as well as other ArcSight solutions.
Difference between a Smart Connector and Smart Collector
To undersand the Collectors v.s Connectors, we need to step back and look at what the SmarConnectors do.
Conceptually, the standard SmartConnectors have two main responsibilties: “Collect” raw data from various sources, and “Process” the collected data to become enriched security events and post them to a destination.
Introduced in ADP 2.30, customers can take advantage of the massive scalabilty and robustness of the Event Broker infrastructure, and move the computationaly intensive “Process” step to the highly scalable and more robust Event Broker streaming infrastructure.
This is done by using syslog Colelctors and syslog CEBs: Collectors are standalone compnents very similar to the SmartConenctors, but they only “Collect” raw syslog data like the syslog SmartConnectors do, wrap it up and post it to a dedicated eb-con-syslog topic in Event Broker.
At that point, the Event Broker’s CEB stream processors (CEB stands for Connector in Event Broker) read the data from the eb-con-syslog topic, do the parsing/normalization/enrichment/filtering processing (as the standalone SmartConnectors destination pipelines do) and post the security events on the EB topics for consumption.
In other words: as their name suggests, the syslog Collectors are lightweight component responsible for collecting raw syslog data and passing it to Event Broker for processing.
Main advantages of the new architecture:
Potential for hardware consolidation and data throughput increase in the data collection layer where the Collectors are deployed: due to moving the processing to the EB streaming infrastructure.
Improved stabilty and easy horizontal scalability as the data flows increase with time, or fluctuate during operations: CEBs are deployed or undeployed on the EB nodes with a single click in the ArcMC UI.
Reduced network traffic due to a single data feed to Event Broker, instead of having tmultiple destinations coming from SmartConnectors
The raw Syslog data is now available on the EB topic for any system that customer would like to share it with.
Note that at this time Colectors and CEBs are only available for Syslog data.
Database connectors use SQL queries to periodically poll for events. Connectors support major database types, including
MS SQL, MS Access, MySQL, Oracle, DB2, Postgres, and Sybase.
IBM DB2 connectors: DB2 drivers are no longer provided in the connector installation due to licensing requirements.
Microsoft SQL Server Multiple Instance DB connector
McAfee Vulnerability Manager DB.
Time-Based Queries use a time field to retrieve events found since the most recent query time until the current time.
ID-Based Queries use a numerically increasing ID field to retrieve events from the last checked ID until the maximum ID.
Job ID-Based Queries use Job IDs that are not required to increase numerically. Processed Job IDs are filed in such a way that only new Job IDs are added. Unlike the other two types of database connector, Job IDs can run in either Interactive mode or Automatic mode
Microsoft Windows Event Log Connectors
SmartConnector for Microsoft Windows Event Log
SmartConnector for Microsoft Windows Event Log – Native
SmartConnector for Microsoft Windows Event Log – Unified
Model Import Connectors
Rather than collecting and forwarding events from devices, Model Import Connectors import user data from an Identity Management system into ArcSight ESM. See individual configuration guides for Model Import Connectors on Protect724 for information about how these connectors are used
Model Import Connectors extract the user identity information from the database and populate the following lists in ESM with the data:
Identity Roles Session List
Identity Information Session List
Account-to-Identity Map Active List
SNMP Traps contain variable bindings, each of which holds a different piece of information for the event. They are usually sent over UDP to port 162, although the port can be changed. SNMP connectors listen on port 162 (or any other configured port) and process the received traps. They can process traps only from one device with a unique Enterprise OID, but can receive multiple trap types from this device. SNMP is based upon UDP, so there is a slight chance of events being lost over the network. Although there are still some SNMP connectors for individual connectors, most SNMP support is provided by the SmartConnector for SNMP Unified. Parsers use the knowledge of the MIB to map the event fields, but, unlike some other SNMP-based applications, the connector itself does not require the MIB to be loaded
Syslog messages are free-form log messages prefixed with a syslog header consisting of a numerical code (facility + severity), timestamp, and host name. They can be installed as a syslog daemon, pipe, or file connector. Unlike other file connectors, a syslog connector can receive and process events from multiple devices. There is a unique regular expression that identifies the device.
Syslog Daemon connectors listen for syslog messages on a configurable port, using port 514 as a default. The default protocol is UDP, but other protocols such as Raw TCP are also supported. It is the only syslog option supported for Windows platforms.
Syslog Pipe connectors require syslog configuration to send messages with a certain syslog facility and severity. The Solaris platform tends to under perform when using Syslog Pipe connectors. The operating system requires that the connector (reader) open the connection to the pipe file before the syslog daemon (writer) writes the messages to it. When using Solaris and running the connector as a nonroot user, using a Syslog Pipe connector is not recommended. It does not include permissions to send an HUP signal to the syslog daemon.
Syslog File connectors require syslog configuration to send messages with a certain syslog facility and severity. For high throughout connectors, Syslog File connectors perform better than Syslog Pipe connectors because of operating system buffer limitations on pipe transmissions
Raw Syslog connectors generally do no parsing and takes the syslog string and puts it in the rawEvent field as-is . The Raw Syslog destination type takes the rawEvent field and sends it as-is using whichever protocol is chosen (UDP, Raw TCP, or TLS). The Raw Syslog connector is always used with the Raw Syslog destination. The event flow is streamlined to eliminate components that do not add value (for example, with the Raw Syslog transport the category fields in the event are ignored, so the categorization components are skipped). If you are transporting data to ArcSight Logger, you can use specific configuration parameters to provide minimal normalization of the syslog data (for source and timestamp)
Syslog NG Daemon connectors support Syslog NG version 3.0 for BSD syslog format. Support is provided for collection of IETF standard events. This connector is capable of receiving events over a secure (encrypted) TLS channel from another connector (whose destination is configured as CEF Syslog over TLS), and can also receive events from devices
CEF Encrypted Syslog (UDP) connectors allow connector-to-connector communication through an encrypted channel by decrypting events previously encrypted through the CEF Encrypted Syslog (UDP) destination. The CEF connector lets ESM connect to, aggregate, filter, correlate, and analyze events from applications and devices that deliver their logs in the CEF standard, using the syslog transport protocol.
UNIX supports all types of syslog connector. If a syslog process is already running, you can end the process or run the connector on a different port. Because UDP is not a reliable protocol, there is a slight chance of missing syslog messages over the network. Generally, TCP is a supported protocol for syslog connectors. There is a basic syslog connector, the connector for UNIX OS Syslog, which provides the base parser for all syslog sub-connectors. For syslog connector deployment information, see the connector Configuration Guide for UNIX OS Syslog. For device-specific configuration information and field mappings, see the connector configuration guide for the specific device. Each syslog sub-connector has its own configuration guide. During connector installation, for all syslog connectors, choose Syslog Daemon, Syslog Pipe, or Syslog File. The names of the syslog sub-connectors are not listed
IP NetFlow (NetFlow/J-Flow) Retrieves data over TCP in a Cisco-defined binary format.
ArcSight Streaming Connector Retrieves data over TCP from Logger in an ArcSight-proprietary format
Connectors for Transformation Hub
Connectors in =Transformation Hub supports ArcSight customers who want to have large-scale distributed ingestion pipelines with 100% availability, where data from any existing or new source at any scale can be ingested while maintaining enterprise level robustness. Transformation Hub can take messages with raw data collected from any source the ArcSight connector framework understands and automatically perform the data ingestion processing currently done by connectors, but deployed and managed at scale as Transformation Hub processing engines. Users deploy the Transformation Hub using the ArcSight Installer and Management Center to achieve the desired layout. New topics can be created in Management Centerand designated to process raw data from a particular technology framework with output into a specific format.
The connector technology in Transformation Hub performs all processing a connector would normally do: parser selection, normalization, main flow, destination specific flows, and categorization, as well as applying network zoning and Agent Name resolution. For more information, see the ArcSight Transformation Hub Administrator’s Guide and the ArcSight Management Center Administrator’s Guide.
Note: If you are using the Linux Red Hat 6.x or later platforms, ensure that you have these libraries or packages installed before installing a connector:
fontconfig \ dejavu-sans-fonts
When installing the 32-bit SmartConnector executable on 64-bit machines, the 32-bit versions of glibc, libXext, libXrender, and libXtst must be installed as well as the 64-bit versions
/tmp – more than 6 GB
/opt – more than 100 GB
CentOS Software Selection
System Administration Tools
Log files /
Make sure that the partition in which your /tmp directory resides has at least 6 GB of space. Make sure that the partition in which your /opt/arcsight directory resides has at least 100 GB of space.
Specifying a Global Event ID Generator ID, Global event IDs uniquely identify events across the ArcSigh
The Manager host name is used to generate a self-signed certificate. The Common Name (CN) in the certificate is the host name that you specify when prompted
The Manager host name is the IP address (for IPv4 only) or the fully-qualified domain name of the machine where the Manager is installed. All clients (for example, the ArcSight Console) use this name to connect to the Manager. For flexibility, Micro Focus recommends using a fully-qualified domain name instead of an IP address.
Make sure that the IP address 127.0.0.1 is resolved to localhost in the /etc/hosts file, otherwise, the ESM installation will fail. This applies to IPv4 and IPv6 systems.
If you do not want the host name on your DNS server, add a static host entry to the /etc/hosts file to resolve the host name locally.
<ARCSIGHT_HOME>/config/esm.properties (has cluster configuration properties and SSL properties common to persistor, correlator, and aggregator services on the node) This properties file is present on each node in a distributed correlation cluster.
<ARCSIGHT_HOME>/config/jaas.config (with RADIUS or SecurID enabled only, has shared node secret)
<ARCSIGHT_HOME>/config/client.properties (with SSL Client authentication only, has keystore passwords)
<ARCSIGHT_HOME>/reports/sree.properties (to protect the report license)
<ARCSIGHT_HOME>/reports/archive/* (to prevent archived reports from being stolen)
<ARCSIGHT_HOME>/jre/lib/security/cacerts (to prevent injection of new trusted CAs)
<ARCSIGHT_HOME>/lib/* (to prevent injection of malicious code) l <ARCSIGHT_HOME>/rules/classes/* (to prevent code injection)
The xmlrpc.accept.ips property restricts access for ArcSightConsoles.
The agents.accept.ips property restrict saccess for SmartConnectors.
For registration, the SmartConnectors need to be in xmlrpc.accept.ips as well, so that they can be registered. (Being “registered” does not mean you can then remove them.)
The format for specifying subnets is quite flexible, as shown in the following example:
//Use CentOS 7.6 - http://ftp.iij.ad.jp/pub/linux/centos-vault/7.6.1810/
Boot intro Troubleshooting —> install CentOS 7 in basic graphics mode
Download the ArcSightESMSuite- 7.0.0.xxxx.1.tar from <a href="https://softwaresupport.softwaregrp.com/.">https://softwaresupport.softwaregrp.com/.</a>
scp [email protected]:tmp/esminstall
//Install TMUX for remote installations
yum install tmux
tmux attach -t number-of-session
// USB Mount
mount -v -t auto /dev/sdf1 /mnt/usb
//Nic on laptop enp0s31f6
nmtui edit enp0s31f6
// Add hostanme to IP address in hosts file
fdisk -l mkdir
/mnt/usb mount -v -t auto /dev/sdf1 /mnt/usb
// Unarchive installer
Create arcsight user with GUID and SU rights
Create a folder called esm_installer
chown arcsight: esm_installer
<span style="color:var(--color-text);">tar xvf ArcSightESMSuite-7.0.0.xxxx.1.tar</span>
// Copy the license files to same location
ulimit -a (<span style="color:var(--color-text);">open files 65536/</span><span style="color:var(--color-text);">max user processes 10240)</span>
// Download and set Timezone
wget tzdata-2019b-1.el7.noarch.rpm <span style="color:var(--color-text);">/opt/work/
rpm -Uvh /opt/work/</span>
sudo yum install tzdata -y
timedatectl list-timezones | egrep -o “*Australian*.*”
timedatectl set-timezone “Asia/Kolkata”
timedatectl set-timezone America/Los_Angeles
timedatectl set-timezone UTC
timedatectl set-time 15:58:30
timedatectl set-time 20151120
timedatectl | grep local
timedatectl set-local-rtc 1
timedatectl set-local-rtc 0
timedatectl set-ntp true
su arcsight | Pwd
Login under user account: arcsight into Console and install
/etc/init.d/arcsight_services stop all
/opt/arcsight/manager/bin/arcsight tzupdater /opt/arcsight /opt/arcsight/manager/lib/jre-tools/tzupdater
/etc/init.d/arcsight_services start all
//Starting the installer
chmod +x /tmp/esm_install/ArcSightESMSuite.bin
chown -R arcsight:arcsight ../Tools
// Error: You are installing this product on an unsupported platform.
// If you are install on later version you might need to downgrade the version manual then update it later
sudo nano /etc/centos-release
sudo nano /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
CentOS Linux release 7.6 (Core)
// LOGIN into CONSOLE as arcsight
./ArcSightESMSuite.bin -i console
/opt/arcsight/manager/bin/arcsight firstbootsetup -boxster -soft -i console
/opt/arcsight/kubernetes/scripts/cdf-updateRE.sh > /tmp/ca.crt
//To install the time zone update package after you complete the ESM
/etc/init.d/arcsight_services stop all
/opt/arcsight/manager/bin/arcsight tzupdater /opt/arcsight
/etc/init.d/arcsight_services start all
// As arcsight user
// Install ESM Login under user account: arcsight into Console and install
/opt/arcsight/manager/bin/arcsight firstbootsetup -boxster -soft -i console
IMPORTANT: The root user must run the following script to start up required services:
// START SERVICES as arcsight user
/etc/init.d/arcsight_services stop all
/etc/init.d/arcsight_services start all
//Set the hostname in local hosts file
//Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --ignore-certificate-errors &> /dev/null &
// Access https://arcsight:8443
/Chrome SSL Error type "thisisunsafe"
// Remove ESM
Remove all files in /tmp and /opt/arcsight rm -r *
The volume or partition required for installation of the /opt/arcsight directory does not contain the minium of 50GB of space to successfully install arcsight
0- ArcSight Content Management - This package contains resources to track content that is being managed across multiple ESM systems.
1- ArcSight ESM HA Monitoring - This package contains resources to track High Availability (HA) status and changes.
2- ArcSight Transformation Hub Monitoring - This package contains resources for monitoring Transformation Hub.
3- Security Threat Monitoring - This package contain default security threat monitoring content.
4- Threat Intelligence Platform - This package contains default content for threat intelligence platform.
Install ArcSight Console
DisableHyperThreading.This setting exists on most server class processors (for example, Intel processors) that support hyper threading. AMD processors do not have an equivalent setting.
DisableIntelVT-d.This setting is specific to Intel processors and is likely to be present on most recent server class processors. AMD processors have an equivalent setting called AMD- Vi.
SetPowerRegulatortoStaticHighPerformance.This setting tells the CPU(s) to always run at high speed, rather than slowing down to save power when the system senses that load has decreased. Most recent CPUs have an equivalent setting.
SetThermalConfigurationtoIncreasedCooling.This setting increases the server fan speed to avoid issues with the increased heat that results from constantly running the CPU(s) at high speed.
EnabletheMinimumProcessorIdlePowerPackageStatesetting.This setting tells the CPU not to use any of its C-states (various states of power saving in the CPU).
SetPowerProfiletoMaximumPerformance. This setting results in the following changes:
QPI power management (the link between physical CPU sockets) is disabled.
PCIe support is forced to Gen 2.
C-states are disabled.
Lower speed settings on the CPUs are disabled so that the CPUs constantly run at high speed.
//Insure FULL Java version on CentOS
[arcsight@vm-esm700-demo ~]$ java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (build 1.8.0_181-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
Microfocus has many product lines that is very interesting for cyber security intergrations;
automate the process of collecting and managing logs from any device and in any format through normalization and categorization of logs into a unified format known as Common Event Format (CEF), which is now an industry standard for log format. You can use this unified data for searching, reporting, analyzing or storing logs. ArcSight Connectors also manage ongoing updates, upgrades, configuration changes and administration of distributed deployments through a centralized web-based interface. They can be deployed as software or on an appliance
ArcSight Connectors helps you with:
Scale easily to manage extreme machine data across IT
Reduce the cost of handling large volumes of logs and events in various formats
Automate the process of managing connectors to collect audit-quality log data
Share, upload, or download connectors within your ArcSight community
Seamlessly integrate with the ArcSight platform
Broadest set of built-in connectors that collect, aggregate, filter, and parse the logs
Managing log records in hundreds of different formats from hundreds of vendors
Patented technology to normalize and categorize logs that enables full-text English searching on rich metadata
High compression of log data up to 10:1 to reduce your storage costs significantly
Automate bandwidth management with low footprint
FlexConnector The FlexConnector framework is a software development kit (SDK) that enables you to create your own SmartConnector tailored to the nodes on your network and theirspecific event data. FlexConnector typesinclude file reader, regular expression file reader, time-based database reader, syslog, and Simple Network Management Protocol (SNMP) readers.
Forwarding Connector The Forwarding Connectorsforward events between multiple Managersin a hierarchical ESM deployment, and/or to one or more Logger deployments.
The ArcSight Manager isthe heart of the solution. It is a Java-based server that drives analysis, workflow, and services. It also correlates output from a wide variety of security systems. The Manager writes eventsto the CORR-Engine asthey stream into the system. Itsimultaneously processesthem through the correlation engine, which evaluates each event with network model and vulnerability information to develop real-time threatsummaries. ESM comes with default configurations and standard foundation use cases consisting of filters, rules, reports, data monitors, dashboards, and network modelsthat make ESM ready to use upon installation
The Correlation Optimized Retention and Retrieval (CORR) Engine is a proprietary data storage and retrieval framework that receives and processes events at high rates, and performs high-speed searches
Security Use Case and Activate Framework Marketplace
ArcSight Activate Framework is a modular content development framework that allows you to implement ArcSight SIEM quickly and effectively. The framework provides a standard way of creating content. Standardized content means new analysts and engineers can easily review and understand existing content reducing the ramp-up time for new employees. It also opens up the possibility of sharing content with other ArcSight users. Best of all, the base content has been created from 10 years of experience implementing ArcSight in thousands of environments. What does this mean? It is proven and it works! ArcSight Activate Framework makes implementing SIEM easy. It helps you with:
Deploy modular content and standardized use cases to implement ArcSight quickly and effectively in your environment with minimal setup required.
Enable inexperienced users to create content quickly. Content created is easier to understand reducing training and maintenance costs.
Provide a standardized approach to creating content that can be shared between ArcSight installations and within the community to easily keep up on the latest IT security threats. This results in a robust SIEM that is easier to set up and maintain.
Leverage proven use cases developed by ArcSight SIEM experts to provide a robust implementation to increase your effectiveness and deployment success.
ArcSight Interactive Discovery (AID) is a separate software application that augments Pattern Discovery, dashboards, reports, and analytical graphics. AID provides enhanced historical data analysis and reporting capabilities using a comprehensive selection of pre-built interactive statistical graphics. You can use AID to: l Quickly gain visibility into your complex security data l Explore and drill down into security data with precision control and flexibility l Accelerate discovery of hard-to-find eventsthat may be dangerous l Presentstate of security in compelling visualsummaries l Build a persuasive, non-technical call to action l Prove IT Security value and help justify budgets
Pattern Discovery can automatically detectsubtle, specialized, or long-term patternsthat might otherwise go undiscovered in the flow of events. You can use Pattern Discovery to: l Discover zero-day attacks—Because Pattern Discovery does not rely on encoded domain knowledge (such as predefined rules or filters), it can discover patternsthat otherwise go unseen, or are unique to your environment. l Detect low-and-slow attacks—Pattern Discovery can process up to a million eventsin just a few seconds(excluding read-time from the disk). This makes Pattern Discovery effective to capture even low-and-slow attack patterns. l Profile common patterns on your network—New patterns discovered from current network traffic are like signaturesfor a particularsubset of network traffic. By matching against a repository of historical patterns, you can detect attacksin progress. The patterns discovered in an event flow that either originate from or target a particular asset can be used to categorize those assets. For example, a pattern originating from machinesthat have a back door (unauthorized program that initiates a connection to the attacker) installed can all be visualized as a cluster. If you see the same pattern originating from a new asset, it is a strong indication that the new asset also has a back door installed. l Automatically create rules—The patterns discovered can be transformed into a complete rule set with a single mouse click. These rules are derived from data patterns unique to your environment, whereas predefined rules must be generic enough to work in many customer environments. Pattern Discovery is a vital tool for preventive maintenance and early detection in your ongoing security management operations. Using periodic, scheduled analysis, you can always be scanning for new patterns over varying time intervalsto stay ahead of new exploitative behavior
Logger ArcSight Logger is an event data storage appliance that is optimized for extremely high event throughput. Loggerstoressecurity events on board in compressed form, but can alwaysretrieve unmodified events on demand for historical analysis-quality litigation data. Logger can be deployed stand-alone to receive eventsfrom syslog messages or log files, or to receive eventsin Common Event Format from SmartConnectors. Logger can forward selected events assyslog messagesto ESM. Multiple Loggers work together to scale up to support high sustained input rates. Event queries are distributed across a peer network of Loggers.
Content, Solutions, and CIPs for ESM and Logger
ArcSight ESM Compliance Insight Package for the Payment Card Industry (PCI) version 4.1 is now generally available. It can be downloaded by licensed customers from the HP support web site. The solution guide and release notes can be found here.
ESM Compliance Insight Package for PCI 4.1 contains the following important updates:
Support for PCI requirements specified in Payment Card Industry Data Security Standard 3.2 (PCI DSS 3.2)
Support for logs generated by applications subject to Payment Application Data Security Standard 3.2 (PA DSS 3.2)
About ESM Compliance Insight Package for PCI:
The ESM Compliance Insight Package for PCI provides a system of reports and real-time checks specifically designed to monitor systems that contain cardholder data, manage vulnerability and access control, monitor networks, and maintain security policies to help demonstrate to stakeholders and auditors that the controls over your company’s credit card data systems expose little or no risk.
ESM uses objects called resources to manage event-processing logic. A resource defines the properties, values, and relationships used to configure the functions that ESM performs. Resources can also be the output of such a configuration (such as archived reports, or Pattern Discovery snapshots and patterns).
ESM has more than 30 different types of resources and comes with hundreds of these resources already configured to give you functionality as soon as the product is installed. These resources are presented in the Navigator panel of the ArcSight Console.
Modeling Resources “The Network Model” on page 120 enables you to build a businessoriented view of data derived from physical information systems. These distinctions help ESM to clearly identify events in your network, providing additional layers of detail for correlation. “The Actor Model” on page 146 creates a real-time user model that maps humans or agents to activity in applications and on the network. Once the actor model is in place, you can use category models to visualize relationships among actors, and correlation to determine if their activity is above board. l Assets l Asset Ranges l Asset Categories l Zones l Networks l Customers l Vulnerabilities l Locations l Actors l Category Models
Correlation Resources Correlation is a process that discovers the relationships between events, infers the significance of those relationships, prioritizes them, then provides a framework for taking action. l Filters l Rules l Data Monitors l Active Lists l Session Lists l Integration Commands l Pattern Discovery
Monitoring and Investigation Resources Active channels and dashboards are tools that monitor all the activity that ESM processes for your network. Each of these views enables you to drill down on a particular event or series of events in order to investigate their details. Saved searches are those you run on a regular basis. They include query statements, the associated field set, and a specified time range. Search filters contain only the query statements. You define and save searches and search filters in the ArcSight Command Center, and export these resources as packages in the ArcSight Console. l Active Channels l Field Sets l Saved Searches and Search Filters l Dashboards l Query Viewers
Workflow and User Management Resources Workflow refers to the way in which people in your organization are informed about incidents, how incidents are escalated to other users, and how incident responses are tracked. l Annotations l Cases l Stages l Users and User Groups l Notifications l Knowledge Base l Reference Pages
Reporting Resources Reporting resources work together to create batch-oriented functions used to analyze incidents, find new patterns, and report on system activity. l Reports l Queries l Trends l Templates l Focused Reports
Administration Resources Administration resources are tools that manage ESM’s daily maintenance and long-term health. l Packages l Files l Storage and storage volumes l Retention periods
Standard Content Standard content is a series of coordinated resources that address common enterprise network security and ESM management tasks. Many of these resources are installed automatically with ESM to provide essential system health and status operations. Others are presented as install-time options organized by category. l ArcSight Administration l ArcSight System
Content Synchronization and Management Content synchronization provides the ability to publish content from one ESM instance to multiple ESM instances. Synchronization is managed through the creation of supported packages, establishment of ESM subscribers, and scheduling the publication of content. Packages
Normalising Event Data
Normalize meansto conform to an accepted standard or norm. Because networks are heterogeneous environments, each device has a different logging format and reporting mechanism. You may also have logsfrom remote sites where security policies and procedures may be different, with different types of network devices, security devices, operating systems and application logs. Because the formats are all different, it is difficult to extract information for querying without normalizing the eventsfirst. The following examples are logsfrom differentsourcesthat each report on the same packet traveling acrossthe network. These logsrepresent a remote printer buffer overflow that connectsto IIS servers over port 80.
In order to productively store this diverse data in a common data store, SmartConnectors evaluate which fields are relevant and arrange them in a common schema. The choice of fields are content driven, ESM 101 Chapter 4: Data Collection and Event Processing Micro Focus ESM (7.0 Patch 1) Page 31 of 161 not based on syntactic differences between what Checkpoint may call target address and what Cisco calls destination address. To normalize, SmartConnectors use a parser to pull out those valuesfrom the event and populate the corresponding fieldsin the schema. Here is a very simple example of these same alerts after they have been normalized.
Another factor in normalization is converting timestampsto a common format. Since the devices may all use different time zones, ESM normalization convertsthe timestampsto UTC (GMT).
During the normalization process, the SmartConnector collects data about the level of danger associated with a particular event asinterpreted by the data source that reported the event to the connector. These data points, device severity and agentseverity, become factorsin calculating the event’s overall priority described in “Evaluate the Priority Formula” on page 41.
Device severity capturesthe language used by the data source to describe itsinterpretation of the danger posed by a particular event. For example, if a network IDS detects a DHCP packet that does not contain enough data to conform to the DHCP format, the device flagsthis as a high-priority exploit.
Agent severity is the translation of the device severity into ESM-normalized values. For example, Snort uses a device severity scale of 1-10, whereas Checkpoint uses a scale of high, medium and low. ESM normalizesthese valuesinto a single agentseverity scale. The default ESM scale is Low, Medium, High, and Very High. An event can also be classified as AgentSeverity Unknown if the data source did not provide a severity rating.
Like the logsthemselves, differentsecurity devices also include a model for describing the characteristics of the eventsthey process. But no two devices or vendors use the same eventcharacteristic model. To solve this problem, ArcSight has also developed a common model for describing events, which enables you to understand the realsignificance of a particular event asreported from different devices. This common model also enables you to write device-independent content that can correlate events with normalized characteristics. This model is expressed as event categories, and the SmartConnector assignsthem using default criteria, which can be configured during connectorsetup. Event categories are a series of six criteria that translate the core meaning of an event from the system that generated it into a common format. These six criteria, taken individually or together, are a central tool in ESM’s analysis capability.
Correlation is a four-dimensional processthat draws upon the network model, the priority formula, and optionally, Pattern Discovery to discover, infer meaning, prioritize, and act upon eventsthat meet specific conditions. For example, varioussystems on a network may report the following events: l UNIX operating system: multiple failed log-ins l IDS: Attempted brute force attack l Windows operating systems: multiple failed log-ins A correlation rule putsthese data pointstogether and detectsfive or more failed log-insin a oneminute period targeting the same source. Based on these facts, this combination of eventsis considered an attempted brute force attack. The Windows operating system next reports a successful log-in from the same source. The attempted brute force attack followed by a successful login from the same source elevatesthe risk that the attack may have been successful. To verify whether an attack wassuccessful, you can analyze the volume of traffic going to the Windows target. In this case, a sudden spike in traffic to thistarget can verify that a brute force attack was successful. ESM’s correlation tools use statistical analysis, Boolean logic, and aggregation to find events with particular characteristics you specify. Rules can then take automated action to protect your network.