Rohitdadlani
6 min readJun 5, 2021

Cyber Crime On Confusion Matrix

What is a Confusion Matrix?

A Confusion matrix is the comparison summary of the predicted results and the actual results in any classification problem use case. The comparison summary is extremely necessary to determine the performance of the model after it is trained with some training data.

True Positive:

Interpretation: You predicted positive and it’s true.

You predicted that a woman is pregnant and she actually is.

True Negative:

Interpretation: You predicted negative and it’s true.

You predicted that a man is not pregnant and he actually is not.

False Positive: (Type 1 Error)

Interpretation: You predicted positive and it’s false.

You predicted that a man is pregnant but he actually is not.

False Negative: (Type 2 Error)

Interpretation: You predicted negative and it’s false.

You predicted that a woman is not pregnant but she actually is.

Just Remember, We describe predicted values as Positive and Negative and actual values as True and False.

Accuracy and Components of Confusion Matrix

After the confusion matrix is created and we determine all the components values, it becomes quite easy for us to calculate the accuracy. So, let us have a look at the components to understand this better.

Classification Accuracy

An Overview of False Positives and False Negatives

Understanding the differences between false positives and false negatives, and how they’re related to cybersecurity is important for anyone working in information security. Why? Investigating false positives is a waste of time as well as resources and distracts your team from focusing on real cyber incidents (alerts) originating from your SIEM.

On the flip side, missing false negatives (uncaught threats) increases your cyber risk, reduces your ability respond to those attackers, and in the event of a data breach, could lead to the end of your business…

What Are False Positives?

False positives are mislabeled security alerts, indicating there is a threat when in actuality, there isn’t. These false/non-malicious alerts (SIEM events) increase noise for already over-worked security teams and can include software bugs, poorly written software, or unrecognized network traffic.

By default, most security teams are conditioned to ignore false positives. Unfortunately, this practice of ignoring security alerts — no matter how trivial they may seem — can create alert fatigue and cause your team to miss actual, important alerts related to a real/malicious cyber threats (as was the case with the Target data breach).

These false alarms account for roughly 40% of the alerts cybersecurity teams receive on a daily basis and at large organizations can be overwhelming and a huge waste of time.

What Are False Negatives?

False negatives are uncaught cyber threats — overlooked by security tooling because they’re dormant, highly sophisticated (i.e. file-less or capable of lateral movement) or the security infrastructure in place lacks the technological ability to detect these attacks.

These advanced/hidden cyber threats are capable of evading prevention technologies, like next-gen firewalls, antivirus software, and endpoint detection and response (EDR) platforms trained to look for “known” attacks and malware.

No cybersecurity or data breach prevention technology can block 100% of the threats they encounter. False positives are among the 1% (roughly) of malicious malware and cyber threats most methods of prevention are prone to miss.

Strengthening Your Cybersecurity Posture

The existence of both false positives and false negatives begs the question: Does your cybersecurity strategy include proactive measures? Most security programs rely on preventative and reactive components — — establishing strong defenses against the attacks those tools know exist. On the other hand, proactive security measures include implementing incident response policies and procedures and proactively hunting for hidden/unknown attacks.

Here are a few simple rules to help govern your approach to cybersecurity with a preventative, reactive, and proactive mindset:

  • Assume you’re breached and begin your offensive (proactive) initiatives with the goal of finding those breaches. By doing so, you’ll seek to validate the strength of your defensive/prevention tools with the understanding that none of them are 100% effective.
  • Use asset discovery tools to discover the hosts, systems, servers, and applications within your network environment, because you can’t protect what you don’t know exists.
  • Execute regular compromise assessments (we recommend at least once a week) and inspect every asset residing on your network.
  • Define security policies and procedures, and implement educational/training requirements so your entire team knows what to do in the event you discover a hidden breach, or worse, fall victim to a data breach.
  • Time is your most valuable asset, so implementing tools/technology to speed your speed of detection and time to respond are key and can help your security team prevent a data breach.

Steps to Improve Forensic Analytics

Forensic analytics — the combination of advanced analytics, forensic accounting and investigative techniques — is making breakthroughs every day in identifying rare events of fraud, corruption and other schemes. To meet rising regulatory and customer demand for fraud mitigation, forensic analytics can reveal signals of emerging risks months — or sometimes even years — before they happen. Of course, predicting anomalous events can also create false positives.

In an effort to reduce false positives in fraud investigations, careful attention should be spent on steps including:

  1. Create an analytics repository — Consolidate and integrate data from disparate sources so analytical models can take an enterprise-wide approach to anomalous activity detection.
  2. Employ network mapping and analysis — Explore fraudsters’ networks, affinities and relationships, as well as others committing similar illicit acts.
  3. Leverage both supervised and unsupervised modeling — Supervised modeling employs algorithms to sift through data, applying historical fraud patterns and digital fingerprints of fraudsters to new data and scoring the level of risk involved in new events based on historical data. Unsupervised modeling uses algorithms to sift through data independent of patterns relating to known historical cases, looking for new events following unprecedented patterns.
  4. Use natural language processing (NLP) — Sift through unstructured data, including emails, messaging, audio and video files to unearth unexpected nuance to communication or connections otherwise unclear in structured, text-only data. For example, the ability of NLP to analyze word choice, tone and possible stress levels expressed in a voicemail can sometimes offer more insight during investigations than text on page alone could offer.
  5. Training and self-learning — Train analytics to learn from a variety of data sources, such as risk issues the organization has confronted in the past. The corresponding models can adapt over time to future risks.
  6. Back testing — Scientifically test forensic analytics performance to evaluate its continued use. Backtesting can help establish confidence that pattern recognition models and algorithms work well and are effective in finding suspicious patterns of interest.
  7. Iterative approach — Iteratively develop, adapt and scale forensic analytics models so they respond to new and evolving fraud patterns. At the same time, develop a broader view of the risks an enterprise may face. This approach enables an organization to build the forensic analytics platform in stages — one step at a time with input and validation from the business stakeholders — while still staying a step ahead of bad actors.
  8. Feedback and continuous improvement — Incorporate feedback from results of each investigation, from the continually growing body of forensic accounting and investigation knowledge and insight and from the input of stakeholders across the enterprise in an effort to continuously improve forensic analytics solution effectiveness.

Rohitdadlani
Rohitdadlani

No responses yet