Cybersecurity & Confusion matrix

Rishabh Jain
5 min readJun 4, 2021

Task Description 📄

📌 Cyber crime cases where they talk about confusion matrix or its two types of error.

Hello Everyone…

Today I am here with a new article. Which is based on Confusion Matrix. In this article we are going to what is Confusion Matrix, why we need it? and also it’s types and lots of interesting things …

So lets jump on our today’s interesting topic guys…

What is Cybersecurity?

Cyber security is the practice of defending computers, servers, mobile devices, electronic systems, networks, and data from malicious attacks. It’s also known as information technology security or electronic information security. The term applies in a variety of contexts, from business to mobile computing, and can be divided into a few common categories.

Network security is the practice of securing a computer network from intruders, whether targeted attackers or opportunistic malware.

Application security focuses on keeping software and devices free of threats. A compromised application could provide access to the data its designed to protect. Successful security begins in the design stage, well before a program or device is deployed.

Information security protects the integrity and privacy of data, both in storage and in transit.

Operational security includes the processes and decisions for handling and protecting data assets. The permissions users have when accessing a network and the procedures that determine how and where data may be stored or shared all fall under this umbrella

Disaster recovery and business continuity define how an organization responds to a cyber-security incident or any other event that causes the loss of operations or data. Disaster recovery policies dictate how the organization restores its operations and information to return to the same operating capacity as before the event. Business continuity is the plan the organization falls back on while trying to operate without certain resources.

End-user education addresses the most unpredictable cyber-security factor: people. Anyone can accidentally introduce a virus to an otherwise secure system by failing to follow good security practices. Teaching users to delete suspicious email attachments, not plug in unidentified USB drives, and various other important lessons is vital for the security of any organization.

What is Confusion Matrix ?

A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier. It is used to measure the performance of a classification model. It can be used to evaluate the performance of a classification model through the calculation of performance metrics like accuracy, precision, recall, and F1-score.

If you have an imbalanced dataset to work with, it’s always better to use confusion matrix as your evaluation criteria for your machine learning model.

Let’s assume class A is positive class and class B is negative class. The key terms of confusion matrix are as follows:

Let’s assume class A is positive class and class B is negative class. The key terms of confusion matrix are as follows:

  • True positive (TP): Predicting positive class as positive (ok)
  • False positive (FP): Predicting negative class as positive (not ok)
  • False negative (FN): Predicting positive class as negative (not ok)
  • True negative (TN): Predicting negative class as negative (ok)

What can we learn from this?

A valid question arises that what we can do with this matrix. There are some important terminologies based on this:

  1. Precision: It is the portion of values that are identified by the model as correct and are relevant to the problem statement solution. We can also quote this as values, which are a portion of the total positive results given by the model and are positive. Therefore, we can give its formula as TP/ (TP + FP).
  2. Recall: It is the portion of values that are correctly identified as positive by the model. It is also termed as True Positive Rate or Sensitivity. Its formula comes out to be TP/ (TP+FN).
  3. F-1 Score: It is the harmonic mean of Precision and Recall. It means that if we were to compare two models, then this metric will suppress the extreme values and consider both False Positives and False Negatives at the same time. It can be quoted as 2*Precision*Recall/ (Precision+Recall).
  4. Accuracy: It is the portion of values that are identified correctly irrespective of whether they are positives or negatives. It means that all True positives and True negatives are included in this. The formula for this is (TP+TN)/ (TP+TN+FP+FN).

Out of all the terms, precision and recall are most widely used. Their tradeoff is a useful measure of the success of a prediction. The desired model is supposed to have high precision and high recall, but this is only in perfectly separable data. In practical use cases, the data is highly unorganized and imbalanced.

CYBER CRIME CASES AND CONFUSION MATRIX :

In the present world, cybercrime offenses are happening at an alarming rate. As the use of the Internet is increasing many offenders, make use of this as a means of communication in order to commit a crime. Cybercrime will cost nearly $6 trillion per annum by 2021 as per the cybersecurity ventures report in 2020. For illegal activities, cybercriminals utilize any network computing devices as a primary means of communication with a victims’ devices, so attackers get profit in terms of finance, publicity and others by exploiting the vulnerabilities over the system. Cybercrimes are steadily increasing daily.

Evaluating cybercrime attacks and providing protective measures by manual methods using existing technical approaches and also investigations has often failed to control cybercrime attacks. Existing literature in the area of cybercrime offenses suffers from a lack of a computation methods to predict cybercrime, especially on unstructured data. Therefore, this study proposes a flexible computational tool using machine learning techniques to analyze cybercrimes rate at a state wise in a country that helps to classify cybercrimes.

Security analytics with the association of data analytic approaches help us for analyzing and classifying offenses from India-based integrated data that may be either structured or unstructured. The main strength of this work is testing analysis reports, which classify the offenses accurately with 99 percent accuracy.

That’s All Guys…

Thanks for patient reading my article🤗

See you soon with new article..

For any help or suggestions find me on LinkedIn.

--

--