"Knowing a great deal is not the same as being smart; intelligence is not information alone but also judgment, the manner in which information is collected and used. - Carl Sagan"

Graduate Research Assistant at Purdue University
M.Sc. in Statistics.
Specialist in Statistics.
Specialist in Management for Engineers.
GIAC Certified Forensic Analyst.
Electronic Engineer.

Paper: **Classification and Regression Trees for Handling Missing Values in a CMBD to reduce malware in an Information System**

Authors: Gustavo A Valencia-Zapata, Juan C Salazar-Uribe, Ph.D.

Escuela de Estadística, Universidad Nacional de Colombia-Sede Medellín

**Abstract**— In this paper we propose a Classification and Regression Trees model (CART) for handling missing values in a Configuration Management Database (CMDB). Once the information is completed a statistical model to dose antivirus scans inside an information system (IS) in banking sector is implemented. Since about 18.22% of the extracted information from the CMBD was incomplete. As a consequence we propose a data mining modeling strategy to impute this missing information. Finally, we illustrated both this imputation methodology and the statistical dosage model using real data from an IS.

Paper: **A statistical approach to reduce malware inside an Information System in Banking Sector**

Authors: Gustavo A Valencia-Zapata, Juan C Salazar-Uribe, Ph.D.

Escuela de Estadística, Universidad Nacional de Colombia-Sede Medellín

**Abstract**— The aim of this article is to illustrate the first stages of the implementation of a statistical model to dose antivirus scans in an information system in banking sector (IS). As a result, IS is strengthened by increasing malware detection and by decreasing malware attacks inside the bank. In an IS there are many components which help to build a dosage model. The IS´s components are applications such as antivirus, web filtering, Human Capital/Resource Management (HCM), and Configuration Management Database (CMDB). A CMDB provides technical information about the computer population (i.e., hard disk, operating system and so on).We can establish an analogy of some of these components with some other components from an epidemiological system which allows building the statistical model. For instance: the patients can be seen as computers into the bank network, and the malware can be seen as diseases in a population. We use this analogy to build a statistical model based on both survival analysis and data mining methods. With this modeling strategy we identify a risk profile in the IS which allows to dose the antivirus scan in a more effective way.

Complete information about DMIN'12 and SAM'12 conferences and so on, can be found on http://www.world-academy-of-science.org