Integración de Sistemas de Minería de Datos Simbólicos desarrollados en R y plataformas para el desarrollo de sistemas WEB
Archivos
Fecha
2018
Autores
Loría Valverde, José Andrés
Título de la revista
ISSN de la revista
Título del volumen
Editor
Universidad Nacional (Costa Rica)
Resumen
En el presente documento se describe el sistema RSDA-WEB, el cual integra lenguajes de alto nivel con lenguajes estadísticos para poder realizar “Análisis de Datos Simbólicos”.
El Sistema tiene la como característica primordial, ser muy intuitivo en su uso, básicamente porque está diseñado como un “paso a paso”, en el que el usuario primeramente debe seleccionar el archivo de insumo a utilizar, luego selecciona con cuales opciones de formato desea cargar el archivo, posteriormente selecciona las funciones de tipo simbólico que desea aplicar y finalmente obtiene los resultados en forma de texto e imágenes según corresponda.
Los archivos que sirven de entrada para ser analizados pueden ser de múltiples formatos tanto gratuitos como comerciales, entre los primeros se encuentran del tipo: RSDA, CSV, XML y de los segundos del tipo SODAS (versión 1 y 2).
Las funciones simbólicas que el usuario selecciona dentro del “paso a paso”, están definidas dentro de un paquete de Análisis Simbólico de Datos denominado RSDA y que puede ser bajado del CRAN (siglas en inglés de Comprehensive R Archive Network, un repositorio en Internet de programas y librerías exclusivos para ser utilizado con el lenguaje R). El RSDA es un paquete desarrollado por el Doctor Oldemar Rodríguez Rojas y que a su vez se encuentra previamente instalado en el lenguaje estadístico utilizado, el cual es R.
Uno de los principales beneficios del RSDA-WEB, es poder independizar al usuario del conocimiento implícito en la instalación, configuración, uso de paquetes y comandos de R, para poder realizar Análisis de Datos Simbólicos (ADS). Lo anterior, por cuanto el Sistema provee los mecanismos para que el usuario desde una página WEB pueda aplicar los mismos análisis que desee, como si los ejecutara directamente desde la consola de R.
Otro beneficio que ofrece RSDA-WEB y que lo hace único de otros sistemas tradicionales, es que es el único a nivel mundial que permite realizar ADS desde un sistema WEB bajo el esquema Cliente-Servidor. En este sentido, el único requisito que requiere el usuario final es un “navegador WEB” con conexión a Internet y el URL correspondiente del servidor. Adicionalmente el Sistema es desarrollado con programas gratuitos por lo que no requiere el uso de ninguna clase de licenciamiento.
RSDA-WEB integra las bondades de rápido procesamiento de los lenguajes estadísticos con las características de un lenguaje de alto nivel (en este caso JAVA); permitiendo desarrollarlo con la libertad de utilizar una conjunción de tecnologías recientes (AJAX, Responsive Web Design, Servlets) y buenas prácticas.
La comunicación existente entre los lenguajes R - Java y viceversa, constituyó el principal reto del proyecto, ya que luego de una exhaustiva búsqueda de alternativas para lograr este fin, la más viable fue el diseño de la clase que se denominó “RInterface” y que posibilitó, con la integración de otras tecnologías y clases, la solución requerida.
Finalmente y como se podrá leer en secciones posteriores, la implementación de este proyecto fue en los servidores de la Escuela de Informática de la Universidad Nacional, la cual gentilmente dispuso del alojamiento correspondiente y los accesos requeridos para toda la comunidad de usuarios interesada en su uso.
This document describes the RSDA-WEB system, which integrates high-level languages with statistical languages in order to perform "Symbolic Data Analysis". The main characteristic of the System is that it is very intuitive in its use, basically because it is designed as a "step by step", in which the user must first select the input file to be used, then select the format options he wishes to use. load the file, then select the symbolic type functions that you want to apply and finally get the results in the form of text and images as appropriate. The files that serve as input to be analyzed can be of multiple formats, both free and commercial, among the first are the types: RSDA, CSV, XML and the second are the SODAS type (version 1 and 2). The symbolic functions that the user selects within the "step by step" are defined within a Symbolic Data Analysis package called RSDA and that can be downloaded from the CRAN (acronym in English of the Comprehensive R Archive Network, an Internet repository of exclusive programs and libraries to be used with the R language). The RSDA is a package developed by Doctor Oldemar Rodríguez Rojas and which in turn is previously installed in the statistical language used, which is R. One of the main benefits of RSDA-WEB is to be able to free the user from the knowledge implicit in the installation, configuration, use of R packages and commands, in order to perform Symbolic Data Analysis (ADS). The foregoing, since the System provides the mechanisms so that the user from a WEB page can apply the same analyzes that he wishes, as if he were executing them directly from the R console. Another benefit that RSDA-WEB offers and that makes it unique from other traditional systems is that it is the only one in the world that allows ADS to be carried out from a WEB system under the Client-Server scheme. In this sense, the only requirement that the end user requires is a "WEB browser" with an Internet connection and the corresponding URL of the server. Additionally, the System is developed with free programs so it does not require the use of any kind of licensing. RSDA-WEB integrates the fast processing benefits of statistical languages with the characteristics of a high level language (in this case JAVA); allowing it to be developed with the freedom to use a combination of recent technologies (AJAX, Responsive Web Design, Servlets) and good practices. The existing communication between the R - Java languages and vice versa, constituted the main challenge of the project, since after an exhaustive search for alternatives to achieve this end, the most viable was the design of the class that was called "RInterface" and that made possible, with the integration of other technologies and classes, the required solution. Finally, and as can be read in later sections, the implementation of this project was on the servers of the School of Informatics of the National University, which kindly provided the corresponding hosting and access required for the entire community of users interested in its use.
This document describes the RSDA-WEB system, which integrates high-level languages with statistical languages in order to perform "Symbolic Data Analysis". The main characteristic of the System is that it is very intuitive in its use, basically because it is designed as a "step by step", in which the user must first select the input file to be used, then select the format options he wishes to use. load the file, then select the symbolic type functions that you want to apply and finally get the results in the form of text and images as appropriate. The files that serve as input to be analyzed can be of multiple formats, both free and commercial, among the first are the types: RSDA, CSV, XML and the second are the SODAS type (version 1 and 2). The symbolic functions that the user selects within the "step by step" are defined within a Symbolic Data Analysis package called RSDA and that can be downloaded from the CRAN (acronym in English of the Comprehensive R Archive Network, an Internet repository of exclusive programs and libraries to be used with the R language). The RSDA is a package developed by Doctor Oldemar Rodríguez Rojas and which in turn is previously installed in the statistical language used, which is R. One of the main benefits of RSDA-WEB is to be able to free the user from the knowledge implicit in the installation, configuration, use of R packages and commands, in order to perform Symbolic Data Analysis (ADS). The foregoing, since the System provides the mechanisms so that the user from a WEB page can apply the same analyzes that he wishes, as if he were executing them directly from the R console. Another benefit that RSDA-WEB offers and that makes it unique from other traditional systems is that it is the only one in the world that allows ADS to be carried out from a WEB system under the Client-Server scheme. In this sense, the only requirement that the end user requires is a "WEB browser" with an Internet connection and the corresponding URL of the server. Additionally, the System is developed with free programs so it does not require the use of any kind of licensing. RSDA-WEB integrates the fast processing benefits of statistical languages with the characteristics of a high level language (in this case JAVA); allowing it to be developed with the freedom to use a combination of recent technologies (AJAX, Responsive Web Design, Servlets) and good practices. The existing communication between the R - Java languages and vice versa, constituted the main challenge of the project, since after an exhaustive search for alternatives to achieve this end, the most viable was the design of the class that was called "RInterface" and that made possible, with the integration of other technologies and classes, the required solution. Finally, and as can be read in later sections, the implementation of this project was on the servers of the School of Informatics of the National University, which kindly provided the corresponding hosting and access required for the entire community of users interested in its use.
Descripción
Loría Valverde, J. A. & Ramírez Segura, E. (2018). Integración de Sistemas de Minería de Datos Simbólicos desarrollados en R y plataformas para el desarrollo de sistemas WEB. [Tesis de Licenciatura]. Universidad Nacional, Heredia, C.R.
Palabras clave
SITIOS WEB, JAVA (LENGUAJE DE PROGRAMACIÓN PARA COMPUTADORA), WEBSITES, R (LENGUAJES DE PROGRAMACION), JAVASCRIPT (LENGUAJE DE PROGRAMACION PARA COMPUTADORA), WEBSITES, PROGRAMAS COMPUTACIONALES, COMPUTER PROGRAMS, LENGUAJES DE PROGRAMACION, INFORMATICA, ANALISIS DE DATOS, MINERIA DE DATOS