Data Science Cloud Service (DataMixi.com)

Cloud Service

Data Science Cloud Service (DataMixi.com)

Data Science Cloud Service – DATAMIXI is the cognitive analysis service that can discover hidden patterns between data and predict future trends by merging, analyzing deep data and visualizing such data from various viewpoints through the cognitive analysis service combining insight for intelligent data analysis and artificial intelligence. This is the data science consulting service planned for IT engineers and developers who wish to integrate AI-based data analysis services into their project in the form of mashup.

208

< Data Science Cloud Service – DATAMIXI >

Data Science Cloud Service – DATAMIXI consists of data science, data curation and cognitive analysis services. Data science is the name of the cloud service that supports the entire process from the establishment of big data to the analysis and utilization of that big data. Data curation is a common name for the structure that enables collaboration between a data builder and software system through that characterizes Saltlux’s data establishment and analysis processes. The cognitive analysis service consists of trend analysis, emotion analysis and visualization services, and the analysis and visualization service based on a data set consisting of tens of billions of unit instances.

Main features

DATAMIXI, the only domestic data science portal for AI and data scientists, can discover hidden patterns between data and predict future trends by merging, analyzing deep data and visualizing such data from various viewpoints through cognitive analysis combining insight for data analysis and artificial intelligence.

  •  The only domestic Big Data & AI community

This is the only Big Data & AI community with data scientists where you can communicate with experts and find the latest information. The best experts in each field, including data architects, data engineers and data scientists are with DATAMIXI

  • Uses the largest domestic intelligent big data platform

You can use the largest domestic intelligent platforms combining the best solutions in each field, including collection, storage, data analysis, machine learning and reasoning.

  • Provides the world’s best data processing service

The world’s best data service (Data as a Service) that enables the machine to read, learn and understand semantically is provided through the world’s best performance big data collection engine ‘TORNADO’, the internal domestic and international professional curation center and technical support from the AI research center.

  • Opening Big Data analysis technology through intelligent OpenAPI

The best service provided is based on the technologies of Saltlux that have focused only on artificial intelligence from big data to large scale machine learning and reasoning for the last 20 years. Various analysis services integrating big data and AI technologies can be used for tests and services.

  • Provides Asia’s largest data

You can receive or use big data that can be used immediately for the analysis of various dictionaries by domain in addition to social data, open data, linked data and real-time data

  • Provides free intelligent cognitive analysis service

The cognitive analysis service where the AI technologies are applied is provided free of charge using social data of approximately 20 billion cases. It can discover a hidden pattern between data and predict future trends by merging, analyzing deep data and visualizing such data from various viewpoints. The personalized premium service that applies customer’s requirements actively and provides results is also delivered as a 100% personalized collection and analysis consulting service.

Main Services

Data Science Service

Saltlux’s Data Science Service is a consulting and education service that provides practical IT knowledge and technology training that can be applied to practical services for the whole process, from the data collection and purification through successful cognitive analysis and machine learning accumulated over the past 20 years, and the participation of experts (data scientists), selection and optimization of machine learning and analysis models, evaluation and visualization of predictions and intelligence results.

209

< Data science service >

Data Science Cloud Service – DATAMIXI’s Data Science Service integrates computer engineering, mathematical statistics, machine learning algorithms and domain knowledge modeling methods through Saltlux’s unique dual spiral methodology and leads to the development of AI-based knowledge services, such as intelligent big data analysis services, Q&A, or dialog services.

210

< Dual spiral methodology-based data service >

Active cooperation between humans and machines (human-in-the-loop) is necessary for outstanding deep data analysis and intelligent services. DATAMIXI’s Data Science System is based on dual spiral methodology where algorithms, tools and experts actively cooperate.

For typical data science processes, the collection and purification of data by applying Saltlux’s unique dual spiral methodology, machine learning and analysis model selection and optimization, evaluation and visualization of predictions and Intelligence results are carried out repeatedly.

211

< Data science consulting service process >

① Requirement analysis step

This is the step in which necessary data resources for data analysis are defined and the direction is identified through the deduction of analysis and intelligence goals, together with analysis and understanding of core problems in the data analysis required by a customer.

② Data curation step

The biggest difficulty in the execution of deep analysis and machine learning is the purification of big data, including any errors and lack of learning data. Data curation is the step to collect the data resources defined in the requirement step, purify data through processes, tools and trained experts that meet each analysis and intelligence goal, and product data for analysis and learning through filtering.

③ Data analysis and learning step

This is the step in which traditional statistical analysis, as well as deep data analysis using various machine learning technologies, such as CRF and SVM and deep neural network-based deep learning technologies such as CNN and RNN, are carried out. This is the process that carries out large-scale data machine learning and prediction and deep learning-based deep analysis using the intelligent analysis platform combining Saltlux’s various analysis engines and strong open sources, such as R and TensorFlow, and produce an optimal analysis result that meets a customer’s requirements through model verification and evaluation, tuning of model parameters and changes to the learning algorithm.

④ Data analysis verification and feedback step

This is the step which discovers knowledge, patterns and exceptions from the analysis results, or evaluate and verify the learning and prediction analysis result through feedback from internal or external experts and the customer before delivering the analysis result to the customer.

⑤ Final data analysis report step

This is the step that provides the data analysis result report that meets the requirements of a customer for whom data analysis and utilization can become a new individual and organizational competitive power.

Data curation service

It encompasses all activities for improving the value of data, such as meta information tagging (annotation), classification and learning data creation in data collection and purification. It is necessary to secure and process large-scale data in a form that can be read, learned and understood semantically by machines for data-based deep analysis and machine learning. Saltlux’s Data Curation Service provides the world’s best data service where Saltlux’s experience in data quality management and machine learning over the past 20 years is accumulated.

212

< Data Curation Service >

① Data curation service process

Six data curation steps are applied commonly to all domains, and the expert teams in each step systematically cooperate in establishing a customer’s knowledge service.

213
② Data curation service function

Data curation embraces all activities that improve the value of data use. In addition to general data processing fields, such as data digitalization through books, raw data collection and data purification, the professional data curation services, such as image and video annotation, R&D data annotation, and establishment of the knowledge base, is provided as follows.

214
Intelligent Cognitive Analysis Service

Saltlux’s Intelligent Cognitive Analysis Service provides an advanced analysis function through convergence analysis, related subject analysis, emotion analysis, trend analysis, issue detection and real-time R-linkage where the AI technology is applied using social data from more than 10 billion cases, provided free of charge, and an intelligent cognitive analysis function that enables deep analysis through the function to analyze semantic networks in data, also free of charge.

1) A data function that allows you to upload, register and use various public data provided by the data service and your data directly.

 

2) A data merging function that allows you to create data optimized for a desired analysis by only selecting and merging the desired elements from two or more files

 

3) A widget creation function that allows you to apply various charts and create a widget through intelligent analysis for an analysis subject of interest using provided social data

 

4) A dashboard creation function that allows you to create your own dashboard by positioning widgets created at a desired location simply by using the drag and drop method

 

5) A web sharing and publishing function that allows you to share a dashboard created from various people’s viewpoints through gallery or using SNS

① My Data function

The My Data function is a cognitive analysis service which enables processing, storing and registering of social data from more than 100 cases provided by Saltlux, and open data from 340,000 cases, or user data according to user needs in the form of CSV files of Excel files, to be used in the intelligent cognitive analysis service.

② Analysis widget function

The analysis widget function in the cognitive analysis service carries out intelligent cognitive analysis using social data from more than 100 cases provided by Saltlux, and open data of 340,000 cases or user data according to user needs, and the analysis results can be applied to various charts using the intelligent cognitive analysis function where a user analysis subject and a user widget can be created. This function can be divided into the cognitive analysis function using social big data and the cognitive analysis function using my data, and the trend analysis, related keyword analysis and emotion analysis can be carried out using the detailed cognitive analysis function.

③ User dashboard and gallery function

The user cognitive analysis result widget can be stored in the analysis widget gallery, and the user can create a dashboard using the registered cognitive analysis result widget. The created dashboard can be stored and registered in the user dashboard gallery, and this function allows the user to share it with other users and download it through user selection.

Data processing and machine learning function service – Dataiku

 As the centralized data-based intelligent big data platform, it uses the analysis function as much as possible to ensure that business maintains close ties to the company process, not merely remaining in the process at the saving data level. It provides support up to the step of data modeling through the machine learning process and the application of such data to company operations.

215

< Data Science Cloud Service – Dataiku >

① Data search function

Creates an automatic report for the data set and points out potential data quality problems. Creates single data and statistics of multi-variables and creates a detailed data set audit report. Filters and searches for data just as easily as Excel. Gains insight by expanding the analysis range through execution in Spark, Hadoop or SQL engines.

② Data pre-processing and visual conversion function

It is possible to easily access more than 80 embedded visual processors to prevent arguments about codeless data. The conversion of automatically suggested context and large-scale data work can be carried out.

③ Machine learning function

Automatic engineering, creation and selection to use all types of model data are available. The model hyper parameter is optimized using various cross validation test strategies. An immediate visual insight is gained from the model (importance of a variable, interaction or characteristics of a parameter) and the model performance can be evaluated through detailed metrics.

④ Machine learning-based model distribution function

It allows analysts and data scientists to place models in the production through several clicks. Data cleaning, enrichment and pre-processing become a score pipeline. The versions of distributed models are managed, so the user can distribute a new version, compare and rollback at any time.

⑤ Data creation information management function

The distribution model that includes all steps necessary for data creation (① development of data creation model (workflow), ② model and production data test, ③ data prototype (verification before production), ④ data commercialization (data and creation model packaging) necessary for data production in a single UI).

Main competitive characteristics

  • Data analysis – Analysis visualization

Insight can be gained through network analysis between authors, similar papers, core technologies, related technologies and keywords, analysis of convergence between different technologies, cognitive analysis and deep analysis. The determination of status of competitors’ technologies and R&D, briefing on government policies in R&D fields, new technology sensing and trend monitoring in R&D fields are available.

  • Data curation – Conversion into smart data

Data curation means all activities for improving the value of data use such as annotation, classification and learning data creation in data collection and purification. It is necessary to secure and process large-scale data in a form that can be read (readable), learned (learnable) and understood (understandable) semantically by the machine for data-based deep analysis and machine learning.

  • Provides the only domestic data science platform service – Conversion into Science Total Service

Supports all tasks for gaining insight or implementing an intelligent system using data collection, curation, statistical analysis and machine learning. Anyone who has no experience in technology can carry out data analysis using this product.

  • Machine learning, AI – prediction of research experiments and results

When internal research data and experiment data such as graphs, tables, images and chemical formulas from outside papers are curated (extracted, purified and processed) and prepared, it is possible to obtain the results easily and promptly by carrying out a research experiment indirectly through the ML function using such data.

  • Complete collection and integration of internal and external data – Data Banking

Enables the collection, sharing and reuse of internally scattered research data. Collects and internalizes various unstructured data such as papers, patents and technological documents from the outside. It embeds the largest collection functions (six types) among domestic and overseas collection engines and secures the best collection performance through the application of real-time data collection and processing technology.

Screen of main engines

스크린샷(47)
스크린샷(48)