Skip links

Data Collection ENGINE Tornado

Advanced multi-platform big data collection solution on the market

Saltlux technology

data collection solution Tornado

In a traditional database (DB) environment, the data is generated in an application and DB’s front end, instead of being imported from outside and from processing initiates. Meanwhile, in big data, the data is brought in from outside and from the processing initiates, instead of being generated internally. The data processing starts from the data collection in the big data environment.

The Data Collection Solution by Saltlux has the ability to crawl on multiple platforms, collect big data based on RSS, deep web, metasearch, social networks, and Open API.

Saltlux Technology - data collection

Definition and features

Real-time

It is a strong big data processing engine that could perform real-time automatic and parallel big data collection according to users’ preferences.

Multi-platform

Collect data from multiple platforms, from deep web, social networks, IoT, Meta Search, streaming data and even Open API.

Methods

Use both active and passive approaches.

Data processing

Perform data loss and duplication prevention, data compression, data structuring, encryption of stored data, flawless validation, and user convenience

Automation

Extract, convert and store big data automatically from hidden web pages, along with powerful web collection.

Environment

Provide an optimized Big data environment for users to perform multi-faceted analysis (competitors, products, markets and products, risk management and customer voice recognition) in real time.

Application

The Tornado Data Collection solution can be applied to business processes, helping businesses improve business efficiency

Saltlux Technology

Improve efficiency

Helping businesses improve their brand management and feedback respond to VIP customers, as well as contributing to new product development​.

Forecast

By in-depth analysis of customer reviews and feedback, businesses can detect abnormalities early and provide a real-time feedback system.

Decision making

Providing a premise for businesses to analyze and evaluate customers' reactions and consumption trends, thereby making timely decisions and strategies.

Highlights

Various collection features (collection based on user scenarios, RSS collection web collection, collection deep web, social collection, collection based on OpenAPI) are built-in for various types of internal and external big data collection following user's needs.

Through a web-based collection rule editor that considers users' usability, the collection rule editor is built to easily extract and collect data from various types of dynamic websites such as JS and AJAX.

It can simultaneously collect a large amount of data using various set rules much faster and more stable through the distributed parallel method. It can also be installed and operated in multiple operating systems (UNIX, Window, etc.).

For user convenience, it provides a feature to confirm the quality of the data collected by data collection simulation in advance with previously generated collection rules through preview before collecting the user data.

Operator/manager canto easily and quickly check the current status through an integrated dashboard, which could monitor the overall condition of the collection engine, and an operation management tool, which could monitor the collection policies and schedule setting per collection source in real-time.

functions

Saltlux Technology's Tornado data collection solution is capable of cross-platform data collecting based on RSS, deep web, metasearch, social networks and OpenAPI. It also provides the functions of operating, simulating, scheduling, monitoring operating status, etc.

Social network data collection function

It has a scheduling feature that allows you to collect multiple types of social data, such as Twitter, public Facebook pages, and Weibo timelines, and set the collection cycle target. It also has a status history view function to verify the status.

Scenario-based Data Collection function

Based on user scenarios from various sites such as news, blogs, shopping malls, and general homepages, data about the collection target is extracted and collected. It provides a scheduling feature to set collection cycles and a status history feature to view collection status within the workbench.

RSS collection function

It provides a feature to read RSS (Really Simple Syndication) feed and extract the data within the collection target feed and original data. It includes a scheduling function that could set the collection cycle and collection status history features in which users can check the collection status even in the workbench.

Deep web collection function

It could easily collect the information within websites by collecting site-wide information based on URLs or filtering with URL patterns or keywords. It also provides the scheduling feature to set the collection cycle and the status history view feature to check the collection status.

Metasearch collection function

It has a keyword-based collection feature that sends user keywords to various search engines, including Google, Bing, Daum, Naver, and Yahoo, to consolidate search results into a single list. It also provides a scheduling feature to efficiently collect and set the collection cycle for the collection target and a status history view feature to check the status.

Open API-based collection function

It provides a scheduling function to easily collect various documents and open data, including domestic public data, overseas public data, and local government public data, while also setting the collection cycle target. It also provides a collection status history view function to verify the status.

Operation management function

Provides a dashboard that monitors and operates the Tornado engine features.

User management function

Allow one or more users to access and assign permissions to users.

Management feature per collection target (project)

Manage each data collection item by different objectives, data sources, or preferences.

Operating Process

Definition of collection tasks

Saltlux Technology - thu thập dữ liệu tornado - data collection

Activities performed by users on the internet (input, click, search, etc.) are collected and stored by collection rule.

01 01

Preview on simulations and results

Saltlux Technology - thu thập dữ liệu tornado - data collection

The ability to preview the results to see if the rules set by implementing simulation perform properly.

02 02

Implementing collection engine

Saltlux Technology - thu thập dữ liệu tornado - data collection

Collect and store web data based on defined rules by implementing collection engine.

03 03

See the results

Saltlux Technology - thu thập dữ liệu tornado - data collection

Verifies the results of informal data collected from the web as semi-formal/formal data through workbench.

04 04
Saltlux Technology - thu thập dữ liệu tornado - data collection
Saltlux Technology - thu thập dữ liệu tornado - data collection
Saltlux Technology - thu thập dữ liệu tornado - data collection
Saltlux Technology - Thu thập dữ liệu Tornado

Engine screen

The main control screens of the Tornado data collection engine.

This website uses cookies to improve your web experience.