Big Data - Analysing high amounts of data in real time

"Big Data" is a buzz word for many european companies. But for many people there is some uncertainty how to analyze large amounts of data and - beeing more specific - how evaluating them. Generated large scale amounts of data should not be taken relativly to the size of the enterprise or company. The question is more directed to the advantages Big Data Applications can offer to interested companies. Big Data should be intersting to companies and enterprises who:

  • Generate large amounts of data
  • Generate komplex amounts of data
  • want to generate dynamic amounts of data

and who would like to analyse, evaluate and process this data. If you see it this way, "Big Data" is nothing more or less than a technology for analysing large scale amounts of structured and unstructured data.

 

How can enterprises and companies profit from Big Data ?

Where ever we go, we will leave traces of digital informations. Simply based on the fact, that where ever we do go, our smartphone or mobile will be next to us. In other words, we do leave traces of informations, where we are (GPS Data) and many other digital kinds of informations, which will be transfered over the servers of our mobile providers, regardless if we do want that or not, this kind of data is also sent if we don´t use our smartphones actively. From the perspective of an mobile or smartphone provider an analysis and evaluation of this really large scale of data can be really useful, if they plan to higher the quality of their services. E.g if they want to offer a better network availability to their clients. Analysis and evaluation of big data puts a company into the position to place complex amounts of data into relations and to get answers and conclusions to economical questions, which can hardly be answered by manual analysis because of the fact that it takes too much time to do it that way.

Because of that Big Data Softwares for analysis should contain appropriate interfaces for visualising and inter-operating in Business-Intelligence, CRM or ERP Solutions. 

 

Big Data in companies

Huge amounts of data are Große Datenmengen sind relative. They can be generated in single files or groups of files. Big kinds of data are not only connected to huge companies. Even in smaller companies great amounts of data is generated right into a dimension where we do speak about "Big Data", this amount is starting at:

Amount of data > 100 Mio. lines or > 100 GB data volume

If we stay at our mobile provider example, descripted above and project this example to a mid-sized company, you will still find countless sources of big data, which enables the company to profit from. Starting from time tracking for automised analysis or data logs, for example. Log files or protocols, are beeing generated in many ways in every company. First of all they are generated at each client and server within the network, but you will also find log files at telephone systems, web based applications and every kind of sensor-systems, like RFID, video cameras, microphones, etc. Big Data is generated in all different kind of businesses like medical or health, science, financial or even in engineering. If we do speak from Big Data, we will also have to speak about the Industry 4.0, as  Big Data is the basement for real time analysis and control in complex process of production.

 

Structured or unstructured ? Data has character(s) !

Data- or data sets is generated in unstructered and strucuted form. What sound less dramatic in the beginning, is a huge challenge for most of the programms specialised in Big Data Analysis on the market. It is also challenging for the users, as most of the big data apps are not able analysing and processing unstructured data. Unstuctured data can be found in log- or log-protocols, for analysing them this kind of unstructured data has to run through a process which is called "Normalisation", because otherwise analysing them is just impossible. A good example for unstructured data based on an typical Web-Server Log of Apache-Web Server:

2011-01-10 10:05:03 H0 0.0.0.1 GET

By the normalization process the unstructured data will be turned into a structured and "readable" form:

Date
Time
Host name
IP
method

2011-01-10

10:05:03

H0

0.0.0.1

GET

Our Big Data Solution "LogDrill" will do this with a speed of 130.000 Lines / sec. per CPU-Core and filters identical log entries (Based on query e.g: identical host names or Ip-adress). For this no additional hardware will be necessary, a notebook and access to your network is all you will need. 5 Billion of thease entries are equal to 1 TB and can be analysed by LogDrill within 1 Sec. Once the query is done it can be used like a matrix, which can be placed to returning patterns or processes within a company. Thease returning process / patterns can be survillanced and analysed automatically, e.g. in case of wrong Log-Ins or access by third parties within a corporate network. So the "Big Data" which has been analysed can also be used for real time monitoring special events within IT-Security.

 

Big Data Analysis with LogDrill and PetaPylon

For anaslysing structured and unstructured sets of data we do offer our clients two solutions:

Big Data Echtzeit Analyse von Log-Protokollen - Abbildung des LogDrill Logos

PetaPylon Big Data Warehouse Appliance - Abbildung des PetaPylon Logos

LogDrill

Fast and resource efficient analysis of unstructured data including normalization

PetaPylon

Big Data Warehouse

  • MOLAP Technology
  • Special and fast Text-Processing
  • Cube-based query methods
  • Query-Exports via:
    • CSV, PDF, HTML, DOCX, ZIP or TXT.
  • User-Administration
    • Adding User and roles / rules
  • Configurable Dashboard
  • Easy handling, intuitive GUI, Drag´n Drop functionality
  • Fast, secured, configurable and inexpensive.
  • Hadoop-Technology
  • Scalable, realiable and inexpensive
  • Data-Management f. ERP, CRM, Business Intelligence
  • SQL-Connetion
  • Analysis of TeraByte-Data within Seconds
    • ETL Engine allows access to actual data with just a few seconds latency.
  • Big Log Management
  • Collect, normalise and evaluate TB/Tag-area
  • Interaktive ad-hoc Analysis & Reporting

 

Questions to Data Recovery ?
0032 11 360 266‏
Just tap to call

Questions to data recvoery ?
0032 11 360 266‏