Tutorial on Data Analytics basics
As we know huge amount of data are produced and stored in large databases due to various data sources such as social media, web, sensors etc. This has increased further due to evolution of smartphones and phone applications. Hence it has become difficult to manage this big data by the companies. In order to optimize storage efficiency and to utilize data for businesses it has become essential to adopt tools and methods for data analytics.
Data Quality Issues
Data in the real world is dirty due to following reasons.
• Incomplete data originates due to wrong collection of information, human/software/hardware issues, differences in the criteria etc.
• Noisy data originates from faulty equipments, human/computer errors, data transmission etc.
• Inconsistent or duplicate data originates from different data sources, non-uniform naming codes/conventions etc.
• There are about 179 dimensions to be considered to evaluate level of data quality. They are accuracy, accessibility, security, timeliness, amount of data, consistency, completeness, interpretability, objectivity, understandability etc.
Why Data Analytics is needed ?
Companies use data analytics for running business
in order to perform better for the following reasons.
➨Increase in revenue
➨Decrease in costs
➨Increase in productivity
Data analytics tools and software solutions perform following tasks to obtain
high quality data.
• Removes inconsistency from the dirty data by removing duplicates.
• Corrects incomplete data from errors.
• Convert poorly structured data into structured data.
• Generates summatized reports in the form of text documents and graphs.
Functions of Data extraction, Data Profiling, Data cleaning
The Data analytics is the data science which applies algorithms to the data sets to derive useful informations with the help of software/hardware.
As shown there are three main parts in data analytics.
1. Data sources: Include various sources which generates data in various forms. Include data from social medias (Facebook, twitter, Google, linkedin etc.), web (mails, queries etc.), business transactions, sensor networks, patient records from hospitals, purchase transactions from online e-commerce websites, subscriber information from telecom service providers etc.
2. Data Analytics: Include various tasks performed on the data in order to convert dirty data into high quality data. It covers data extraction, data profiling, data cleansing and data deduping.
3. Data targets or results: Include cleaned and high quality data along with results to be used for benefits of running the business. Let us understand functions/definitions of core methods used in data analytics.
Data extraction: The process of extracting and storing the data from
data sources mentioned above is known as data extraction.
Data Profiling: The process of examining and collecting informative summary in the form of smaller database from the larger one is known as data profiling.
Data cleaning: The process of converting sourced data from errors, duplicates and inconsistencies into cleaned target data is known as data cleansing or data cleaning.
Data Deduping: The process of replacing multiple copies of data into single instance storage in order to save storage space/bandwidth is known as data deduping or data deduplication.
Data Analytics Use Cases or Applications
Following are the few of the applications or use cases of data analytics in different fields:
• BFSI (Banking, Finance and Insurance) : Data analytics help banking, financial and insurance industry to better understand customers, competitors and markets. This helps them provide better services and rights to customers to win their confidence. This helps people to invest more due to mutual trust. This increases revenue for them.
• Telecom : Network capacity and traffic density are the two major drivers for the growth of the telecom companies. Data analytics help telecom companies to plan capacity according to traffic statistics and historical data of subscribers. Hence telecom companies can save maintenance and equipments installation costs. Moreover they can provide better services to subscribers based on analysis of data logs collected by their advanced software.
• Hospitals : Data analytics in hospitals utilize medical records of the patients, medical equipments, test facilities etc. This helps hospitals to save administrative costs, to take better decision, to reduce fraud/abuse, to provide better care, to improve wellness of the patients etc.
• Aerospace : Data analytics collect informations from aircrafts and airport ground stations to predict status of Aerospace system and its surroundings with the help of mounted sensors. This makes it possible to provide better safety features to the passengers.
• E-commerce : Data analytics softwares help e-commerce companies to derive useful informations of their online customers based on their purchase behaviours and previous historic data. This will help them push desired advertisements based on machine learning algorithms. This benefits both customers and e-commerce website owners or online stores.
From Data Analytics tutorial it can be concluded that data analytics is very useful for everyone including various businesses and individuals.
Similar posts on Data Analytics
What is Data Cleansing
What is Data Deduping
What is Data Profiling
Advantages and Disadvantages of data analytics
What is big data
What is Hadoop
Data Mining Glossary
Data mining tools and techniques
What is Cloud Storage
cloud storage tutorial
data mining tutorial