Solved: Discussion: Why are the original/raw data not readily usable…

Discussion: Why are the original/raw data not readily usable by analytics tasks? What are the main data preprocessing steps? List and explain their importance in analytics.



Title: The Importance of Data Preprocessing in Analytics: Exploring the Limitations of Raw Data and Preprocessing Steps

Data preprocessing plays a crucial role in analytics, as raw or original data often requires preparation before it can be used for various analytical tasks. This discussion delves into the reasons why raw data isn’t readily usable for analytics tasks and highlights the significant steps involved in data preprocessing.

Limitations of Raw Data:
Raw data is typically unstructured, messy, and often contains inconsistencies, errors, and missing values. It lacks the necessary structure and quality for direct analysis. Additionally, raw data may contain irrelevant features, outliers, or noise that can negatively impact the accuracy and effectiveness of analytical models.

Importance of Data Preprocessing:

1. Data Cleaning:
Data cleaning involves identifying and correcting errors, inconsistencies, and missing values within the dataset. This step aims to enhance the quality and reliability of the data. By eliminating the presence of erroneous or missing information, data cleaning minimizes the risk of biased or misleading insights during the analysis.

2. Data Integration:
Data integration refers to combining multiple data sources into a single dataset. Often, organizations accumulate data from various systems and sources, leading to inconsistencies in data formats, naming conventions, or structures. Through data integration, the different datasets can be merged and standardized, enabling comprehensive analysis and reducing redundancy.

3. Data Transformation:
Data transformation involves converting data into a suitable format for analysis, typically by applying mathematical functions or normalization techniques. This step helps to address issues such as outliers and skewed distributions, making the data more suitable for statistical modeling and machine learning algorithms.

4. Dimensionality Reduction:
Dimensionality reduction aims to reduce the number of features or variables within a dataset without significant loss of information. It is particularly useful when dealing with high-dimensional datasets that are computationally expensive to process. By reducing the dimensionality, the complexity of the analysis is reduced, enabling faster and more efficient analytics.

5. Data Discretization:
Data discretization involves transforming continuous variables into discrete intervals or categories. This step is commonly used in decision tree algorithms or association rule mining. Discretization simplifies the data representation, reducing computational complexity, and enabling more accurate analysis, especially with categorical or ordinal variables.

Raw data requires preprocessing to overcome its inherent limitations, such as inconsistencies, errors, missing values, and irrelevant features. Data cleaning, integration, transformation, dimensionality reduction, and data discretization are essential preprocessing steps that enhance the quality, relevance, and suitability of data for analytics tasks. By conducting these preprocessing steps meticulously, organizations can ensure accurate and meaningful analysis, leading to valuable insights for decision-making purposes.


Barone, L., & Senni, V. (2017). Big data and data preprocessing techniques. In Advanced analytics on raw data for efficient decision making (pp. 1-23). IGI Global.

Kelleher, J. D., Mac Namee, B., & D’Arcy, A. (2015). Data preprocessing. In Fundamentals of machine learning for predictive data analytics (pp. 77-101). MIT Press.

Manoochehri, M., Al-Ebbini, L., & Selamat, A. (2014). Data preprocessing techniques for data mining. In New perspectives in information systems and technologies (Vol. 2, pp. 781-791). Springer.

Calculate the price
Make an order in advance and get the best price
Pages (550 words)
*Price with a welcome 15% discount applied.
Pro tip: If you want to save more money and pay the lowest price, you need to set a more extended deadline.
We know how difficult it is to be a student these days. That's why our prices are one of the most affordable on the market, and there are no hidden fees.

Instead, we offer bonuses, discounts, and free services to make your experience outstanding.
How it works
Receive a 100% original paper that will pass Turnitin from a top essay writing service
step 1
Upload your instructions
Fill out the order form and provide paper details. You can even attach screenshots or add additional instructions later. If something is not clear or missing, the writer will contact you for clarification.
Pro service tips
How to get the most out of your experience with TOP USA homework
One writer throughout the entire course
If you like the writer, you can hire them again. Just copy & paste their ID on the order form ("Preferred Writer's ID" field). This way, your vocabulary will be uniform, and the writer will be aware of your needs.
The same paper from different writers
You can order essay or any other work from two different writers to choose the best one or give another version to a friend. This can be done through the add-on "Same paper from another writer."
Copy of sources used by the writer
Our college essay writers work with ScienceDirect and other databases. They can send you articles or materials used in PDF or through screenshots. Just tick the "Copy of sources" field on the order form.
See why 20k+ students have chosen us as their sole writing assistance provider
Check out the latest reviews and opinions submitted by real customers worldwide and make an informed decision.
Customer reviews in total
Current satisfaction rate
3 pages
Average paper length
Customers referred by a friend
15% OFF your first order
Use a coupon FIRST15 and enjoy expert help with any task at the most affordable price.
Claim my 15% OFF Order in Chat
error: Content is protected !!