Which type of data analysis is used to summarize the data?
Graphs show the form of the distribution of the data and are a very usefull tool in exploring a dataset. Besides graphs, statistics that summarize the (distribution of the) data, are used to transform data into information. The
five-number summary, which forms the basis for a boxplot, is a good example of summarizing data. Central tendency metricsMetrics for central tendency: (arithmetric) mean and median. For grouped data, also the mode can be used to measure the central tendency. With individual cases the mode is in most cases not a good metric for central tendency, because an individual observation can have the highest frequency without being in the center of the distribution. Measuring spreadMost common metrics to measure variation in the data: range, interquartile range (IQR), variance and standard deviation. Other statistics used to summarize the dataThere are many other statistics that can be used to describe a data set. Some examples: Aggregating and summarizing dataIn many cases data are summarized per group. E.g. houseprices in the Netherlands can be summarized per province or per municipality, healthcare costs can be sumarized per age group etc. Example: table with summary statistics of houses sold in London in January 2019 Table 1
Summarizing data using MS ExcelMS Excel comes with a lot of statistical functions which can be used to create a data summary. For an overview of statistical functions in Excel, see here. Another possibility to create an overview of the most common summary statistics is using the analyses toolpak. On the internet, one can find many instruction manuals how to use this add-in, see for instance here. Standardizing variablesIn many cases it is useful to transform values to another unit of measurement. E.g. if in a data set there is a temperature variable (TEMP) measured in degrees Fahrenheit and for one reason or another the unit should be degrees Celsius, the following variable transformation can be used: TEMP_CELCIUS = (TEMP – 32) x 5/9. A common transformation used is standardizing; the values of a variable (X-variable) are transformed into a new variable (Z-variable) which is the location of the observation relative to the mean, expressed in standard deviations: \(z = \frac{x\;-\;mean}{standard\;deviation}\) . Collecting data: open data sourcesThere are many open data sources with a variety of data available at the internet. Some interesting examples are listed below. Open data sources offer data in a variety of different formats. A common used format is the csv (comma seperated values) format. Be aware that sometimes the semi-colon instead of the comma is used as a seperator in such a file. Long and wide data formatIn many cases the same data can be presensted in different ways. A common used transformation is from long to wide data format or the other way around. Some MS Excel tips and tricksSome commonlhy used possibilities when working with MS Excel are: - the use of $-signs to copy formulas with cells fixed - naming
cells Example research caseBelow an example of an outline for a quantitative research is given. Casus: comparison of the air quality in the four major Dutch cities 1. Introduction 2
Theoretical background 3 Emperical part: research methodology 3.1 PM10 data 2018 for Amsterdam, Rotterdam and The Hague Insert: - table with locations 3.2 Data collecting and cleaning 3.3 Operationalization (1): pattern over the year
3.4 Operationalization (2): daily pattern identification
3.5 Operationalization (3): comparing the major cities
4 Results data analysis 4.1 Data per city
4.2 Comparing air quality in the four major Dutch cities 4.3 Conclusion Present the conclusions from the research and answer the central question. 5 Conclusions and recommendations Homework group assignment
What data analysis is used to summarize the data?Besides graphs, statistics that summarize the (distribution of the) data, are used to transform data into information. The five-number summary, which forms the basis for a boxplot, is a good example of summarizing data. The most important statistics are statistics which measure central tendency and spread.
What are types of data analysis?In data analytics and data science, there are four main types of data analysis: Descriptive, diagnostic, predictive, and prescriptive.
What are the four types of data analysis?4 Key Types of Data Analytics. Descriptive Analytics. Descriptive analytics is the simplest type of analytics and the foundation the other types are built on. ... . Diagnostic Analytics. Diagnostic analytics addresses the next logical question, “Why did this happen?” ... . Predictive Analytics. ... . Prescriptive Analytics.. Which method is used for data analysis?The two primary methods for data analysis are qualitative data analysis techniques and quantitative data analysis techniques. These data analysis techniques can be used independently or in combination with the other to help business leaders and decision-makers acquire business insights from different data types.
|