Plot histogram from dictionary python
I created a Show
This is the content of the dictionary I want to plot:
So far I wrote this:
I tried by simply doing
but this is the result: and I don't know why the 3 bars are shifted and also I'd like the histogram to be displayed in a ordered fashion. Can somebody tell me how to do it? IntroductionA histogram is one of the 7 basic tools for quality control. Histograms also figure prominently in the data visualization world. For a small data set, histograms should be easy to plot physically. We can also use a tool like MS Excel to plot histograms. However, we are going to plot it the cool way - using python. The module we will be using is called matplotlib. It is currently the most popular module for data visualization but it is not the only module out there. There are more modern modules like altair and seaborn. But as it stands today, matplotlib remains the popular choice. In this post, we will learn to plot a histogram using python dictionary. Objectives
What is data visualization?Data visualization is the graphical representation of information and data. It has many tools including but not limited to charts, graphs, and maps. Data visualization has gained wide prominence in the last decade due to the rise of big data. The easiest way to understand trends, outliers, and patterns in data is to graphically represent them. In a world that is increasingly making data-driven decisions, data visualization methods have proved to be very effective. One can say data visualization has mainly 2 uses.
What are some data visualization techniques?Either discussing data visualization or visualization techniques will require writing an entire book. But we will go through a crude list of visualization techniques. These are easy to learn and something we probably came across during school.
Data visualization is a fast-growing field. The techniques are always evolving. The above list is very crude but is a starting point if one is looking to learn data visualization. In this article, we will be specifically looking to discuss histograms. What is a histogram?A histogram is used to graphically represent the distribution of numerical data. In other words, it is used to visualize the frequency of a particular data in a given range. One must have heard of (or at least seen one) color histograms. A color histogram is a representation of the distribution of colors in an image. Anyone that ever tried photo editing must have at least unknowingly came across one. Histograms were invented by the English mathematician and biostatistician, Karl Pearson. He is also by and large credited with establishing the field of mathematical statistics. The general step-by-step method to plot histograms is as follows:
Things to note:
PS: The difference between a histogram and a bar chart is that a histogram represents continuous data. In a bar chart, each bar is dedicated to a special category. We usually give space between the bars in a bar chart which is absent in a histogram. The space between the bars should immediately communicate whether we are looking at a bar chart or a histogram. Reading a histogramSince histograms are visual tools, reading them is an easy task as they’re supposed to be. We already mentioned that histograms usually have equal intervals but it is not necessary. There are two ways of reading a histogram depending on the interval selection. If the histogram has equal intervals, then each bar can be directly compared with the other.
If the histogram has unequal intervals, then each bar cannot be directly compared with the other.
Histograms have 5 patterns that we should be looking out for.
We should also try to change the interval period to see different patterns arise. Plotting over different intervals can give valuable information. Applications of histogramHistograms have many applications. We have listed a few.
Constructing a histogramThe best way to understand something is to make it. But before we show how to make one using python, let us understand how it is physically made. Let’s suppose we have 10 students who have recently undergone a test. All students are having different marks. We want to understand the distribution of marks of these students so we can have a better understanding of how much the students are scoring. The marks are as follows:
The histogram will look like this. We made it using Matplotlib, which we will get to later.
We divided the whole marks into intervals of 10. Then we count how much the students scored in a given interval. The number of students are marked on the y-axis and determines the height of the bar. The range itself is marked on the x-axis. By looking at the histogram, we can easily see that most students (3 out of 10) scored between 80 and 90 marks. We can also see 80% of students scored above 60. See how a simple histogram made data visualization possible. Python dictionariesDictionaries are one of the fundamental data structures in python. It stores key-value pairs. For us to plot a histogram, it is necessary to split our data into key-value pairs. By looking at the above graph, we can see that our keys will be the ranges in the x-axis while our values are on the y-axis. In python, dictionaries are the only way to create key-value pairs. So it is important to understand python dictionaries before going forward. There are few ways to declare a dictionary, but the best way is the pythonic way. The above syntax will declare an empty dictionary called
But we will prefer the pythonic way of doing things. Now let’s understand how to add key and values in to a dictionary.
Here we manually inserted the key-value pair into the dictionary. However, that is not possible while making a histogram. We need to able to automatically count the values in the given interval(keys). To understand that, let’s first write a program that will estimate the frequency of words in a document. This is obviously achieved through a dictionary where the word will be the key while count will be the value. In this case, we will have to write a program that checks each word in the document and do two things:
This can be achieved using a simple for-loop.
While the above code will completely do the job for us, python has a better way of doing it
using defaultdict
Here we declared
|