I have some data in a json file that contains a list of timings for different tests. Each test has a name, start timestamp and duration in milliseconds. It looks something like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"timings": [ | |
{ | |
"testname": "Test A", | |
"start": 123456, | |
"duration": 121 | |
}, | |
{ | |
"testname": "Control", | |
"start": 123599, | |
"duration": 25 | |
}, | |
{ | |
"testname": "Test B", | |
"start": 123400, | |
"duration": 220 | |
} | |
] | |
} |
Now, I love the Matplotlib Python library. I used it a lot when learning Machine Learning during the 2020 lockdowns. It is an incredibly rich and powerful tool for creating professional data visualizations. So, as I'm new to it I thought I'd see if I could create a Range Bar plot of these data.
In essence I want to show each test as a individual bar, the size of which corresponds to the duration and each bar's left most edge corresponding to the start time.
Here is the script:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# I have a json file with a list of tests. Each test has a name, start time and duration | |
# I want to display these in a bar plot | |
import numpy as np | |
import matplotlib.pyplot as plt | |
import json | |
# Read the json file | |
f = open('data.json') | |
data = json.load(f) | |
test_names = [] | |
start_times = [] | |
durations = [] | |
colors = [] | |
for timing in data['timings']: | |
test_names.append(timing['testname']) | |
start_times.append(timing['start']) | |
durations.append(timing['duration']) | |
# Change the color of the longest duration bar | |
max_duration = max(durations) | |
for duration in durations: | |
if duration == maxduration: | |
colors.append("red") | |
else: | |
colors.append("blue") | |
# Now create the plot | |
plt.title('Test metrics') | |
plt.xlabel('Start time/duration (ms)') | |
plt.barh(test_names, durations, left=start_times, color=colors) | |
plt.show() |
It's very simple. Most of the script is concerned with reading the data from the file and generating the data structures. I thought it would be nice to highlight the longest duration test in red. This would be useful for seeing if the times for different tests were close.
Very simple. There are of course many options to polish this. I'll update the blog as I refine them.
No comments:
Post a Comment