Data Visualization

Tuesday, April 12, 2022

Reflection April 10th

For this week's reflection I decided on John Wick's kills in the movies (starred by Keanu Reeves). This is made by Zach Bowders.

John Wick Trilogy

The above link is interactive, however, for a still-framed picture:

As you can see, there are quite the number of kills by John Wick. In the interactive part of this data, the user can click on individual bars to reveal the number of kills in said location as a total, or (to the right of each named location), the number of kills per weapon. This is quite interesting, in my opinion. The type of chart this is considered is called "Sankey". There appear to be snake-like lines drawn throughout the graph (seems to fit the idea). The blocked rectangles resemble something called "Sources" and the white-space is nullified into a dark grey/black color.

This is a rather self-explanatory representation of data visualization, however, I wanted to point out a few things. The channels being used in this graph include the colors and lines' sizes. More specifically, the lines themselves are marks, however the channel is the angle of the line as it snakes across the graph-space. Some lines are much smaller, which represent lower nominal values. It appears as though there is a given width for each line while excluding the white-space.

Furthermore, each block (or source) can be clicked and all other lines that are not a part of clicked source will lower in opacity. (turning into a light grey). Overall this seems like an effective and useful graph. Near the top, there are indicators of the order. When | Where | How | Who are represented, almost as a timeline of answering the question about kills. However, there is no data representing the order of kills, nor the time at which said persons were killed.

Nothing wrong with this, however there is some imperfect data because of John Wick's positioning. At times it is difficult to tell whether it is his character or someone else's who is completing the murder. John Wick 3, for example, has background characters in plenty of forms (some shots my have missed Wick and hit someone else, friendly-fire, for example is an example of casualty in these films.

Not much else to say about this graph, although its fun to look at if, like me, you re a fan of John Wick films and the ridiculousness applied to each one. Shout out to Keanu Reeves!

The data is on Kaggle.

Sunday, March 20, 2022

Reflection (March 20)

For this week, I chose to focus on a "Dear Data" piece as requested. The data is self-explanatory, as I decided to keep track of the time I used watching a screen. Excluding my cell phone, this data is within an hour of accuracy (some times are estimated) I believe that the data is well chosen, however I debated on whether this is the best way to create a visualization for the data. Perhaps, a forbidden pie chart would have been a good idea, or a few bar charts to depict each day. The color-key is given on the paper, however, to repeat what can be seen below...

Red is for my Desktop PC
Black is for my Laptop
Light Blue is my Television
Orange is the Lab PC during class.
Green is for the movie watched this week during one of my classes.
Dark Blue is for the time spent at the theater

Specifically, the theater had a showing for the latest Batman, which rounded to about 3 hours of screen time. There was, of course, time used to watch previews and this time is a rough estimate to make things easier when finishing the visualization.

Overall, I chose colors that can be separated easily without too much else going on. There was also a time when I had the television on along with my laptop while I did some work (filling out applications and tax return). Other than that, there is not much else to say regarding times. At the bottom, you can find the times I watched, starting at 8am as a part of the x = 0 axis. I denoted a "p" and an "a" when the times travel into the afternoon and morning. The y-axis, of course, are each of the days starting with Monday at the top and descending unto Sunday at the bottom.

There is a bit of clutter on this graph, in my opinion, although, just to make sure that this is understood, I added the specific dates for each day. In hindsight, the dates can be deleted (or perhaps the days themselves). And the times on the bottom could have gone "every other" rather than for every hour. However, there are times ending at 1:45, for example, and I found that it is easier to read this way.

Note: The scratches are in pencil at first, as a rough draft.

Sunday, February 20, 2022

Reflection (Feb 20)

I attended the info-session about data within Roc Brewing Co along with one of their leaders, Chris Spinelli.

Roc Brewing Co. is a family owned business which has been ongoing for over ten years. Chris Spinelli, one of the leading staff for this company, went into detail about being an entrepreneur. What new company owners need to look out for is the first five years that the business is open. Most companies fail within this time, however, the numbers seem better for the longer a company is open.

The shelf-life of the beers varies for each type. Chris was able to provide a couple examples from the top of his head. Whoop-Ass is a beer which has a (rough estimate) 60 day shelf-life while the Lagerithm Lager has a 90-day shelf life. Storage and packaging are also factors for the shelf-life of beer.

Some of the example data that is analyzed for running the business includes social media data. This can answer the question of who the target audience is or popularity of a select beer. There is also data from the tap room and the wholesale side for marketing purposes. Some of the data is as simple as it sounds: the different kinds of beer that is produced as well as the alcohol percentage. Some beers vary in "type" such as Lagers, IPA's, DIPA's, Pilsners, Porters. The brewing is done just a bit differently for each kind of beer -and the ingredients tend to vary to provide assortments of flavors regarding each type.

Roc Brewing Co. Beer List

Note: Fermenting speeds are also a factor. Lager's ferment slower than Pale Ale's, for example.

Wholesale marketing needs to keep track of not only Roc Brewing Co.'s numbers, yet the numbers of rival brewing companies as well. There is a local competition just as much as there is a national one.

Most of this info-session was about a family oriented business who has taken quite a few steps forward through the analyzation of data. Not only was it necessary for the family to look at the statistics for probability of success, it was necessary to compare sales with their competitors as well. Questions such as, "what works" and "what doesn't" were frequently asked near the beginnings of this company's origin.

One must also factor in possible merch and what the relative cost is. Supply & demand is, of course, a deal with this company, but the idea is this: Whatever works, works. Keeping the company moving upward and onward can only happen in time (another factored piece of data).

Sunday, February 6, 2022

Reflection (February 6)

During this last week, there were assignments regarding scatterplots and SVG range/domain manipulation. The idea is simple. Sometimes there is data which may make for a great scatterplot or bar-graph! However, there's just one issue... the values of the numbers are spread out in such ways which cause for a very large SVG drawing. This can be affected by either width or height for 2-dimensional visualizations.

Furthermore, the data may be too close together when plotted for scatterplots (such as numbers that range from 4 to 5). Adding a broader range when scaling the x coordinates can bring out a clearer dimension of the data.

It is possible to scale multiple axes including the radius of circles. All one needs to do is access the data selection to be scaled. When selected, the user can call that scale to another function which then creates a domain and range of an axis. There are times when finding the right number to adjust the scale takes a some times to discover. In this event, be patient.

Sometimes its important to compare the same kinds of data in different ways to better understand comparisons. In this sense, its almost as if the data is telling a story of its own through the different ways of visualizing the dataset.

Reddit link

Here is an example below which mirrors our first assignment regarding style and the type of graph used to display data. The main difference is that there is a title and a code to represent a new variable called medical treatments. Really, this variable is a category of treatments that are all displayed below. There are quite a few different medical treatments being added to this dataset, so its best to see the graph itself.

The cost comparison between medical treatments in the USA is drastically different for other countries! This type of bar-graph is one of the main types of data visualization we went over for the last week. The added category has multiple variables which are structured within the data category called "medical treatments".

As we can see from the graphs, even a large populous such as India has lower costs in comparison to the United States. However, perhaps the United States treats more people -- or perhaps the medical treatments cost more. There is no indication of the amount of people who may have gotten treatment.

The amount of people per medical treatment may also correlate to the price of each medical treatment. It's difficult to fully grasp everything going on in this graph. However, it would be possible to display another graph based on the amount of medically treated patients vs cost. And, as it can be seen, displaying the country for each type of treatment is not a difficult way to draw the data.

Overall, I think this data is flashy -- it really brings the user in and it gets to the point... although I wonder about the "story" behind the data. It seems like there is more that can be told. Perhaps a scatterplot can emphasize an explanation behind the cost based on color distribution.

The information learned this past week opened up the opportunity of understanding just a bit more about data visualization.

Thursday, January 27, 2022

Reflection (Jan 30)

The article called: A Tour Through the Visualization Zoo introduces new readers to different forms of displaying data. Graphical perception of information is considered an experiment regarding spatial positioning. Most commonly, there are bar graphs and line graphs. However, there are far more types of graphs that are not seen often yet pertain to the balance of displaying data. A Choropleth Map, for example, encodes color across geographic areas to display values.

Choropleth Map Interactive Example

Data management is a powerful idea which, when used by certain tools, can display helpful information for people. Positioning is everything. And with the correct style of mapping, data successfully brings focus to statistics that may have before been quite dismal to look at upon a table. This practical assessment through management will engage the readers into understanding the diversity of particular subject matter. Appropriation among plots reveals a new angle which may have at first been uncommon. And this, not only creates a balance, provides a pleasing aesthetic through the provision of design.

A graph I located on reddit/r/dataisbeautiful supplies information regarding different genre types and their popularity rates from IMDB. The years traversed to reveal such data are between 1910 and 2021. Although it may take a second for the onlooker to find the dates provided (on the bottom), the popularity is quite interesting to look at. This appears to be an Area Graph, which is normally associated with geometric purposes. They are essentially line graphs with the bottom portion being colored in to denote trends.

The distribution of said genres from IMDB is shown below. Note: This is not rated on the ratings of movies -- simply on the popularity of the overall genre.

The fall in Westerns after the 1970's is certainly a true statement, for starters. This creator also seems to have color-coded each type of movie as well. For an image to open in a new browser (in the case of being cramped on this page) Film Genre Popularity

This data seems to be continuous (quantitative) and nominal (categories/labels) as qualitative data. This mix and match is rather appealing and when done right, can exercise an easy description given the chosen data to display. Personally, I find it interesting that Fantasy and Comedy have been the most consistent. By taking a look into the 1950s, Crime movies certainly stand out and people cannot seem to get enough of Thrillers!

As a reminder, the years are on the bottom. In my opinion, the years should be located under each graph for easier reading. Although, as a whole, this representation completes the job of explaining popularity throughout the years.

In conclusion, regarding this last week of new information and data management, there appears to be quite a few different ways of displaying data. Coding brings forth an entire new way to display data as well. It will be interesting to see in the class we go from here by using D3.

Side note: The music data is from a recent post on Reddit. Here is a site with a bit a portfolio from Bo McCready

Data Visualization - IGME 460

Tuesday, April 12, 2022

Reflection April 10th

Sunday, March 20, 2022

Reflection (March 20)

Sunday, February 20, 2022

Reflection (Feb 20)

Sunday, February 6, 2022

Reflection (February 6)

Thursday, January 27, 2022

Reflection (Jan 30)

Reflection April 10th

Report Abuse