r/dataisbeautiful Nov 01 '22

[Topic][Open] Open Discussion Thread — Anybody can post a general visualization question or start a fresh discussion! Discussion

Anybody can post a question related to data visualization or discussion in the monthly topical threads. Meta questions are fine too, but if you want a more direct line to the mods, click here

If you have a general question you need answered, or a discussion you'd like to start, feel free to make a top-level comment.

Beginners are encouraged to ask basic questions, so please be patient responding to people who might not know as much as yourself.


To view all Open Discussion threads, click here.

To view all topical threads, click here.

Want to suggest a topic? Click here.

35 Upvotes

44 comments sorted by

7

u/RustyKarma076 Nov 07 '22

Can we make r/DataIsBeautiful good again?

I see way too many posts on this sub that boil down to lazy attempts at trying to make a political claim. The data isn’t that neat, or fun to look at. All it is, is a bar graph form of “hey guys look how little Fox News talks about school shootings.” or “hey guys look how awful Christians are.”

The point of the sub is to show the data, and the well done presentation of the data. If the focus of your post is politics, and specifically how one political group is wrong/hypocritical, and makes little to no attempt at presenting data in an interesting way, it should be deleted.

This is a great example

1

u/Summoarpleaz Nov 11 '22

THANK YOU. I subbed to this not to exclusively read about politics or, really, to read about the data, but to look at smart and beautifully presented data. Idk why it has boiled down to bar charts.

1

u/RustyKarma076 Nov 11 '22

There’s nothing inherently wrong if the post is about politics. Even if the data is very clearly trying to make a point about one political group. I’m fine with that, as long as the data is “beautiful.”

There’s nothing interesting about bar graphs and pie charts. That’s all they are

edit: like this

It’s the top post of the day and it’s just a wall of text to complain about billionaires. There’s nothing interesting about it

1

u/Summoarpleaz Nov 11 '22

Yes agreed. I have no issue with politics but it seems like the posts have basically gone down in the data visualization aspect in favor of making a political point. Again, wouldn’t be an issue if only the data visualization part were punched up a notch.

1

u/beingsubmitted Nov 30 '22

I'm a little in between, here. I've posted OC a couple of times, and I'm working on a new one currently.

I for one do think aesthetics should be part of it, absolutely. Having posted OC before, though,you will get a lot of pushback for aesthetic choices - if your colors don't perfectly interpolate or anything can be interpreted as not strictly accurate. There's a degree to which accuracy and aesthetics are at odds. For example, a pie chart can be perfectly accurate, but if certain colors are too similar or one color stands out without good reason, you will be accused of manipulating data or not presenting it honestly.

On the other hand, I also think some data just are beautiful, regardless of aesthetics. Interesting data are fun to look at on their own, and the data itself is part of the beauty, like "r/oddlysatisfying" I think sometimes a post isn't about the data visualization being beautiful , but the data itself.

Ultimately, though, it's a matter of voting, particularly on new/rising posts. Given that only about one in a thousand viewers or less of a post actually up or downvote, individuals have a lot of power to reshape the sub, by voting in new/rising. If more aesthetically-minded posts are lifted, the OC will follow suit.

1

u/FalseEconomy Nov 17 '22

I am sure no one cares but the 90 second video of burgers in place of a bar graph pushed me over the edge. I am unsubscribing. This sub is in dire need of active moderation.

2

u/acbcccdc Nov 07 '22

2

u/Infamousscorpion Nov 10 '22

I've wondered the same thing in the past and was thinking of making my own. The big picture idea is that it is called a 'Sankey' diagram.

https://plotly.com/python/sankey-diagram/#:~:text=Sankey%20diagrams%20visualize%20the%20contributions,Figure(data%3D%5Bgo.

Hope that helps

2

u/talkingtunataco501 Nov 10 '22 edited Nov 10 '22

For 2.5 years, I collected this data.

  • Drinking: whether I drank on a day or not, how much I had to drink, whether I did it for fun or stress, and whether I had a hangover or not the next day
  • Marijuana: whether I had pot on a day or not, the method I took (smoke or edible), whether I did it for fun or stress, and whether I had a hangover or not the next day

I started to collect this data to track some interesting things. From this, I figured out that I get alcohol induced. migraines. I also found out that on average I drank 1.05 times per week with 3.1 drinks per week, and I do pot on average of 1.93 times per week. This is during that particular 2.5 year period.

What are some good visualizations that I can do with this data?

2

u/giraffeingreen Nov 30 '22

You can do a box plot, see the outliers and focus on those.

Maybe you smoke more during the winter?

2

u/_artbreaker Nov 13 '22

I have been thinking recently, especially with COP happening, that this subreddit could create something that collectively visualises and tracks the scale of climate impact in as it's happening. Potentially as a Micro-website.

Like how many countries have experienced drought, flood, unprecedented storms, famine etc. This could also be linked to news stories around the world.

I feel like the best thing about this sub is the way it can help convey complex information at scale. We get news articles around floods in Pakistan, droughts in China, but what does that all look like together?

1

u/CrowsinPrism Nov 26 '22

Great point and I think this is a great idea

2

u/SullyPanda76cl Nov 14 '22

Hi... can we pin a "101" threat of animating data?

I see that's a very common question for newbies enthusiasts (like me)

1

u/TempoMentalWriter Nov 02 '22

I'm looking for a software package that would allow me to do a timeline visualization, similar to what is done on Wikipedia for, say, musical groups.

https://imgur.com/a/bJPSFrv for an example.

1

u/elliotboney Nov 03 '22

I'm wondering if you could just do a Gantt chart in excel, but just with years instead of the typical months.

There are also free timeline visualization tools, like this: https://www.lucidchart.com/pages/examples/timeline-maker

1

u/jasonjonesresearch OC: 5 Nov 03 '22

R, ggplot2, geom_rect

See "A Timeline Using geom_rect" on https://plotly.com/ggplot2/geom_rect/

1

u/Zibbulon Nov 03 '22

Hello, I am in the process of learning Matplotlib and it's been very fun so far. Do you know were I can find basic public databases / files that contains various information on which to practice making graphs ?

Thank you.

1

u/FeistyConstruction60 Nov 06 '22

People I would like to know your opinion about the recopilation of our data without any of our awareness.

1

u/ShounakDas Nov 06 '22

What are the best Data Visualization tools that can also export in mp4? Using Python and API s.

1

u/Cloudkid78 Nov 07 '22

That's honesty great! (Say hi guys, this is screenshot is for my Estatistics an Analysis Homework).

1

u/worksofter Nov 08 '22

I have some data where the data decrease the further up both the x and y axis you go. What is an attractive yet effective way of showing this data? The goal is that somebody can look at where they are for the x axis, and where they are on the y axis, and find the 'meeting point' for the answer (sorry to be vague). Would appreciate any help. Thank you

https://imgur.com/cQOqRbk

1

u/Helique Nov 09 '22

I am looking for a new job, are there any standardized data formats for recording the job search process? I have seen lots of people create visualizations, and it would be cool if we could standardize on data formats.

1

u/pappasmurf91 Nov 10 '22

So sort of an odd request considering this group name. But I'm a teacher looking to talk about graphs and how they can be used for good and bad. I didn't know if people had data is ugly examples? If I need to make this into a different thread let me know.

1

u/vitaliyh OC: 2 Nov 14 '22

How would one go about making a heatmap of persons earning more than $100,000 per year, assuming each person is a 50-mile circle rather than a dot?...

1

u/MatteDambro Nov 14 '22

How can I have data about luxury goods inflation over 20 years?

If they are not aggregated, can I have price index over years of:

  • 5 stars Hotels
  • Luxury Fashion
  • Sports car
  • Yachts
  • Villas

1

u/skipjack- Nov 15 '22 edited Nov 16 '22

So I'm relatively new to this sub (and reddit as a whole), but I have a few interactive visualizations I'd like to share and get feedback on. I have two questions before I move forward though, both relating to the "Title" and image limitation...

  • The sourcing and description must be in the image?
  • Is it ok for me to respond to my own post with a link to the interactive version?

2

u/skipjack- Nov 16 '22

Ok looks like the answer to the first question is "no":

[OC] posts must state the data source(s) and tool(s) used in a top-level comment.

And for the second, yes (implied by the same quote above I think).

2

u/heresacorrection OC: 24 Nov 17 '22

Yes but please no youtube/social media spamming.

1

u/gftmc Nov 15 '22

I'm looking to do some clustering over time. What kind of tools would you suggest? (I can program well enough)

Specifically, I have a group of people and connections. The connections strengthen or weaken over time. I want to show how strong a connection between two individuals is, and cluster them together the stronger the connections are.

1

u/sinncross Nov 17 '22

Hi, I'm wondering what is the best way to graph a survey question related to ranked choices from 3 different demographics?

eg: Rank the following fruit in order of deliciusness: apples, watermelon, lemons. And the answers come from kids, teenangers and adults.

1

u/melent3303 Nov 17 '22

What would be the best way to visualize this data set:

Black Friday Death Count

It has data on the: year, location, # of injured, and # of deaths.

1

u/Historical-Jello1745 Nov 21 '22

Health data question: With things like smart watches, calorie tracking apps and sleep cycle alarm apps - users are gathering a lot of data about their daily habits. While a lot of it might be quite dirty (e.g. sleep alarm apps may miss records for weekends, or automatic time guesses may be off, users may lie to calorie tracking apps) - in aggregate the data has to be somewhat useful, no?

Wondering:

  1. Why there aren't any major projects asking for this data?
  2. What are the risks/implications of using this data - even if donated freely by volunteers?
  3. What could such data actually be useful for? (e.g. HR data for a million people over 5 years, OR sleep timing and quality for similar sets of people, OR complete food logs from fitness enthusiasts or chronic dieters)

1

u/FearlessHead8689 Nov 23 '22

Hi!

I am looking to make a career change to data analytics in the next 2-5 years. Right now, I still work with data quite regularly in my current role and often make basic visuals in excel.

I would like to up my data visual game! I assume excel won't cut it, what programs do I need to start looking to practice?

Thanks!

1

u/qthrow12 Nov 24 '22

For any data people here, I have a question about how to present some data.

I've got a query that returns periods of vacation time for a group of people, over many years. (5 in this case).

It also identifies when theirs a stat day.

So my data for example looks like. This would be 1 row of data.

startdate - 2018-01-01 00:00:00.000

enddate - 2018-01-02 00:00:00.000

count of members - 2

Stat Day found (within that range) - Y

I'm trying to figure out how to compare 2018 to 19 to 20 to 21 etc.

How would I best display this?

My end goal is to hopefully be able to use this to say, for example, 2nd week of February you typically have X members off on vacation. So a manager could prepare.

The problems i'm having figuring this out, is that every year moves everything forward a day, and you can't just add those movements so they all balance out, as that day that was in week 1 of month might be part of week 2 now, but the stat fell in week 1 regardless and thats the week people might take off.

I hope this makes sense. Thank you so much for your time.

1

u/[deleted] Nov 24 '22

I'd love to see somebody make a visualisation of how many countries actually did boycott the Qatar world Championship broadcasts. There seem to be mixed results with some posts stating that Germany had half as many viewers than expected and Denmark having more than they had during a previous match against Russia.

1

u/avasophia253 Nov 28 '22

DM me for a serious relationship

1

u/TurtleChomps Nov 29 '22

Would love to know software used for the beautiful charts (static), maps etc… and where can we learn it? Any classes specific to data viz or

1

u/Hayk94 Nov 29 '22

Hey guys, a data viz noob here so excuse my foolish question.

Say we have a chart with bars and lines combined. And it's a dual Y axis chart. One axis is for the bars and the other one for the line. Now my understanding is that the two axes should have some correlation, aren't they? Is there any scenario where it makes sense the axes not to have correlation?

Also any links to reading material as to what are the rules of those type of charts and should the axis be correlated or not, would be appreciated.

1

u/YOLO4JESUS420SWAG Nov 29 '22

Where can I make a request? I love this sub but don't know shit about data mining nor how to make cool graphics like I see here. After hitting 8bn people on earth I was hoping to see a visual of something like the last 100 years global map of what areas exploded and by how much. It'd be so cool to see.

1

u/Atomicityy Nov 30 '22

Can someone recommend a noob friendly program to analyze data? Will said program also help me transform my findings into charts and graphs?

I want to analyse my spotify wrapped playlists from 2016 til now. Examples: how is each decade/genre represented, nationalities of the artists, which songs return over the years etc.

1

u/zestyping Nov 30 '22 edited Nov 30 '22

Members and mods, let's talk about misleading visualizations.

The subreddit description says DataIsBeautiful is for visualizations that effectively convey information.

To me, that means highly misleading visualizations shouldn't qualify. Don't we all want a high-quality subreddit?

Of course, we all make mistakes sometimes; I'm not talking about missing one point out of 100, or plotting a point a few pixels off to the side, the kind of thing that could be fixed with a small correction. I'm talking about egregious designs, where a visual element is 2 or 3 or even 10 times larger than it should be. Whether or not there's an intent to mislead, these are simply low-quality visualizations, and we'd be better off without them.

What do you think? Should this be addressed in the FAQ? Should there be a guideline?

1

u/fengchiafatty Dec 10 '22

I am looking for help. I have done bar chart races and line chart races. I would like a new or different way of showing the data. My students are reading books. I am showing how many words they have read. I wondered if anyone had any fun or exciting ways to display the data. I had the idea of showing a car leaving our school and driving away, with the number of words = to km. or a rocket ship flying away from Earth. TIA