This week, I learned more about working with data using Python, Pandas, NumPy, and visualization tools. I already have some experience with coding, so some parts felt familiar, especially reading code, testing outputs, and understanding how variables work. However, this week helped me practice applying those skills specifically to data analysis and visualization.
One important thing I learned was how to choose the correct type of plot based on the variables. For example, a histogram is useful for showing the distribution of one numeric variable, a boxplot is helpful when comparing a numeric variable across categories, and a bar chart or count plot works well for categorical data. I realized that making a graph is not just about writing the code correctly. It is also about understanding what the question is asking and choosing a visualization that clearly answers it.
I also practiced problems involving discrete distributions, such as binomial probability and expected value. These problems helped me understand how probability connects to real situations. For example, the lottery expected value problem showed me that I need to include both the prize and the cost of playing. At first, I was focused only on the winning amount, but after working through it, I understood why the expected winnings can be negative.
Another topic I worked on was campaign contribution data. I practiced grouping data by candidate, occupation, and employment status using functions like
groupby(), value_counts(), mean(), median(), and crosstab(). This helped me see how data can be summarized in different ways depending on the question. I also learned that data analysis is not only about getting numbers, but also about explaining what those numbers mean.A concept I am still working on is deciding the best plot without second guessing myself. I can usually understand the code after seeing it, but I want to get better at choosing the right visualization on my own. I also want to keep practicing stacked bar charts and normalized crosstabs because they are useful, but they require careful interpretation.
Overall, this week helped me build on the coding experience I already have and apply it more toward data science. I learned that before writing code, I should slow down, read the question carefully, identify the variables, decide whether they are numerical or categorical, and then choose the best method. This process will help me become more confident in future data analysis work.
Comments
Post a Comment