This week, I learned more about probability distributions, density plots, histograms, and how to visualize data using Python libraries such as Pandas, Matplotlib, Seaborn, and SciPy. I practiced creating density plots, box plots, cumulative density plots, and histograms using real datasets. I also learned how changing things like bin width, bandwidth, transparency, and sample size can affect the appearance and interpretation of graphs. Another important topic was understanding skewness and how transformations such as log10 can help make heavily skewed data easier to analyze. One thing I found interesting was how probability density functions (PDFs) and histograms can represent the same data differently. Before this week, I thought graphs mostly showed the same information in different styles, but now I understand that each type of plot has a different purpose and can make patterns easier or harder to notice. I also learned that larger sample sizes tend to reflect the true distribution...
This week, I learned more about data distributions, probability, and how Pandas can be used for data analysis and aggregation. At first, I was confused about some of the concepts and coding exercises, especially when working with grouping, crosstabs, and understanding probability distributions. However, after rewatching the course videos and practicing with the other lab files, the topics became much clearer to me. I realized that repetition and hands-on practice really help me understand coding and data science concepts better. One concept that took me some time to understand was the difference between PDFs and CDFs, especially how probabilities are interpreted for continuous variables. I was also initially confused about aggregation with grouping in Pandas because there were many different functions like groupby() , value_counts() , and aggregation methods being used together. After trying the labs multiple times, I started understanding how these functions work together to sum...