Skip to main content

CST383: Learning Probability Distributions and Data Visualization in Python

This week, I learned more about probability distributions, density plots, histograms, and how to visualize data using Python libraries such as Pandas, Matplotlib, Seaborn, and SciPy. I practiced creating density plots, box plots, cumulative density plots, and histograms using real datasets. I also learned how changing things like bin width, bandwidth, transparency, and sample size can affect the appearance and interpretation of graphs. Another important topic was understanding skewness and how transformations such as log10 can help make heavily skewed data easier to analyze.

One thing I found interesting was how probability density functions (PDFs) and histograms can represent the same data differently. Before this week, I thought graphs mostly showed the same information in different styles, but now I understand that each type of plot has a different purpose and can make patterns easier or harder to notice. I also learned that larger sample sizes tend to reflect the true distribution more accurately, which helped me better understand randomness and sampling variability.

At first, I was confused about the difference between a histogram and a density plot, especially when using parameters like density=True and bw_method. I also had to spend extra time understanding how normal distributions work and why sampled data does not perfectly match the theoretical PDF curve. After practicing more and reviewing the examples, the concepts started making more sense. I still want to improve my understanding of how bandwidth values affect density plots because sometimes it is hard to tell which bandwidth is considered the “best” choice for a dataset.

Overall, this week helped me become more comfortable with data visualization and statistical analysis in Python. I feel more confident reading graphs, understanding distributions, and writing plotting code compared to before.

Comments

Popular posts from this blog

Choosing Between MongoDB and MySQL

This week I learned more about how MongoDB and MySQL are both powerful tools for managing data, but they serve different purposes. MySQL is a relational database that organizes data into tables with rows and columns. It uses SQL (Structured Query Language) to define and manage data, which makes it very structured and reliable. MongoDB, on the other hand, is a NoSQL database that stores data as documents in a flexible JSON-like format . It does not require a fixed schema, so it is easier to change or add new data types as needed. Both databases are similar because they can handle large amounts of data, support indexing for faster searches, and allow users to perform queries to get specific information. They are also widely used in modern applications and can be connected to programming languages like Java, Python, or C++. However, the key difference is how they store and organize data. MySQL is best when data has clear relationships, such as in school systems, banking, or employee ...

CST462S - From Learning to Impact: My Service Learning Journey

What went well during my service learning experience was my ability to contribute meaningfully to the ASCENDtials web team. I was able to complete several tasks such as updating website pages, working on LifterLMS courses, and improving user experience through better layouts and navigation. I also communicated effectively with my team, asked questions when needed, and stayed consistent with meeting deadlines. Over time, I became more confident using tools like WordPress, WPForms, and course-building platforms. If I could improve something, it would be my time management and planning. There were moments when tasks felt overwhelming, especially when balancing schoolwork and service hours. I would also improve my confidence in decision-making, particularly when working independently on design or technical issues. Taking more initiative earlier and asking for feedback sooner would have made my work even stronger. The most impactful part of this experience was seeing how my work directly co...

CST438: Hands-On System Testing and Cloud Technologies

This week, I worked on both system testing and understanding cloud and distributed system concepts. One of the main things I learned was how to build Selenium system tests that simulate real user behavior. I created tests in which an instructor enters final grades and in which instructor-created assignments appear in a student’s view. While doing this, I learned how important it is to have all parts of the system running, including the frontend, backend, and external services like the gradebook. I also improved my debugging skills by fixing issues with Selenium, such as incorrect XPath selectors, missing UI elements, and unnecessary alert handling. I also learned how to properly manage my code using Git and GitHub. I created branches, committed my changes, pushed them to GitHub, and opened pull requests for review. This helped me understand a more realistic development workflow and how collaboration works in a team environment. In addition to coding, I reviewed several key concepts i...