Skip to main content

CST383: Reflection on Data Distributions and Pandas Aggregation

 This week, I learned more about data distributions, probability, and how Pandas can be used for data analysis and aggregation. At first, I was confused about some of the concepts and coding exercises, especially when working with grouping, crosstabs, and understanding probability distributions. However, after rewatching the course videos and practicing with the other lab files, the topics became much clearer to me. I realized that repetition and hands-on practice really help me understand coding and data science concepts better.

One concept that took me some time to understand was the difference between PDFs and CDFs, especially how probabilities are interpreted for continuous variables. I was also initially confused about aggregation with grouping in Pandas because there were many different functions like groupby(), value_counts(), and aggregation methods being used together. After trying the labs multiple times, I started understanding how these functions work together to summarize and analyze datasets efficiently.

I also found it interesting how data science uses distributions and probabilities to make predictions. The bike-sharing lab helped me see how real-world datasets can be analyzed using Pandas operations without loops. Overall, this week improved my confidence in working with datasets, reading distributions, and using Pandas for analysis.

Comments

Popular posts from this blog

Choosing Between MongoDB and MySQL

This week I learned more about how MongoDB and MySQL are both powerful tools for managing data, but they serve different purposes. MySQL is a relational database that organizes data into tables with rows and columns. It uses SQL (Structured Query Language) to define and manage data, which makes it very structured and reliable. MongoDB, on the other hand, is a NoSQL database that stores data as documents in a flexible JSON-like format . It does not require a fixed schema, so it is easier to change or add new data types as needed. Both databases are similar because they can handle large amounts of data, support indexing for faster searches, and allow users to perform queries to get specific information. They are also widely used in modern applications and can be connected to programming languages like Java, Python, or C++. However, the key difference is how they store and organize data. MySQL is best when data has clear relationships, such as in school systems, banking, or employee ...

CST462S - From Learning to Impact: My Service Learning Journey

What went well during my service learning experience was my ability to contribute meaningfully to the ASCENDtials web team. I was able to complete several tasks such as updating website pages, working on LifterLMS courses, and improving user experience through better layouts and navigation. I also communicated effectively with my team, asked questions when needed, and stayed consistent with meeting deadlines. Over time, I became more confident using tools like WordPress, WPForms, and course-building platforms. If I could improve something, it would be my time management and planning. There were moments when tasks felt overwhelming, especially when balancing schoolwork and service hours. I would also improve my confidence in decision-making, particularly when working independently on design or technical issues. Taking more initiative earlier and asking for feedback sooner would have made my work even stronger. The most impactful part of this experience was seeing how my work directly co...

CST438: Hands-On System Testing and Cloud Technologies

This week, I worked on both system testing and understanding cloud and distributed system concepts. One of the main things I learned was how to build Selenium system tests that simulate real user behavior. I created tests in which an instructor enters final grades and in which instructor-created assignments appear in a student’s view. While doing this, I learned how important it is to have all parts of the system running, including the frontend, backend, and external services like the gradebook. I also improved my debugging skills by fixing issues with Selenium, such as incorrect XPath selectors, missing UI elements, and unnecessary alert handling. I also learned how to properly manage my code using Git and GitHub. I created branches, committed my changes, pushed them to GitHub, and opened pull requests for review. This helped me understand a more realistic development workflow and how collaboration works in a team environment. In addition to coding, I reviewed several key concepts i...