Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment4
u1s1，这门课的assignment还是有点难度的，特别是assigment4（哀怨），放给大家参考啦~ 有时间（需求）就把所有代码放到github上（好担心被河蟹啊） 先放下该课程相关链接： Coursera | Introduction to Data Science in Python（University of Michigan）| quiz答案 Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment1 Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment2 Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment3 Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment4 嘿，顺便推广下自己的 博客 ，以后CSDN的文章都会放到自己的博客的。 犹豫了下，还是把assignment4放上来吧，现在网上几乎么得代码，做的我头皮发麻，呜呜呜。但是收获还是很多的，大家加油！
Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment4
Assignment 4, description.
In this assignment you must read in a file of metropolitan regions and associated sports teams from assets/wikipedia_data.html and answer some questions about each metropolitan region. Each of these regions may have one or more teams from the “Big 4”: NFL (football, in assets/nfl.csv), MLB (baseball, in assets/mlb.csv), NBA (basketball, in assets/nba.csvor NHL (hockey, in assets/nhl.csv). Please keep in mind that all questions are from the perspective of the metropolitan region, and that this file is the “source of authority” for the location of a given sports team. Thus teams which are commonly known by a different area (e.g. “Oakland Raiders”) need to be mapped into the metropolitan region given (e.g. San Francisco Bay Area). This will require some human data understanding outside of the data you’ve been given (e.g. you will have to hand-code some names, and might need to google to find out where teams are)!
For each sport I would like you to answer the question: what is the win/loss ratio’s correlation with the population of the city it is in? Win/Loss ratio refers to the number of wins over the number of wins plus the number of losses. Remember that to calculate the correlation with pearsonr , so you are going to send in two ordered lists of values, the populations from the wikipedia_data.html file and the win/loss ratio for a given sport in the same order. Average the win/loss ratios for those cities which have multiple teams of a single sport. Each sport is worth an equal amount in this assignment (20%*4=80%) of the grade for this assignment. You should only use data from year 2018 for your analysis – this is important!
- Do not include data about the MLS or CFL in any of the work you are doing, we’re only interested in the Big 4 in this assignment.
- I highly suggest that you first tackle the four correlation questions in order, as they are all similar and worth the majority of grades for this assignment. This is by design!
- There may be more teams than the assert statements test, remember to collapse multiple teams in one city into a single value!
For this question, calculate the win/loss ratio’s correlation with the population of the city it is in for the NHL using 2018 data.
For this question, calculate the win/loss ratio’s correlation with the population of the city it is in for the NBA using 2018 data.
For this question, calculate the win/loss ratio’s correlation with the population of the city it is in for the MLB using 2018 data.
For this question, calculate the win/loss ratio’s correlation with the population of the city it is in for the NFL using 2018 data.
In this question I would like you to explore the hypothesis that given that an area has two sports teams in different sports, those teams will perform the same within their respective sports . How I would like to see this explored is with a series of paired t-tests (so use ttest_rel ) between all pairs of sports. Are there any sports where we can reject the null hypothesis? Again, average values where a sport has multiple teams in one region. Remember, you will only be including, for each sport, cities which have teams engaged in that sport, drop others as appropriate. This question is worth 20% of the grade for this assignment.
所有assignment就这样结束啦，希望大家有所收获~ 大家其他还有需要的就在评论留言哦 😃 欢迎讨论分享~
Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment2
u1s1，这门课的assignment还是有点难度的，特别是assigment4（哀怨），放给大家参考啦~ 有时间（需求）就把所有代码放到github上（好担心被河蟹啊） 相关链接： Coursera | Introduction to Data Science in Python（University of Michigan）| Quiz Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment1 Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment2 Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment3 Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment4 CSDN链接： Coursera | Introduction to Data Science in Python（University of Michigan）| Quiz答案 Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment1 Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment2 Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment3 Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment4
For this assignment you’ll be looking at 2017 data on immunizations from the CDC. Your datafile for this assignment is in assets/NISPUF17.csv. A data users guide for this, which you’ll need to map the variables in the data to the questions being asked, is available at assets/NIS-PUF17-DUG.pdf. Note: you may have to go to your Jupyter tree (click on the Coursera image) and navigate to the assignment 2 assets folder to see this PDF file).
Write a function called proportion_of_education which returns the proportion of children in the dataset who had a mother with the education levels equal to less than high school (<12), high school (12), more than high school but not a college graduate (>12) and college degree.
This function should return a dictionary in the form of (use the correct numbers, do not round numbers):
Let’s explore the relationship between being fed breastmilk as a child and getting a seasonal influenza vaccine from a healthcare provider. Return a tuple of the average number of influenza vaccines for those children we know received breastmilk as a child and those who know did not.
This function should return a tuple in the form (use the correct numbers:
This function should return a dictionary in the form of (use the correct numbers):
Note: To aid in verification, the chickenpox_by_sex()['female'] value the autograder is looking for starts with the digits 0.0077 .
A correlation is a statistical relationship between two variables. If we wanted to know if vaccines work, we might look at the correlation between the use of the vaccine and whether it results in prevention of the infection or disease . In this question, you are to see if there is a correlation between having had the chicken pox and the number of chickenpox vaccine doses given (varicella).
Some notes on interpreting the answer. The had_chickenpox_column is either 1 (for yes) or 2 (for no), and the num_chickenpox_vaccine_column is the number of doses a child has been given of the varicella vaccine. A positive correlation (e.g., corr > 0 ) means that an increase in had_chickenpox_column (which means more no’s) would also increase the values of num_chickenpox_vaccine_column (which means more doses of vaccine). If there is a negative correlation (e.g., corr < 0 ), it indicates that having had chickenpox is related to an increase in the number of vaccine doses.
Also, pval is the probability that we observe a correlation between had_chickenpox_column and num_chickenpox_vaccine_column which is greater than or equal to a particular value occurred by chance. A small pval means that the observed correlation is highly unlikely to occur by chance. In this case, pval should be very small (will end in e-18 indicating a very small number).
 This isn’t really the full picture, since we are not looking at when the dose was given. It’s possible that children had chickenpox and then their parents went to get them the vaccine. Does this dataset have the data we would need to investigate the timing of the dose?
大家其他还有需要的就在评论留言哦 :) 欢迎讨论分享~
- 本文作者： 买猫咪的小鱼干
- 本文链接： https://ycchen00.github.io/2021/08/18/Coursera/Intro2DS/Assignment2/
- 版权声明： 本博客所有文章除特别声明外，均采用 BY-NC-SA 许可协议。转载请注明出处！
- Online Degree Explore Bachelor’s & Master’s degrees
- MasterTrack™ Earn credit towards a Master’s degree
- University Certificates Advance your career with graduate-level learning
- Top Courses
- Join for Free
Applied Data Science with Python Specialization
Gain new insights into your data . Learn to apply data science methods and techniques, and acquire analysis skills.
What you will learn
Conduct an inferential statistical analysis
Discern whether a data visualization is good or bad
Enhance a data analysis with applied machine learning
Analyze the connectivity of a social network
Skills you will gain
- Text Mining
- Python Programming
- Data Cleansing
- Data Virtualization
- Data Visualization (DataViz)
- Machine Learning (ML) Algorithms
- Machine Learning
- Natural Language Toolkit (NLTK)
About this Specialization
Some related experience required.
Could your company benefit from training employees on in-demand skills?
See how employees at top companies are mastering in-demand skills
How the Specialization Works
A Coursera Specialization is a series of courses that helps you master a skill. To begin, enroll in the Specialization directly, or review its courses and choose the one you'd like to start with. When you subscribe to a course that is part of a Specialization, you’re automatically subscribed to the full Specialization. It’s okay to complete just one course — you can pause your learning or end your subscription at any time. Visit your learner dashboard to track your course enrollments and your progress.
Every Specialization includes a hands-on project. You'll need to successfully finish the project(s) to complete the Specialization and earn your certificate. If the Specialization includes a separate course for the hands-on project, you'll need to finish each of the other courses before you can start it.
Earn a Certificate
When you finish every course and complete the hands-on project, you'll earn a Certificate that you can share with prospective employers and your professional network.
There are 5 Courses in this Specialization
Introduction to data science in python.
This course will introduce the learner to the basics of the python programming environment, including fundamental python programming techniques such as lambdas, reading and manipulating csv files, and the numpy library. The course will introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the Series and DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as groupby, merge, and pivot tables effectively. By the end of this course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses.
This course should be taken before any of the other Applied Data Science with Python courses: Applied Plotting, Charting & Data Representation in Python, Applied Machine Learning in Python, Applied Text Mining in Python, Applied Social Network Analysis in Python.
Applied Plotting, Charting & Data Representation in Python
This course will introduce the learner to information visualization basics, with a focus on reporting and charting using the matplotlib library. The course will start with a design and information literacy perspective, touching on what makes a good and bad visualization, and what statistical measures translate into in terms of visualizations. The second week will focus on the technology used to make visualizations in python, matplotlib, and introduce users to best practices when creating basic charts and how to realize design decisions in the framework. The third week will be a tutorial of functionality available in matplotlib, and demonstrate a variety of basic statistical charts helping learners to identify when a particular method is good for a particular problem. The course will end with a discussion of other forms of structuring and visualizing data.
This course should be taken after Introduction to Data Science in Python and before the remainder of the Applied Data Science with Python courses: Applied Machine Learning in Python, Applied Text Mining in Python, and Applied Social Network Analysis in Python.
Applied Machine Learning in Python
This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. The course will start with a discussion of how machine learning is different than descriptive statistics, and introduce the scikit learn toolkit through a tutorial. The issue of dimensionality of data will be discussed, and the task of clustering data, as well as evaluating those clusters, will be tackled. Supervised approaches for creating predictive models will be described, and learners will be able to apply the scikit learn predictive modelling methods while understanding process issues related to data generalizability (e.g. cross validation, overfitting). The course will end with a look at more advanced techniques, such as building ensembles, and practical limitations of predictive models. By the end of this course, students will be able to identify the difference between a supervised (classification) and unsupervised (clustering) technique, identify which technique they need to apply for a particular dataset and need, engineer features to meet that need, and write python code to carry out an analysis.
This course should be taken after Introduction to Data Science in Python and Applied Plotting, Charting & Data Representation in Python and before Applied Text Mining in Python and Applied Social Analysis in Python.
Applied Text Mining in Python
This course will introduce the learner to text mining and text manipulation basics. The course begins with an understanding of how text is handled by python, the structure of text both to the machine and to humans, and an overview of the nltk framework for manipulating text. The second week focuses on common manipulation needs, including regular expressions (searching for text), cleaning text, and preparing text for use by machine learning processes. The third week will apply basic natural language processing methods to text, and demonstrate how text classification is accomplished. The final week will explore more advanced methods for detecting the topics in documents and grouping them by similarity (topic modelling).
This course should be taken after: Introduction to Data Science in Python, Applied Plotting, Charting & Data Representation in Python, and Applied Machine Learning in Python.
V. G. Vinod Vydiswaran
University of Michigan
The mission of the University of Michigan is to serve the people of Michigan and the world through preeminence in creating, communicating, preserving and applying knowledge, art, and academic values, and in developing leaders and citizens who will challenge the present and enrich the future.
Start working towards your Master's degree
Frequently asked questions.
What is the refund policy?
If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy .
Can I just enroll in a single course?
Yes! To get started, click the course card that interests you and enroll. You can enroll and complete the course to earn a shareable certificate, or you can audit it to view the course materials for free. When you subscribe to a course that is part of a Specialization, you’re automatically subscribed to the full Specialization. Visit your learner dashboard to track your progress.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.
Can I take the course for free?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. If you only want to read and view the course content, you can audit the course for free. If you cannot afford the fee, you can apply for financial aid .
Is this course really 100% online? Do I need to attend any classes in person?
This course is completely online, so there’s no need to show up to a classroom in person. You can access your lectures, readings and assignments anytime and anywhere via the web or your mobile device.
Will I earn university credit for completing the Specialization?
This Specialization doesn't carry university credit, but some universities may choose to accept Specialization Certificates for credit. Check with your institution to learn more.
More questions? Visit the Learner Help Center .
Build employee skills, drive business results
Start or advance your career.
- Google Data Analyst
- Google Digital Marketing & E-commerce Professional Certificate
- Google IT Automation with Python Professional Certificate
- Google IT Support
- Google Project Management
- Google UX Design
- Preparing for Google Cloud Certification: Cloud Architect
- IBM Cybersecurity Analyst
- IBM Data Analyst
- IBM Data Engineering
- IBM Data Science
- IBM Full Stack Cloud Developer
- IBM Machine Learning
- Intuit Bookkeeping
- Meta Front-End Developer
- DeepLearning.AI TensorFlow Developer Professional Certificate
- SAS Programmer Professional Certificate
- Launch your career
- Prepare for a certification
- Advance your career
- How to Identify Python Syntax Errors
- How to Catch Python Exceptions
- See all Programming Tutorials
Popular Courses and Certifications
- Free Courses
- Artificial Intelligence Courses
- Blockchain Courses
- Computer Science Courses
- Cursos Gratis
- Cybersecurity Courses
- Data Analysis Courses
- Data Science Courses
- English Speaking Courses
- Full Stack Web Development Courses
- Google Courses
- Human Resources Courses
- Learning English Courses
- Microsoft Excel Courses
- Product Management Courses
- Project Management Courses
- Python Courses
- SQL Courses
- Agile Certifications
- CAPM Certification
- CompTIA A+ Certification
- Data Analytics Certifications
- Scrum Master Certifications
- See all courses
Popular collections and articles
- Free online courses you can finish in a day
- Popular Free Courses
- Business Jobs
- Cybersecurity Jobs
- Entry-Level IT Jobs
- Data Analyst Interview Questions
- Data Analytics Projects
- How to Become a Data Analyst
- How to Become a Project Manager
- Project Manager Interview Questions
- Python Programming Skills
- Strength and Weakness in Interview
- What Does a Data Analyst Do
- What Does a Software Engineer Do
- What Is a Data Engineer
- What Is a Data Scientist
- What Is a Product Designer
- What Is a Scrum Master
- What Is a UX Researcher
- How to Get a PMP Certification
- PMI Certifications
- Popular Cybersecurity Certifications
- Popular SQL Certifications
- Read all Coursera Articles
Earn a degree or certificate online
- Google Professional Certificates
- Professional Certificates
- See all certificates
- Bachelor's Degrees
- Master's Degrees
- Computer Science Degrees
- Data Science Degrees
- MBA & Business Degrees
- Data Analytics Degrees
- Public Health Degrees
- Social Sciences Degrees
- Management Degrees
- BA vs BS Degree
- What is a Bachelor's Degree?
- 11 Good Study Habits to Develop
- How to Write a Letter of Recommendation
- 10 In-Demand Jobs You Can Get with a Business Degree
- Is a Master's in Computer Science Worth it?
- See all degree programs
- Coursera India
- Coursera UK
- Coursera Mexico
- What We Offer
- Coursera Plus
- MasterTrack® Certificates
- For Enterprise
- For Government
- Become a Partner
- Coronavirus Response
- Beta Testers
- Teaching Center
- Modern Slavery Statement
1700 Coursera Courses Still Fully Free!
2022 Year in Review: The “New Normal” that Wasn’t
The pandemic ushered in a “new normal” in online learning, but it culminated in layoffs and stock drops.
- Academic Writing Made Easy: Improve Your Writing Skills With TUM
- What is the Secret of Chinese Economic Success?
- 20 Most Cited Research Papers on MOOCs
- 10 Best Digital Art Courses to Take in 2023
- EMOOCs 2023: Call for Papers [Deadline March 15]
700+ Free Google Certifications
- data science
- web development
Information Technology (IT) Certifications
Learn to Program: The Fundamentals
Making Successful Decisions through the Strategy, Law & Ethics Model
Organize and share your learning with Class Central Lists.
View our Lists Showcase
Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.
Introduction to Data Science in Python
University of Michigan via Coursera Help
Class Central Tips
- Learn How to Sign up to Coursera courses for free
- 1700 Coursera Courses That Are Still Completely Free
- Fundamentals of Data Manipulation with Python
- In this week you'll get an introduction to the field of data science, review common Python functionality and features which data scientists use, and be introduced to the Coursera Jupyter Notebook for the lectures. All of the course information on grading, prerequisites, and expectations are on the course syllabus, and you can find more information about the Jupyter Notebooks on our Course Resources page.
- Basic Data Processing with Pandas
- In this week of the course you'll learn the fundamentals of one of the most important toolkits Python has for data cleaning and processing -- pandas. You'll learn how to read in data into DataFrame structures, how to query these structures, and the details about such structures are indexed.
- More Data Processing with Pandas
- In this week you'll deepen your understanding of the python pandas library by learning how to merge DataFrames, generate summary tables, group data into logical pieces, and manipulate dates. We'll also refresh your understanding of scales of data, and discuss issues with creating metrics for analysis. The week ends with a more significant programming assignment.
- Answering Questions with Messy Data
- In this week of the course you'll be introduced to a variety of statistical techniques such a distributions, sampling and t-tests. The week ends with two discussions of science and the rise of the fourth paradigm -- data driven discovery.
Christopher Brooks, Kevyn Collins-Thompson, Daniel Romero and V. G. Vinod Vydiswaran
- united states
Applied data science with python, applied plotting, charting & data representation in python, applied social network analysis in python, introduction to data analytics with python, applied text mining in python, applied machine learning in python, related articles.
2.4 rating, based on 46 Class Central reviews
4.5 rating at Coursera based on 26527 ratings
Start your review of Introduction to Data Science in Python
Paul Leitner is taking this course right now.
Anonymous completed this course.
Rtodyssey completed this course, spending 6 hours a week on it and found the course difficulty to be hard.
Anonymous is taking this course right now.
D C is taking this course right now.
Graham C completed this course, spending 8 hours a week on it and found the course difficulty to be medium.
Julián Urrea completed this course.
Juan Velasquez is taking this course right now.
Jeff Trawick is taking this course right now, spending 4 hours a week on it and found the course difficulty to be medium.
- AA Anonymous 2 years ago This course is a real waste of time! Please avoid!! The lecturer in general teaches nothing. He explains some basic concepts. You can learn them in a 5 minutes YouTube video. Then, you should answer the detailed/technical coding assignments. The assignments... Read more This course is a real waste of time! Please avoid!! The lecturer in general teaches nothing. He explains some basic concepts. You can learn them in a 5 minutes YouTube video. Then, you should answer the detailed/technical coding assignments. The assignments have nothing to deal with the lectures. The lectures have zero to very limited coding explanation. Then, there is an outdated picky auto grader that grades your work. You will spend hours finding out that your code is correct, but the auto grader works with libraries very old versions. I learned nothing from the lectures but I passed the assignments with 90, thanks to StackOverflow and online resources. I am wondering who gives this course 5 stars. Fake reviews? Helpful
Mark Adelhelm completed this course, spending 8 hours a week on it and found the course difficulty to be hard.
- AA Anonymous 2 years ago The lectures barely scratch the surface of Python. Their assignment is absurdly difficult given their poor lectures and information provided. I took the course using 7-day trial for Coursera and it was the worst. Helpful
Never Stop Learning.
Get personalized course recommendations, track subjects and courses with reminders, and more.
Fundamentals of Data Manipulation with PythonBasic Data Processing with PandasLoad, manipulate, and select data using numpy
Aspirant Life Vlogs Certification: Introduction to Data Science in pythonPlease subscribe for more solution of updated assignment. ...
This repository includes course assignments of Introduction to Data Science in Python on coursera by university of michigan - GitHub
You are currently looking at version 1.0 of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera
但是收获还是很多的，大家加油！ Coursera | Introduction to Data Science in Python（University of Michigan）| Assignment4. Assignment 4.
The 5 courses in this University of Michigan specialization introduce learners to data science through the python programming language.
problem to submit assignment 3 in course "introduction to data science in python". each time it gives different error.. last error -.
Provider. Coursera Help · Pricing. Free Online Course (Audit) · Languages. English · Certificate. Paid Certificate Available · Duration & workload. 4 weeks long, 35
Jupyter: Intro to Data Science - Assignment 2, Part 2 Details. Grant Long. CUNY City College. NYC Tech-in-Residence Corps.