Introduction to Data Science Programming-3 Credits

  • Course Description

    This course is designed for students who are new to the world of data science. After the introduction of some basic arithmetic, variables, and data structures in Python, students will start to learn how to collect and extract data from real datasets. Some data analytical skills using the control flows and Python packages (e.g., NumPy, SciPy, Pandas, etc.) will be introduced. To address the needs of big data processing, some distributed computing frameworks (e.g., Spark) and visualization tools with Python will be discussed. Students may apply some basic learning algorithms with Python packages (e.g., scikit-learn) to extract knowledge from data.

  • Course objectives

    • apply the Python language fundamentals, including basic syntax, variables, and process flows, to write their first program
    • apply functions and import packages to work with complex and/or large data sets
    • apply scientific packages (e.g., NumPy and SciPy) to perform useful computations
    • process text file using external packages (e.g., tabula)
    • apply stunning data visualization tools to visualize large data sets

Textbook

Learning Python, 5th Edition by Mark Lutz

Remarks: This course will also be offered to the senior UG students (year-4 students) as free elective.