Unit 7: Day 3

START DATE:DUE DATE:STATUS:Open

Tasks

73.1 CSP Word of the Day: Cleaning Data

  • Cleaning data is a process that makes the data uniform without changing its meaning (e.g., replacing all equivalent abbreviations, spellings, and capitalizations with the same word). 
  • Related terms: Data set, bias, filtering

73.2 Global Impact - Algorithmic Bias

  • If you have not already watched the video below, do so now.
  • Search Grandma on Google and what do you see?  Probably the stereotypical white granny.  Add the filter ‘cute’ and the results are even worse
  • Why are the search results disproportionately white?
  • Watch How I'm Fighting Bias in Algorithms [8:43]

  • When white males start a project, their data set is their friends and acquaintances.  Their data sets are often the problem, not their own biases.  Algorithms therefore adopt the unconscious skews and biases of their developers, unless developers are hyper conscious of this potential problem.
  • Problems of bias are often created by the type or source of data being collected. Bias is not eliminated by simply collecting more data.
  • Computing innovations can reflect existing human biases because of biases written into the algorithms or biases in the data used by the innovation.
  • Programmers should take action to reduce bias in algorithms used for computing innovations as a way of combating existing human biases.
  • Biases can be embedded at all levels of software development.
  • Large, diverse, carefully selected data sets can help reduce Algorithmic Bias but…
    • DAT-2.C.7: Large data sets are difficult to process using a single computer and may require parallel systems.
    • DAT-2.C.7: Large data sets are difficult to process using a single computer and may require parallel systems.

73.2 Algorithms and What They Can Not Solve

  • If you have not watched this video earlier in the course, it does a good job of setting the stage for today’s topics.
  • Warning: This video is very challenging and may only be approachable by more academic students.  
  • Watch P vs NP and the Computational Complexity Zoo [10:43]
  • AP CSP uses the terms Decidable and Undecidable problems. Discuss examples from the P vs NP video and list problems that fall into each category
    • AAP-4.A.7 Algorithms with a polynomial efficiency or lower (constant, linear, square, cube, etc.) are said to run in a reasonable amount of time. Algorithms with exponential or factorial efficiencies are examples of algorithms that run in an unreasonable amount of time. 
    • AAP-4.A.8Some problems cannot be solved in a reasonable amount of time because there is no efficient algorithm for solving them. In these cases, approximate solutions are sought.
    • AAP-4.A.9A heuristic is an approach to a problem that produces a solution that is not guaranteed to be optimal but may be used when techniques that are guaranteed to always find an optimal solution are impractical
    • AAP-4.B.1 A decidable problem is a decision problem for which an algorithm can be written to produce a correct output for all inputs (e.g., “Is the number even?”).
    • AAP-4.B.2 An undecidable problem is one for which no algorithm can be constructed that is always capable of providing a correct yes-or-no answer.
    • AAP-4.B.3 An undecidable problem may have some instances that have an algorithmic solution, but there is no algorithmic solution that could solve all instances of the problem.

73.3 CSP Review Assignment 1 - Algorithms

  • Instructions for this assignment can be found in the folder: 2.7 - Year2: Exam Prep
  • This activity makes use of the great Khan Academy CSP Review course.


Continue to Unit 7: Day 4 »